DEA-C01 Exam Questions
308 real DEA-C01 exam questions with expert-verified answers and explanations. Page 1 of 7.
- Question #1Data Ingestion and Transformation
A company stores daily records of the financial performance of investment portfolios in .csv format in an Amazon S3 bucket. A data engineer uses AWS Glue crawlers to crawl the S3 d...
AWS GlueIAMData CatalogCrawlers - Question #2Data Operations and Support
A company loads transaction data for each day into Amazon Redshift tables at the end of each day. The company wants to have the ability to track which tables have been loaded and w...
Redshift Data APIEventBridgeLambda invocationData load monitoring - Question #3Data Ingestion and Transformation
A data engineer needs to securely transfer 5 TB of data from an on-premises data center to an Amazon S3 bucket. Approximately 5% of the data changes every day. Updates to the data...
On-premises data transferAWS DataSyncAutomated data transferIncremental data synchronization - Question #4Data Ingestion and Transformation
A company uses an on-premises Microsoft SQL Server database to store financial transaction data. The company migrates the transaction data from the on-premises database to AWS at t...
Database MigrationAWS DMSOn-premises to AWSMinimal Downtime - Question #5Data Ingestion and Transformation
A data engineer is building a data pipeline on AWS by using AWS Glue extract, transform, and load (ETL) jobs. The data engineer needs to process data from Amazon RDS and MongoDB, p...
AWS GlueETL PipelineSchedulingDatabase Connectivity - Question #6Data Store Management
A company uses an Amazon Redshift cluster that runs on RA3 nodes. The company wants to scale read and write capacity to meet demand. A data engineer needs to identify a solution th...
RedshiftConcurrency ScalingWorkload Management (WLM)Scalability - Question #7Data Operations and Support
A data engineer must orchestrate a series of Amazon Athena queries that will run every day. Each query can run for more than 15 minutes. Which combination of steps will meet these...
AthenaWorkflow OrchestrationAWS Step FunctionsAWS Lambda - Question #8Data Ingestion and Transformation
A company is migrating on-premises workloads to AWS. The company wants to reduce overall operational overhead. The company also wants to explore serverless options. The company's c...
ETLAWS Big Data ServicesApache FrameworksWorkload Migration - Question #9Data Ingestion and Transformation
A data engineer must use AWS services to ingest a dataset into an Amazon S3 data lake. The data engineer profiles the dataset and discovers that the dataset contains personally ide...
PIIData ObfuscationAWS Glue StudioAWS Glue Data Catalog - Question #10Data Operations and Support
A company maintains multiple extract, transform, and load (ETL) workflows that ingest data from the company's operational databases into an Amazon S3 based data lake. The ETL workf...
ETL orchestrationServerless workflowsAWS Step FunctionsOperational overhead - Question #11Data Store Management
A company currently stores all of its data in Amazon S3 by using the S3 Standard storage class. A data engineer examined data access patterns to identify trends. During the first 6...
S3 Lifecycle PoliciesS3 Storage ClassesCost OptimizationData Availability - Question #12Data Store Management
A company maintains an Amazon Redshift provisioned cluster that the company uses for extract, transform, and load (ETL) operations to support critical analysis tasks. A sales team...
Redshift data sharingCross-cluster data accessResource isolationETL/BI integration - Question #13Data Ingestion and Transformation
A data engineer needs to join data from multiple sources to perform a one-time analysis job. The data is stored in Amazon DynamoDB, Amazon RDS, Amazon Redshift, and Amazon S3. Whic...
Athena Federated QueryCross-source data joiningCost optimizationAd-hoc data analysis - Question #14Data Store Management
A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must...
EMR optimizationCost-effective computingSpark workloadsBig data storage - Question #15Data Ingestion and Transformation
A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the...
Amazon Redshift Streaming IngestionAmazon Kinesis Data StreamsReal-time analyticsOperational overhead - Question #16Data Ingestion and Transformation
A company uses an Amazon QuickSight dashboard to monitor usage of one of the company's applications. The company uses AWS Glue jobs to process data for the dashboard. The company s...
AWS Glue PerformanceS3 Data PartitioningETL OptimizationData Lake Optimization - Question #17Data Operations and Support
A data engineering team is using an Amazon Redshift data warehouse for operational reporting. The team wants to prevent performance issues that might result from long- running quer...
RedshiftPerformance MonitoringSystem ViewsQuery Optimization Alerts - Question #18Data Ingestion and Transformation
A data engineer must ingest a source of structured data that is in .csv format into an Amazon S3 data lake. The .csv files contain 15 columns. Data analysts need to run Amazon Athe...
AWS Glue ETLAmazon AthenaData LakeData Formats - Question #19Data Security and Governance
A company has five offices in different AWS Regions. Each office has its own human resources (HR) department that uses a unique IAM role. The company stores employee records in a d...
AWS Lake FormationFine-grained access controlData securityS3 data lake - Question #20Data Operations and Support
A company uses AWS Step Functions to orchestrate a data pipeline. The pipeline consists of Amazon EMR jobs that ingest data from data sources and store the data in an Amazon S3 buc...
Step FunctionsEMR OrchestrationIAM PermissionsVPC Network Troubleshooting - Question #21Data Store Management
A company is developing an application that runs on Amazon EC2 instances. Currently, the data that the application generates is temporary. However, the company needs to persist the...
EC2EBSData PersistenceAMI - Question #22Data Ingestion and Transformation
A company uses Amazon Athena to run SQL queries for extract, transform, and load (ETL) tasks by using Create Table As Select (CTAS). The company must use Apache Spark instead of SQ...
Amazon AthenaApache SparkAWS Glue Data CatalogETL - Question #23Data Store Management
A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucke...
AWS Glue Data CatalogData PartitioningS3 Data LakeAPI Integration - Question #24Data Ingestion and Transformation
A media company uses software as a service (SaaS) applications to gather data by using third- party tools. The company needs to store the data in an Amazon S3 bucket. The company w...
SaaS IntegrationData IngestionAmazon AppFlowLow Operational Overhead - Question #25Data Operations and Support
A data engineer is using Amazon Athena to analyze sales data that is in Amazon S3. The data engineer writes a query to retrieve sales amounts for 2023 for several products from a t...
Amazon AthenaSQL QueryingData FilteringAggregate Functions - Question #26Data Operations and Support
A data engineer has a one-time task to read data from objects that are in Apache Parquet format in an Amazon S3 bucket. The data engineer needs to query only one column of the data...
S3 SelectParquetServerless QueryingOperational Efficiency - Question #27Data Operations and Support
A company uses Amazon Redshift for its data warehouse. The company must automate refresh schedules for Amazon Redshift materialized views. Which solution will meet this requirement...
Amazon RedshiftMaterialized ViewsAutomationAWS Lambda - Question #28Data Operations and Support
A data engineer must orchestrate a data pipeline that consists of one AWS Lambda function and one AWS Glue job. The solution must integrate with AWS services. Which solution will m...
Workflow OrchestrationAWS Step FunctionsServerless WorkflowsData Pipelines - Question #29Data Ingestion and Transformation
A company needs to set up a data catalog and metadata management for data sources that run in the AWS Cloud. The company will use the data catalog to maintain the metadata of all t...
AWS GlueData CatalogMetadata ManagementGlue Crawlers - Question #30Data Store Management
A company stores data from an application in an Amazon DynamoDB table that operates in provisioned capacity mode. The workloads of the application have predictable throughput load...
DynamoDBProvisioned CapacityAuto ScalingCost Optimization - Question #31Data Store Management
A company is planning to migrate on-premises Apache Hadoop clusters to Amazon EMR. The company also needs to migrate a data catalog into a persistent storage solution. The company...
Amazon EMRAWS Glue Data CatalogHive Metastore MigrationServerless Data Catalog - Question #32Data Store Management
A company uses an Amazon Redshift provisioned cluster as its database. The Redshift cluster has five reserved ra3.4xlarge nodes and uses key distribution. A data engineer notices t...
Redshift Performance TuningData DistributionWorkload BalancingCompute Node Management - Question #33Data Ingestion and Transformation
A security company stores IoT data that is in JSON format in an Amazon S3 bucket. The data structure can change when the company upgrades the IoT devices. The company wants to crea...
AWS GlueData CatalogSchema EvolutionS3 Data Lakes - Question #34Data Security and Governance
A company stores details about transactions in an Amazon S3 bucket. The company wants to log all writes to the S3 bucket into another S3 bucket that is in the same AWS Region. Whic...
CloudTrailS3 Data EventsLoggingAudit Trail - Question #35Data Store Management
A data engineer needs to maintain a central metadata repository that users access through Amazon EMR and Amazon Athena queries. The repository needs to provide the schema and prope...
AWS Glue Data CatalogMetadata ManagementApache HiveAmazon EMR - Question #36Data Security and Governance
A company needs to build a data lake in AWS. The company must provide row-level data access and column-level data access to specific teams. The teams will access the data by using...
Data LakeData SecurityAccess ControlAWS Lake Formation - Question #37Data Ingestion and Transformation
An airline company is collecting metrics about flight activities for analytics. The company is conducting a proof of concept (POC) test to show how analytics can provide insights t...
Athena Query PerformanceS3 Data StorageColumnar FormatsData Transformation - Question #38Data Store Management
A company's data engineer needs to optimize the performance of table SQL queries. The company stores data in an Amazon Redshift cluster. The data engineer cannot increase the size...
Amazon RedshiftDistribution StylesQuery OptimizationData Modeling - Question #39Data Ingestion and Transformation
A company receives .csv files that contain physical address data. The data is in columns that have the following names: Door_No, Street_Name, City, and Zip_Code. The company wants...
AWS Glue DataBrewData TransformationNEST_TO_MAPColumn Aggregation - Question #40Data Security and Governance
A company receives call logs as Amazon S3 objects that contain sensitive customer information. The company must protect the S3 objects by using encryption. The company must also us...
S3 EncryptionAWS KMSKey ManagementAccess Control - Question #41Data Store Management
A company stores petabytes of data in thousands of Amazon S3 buckets in the S3 Standard storage class. The data supports analytics workloads that have unpredictable and variable da...
S3 Storage ClassesS3 Intelligent-TieringCost OptimizationData Retrieval - Question #42Data Security and Governance
During a security review, a company identified a vulnerability in an AWS Glue job. The company discovered that credentials to access an Amazon Redshift cluster were hard coded in t...
AWS GlueAWS Secrets ManagerIAMCredential Management - Question #43Data Store Management
A data engineer uses Amazon Redshift to run resource-intensive analytics processes once every month. Every month, the data engineer creates a new Redshift provisioned cluster. The...
Redshift ServerlessOperational OverheadData WarehousingInfrastructure Management - Question #44Data Ingestion and Transformation
A company receives a daily file that contains customer data in .xls format. The company stores the file in Amazon S3. The daily file is approximately 2 GB in size. A data engineer...
Data TransformationData AggregationAWS Glue DataBrewOperational Efficiency - Question #45Data Ingestion and Transformation
A healthcare company uses Amazon Kinesis Data Streams to stream real-time health data from wearable devices, hospital equipment, and patient records. A data engineer needs to find...
Redshift Streaming IngestionKinesis Data StreamsNear Real-time AnalyticsOperational Overhead - Question #46Data Security and Governance
A data engineer needs to use an Amazon QuickSight dashboard that is based on Amazon Athena queries on data that is stored in an Amazon S3 bucket. When the data engineer connects to...
QuickSight PermissionsAthena PermissionsS3 Access ControlKMS Decryption - Question #47Data Ingestion and Transformation
A company stores datasets in JSON format and .csv format in an Amazon S3 bucket. The company has Amazon RDS for Microsoft SQL Server databases, Amazon DynamoDB tables that are in p...
AWS GlueData CatalogFederated QueryServerless Analytics - Question #48Data Security and Governance
A data engineer is configuring Amazon SageMaker Studio to use AWS Glue interactive sessions to prepare data for machine learning (ML) models. The data engineer receives an access d...
IAMSageMaker StudioAccess ControlManaged Policies - Question #49Data Ingestion and Transformation
A company extracts approximately 1 TB of data every day from data sources such as SAP HANA, Microsoft SQL Server, MongoDB, Apache Kafka, and Amazon DynamoDB. Some of the data sourc...
AWS GlueETLSchema DetectionManaged Services - Question #50Data Security and Governance
A company has multiple applications that use datasets that are stored in an Amazon S3 bucket. The company has an ecommerce application that generates a dataset that contains person...
S3 Object LambdaData RedactionData GovernanceOperational Efficiency