DEA-C01 Exam Questions
308 real DEA-C01 exam questions with expert-verified answers and explanations. Page 2 of 7.
- Question #51Data Ingestion and Transformation
A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of e...
ETLAWS GlueApache SparkCost Optimization - Question #52Data Store Management
In data modeling, an entity-relationship diagram (ERD) is primarily used to:
Data ModelingERDDatabase DesignData Relationships - Question #53Data Store Management
A company is designing a data lake on Amazon S3. To ensure high performance when accessing the data, which best practice should the company adopt in organizing its data in the S3 b...
Amazon S3Data Lake DesignData PartitioningPerformance Optimization - Question #54Data Ingestion and Transformation
You have been tasked with migrating an on-premises MySQL database to Amazon Aurora PostgreSQL using AWS Database Migration Service (DMS). The stakeholder emphasizes that the source...
AWS DMSDatabase MigrationContinuous ReplicationZero Downtime - Question #55Data Ingestion and Transformation
A data engineer is configuring an AWS Glue job to read data from an Amazon S3 bucket. The data engineer has set up the necessary AWS Glue connection details and an associated IAM r...
AWS GlueAmazon S3VPC EndpointsRoute Tables - Question #56Data Security and Governance
A retail company has a customer data hub in an Amazon S3 bucket. Employees from many countries use the data hub to support company-wide analytics. A governance team must ensure tha...
AWS Lake FormationRow-level securityData governanceS3 data lake - Question #57Data Ingestion and Transformation
A media company wants to improve a system that recommends media content to customer based on user behavior and preferences. To improve the recommendation system, the company needs...
Data IngestionAWS DataSyncAWS Data ExchangeThird-party data integration - Question #58Data Security and Governance
A financial company wants to implement a data mesh. The data mesh must support centralized data governance, data analysis, and data access control. The company has decided to use A...
Data MeshData LakeData GovernanceAccess Control - Question #59Data Operations and Support
A data engineer maintains custom Python scripts that perform a data formatting process that many AWS Lambda functions use. When the data engineer needs to modify the Python scripts...
AWS LambdaLambda LayersCode ManagementServerless Architecture - Question #60Data Security and Governance
Company DEF has a strict security policy that mandates that all data at rest in Amazon S3 must be encrypted. They want to ensure that the encryption keys are managed by AWS, but th...
S3 encryptionAWS KMSData SecurityKey Management - Question #61Data Security and Governance
In a data engineering pipeline, a company is using multiple applications and teams to access a shared Amazon S3 bucket. To streamline access and simplify permissions management for...
S3 Access PointsPermissions ManagementData Access ControlS3 - Question #62Data Operations and Support
When processing large datasets using distributed computing frameworks, uneven distribution of data can lead to processing delays. What is this phenomenon commonly known as?
Distributed computingData skewPerformance optimizationData processing - Question #63Data Ingestion and Transformation
A data engineer needs to use AWS Step Functions to design an orchestration workflow. The workflow must parallel process a large collection of data files and apply a specific transf...
AWS Step FunctionsWorkflow OrchestrationMap StateParallel Processing - Question #64Data Ingestion and Transformation
A company is migrating a legacy application to an Amazon S3 based data lake. A data engineer reviewed data that is associated with the legacy application. The data engineer found t...
AWS GlueETLData DeduplicationMachine Learning Transforms - Question #65Data Store Management
A company is building an analytics solution. The solution uses Amazon S3 for data lake storage and Amazon Redshift for a data warehouse. The company wants to use Amazon Redshift Sp...
Redshift SpectrumQuery OptimizationData PartitioningColumnar Storage - Question #66Data Store Management
A company uses Amazon RDS to store transactional data. The company runs an RDS DB instance in a private subnet. A developer wrote an AWS Lambda function with default settings to in...
AWS LambdaAmazon RDSVPC NetworkingSecurity Groups - Question #67Data Operations and Support
A company has a frontend ReactJS website that uses Amazon API Gateway to invoke REST APIs. The APIs perform the functionality of the website. A data engineer needs to write a Pytho...
AWS LambdaAPI GatewayServerless ComputeOperational Overhead - Question #68Data Ingestion and Transformation
A company has a production AWS account that runs company workloads. The company's security team created a security AWS account to store and analyze security logs from the productio...
Kinesis Data StreamsCloudWatch LogsCross-account data transferIAM roles - Question #69Data Ingestion and Transformation
A company uses Amazon S3 to store semi-structured data in a transactional data lake. Some of the data files are small, but other data files are tens of terabytes. A data engineer m...
Change Data CaptureData Lake FormatsAmazon S3Cost Optimization - Question #70Data Store Management
A data engineer runs Amazon Athena queries on data that is in an Amazon S3 bucket. The Athena queries use AWS Glue Data Catalog as a metadata table. The data engineer notices that...
AthenaAWS Glue Data CatalogPartitioningPerformance Optimization - Question #71Data Ingestion and Transformation
A data engineer must manage the ingestion of real-time streaming data into AWS. The data engineer wants to perform real-time analytics on the incoming streaming data by using time-...
Real-time data processingStream analyticsWindowed aggregationsAmazon Managed Service for Apache Flink - Question #72Data Store Management
A company is planning to upgrade its Amazon Elastic Block Store (Amazon EBS) General Purpose SSD storage from gp2 to gp3. The company wants to prevent any interruptions in its Amaz...
Amazon EBSVolume ModificationStorage UpgradeOperational Efficiency - Question #73Data Ingestion and Transformation
A company is migrating its database servers from Amazon EC2 instances that run Microsoft SQL Server to Amazon RDS for Microsoft SQL Server DB instances. The company's analytics tea...
SQL Server ViewsData TransformationOperational EfficiencyData Preparation - Question #74Data Ingestion and Transformation
For evolving schema and high compatibility, which data format should be chosen for downstream analytics?
Data FormatsSchema EvolutionData CompatibilityAnalytics Data Formats - Question #75Data Security and Governance
What is the primary purpose of data lineage in data engineering?
data lineagedata governancedata quality - Question #76Data Store Management
Which of the following best describes the type of data found in traditional relational databases?
Relational databasesStructured dataData typesDatabase concepts - Question #77Data Ingestion and Transformation
Pivoting in SQL is mainly used to transform data from:
SQL PivotingData TransformationSQL - Question #78Data Operations and Support
A company created an extract, transform, and load (ETL) data pipeline in AWS Glue. A data engineer must crawl a table that is in Microsoft SQL Server. The data engineer needs to ex...
AWS Glue WorkflowsData Pipeline OrchestrationETLCost Optimization - Question #79Data Store Management
A financial services company stores financial data in Amazon Redshift. A data engineer wants to run real-time queries on the financial data to support a web-based trading applicati...
Amazon RedshiftRedshift Data APIApplication IntegrationOperational Overhead - Question #80Data Security and Governance
A company uses Amazon Athena for one-time queries against data that is in Amazon S3. The company has several use cases. The company must implement permission controls to separate q...
Athena WorkgroupsPermissions ManagementQuery HistoryData Security - Question #81Data Ingestion and Transformation
A data engineer needs to schedule a workflow that runs a set of AWS Glue jobs every day. The data engineer does not require the Glue jobs to run or finish at a specific time. Which...
AWS GlueCost OptimizationETL JobsFLEX Execution - Question #82Data Ingestion and Transformation
A data engineer needs to create an AWS Lambda function that converts the format of data from .csv to Apache Parquet. The Lambda function must run only if a user uploads a .csv file...
S3 Event NotificationsAWS LambdaData TransformationOperational Efficiency - Question #83Data Ingestion and Transformation
A data engineer needs Amazon Athena queries to finish faster. The data engineer notices that all the files the Athena queries use are currently stored in uncompressed .csv format....
AthenaData FormatsColumnar StoragePerformance Optimization - Question #84Data Ingestion and Transformation
A manufacturing company collects sensor data from its factory floor to monitor and enhance operational efficiency. The company uses Amazon Kinesis Data Streams to publish the data...
Real-time stream processingLow latency architectureKinesis Data StreamsApache Flink - Question #85Data Store Management
A company uses Amazon RDS for MySQL as the database for a critical application. The database workload is mostly writes, with a small number of reads. A data engineer notices that t...
RDS Performance TuningDatabase ScalingPerformance MonitoringCPU Optimization - Question #86Data Store Management
A company has used an Amazon Redshift table that is named Orders for 6 months. The company performs weekly updates and deletes on the table. The table has an interleaved sort key o...
Redshift VACUUMSort KeysDisk Space ManagementTable Maintenance - Question #87Data Ingestion and Transformation
A manufacturing company wants to collect data from sensors. A data engineer needs to implement a solution that ingests sensor data in near real time. The solution must store the da...
Near real-time ingestionServerless architectureNoSQL databasesOperational efficiency - Question #88Data Security and Governance
A company stores data in a data lake that is in Amazon S3. Some data that the company stores in the data lake contains personally identifiable information (PII). Multiple user grou...
Data Lake SecurityAWS Lake FormationFine-grained Access ControlPII Protection - Question #89Data Ingestion and Transformation
A data engineer must build an extract, transform, and load (ETL) pipeline to process and load data from 10 source systems into 10 tables that are in an Amazon Redshift database. Al...
AWS GlueETL PipelineSchema EvolutionData Pipeline Orchestration - Question #90Data Operations and Support
A financial company wants to use Amazon Athena to run on-demand SQL queries on a petabyte- scale dataset to support a business intelligence (BI) application. An AWS Glue job that r...
Amazon AthenaCost OptimizationQuery Result ReuseOperational Efficiency - Question #91Data Store Management
A data engineer creates an AWS Glue Data Catalog table by using an AWS Glue crawler that is named Orders. The data engineer wants to add the following new partitions: s3://transact...
AWS Glue Data CatalogAmazon AthenaPartition ManagementDDL - Question #92Data Ingestion and Transformation
A company stores 10 to 15 TB of uncompressed .csv files in Amazon S3. The company is evaluating Amazon Athena as a one-time query engine. The company wants to transform the data to...
Amazon AthenaData Lake FormatsColumnar StorageCompression Strategies - Question #93Data Operations and Support
A company uses Apache Airflow to orchestrate the company's current on-premises data pipelines. The company runs SQL data quality check tasks as part of the pipelines. The company w...
Amazon MWAAData OrchestrationCloud MigrationApache Airflow - Question #94Data Ingestion and Transformation
A company uses Amazon EMR as an extract, transform, and load (ETL) pipeline to transform data that comes from multiple sources. A data engineer must orchestrate the pipeline to max...
ETL OrchestrationAWS Step FunctionsAmazon EMRCost Optimization - Question #95Data Store Management
An online retail company stores Application Load Balancer (ALB) access logs in an Amazon S3 bucket. The company wants to use Amazon Athena to query the logs to analyze traffic patt...
Athena Performance OptimizationAWS Glue CrawlerS3 Data LakeLog Analytics - Question #96Data Ingestion and Transformation
A company has a business intelligence platform on AWS. The company uses an AWS Storage Gateway Amazon S3 File Gateway to transfer files from the company's on-premises environment t...
Event-driven ArchitectureAmazon EventBridgeAWS Glue WorkflowsAWS Storage Gateway - Question #97Data Store Management
A retail company uses Amazon Aurora PostgreSQL to process and store live transactional data. The company uses an Amazon Redshift cluster for a data warehouse. An extract, transform...
Redshift Cost OptimizationRedshift Federated QueryData ArchivingHybrid Querying - Question #98Data Ingestion and Transformation
A manufacturing company has many IoT devices in facilities around the world. The company uses Amazon Kinesis Data Streams to collect data from the devices. The data includes device...
Kinesis Data StreamsPartition KeyHot ShardWrite Throughput - Question #99Data Store Management
A data engineer wants to improve the performance of SQL queries in Amazon Athena that run against a sales data table. The data engineer wants to understand the execution plan of a...
Athena Query PerformanceSQL Execution PlanEXPLAIN ANALYZE - Question #100Data Ingestion and Transformation
A company plans to provision a log delivery stream within a VPC. The company configured the VPC flow logs to publish to Amazon CloudWatch Logs. The company needs to send the flow l...
Kinesis Data FirehoseCloudWatch LogsLog StreamingSplunk Integration