nerdexam
AmazonAmazon

DEA-C01 · Question #221

DEA-C01 Question #221: Real Exam Question with Answer & Explanation

The correct answer is A: Partition the data based on order date. Use Amazon Athena to query the data.. Partitioning the S3 data by the order date lets Athena prune scans to only the relevant date folders, keeping query times stable as data grows. Because Athena is a serverless, pay-per- query service, you only pay for the data scanned, making it the most cost-effective way to run

Data Ingestion and Transformation

Question

A data engineer needs to optimize the performance of a data pipeline that handles retail orders. Data about the orders is ingested daily into an Amazon S3 bucket. The data engineer runs queries once each week to extract metrics from the orders data based the order date for multiple date ranges. The data engineer needs an optimization solution that ensures the query performance will not degrade when the volume of data increases. Which solution will meet this requirement MOST cost-effectively?

Options

  • APartition the data based on order date. Use Amazon Athena to query the data.
  • BPartition the data based on order date. Use Amazon Redshift to query the data.
  • CPartition the data based on load date. Use Amazon EMR to query the data.
  • DPartition the data based on load date. Use Amazon Aurora to query the data.

Explanation

Partitioning the S3 data by the order date lets Athena prune scans to only the relevant date folders, keeping query times stable as data grows. Because Athena is a serverless, pay-per- query service, you only pay for the data scanned, making it the most cost-effective way to run your weekly date-range metrics.

Topics

#Data Partitioning#Amazon Athena#S3 Data Lake#Cost Optimization

Community Discussion

No community discussion yet for this question.

Full DEA-C01 PracticeBrowse All DEA-C01 Questions