DEA-C01 · Question #33
DEA-C01 Question #33: Real Exam Question with Answer & Explanation
The correct answer is A: Create an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create a new. To create a cost-effective data catalog for IoT data in Amazon S3 that can handle changing data structures (schema evolution), leverage the AWS Glue Data Catalog and configure an AWS Glue Schema Registry to manage schema versions, enabling Amazon Athena to query the indexed data.
Question
A security company stores IoT data that is in JSON format in an Amazon S3 bucket. The data structure can change when the company upgrades the IoT devices. The company wants to create a data catalog that includes the IoT data. The company's analytics department will use the data catalog to index the data. Which solution will meet these requirements MOST cost-effectively?
Options
- ACreate an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create a new
- BCreate an Amazon Redshift provisioned cluster. Create an Amazon Redshift Spectrum database
- CCreate an Amazon Athena workgroup. Explore the data that is in Amazon S3 by using Apache
- DCreate an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create AWS
Explanation
To create a cost-effective data catalog for IoT data in Amazon S3 that can handle changing data structures (schema evolution), leverage the AWS Glue Data Catalog and configure an AWS Glue Schema Registry to manage schema versions, enabling Amazon Athena to query the indexed data.
Common mistakes.
- B. An Amazon Redshift provisioned cluster, even with Redshift Spectrum, is a significantly more expensive and higher-overhead solution than a serverless approach for creating a data catalog and indexing S3 data.
- C. While Amazon Athena workgroups aid cost control, Apache Hudi is a data lake format, not a data catalog service, and does not provide the comprehensive schema management solution needed for changing data structures on its own.
- D. While AWS Glue crawlers are valuable for schema discovery, for data with frequently changing structures, the AWS Glue Schema Registry provides explicit, versioned schema management, and option A describes a complete pipeline for analytics. The Schema Registry directly addresses schema evolution, and its integration with the Data Catalog makes it the core for handling changing schemas, with Athena as the querying service.
Concept tested. Data cataloging with schema evolution
Reference. https://docs.aws.amazon.com/glue/latest/dg/schema-registry.html
Topics
Community Discussion
No community discussion yet for this question.