A security company stores IoT data that is in JSON format in an Amazon S3 bucket. The data structure can change when the company upgrades the IoT devices. The company wants to create a data catalog that includes the IoT data. The company's analytics department will use the data catalog to index the data. Which solution will meet these requirements MOST cost-effectively?

Question

Accepted Answer

A. Create an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create a new This solution leverages the AWS Glue Data Catalog for central metadata storage and the AWS Glue Schema Registry for explicit management of schema evolution, which is crucial for IoT data where structures can change, allowing cost-effective querying through Amazon Athena. The Schema Registry provides versioning and compatibility checks, directly addressing the challenge of changing data structures.

Answer

B. Create an Amazon Redshift provisioned cluster. Create an Amazon Redshift Spectrum database An Amazon Redshift provisioned cluster, even with Redshift Spectrum, is a significantly more expensive and higher-overhead solution than a serverless approach for creating a data catalog and indexing S3 data.

Answer

C. Create an Amazon Athena workgroup. Explore the data that is in Amazon S3 by using Apache While Amazon Athena workgroups aid cost control, Apache Hudi is a data lake format, not a data catalog service, and does not provide the comprehensive schema management solution needed for changing data structures on its own.

Answer

D. Create an AWS Glue Data Catalog. Configure an AWS Glue Schema Registry. Create AWS While AWS Glue crawlers are valuable for schema discovery, for data with frequently changing structures, the AWS Glue Schema Registry provides explicit, versioned schema management, and option A describes a complete pipeline for analytics. The Schema Registry directly addresses schema evolution, and its integration with the Data Catalog makes it the core for handling changing schemas, with Athena as the querying service.

A security company stores IoT data that is in JSON format in an Amazon S3 bucket. The data structure can change when the company upgrades the IoT devices. The company wants to create a data catalog th

Question

Options

How the community answered

Why each option

Topics

Community Discussion