DEA-C01 · Question #276
DEA-C01 Question #276: Real Exam Question with Answer & Explanation
The correct answer is B: Create an AWS Glue DataBrew project for the data in the S3 bucket. Create a ruleset for the data. AWS Glue DataBrew is a fully managed, serverless service with built-in data quality rulesets and profiling jobs that run directly on S3 data. Triggering the profile job via EventBridge on object creation applies rules with minimal code and management overhead.
Question
A company is developing machine learning (ML) models. A data engineer needs to apply data quality rules to training data. The company stores the training data in an Amazon S3 bucket. Which solution will meet these requirements with the LEAST operational overhead?
Options
- ACreate an AWS Lambda function to check data quality and to raise exceptions in the code. Run
- BCreate an AWS Glue DataBrew project for the data in the S3 bucket. Create a ruleset for the data
- CCreate an Amazon EMR provisioned cluster. Add a Python open source data quality package to
- DCreate AWS Lambda functions to evaluate data quality rules. Use AWS Step Functions to
Explanation
AWS Glue DataBrew is a fully managed, serverless service with built-in data quality rulesets and profiling jobs that run directly on S3 data. Triggering the profile job via EventBridge on object creation applies rules with minimal code and management overhead.
Topics
Community Discussion
No community discussion yet for this question.