A data engineer needs to provide a team of data scientists with the appropriate dataset to run machine learning training jobs. The data will be stored in Amazon S3. The data engineer is obtaining the

Sign in or unlock MLS-C01 to reveal the answer and full explanation for question #220. The question stem and answer options stay visible for context.

Data Engineering

Question

A data engineer needs to provide a team of data scientists with the appropriate dataset to run machine learning training jobs. The data will be stored in Amazon S3. The data engineer is obtaining the data from an Amazon Redshift database and is using join queries to extract a single tabular dataset. A portion of the schema is as follows:

TransactionTimestamp (Timestamp)
CardName (Varchar)
CardNo (Varchar)

The data engineer must provide the data so that any row with a CardNo value of NULL is removed. Also, the TransactionTimestamp column must be separated into a TransactionDate column and a TransactionTime column. Finally, the CardName column must be renamed to NameOnCard. The data will be extracted on a monthly basis and will be loaded into an S3 bucket. The solution must minimize the effort that is needed to set up infrastructure for the ingestion and transformation. The solution also must be automated and must minimize the load on the Amazon Redshift cluster. Which solution meets these requirements?

Options

ASet up an Amazon EMR cluster. Create an Apache Spark job to read the data from the Amazon
BSet up an Amazon EC2 instance with a SQL client tool, such as SQL Workbench/J, to query the
CSet up an AWS Glue job that has the Amazon Redshift cluster as the source and the S3 bucket
DUse Amazon Redshift Spectrum to run a query that writes the data directly to the S3 bucket.

Unlock MLS-C01 to see the answer

You've previewed enough free MLS-C01 questions. Unlock MLS-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLS-C01 - $49.99 / 30 days Sign in

Topics

#Data Ingestion#Data Transformation#ETL#AWS Glue

Full MLS-C01 Practice