A company has significantly increased the amount of data that is stored as .csv files in an Amazon S3 bucket. Data transformation scripts and queries are now taking much longer than they used to take.

Sign in or unlock MLA-C01 to reveal the answer and full explanation for question #159. The question stem and answer options stay visible for context.

Data Preparation for Machine Learning

Question

A company has significantly increased the amount of data that is stored as .csv files in an Amazon S3 bucket. Data transformation scripts and queries are now taking much longer than they used to take. An ML engineer must implement a solution to optimize the data for query performance. Which solution will meet this requirement with the LEAST operational overhead?

Options

AConfigure an AWS Lambda function to split the .csv files into smaller objects in the S3 bucket.
BConfigure an AWS Glue job to drop columns that have string type values and to save the results
CConfigure an AWS Glue extract, transform, and load (ETL) job to convert the .csv files to Apache
DConfigure an Amazon EMR cluster to process the data that is in the S3 bucket.

Unlock MLA-C01 to see the answer

You've previewed enough free MLA-C01 questions. Unlock MLA-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLA-C01 - $49.99 / 30 days Sign in

Topics

#Data format optimization#Apache Parquet#AWS Glue ETL#Query performance

Full MLA-C01 Practice