nerdexam
AmazonAmazon

MLA-C01 · Question #103

MLA-C01 Question #103: Real Exam Question with Answer & Explanation

The correct answer is A: Use Amazon SageMaker Pipe mode.. Pipe mode streams data directly from S3 into SageMaker during training rather than copying the full dataset to the instance's local disk first - this eliminates the download bottleneck and lets the algorithm start learning almost immediately, dramatically reducing wall-clock trai

ML Model Development

Question

A machine learning team has several large CSV datasets in Amazon S3. Historically, models built with the Amazon SageMaker Linear Learner algorithm have taken hours to train on similar-sized datasets. The team's leaders need to accelerate the training process. What can a machine learning specialist do to address this concern?

Options

  • AUse Amazon SageMaker Pipe mode.
  • BUse Amazon Machine Learning to train the models.
  • CUse Amazon Kinesis to stream the data to Amazon SageMaker.
  • DUse AWS Glue to transform the CSV dataset to the JSON format.

Explanation

Pipe mode streams data directly from S3 into SageMaker during training rather than copying the full dataset to the instance's local disk first - this eliminates the download bottleneck and lets the algorithm start learning almost immediately, dramatically reducing wall-clock training time for large datasets.

Why the distractors are wrong:

  • B (Amazon Machine Learning): AML is an older, simpler service with fewer configuration options and no meaningful speed advantage; it's largely deprecated in favor of SageMaker.
  • C (Kinesis): Kinesis is for real-time data streaming pipelines, not for accelerating batch training jobs against static S3 datasets.
  • D (Glue/JSON format): Changing the file format doesn't address the core bottleneck (data ingestion speed), and Linear Learner already handles CSV natively - this adds work without solving the problem.

Memory tip: Think of Pipe mode as a garden hose directly to the algorithm - instead of filling a bucket (local disk) first and then pouring it in, data flows continuously. Any time an exam question mentions large S3 datasets and slow SageMaker training, Pipe mode is almost always the answer.

Topics

#SageMaker Pipe Mode#Training Optimization#Data Ingestion

Community Discussion

No community discussion yet for this question.

Full MLA-C01 PracticeBrowse All MLA-C01 Questions