nerdexam
AmazonAmazon

MLS-C01 · Question #72

MLS-C01 Question #72: Real Exam Question with Answer & Explanation

The correct answer is C: Transform the dataset into the RecordIO protobuf format.. Most Amazon SageMaker algorithms work best when you use the optimized protobuf recordIO format for the training data. https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html

Data Engineering

Question

A Machine Learning Specialist is preparing data for training on Amazon SageMaker. The Specialist is using one of the SageMaker built-in algorithms for the training. The dataset is stored in .CSV format and is transformed into a numpy.array, which appears to be negatively affecting the speed of the training. What should the Specialist do to optimize the data for training on SageMaker?

Options

  • AUse the SageMaker batch transform feature to transform the training data into a DataFrame.
  • BUse AWS Glue to compress the data into the Apache Parquet format.
  • CTransform the dataset into the RecordIO protobuf format.
  • DUse the SageMaker hyperparameter optimization feature to automatically optimize the data.

Explanation

Most Amazon SageMaker algorithms work best when you use the optimized protobuf recordIO format for the training data. https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html

Topics

#SageMaker data formats#Data optimization#Training performance#RecordIO Protobuf

Community Discussion

No community discussion yet for this question.

Full MLS-C01 PracticeBrowse All MLS-C01 Questions