nerdexam
AmazonAmazon

MLS-C01 · Question #253

MLS-C01 Question #253: Real Exam Question with Answer & Explanation

Sign in or unlock MLS-C01 to reveal the answer and full explanation for question #253. The question stem and answer options stay visible for context.

Machine Learning Implementation and Operations

Question

A data scientist is training a large PyTorch model by using Amazon SageMaker. It takes 10 hours on average to train the model on GPU instances. The data scientist suspects that training is not converging and that resource utilization is not optimal. What should the data scientist do to identify and address training issues with the LEAST development effort?

Options

  • AUse CPU utilization metrics that are captured in Amazon CloudWatch. Configure a CloudWatch
  • BUse high-resolution custom metrics that are captured in Amazon CloudWatch. Configure an AWS
  • CUse the SageMaker Debugger vanishing_gradient and LowGPUUtilization built-in rules to detect
  • DUse the SageMaker Debugger confusion and feature_importance_overweight built-in rules to

Unlock MLS-C01 to see the answer

You've previewed enough free MLS-C01 questions. Unlock MLS-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Topics

#SageMaker Debugger#Model Training Optimization#Resource Utilization Monitoring#Convergence Issues
Full MLS-C01 PracticeBrowse All MLS-C01 Questions