nerdexam
AmazonAmazon

MLA-C01 · Question #72

MLA-C01 Question #72: Real Exam Question with Answer & Explanation

The correct answer is C: Place the instances in the same VPC subnet. Store the data in the same AWS Region and. Option C is correct because placing all training instances in the same VPC subnet ensures they communicate over the lowest-latency network path (no inter-AZ or inter-region hops), and storing training data in the same AWS Region (and ideally same Availability Zone) eliminates exp

ML Model Development

Question

An ML engineer is using Amazon SageMaker to train a deep learning model that requires distributed training. After some training attempts, the ML engineer observes that the instances are not performing as expected. The ML engineer identifies communication overhead between the training instances. What should the ML engineer do to MINIMIZE the communication overhead between the instances?

Options

  • APlace the instances in the same VPC subnet. Store the data in a different AWS Region from
  • BPlace the instances in the same VPC subnet but in different Availability Zones. Store the data in a
  • CPlace the instances in the same VPC subnet. Store the data in the same AWS Region and
  • DPlace the instances in the same VPC subnet. Store the data in the same AWS Region but in a

Explanation

Option C is correct because placing all training instances in the same VPC subnet ensures they communicate over the lowest-latency network path (no inter-AZ or inter-region hops), and storing training data in the same AWS Region (and ideally same Availability Zone) eliminates expensive cross-region data transfers that would bottleneck I/O during training.

Why the distractors fail:

  • A is wrong because storing data in a different AWS Region introduces high-latency, high-cost cross-region network traffic every time the instances read training batches.
  • B is wrong because placing instances in different Availability Zones adds measurable inter-AZ latency to every gradient synchronization message, which compounds heavily in distributed training.
  • D is wrong because even though the region matches, storing data in a separate location (e.g., a different AZ) from the compute adds unnecessary data-fetch latency compared to co-locating everything.

Memory tip: Think "colocation wins" - for distributed training, minimize every hop. Same subnet = fastest instance-to-instance communication; same region/AZ for data = fastest data loading. Any time you cross a subnet, AZ, or region boundary, you pay in latency.

Topics

#Distributed Training#Amazon SageMaker#Network Optimization#Data Locality

Community Discussion

No community discussion yet for this question.

Full MLA-C01 PracticeBrowse All MLA-C01 Questions