A company has trained an ML model in Amazon SageMaker. The company needs to host the model to provide inferences in a production environment. The model must be highly available and must respond with m

Sign in or unlock MLA-C01 to reveal the answer and full explanation for question #37. The question stem and answer options stay visible for context.

Deployment and Orchestration of ML Workflows

Question

A company has trained an ML model in Amazon SageMaker. The company needs to host the model to provide inferences in a production environment. The model must be highly available and must respond with minimum latency. The size of each request will be between 1 KB and 3 MB. The model will receive unpredictable bursts of requests during the day. The inferences must adapt proportionally to the changes in demand. How should the company deploy the model into production to meet these requirements?

Options

ACreate a SageMaker real-time inference endpoint. Configure auto scaling. Configure the endpoint
BDeploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster. Use ECS
CInstall SageMaker Operator on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
DUse Spot Instances with a Spot Fleet behind an Application Load Balancer (ALB) for inferences.

Unlock MLA-C01 to see the answer

You've previewed enough free MLA-C01 questions. Unlock MLA-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLA-C01 - $49.99 / 30 days Sign in

Topics

#SageMaker Inference#Real-time Endpoints#Auto Scaling#Model Deployment

Full MLA-C01 Practice