An ML engineer is configuring auto scaling for an inference component of a model that runs behind an Amazon SageMaker AI endpoint. The ML engineer configures SageMaker AI auto scaling with a target tr

Sign in or unlock MLA-C01 to reveal the answer and full explanation for question #213. The question stem and answer options stay visible for context.

Deployment and Orchestration of ML Workflows

Question

An ML engineer is configuring auto scaling for an inference component of a model that runs behind an Amazon SageMaker AI endpoint. The ML engineer configures SageMaker AI auto scaling with a target tracking scaling policy set to 100 invocations per model per minute. The SageMaker AI endpoint scales appropriately during normal business hours. However, the ML engineer notices that at the start of each business day, there are zero instances available to handle requests, which causes delays in processing. The ML engineer must ensure that the SageMaker AI endpoint can handle incoming requests at the start of each business day. Which solution will meet this requirement?

Options

AReduce the SageMaker AI auto scaling cooldown period to the minimum supported value. Add an
BChange the target metric to CPU utilization.
CModify the scaling policy target value to one.
DApply a step scaling policy that scales based on an Amazon CloudWatch alarm. Apply a second

Unlock MLA-C01 to see the answer

You've previewed enough free MLA-C01 questions. Unlock MLA-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLA-C01 - $49.99 / 30 days Sign in

Topics

#SageMaker Endpoints#Auto Scaling Policies#Cold Start Problem#Proactive Scaling

Full MLA-C01 Practice