Exam MLA-C01 Topic 3 Question 32 Discussion
Actual exam question for Amazon's MLA-C01 exam
Question #: 32
Topic #: 3
Question #: 32
Topic #: 3
An ML engineer is configuring auto scaling for an inference component of a model that runs behind an Amazon SageMaker AI endpoint. The ML engineer configures SageMaker AI auto scaling with a target tracking scaling policy set to 100 invocations per model per minute. The SageMaker AI endpoint scales appropriately during normal business hours. However, the ML engineer notices that at the start of each business day, there are zero instances available to handle requests, which causes delays in processing.
The ML engineer must ensure that the SageMaker AI endpoint can handle incoming requests at the start of each business day.
Which solution will meet this requirement?
The ML engineer must ensure that the SageMaker AI endpoint can handle incoming requests at the start of each business day.
Which solution will meet this requirement?
Suggested Answer: D Vote an answer
This issue occurs because target tracking auto scaling allows the endpoint to scale down to zero, and scaling up only happens after traffic arrives. At the start of the business day, no instances are running, so the first requests experience cold-start delays.
AWS documentation for Amazon SageMaker recommends using scheduled or step scaling policies when predictable traffic patterns exist. In this case, business hours are predictable, so the best practice is to proactively scale the endpoint before traffic arrives.
Option D correctly uses Amazon CloudWatch alarms with step scaling to increase the minimum instance count from zero to one at the start of the business day. This ensures at least one warm instance is ready to handle requests immediately, eliminating startup latency.
Options A, B, and C do not guarantee instance availability before traffic begins. Cooldown tuning and metric changes only react after load is detected.
Therefore, scheduled step scaling using CloudWatch alarms is the correct solution.
AWS documentation for Amazon SageMaker recommends using scheduled or step scaling policies when predictable traffic patterns exist. In this case, business hours are predictable, so the best practice is to proactively scale the endpoint before traffic arrives.
Option D correctly uses Amazon CloudWatch alarms with step scaling to increase the minimum instance count from zero to one at the start of the business day. This ensures at least one warm instance is ready to handle requests immediately, eliminating startup latency.
Options A, B, and C do not guarantee instance availability before traffic begins. Cooldown tuning and metric changes only react after load is detected.
Therefore, scheduled step scaling using CloudWatch alarms is the correct solution.
by Zona at Apr 03, 2026, 03:55 AM
0
0
0
10
Comments
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Report Comment
Commenting
You can sign-up / login (it's free).