PROFESSIONAL-MACHINE-LEARNING-ENGINEER · Question #56
PROFESSIONAL-MACHINE-LEARNING-ENGINEER Question #56: Real Exam Question with Answer & Explanation
Sign in or unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to reveal the answer and full explanation for question #56. The question stem and answer options stay visible for context.
Question
You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?
Options
- ASignificantly increase the max_batch_size TensorFlow Serving parameter.
- BSwitch to the tensorflow-model-server-universal version of TensorFlow Serving.
- CSignificantly increase the max_enqueued_batches TensorFlow Serving parameter.
- DRecompile TensorFlow Serving using the source to support CPU-specific optimizations.
Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to see the answer
You've previewed enough free PROFESSIONAL-MACHINE-LEARNING-ENGINEER questions. Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.