You work as an ML researcher at an investment bank, and you are experimenting with the Gemma large language model (LLM). You plan to deploy the model for an internal use case. You need to have full co

Sign in or unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to reveal the answer and full explanation for question #330. The question stem and answer options stay visible for context.

Submitted by naveen.iyer· Apr 18, 2026ML pipeline operationalization

Question

You work as an ML researcher at an investment bank, and you are experimenting with the Gemma large language model (LLM). You plan to deploy the model for an internal use case. You need to have full control of the mode's underlying infrastructure and minimize the model's inference time. Which serving configuration should you use for this task?

Options

ADeploy the model on a Vertex AI endpoint manually by creating a custom inference container.
BDeploy the model on a Google Kubernetes Engine (GKE) cluster by using the deployment options
CDeploy the model on a Vertex AI endpoint by using one-click deployment in Model Garden.
DDeploy the model on a Google Kubernetes Engine (GKE) cluster manually by cresting a custom

Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to see the answer

You've previewed enough free PROFESSIONAL-MACHINE-LEARNING-ENGINEER questions. Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER - $49.99 / 30 days Sign in

Topics

#ML Model Deployment#LLM Serving#Google Kubernetes Engine (GKE)#Infrastructure Control

Full PROFESSIONAL-MACHINE-LEARNING-ENGINEER Practice