nerdexam
GoogleGoogle

PROFESSIONAL-MACHINE-LEARNING-ENGINEER · Question #142

PROFESSIONAL-MACHINE-LEARNING-ENGINEER Question #142: Real Exam Question with Answer & Explanation

Sign in or unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to reveal the answer and full explanation for question #142. The question stem and answer options stay visible for context.

Submitted by yuriko_h· Apr 18, 2026Monitoring, optimizing, and maintaining ML solutions

Question

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn't meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?

Options

  • AWeight pruning
  • BDynamic range quantization
  • CModel distillation
  • DDimensionality reduction

Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to see the answer

You've previewed enough free PROFESSIONAL-MACHINE-LEARNING-ENGINEER questions. Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Topics

#Model optimization#Quantization#Inference latency#Mobile ML
Full PROFESSIONAL-MACHINE-LEARNING-ENGINEER PracticeBrowse All PROFESSIONAL-MACHINE-LEARNING-ENGINEER Questions