You have trained a DNN regressor with TensorFlow to predict housing prices using a set of predictive features. Your default precision is tf.float64, and you use a standard TensorFlow estimator: Your m

Sign in or unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to reveal the answer and full explanation for question #156. The question stem and answer options stay visible for context.

Submitted by olafpl· Apr 18, 2026Monitoring, optimizing, and maintaining ML solutions

Question

You have trained a DNN regressor with TensorFlow to predict housing prices using a set of predictive features. Your default precision is tf.float64, and you use a standard TensorFlow estimator: Your model performs well, but just before deploying it to production, you discover that your current serving latency is 10ms @ 90 percentile and you currently serve on CPUs. Your production requirements expect a model latency of 8ms @ 90 percentile. You're willing to accept a small decrease in performance in order to reach the latency requirement. Therefore your plan is to improve latency while evaluating how much the model's prediction decreases. What should you first try to quickly lower the serving latency?

Options

ASwitch from CPU to GPU serving.
BApply quantization to your SavedModel by reducing the floating point precision to tf.float16.
CIncrease the dropout rate to 0.8 and retrain your model.
DIncrease the dropout rate to 0.8 in _PREDICT mode by adjusting the TensorFlow Serving

Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER to see the answer

You've previewed enough free PROFESSIONAL-MACHINE-LEARNING-ENGINEER questions. Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock PROFESSIONAL-MACHINE-LEARNING-ENGINEER - $49.99 / 30 days Sign in

Topics

#Model optimization#Quantization#Inference latency#TensorFlow serving

Full PROFESSIONAL-MACHINE-LEARNING-ENGINEER Practice