CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #25
CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #25: Real Exam Question with Answer & Explanation
The correct answer is E: Real-time. Real-time (online) inference is the correct strategy because it processes individual records on-demand at the moment of prediction, which satisfies both constraints: features are computed at delivery time (not pre-computed), and latency is minimized for single-record responses. W
Question
A machine learning engineer needs to select a deployment strategy for a new machine learning application. The feature values are not available until the time of delivery, and results are needed exceedingly fast for one record at a time. Which of the following deployment strategies can be used to meet these requirements?
Options
- AEdge/on-device
- BStreaming
- CNone of these strategies will meet the requirements.
- DBatch
- EReal-time
Explanation
Real-time (online) inference is the correct strategy because it processes individual records on-demand at the moment of prediction, which satisfies both constraints: features are computed at delivery time (not pre-computed), and latency is minimized for single-record responses.
Why the distractors fail:
- Batch (D) processes many records together on a schedule - it requires features to be available in advance and doesn't deliver "exceedingly fast" results for a single record.
- Streaming (B) handles continuous data flows in near-real-time but is optimized for pipelines of events, not instant single-record lookups; it also typically introduces pipeline latency.
- Edge/on-device (A) is a deployment location, not a deployment strategy in the same sense - it describes where the model runs, not how inference is triggered. It could support real-time inference, but alone it doesn't answer the strategy question.
- C is wrong because real-time satisfies both requirements.
Memory tip: Think of the two constraints as a filter - "one record at a time" eliminates Batch, and "features not available until delivery" eliminates any pre-computation approach. What's left standing is Real-time inference, which waits for the live request before doing any work.
Community Discussion
No community discussion yet for this question.