CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #24
CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #24: Real Exam Question with Answer & Explanation
The correct answer is E: fs.score_batch(model_uri, spark_df). fs.score_batch(model_uri, spark_df) is the correct method for Feature Store batch inference - it accepts the model URI and a DataFrame containing the primary key (customer_id), automatically looks up any missing features from the Feature Store, and returns a new DataFrame with a
Question
A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on customer-level Spark DataFrame spark_df, but it is missing a few of the static features that were used when training the model. The customer_id column is the primary key of spark_df and the training set used when training and logging the model. Which of the following code blocks can be used to compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id?
Options
- Adf = fs.get_missing_features(spark_df, model_uri)
- Bfs.score_model(model_uri, spark_df)
- Cdf = fs.get_missing_features(spark_df, model_uri)
- Dfs.score_batch(model_uri, df)
- Efs.score_batch(model_uri, spark_df)
Explanation
fs.score_batch(model_uri, spark_df) is the correct method for Feature Store batch inference - it accepts the model URI and a DataFrame containing the primary key (customer_id), automatically looks up any missing features from the Feature Store, and returns a new DataFrame with a prediction column appended.
Options A and C both call fs.get_missing_features(...), which is not a real FeatureStoreClient method - feature retrieval and scoring are handled together by score_batch, not in separate steps. Option B uses fs.score_model(...), which also does not exist in the Databricks Feature Store API. Option D uses the correct method (score_batch) but passes df - a variable that was never defined (it was the fabricated result of the fake get_missing_features call in C), so it would throw a NameError at runtime.
Memory tip: Think of score_batch as "one call does it all" - it handles feature lookup and scoring in a single step. If you see any code that tries to retrieve missing features as a separate step before scoring, that pattern doesn't match how the Databricks Feature Store API actually works.
Community Discussion
No community discussion yet for this question.