nerdexam
DatabricksDatabricks

CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #104

CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #104: Real Exam Question with Answer & Explanation

Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #104. The question stem and answer options stay visible for context.

ML Model Integration in Spark Data Pipelines

Question

The data science team has created and logged a production model using MLflow. The model accepts a list of column names and returns a new column of type DOUBLE. The following code correctly imports the production model, loads the customers table containing the customer_id key column into a DataFrame, and defines the feature columns needed for the model. Which code block will output a DataFrame with the schema "customer_id LONG, predictions DOUBLE"?

Options

  • Amodel.predict(df, columns)
  • Bdf.map(lambda x:model(x[columns])).select("customer_id, predictions")
  • Cdf.select("customer_id", model(*columns).alias("predictions"))
  • Ddf.apply(model, columns).select("customer_id, predictions")
  • Edf.select("customer_id", pandas_udf(model, columns).alias("predictions"))

Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer

You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Topics

#PySpark DataFrames#MLflow#UDFs#Model Inference
Full CERTIFIED-DATA-ENGINEER-PROFESSIONAL PracticeBrowse All CERTIFIED-DATA-ENGINEER-PROFESSIONAL Questions