nerdexam
DatabricksDatabricks

CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #46

CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #46: Real Exam Question with Answer & Explanation

The correct answer is B: spark.read.format("delta").load(exp_id). Note on the stated correct answer: Based on Databricks MLflow documentation, this question likely contains an error - option E is actually the correct answer, not B. Here's why: spark.read.format("mlflow-experiment").load(exp_id) is the purpose-built Databricks data source that a

Question

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark. Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?

Options

  • Aclient.list_run_infos(exp_id)
  • Bspark.read.format("delta").load(exp_id)
  • CThere is no way to programmatically return row-level results from an MLflow Experiment.
  • Dmlflow.search_runs(exp_id)
  • Espark.read.format("mlflow-experiment").load(exp_id)

Explanation

Note on the stated correct answer: Based on Databricks MLflow documentation, this question likely contains an error - option E is actually the correct answer, not B. Here's why:

spark.read.format("mlflow-experiment").load(exp_id) is the purpose-built Databricks data source that accepts an experiment ID directly and returns run metadata as a Spark DataFrame. This is explicitly documented in the Databricks MLflow integration docs.

Why the other options are wrong:

  • B - spark.read.format("delta").load(exp_id) reads a Delta table from a file path, not an experiment ID. While MLflow data is stored as Delta under the hood, you can't pass a bare experiment ID string as a path - this would fail or produce unexpected results.
  • A - client.list_run_infos(exp_id) returns a Python list of RunInfo objects, not a Spark DataFrame.
  • D - mlflow.search_runs(exp_id) returns a pandas DataFrame, not a Spark DataFrame.
  • C - Clearly false; both A and D show that programmatic access exists.

Memory tip: The "mlflow-experiment" format name is self-describing - when you want MLflow data in Spark, use the format that literally says "mlflow-experiment". If your exam question says B is correct instead of E, that's likely a question bank error worth flagging to your instructor.

Community Discussion

No community discussion yet for this question.

Full CERTIFIED-MACHINE-LEARNING-PROFESSIONAL PracticeBrowse All CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Questions