nerdexam
DatabricksDatabricks

CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #42

CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #42: Real Exam Question with Answer & Explanation

The correct answer is D: spark.read.table(path).drop("star_rating"). Option D correctly chains spark.read.table(path) - the standard Spark method for reading a registered Delta table by name or path - with .drop("star_rating"), which returns a new DataFrame excluding that column. This is the idiomatic pattern when working with tables registered in

Question

A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column. Which of the following code blocks accomplishes this task?

Options

  • Aspark.read.format("delta").load(path).drop("star_rating")
  • Bspark.read.format("delta").table(path).drop("star_rating")
  • CDelta tables cannot be modified
  • Dspark.read.table(path).drop("star_rating")
  • Espark.sql("SELECT * EXCEPT star_rating FROM path")

Explanation

Option D correctly chains spark.read.table(path) - the standard Spark method for reading a registered Delta table by name or path - with .drop("star_rating"), which returns a new DataFrame excluding that column. This is the idiomatic pattern when working with tables registered in the metastore.

Why each distractor fails:

  • A uses spark.read.format("delta").load(path), which is designed for reading raw, unregistered files from a file-system path - not the correct approach for a registered Delta table, making it the wrong tool here.
  • B attempts to chain .table() after .format("delta"), which is invalid syntax; DataFrameReader does not expose a .table() method after .format() is called.
  • C is factually wrong - Delta tables support schema evolution and can be modified.
  • E embeds path as a literal identifier in SQL (FROM path), so it looks for a table literally named "path" rather than using the Python variable; additionally, SELECT * EXCEPT column without parentheses is not valid Spark SQL.

Memory tip: Think of it as "table for tables, load for files" - use spark.read.table() for registered/metastore tables and spark.read.format("delta").load() only when pointing at raw file-system paths.

Community Discussion

No community discussion yet for this question.

Full CERTIFIED-MACHINE-LEARNING-PROFESSIONAL PracticeBrowse All CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Questions