DatabricksDatabricks
CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #62
CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #62: Real Exam Question with Answer & Explanation
Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #62. The question stem and answer options stay visible for context.
Optimizing Spark Applications
Question
Which statement describes the correct use of pyspark.sql.functions.broadcast?
Options
- AIt marks a column as having low enough cardinality to properly map distinct values to available
- BIt marks a column as small enough to store in memory on all executors, allowing a broadcast join.
- CIt caches a copy of the indicated table on attached storage volumes for all active clusters within a
- DIt marks a DataFrame as small enough to store in memory on all executors, allowing a broadcast
- EIt caches a copy of the indicated table on all nodes in the cluster for use in all future queries
Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer
You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.
Topics
#PySpark#Broadcast Join#Spark Optimization#Spark DataFrames