CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #49: Real Exam Question with Answer & Explanation

Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #49. The question stem and answer options stay visible for context.

Optimizing Spark Applications

Question

A user new to Databricks is trying to troubleshoot long execution times for some pipeline logic they are working on. Presently, the user is executing code cell-by-cell, using display() calls to confirm code is producing the logically correct results as new transformations are added to an operation. To get a measure of average time to execute, the user is running each cell multiple times interactively. Which of the following adjustments will get a more accurate measure of how code is likely to perform in production?

Options

AScala is the only language that can be accurately tested using interactive notebooks; because the
BThe only way to meaningfully troubleshoot code execution times in development notebooks Is to
CProduction code development should only be done using an IDE; executing code against a local
DCalling display () forces a job to trigger, while many transformations will only add to the logical
EThe Jobs Ul should be leveraged to occasionally run the notebook as a job and track execution

Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer

You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL - $49.99 / 30 days Sign in

Topics

#Spark Lazy Evaluation#Databricks Notebooks#Performance Tuning#Spark Actions

Full CERTIFIED-DATA-ENGINEER-PROFESSIONAL Practice Browse All CERTIFIED-DATA-ENGINEER-PROFESSIONAL Questions