CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #98
CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #98: Real Exam Question with Answer & Explanation
Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #98. The question stem and answer options stay visible for context.
Question
A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure. The silver_device_recordings table will be used downstream for highly selective joins on a number of fields, and will also be leveraged by the machine learning team to filter on a handful of relevant fields, in total, 15 fields have been identified that will often be used for filter and join logic. The data engineer is trying to determine the best approach for dealing with these nested fields before declaring the table schema. Which of the following accurately presents information about Delta Lake and Databricks that may Impact their decision-making process?
Options
- ABecause Delta Lake uses Parquet for data storage, Dremel encoding information for nesting can
- BTungsten encoding used by Databricks is optimized for storing string data: newly-added native
- CSchema inference and evolution on Databricks ensure that inferred types will always accurately
- DBy default Delta Lake collects statistics on the first 32 columns in a table; these statistics are
Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer
You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.