nerdexam
DatabricksDatabricks

CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #8

CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #8: Real Exam Question with Answer & Explanation

Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #8. The question stem and answer options stay visible for context.

Data Ingestion and Processing

Question

An upstream source writes Parquet data as hourly batches to directories named with the current date. A nightly batch job runs the following code to ingest all data from the previous day as indicated by the date variable: Assume that the fields customer_id and order_id serve as a composite key to uniquely identify each order. If the upstream system is known to occasionally produce duplicate entries for a single order hours apart, which statement is correct?

Options

  • AEach write to the orders table will only contain unique records, and only those records without
  • BEach write to the orders table will only contain unique records, but newly written records may
  • CEach write to the orders table will only contain unique records; if existing records with the same
  • DEach write to the orders table will only contain unique records; if existing records with the same
  • EEach write to the orders table will run deduplication over the union of new and existing records,

Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer

You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Topics

#Data Ingestion#Deduplication#Delta Lake#Data Quality
Full CERTIFIED-DATA-ENGINEER-PROFESSIONAL PracticeBrowse All CERTIFIED-DATA-ENGINEER-PROFESSIONAL Questions