CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #8
CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #8: Real Exam Question with Answer & Explanation
Sign in or unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to reveal the answer and full explanation for question #8. The question stem and answer options stay visible for context.
Question
An upstream source writes Parquet data as hourly batches to directories named with the current date. A nightly batch job runs the following code to ingest all data from the previous day as indicated by the date variable: Assume that the fields customer_id and order_id serve as a composite key to uniquely identify each order. If the upstream system is known to occasionally produce duplicate entries for a single order hours apart, which statement is correct?
Options
- AEach write to the orders table will only contain unique records, and only those records without
- BEach write to the orders table will only contain unique records, but newly written records may
- CEach write to the orders table will only contain unique records; if existing records with the same
- DEach write to the orders table will only contain unique records; if existing records with the same
- EEach write to the orders table will run deduplication over the union of new and existing records,
Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL to see the answer
You've previewed enough free CERTIFIED-DATA-ENGINEER-PROFESSIONAL questions. Unlock CERTIFIED-DATA-ENGINEER-PROFESSIONAL for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.