nerdexam
DatabricksDatabricks

CERTIFIED-DATA-ANALYST-ASSOCIATE · Question #84

CERTIFIED-DATA-ANALYST-ASSOCIATE Question #84: Real Exam Question with Answer & Explanation

The correct answer is A: MERGE INTO suppliers. MERGE INTO is correct because it performs a conditional upsert - it checks whether matching rows already exist (based on a join key like supplier_id) and only inserts rows that don't already exist, leaving existing rows untouched. COPY INTO is designed for bulk-loading data from

Question

An analyst has been asked to combine the data in two tables: suppliers and new_suppliers. It's possible that some of the supplier_id values match in both tables, meaning that those particular suppliers have already been added to the suppliers table. If that is the case, the data should be unchanged. Which command will combine the two tables without duplicating the rows with the same supplier_id?

Options

  • AMERGE INTO suppliers
  • BCOPY INTO suppliers
  • DINSERT INTO suppliers

Explanation

MERGE INTO is correct because it performs a conditional upsert - it checks whether matching rows already exist (based on a join key like supplier_id) and only inserts rows that don't already exist, leaving existing rows untouched. COPY INTO is designed for bulk-loading data from external files (like CSVs or staged files) into a table, not for combining two existing tables. INSERT INTO blindly appends all rows from the source without checking for duplicates, which would create duplicate supplier_id entries. For a memory tip, think of MERGE as a "smart insert" - it merges logic (match? skip or update : insert), whereas INSERT just dumps data in without looking.

Community Discussion

No community discussion yet for this question.

Full CERTIFIED-DATA-ANALYST-ASSOCIATE PracticeBrowse All CERTIFIED-DATA-ANALYST-ASSOCIATE Questions