nerdexam
SnowflakeSnowflake

SOL-C01 · Question #158

SOL-C01 Question #158: Real Exam Question with Answer & Explanation

The correct answer is B: CREATE TABLE customers AS SELECT UPPER(customer_name) AS customer_name, TRIM(customer_address) AS customer_address FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_date DESC) AS rn FROM staging_customers) WHERE rn = 1;. Option A is the MOST efficient and recommended approach- It combines CTAS with the 'QUALIFY clause and 'ROW NUMBER()' window function to perform the transformations and deduplication in a single step. The 'QUALIFY clause filters the results based on the row number within each par

Data Loading and Unloading

Question

You are designing a data warehouse in Snowflake and need to load data from various sources. You have a table named 'staging_customers' that contains raw customer data. You want to create a new table named 'customers' that contains cleansed and transformed data from the 'staging _ customers' table. You need to perform the following transformations: 1) Convert the 'customer name' to uppercase. 2) Remove leading and trailing spaces from the 'customer address'. 3) Handle potential duplicate records based on the 'customer id' by only inserting the latest record (assuming 'load_date' indicates the load timestamp). Which of the following approaches, using a combination of CTAS (CREATE TABLE AS SELECT) and other Snowflake features, is the MOST efficient and recommended way to achieve this? A. B. C. D. E.

Options

  • ACREATE TABLE customers AS SELECT UPPER(customer_name) AS customer_name, TRIM(customer_address) AS customer_address FROM staging_customers WHERE ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_date DESC) = 1;
  • BCREATE TABLE customers AS SELECT UPPER(customer_name) AS customer_name, TRIM(customer_address) AS customer_address FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_date DESC) AS rn FROM staging_customers) WHERE rn = 1;
  • CCREATE TABLE customers AS SELECT UPPER(customer_name) AS customer_name, TRIM(customer_address) AS customer_address FROM staging_customers GROUP BY customer_id;
  • DCREATE TABLE customers AS SELECT DISTINCT UPPER(customer_name) AS customer_name, TRIM(customer_address) AS customer_address FROM staging_customers;

Explanation

Option A is the MOST efficient and recommended approach- It combines CTAS with the 'QUALIFY clause and 'ROW NUMBER()' window function to perform the transformations and deduplication in a single step. The 'QUALIFY clause filters the results based on the row number within each partition (customer_id), ensuring that only the row with the highest 'load_date' is included. Using window functions within a CTAS statement is highly optimized in Snowflake. Option B is incorrect because 'GROUP BY with doesnt guarantee that all other columns will correspond to the record with the maximum load_date for that customer_id. Option C can unexpected results, as the subquery might return multiple maximum load dates for different customer IDs_ Option D only uses 'DISTINCT and 'ORDER BY , which does not correctly handle duplicate records and only sorts the end result Option E creates the table first, then attempts to delete duplicates, which is less efficient than doing it in a single CTAS statement.

Topics

#CTAS#Data Transformation#Deduplication#Window Functions

Community Discussion

No community discussion yet for this question.

Full SOL-C01 PracticeBrowse All SOL-C01 Questions