nerdexam
SnowflakeSnowflake

SOL-C01 · Question #120

SOL-C01 Question #120: Real Exam Question with Answer & Explanation

The correct answer is C: COPY INTO CUSTOMER_PROFILES (customer_id, first_name, last_name, address) FROM (SELECT $1:customer_id::NUMBER, TRY_CAST($1:first_name AS VARCHAR), TRY_CAST($1:last_name AS VARCHAR), $1:address FROM @my_stage/customer_data.json) FILE_FORMAT = (TYPE = 'JSON');. Option A is the most efficient because it directly loads the JSON data into the address' column as a VARIANT type, allowing for future querying of nested data without any data transformation during the load. It handles missing fields gracefully because Snowflake automatically han

Data Loading and Unloading

Question

A data engineer needs to load JSON data containing customer profiles into a Snowflake table named 'CUSTOMER PROFILES'. Some JSON objects have missing fields, while others contain nested arrays. The target table `CUSTOMER PROFILES has columns: `customer id NUMBER, first _ name VARCHAR, last_name VARCHAR, address VARIANT'. Which of the following SQL statements is the MOST efficient and appropriate way to insert the data, handling potential missing fields without causing errors and allowing for future querying of the nested address data? A. B. C. D. E.

Options

  • AINSERT INTO CUSTOMER_PROFILES (customer_id, first_name, last_name, address) SELECT $1:customer_id::NUMBER, $1:first_name::VARCHAR, $1:last_name::VARCHAR, $1:address::VARIANT FROM @my_stage/customer_data.json (FILE_FORMAT => 'my_json_format');
  • BCOPY INTO CUSTOMER_PROFILES FROM @my_stage/customer_data.json FILE_FORMAT = (TYPE = 'JSON') MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
  • CCOPY INTO CUSTOMER_PROFILES (customer_id, first_name, last_name, address) FROM (SELECT $1:customer_id::NUMBER, TRY_CAST($1:first_name AS VARCHAR), TRY_CAST($1:last_name AS VARCHAR), $1:address FROM @my_stage/customer_data.json) FILE_FORMAT = (TYPE = 'JSON');
  • DINSERT INTO CUSTOMER_PROFILES SELECT PARSE_JSON(column1):customer_id, PARSE_JSON(column1):first_name, PARSE_JSON(column1):last_name, PARSE_JSON(column1):address FROM @my_stage/customer_data.json;

Explanation

Option A is the most efficient because it directly loads the JSON data into the address' column as a VARIANT type, allowing for future querying of nested data without any data transformation during the load. It handles missing fields gracefully because Snowflake automatically handles missing fields in VARIANT columns. Option B tries to apply PARSE_JSON and COALESCE, but VARIANT handles nulls automatically. Option C unnecessarily attempts to construct a JSON object, which is less efficient and might miss fields. Option D filters based on customer_id, but might lose the rows when `address' is NULL. Option E parses the JSON after casting it to string.

Topics

#JSON data loading#Semi-structured data#VARIANT data type#Safe JSON parsing

Community Discussion

No community discussion yet for this question.

Full SOL-C01 PracticeBrowse All SOL-C01 Questions