SOL-C01 · Question #127
SOL-C01 Question #127: Real Exam Question with Answer & Explanation
The correct answer is C: Create a single target table with all possible columns from all CSV files, using 'SKIP_HEADER = 1'. Creating a single target table with all possible columns and explicitly mapping the columns in the 'COPY INTO' statement is the most appropriate. This approach handles variations in column order by explicitly mapping columns from the CSV files to the target table. It is more perf
Question
You are tasked with loading data from a series of CSV files stored in an Amazon S3 bucket into Snowflake. The CSV files contain a header row, but some files have slight variations in the number and order of columns. You want to ensure that all relevant data is loaded correctly, even if the column order differs, and that any extra columns are ignored. Which of the following approaches is the MOST appropriate and efficient?
Options
- ACreate a separate external table for each CSV file with a different column structure.
- BDefine a single external table with a VARIANT column and use Snowflake's CSV parsing
- CCreate a single target table with all possible columns from all CSV files, using 'SKIP_HEADER = 1'
- DCreate a VIEW on top of the external table to ensure that column names are consistent across all
- EPre-process the CSV files to standardize the column order and names before loading them into
Explanation
Creating a single target table with all possible columns and explicitly mapping the columns in the 'COPY INTO' statement is the most appropriate. This approach handles variations in column order by explicitly mapping columns from the CSV files to the target table. It is more performant than VARIANT, and doesn't require external preprocessing. Option A is not scalable and difficult to maintain. Option B is suitable for schema evolution but is not recommended if schemas are already known. Option D loading data into view, is not direct approach, and requires external table and COPY command need a table not view to load. Option E pre-processing helps if data consistency is high priority, but adds complexity to workflow and is not part of Snowflake functionalities.
Topics
Community Discussion
No community discussion yet for this question.