SOL-C01 · Question #25
SOL-C01 Question #25: Real Exam Question with Answer & Explanation
The correct answer is B: VARIANT. Query efficiency is achieved using dot notation and FLATTEN table function, but. VARIANT is the most suitable data type for storing semi-structured data like JSON in Snowflake. It automatically infers the schema and allows for flexible storage of evolving JSON structures. Dot notation and the FLATTEN table function enable efficient querying of nested fields.
Question
A data engineer needs to create a table in Snowflake to store JSON data from an external API. The API returns a nested JSON structure that is frequently updated with new fields. Which data type is MOST suitable for storing this JSON data, and what are the key considerations for querying it efficiently?
Options
- AVARCHAR with a size large enough to accommodate the largest expected JSON document.
- BVARIANT. Query efficiency is achieved using dot notation and FLATTEN table function, but
- COBJECT. Suitable if the JSON structure is strictly defined and known in advance, allowing for
- DARRAY. Best suited if the primary structure is an array of similar JSON objects, allowing for easy
- EBINARY. Suitable for storing raw JSON bytes. Requires custom conversion functions for querying,
Explanation
VARIANT is the most suitable data type for storing semi-structured data like JSON in Snowflake. It automatically infers the schema and allows for flexible storage of evolving JSON structures. Dot notation and the FLATTEN table function enable efficient querying of nested fields. While storage might be larger than VARCHAR due to internal metadata, the ease of use and query performance benefits outweigh this cost in many scenarios. Other options are less suitable because VARCHAR requires string manipulation, OBJECT needs a predefined schema, ARRAY requires an array structure, and BINARY needs custom conversion.
Topics
Community Discussion
No community discussion yet for this question.