nerdexam
SnowflakeSnowflake

SOL-C01 · Question #237

SOL-C01 Question #237: Real Exam Question with Answer & Explanation

The correct answer is A: Use external tables to access data stored in cloud storage (e.g., AWS S3) in its native format,. Options A and B are the most appropriate practices. Using external tables allows you to directly query data stored in cloud storage without loading it into Snowflake, which is cost-effective for large volumes of infrequently accessed data. Combining this with materialized views c

Snowflake Overview and Architecture

Question

You are designing a data lake solution in Snowflake that requires storing and processing both structured and semi-structured data from various sources. The data lake will be used for ad-hoc querying, data science, and reporting. Which of the following combinations of Snowflake features and practices would be MOST appropriate for building a scalable, performant, and cost-effective data lake?

Options

  • AUse external tables to access data stored in cloud storage (e.g., AWS S3) in its native format,
  • BLoad all data into Snowflake internal tables in a raw format (e.g., JSON or Parquet), and create
  • CStore all data in Snowflake internal tables in a fully normalized relational format, creating indexes
  • DUse only external tables to access data in cloud storage, avoiding any data loading into Snowflake
  • EIngest all data into a single Snowflake database, and use stored procedures to perform all data

Explanation

Options A and B are the most appropriate practices. Using external tables allows you to directly query data stored in cloud storage without loading it into Snowflake, which is cost-effective for large volumes of infrequently accessed data. Combining this with materialized views can significantly improve the performance of frequently executed queries by pre-computing and storing the results. Also loading data in its raw format into internal tables and segregating the same into raw, curated and transformed schemas as applicable helps manage the data effectively. Option C is not ideal because a fully normalized relational format is not suitable for all types of data, especially semi-structured data. Creating indexes on all columns would also be inefficient and costly. Option D may lead to performance bottlenecks due to the overhead of querying data directly from cloud storage for all queries. Option E would lead to poor organization and maintainability.

Topics

#Data Lake Design#External Tables#Internal Tables#Semi-structured Data

Community Discussion

No community discussion yet for this question.

Full SOL-C01 PracticeBrowse All SOL-C01 Questions