SOL-C01 · Question #182
SOL-C01 Question #182: Real Exam Question with Answer & Explanation
The correct answer is B: Create a materialized view on top of the directory table, including the 'METADATA$FILENAME , ,. Creating a materialized view (Option B) is the most effective way to optimize the performance of querying a directory table. Materialized views pre-compute and store the results of a query, allowing for faster retrieval of data, especially when the underlying table (the directory
Question
You have enabled directory tables on an external stage pointing to an AWS S3 bucket. The S3 bucket contains millions of small JSON files. When you query the directory table, you notice that the query performance is slow, even when filtering by 'METADATA$FILENAME'. What strategies can you implement to optimize the performance of querying the directory table?
Options
- AIncrease the virtual warehouse size used for querying the directory table.
- BCreate a materialized view on top of the directory table, including the 'METADATA$FILENAME , ,
- CPeriodically refresh the directory table using 'ALTER STAGE REFRESH;'.
- DCreate an index on the 'METADATA$FILENAME column of the directory table.
- EReduce the number of files in the S3 bucket. Directory tables perform better with fewer files.
Explanation
Creating a materialized view (Option B) is the most effective way to optimize the performance of querying a directory table. Materialized views pre-compute and store the results of a query, allowing for faster retrieval of data, especially when the underlying table (the directory table in this case) is large. Increasing warehouse size (Option A) can help to a certain extent, but a materialized view provides a more significant performance boost. Refreshing the directory table (Option C) keeps the metadata up- to-date but doesn't directly improve query performance. Creating an index (Option D) is not possible on directory tables. While reducing the number of files (Option E) might help slightly, it's often impractical and doesn't address the underlying issue of querying a large dataset. It's best to handle millions of files with proper indexing and data
Topics
Community Discussion
No community discussion yet for this question.