nerdexam
SnowflakeSnowflake

SOL-C01 · Question #193

SOL-C01 Question #193: Real Exam Question with Answer & Explanation

The correct answer is B: Pre-process the documents to remove irrelevant sections (e.g., boilerplate text, headers, footers). Option B is correct because pre-processing reduces the amount of data that PARSE_DOCUMENT needs to process. Partitioning in the external stage enables Snowflake to more efficiently retrieve the relevant data. Option C is correct because caching prevents redundant processing and r

Querying and Performance

Question

A company stores unstructured text data (PDFs, DOCX) in an external stage (AWS S3). They want to use Snowflake Cortex's PARSE DOCUMENT function to extract specific information, but are encountering performance issues and high costs. Which of the following strategies could optimize performance and reduce costs when using PARSE DOCUMENT in this scenario?

Options

  • AIncrease the size of the virtual warehouse used for processing, even if it means paying for larger
  • BPre-process the documents to remove irrelevant sections (e.g., boilerplate text, headers, footers)
  • CUtilize Snowflake's caching mechanism by storing parsed results in a separate table and
  • DReduce the number of documents being processed in a single batch to minimize memory
  • EImplement a robust error handling mechanism to prevent processing from halting due to

Explanation

Option B is correct because pre-processing reduces the amount of data that PARSE_DOCUMENT needs to process. Partitioning in the external stage enables Snowflake to more efficiently retrieve the relevant data. Option C is correct because caching prevents redundant processing and reduce MAX FILE_SIZE to lower value. Option E is correct because error handling ensures processing continues and monitoring provides insights into resource usage. Option A increasing warehouse size and MAX FILE SIZE without other optimizations is often a brute-force approach that doesn't address the root cause of performance problems and leads to unnecessary costs. Option D, limiting batch size, can help with memory issues but doesn't fundamentally improve the efficiency of document parsing.

Topics

#PARSE DOCUMENT#Performance Tuning#Cost Optimization#Unstructured Data Processing

Community Discussion

No community discussion yet for this question.

Full SOL-C01 PracticeBrowse All SOL-C01 Questions