nerdexam
SnowflakeSnowflake

DEA-C02 · Question #4

DEA-C02 Question #4: Real Exam Question with Answer & Explanation

The correct answer is B: Ensure that data files are 100-250 MB (or larger) in size, compressed.. For optimal Snowpipe JSON ingestion, B is correct because Snowflake recommends 100–250 MB compressed files to balance parallelism and throughput - too small wastes overhead, too large limits concurrency. D is correct because Snowflake's schema detection on VARIANT columns relies

Data Movement

Question

A Data Engineer needs to load JSON output from some software into Snowflake using Snowpipe. Which recommendations apply to this scenario? (Choose three.)

Options

  • ALoad large files (1 GB or larger).
  • BEnsure that data files are 100-250 MB (or larger) in size, compressed.
  • CLoad a single huge array containing multiple records into a single table row.
  • DVerify each value of each unique element stores a single native data type (string or number).
  • EExtract semi-structured data elements containing null values into relational columns before
  • FCreate data files that are less than 100 MB and stage them in cloud storage at a sequence

Explanation

For optimal Snowpipe JSON ingestion, B is correct because Snowflake recommends 100–250 MB compressed files to balance parallelism and throughput - too small wastes overhead, too large limits concurrency. D is correct because Snowflake's schema detection on VARIANT columns relies on consistent native types per element; mixing types (e.g., a field that is sometimes a string, sometimes a number) breaks automatic casting and query performance. E is correct because null values in semi-structured data stored in VARIANT columns are treated as SQL NULL and JSON null differently, causing ambiguity - extracting them into typed relational columns before loading avoids this problem.

Why the distractors are wrong:

  • A - 1 GB+ files are too large for Snowpipe; they reduce parallelism and can time out or cause bottlenecks.
  • C - Storing a huge array as a single row bloats VARIANT storage, makes querying inefficient, and bypasses Snowflake's ability to process records individually.
  • F - Sub-100 MB files are too small, generating excessive micro-transaction overhead; this directly contradicts the 100–250 MB guidance in B.

Memory tip: Think "BDE = Best Data Engineering" - Big enough files (100–250 MB), Data types consistent per element, Extract nulls before loading. If an option suggests either extreme (1 GB+ or <100 MB) or packing many records into one row, it's wrong.

Topics

#Snowpipe#Semi-structured Data#Data Loading Best Practices#Performance Optimization

Community Discussion

No community discussion yet for this question.

Full DEA-C02 PracticeBrowse All DEA-C02 Questions