DEA-C02 Question #4: Real Exam Question with Answer & Explanation

The correct answer is B: Ensure that data files are 100-250 MB (or larger) in size, compressed.. For optimal Snowpipe JSON ingestion, B is correct because Snowflake recommends 100–250 MB compressed files to balance parallelism and throughput - too small wastes overhead, too large limits concurrency. D is correct because Snowflake's schema detection on VARIANT columns relies

Data Movement

Question

A Data Engineer needs to load JSON output from some software into Snowflake using Snowpipe. Which recommendations apply to this scenario? (Choose three.)

Options

ALoad large files (1 GB or larger).
BEnsure that data files are 100-250 MB (or larger) in size, compressed.
CLoad a single huge array containing multiple records into a single table row.
DVerify each value of each unique element stores a single native data type (string or number).
EExtract semi-structured data elements containing null values into relational columns before
FCreate data files that are less than 100 MB and stage them in cloud storage at a sequence

Explanation

For optimal Snowpipe JSON ingestion, B is correct because Snowflake recommends 100–250 MB compressed files to balance parallelism and throughput - too small wastes overhead, too large limits concurrency. D is correct because Snowflake's schema detection on VARIANT columns relies on consistent native types per element; mixing types (e.g., a field that is sometimes a string, sometimes a number) breaks automatic casting and query performance. E is correct because null values in semi-structured data stored in VARIANT columns are treated as SQL NULL and JSON null differently, causing ambiguity - extracting them into typed relational columns before loading avoids this problem.

Why the distractors are wrong:

A - 1 GB+ files are too large for Snowpipe; they reduce parallelism and can time out or cause bottlenecks.
C - Storing a huge array as a single row bloats VARIANT storage, makes querying inefficient, and bypasses Snowflake's ability to process records individually.
F - Sub-100 MB files are too small, generating excessive micro-transaction overhead; this directly contradicts the 100–250 MB guidance in B.

Memory tip: Think "BDE = Best Data Engineering" - Big enough files (100–250 MB), Data types consistent per element, Extract nulls before loading. If an option suggests either extreme (1 GB+ or <100 MB) or packing many records into one row, it's wrong.

Topics

#Snowpipe#Semi-structured Data#Data Loading Best Practices#Performance Optimization

Community Discussion

No community discussion yet for this question.

Full DEA-C02 Practice Browse All DEA-C02 Questions