nerdexam
GoogleGoogle

PROFESSIONAL-DATA-ENGINEER · Question #185

PROFESSIONAL-DATA-ENGINEER Question #185: Real Exam Question with Answer & Explanation

The correct answer is C: Use the BigQuery streaming the stream changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory. Option C is correct because BigQuery's streaming insert API allows inventory changes to be ingested with seconds of latency, satisfying the near real-time requirement. Storing changes as an append-only movement table (rather than mutating balances) aligns with BigQuery's columnar

Submitted by tunde_lagos· Mar 30, 2026Designing data processing systems

Question

You need to create a near real-time inventory dashboard that reads the main inventory tables in your BigQuery data warehouse. Historical inventory data is stored as inventory balances by item and location. You have several thousand updates to inventory every hour. You want to maximize performance of the dashboard and ensure that the data is accurate. What should you do?

Options

  • ALeverage BigQuery UPDATE statements to update the inventory balances as they are changing.
  • BPartition the inventory balance table by item to reduce the amount of data scanned with each inventory update.
  • CUse the BigQuery streaming the stream changes into a daily inventory movement table. Calculate balances in a view that joins it to the historical inventory
  • DUse the BigQuery bulk loader to batch load inventory changes into a daily inventory movement table.

Explanation

Option C is correct because BigQuery's streaming insert API allows inventory changes to be ingested with seconds of latency, satisfying the near real-time requirement. Storing changes as an append-only movement table (rather than mutating balances) aligns with BigQuery's columnar, read-optimized architecture, and computing current balances in a view by joining movements to historical data ensures accuracy without expensive rewrites.

A is wrong because BigQuery is not designed for frequent row-level DML - thousands of UPDATE statements per hour would be slow and costly, since each DML operation rewrites entire storage segments rather than modifying individual rows in place.

B is wrong because partitioning by item reduces scan cost for queries but does nothing to solve the real-time ingestion problem; it also doesn't address how updates reach the table at all.

D is wrong because bulk loading is inherently batch-oriented and introduces significant latency (typically minutes to hours), which directly violates the near real-time requirement.

Memory tip: In BigQuery, "near real-time" almost always means streaming inserts, and "frequent mutations" is a red flag - BigQuery prefers append-only patterns. When you see both requirements together, think stream movements + view-computed balances.

Topics

#BigQuery streaming#Real-time analytics#Data warehousing patterns#Near real-time inventory

Community Discussion

No community discussion yet for this question.

Full PROFESSIONAL-DATA-ENGINEER PracticeBrowse All PROFESSIONAL-DATA-ENGINEER Questions