Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?

The correct answer is A. An hourly watermark. Explanation/Reference: When collecting and grouping data into windows, Beam uses triggers to determine when to emit the aggregated results of each window. Processing time triggers. These triggers operate on the processing time ?the time when the data element is processed at any g

Submitted by dimitri_ru· Mar 30, 2026Building and operationalizing data processing systems

Question

Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?

Options

AAn hourly watermark
BDataflow pipelines can consume data from other Google Cloud services
CDataflow pipelines can be programmed in Java
DDataflow pipelines use a unified programming model, so can work both with streaming and batch data sources

How the community answered

(24 responses)

A
79% (19)
B
4% (1)
C
4% (1)
D
13% (3)

Explanation

Explanation/Reference: When collecting and grouping data into windows, Beam uses triggers to determine when to emit the aggregated results of each window. Processing time triggers. These triggers operate on the processing time ?the time when the data element is processed at any given stage in the pipeline. Event time triggers. These triggers operate on the event time, as indicated by the timestamp on each data element. Beam's default trigger is event time-based. Explanation/Reference: Dataflow pipelines can also run on alternate runtimes like Spark and Flink, as they are built using the Apache Beam SDKs

Topics

#windowing#watermarks#unbounded data#streaming aggregation

Community Discussion

No community discussion yet for this question.

Full PROFESSIONAL-DATA-ENGINEER Practice