PROFESSIONAL-DATA-ENGINEER · Question #94
PROFESSIONAL-DATA-ENGINEER Question #94: Real Exam Question with Answer & Explanation
The correct answer is A: An hourly watermark. Explanation/Reference: When collecting and grouping data into windows, Beam uses triggers to determine when to emit the aggregated results of each window. Processing time triggers. These triggers operate on the processing time ?the time when the data element is processed at any g
Question
Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?
Options
- AAn hourly watermark
- BDataflow pipelines can consume data from other Google Cloud services
- CDataflow pipelines can be programmed in Java
- DDataflow pipelines use a unified programming model, so can work both with streaming and batch data sources
Explanation
Explanation/Reference: When collecting and grouping data into windows, Beam uses triggers to determine when to emit the aggregated results of each window. Processing time triggers. These triggers operate on the processing time ?the time when the data element is processed at any given stage in the pipeline. Event time triggers. These triggers operate on the event time, as indicated by the timestamp on each data element. Beam's default trigger is event time-based. Explanation/Reference: Dataflow pipelines can also run on alternate runtimes like Spark and Flink, as they are built using the Apache Beam SDKs
Community Discussion
No community discussion yet for this question.