nerdexam
GoogleGoogle

PROFESSIONAL-DATA-ENGINEER · Question #352

PROFESSIONAL-DATA-ENGINEER Question #352: Real Exam Question with Answer & Explanation

The correct answer is B: Use the BigQueryInsertJobOperator in Cloud Composer, set the retry parameter to three, and set the email_on_failure parameter to true.. Option B is correct because BigQueryInsertJobOperator is the appropriate Cloud Composer (Apache Airflow) operator for executing SQL queries - including aggregate transformations - and can append results to an existing BigQuery table. Airflow's built-in retries parameter controls

Submitted by ravi_2018· Mar 30, 2026Building and operationalizing data processing systems

Question

You need to create a SQL pipeline. The pipeline runs an aggregate SQL transformation on a BigQuery table every two hours and appends the result to another existing BigQuery table. You need to configure the pipeline to retry if errors occur. You want the pipeline to send an email notification after three consecutive failures. What should you do?

Options

  • AUse the BigQueryUpsertTableOperator in Cloud Composer, set the retry parameter to three, and set the email_on_failure parameter to true.
  • BUse the BigQueryInsertJobOperator in Cloud Composer, set the retry parameter to three, and set the email_on_failure parameter to true.
  • CCreate a BigQuery scheduled query to run the SQL transformation with schedule options that repeats every two hours, and enable email notifications.
  • DCreate a BigQuery scheduled query to run the SQL transformation with schedule options that repeats every two hours, and enable notification to Pub/Sub

Explanation

Option B is correct because BigQueryInsertJobOperator is the appropriate Cloud Composer (Apache Airflow) operator for executing SQL queries - including aggregate transformations - and can append results to an existing BigQuery table. Airflow's built-in retries parameter controls how many times a task retries on failure, and email_on_failure=True combined with retries=3 triggers an email notification after three consecutive failures, satisfying all requirements.

Option A is wrong because BigQueryUpsertTableOperator performs row-level upserts (merge/update operations based on a primary key), not aggregate SQL transformations - it's the wrong tool for the job entirely.

Option C is wrong because BigQuery Scheduled Queries lack fine-grained retry control; you cannot configure "retry up to N times before notifying." The email notification fires on any failure, not specifically after three consecutive ones.

Option D is wrong for the same scheduling limitations as C, and additionally, routing failure events through Pub/Sub to send email requires extra infrastructure - it doesn't natively send email notifications.

Memory tip: Think of the operator name literally - you're inserting a job (a SQL query job) into BigQuery. InsertJob = run SQL. UpsertTable = merge rows by key. When you see "aggregate transformation + append," reach for InsertJobOperator.

Topics

#BigQuery#Cloud Composer#Data Pipelines#Error Handling

Community Discussion

No community discussion yet for this question.

Full PROFESSIONAL-DATA-ENGINEER PracticeBrowse All PROFESSIONAL-DATA-ENGINEER Questions