PROFESSIONAL-MACHINE-LEARNING-ENGINEER · Question #219
PROFESSIONAL-MACHINE-LEARNING-ENGINEER Question #219: Real Exam Question with Answer & Explanation
The correct answer is A: Use the Kubeflow Pipelines SDK to implement the pipeline. Use the BigQueryJobOp component. To develop a weekly training pipeline for a TensorFlow wide and deep model, leveraging existing BigQuery SQL preprocessing, the Kubeflow Pipelines SDK should be used with a BigQueryJobOp component to directly execute the preprocessing SQL.
Question
You recently developed a wide and deep model in TensorFlow. You generated training datasets using a SQL script that preprocessed raw data in BigQuery by performing instance-level transformations of the data. You need to create a training pipeline to retrain the model on a weekly basis. The trained model will be used to generate daily recommendations. You want to minimize model development and training time. How should you develop the training pipeline?
Options
- AUse the Kubeflow Pipelines SDK to implement the pipeline. Use the BigQueryJobOp component
- BUse the Kubeflow Pipelines SDK to implement the pipeline. Use the DataflowPythonJobOp
- CUse the TensorFlow Extended SDK to implement the pipeline. Use the ExampleGen component
- DUse the TensorFlow Extended SDK to implement the pipeline. Implement the preprocessing
Explanation
To develop a weekly training pipeline for a TensorFlow wide and deep model, leveraging existing BigQuery SQL preprocessing, the Kubeflow Pipelines SDK should be used with a BigQueryJobOp component to directly execute the preprocessing SQL.
Common mistakes.
- B. Using
DataflowPythonJobOpwould require rewriting the existing SQL preprocessing logic into Python for Dataflow, which increases development time instead of minimizing it. - C. While TensorFlow Extended (TFX) is a robust framework, using its
ExampleGencomponent alone might not be sufficient to directly incorporate complex BigQuery SQL preprocessing, and adapting the SQL logic into TFX's data transformation components would typically involve more development effort. - D. Implementing preprocessing within the TensorFlow Extended (TFX) SDK would necessitate rewriting the existing BigQuery SQL preprocessing logic using
tf.Transformor similar TFX components, which goes against the goal of minimizing development and training time.
Concept tested. Integrating BigQuery SQL preprocessing in MLOps pipeline
Reference. https://cloud.google.com/vertex-ai/docs/pipelines/create-component-from-bigquery-query
Topics
Community Discussion
No community discussion yet for this question.