PROFESSIONAL-DATA-ENGINEER Exam Questions
357 real PROFESSIONAL-DATA-ENGINEER exam questions with expert-verified answers and explanations. Page 6 of 8.
- Question #279
You are implementing a chatbot to help an online retailer streamline their customer service. The chatbot must be able to respond to both text and voice inquiries. You are looking f...
- Question #280
You are developing a new deep learning model that predicts a customer's likelihood to buy on your ecommerce site. After running an evaluation of the model against both the original...
- Question #281
You are loading CSV files from Cloud Storage to BigQuery. The files have known data quality issues, including mismatched data types, such as STRINGs and INT64s in the same column,...
- Question #282
You need to migrate 1 PB of data from an on-premises data center to Google Cloud. Data transfer time during the migration should take only a few hours. You want to follow Google-re...
- Question #283
Your startup has a web application that currently serves customers out of a single region in Asia. You are targeting funding that will allow your startup to serve customers globall...
- Question #284
Your new customer has requested daily reports that show their net consumption of Google Cloud compute resources and who used the resources. You need to quickly and efficiently gene...
- Question #285
You issue a new batch job to Dataflow. The job starts successfully, processes a few elements, and then suddenly fails and shuts down. You navigate to the Dataflow monitoring interf...
- Question #286Designing data processing systems
You want to create a machine learning model using BigQuery ML and create an endpoint for hosting the model using Vertex AI. This will enable the processing of continuous streaming...
Streaming Data ProcessingData IngestionData TransformationBigQuery ML - Question #287Building and operationalizing data processing systems
You have a data processing application that runs on Google Kubernetes Engine (GKE). Containers need to be launched with their latest available configurations from a container regis...
GKEInfrastructure as CodeCI/CDContainer Deployment - Question #288Designing data processing systems
You need ads data to serve AI models and historical data for analytics. Longtail and outlier data points need to be identified. You want to cleanse the data in near- real time befo...
Real-time ProcessingData CleansingOutlier DetectionGoogle Dataflow - Question #289Designing data processing systems
You are collecting IoT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data, filtered by location_id...
BigQueryData PartitioningData ClusteringQuery Optimization - Question #290Designing data processing systems
A live TV show asks viewers to cast votes using their mobile phones. The event generates a large volume of data during a 3-minute period. You are in charge of the "Voting infrastru...
Pub/SubBigQueryReal-time AnalyticsScalable Data Ingestion - Question #291Designing data processing systems
A shipping company has live package-tracking data that is sent to an Apache Kafka stream in real time. This is then loaded into BigQuery. Analysts in your company want to query the...
BigQueryClusteringQuery OptimizationData Modeling - Question #292Designing data processing systems
You are designing a data mesh on Google Cloud with multiple distinct data engineering teams building data products. The typical data curation design pattern consists of landing fil...
DataplexData MeshAccess ControlData Governance - Question #293Ensuring solution quality
You are using BigQuery with a multi-region dataset that includes a table with the daily sales volumes. This table is updated multiple times per day. You need to protect your sales...
BigQueryDisaster RecoveryData BackupCost Optimization - Question #294Building and operationalizing data processing systems
You are troubleshooting your Dataflow pipeline that processes data from Cloud Storage to BigQuery. You have discovered that the Dataflow worker nodes cannot communicate with one an...
Dataflow NetworkingFirewall RulesNetwork TagsTroubleshooting - Question #296Building and operationalizing data processing systems
You have a Standard Tier Memorystore for Redis instance deployed in a production environment. You need to simulate a Redis instance failover in the most accurate disaster recovery...
Memorystore for RedisFailoverDisaster Recovery TestingData Protection Modes - Question #297Designing data processing systems
You are administering a BigQuery dataset that uses a customer-managed encryption key (CMEK). You need to share the dataset with a partner organization that does not have access to...
BigQueryData SharingCMEKAnalytics Hub - Question #298Designing data processing systems
You are developing an Apache Beam pipeline to extract data from a Cloud SQL instance by using JdbcIO. You have two projects running in Google Cloud. The pipeline will be deployed a...
VPC Network PeeringCloud SQL ConnectivityDataflow NetworkingPrivate Networking - Question #299Ensuring solution quality
You have a BigQuery table that contains customer data, including sensitive information such as names and addresses. You need to share the customer data with your data analytics and...
BigQuery SecurityPolicy TagsAuthorized ViewsIAM Roles - Question #301Building and operationalizing data processing systems
You orchestrate ETL pipelines by using Cloud Composer. One of the tasks in the Apache Airflow directed acyclic graph (DAG) relies on a third-party service. You want to be notified...
Cloud ComposerApache AirflowTask failureCallbacks - Question #302Designing data processing systems
You are migrating your on-premises data warehouse to BigQuery. One of the upstream data sources resides on a MySQL. database that runs in your on-premises data center with no publi...
Data migrationBigQueryDatastreamCloud Interconnect - Question #303Designing data processing systems
You store and analyze your relational data in BigQuery on Google Cloud with all data that resides in US regions. You also have a variety of object stores across Microsoft Azure and...
BigQuery OmniBigLake tablesMulti-cloud analyticsData locality - Question #304Designing data processing systems
You have a variety of files in Cloud Storage that your data science team wants to use in their models. Currently, users do not have a method to explore, cleanse, and validate the d...
Data PreparationData CleansingLow CodeGoogle Cloud Dataprep - Question #305
You are building an ELT solution in BigQuery by using Dataform. You need to perform uniqueness and null value checks on your final tables. What should you do to efficiently integra...
- Question #306Building and operationalizing data processing systems
A web server sends click events to a Pub/Sub topic as messages. The web server includes an eventTimestamp attribute in the messages, which is the time when the click occurred. You...
DataflowStreaming PerformanceMonitoring MetricsLatency Optimization - Question #307Designing data processing systems
Your organization stores customer data in an on-premises Apache Hadoop cluster in Apache Parquet format. Data is processed on a daily basis by Apache Spark jobs that run on the clu...
Data MigrationApache SparkDataproc MetastoreCloud Storage - Question #308Designing data processing systems
Your organization has two Google Cloud projects, project A and project B. In project A, you have a Pub/Sub topic that receives data from confidential sources. Only the resources in...
VPC Service ControlsData Exfiltration PreventionPub/Sub SecurityCross-Project Security - Question #309Designing data processing systems
You stream order data by using a Dataflow pipeline, and write the aggregated result to Memorystore. You provisioned a Memorystore for Redis instance with Basic Tier, 4 GB capacity,...
Memorystore for RedisRead ReplicasScalingHigh Availability - Question #310
You have a streaming pipeline that ingests data from Pub/Sub in production. You need to update this streaming pipeline with improved business logic. You need to ensure that the upd...
- Question #311Designing data processing systems
You currently use a SQL-based tool to visualize your data stored in BigQuery. The data visualizations require the use of outer joins and analytic functions. Visualizations must be...
BigQueryMaterialized ViewsQuery OptimizationData Staleness - Question #312Designing data processing systems
You need to modernize your existing on-premises data strategy. Your organization currently uses: - Apache Hadoop clusters for processing multiple large data sets, including on-prem...
Data MigrationHadoopETL OrchestrationGoogle Cloud Dataproc - Question #313
You recently deployed several data processing jobs into your Cloud Composer 2 environment. You notice that some tasks are failing in Apache Airflow. On the monitoring dashboard, yo...
- Question #314Ensuring solution quality
You are on the data governance team and are implementing security requirements to deploy resources. You need to ensure that resources are limited to only the europe-west3 region. Y...
Google Cloud Organization PolicyResource Location RestrictionData GovernanceCompliance - Question #316Designing data processing systems
You migrated a data backend for an application that serves 10 PB of historical product data for analytics. Only the last known state for a product, which is about 10 GB of data, ne...
BigQueryData warehousingLarge-scale data storageDatabase selection - Question #317Designing data processing systems
You want to schedule a number of sequential load and transformation jobs. Data files will be added to a Cloud Storage bucket by an upstream process. There is no fixed schedule for...
Workflow OrchestrationCloud ComposerData PipelinesBigQuery - Question #318Building and operationalizing data processing systems
You are deploying a MySQL database workload onto Cloud SQL. The database must be able to scale up to support several readers from various geographic regions. The database must be h...
Cloud SQLDatabase High AvailabilityRead ReplicasCross-Region Disaster Recovery - Question #319
You are planning to load some of your existing on-premises data into BigQuery on Google Cloud. You want to either stream or batch-load data, depending on your use case. Additionall...
- Question #320
You want to encrypt the customer data stored in BigQuery. You need to implement per-user crypto-deletion on data stored in your tables. You want to adopt native features in Google...
- Question #321Building and operationalizing data processing systems
The data analyst team at your company uses BigQuery for ad-hoc queries and scheduled SQL pipelines in a Google Cloud project with a slot reservation of 2000 slots. However, with th...
BigQuery Slot ManagementBigQuery Query TypesWorkload ManagementConcurrency Control - Question #322Designing data processing systems
You are designing a data mesh on Google Cloud by using Dataplex to manage data in BigQuery and Cloud Storage. You want to simplify data asset permissions. You are creating a custom...
DataplexIAM RolesData MeshData Lake Security - Question #323Designing data processing systems
You are designing the architecture of your application to store data in Cloud Storage. Your application consists of pipelines that read data from a Cloud Storage bucket that contai...
Cloud StorageHigh AvailabilityDisaster RecoveryRPO - Question #324Designing data processing systems
You have designed an Apache Beam processing pipeline that reads from a Pub/Sub topic. The topic has a message retention duration of one day, and writes to a Cloud Storage bucket. Y...
Disaster RecoveryCloud StorageData DurabilityRPO - Question #325Preparing and processing data for machine learning using Google Cloud BigQuery and BigQueryML
You are preparing data that your machine learning team will use to train a model using BigQueryML. They want to predict the price per square foot of real estate. The training data...
BigQuery SQLData PreprocessingBigQueryMLFeature Engineering - Question #326Designing data processing systems
Different teams in your organization store customer and performance data in BigQuery. Each team needs to keep full control of their collected data, be able to query data within the...
BigQueryData SharingAnalytics HubData Governance - Question #327
You are developing a model to identify the factors that lead to sales conversions for your customers. You have completed processing your data. You want to continue through the mode...
- Question #328Building and operationalizing data processing systems
You have one BigQuery dataset which includes customers' street addresses. You want to retrieve all occurrences of street addresses from the dataset. What should you do? 搒treet?appe...
Cloud Data Loss Prevention (DLP)Data DiscoveryPIIBigQuery - Question #329Designing data processing systems
Your company operates in three domains: airlines, hotels, and ride-hailing services. Each domain has two teams: analytics and data science, which create data assets in BigQuery wit...
Data MeshDataplexData Lake ArchitectureDecentralized Data Management - Question #330
dataset.inventory_vm sample records: You have an inventory of VM data stored in the BigQuery table. You want to prepare the data for regular reporting in the most cost-effective wa...
- Question #332Building and operationalizing data processing systems
Your company's data platform ingests CSV file dumps of booking and user profile data from upstream sources into Cloud Storage. The data analyst team wants to join these datasets on...
Dynamic Data MaskingBigQueryData De-identificationPII Protection