PROFESSIONAL-DATA-ENGINEER Exam Questions
357 real PROFESSIONAL-DATA-ENGINEER exam questions with expert-verified answers and explanations. Page 1 of 8.
- Question #1
Your company built a TensorFlow neural-network model with a large number of neurons and layers. The model fits well for the training data. However, when tested against new data, it...
- Question #2
You are building a model to make clothing recommendations. You know a user's fashion pis likely to change over time, so you build a data pipeline to stream new data back to the mod...
- Question #3
You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. Your design used a single database table to represent all patients...
- Question #4
You create an important report for your large team in Google Data Studio 360. The report uses Google BigQuery as its data source. You notice that visualizations are not showing dat...
- Question #5
An external customer provides you with a daily dump of data from their database. The data flows into Google Cloud Storage GCS as comma-separated values (CSV) files. You want to ana...
- Question #7
You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you...
- Question #8
You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts. There is no guarantee that data will only be sent in once but you do...
- Question #9
Your company is using WHILECARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error: # Syntax error : Ex...
- Question #10Ensuring solution quality
Your company is in a highly regulated industry. One of your requirements is to ensure individual users have access only to the minimum amount of information required to do their jo...
BigQuery SecurityData Access ControlLeast PrivilegeData Governance - Question #11
You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: - No interaction by the user on the site for...
- Question #12Designing data processing systems
Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Go...
BigQueryIAMData SecurityMulti-tenancy - Question #13
You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to mana...
- Question #14
You want to use a database of information about tissue samples to classify future tissue samples as either normal or mutated. You are evaluating an unsupervised anomaly detection m...
- Question #15
You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming...
- Question #16
Your startup has never implemented a formal security policy. Currently, everyone in the company has access to the datasets stored in Google BigQuery. Teams have freedom to use the...
- Question #17
Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as...
- Question #18
Business owners at your company have given you a database of bank transactions. Each row contains the user ID, transaction type, transaction location, and transaction amount. They...
- Question #19Designing data processing systems
Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for- like migration of the...
Cloud DataprocGoogle Cloud StorageCost OptimizationData Migration - Question #20
You work for a car manufacturer and have set up a data pipeline using Google Cloud Pub/Sub to capture anomalous sensor events. You are using a push subscription in Cloud Pub/Sub th...
- Question #21
Your company uses a proprietary system to send inventory data every 6 hours to a data ingestion service in the cloud. Transmitted data includes a payload of several fields and the...
- Question #22
Your company has hired a new data scientist who wants to perform complicated analyses across very large datasets stored in Google Cloud Storage and in a Cassandra cluster on Google...
- Question #23
You are deploying 10,000 new Internet of Things devices to collect temperature data in your warehouses globally. You need to process, store and analyze these very large datasets in...
- Question #25
You want to use Google Stackdriver Logging to monitor Google BigQuery usage. You need an instant notification to be sent to your monitoring tool when new data is appended to a cert...
- Question #26
You are working on a sensitive project involving private user data. You have set up a project on Google Cloud Platform to house your work internally. An external consultant is goin...
- Question #27Building and operationalizing data processing systems
You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing s...
Feature EngineeringDimensionality ReductionModel OptimizationData Preprocessing - Question #28
Your company is performing data preprocessing for a learning algorithm in Google Cloud Dataflow. Numerous data logs are being are being generated during this step, and the team wan...
- Question #30
Your company's customer and order databases are often under heavy load. This makes performing analytics against them difficult without harming operations. The databases are in a My...
- Question #31
You have Google Cloud Dataflow streaming pipeline running with a Google Cloud Pub/Sub subscription as the source. You need to make an update to the code that will make the new Clou...
- Question #32
Your company is running their first dynamic campaign, serving different offers by analyzing real-time data during the holiday season. The data scientists are collecting terabytes o...
- Question #33
Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dash...
- Question #34
destination. The company has grown rapidly, expanding their offerings to include rail, truck, aircraft, and oceanic shipping. Company Background The company started as a regional t...
- Question #35
Case Study 1 - Flowlogistic Company Background The company started as a regional trucking company, and then expanded into other logistics market. Because they have not updated thei...
- Question #36Ensuring solution quality
Case Study 1 - Flowlogistic Company Background The company started as a regional trucking company, and then expanded into other logistics market. Because they have not updated thei...
BigQueryCost OptimizationData SimplificationData Views - Question #37Designing data processing systems
Case Study 1 - Flowlogistic Company Overview Flowlogistic is a leading logistics and supply chain provider. They help businesses throughout the world manage their resources and tra...
Real-time processingEvent timeData ingestionGoogle Cloud Pub/Sub - Question #38Building and operationalizing data processing systems
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
Cloud DataflowAuto-scalingPipeline ConfigurationCompute Scaling - Question #39Designing data processing systems
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
Data WarehousingBusiness IntelligenceReal-time AnalyticsScalable Data Infrastructure - Question #40Designing data processing systems
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
BigQueryData SecurityAccess ControlData Organization - Question #41
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
- Question #42
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
- Question #43Ensuring solution quality
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
Data VisualizationDashboard DesignDynamic ReportingOperational Monitoring - Question #44
Case Study 2 - MJTelco Company Overview MJTelco is a startup that plans to build networks in rapidly growing, underserved markets around the world. The company has patents for inno...
- Question #45
Your company has recently grown rapidly and now ingesting data at a significantly higher rate than it was previously. You manage the daily batch MapReduce analytics jobs in Apache...
- Question #46
You work for a large fast food restaurant chain with over 400,000 employees. You store employee information in Google BigQuery in a Users table consisting of a FirstName field and...
- Question #47Designing data processing systems
You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with mu...
Google Cloud DatastoreIndexingQuery OptimizationMulti-valued Properties - Question #48Building and operationalizing data processing systems
You work for a manufacturing plant that batches application log files together into a single log file once a day at 2:00 AM. You have written a Google Cloud Dataflow job to process...
DataflowSchedulingCost OptimizationBatch Processing - Question #49
You work for an economic consulting firm that helps companies identify economic trends as they happen. As part of your analysis, you use Google BigQuery to correlate customer data...
- Question #50
You are designing the database schema for a machine learning-based food ordering service that will predict what users want to eat. Here is some of the information you need to store...
- Question #51
Your company is loading comma-separated values (CSV) files into Google BigQuery. The data is fully imported successfully; however, the imported data is not matching byte-to-byte to...
- Question #52
executives by 10:00 a.m. each day. This design is barely able to keep up with the current volume, even though the bandwidth utilization is rather low. You are told that due to seas...
- Question #53
You are choosing a NoSQL database to handle telemetry data submitted from millions of Internet-of-Things (IoT) devices. The volume of data is growing at 100 TB per year, and each d...