PROFESSIONAL-DATA-ENGINEER Exam Questions
357 real PROFESSIONAL-DATA-ENGINEER exam questions with expert-verified answers and explanations. Page 2 of 8.
- Question #54
Suppose you have a table that includes a nested column called "city" inside a column called "person", but when you try to submit the following query in BigQuery, it gives you an er...
- Question #55
What are two of the benefits of using denormalized data structures in BigQuery?
- Question #57
What are all of the BigQuery operations that Google charges for?
- Question #58
Which of the following is not possible using primitive roles?
- Question #59
Which of these statements about BigQuery caching is true?
- Question #60
Which of these sources can you not load data into BigQuery from?
- Question #61
Which of the following statements about Legacy SQL and Standard SQL is not true?
- Question #62
How would you query specific partitions in a BigQuery table?
- Question #63
Which SQL keyword can be used to reduce the number of columns processed by BigQuery?
- Question #64
To give a user read permission for only the first three columns of a table, which access control method would you use?
- Question #65
What are two methods that can be used to denormalize tables in BigQuery?
- Question #67
Which of these operations can you perform from the BigQuery Web UI?
- Question #68
Which methods can be used to reduce the number of rows processed by BigQuery?
- Question #69
Why do you need to split a machine learning dataset into training data and test data?
- Question #70
Which of these numbers are adjusted by a neural network as it learns from a training dataset (select 2 answers)?
- Question #72
Which software libraries are supported by Cloud Machine Learning Engine?
- Question #73
Which TensorFlow function can you use to configure a categorical column if you don't know all of the possible values for that column?
- Question #74
Which of the following statements about the Wide & Deep Learning model are true? (Select 2 answers.)
- Question #75
To run a TensorFlow training job on your own computer using Cloud Machine Learning Engine, what would your command start with?
- Question #77
Suppose you have a dataset of images that are each labeled as to whether or not they contain a human face. To create a neural network that recognizes human faces in images using th...
- Question #78
What are two of the characteristics of using online prediction rather than batch prediction?
- Question #79
Which of these are examples of a value in a sparse vector? (Select 2 answers.)
- Question #81
If a dataset contains rows with individual people and columns for year of birth, country, and income, how many of the columns are continuous and how many are categorical?
- Question #82
Which of the following are examples of hyperparameters? (Select 2 answers.)
- Question #83
Which of the following are feature engineering techniques? (Select 2 answers)
- Question #84
You want to use a BigQuery table as a data sink. In which writing mode(s) can you use BigQuery as a sink?
- Question #86
When running a pipeline that has a BigQuery source, on your local machine, you continue to get permission denied errors. What could be the reason for that?
- Question #87
What Dataflow concept determines when a Window's contents should be output based on certain criteria being met?
- Question #88
Which of the following is NOT one of the three main types of triggers that Dataflow supports?
- Question #89
Which Java SDK class can you use to run your Dataflow programs locally?
- Question #91
The _________ for Cloud Bigtable makes it possible to use Cloud Bigtable in a Cloud Dataflow pipeline.
- Question #92
Does Dataflow process batch data pipelines or streaming data pipelines?
- Question #93
You are planning to use Google's Dataflow SDK to analyze customer data such as displayed below. Your project requirement is to extract only the customer name from the data source a...
- Question #94
Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?
- Question #96
You are developing a software application using Google's Dataflow SDK, and want to use conditional, for loops and other complex programming structures to create a branching pipelin...
- Question #97
Which of the following IAM roles does your Compute Engine account require to be able to run pipeline jobs?
- Question #98
Which of the following is not true about Dataflow pipelines?
- Question #99
By default, which of the following windowing behavior does Dataflow apply to unbounded data sets?
- Question #100
Which of the following job types are supported by Cloud Dataproc (select 3 answers)?
- Question #101
What are the minimum permissions needed for a service account used with Google Dataproc?
- Question #102
Which role must be assigned to a service account used by the virtual machines in a Dataproc cluster so they can execute jobs?
- Question #103
When creating a new Cloud Dataproc cluster with the projects.regions.clusters.create operation, these four values are required: project, region, name, and ____.
- Question #104
Which Google Cloud Platform service is an alternative to Hadoop with Hive?
- Question #105
Which of these rules apply when you add preemptible workers to a Dataproc cluster (select 2 answers)?
- Question #106
When using Cloud Dataproc clusters, you can access the YARN web interface by configuring a browser to connect through a ____ proxy.
- Question #107
Cloud Dataproc is a managed Apache Hadoop and Apache _____ service.
- Question #108
Which action can a Cloud Dataproc Viewer perform?
- Question #109
Dataproc clusters contain many configuration files. To update these files, you will need to use the --properties option. The format for the option is: file_prefix:property=_____.
- Question #110
Scaling a Cloud Dataproc cluster typically involves ____.
- Question #111
Cloud Dataproc charges you only for what you really use with _____ billing.