E20-007 Exam Questions
162 real E20-007 exam questions with expert-verified answers and explanations. Page 3 of 4.
- Question #110
What does the R code z <- f[1:10, ] do?
- Question #111
In R, functions like plot() and hist() are known as what?
- Question #112
Review the following code: SELECT pn, vn, sum(prc*qty) FROM sale GROUP BY CUBE(pn, vn) ORDER BY 1, 2, 3; Which combination of subtotals do you expect to be returned by the query?
- Question #113
In MADlib what does MAD stand for?
- Question #114
The web analytics team uses Hadoop to process access logs. They now want to correlate this data with structured user data residing in their massively parallel database. Which tool...
- Question #115
When would you prefer a Naive Bayes model to a logistic regression model for classification?
- Question #116
Before you build an ARMA model, how can you tell if your time series is weakly stationary?
- Question #117
In a Student's t-test, what is the meaning of the p-value?
- Question #118
In addition to less data movement and the ability to use larger datasets in calculations, what is a benefit of analytical calculations in a database?
- Question #119
You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. When have you completed the analytics lifecycle?
- Question #120
Consider these itemsets: (hat, scarf, coat) (hat, scarf, coat, gloves) (hat, scarf, gloves) (hat, gloves) (scarf, coat, gloves) What is the confidence of the rule (gloves -> hat)?
- Question #121
What is holdout data?
- Question #122
Which characteristic applies mainly to Data Science as opposed to Business Intelligence?
- Question #123
Which word or phrase completes the statement? Theater actor is to "Artistic and Expressive" as Data Scientist is to ________________
- Question #124
Which process in text analysis can be used to reduce dimensionality?
- Question #125
What is the format of the output from the Map function of MapReduce?
- Question #126
Which data type value is used for the observed response variable in a logistic regression model?
- Question #127
A data scientist is given an R data frame, "empdata", with the columns Age, Salary, Occupation, Education, and Gender. The data scientist would like to examine only the Salary and...
- Question #128
What is required in a presentation for business analysts?
- Question #129
What is LOESS used for?
- Question #130
Which word or phrase completes the statement? Mahout is to Hadoop as MADlib is to ____________ .
- Question #131
In linear regression modeling, which action can be taken to improve the linearity of the relationship between the dependent and independent variables?
- Question #132
Data visualization is used in the final presentation of an analytics project. For what else is this technique commonly used?
- Question #133
Which data asset is an example of semi-structured data?
- Question #134
Your colleague, who is new to Hadoop, approaches you with a question. They want to know how best to access their data. This colleague has previously worked extensively with SQL and...
- Question #135
In linear regression, what indicates that an estimated coefficient is significantly different than zero?
- Question #136
Which graphical representation shows the distribution and multiple summary statistics of a continuous variable for each value of a corresponding discrete variable?
- Question #137
Assume that you have a data frame in R. Which function would you use to display descriptive statistics about this variable?
- Question #138
What is the mandatory Clause that must be included when using Window functions?
- Question #139
What is the purpose of the process step "parsing" in text analysis?
- Question #140
Which word or phrase completes the statement? A data warehouse is to a centralized database for reporting as an analytic sandbox is to a _______?
- Question #141
You do a Student's t-test to compare the average test scores of sample groups from populations A and B. Group A averaged 10 points higher than group B. You find that this differenc...
- Question #142
Which word or phrase completes the statement? Business Intelligence is to ad-hoc reporting and dashboards as Data Science is to ______________ .
- Question #143
What is a property of window functions in SQL commands?
- Question #144
You are attempting to find the Euclidean distance between two centroids: Centroid A's coordinates: (X = 2, Y = 4) Centroid B's coordinates (X = 8, Y = 10) Which formula finds the c...
- Question #145
In data visualization, which type of chart is recommended to represent frequency data?
- Question #146
A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or...
- Question #147
How are window functions different from regular aggregate functions?
- Question #148
Consider these itemsets: (hat, scarf, coat) (hat, scarf, coat, gloves) (hat, scarf, gloves) (hat, gloves) (scarf, coat, gloves) What is the confidence of the rule (hat, scarf) -> g...
- Question #149
In the MapReduce framework, what is the purpose of the Map Function?
- Question #150
You have completed your model and are handing it off to be deployed in production. What should you deliver to the production team, along with your commented code?
- Question #151
While having a discussion with your colleague, this person mentions that they want to perform K- means clustering on text file data stored in HDFS. Which tool would you recommend t...
- Question #152
Which method is used to solve for coefficients b0, b1, .., bn in your linear regression model : Y = b0 + b1x1+b2x2+....+bnxn
- Question #153
What describes a true limitation of Logistic Regression method?
- Question #154
You submit a MapReduce job to a Hadoop cluster and notice that although the job was successfully submitted, it is not completing. What should you do?
- Question #155
A disk drive manufacturer has a defect rate of less than 1.5% with 98% confidence. A quality assurance team samples 1000 disk drives and finds 14 defective units. Which action shou...
- Question #156
Which word or phrase completes the statement? Data-ink ratio is to data visualization as __________ .
- Question #157
Consider a database with 4 transactions: Transaction 1: {cheese, bread, milk} Transaction 2: {soda, bread, milk} Transaction 3: {cheese, bread} Transaction 4: {cheese, soda, juice}...
- Question #158
You are using the Apriori algorithm to determine the likelihood that a person who owns a home has a good credit score. You have determined that the confidence for the rules used in...
- Question #159
Consider a database with 4 transactions: Transaction 1: {cheese, bread, milk} Transaction 2: {soda, bread, milk} Transaction 3: {cheese, bread} Transaction 4: {cheese, soda, juice}...