DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST Exam Questions
138 real DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST exam questions with expert-verified answers and explanations. Page 2 of 3.
- Question #51
Which of the following question statement falls under data science category?
- Question #52
Which of the following skills a data scientists required?
- Question #53
Which of the following steps you will be using in the discovery phase?
- Question #54
Refer to the Exhibit. In the Exhibit, the table shows the values for the input Boolean attributes "A", "B", and "C". It also shows the values for the output attribute "class". Whic...
- Question #55
What are the key outcomes of the successful analytical projects?
- Question #56
You are working on a Data Science project and during the project you have been gibe a responsibility to interview all the stakeholders in the project. In which phase of the project...
- Question #57
While working with Netflix the movie rating websites you have developed a recommender system that has produced ratings predictions for your data set that are consistently exactly 1...
- Question #58
Refer to the exhibit. You are using K-means clustering to classify customer behavior for a large retailer. You need to determine the optimum number of customer groups. You plot the...
- Question #59
You are doing advanced analytics for the one of the medical application using the regression and you have two variables which are weight and height and they are very important inpu...
- Question #60
You are working as a data science consultant for a gaming company. You have three member team and all other stake holders are from the company itself like project managers and proj...
- Question #61
You are working in a classification model for a book, written by HadoopExam Learning Resources and decided to use building a text classification model for determining whether this...
- Question #62
Stories appear in the front page of Digg as they are "voted up" (rated positively) by the community. As the community becomes larger and more diverse, the promoted stories can bett...
- Question #63
Which of the following is not a correct application for the Classification?
- Question #64
You are building a classifier off of a very high-dimensiona data set similar to shown in the image with 5000 variables (lots of columns, not that many rows). It can handle both den...
- Question #65
Consider the following confusion matrix for a data set with 600 out of 11,100 instances positive: In this case, Precision = 50%, Recall = 83%, Specificity = 95%, and Accuracy = 95%...
- Question #66
The method based on principal component analysis (PCA) evaluates the features according to
- Question #67
Which analytical method is considered unsupervised? may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in the time series da...
- Question #68
You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measur...
- Question #69
As a data scientist consultant at ABC Corp, you are working on a recommendation engine for the learning resources for end user. So Which recommender system technique benefits most...
- Question #70
Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?
- Question #71
In which phase of the analytic lifecycle would you expect to spend most of the project time?
- Question #72
Refer to exhibit You are asked to write a report on how specific variables impact your client's sales using a data set provided to you by the client. The data includes 15 variables...
- Question #73
You have data of 10.000 people who make the purchasing from a specific grocery store. You also have their income detail in the data. You have created 5 clusters using this data. Bu...
- Question #74
You are working with the Clustering solution of the customer datasets. There are almost 40 variables are available for each customer and almost 1.00,0000 customer's data is availab...
- Question #75
You are having 1000 patients' data with the height and age. Where age in years and height in meters. You wanted to create cluster using this two attributes. You wanted to have near...
- Question #76
Which of the following true with regards to the K-Means clustering algorithm?
- Question #77
In which lifecycle stage are test and training data sets created?
- Question #78
Select the correct statement which applies to logistic regression
- Question #79
You are studying the behavior of a population, and you are provided with multidimensional data at the individual level. You have identified four specific individuals who are valuab...
- Question #80
You are designing a recommendation engine for a website where the ability to generate more personalized recommendations by analyzing information from the past activity of a specifi...
- Question #81
Which of the following metrics are useful in measuring the accuracy and quality of a recommender system?
- Question #82
What is the best way to evaluate the quality of the model found by an unsupervised algorithm like k- means clustering, given metrics for the cost of the clustering (how well it fit...
- Question #83
In which lifecycle stage are appropriate analytical techniques determined?
- Question #84
A problem statement is given as below Hospital records show that of patients suffering from a certain disease, 75% die of it. What is the probability that of 6 randomly selected pa...
- Question #85
Which of the following problem you can solve using binomial distribution
- Question #86
Of all the smokers in a particular district, 40% prefer brand A and 60% prefer brand B. Of those smokers who prefer branda. 30% are females, and of those who prefer brand B. 40% ar...
- Question #87
Google Adwords studies the number of men, and women, clicking the advertisement on search engine during the midnight for an hour each day. Google find that the number of men that c...
- Question #88
Projecting a multi-dimensional dataset onto which vector has the greatest variance?
- Question #89
Question-26. There are 5000 different color balls, out of which 1200 are pink color. What is the maximum likelihood estimate for the proportion of "pink" items in the test set of c...
- Question #90
If E1 and E2 are two events, how do you represent the conditional probability given that E2 occurs given that E1 has occurred?
- Question #91
What is the probability that the total of two dice will be greater than 8, given that the first die is a 6?
- Question #92
A denote the event 'student is female' and let B denote the event 'student is French'. In a class of 100 students suppose 60 are French, and suppose that 10 of the French students...
- Question #93
Suppose there are three events then which formula must always be equal to P(E1|E2,E3)?
- Question #94
You are working on a email spam filtering assignment, while working on this you find there is new word e.g. HadoopExam comes in email, and in your solutions you never come across t...
- Question #95
Consider flipping a coin for which the probability of heads is p, where p is unknown, and our goa is to estimate p. The obvious approach is to count how many times the coin came up...
- Question #96
Which of the following is a Continuous Probability Distributions?
- Question #97
Which method is used to solve for coefficients bO, b1, ... bn in your linear regression model:
- Question #98
In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?
- Question #99
You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When y...
- Question #100
You are asked to create a model to predict the total number of monthly subscribers for a specific magazine. You are provided with 1 year's worth of subscription and payment data, u...