A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a contin

Sign in or unlock MLS-C01 to reveal the answer and full explanation for question #8. The question stem and answer options stay visible for context.

Data Engineering

Question

A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a continuous value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study was conducted on a group of individuals over the age of 65 who have a particular disease that is known to worsen with age. Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that, out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other features for these observations appear normal compared to the rest of the sample population How should the Data Scientist correct this issue?

Options

ADrop all records from the dataset where age has been set to 0.
BReplace the age field value for records with a value of 0 with the mean or median value from the
CDrop the age feature from the dataset and train the model using the rest of the features.
DUse k-means clustering to handle missing features

Unlock MLS-C01 to see the answer

You've previewed enough free MLS-C01 questions. Unlock MLS-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLS-C01 - $49.99 / 30 days Sign in

Topics

#Data Cleaning#Missing Data Imputation#Data Preprocessing

Full MLS-C01 Practice