CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #37
CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #37: Real Exam Question with Answer & Explanation
The correct answer is B: One-way Chi-squared Test. Option B is correct because the one-way chi-squared (goodness-of-fit) test is designed to compare observed frequencies to expected frequencies for a single categorical variable - exactly what's needed here. The engineer can compare the observed frequency of missing values in rece
Question
A machine learning engineer is monitoring categorical input variables for a production machine learning application. The engineer believes that missing values are becoming more prevalent in more recent data for a particular value in one of the categorical input variables. Which of the following tools can the machine learning engineer use to assess their theory?
Options
- AKolmogorov-Smirnov (KS) test
- BOne-way Chi-squared Test
- CTwo-way Chi-squared Test
- DJenson-Shannon distance
- ENone of these
Explanation
Option B is correct because the one-way chi-squared (goodness-of-fit) test is designed to compare observed frequencies to expected frequencies for a single categorical variable - exactly what's needed here. The engineer can compare the observed frequency of missing values in recent data against the expected frequency from historical baseline data to determine if missingness has significantly increased.
The distractors fail for these reasons:
- A (KS test): Designed for continuous distributions, not categorical data; it measures the maximum difference between two cumulative distribution functions.
- C (Two-way chi-squared): Tests for independence between two categorical variables, not whether a single variable's frequency distribution has drifted from an expectation.
- D (Jensen-Shannon distance): A similarity metric between distributions, not a hypothesis test - it produces a distance value but doesn't provide a p-value to confirm or reject a theory about increasing missingness.
- E (None): Incorrect because B works.
Memory tip: Think "one variable → one-way chi-squared." You have one thing to measure (missing vs. not missing) and you're checking if its observed counts match what you'd expect from historical data - that's a goodness-of-fit problem, which is exactly what the one-way chi-squared solves.
Community Discussion
No community discussion yet for this question.