DP-100 · Question #91
DP-100 Question #91: Real Exam Question with Answer & Explanation
The correct answer is D: Synthetic Minority Oversampling Technique (SMOTE). The task is to resolve class imbalance in a classification model where Class A is a minority class (100 samples) and Class B is a majority class (10,000 samples) with high variation.
Question
You create a classification model with a dataset that contains 100 samples with Class A and 10,000 samples with Class B The variation of Class B is very high. You need to resolve imbalances. Which method should you use?
Options
- APartition and Sample
- BCluster Centroids
- CTomek links
- DSynthetic Minority Oversampling Technique (SMOTE)
Explanation
The task is to resolve class imbalance in a classification model where Class A is a minority class (100 samples) and Class B is a majority class (10,000 samples) with high variation.
Common mistakes.
- A. Partition and Sample is a general data preparation step for splitting datasets and sampling, which does not directly resolve class imbalance.
- B. Cluster Centroids is an undersampling technique that reduces the majority class, potentially leading to a loss of valuable information from that class.
- C. Tomek links is an undersampling technique used to clean decision boundaries by removing majority class samples, which also reduces the size of the majority class and might discard useful data.
Concept tested. Imbalanced dataset handling techniques
Reference. https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/smote
Topics
Community Discussion
No community discussion yet for this question.