nerdexam
MicrosoftMicrosoft

DP-100 · Question #91

DP-100 Question #91: Real Exam Question with Answer & Explanation

The correct answer is D: Synthetic Minority Oversampling Technique (SMOTE). The task is to resolve class imbalance in a classification model where Class A is a minority class (100 samples) and Class B is a majority class (10,000 samples) with high variation.

Design and prepare a machine learning solution

Question

You create a classification model with a dataset that contains 100 samples with Class A and 10,000 samples with Class B The variation of Class B is very high. You need to resolve imbalances. Which method should you use?

Options

  • APartition and Sample
  • BCluster Centroids
  • CTomek links
  • DSynthetic Minority Oversampling Technique (SMOTE)

Explanation

The task is to resolve class imbalance in a classification model where Class A is a minority class (100 samples) and Class B is a majority class (10,000 samples) with high variation.

Common mistakes.

  • A. Partition and Sample is a general data preparation step for splitting datasets and sampling, which does not directly resolve class imbalance.
  • B. Cluster Centroids is an undersampling technique that reduces the majority class, potentially leading to a loss of valuable information from that class.
  • C. Tomek links is an undersampling technique used to clean decision boundaries by removing majority class samples, which also reduces the size of the majority class and might discard useful data.

Concept tested. Imbalanced dataset handling techniques

Reference. https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/smote

Topics

#Imbalanced datasets#Oversampling#SMOTE#Data preparation

Community Discussion

No community discussion yet for this question.

Full DP-100 PracticeBrowse All DP-100 Questions