nerdexam
AmazonAmazon

MLS-C01 · Question #352

MLS-C01 Question #352: Real Exam Question with Answer & Explanation

The correct answer is D: F1 score. To optimize a classifier for an imbalanced fraud detection dataset, prioritizing the accurate capture of as many fraudulent transactions as possible, the True Positive Rate (Recall) and F1 score are the most relevant metrics.

Modeling

Question

A company wants to detect credit card fraud. The company has observed that an average of 2% of credit card transactions are fraudulent. A data scientist trains a classifier on a year's worth of credit card transaction data. The classifier needs to identify the fraudulent transactions. The company wants to accurately capture as many fraudulent transactions as possible. Which metrics should the data scientist use to optimize the classifier? (Choose two.)

Options

  • ASpecificity
  • BFalse positive rate
  • CAccuracy
  • DF1 score
  • ETrue positive rate

Explanation

To optimize a classifier for an imbalanced fraud detection dataset, prioritizing the accurate capture of as many fraudulent transactions as possible, the True Positive Rate (Recall) and F1 score are the most relevant metrics.

Common mistakes.

  • A. Specificity measures the proportion of actual negative cases (non-fraudulent transactions) that are correctly identified, which is not the primary objective of capturing fraudulent transactions.
  • B. False positive rate measures the proportion of non-fraudulent transactions that are incorrectly classified as fraudulent, and while minimizing it is important, it does not directly address the goal of capturing as many fraudulent transactions as possible.
  • C. Accuracy can be misleading in imbalanced datasets because a model that always predicts the majority class will still show high accuracy, failing to correctly identify the critical minority class (fraud).

Concept tested. Classification metrics for imbalanced datasets

Reference. https://scikit-learn.org/stable/modules/model_evaluation.html#f1-score

Topics

#Machine Learning Metrics#Classification#Imbalanced Data#Fraud Detection

Community Discussion

No community discussion yet for this question.

Full MLS-C01 PracticeBrowse All MLS-C01 Questions