MLA-C01 · Question #14
MLA-C01 Question #14: Real Exam Question with Answer & Explanation
The correct answer is A: LightGBM. LightGBM is the correct choice because it's a gradient boosting framework that natively handles class imbalance (via scale_pos_weight and other parameters), captures complex non-linear feature interdependencies through its ensemble of decision trees, and is available as a SageMak
Question
Case Study An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3. The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data. The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model. Which algorithm should the ML engineer use to meet this requirement?
Options
- ALightGBM
- BLinear learner
- CК-means clustering
- DNeural Topic Model (NTM)
Explanation
LightGBM is the correct choice because it's a gradient boosting framework that natively handles class imbalance (via scale_pos_weight and other parameters), captures complex non-linear feature interdependencies through its ensemble of decision trees, and is available as a SageMaker built-in algorithm - making it well-suited for structured/tabular fraud detection data.
Linear Learner (B) is wrong because it assumes a linear relationship between features and output, meaning it cannot capture the complex interdependencies the question explicitly mentions as a problem.
K-means (C) is wrong because it's an unsupervised clustering algorithm - it has no concept of a "fraud" label and cannot be used to train a supervised classification model.
Neural Topic Model (D) is wrong because NTM is designed for text data to discover abstract topics in documents - it has no application to tabular transaction/profile data.
Memory tip: When you see the combo "tabular data + class imbalance + feature interactions + supervised learning," think gradient boosting. Among SageMaker built-ins, LightGBM is the gradient boosting option. K-means = clustering only, NTM = text topics only, Linear Learner = linear problems only.
Topics
Community Discussion
No community discussion yet for this question.