nerdexam
AmazonAmazon

MLS-C01 · Question #217

MLS-C01 Question #217: Real Exam Question with Answer & Explanation

The correct answer is A: IP Insights. To detect fraudulent transactions without labels, indicating an unsupervised anomaly detection problem, the company should use a combination of Amazon SageMaker IP Insights and Random Cut Forest (RCF). IP Insights is specifically designed for detecting unusual IP address patterns

Modeling

Question

An ecommerce company wants to use machine learning (ML) to monitor fraudulent transactions on its website. The company is using Amazon SageMaker to research, train, deploy, and monitor the ML models. The historical transactions data is in a .csv file that is stored in Amazon S3. The data contains features such as the user's IP address, navigation time, average time on each page, and the number of clicks for each session. There is no label in the data to indicate if a transaction is anomalous. Which models should the company use in combination to detect anomalous transactions? (Choose two.)

Options

  • AIP Insights
  • BK-nearest neighbors (k-NN)
  • CLinear learner with a logistic function
  • DRandom Cut Forest (RCF)
  • EXGBoost

Explanation

To detect fraudulent transactions without labels, indicating an unsupervised anomaly detection problem, the company should use a combination of Amazon SageMaker IP Insights and Random Cut Forest (RCF). IP Insights is specifically designed for detecting unusual IP address patterns, while RCF is a general-purpose, unsupervised anomaly detection algorithm suitable for identifying statistical outliers in the other transaction features.

Common mistakes.

  • B. While k-nearest neighbors (k-NN) can be adapted for anomaly detection, it is not primarily designed as an unsupervised anomaly detection algorithm in the same direct manner as RCF, and it generally performs better in supervised contexts or for specific density-based anomaly detection rather than the broad context described.
  • C. A linear learner with a logistic function is a supervised classification algorithm, requiring labeled data (fraudulent vs. non-fraudulent) for training, which is explicitly stated as unavailable in the problem statement.
  • E. XGBoost is a highly effective and popular supervised learning algorithm used for classification and regression, but it requires labeled training data to identify fraudulent transactions, which is not available in this unsupervised scenario.

Concept tested. Unsupervised anomaly detection

Reference. https://docs.aws.amazon.com/sagemaker/latest/dg/ip-insights.html

Topics

#Anomaly Detection#Unsupervised Learning#Amazon SageMaker Built-in Algorithms#Fraud Detection

Community Discussion

No community discussion yet for this question.

Full MLS-C01 PracticeBrowse All MLS-C01 Questions