A company is using Amazon SageMaker AI to create a classification model to categorize the company's sales performance for each month of the previous 20 years on a scale from 1 to 5. The dataset includ

Sign in or unlock MLA-C01 to reveal the answer and full explanation for question #235. The question stem and answer options stay visible for context.

Data Preparation for Machine Learning

Question

A company is using Amazon SageMaker AI to create a classification model to categorize the company’s sales performance for each month of the previous 20 years on a scale from 1 to 5. The dataset includes fields for month, sales region, regional aggregate sales, and the number of stores in each sales region. The company notices that during two months of every year, the aggregate sales values are unexpectedly high. The company performs one-hot encoding on all non-numerical features in the training and validation datasets. The company uses the training dataset to train the classification model. When the company evaluates the model against the validation dataset, the results are less accurate than expected. The company must improve the model’s accuracy on the validation dataset. Which solution will meet this requirement?

Options

ARemove records that include outliers across all features.
BUse a stratified split on the month and sales region features.
CPerform normalization on the aggregate sales feature.
DPerform normalization on the aggregate sales feature for each sales region.

Unlock MLA-C01 to see the answer

You've previewed enough free MLA-C01 questions. Unlock MLA-C01 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock MLA-C01 - $49.99 / 30 days Sign in

Topics

#Data Splitting#Stratified Sampling#Validation Accuracy#Categorical Features

Full MLA-C01 Practice