nerdexam
AmazonAmazon

MLS-C01 · Question #189

MLS-C01 Question #189: Real Exam Question with Answer & Explanation

The correct answer is B: Add new columns that store one-hot representation of colors.. {"question_number": 4, "question_summary": "Feature engineering a nominal categorical color variable (Red, White, Green) for a numerical regression model.", "correct_answer": "B, E", "explanation": "Wall_Color is a nominal categorical variable - its values have no inherent order.

Modeling

Question

A machine learning specialist is developing a regression model to predict rental rates from rental listings. A variable named Wall_Color represents the most prominent exterior wall color of the property. The following is the sample data, excluding all other variables: The specialist chose a model that needs numerical input data. Which feature engineering approaches should the specialist use to allow the regression model to learn from the Wall_Color data? (Choose two.)

Options

  • AApply integer transformation and set Red = 1, White = 5, and Green = 10.
  • BAdd new columns that store one-hot representation of colors.
  • CReplace the color name string by its length.
  • DCreate three columns to encode the color in RGB format.
  • EReplace each color name by its training set frequency.

Explanation

{"question_number": 4, "question_summary": "Feature engineering a nominal categorical color variable (Red, White, Green) for a numerical regression model.", "correct_answer": "B, E", "explanation": "Wall_Color is a nominal categorical variable - its values have no inherent order. Any encoding must avoid implying a ranking. Option B (one-hot encoding) creates a separate binary column per color, cleanly representing membership without any false ordinal relationship. Option E (frequency encoding) replaces each color with how often it appears in the training set, which can capture prevalence signals the model may find useful. Option A (integer encoding with arbitrary values like 1, 5, 10) is incorrect because it implies Red < White < Green, introducing a false ordinal relationship the model will exploit incorrectly. Option C (replacing with string length) discards all meaningful color information. Option D (RGB encoding) adds three highly correlated columns that introduce engineering complexity without clear benefit for a regression task predicting rental rates.", "generated_by": "claude-sonnet", "llm_judge_score": 3}

Topics

#Feature Engineering#Categorical Data Encoding#Regression Models#One-Hot Encoding

Community Discussion

No community discussion yet for this question.

Full MLS-C01 PracticeBrowse All MLS-C01 Questions