nerdexam
AmazonAmazon

MLS-C01 · Question #321

MLS-C01 Question #321: Real Exam Question with Answer & Explanation

The correct answer is B: Logarithmic transformation. A distribution where mode < median < mean indicates a right-skewed distribution, and a logarithmic transformation is commonly used to normalize such data for linear regression models.

Modeling

Question

A data scientist is building a linear regression model. The scientist inspects the dataset and notices that the mode of the distribution is lower than the median, and the median is lower than the mean. Which data transformation will give the data scientist the ability to apply a linear regression model?

Options

  • AExponential transformation
  • BLogarithmic transformation
  • CPolynomial transformation
  • DSinusoidal transformation

Explanation

A distribution where mode < median < mean indicates a right-skewed distribution, and a logarithmic transformation is commonly used to normalize such data for linear regression models.

Common mistakes.

  • A. Exponential transformation would increase the skewness of an already right-skewed distribution, moving it further away from a normal distribution suitable for linear regression.
  • C. Polynomial transformation primarily helps capture non-linear relationships between variables but does not inherently correct for skewness in the distribution of a single variable to make it more Gaussian.
  • D. Sinusoidal transformation is used for cyclical or periodic data, which is not indicated by the described skewness of the distribution.

Concept tested. Data transformation for skewed distributions

Reference. https://aws.amazon.com/blogs/machine-learning/optimizing-machine-learning-predictions-with-data-transformations-part-1-power-transforms/

Topics

#Data Transformation#Skewness#Linear Regression Preprocessing#Statistical Distributions

Community Discussion

No community discussion yet for this question.

Full MLS-C01 PracticeBrowse All MLS-C01 Questions