MicrosoftMicrosoft
AI-900 · Question #19
AI-900 Question #19: Real Exam Question with Answer & Explanation
The correct answer is C: to test the model by using data that was not used to train the model. Randomly splitting data into distinct training and testing subsets is essential to evaluate a model's performance on unseen data. This process ensures the model can generalize well and helps prevent overfitting.
Submitted by haruto_sh· Mar 30, 2026Describe fundamental principles of machine learning on Azure
Question
When training a model, why should you randomly split the rows into separate subsets?
Options
- Ato train the model twice to attain better accuracy
- Bto train multiple models simultaneously to attain better performance
- Cto test the model by using data that was not used to train the model
Explanation
Randomly splitting data into distinct training and testing subsets is essential to evaluate a model's performance on unseen data. This process ensures the model can generalize well and helps prevent overfitting.
Common mistakes.
- A. Training a model twice on the same data or different subsets without a proper test set does not inherently lead to better accuracy and can still result in an overfit model if not evaluated on unseen data.
- B. While multiple models can be trained, the primary reason for splitting data is not to train them simultaneously for better performance, but rather to properly evaluate a single model's generalization capability.
Concept tested. Data splitting for machine learning evaluation
Topics
#Data splitting#Model evaluation#Training data#Test data
Community Discussion
No community discussion yet for this question.