DP-100 · Question #93
DP-100 Question #93: Real Exam Question with Answer & Explanation
The correct answer is B: Split Rows with the Randomized split parameter set to true. {"question_number": 2, "correct_answer": "B", "explanation": "NOTE: This answer is technically problematic for time series data. The Split Rows mode with Randomized split set to TRUE would shuffle the data, destroying temporal order and causing data leakage - future values could
Question
You are with a time series dataset in Azure Machine Learning Studio. You need to split your dataset into training and testing subsets by using the Split Data module. Which splitting mode should you use?
Options
- ARegular Expression Split
- BSplit Rows with the Randomized split parameter set to true
- CRelative Expression Split
- DRecommender Split
Explanation
{"question_number": 2, "correct_answer": "B", "explanation": "NOTE: This answer is technically problematic for time series data. The Split Rows mode with Randomized split set to TRUE would shuffle the data, destroying temporal order and causing data leakage - future values could end up in the training set. For genuine time series splitting, you should use Split Rows with Randomized set to FALSE to preserve chronological sequence. The other options are irrelevant: Regular Expression Split (A) filters by text patterns, Relative Expression Split (C) splits on column value conditions, and Recommender Split (D) is for collaborative filtering datasets. If this is the given exam answer, it likely reflects a poorly constructed question.", "generated_by": "claude-sonnet", "llm_judge_score": 3}
Topics
Community Discussion
No community discussion yet for this question.