Cloudera
DS-200 · Question #39
DS-200 Question #39: Real Exam Question with Answer & Explanation
Sign in or unlock DS-200 to reveal the answer and full explanation for question #39. The question stem and answer options stay visible for context.
Question
You have a large file of N records (one per line), and want to randomly sample 10% them. You have two functions that are perfect random number generators (through they are a bit slow): Random_uniform () generates a uniformly distributed number in the interval [0, 1] random_permotation (M) generates a random permutation of the number O through M -1. Below are three different functions that implement the sampling. Method A For line in file: If random_uniform () < 0.1; Print line Method B i = 0 for line in file: if i % 10 = = 0; print line i += 1 Method C idxs = random_permotation (N) [: (N/10)] i = 0 for line in file: if i in idxs: print line i +=1 Which method is least likely to give you exactly 10% of your data?
Options
- AMethod A
- BMethod B
- CMethod C
Unlock DS-200 to see the answer
You've previewed enough free DS-200 questions. Unlock DS-200 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.