nerdexam
Cloudera

DS-200 · Question #39

DS-200 Question #39: Real Exam Question with Answer & Explanation

Sign in or unlock DS-200 to reveal the answer and full explanation for question #39. The question stem and answer options stay visible for context.

Question

You have a large file of N records (one per line), and want to randomly sample 10% them. You have two functions that are perfect random number generators (through they are a bit slow): Random_uniform () generates a uniformly distributed number in the interval [0, 1] random_permotation (M) generates a random permutation of the number O through M -1. Below are three different functions that implement the sampling. Method A For line in file: If random_uniform () < 0.1; Print line Method B i = 0 for line in file: if i % 10 = = 0; print line i += 1 Method C idxs = random_permotation (N) [: (N/10)] i = 0 for line in file: if i in idxs: print line i +=1 Which method is least likely to give you exactly 10% of your data?

Options

  • AMethod A
  • BMethod B
  • CMethod C

Unlock DS-200 to see the answer

You've previewed enough free DS-200 questions. Unlock DS-200 for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Full DS-200 Practice