nerdexam
Dell-EMC

E20-007 · Question #29

E20-007 Question #29: Real Exam Question with Answer & Explanation

The correct answer is B. Run MapReduce to transform the data,and find relevant key value pairs.. See the full explanation below for the reasoning.

Question

You are given 10, 000, 000 user profile pages of an online dating site in XML files, and they are stored in HDFS. You are assigned to divide the users into groups based on the content of their profiles. You have been instructed to try K-means clustering on this data. How should you proceed?

Options

  • ADivide the data into sets of 1,000 user profiles,and run K-means clustering in RHadoop iteratively.
  • BRun MapReduce to transform the data,and find relevant key value pairs.
  • CRun a Naive Bayes classification as a pre-processing step in HDFS.
  • DPartition the data by XML file size,and run K-means clustering in each partition.

Community Discussion

No community discussion yet for this question.

Full E20-007 Practice