nerdexam
Microsoft

70-475 · Question #85

70-475 Question #85: Real Exam Question with Answer & Explanation

The correct answer is D. Azure Data Lake E. Azure Data Factory. D: To analyze data in HDInsight cluster, you can store the data either in Azure Storage, Azure Data Lake Storage Gen 1/Azure Data Lake Storage Gen 2, or both. Both storage options enable you to safely delete HDInsight clusters that are used for computation without losing user dat

Question

You have an Apache Hive cluster in Microsoft Azure HDInsight. The cluster contains 10 million data files. You plan to archive the data. The data will be analyzed monthly. You need to recommend a solution to move and store the data. The solution must minimize how long it takes to move the data and must minimize costs. Which two services should you include in the recommendation? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Options

  • AAzure Queue storage
  • BMicrosoft SQL Server Integration Services (SSIS)
  • CAzure Table Storage
  • DAzure Data Lake
  • EAzure Data Factory

Explanation

D: To analyze data in HDInsight cluster, you can store the data either in Azure Storage, Azure Data Lake Storage Gen 1/Azure Data Lake Storage Gen 2, or both. Both storage options enable you to safely delete HDInsight clusters that are used for computation without losing user data. E: The Spark activity in a Data Factory pipeline executes a Spark program on your own or on- demand HDInsight cluster. It handles data transformation and the supported transformation https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-data-lake-store https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-spark

Community Discussion

No community discussion yet for this question.

Full 70-475 Practice