Hotspot Question You plan to develop a dataset named Purchases by using Azure Databricks. Purchases will contain the following columns: - ProductID - ItemPrice - LineTotal - Quantity - StoreID - Minute - Month - Hour - Year - Day You need to store the data to support hourly incremental load pipelines that will vary for each StoreID. The solution must minimize storage costs. How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point. Answer:

Question

Accepted Answer

df.write method: .partitionBy; method arguments: ("StoreID", "Hour"); final write action: .parquet("/Purchases")

Hotspot Question You plan to develop a dataset named Purchases by using Azure Databricks. Purchases will contain the following columns: - ProductID - ItemPrice - LineTotal - Quantity - StoreID - Minut

Question

Exhibit

Answer Area

Explanation

Topics

Community Discussion