nerdexam
DatabricksDatabricks

CERTIFIED-DATA-ENGINEER-PROFESSIONAL · Question #37

CERTIFIED-DATA-ENGINEER-PROFESSIONAL Question #37: Real Exam Question with Answer & Explanation

The correct answer is C: Cross-region reads and writes can incur significant costs and latency; whenever possible,. This is the correct answer because it accurately informs this decision. The decision is about where the Databricks workspace used by the contractors should be deployed. The contractors are based in India, while all the company's data is stored in regional cloud storage in the Uni

Databricks Infrastructure Planning

Question

A small company based in the United States has recently contracted a consulting firm in India to implement several new data engineering pipelines to power artificial intelligence applications. All the company's data is stored in regional cloud storage in the United States. The workspace administrator at the company is uncertain about where the Databricks workspace used by the contractors should be deployed. Assuming that all data governance considerations are accounted for, which statement accurately informs this decision?

Options

  • ADatabricks runs HDFS on cloud volume storage; as such, cloud virtual machines must be
  • BDatabricks workspaces do not rely on any regional infrastructure; as such, the decision should be
  • CCross-region reads and writes can incur significant costs and latency; whenever possible,
  • DDatabricks leverages user workstations as the driver during interactive development; as such,
  • EDatabricks notebooks send all executable code from the user's browser to virtual machines over

Explanation

This is the correct answer because it accurately informs this decision. The decision is about where the Databricks workspace used by the contractors should be deployed. The contractors are based in India, while all the company's data is stored in regional cloud storage in the United States. When choosing a region for deploying a Databricks workspace, one of the important factors to consider is the proximity to the data sources and sinks. Cross-region reads and writes can incur significant costs and latency due to network bandwidth and data transfer fees. Therefore, whenever possible, compute should be deployed in the same region the data is stored to optimize performance and reduce costs.

Topics

#Cloud Cost Management#Data Locality#Databricks Workspace Deployment#Performance Optimization

Community Discussion

No community discussion yet for this question.

Full CERTIFIED-DATA-ENGINEER-PROFESSIONAL PracticeBrowse All CERTIFIED-DATA-ENGINEER-PROFESSIONAL Questions