PROFESSIONAL-DATA-ENGINEER · Question #233
PROFESSIONAL-DATA-ENGINEER Question #233: Real Exam Question with Answer & Explanation
Sign in or unlock PROFESSIONAL-DATA-ENGINEER to reveal the answer and full explanation for question #233. The question stem and answer options stay visible for context.
Question
You want to build a managed Hadoop system as your data lake. The data transformation process is composed of a series of Hadoop jobs executed in sequence. To accomplish the design of separating storage from compute, you decided to use the Cloud Storage connector to store all input data, output data, and intermediary data. However, you noticed that one Hadoop job runs very slowly with Cloud Dataproc, when compared with the on-premises bare-metal Hadoop environment (8-core nodes with 100-GB RAM). Analysis shows that this particular Hadoop job is disk I/O intensive. You want to resolve the issue. What should you do?
Options
- AAllocate sufficient memory to the Hadoop cluster, so that the intermediary data of that particular Hadoop job can be held in memory
- BAllocate sufficient persistent disk space to the Hadoop cluster, and store the intermediate data of that particular Hadoop job on native HDFS
- CAllocate more CPU cores of the virtual machine instances of the Hadoop cluster so that the networking bandwidth for each instance can scale up
- DAllocate additional network interface card (NIC), and configure link aggregation in the operating system to use the combined throughput when working with
Unlock PROFESSIONAL-DATA-ENGINEER to see the answer
You've previewed enough free PROFESSIONAL-DATA-ENGINEER questions. Unlock PROFESSIONAL-DATA-ENGINEER for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.