CCD-410 Exam Questions
57 real CCD-410 exam questions with expert-verified answers and explanations. Page 1 of 2.
- Question #1
MapReduce v2 (MRv2/YARN) is designed to address which two issues?
- Question #2
You need to run the same job many times with minor variations. Rather than hardcoding all job configuration options in your drive code, you've decided to have your Driver subclass...
- Question #3
You are developing a MapReduce job for sales reporting. The mapper will process input keys representing the year (IntWritable) and input values representing product indentifies (Te...
- Question #4
Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage?
- Question #5
Which best describes how TextInputFormat processes input files and line breaks?
- Question #6
For each input key-value pair, mappers can emit:
- Question #7
In a MapReduce job, the reducer receives all values associated with same key. Which statement best describes the ordering of these values?
- Question #8
You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an in...
- Question #9
You want to count the number of occurrences for each unique word in the supplied input data. You've decided to implement this by having your mapper tokenize each word and emit a li...
- Question #10
Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduc...
- Question #11
Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access to hundreds of terabytes of data?
- Question #12
You use the hadoop fs -put command to write a 300 MB file using and HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another u...
- Question #13
Identify the tool best suited to import a portion of a relational database every day as files into HDFS, and generate Java classes to interact with that imported data?
- Question #14
You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt. How many files will be processed by the FileInputFormat.setIn...
- Question #15
You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key- values...
- Question #16
A combiner reduces:
- Question #18
MapReduce v2 (MRv2/YARN) splits which major functions of the JobTracker into separate daemons? Select two.
- Question #19
What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
- Question #20
In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
- Question #21
Table metadata in Hive is:
- Question #22
Analyze each scenario below and indentify which best describes the behavior of the default partitioner?
- Question #23
You need to move a file titled "weblogs" into HDFS. When you try to copy the file, you can't. You know you have ample space on your DataNodes. Which action should you take to relie...
- Question #24
In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?
- Question #25
Workflows expressed in Oozie can contain:
- Question #26
Which best describes what the map method accepts and emits?
- Question #27
When can a reduce class also serve as a combiner without affecting the output of a MapReduce program?
- Question #28
You want to perform analysis on a large collection of images. You want to store this data in HDFS and process it with MapReduce but you also want to give your data analysts and dat...
- Question #29
You want to run Hadoop jobs on your development workstation for testing before you submit them to your production cluster. Which mode of operation in Hadoop allows you to most clos...
- Question #30
Your cluster's HDFS block size in 64MB. You have directory containing 100 plain text files, each of which is 100MB in size. The InputFormat for your job is TextInputFormat. Determi...
- Question #31
What is a SequenceFile?
- Question #32
When is the earliest point at which the reduce method of a given Reducer can be called?
- Question #33
Which describes how a client reads a file from HDFS?
- Question #36
How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of MapReduce?
- Question #37
Assuming default settings, which best describes the order of data provided to a reducer's reduce method:
- Question #38
You wrote a map function that throws a runtime exception when it encounters a control character in input data. The input supplied to your mapper contains twelve such characters tot...
- Question #39
You want to populate an associative array in order to perform a map-side join. You've decided to put this information in a text file, place that file into the DistributedCache and...
- Question #40
You've written a MapReduce job that will process 500 million input records and generated 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will...
- Question #41
Can you use MapReduce to perform a relational join on two large tables sharing a key? Assume that the two tables are formatted as comma-separated files in HDFS.
- Question #42
You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper's map method?
- Question #43
You want to understand more about how users browse your public website, such as which pages they visit prior to placing an order. You have a farm of 200 web servers hosting your we...
- Question #44
You have the following key-value pairs as output from your Map task: (the, 1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1) How many keys will be passed to the Reducer's reduce...
- Question #46
What is the disadvantage of using multiple reducers with the default HashPartitioner and distributing your workload across you cluster?
- Question #47
Given a directory of files with the following structure: line number, tab character, string: Example: 1 abialkjfjkaoasdfjksdlkjhqweroij 2 kadfjhuwqounahagtnbvaswslmnbfgy 3 kjfteiom...
- Question #48
You need to perform statistical analysis in your MapReduce job and would like to call methods in the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archiv...
- Question #49
The Hadoop framework provides a mechanism for coping with machine issues such as faulty configuration or impending hardware failure. MapReduce detects that one or a number of machi...
- Question #50
For each intermediate key, each reducer task can emit:
- Question #51
What data does a Reducer reduce method process?
- Question #52
All keys used for intermediate output from mappers must:
- Question #53
On a cluster running MapReduce v1 (MRv1), a TaskTracker heartbeats into the JobTracker on your cluster, and alerts the JobTracker it has an open map task slot. What determines how...
- Question #54
Indentify which best defines a SequenceFile?