MapReduce Types
Point out the correct statement : a) The reduce input must have the same types as the map output, although the reduce output types may be different again b) The map input key and value types (K1 and V1) are different from the map output types c) The partition function operates on the intermediate key d) All of the mentioned
d) All of the mentioned In practice, the partition is determined solely by the key (the value is ignored).
An ___________ is responsible for creating the input splits, and dividing them into records. a) TextOutputFormat b) TextInputFormat c) OutputInputFormat d) InputFormat
d) InputFormat As a MapReduce application writer, you don't need to deal with InputSplits directly, as they are created by an InputFormat.
Point out the wrong statement: a) If V2 and V3 are the same, you only need to use setOutputValueClass() b) The overall effect of Streaming job is to perform a sort of the input c) A Streaming application can control the separator that is used when a key-value pair is turned into a series of bytes and sent to the map or reduce process over standard input d) None of the mentioned
d) None of the mentioned If a combine function is used then it is the same form as the reduce function, except its output types are the intermediate key and value types (K2 and V2), so they can feed the reduce function.
An input _________ is a chunk of the input that is processed by a single map. a) textformat b) split c) datanode d) all of the mentioned
b) split Each split is divided into records, and the map processes each record—a key-value pair—in turn.
______________ is another implementation of the MapRunnable interface that runs mappers concurrently in a configurable number of threads. a) MultithreadedRunner b) MultithreadedMap c) MultithreadedMapRunner d) SinglethreadedMapRunner
c) MultithreadedMapRunner A RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs, which it passes to the map function.
_________ is the base class for all implementations of InputFormat that use files as their data source. a) FileTextFormat b) FileInputFormat c) FileOutputFormat d) None of the mentioned
b) FileInputFormat FileInputFormat provides implementation for generating splits for the input files.
Which of the following is the only way of running mappers? a) MapReducer b) MapRunner c) MapRed d) All of the mentioned
b) MapRunner Having calculated the splits, the client sends them to the jobtracker.
In _____________, the default job is similar, but not identical, to the Java equivalent. a) Mapreduce b) Streaming c) Orchestration d) All of the mentioned
b) Streaming MapReduce Types and Formats MapReduce has a simple model of data processing.
___________ generates keys of type LongWritable and values of type Text. a) TextOutputFormat b) TextInputFormat c) OutputInputFormat d) None of the mentioned
b) TextInputFormat If K2 and K3 are the same, you don't need to call setMapOutputKeyClass().
Which of the following method add a path or paths to the list of inputs? a) setInputPaths() b) addInputPath() c) setInput() d) none of the mentioned
b) addInputPath() FileInputFormat offers four static convenience methods for setting a JobConf input paths.