Vous êtes sur la page 1sur 40

Map Reduce Concepts

Job Tracker

The Job-Tracker is responsible for accepting jobs from clients,dividing those jobs into tasks, and assigning those tasks to be executed by worker nodes.

Task Tracker

Task-Tracker process that manages the execution of the tasks currently assigned to that node. Each Task Tracker has a fixed number of slots for executing tasks (two maps and two reduces by default).

MapReduce co-located with HDFS


JobTracker
Client submits MapReduce job

NameNode

JobTracker and NameNode need not be on same node

Slave node A
TaskTracker

Slave node B
TaskTracker

Slave node C
TaskTracker

DataNode

DataNode

DataNode

TaskTrackers (compute nodes) and DataNodes colocate = high aggregate bandwidth across cluster

Introduction to MapReduce Framework

A programming model for parallel data processing. Hadoop can run map reduce programs in multiple languages like Java, Python, Ruby and C++.
Map function: Operate on set of key, value pairs Map is applied in parallel on input data set This produces output keys and list of values for each key depending upon the functionality Mapper output are partitioned per reducer = No. Of reduce task for that job

Reduce function: Operate on set of key, value pairs Reduce is then applied in parallel to each group, again producing a collection of key, values. No of reducers can be set by the user.

How does a map-reduce algorithm work?

Understanding processing in a MapReduce framework


User runs a program on the client computer Program submits a job to HDFS. Job contains: Input data Map / Reduce program Configuration information Two types of daemons that control job execution: Job Tracker (master node) Task Trackers (slave nodes) Job sent to JobTracker JobTracker communicates with NameNode and assigns parts of job to TaskTrackers (TaskTracker is run on each DataNode) Task is a single MAP or REDUCE operation over piece of data Hadoop divides the input to MAP / REDUCE job into equal splits The JobTracker knows (from NameNode) which node contains the data, and which other machines are nearby. Task processes send heartbeats to TaskTracker, TaskTracker sends heartbeats to the JobTracker.

Understanding processing in a MapReduce framework


Any tasks that did not report in certain time (default is 10 min) assumed to be failed and its JVM will be killed by TaskTracker and reported to the JobTracker The JobTracker will reschedule any failed tasks (with different TaskTracker) If same task failed 4 times all job fails Any TaskTracker reporting high number of failed jobs on particular node will be blacklist the node (remove metadata from NameNode) JobTracker maintains and manages the status of each job. Results from failed tasks will be ignored 1 Job Tracker (master) n TaskTrackers (slaves) m Tasks

Map/Reduce data flow


Output of Map is stored on local disk Output of Reduce is stored in HDFS When there is more than one reducer the map tasks partition their output: One partition for each reduce task There are many keys and associated values for each partition , but records for each given key are all in the same partition Partitioning can be controlled by user defined function (default is hash function) Shuffle data flow between map and reduce tasks

Computing parallelism meet data locality All map tasks are equivalent; so can run in parallel
All reduce tasks can also run in parallel Input data on HDFS on can be processed independently

Therefore, run map task on whatever data is local (or closest) to a particular node in HDFS For map task assignment, JobTracker has an affinity for a particular node which has a replica of the input data If lots of data does happen to pile up on the same node, nearby nodes will map instead
Therefore, good performance And improve recovery from partial failure of servers or storage during the operation: if one map or reduce task fails, the work can be rescheduled

Data Distribution
In a MapReduce cluster, data is distributed to all the nodes of the cluster as it is being loaded in An underlying distributed file systems (e.g., GFS) splits large data files into chunks which are managed by different nodes in the cluster
Input data: A large file
Node 1
Chunk of input data

Node 2
Chunk of input data

Node 3
Chunk of input data

Even though the file chunks are distributed across several machines, they form a single namesapce

Keys and Values


The programmer in MapReduce has to specify two functions, the map function and the reduce function that implement the Mapper and the Reducer in a MapReduce program In MapReduce data elements key-value (i.e., (K, V)) pairs are always structured as

The map and reduce functions receive and emit (K, V) pairs
Input Splits Intermediate Outputs Final Outputs

(K, V) Pairs

Map Function

(K, V) Pairs

Reduce Function

(K, V) Pairs

Partitions
In MapReduce, intermediate output values are not usually reduced together All values with the same key are presented to a single Reducer together More specifically, a different subset of intermediate key space is assigned to each Reducer These subsets are known as partitions
Different colors represent different keys (potentially) from different Mappers

Partitions are the input to Reducers

Hadoop MapReduce: A Closer Look


Node 1
Files loaded from local HDFS store

Node 2
Files loaded from local HDFS store

InputFormat
file Split file RecordReaders RR RR RR RR Split Split Split

InputFormat
file Split Split file RR RR RecordReaders Input (K, V) pairs

Input (K, V) pairs


Map Intermediate (K, V) pairs Partitioner Map Map Shuffling Process Map Map Map

Intermediate (K, V) pairs Partitioner

Sort
Reduce Final (K, V) pairs Writeback to local HDFS store

Intermediate (K,V) pairs exchanged by all nodes

Sort
Reduce Final (K, V) pairs

OutputFormat

OutputFormat

Writeback to local HDFS store

Input Files
Input files are where the data for a MapReduce task is initially stored The input files typically reside in a distributed file system (e.g. HDFS) The format of input files is arbitrary Line-based log files Binary files Multi-line input records Or something else entirely
file

file

13

InputFormat
How the input files are split up and read is defined by the InputFormat InputFormat is a class that does the following:
Files loaded from local HDFS store

Selects the files that should be used for input file Defines the InputSplits that break file a file Provides a factory for RecordReader objects that read the file

InputFormat

14

InputFormat Types
Several InputFormats are provided with Hadoop:
InputFormat
TextInputFormat

Description
Default format; reads lines of text files Parses lines into (K, V) pairs A Hadoop-specific high-performance binary format

Key
The byte offset of the line Everything up to the first tab character user-defined

Value
The line contents

KeyValueInputFormat

The remainder of the line user-defined

SequenceFileInputFormat

15

Input Splits
An input split describes a unit of work that comprises a single map task in a MapReduce program By default, the InputFormat breaks a file up into 64MB splits By dividing the file into splits, we allow several map tasks to operate on a single file in parallel If the file is very large, this can improve performance significantly through parallelism
Files loaded from local HDFS store

InputFormat
file Split file Split Split

Each map task corresponds to a single input split

RecordReader
The input split defines a slice of work but does not describe how to access it The RecordReader class actually loads data from its source and converts it into (K, V) pairs suitable for reading by Mappers
Files loaded from local HDFS store

The RecordReader is invoked repeatedly on the input until the entire split is consumed
file

InputFormat

Each invocation of the RecordReader leads to another call of the map function defined by the programmer

Split file RR

Split

Split

RR

RR

Mapper and Reducer


The Mapper performs the user-defined work of the first phase of the Files loaded from local HDFS store MapReduce program A new instance of Mapper is created for each split
file Split Split Split

InputFormat

The Reducer performs the user-defined work of file the second phase of the MapReduce program A new instance of Reducer is created for each partition For each key in the partition assigned to a Reducer, the Reducer is called once

RR

RR

RR

Map

Map

Map

Partitioner

Sort

Reduce

Partitioner
Each mapper may emit (K, V) pairs to any partition
Files loaded from local HDFS store

Therefore, the map nodes must all agree on where to send different pieces of intermediate data
file

InputFormat
file Split Split Split

The partitioner class determines which partition a given (K,V) pair will go to The default partitioner computes a hash value for a given key and assigns it to a partition based on this result

RR

RR

RR

Map

Map

Map

Partitioner

Sort

Reduce

Sort
Each Reducer is responsible for reducing the values associated with (several) intermediate keys The set of intermediate keys on a single node is automatically sorted by MapReduce before they are presented to the Reducer
Files loaded from local HDFS store

InputFormat
file Split file RR RR RR Split Split

Map

Map

Map

Partitioner

Sort Reduce

OutputFormat
Files loaded from local HDFS store

The OutputFormat class defines the way (K,V) pairs produced by Reducers are written to output files
file

InputFormat

The instances of OutputFormat provided by Hadoop write to files on the local disk or in HDFS

Split file RR

Split

Split

RR

RR

Several OutputFormats are provided by Hadoop:


Description
Default; writes lines in "key \t value" format Writes binary files suitable for reading into subsequent MapReduce jobs Generates no output files
Map Map Map

OutputFormat
TextOutputFormat SequenceFileOutputFormat

Partitioner

Sort

NullOutputFormat

Reduce

OutputFormat

Job Scheduling in MapReduce


In MapReduce, an application is represented as a job A job encompasses multiple map and reduce tasks MapReduce in Hadoop comes with a choice of schedulers:

The default is the FIFO scheduler which schedules jobs in order of submission
There is also a multi-user scheduler called the Fair scheduler which aims to give every user a fair share of the cluster capacity over time

22

FIFO Scheduling

Job Queue

FIFO Scheduling

Job Queue

FIFO Scheduling

Job Queue

Fair Scheduling

Job Queue

Fair Scheduling

Job Queue

Fair Scheduler Basics


Group jobs into pools Assign each pool a guaranteed minimum share Divide excess capacity evenly between pools

Pools Determined from a configurable job property


Default in 0.20: user.name (one pool per user)

Pools have properties:


Minimum map slots Minimum reduce slots Limit on # of running jobs

Example Pool Allocations


entire cluster 100
slots

matei

jeff

tom
min share = 30

ads
min share = 40

job 1
30 slots

job 2
15 slots

job 3
15 slots

job 4
40 slots

Scheduling Algorithm
Split each pools min share among its jobs Split each pools total share among its jobs When a slot needs to be assigned:
If there is any job below its min share, schedule it Else schedule the job that weve been most unfair to (based on deficit)

Fault Tolerance in Hadoop


MapReduce can guide jobs toward a successful completion even when jobs are run on a large cluster where probability of failures increases The primary way that MapReduce achieves fault tolerance is through restarting tasks

If a TT fails to communicate with JT for a period of time (by default, 1 minute in Hadoop), JT will assume that TT in question has crashed
If the job is still in the map phase, JT asks another TT to re-execute all Mappers that previously ran at the failed TT If the job is in the reduce phase, JT asks another TT to re-execute all Reducers that were in progress on the failed TT
32

Speculative Execution
A MapReduce job is dominated by the slowest task MapReduce attempts to locate slow tasks (stragglers) and run redundant (speculative) tasks that will optimistically commit before the corresponding stragglers This process is known as speculative execution Only one copy of a straggler is allowed to be speculated Whichever copy (among the two copies) of a task commits first, it becomes the definitive copy, and the other copy is killed by JT

Locating Stragglers
How does Hadoop locate stragglers?
Hadoop monitors each task progress using a progress score between 0 and 1 If a tasks progress score is less than (average 0.2), and the task has run for at least 1 minute, it is marked as a straggler

T1 PS= 2/3 T2 PS= 1/12

Not a straggler

A straggler

Time

What Makes MapReduce Unique?


MapReduce is characterized by: 1. Its simplified programming model which allows the user to quickly write and test distributed systems

2. Its efficient and automatic distribution of data and workload across machines
3. Its flat scalability curve. Specifically, after a Mapreduce program is written and functioning on 10 nodes, very little-if any- work is required for making that same program run on 1000 nodes

35

Programming using MapReduce

WordCount is a simple application that counts the number of occurences of each word in a given input file. Here we divide the entire code into 3 files 1)Mapper.java 2)Reducer.java 3)Basic.java

Mapper.java
import import import import import import import java.io.IOException; java.util.*; org.apache.hadoop.fs.Path; org.apache.hadoop.conf.*; org.apache.hadoop.io.*; org.apache.hadoop.mapred.*; org.apache.hadoop.util.*;

public class Mapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } }

Reducer.java
import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*;

public class Reducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } }

Basic.java
import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; public class Basic extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public static void main(String[] args) throws Exception { JobConf conf = new JobConf(Basic.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Mapper.class); conf.setReducerClass(Reducer.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf);


} }

Executing the MapReduce program


1)Compile all the 3 java files which will create 3 .class files
2)Add all 3 .class files into 1 single jar file by writing this command jar cvf file_name.jar *.class

3)Now you just need to execute single jar file by writing this command bin/hadoop jar file_name.jar Basic input_file_name output_file_name