This document provides an overview of Hadoop MapReduce. It begins with an introduction to MapReduce and defines it as the processing component of Apache Hadoop that processes data in parallel across a distributed environment. The document then discusses two main advantages of MapReduce: 1) parallel processing, which makes data processing fast, and 2) data locality where processing is moved to the data rather than moving large amounts of data. It also provides an example of how MapReduce can be used to efficiently count words in a document by splitting the work across nodes and aggregating the results.
4. Agenda For Today’s Session
◉ What is hadoop MapReduce ?
◉ MapReduce In Nutshell
◉ Two Advantages of MapReduce
◉ Hadoop MapReduce Approach with an Example
7. MapReduce Data Processing and
Programming
◉ MapReduce is the processing components
of Apache Hadoop
◉ It process data parallelly in distributed
environment
Result
13. Election votes counting : Traditional way
Election Vote Casting
◉ vote is stored at different Booths
◉ Result Center has the details of
all the Booths
14. Election votes counting : MapReduce way
Counting - MapReduce Approach
◉ Votes are counted at individual booths.
◉ Booth-wise result are send back to the result
Centre.
◉ Final Result is declared easily and quickly
using this way.
18. MapReduce way – Word Count Process
The overall MapReduce Word Count Process
Input Split
Deer Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Splitting
19. MapReduce way – Word Count Process
The overall MapReduce Word Count Process
Input Split
Deer Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Splitting
Deer,1
Bear,1
River,1
Car,1
Car,1
River,1
Mapping
20. MapReduce way – Word Count Process
The overall MapReduce Word Count Process
Input Split
Deer Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Splitting
Deer,1
Bear,1
River,1
Car,1
Car,1
River,1
Deer,1
Car,1
Bear,1
Bear,(1,1)
Car,(1,1,1)
Dear,(1,1)
Mapping Shuffling
21. MapReduce way – Word Count Process
The overall MapReduce Word Count Process
Input Split
Deer Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Splitting
Deer,1
Bear,1
River,1
Car,1
Car,1
River,1
Deer,1
Car,1
Bear,1
Bear,(1,1)
Car,(1,1,1)
Dear,(1,1)
River,(1,1)
Mapping
Bear,2
Car,3
Dear,2
Shuffling Reducing
22. MapReduce way – Word Count Process
The overall MapReduce Word Count Process
Input Split
Deer Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Splitting
Deer,1
Bear,1
River,1
Car,1
Car,1
River,1
Deer,1
Car,1
Bear,1
Bear,(1,1)
Car,(1,1,1)
Dear,(1,1)
River,(1,1)
Mapping
Bear,2
Car,3
Dear,2
River,2
Shuffling Reducing Final Result
Bear,2
Car,3
Dear,2
River,2