MapReduce in Simple Terms

Saliya Ekanayake
Saliya EkanayakeApplied Computer Scientist at Virginia Tech
The Story of Sam
Saliya Ekanayake
SALSA HPC Group
http://salsahpc.indiana.edu
Pervasive Technology Institute, Indiana University, Bloomington
 Believed “an apple a day keeps a doctor away”
Mother
Sam
An Apple
 Sam thought of “drinking” the apple
 He used a to cut
the and a
to make juice.
 (map ‘( ))
( )
 Sam applied his invention to all the
fruits he could find in the fruit basket
 (reduce ‘( )) Classical Notion of MapReduce in
Functional Programming
A list of values mapped into
another list of values, which gets
reduced into a single value
 Sam got his first job in JuiceRUs for his
talent in making juice
 Now, it’s not just one basket
but a whole container of fruits
 Also, they produce a list of
juice types separately
NOT ENOUGH !!
 But, Sam had just ONE
and ONE
Large data and list of values
for output
Wait!
 Implemented a parallel version of his innovation
(<a, > , <o, > , <p, > , …)
Each input to a map is a list of <key, value> pairs
Each output of a map is a list of <key, value> pairs
(<a’, > , <o’, > , <p’, > , …)
Grouped by key
Each input to a reduce is a <key, value-list>
(possibly a list of these, depending on the
grouping/hashing mechanism)
e.g. <a’, ( …)>
Reduced into a list of values
 Implemented a parallel version of his innovation
The idea of MapReduce in Data
Intensive Computing
A list of <key, value> pairs mapped
into another list of <key, value> pairs
which gets grouped by the key and
reduced into a list of values
 Sam realized,
◦ To create his favorite mix fruit juice he can use a combiner after the
reducers
◦ If several <key, value-list> fall into the same group (based on the
grouping/hashing algorithm) then use the blender (reducer) separately on
each of them
◦ The knife (mapper) and blender (reducer) should not contain residue after
use – Side Effect Free
◦ In general reducer should be associative and commutative
 We think Sam was you 
1 of 9

Recommended

Advanced Sorting Algorithms by
Advanced Sorting AlgorithmsAdvanced Sorting Algorithms
Advanced Sorting AlgorithmsDamian T. Gordon
5.5K views180 slides
Maximum sum subarray by
Maximum sum subarrayMaximum sum subarray
Maximum sum subarrayShaheen kousar
482 views17 slides
Lecture optimal binary search tree by
Lecture optimal binary search tree Lecture optimal binary search tree
Lecture optimal binary search tree Divya Ks
1.2K views16 slides
The Floyd–Warshall algorithm by
The Floyd–Warshall algorithmThe Floyd–Warshall algorithm
The Floyd–Warshall algorithmJosé Juan Herrera
7.2K views14 slides
linked list by
linked listlinked list
linked listShaista Qadir
394 views32 slides
Linear and Binary search by
Linear and Binary searchLinear and Binary search
Linear and Binary searchNisha Soms
499 views31 slides

More Related Content

What's hot

Normalisation by
NormalisationNormalisation
NormalisationSoumyajit Dutta
621 views82 slides
Design and Analysis of Algorithms by
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of AlgorithmsSwapnil Agrawal
16K views37 slides
UNIT I LINEAR DATA STRUCTURES – LIST by
UNIT I 	LINEAR DATA STRUCTURES – LIST 	UNIT I 	LINEAR DATA STRUCTURES – LIST
UNIT I LINEAR DATA STRUCTURES – LIST Kathirvel Ayyaswamy
5.3K views109 slides
Dynamic programming by
Dynamic programmingDynamic programming
Dynamic programmingShakil Ahmed
10.9K views39 slides
Quick Sort , Merge Sort , Heap Sort by
Quick Sort , Merge Sort ,  Heap SortQuick Sort , Merge Sort ,  Heap Sort
Quick Sort , Merge Sort , Heap SortMohammed Hussein
48.7K views37 slides

What's hot(20)

Design and Analysis of Algorithms by Swapnil Agrawal
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of Algorithms
Swapnil Agrawal16K views
Dynamic programming by Shakil Ahmed
Dynamic programmingDynamic programming
Dynamic programming
Shakil Ahmed10.9K views
Quick Sort , Merge Sort , Heap Sort by Mohammed Hussein
Quick Sort , Merge Sort ,  Heap SortQuick Sort , Merge Sort ,  Heap Sort
Quick Sort , Merge Sort , Heap Sort
Mohammed Hussein48.7K views
Data Structures - Searching & sorting by Kaushal Shah
Data Structures - Searching & sortingData Structures - Searching & sorting
Data Structures - Searching & sorting
Kaushal Shah13.6K views
Ch7 Process Synchronization galvin by Shubham Singh
Ch7 Process Synchronization galvinCh7 Process Synchronization galvin
Ch7 Process Synchronization galvin
Shubham Singh955 views
Advanced Reflection in Java by kim.mens
Advanced Reflection in JavaAdvanced Reflection in Java
Advanced Reflection in Java
kim.mens1.6K views
Binary Search - Design & Analysis of Algorithms by Drishti Bhalla
Binary Search - Design & Analysis of AlgorithmsBinary Search - Design & Analysis of Algorithms
Binary Search - Design & Analysis of Algorithms
Drishti Bhalla14.6K views
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber by error007
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error00714.4K views

More from Saliya Ekanayake

The Art and Craft of Woodworking by
The Art and Craft of WoodworkingThe Art and Craft of Woodworking
The Art and Craft of WoodworkingSaliya Ekanayake
38 views17 slides
Java Thread and Process Performance for Parallel Machine Learning on Multicor... by
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Saliya Ekanayake
728 views17 slides
Towards a Systematic Study of Big Data Performance and Benchmarking by
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingSaliya Ekanayake
458 views38 slides
Sandhi Wimarshana by
Sandhi WimarshanaSandhi Wimarshana
Sandhi WimarshanaSaliya Ekanayake
193 views1 slide
High Performance Data Analytics with Java on Large Multicore HPC Clusters by
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersSaliya Ekanayake
1.1K views18 slides
Survey onhpcs languages by
Survey onhpcs languagesSurvey onhpcs languages
Survey onhpcs languagesSaliya Ekanayake
426 views12 slides

More from Saliya Ekanayake(6)

Java Thread and Process Performance for Parallel Machine Learning on Multicor... by Saliya Ekanayake
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Saliya Ekanayake728 views
Towards a Systematic Study of Big Data Performance and Benchmarking by Saliya Ekanayake
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and Benchmarking
Saliya Ekanayake458 views
High Performance Data Analytics with Java on Large Multicore HPC Clusters by Saliya Ekanayake
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
Saliya Ekanayake1.1K views

Recently uploaded

Igniting Next Level Productivity with AI-Infused Data Integration Workflows by
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Safe Software
317 views86 slides
Mini-Track: Challenges to Network Automation Adoption by
Mini-Track: Challenges to Network Automation AdoptionMini-Track: Challenges to Network Automation Adoption
Mini-Track: Challenges to Network Automation AdoptionNetwork Automation Forum
17 views27 slides
"Node.js Development in 2024: trends and tools", Nikita Galkin by
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin Fwdays
17 views38 slides
Business Analyst Series 2023 - Week 3 Session 5 by
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5DianaGray10
345 views20 slides
Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
61 views38 slides
SAP Automation Using Bar Code and FIORI.pdf by
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdfVirendra Rai, PMP
25 views38 slides

Recently uploaded(20)

Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software317 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays17 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10345 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2218 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn26 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc72 views
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
2024: A Travel Odyssey The Role of Generative AI in the Tourism Universe by Simone Puorto
2024: A Travel Odyssey The Role of Generative AI in the Tourism Universe2024: A Travel Odyssey The Role of Generative AI in the Tourism Universe
2024: A Travel Odyssey The Role of Generative AI in the Tourism Universe
Simone Puorto13 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays33 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi139 views

MapReduce in Simple Terms

  • 1. The Story of Sam Saliya Ekanayake SALSA HPC Group http://salsahpc.indiana.edu Pervasive Technology Institute, Indiana University, Bloomington
  • 2.  Believed “an apple a day keeps a doctor away” Mother Sam An Apple
  • 3.  Sam thought of “drinking” the apple  He used a to cut the and a to make juice.
  • 4.  (map ‘( )) ( )  Sam applied his invention to all the fruits he could find in the fruit basket  (reduce ‘( )) Classical Notion of MapReduce in Functional Programming A list of values mapped into another list of values, which gets reduced into a single value
  • 5.  Sam got his first job in JuiceRUs for his talent in making juice  Now, it’s not just one basket but a whole container of fruits  Also, they produce a list of juice types separately NOT ENOUGH !!  But, Sam had just ONE and ONE Large data and list of values for output Wait!
  • 6.  Implemented a parallel version of his innovation (<a, > , <o, > , <p, > , …) Each input to a map is a list of <key, value> pairs Each output of a map is a list of <key, value> pairs (<a’, > , <o’, > , <p’, > , …) Grouped by key Each input to a reduce is a <key, value-list> (possibly a list of these, depending on the grouping/hashing mechanism) e.g. <a’, ( …)> Reduced into a list of values
  • 7.  Implemented a parallel version of his innovation The idea of MapReduce in Data Intensive Computing A list of <key, value> pairs mapped into another list of <key, value> pairs which gets grouped by the key and reduced into a list of values
  • 8.  Sam realized, ◦ To create his favorite mix fruit juice he can use a combiner after the reducers ◦ If several <key, value-list> fall into the same group (based on the grouping/hashing algorithm) then use the blender (reducer) separately on each of them ◦ The knife (mapper) and blender (reducer) should not contain residue after use – Side Effect Free ◦ In general reducer should be associative and commutative
  • 9.  We think Sam was you 