SlideShare a Scribd company logo
1 of 9
The Story of Sam
Saliya Ekanayake
SALSA HPC Group
http://salsahpc.indiana.edu
Pervasive Technology Institute, Indiana University, Bloomington
 Believed “an apple a day keeps a doctor away”
Mother
Sam
An Apple
 Sam thought of “drinking” the apple
 He used a to cut
the and a
to make juice.
 (map ‘( ))
( )
 Sam applied his invention to all the
fruits he could find in the fruit basket
 (reduce ‘( )) Classical Notion of MapReduce in
Functional Programming
A list of values mapped into
another list of values, which gets
reduced into a single value
 Sam got his first job in JuiceRUs for his
talent in making juice
 Now, it’s not just one basket
but a whole container of fruits
 Also, they produce a list of
juice types separately
NOT ENOUGH !!
 But, Sam had just ONE
and ONE
Large data and list of values
for output
Wait!
 Implemented a parallel version of his innovation
(<a, > , <o, > , <p, > , …)
Each input to a map is a list of <key, value> pairs
Each output of a map is a list of <key, value> pairs
(<a’, > , <o’, > , <p’, > , …)
Grouped by key
Each input to a reduce is a <key, value-list>
(possibly a list of these, depending on the
grouping/hashing mechanism)
e.g. <a’, ( …)>
Reduced into a list of values
 Implemented a parallel version of his innovation
The idea of MapReduce in Data
Intensive Computing
A list of <key, value> pairs mapped
into another list of <key, value> pairs
which gets grouped by the key and
reduced into a list of values
 Sam realized,
◦ To create his favorite mix fruit juice he can use a combiner after the
reducers
◦ If several <key, value-list> fall into the same group (based on the
grouping/hashing algorithm) then use the blender (reducer) separately on
each of them
◦ The knife (mapper) and blender (reducer) should not contain residue after
use – Side Effect Free
◦ In general reducer should be associative and commutative
 We think Sam was you 

More Related Content

What's hot

MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsLeila panahi
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingVaibhav Khanna
 
Research issues in object oriented software testing
Research issues in object oriented software testingResearch issues in object oriented software testing
Research issues in object oriented software testingAnshul Vinayak
 
Federated Cloud Computing - The OpenNebula Experience v1.0s
Federated Cloud Computing  - The OpenNebula Experience v1.0sFederated Cloud Computing  - The OpenNebula Experience v1.0s
Federated Cloud Computing - The OpenNebula Experience v1.0sIgnacio M. Llorente
 
Deadlock Detection in Distributed Systems
Deadlock Detection in Distributed SystemsDeadlock Detection in Distributed Systems
Deadlock Detection in Distributed SystemsDHIVYADEVAKI
 
Software Process Models
Software Process ModelsSoftware Process Models
Software Process ModelsHassan A-j
 
Unified process,agile process,process assesment ppt
Unified process,agile process,process assesment pptUnified process,agile process,process assesment ppt
Unified process,agile process,process assesment pptShweta Ghate
 
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...priyasoundar
 
Software estimation techniques
Software estimation techniquesSoftware estimation techniques
Software estimation techniquesTan Tran
 
Cloud computing and Cloud Enabling Technologies
Cloud computing and Cloud Enabling TechnologiesCloud computing and Cloud Enabling Technologies
Cloud computing and Cloud Enabling TechnologiesAbdelkhalik Mosa
 
Ecg analysis in the cloud
Ecg analysis in the cloudEcg analysis in the cloud
Ecg analysis in the cloudgaurav jain
 
Introduction to AWS Step Functions:
Introduction to AWS Step Functions: Introduction to AWS Step Functions:
Introduction to AWS Step Functions: Amazon Web Services
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce FrameworkEdureka!
 

What's hot (20)

MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
 
Research issues in object oriented software testing
Research issues in object oriented software testingResearch issues in object oriented software testing
Research issues in object oriented software testing
 
Federated Cloud Computing - The OpenNebula Experience v1.0s
Federated Cloud Computing  - The OpenNebula Experience v1.0sFederated Cloud Computing  - The OpenNebula Experience v1.0s
Federated Cloud Computing - The OpenNebula Experience v1.0s
 
Deadlock Detection in Distributed Systems
Deadlock Detection in Distributed SystemsDeadlock Detection in Distributed Systems
Deadlock Detection in Distributed Systems
 
DynamoDB & DAX
DynamoDB & DAXDynamoDB & DAX
DynamoDB & DAX
 
Software Process Models
Software Process ModelsSoftware Process Models
Software Process Models
 
Sqa plan
Sqa planSqa plan
Sqa plan
 
Unified process,agile process,process assesment ppt
Unified process,agile process,process assesment pptUnified process,agile process,process assesment ppt
Unified process,agile process,process assesment ppt
 
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...
Software Testing - Boundary Value Analysis, Equivalent Class Partition, Decis...
 
Spark
SparkSpark
Spark
 
Software estimation techniques
Software estimation techniquesSoftware estimation techniques
Software estimation techniques
 
Cloud computing and Cloud Enabling Technologies
Cloud computing and Cloud Enabling TechnologiesCloud computing and Cloud Enabling Technologies
Cloud computing and Cloud Enabling Technologies
 
Ecg analysis in the cloud
Ecg analysis in the cloudEcg analysis in the cloud
Ecg analysis in the cloud
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Cloud security
Cloud securityCloud security
Cloud security
 
MapReduce
MapReduceMapReduce
MapReduce
 
Introduction to AWS Step Functions:
Introduction to AWS Step Functions: Introduction to AWS Step Functions:
Introduction to AWS Step Functions:
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce Framework
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 

More from Saliya Ekanayake

The Art and Craft of Woodworking
The Art and Craft of WoodworkingThe Art and Craft of Woodworking
The Art and Craft of WoodworkingSaliya Ekanayake
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Saliya Ekanayake
 
Towards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingSaliya Ekanayake
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersSaliya Ekanayake
 

More from Saliya Ekanayake (6)

The Art and Craft of Woodworking
The Art and Craft of WoodworkingThe Art and Craft of Woodworking
The Art and Craft of Woodworking
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
 
Towards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and Benchmarking
 
Sandhi Wimarshana
Sandhi WimarshanaSandhi Wimarshana
Sandhi Wimarshana
 
High Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC ClustersHigh Performance Data Analytics with Java on Large Multicore HPC Clusters
High Performance Data Analytics with Java on Large Multicore HPC Clusters
 
Survey onhpcs languages
Survey onhpcs languagesSurvey onhpcs languages
Survey onhpcs languages
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

MapReduce in Simple Terms

  • 1. The Story of Sam Saliya Ekanayake SALSA HPC Group http://salsahpc.indiana.edu Pervasive Technology Institute, Indiana University, Bloomington
  • 2.  Believed “an apple a day keeps a doctor away” Mother Sam An Apple
  • 3.  Sam thought of “drinking” the apple  He used a to cut the and a to make juice.
  • 4.  (map ‘( )) ( )  Sam applied his invention to all the fruits he could find in the fruit basket  (reduce ‘( )) Classical Notion of MapReduce in Functional Programming A list of values mapped into another list of values, which gets reduced into a single value
  • 5.  Sam got his first job in JuiceRUs for his talent in making juice  Now, it’s not just one basket but a whole container of fruits  Also, they produce a list of juice types separately NOT ENOUGH !!  But, Sam had just ONE and ONE Large data and list of values for output Wait!
  • 6.  Implemented a parallel version of his innovation (<a, > , <o, > , <p, > , …) Each input to a map is a list of <key, value> pairs Each output of a map is a list of <key, value> pairs (<a’, > , <o’, > , <p’, > , …) Grouped by key Each input to a reduce is a <key, value-list> (possibly a list of these, depending on the grouping/hashing mechanism) e.g. <a’, ( …)> Reduced into a list of values
  • 7.  Implemented a parallel version of his innovation The idea of MapReduce in Data Intensive Computing A list of <key, value> pairs mapped into another list of <key, value> pairs which gets grouped by the key and reduced into a list of values
  • 8.  Sam realized, ◦ To create his favorite mix fruit juice he can use a combiner after the reducers ◦ If several <key, value-list> fall into the same group (based on the grouping/hashing algorithm) then use the blender (reducer) separately on each of them ◦ The knife (mapper) and blender (reducer) should not contain residue after use – Side Effect Free ◦ In general reducer should be associative and commutative
  • 9.  We think Sam was you 