SlideShare a Scribd company logo
MapReduce: Simplified Data
Processing on Large Clusters
          Rob Keisler
           CSCI 638
         Summer 2011
Outline

● Background

● Model

● Examples

● Execution

● Conclusions
Background

● Transformation operations are conceptually straightforward
   ○ Until data is large and the computation must be
     distributed over hundred or thousands of machines

● So, Google created MapReduce

● MapReduce is a programming abstraction
   ○ Expresses simple computations
   ○ Hides complexity details
Model

● Utilizes higher-order shaping functions Map and Reduce to
  take a set of input key/value pairs and produce a set of
  output key/value pairs

● Map
   ○ Takes an input key/value pair and produces a set of
     intermediate key/value pairs

● Reduce
   ○ Accepts an intermediate key I and a set of values for
     that key, and merges those values to form possibly
     smaller sets of values
Examples

● Distributed Grep

● Count of URL Access Frequency

● Reverse Web-Link Graph

● Term-Vector per Host

● Inverted Index

● Distributed Sort
Execution Overview
Conclusions

● The MapReduce programming model proved to be a useful
  abstraction for many different purposes
   ○ Easy to use
       ■ even for programmers without experience with
         parallel and distributed systems
   ○ A large variety of problems are easily expressible as
     MapReduce computations
   ○ The implementation scales to large clusters of machines

● Greatly simplifies large-scale computations at Google
Questions?

http://labs.google.com/papers/mapreduce.html

More Related Content

What's hot

FME in Tesera’s HRIS: Slicing through the forest of data to see the trees
FME in Tesera’s HRIS: Slicing through the forest of data to see the treesFME in Tesera’s HRIS: Slicing through the forest of data to see the trees
FME in Tesera’s HRIS: Slicing through the forest of data to see the trees
Safe Software
 
Using FME to Automate Data Integration in a City
Using FME to Automate Data Integration in a CityUsing FME to Automate Data Integration in a City
Using FME to Automate Data Integration in a City
Safe Software
 
Map Reduce Presentation
Map Reduce PresentationMap Reduce Presentation
Map Reduce Presentation
ATWIINE Simon Alex
 
Extending 3D Model Visualization with FME 2017
Extending 3D Model Visualization with FME 2017Extending 3D Model Visualization with FME 2017
Extending 3D Model Visualization with FME 2017
Safe Software
 
FME Cloud as Engine for New Mobility Ideas
FME Cloud as Engine for New Mobility IdeasFME Cloud as Engine for New Mobility Ideas
FME Cloud as Engine for New Mobility Ideas
Safe Software
 
Supporting Situational Awareness at LAX using FME Server
Supporting Situational Awareness at LAX using FME ServerSupporting Situational Awareness at LAX using FME Server
Supporting Situational Awareness at LAX using FME Server
Safe Software
 
Prepare LiDAR Data To Meet Your Requirements
Prepare LiDAR Data To Meet Your RequirementsPrepare LiDAR Data To Meet Your Requirements
Prepare LiDAR Data To Meet Your Requirements
Safe Software
 
Using FME to Deliver Map-Based Geological Data for Oil & Gas Companies
Using FME to Deliver Map-Based Geological Data for Oil & Gas CompaniesUsing FME to Deliver Map-Based Geological Data for Oil & Gas Companies
Using FME to Deliver Map-Based Geological Data for Oil & Gas Companies
Safe Software
 
Using GIS to reassess urban plans based on changing industrial emissions
Using GIS to reassess urban plans based on changing industrial emissionsUsing GIS to reassess urban plans based on changing industrial emissions
Using GIS to reassess urban plans based on changing industrial emissions
niket_narang
 
Map Reduce
Map ReduceMap Reduce
Map Reducemsgroner
 
KDOT Aviation Portal Update: Cesium, FME
KDOT Aviation Portal Update: Cesium, FMEKDOT Aviation Portal Update: Cesium, FME
KDOT Aviation Portal Update: Cesium, FME
Safe Software
 
Gain Total Control of Your LiDAR and Point Cloud Data
Gain Total Control of Your LiDAR and Point Cloud DataGain Total Control of Your LiDAR and Point Cloud Data
Gain Total Control of Your LiDAR and Point Cloud Data
Safe Software
 
Creating Geometric Networks at the City of Barrie
Creating Geometric Networks at the City of BarrieCreating Geometric Networks at the City of Barrie
Creating Geometric Networks at the City of Barrie
Safe Software
 
Some of my favourite QGIS plugins
Some of my favourite QGIS pluginsSome of my favourite QGIS plugins
Some of my favourite QGIS plugins
Mark Owen
 
Essential NumPy By ZekeLabs
Essential NumPy By ZekeLabsEssential NumPy By ZekeLabs
Essential NumPy By ZekeLabs
Awantik Das
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GIS
Shaun Lewis
 
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
I3E Technologies
 
Tilemill gwu-wboykinm
Tilemill gwu-wboykinmTilemill gwu-wboykinm
Tilemill gwu-wboykinm
Bill Morris
 
From Outdoor to Indoor: 3D and Venue Mapping
From Outdoor to Indoor: 3D and Venue MappingFrom Outdoor to Indoor: 3D and Venue Mapping
From Outdoor to Indoor: 3D and Venue Mapping
Safe Software
 

What's hot (20)

FME in Tesera’s HRIS: Slicing through the forest of data to see the trees
FME in Tesera’s HRIS: Slicing through the forest of data to see the treesFME in Tesera’s HRIS: Slicing through the forest of data to see the trees
FME in Tesera’s HRIS: Slicing through the forest of data to see the trees
 
Using FME to Automate Data Integration in a City
Using FME to Automate Data Integration in a CityUsing FME to Automate Data Integration in a City
Using FME to Automate Data Integration in a City
 
Map Reduce Presentation
Map Reduce PresentationMap Reduce Presentation
Map Reduce Presentation
 
Extending 3D Model Visualization with FME 2017
Extending 3D Model Visualization with FME 2017Extending 3D Model Visualization with FME 2017
Extending 3D Model Visualization with FME 2017
 
FME Cloud as Engine for New Mobility Ideas
FME Cloud as Engine for New Mobility IdeasFME Cloud as Engine for New Mobility Ideas
FME Cloud as Engine for New Mobility Ideas
 
Supporting Situational Awareness at LAX using FME Server
Supporting Situational Awareness at LAX using FME ServerSupporting Situational Awareness at LAX using FME Server
Supporting Situational Awareness at LAX using FME Server
 
Prepare LiDAR Data To Meet Your Requirements
Prepare LiDAR Data To Meet Your RequirementsPrepare LiDAR Data To Meet Your Requirements
Prepare LiDAR Data To Meet Your Requirements
 
Using FME to Deliver Map-Based Geological Data for Oil & Gas Companies
Using FME to Deliver Map-Based Geological Data for Oil & Gas CompaniesUsing FME to Deliver Map-Based Geological Data for Oil & Gas Companies
Using FME to Deliver Map-Based Geological Data for Oil & Gas Companies
 
Om
OmOm
Om
 
Using GIS to reassess urban plans based on changing industrial emissions
Using GIS to reassess urban plans based on changing industrial emissionsUsing GIS to reassess urban plans based on changing industrial emissions
Using GIS to reassess urban plans based on changing industrial emissions
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
KDOT Aviation Portal Update: Cesium, FME
KDOT Aviation Portal Update: Cesium, FMEKDOT Aviation Portal Update: Cesium, FME
KDOT Aviation Portal Update: Cesium, FME
 
Gain Total Control of Your LiDAR and Point Cloud Data
Gain Total Control of Your LiDAR and Point Cloud DataGain Total Control of Your LiDAR and Point Cloud Data
Gain Total Control of Your LiDAR and Point Cloud Data
 
Creating Geometric Networks at the City of Barrie
Creating Geometric Networks at the City of BarrieCreating Geometric Networks at the City of Barrie
Creating Geometric Networks at the City of Barrie
 
Some of my favourite QGIS plugins
Some of my favourite QGIS pluginsSome of my favourite QGIS plugins
Some of my favourite QGIS plugins
 
Essential NumPy By ZekeLabs
Essential NumPy By ZekeLabsEssential NumPy By ZekeLabs
Essential NumPy By ZekeLabs
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GIS
 
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
 
Tilemill gwu-wboykinm
Tilemill gwu-wboykinmTilemill gwu-wboykinm
Tilemill gwu-wboykinm
 
From Outdoor to Indoor: 3D and Venue Mapping
From Outdoor to Indoor: 3D and Venue MappingFrom Outdoor to Indoor: 3D and Venue Mapping
From Outdoor to Indoor: 3D and Venue Mapping
 

Similar to MapReduce

Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems research
Vasia Kalavri
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and Fugue
Databricks
 
Hadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. ElephantHadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. Elephant
Akshay Rai
 
MapReduce Programming Model
MapReduce Programming ModelMapReduce Programming Model
MapReduce Programming Model
AdarshaDhakal
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
datamantra
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentationNoha Elprince
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
Vedant Mane
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce
Sina Ebrahimi
 
closd computing 4th updated MODULE-4.pptx
closd computing 4th updated MODULE-4.pptxclosd computing 4th updated MODULE-4.pptx
closd computing 4th updated MODULE-4.pptx
MaruthiPrasad96
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
areej qasrawi
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
lmphuong06
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
Ahmad El Tawil
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
Lucian Neghina
 
try
trytry
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
Adam Doyle
 
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsFugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Databricks
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
Amer Ather
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale AutomationSoftware Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
Hao Xu
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 

Similar to MapReduce (20)

Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems research
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and Fugue
 
Hadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. ElephantHadoop & Spark Performance tuning using Dr. Elephant
Hadoop & Spark Performance tuning using Dr. Elephant
 
MapReduce Programming Model
MapReduce Programming ModelMapReduce Programming Model
MapReduce Programming Model
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
 
Main map reduce
Main map reduceMain map reduce
Main map reduce
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentation
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce
 
closd computing 4th updated MODULE-4.pptx
closd computing 4th updated MODULE-4.pptxclosd computing 4th updated MODULE-4.pptx
closd computing 4th updated MODULE-4.pptx
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
try
trytry
try
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsFugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale AutomationSoftware Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 

Recently uploaded

Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 

Recently uploaded (20)

Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 

MapReduce

  • 1. MapReduce: Simplified Data Processing on Large Clusters Rob Keisler CSCI 638 Summer 2011
  • 2. Outline ● Background ● Model ● Examples ● Execution ● Conclusions
  • 3. Background ● Transformation operations are conceptually straightforward ○ Until data is large and the computation must be distributed over hundred or thousands of machines ● So, Google created MapReduce ● MapReduce is a programming abstraction ○ Expresses simple computations ○ Hides complexity details
  • 4. Model ● Utilizes higher-order shaping functions Map and Reduce to take a set of input key/value pairs and produce a set of output key/value pairs ● Map ○ Takes an input key/value pair and produces a set of intermediate key/value pairs ● Reduce ○ Accepts an intermediate key I and a set of values for that key, and merges those values to form possibly smaller sets of values
  • 5. Examples ● Distributed Grep ● Count of URL Access Frequency ● Reverse Web-Link Graph ● Term-Vector per Host ● Inverted Index ● Distributed Sort
  • 7. Conclusions ● The MapReduce programming model proved to be a useful abstraction for many different purposes ○ Easy to use ■ even for programmers without experience with parallel and distributed systems ○ A large variety of problems are easily expressible as MapReduce computations ○ The implementation scales to large clusters of machines ● Greatly simplifies large-scale computations at Google