SlideShare a Scribd company logo
MapReduce
Presented by – Somesh maliye
Content
•Motivation for MapReduce.
•What is MapReduce.
•Map() & Reduce() functions.
•MapReduce - Example.
•Dataflow.
•MapReduce Job.
•Job Tracker & Task tracker.
•Characteristics of MapReduce.
•Real Time uses.
•Failure in MapReduce.
•Conclusion.
Motivation For MapReduce
•Large scale data processing.
◦ Want to use 1000s of CPUs
•MapReduce Architecture provides
◦ Automatic parallelization & distribution
◦ Fault tolerance
◦ I/O scheduling
◦ Monitoring & status updates
What is MapReduce
•MapReduce is programming model and an associated implementation for
processing and generating large data sets with parallel and distributed algorithm
on clusters.
Map() function
• Reads in input pair <Key, Value>
• Outputs a pair <K’, V’>
• Let’s count number of each word in user queries (or Tweets/Blogs)
• The input to the map() will be <queryID, QueryText>:
• <Q1,“The teacher went to the store. The store was closed; the store opens in
the morning. The store opens at 9am.” >
• The output would be:
<The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store,1> <the, 1> <store, 1> <was, 1>
<closed, 1> <the, 1> <store,1> <opens, 1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1>
<opens, 1> <at, 1> <9am, 1>
Reduce() function
•Accepts the Map() output, and aggregates values on the key
•For our example, the reducer input would be:
• <The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store, 1> <the, 1> <store, 1> <was, 1> <closed, 1>
<the, 1> <store, 1> <opens,1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1> <opens, 1> <at, 1>
<9am, 1>
• The output would be:
• <The, 6> <teacher, 1> <went, 1> <to, 1> <store, 3> <was, 1> <closed, 1> <opens, 1> <morning, 1> <at, 1>
<9am, 1>
MapReduce - Example
Dataflow
Dataflow can be determine through the following function:
• an input reader
• a Map function
• a partition function
• a compare function
• a Reduce function
• an output writer
Dataflow(Cont.)
•Input reader
The input reader divides the input into appropriate size 'splits' (in practice typically 64 MB to 128 MB)
and the framework assigns one split to each Map function. The input reader reads data from stable
storage (typically a distributed file system) and generates key/value pairs.
•Map function
The Map function takes a series of key/value pairs, processes each, and generates zero or more output
key/value pairs.
• Partition function
Each Map function output is allocated to a particular reducer by the application's partition function for
sharding purposes. The partition function is given the key and the number of reducers and returns the
index of the desired reducer.
•Comparison function
The input for each Reduce is pulled from the machine where the Map ran and sorted using the
application's comparison function.
Dataflow(Cont.)
•Reduce function
The framework calls the application's Reduce function once for each unique key in the sorted order. The
Reduce can iterate through the values that are associated with that key and produce zero or more
outputs.
•Output writer
The Output Writer writes the output of the Reduce to the stable storage.
MapReduce Job
A job is a full MapReduce program , which typically will cause multiple Map
and Reduce functions to be run in parallel over the life of program. A task is a
map or reduce function executed on a subset of data.
Job tracker & Task tracker
Characteristics of MapReduce
Real time uses
Failures in MapReduce
• Failures are norm in commodity hardware
• Worker failure
• Detect failure via periodic heartbeats
• Re-execute in-progress map/reduce tasks
• Master failure
• Single point of failure; Resume from Execution Log
Conclusion.
•Simplifies large-scale computations that fit this model
•Allows user to focus on the problem without worrying about details
Thank You

More Related Content

What's hot

MapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large ClustersMapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large Clusters
kazuma_sato
 
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ..."MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
Adrian Florea
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML InferenceWillump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML Inference
Databricks
 
Hadoop map reduce v2
Hadoop map reduce v2Hadoop map reduce v2
Hadoop map reduce v2
Subhas Kumar Ghosh
 
Finite state automaton
Finite state automatonFinite state automaton
Finite state automaton
guest350909
 
Map reduce advantages over parallel databases
Map reduce advantages over parallel databases Map reduce advantages over parallel databases
Map reduce advantages over parallel databases
Ahmad El Tawil
 
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
Papitha Velumani
 
Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019
William Cox
 
Map reduce
Map reduceMap reduce
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Apache Apex
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
Haripritha
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
Akshay Gore
 
Parallel Graph Analytics
Parallel Graph AnalyticsParallel Graph Analytics
Parallel Graph Analytics
Stefano Romanazzi
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
Thomas Weise
 
MapReduce and the New Software Stack
MapReduce and the New Software StackMapReduce and the New Software Stack
MapReduce and the New Software Stack
Maruf Aytekin
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduce
HC Lin
 
Unit3 MapReduce
Unit3 MapReduceUnit3 MapReduce
Performance Tuning with Execution Plans
Performance Tuning with Execution PlansPerformance Tuning with Execution Plans
Performance Tuning with Execution Plans
Grant Fritchey
 
Deep Dive into Apache Apex App Development
Deep Dive into Apache Apex App DevelopmentDeep Dive into Apache Apex App Development
Deep Dive into Apache Apex App Development
Apache Apex
 

What's hot (20)

MapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large ClustersMapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large Clusters
 
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ..."MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML InferenceWillump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML Inference
 
Hadoop map reduce v2
Hadoop map reduce v2Hadoop map reduce v2
Hadoop map reduce v2
 
Finite state automaton
Finite state automatonFinite state automaton
Finite state automaton
 
Map reduce advantages over parallel databases
Map reduce advantages over parallel databases Map reduce advantages over parallel databases
Map reduce advantages over parallel databases
 
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
Thermal-Aware Scheduling of Batch Jobs in Geographically Distributed Data Cen...
 
Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019
 
Map reduce
Map reduceMap reduce
Map reduce
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
 
Parallel Graph Analytics
Parallel Graph AnalyticsParallel Graph Analytics
Parallel Graph Analytics
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
MapReduce and the New Software Stack
MapReduce and the New Software StackMapReduce and the New Software Stack
MapReduce and the New Software Stack
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduce
 
Unit3 MapReduce
Unit3 MapReduceUnit3 MapReduce
Unit3 MapReduce
 
Performance Tuning with Execution Plans
Performance Tuning with Execution PlansPerformance Tuning with Execution Plans
Performance Tuning with Execution Plans
 
Deep Dive into Apache Apex App Development
Deep Dive into Apache Apex App DevelopmentDeep Dive into Apache Apex App Development
Deep Dive into Apache Apex App Development
 

Viewers also liked

Entidades internacionales que promueven el desarrollo humano 2
Entidades internacionales que promueven el desarrollo humano 2Entidades internacionales que promueven el desarrollo humano 2
Entidades internacionales que promueven el desarrollo humano 2Shirley Mirella Mendoza Peregrino
 
California VA Loans
California VA LoansCalifornia VA Loans
California VA Loans
VALoansFinance.com
 
(NSIPS) DEPENDENCY APPLICATION - PERS 2
(NSIPS) DEPENDENCY APPLICATION - PERS 2(NSIPS) DEPENDENCY APPLICATION - PERS 2
(NSIPS) DEPENDENCY APPLICATION - PERS 2
CCCJimenez
 
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
Elena Ayala Bailador
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard Soft
Igor Petrushyn
 
Comercio electrónico
Comercio electrónicoComercio electrónico
Comercio electrónico
mix34
 
Presentacon axiologia
Presentacon axiologiaPresentacon axiologia
Presentacon axiologiafabian.gastel
 
Isertacion de audio y video
Isertacion de audio y videoIsertacion de audio y video
Isertacion de audio y videoAnaZabaleta
 
Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
 Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
Martin Häberle
 
22 журам шинэчлэн батлах тухай ажилгүйдэл
22 журам шинэчлэн батлах тухай ажилгүйдэл22 журам шинэчлэн батлах тухай ажилгүйдэл
22 журам шинэчлэн батлах тухай ажилгүйдэл
Iggy Enkhee
 
Delitos informatiicos
Delitos informatiicosDelitos informatiicos
Presentacion profic
Presentacion proficPresentacion profic
Presentacion profic
John J Guzman
 
El aborto desde el punto de vista de los hombres
El aborto desde el punto de vista de los hombres El aborto desde el punto de vista de los hombres
El aborto desde el punto de vista de los hombres
Adriary L. Ortiz
 

Viewers also liked (19)

Entidades internacionales que promueven el desarrollo humano 2
Entidades internacionales que promueven el desarrollo humano 2Entidades internacionales que promueven el desarrollo humano 2
Entidades internacionales que promueven el desarrollo humano 2
 
Presentacion lulu
Presentacion luluPresentacion lulu
Presentacion lulu
 
Trabajo formas
Trabajo formasTrabajo formas
Trabajo formas
 
Diapositivas yamile
Diapositivas yamileDiapositivas yamile
Diapositivas yamile
 
California VA Loans
California VA LoansCalifornia VA Loans
California VA Loans
 
(NSIPS) DEPENDENCY APPLICATION - PERS 2
(NSIPS) DEPENDENCY APPLICATION - PERS 2(NSIPS) DEPENDENCY APPLICATION - PERS 2
(NSIPS) DEPENDENCY APPLICATION - PERS 2
 
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
Novedades en twitter. Nueva visualización y nuevas opciones en nuestro perfil...
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard Soft
 
Comercio electrónico
Comercio electrónicoComercio electrónico
Comercio electrónico
 
Presentacon axiologia
Presentacon axiologiaPresentacon axiologia
Presentacon axiologia
 
Isertacion de audio y video
Isertacion de audio y videoIsertacion de audio y video
Isertacion de audio y video
 
Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
 Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
Wirkungsbewusst und situationsgerecht schreiben – Eine Textwerkstatt
 
Los wikis
Los wikisLos wikis
Los wikis
 
22 журам шинэчлэн батлах тухай ажилгүйдэл
22 журам шинэчлэн батлах тухай ажилгүйдэл22 журам шинэчлэн батлах тухай ажилгүйдэл
22 журам шинэчлэн батлах тухай ажилгүйдэл
 
El oso y la colmena
El oso y la colmenaEl oso y la colmena
El oso y la colmena
 
Delitos informatiicos
Delitos informatiicosDelitos informatiicos
Delitos informatiicos
 
Presentacion profic
Presentacion proficPresentacion profic
Presentacion profic
 
El aborto desde el punto de vista de los hombres
El aborto desde el punto de vista de los hombres El aborto desde el punto de vista de los hombres
El aborto desde el punto de vista de los hombres
 
Virus y antivirus
Virus y antivirus Virus y antivirus
Virus y antivirus
 

Similar to Map reduce

Hadoop map reduce in operation
Hadoop map reduce in operationHadoop map reduce in operation
Hadoop map reduce in operation
Subhas Kumar Ghosh
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
NelakurthyVasanthRed1
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Dr. C.V. Suresh Babu
 
MapReduce presentation
MapReduce presentationMapReduce presentation
MapReduce presentation
Vu Thi Trang
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoop
Colin Su
 
Map reduce prashant
Map reduce prashantMap reduce prashant
Map reduce prashant
Prashant Gupta
 
try
trytry
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
AtulYadav218546
 
IOE MODULE 6.pptx
IOE MODULE 6.pptxIOE MODULE 6.pptx
IOE MODULE 6.pptx
nikshaikh786
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
WasyihunSema2
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
Harisankar H
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
Mao Geng
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworks
Reem Abdel-Rahman
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
Joseph Adler
 
Hadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.comHadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.com
softwarequery
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
 

Similar to Map reduce (20)

Hadoop map reduce in operation
Hadoop map reduce in operationHadoop map reduce in operation
Hadoop map reduce in operation
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
MapReduce presentation
MapReduce presentationMapReduce presentation
MapReduce presentation
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoop
 
Map reduce prashant
Map reduce prashantMap reduce prashant
Map reduce prashant
 
try
trytry
try
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
 
IOE MODULE 6.pptx
IOE MODULE 6.pptxIOE MODULE 6.pptx
IOE MODULE 6.pptx
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworks
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
 
Hadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.comHadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.com
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 

Recently uploaded

spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
Kamal Acharya
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
azkamurat
 
01-GPON Fundamental fttx ftth basic .pptx
01-GPON Fundamental fttx ftth basic .pptx01-GPON Fundamental fttx ftth basic .pptx
01-GPON Fundamental fttx ftth basic .pptx
benykoy2024
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 

Recently uploaded (20)

spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
Ethernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.pptEthernet Routing and switching chapter 1.ppt
Ethernet Routing and switching chapter 1.ppt
 
01-GPON Fundamental fttx ftth basic .pptx
01-GPON Fundamental fttx ftth basic .pptx01-GPON Fundamental fttx ftth basic .pptx
01-GPON Fundamental fttx ftth basic .pptx
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 

Map reduce

  • 2. Content •Motivation for MapReduce. •What is MapReduce. •Map() & Reduce() functions. •MapReduce - Example. •Dataflow. •MapReduce Job. •Job Tracker & Task tracker. •Characteristics of MapReduce. •Real Time uses. •Failure in MapReduce. •Conclusion.
  • 3. Motivation For MapReduce •Large scale data processing. ◦ Want to use 1000s of CPUs •MapReduce Architecture provides ◦ Automatic parallelization & distribution ◦ Fault tolerance ◦ I/O scheduling ◦ Monitoring & status updates
  • 4. What is MapReduce •MapReduce is programming model and an associated implementation for processing and generating large data sets with parallel and distributed algorithm on clusters.
  • 5. Map() function • Reads in input pair <Key, Value> • Outputs a pair <K’, V’> • Let’s count number of each word in user queries (or Tweets/Blogs) • The input to the map() will be <queryID, QueryText>: • <Q1,“The teacher went to the store. The store was closed; the store opens in the morning. The store opens at 9am.” > • The output would be: <The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store,1> <the, 1> <store, 1> <was, 1> <closed, 1> <the, 1> <store,1> <opens, 1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1> <opens, 1> <at, 1> <9am, 1>
  • 6. Reduce() function •Accepts the Map() output, and aggregates values on the key •For our example, the reducer input would be: • <The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store, 1> <the, 1> <store, 1> <was, 1> <closed, 1> <the, 1> <store, 1> <opens,1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1> <opens, 1> <at, 1> <9am, 1> • The output would be: • <The, 6> <teacher, 1> <went, 1> <to, 1> <store, 3> <was, 1> <closed, 1> <opens, 1> <morning, 1> <at, 1> <9am, 1>
  • 8. Dataflow Dataflow can be determine through the following function: • an input reader • a Map function • a partition function • a compare function • a Reduce function • an output writer
  • 9. Dataflow(Cont.) •Input reader The input reader divides the input into appropriate size 'splits' (in practice typically 64 MB to 128 MB) and the framework assigns one split to each Map function. The input reader reads data from stable storage (typically a distributed file system) and generates key/value pairs. •Map function The Map function takes a series of key/value pairs, processes each, and generates zero or more output key/value pairs. • Partition function Each Map function output is allocated to a particular reducer by the application's partition function for sharding purposes. The partition function is given the key and the number of reducers and returns the index of the desired reducer. •Comparison function The input for each Reduce is pulled from the machine where the Map ran and sorted using the application's comparison function.
  • 10. Dataflow(Cont.) •Reduce function The framework calls the application's Reduce function once for each unique key in the sorted order. The Reduce can iterate through the values that are associated with that key and produce zero or more outputs. •Output writer The Output Writer writes the output of the Reduce to the stable storage.
  • 11. MapReduce Job A job is a full MapReduce program , which typically will cause multiple Map and Reduce functions to be run in parallel over the life of program. A task is a map or reduce function executed on a subset of data.
  • 12. Job tracker & Task tracker
  • 15. Failures in MapReduce • Failures are norm in commodity hardware • Worker failure • Detect failure via periodic heartbeats • Re-execute in-progress map/reduce tasks • Master failure • Single point of failure; Resume from Execution Log
  • 16. Conclusion. •Simplifies large-scale computations that fit this model •Allows user to focus on the problem without worrying about details