SlideShare a Scribd company logo
1 of 11
Download to read offline
1
Cloud Computing :
MapReduce - Tutorial
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
Introduction
• MapReduce: programming model developed at Google
• Objective:
– Implement large scale search
– Text processing on massively scalable web data stored using BigTable and GFS distributed file
system
• Designed for processing and generating large volumes of data via massively parallel
computations, utilizing tens of thousands of processors at a time
• Fault tolerant: ensure progress of computation even if processors and networks fail
• Example:
– Hadoop: open source implementation of MapReduce (developed at Yahoo!)
– Available on pre-packaged AMIs on Amazon EC2 cloud platform
9/11/2017 2
MapReduce Model
9/11/2017 3
• Parallel programming abstraction
• Used by many different parallel applications which carry out large-scale
computation involving thousands of processors
• Leverages a common underlying fault-tolerant implementation
• Two phases of MapReduce:
– Map operation
– Reduce operation
• A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are
assigned to work on the problem
• The computation is coordinated by a single master process
MapReduce Model Contd…
9/11/2017 4
• Map phase:
– Each mapper reads approximately 1/M of the input from the global file
system, using locations given by the master
– Map operation consists of transforming one set of key-value pairs to
another:
– Each mapper writes computation results in one file per reducer
– Files are sorted by a key and stored to the local file system
– The master keeps track of the location of these files
MapReduce Model Contd…
9/11/2017 5
• Reduce phase:
– The master informs the reducers where the partial computations have been stored
on local files of respective mappers
– Reducers make remote procedure call requests to the mappers to fetch the files
– Each reducer groups the results of the map step using the same key and performs a
function f on the list of values that correspond to these key value:
– Final results are written back to the GFS file system
MapReduce: Example
9/11/2017 6
• 3 mappers; 2 reducers
• Map function:
• Reduce function:
Problem-1
9/11/2017 7
In a MapReduce framework consider the HDFS block size is 64 MB.
We have 3 files of size 64K, 65Mb and 127Mb. How many blocks will
be created by Hadoop framework?
Problem-2
9/11/2017 8
Write the pseudo-codes (for map and reduce functions) for calculating
the average of a set of integers in MapReduce.
Suppose A = (10, 20, 30, 40, 50) is a set of integers. Show the map and
reduce outputs.
Problem-3
9/11/2017 9
Compute total and average salary of organization XYZ and group by
based on gender (male or female) using MapReduce. The input is as
follows
Name, Gender, Salary
John, M, 10,000
Martha, F, 15,000
----
Problem-4
9/11/2017 10
Write the Map and Reduce functions (pseudo-codes) for the following Word
Length Categorization problem under MapReduce model.
Word Length Categorization: Given a text paragraph (containing only words),
categorize each word into following categories. Output the frequency of
occurrence of words in each category.
Categories:
tiny: 1-2 letters; small: 3-5 letters; medium: 6-9 letters; big: 10 or more letters
11

More Related Content

What's hot

Hybrid Cloud and Its Implementation
Hybrid Cloud and Its ImplementationHybrid Cloud and Its Implementation
Hybrid Cloud and Its ImplementationSai P Mishra
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theorybutest
 
Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsDr. Sunil Kr. Pandey
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computingMehul Patel
 
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.Megha Sharma
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
 
Week2 cloud computing week2
Week2 cloud computing week2Week2 cloud computing week2
Week2 cloud computing week2Ankit Gupta
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...Alejandro Bellogin
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithmswapnac12
 
PRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTINGPRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTINGvipluv mittal
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Cloud computing vs grid computing
Cloud computing vs grid computingCloud computing vs grid computing
Cloud computing vs grid computing8neutron8
 

What's hot (20)

Hybrid Cloud and Its Implementation
Hybrid Cloud and Its ImplementationHybrid Cloud and Its Implementation
Hybrid Cloud and Its Implementation
 
Failover cluster
Failover clusterFailover cluster
Failover cluster
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theory
 
Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and Applications
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computing
 
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.
 
Google App Engine ppt
Google App Engine  pptGoogle App Engine  ppt
Google App Engine ppt
 
Module 4 part_1
Module 4 part_1Module 4 part_1
Module 4 part_1
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
 
Week2 cloud computing week2
Week2 cloud computing week2Week2 cloud computing week2
Week2 cloud computing week2
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithm
 
Data cleaning-outlier-detection
Data cleaning-outlier-detectionData cleaning-outlier-detection
Data cleaning-outlier-detection
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
PRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTINGPRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTING
 
Clustering
ClusteringClustering
Clustering
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Virtual machine security
Virtual machine securityVirtual machine security
Virtual machine security
 
Cloud computing vs grid computing
Cloud computing vs grid computingCloud computing vs grid computing
Cloud computing vs grid computing
 

Similar to Mod05lec23(map reduce tutorial)

Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfTSANKARARAO
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Deanna Kosaraju
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesKelly Technologies
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionDong Ngoc
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintijccsa
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introductionrajsandhu1989
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combinersijcsit
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5RojaT4
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System cscpconf
 

Similar to Mod05lec23(map reduce tutorial) (20)

Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
E031201032036
E031201032036E031201032036
E031201032036
 
Hadoop
HadoopHadoop
Hadoop
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Hadoop - Introduction to HDFS
Hadoop - Introduction to HDFSHadoop - Introduction to HDFS
Hadoop - Introduction to HDFS
 
Hadoop
HadoopHadoop
Hadoop
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
Hadoop
HadoopHadoop
Hadoop
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System
 

More from Ankit Gupta

Biometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningBiometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningAnkit Gupta
 
Week 8 lecture material
Week 8 lecture materialWeek 8 lecture material
Week 8 lecture materialAnkit Gupta
 
Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Ankit Gupta
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material ccAnkit Gupta
 
Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Ankit Gupta
 
Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Ankit Gupta
 
Mod05lec21(sla tutorial)
Mod05lec21(sla tutorial)Mod05lec21(sla tutorial)
Mod05lec21(sla tutorial)Ankit Gupta
 
Lecture29 cc-security4
Lecture29 cc-security4Lecture29 cc-security4
Lecture29 cc-security4Ankit Gupta
 
Lecture28 cc-security3
Lecture28 cc-security3Lecture28 cc-security3
Lecture28 cc-security3Ankit Gupta
 
Lecture27 cc-security2
Lecture27 cc-security2Lecture27 cc-security2
Lecture27 cc-security2Ankit Gupta
 
Lecture26 cc-security1
Lecture26 cc-security1Lecture26 cc-security1
Lecture26 cc-security1Ankit Gupta
 
Lecture 30 cloud mktplace
Lecture 30 cloud mktplaceLecture 30 cloud mktplace
Lecture 30 cloud mktplaceAnkit Gupta
 
Week 7 lecture material
Week 7 lecture materialWeek 7 lecture material
Week 7 lecture materialAnkit Gupta
 
Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Ankit Gupta
 
Microprocessor full hand made notes
Microprocessor full hand made notesMicroprocessor full hand made notes
Microprocessor full hand made notesAnkit Gupta
 
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptxAnkit Gupta
 
Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Ankit Gupta
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Cloud computing ebook
Cloud computing ebookCloud computing ebook
Cloud computing ebookAnkit Gupta
 
java program assigment -2
java program assigment -2java program assigment -2
java program assigment -2Ankit Gupta
 

More from Ankit Gupta (20)

Biometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learningBiometricstechnology in iot and machine learning
Biometricstechnology in iot and machine learning
 
Week 8 lecture material
Week 8 lecture materialWeek 8 lecture material
Week 8 lecture material
 
Week 4 lecture material cc (1)
Week 4 lecture material cc (1)Week 4 lecture material cc (1)
Week 4 lecture material cc (1)
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)Mod05lec25(resource mgmt ii)
Mod05lec25(resource mgmt ii)
 
Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)Mod05lec24(resource mgmt i)
Mod05lec24(resource mgmt i)
 
Mod05lec21(sla tutorial)
Mod05lec21(sla tutorial)Mod05lec21(sla tutorial)
Mod05lec21(sla tutorial)
 
Lecture29 cc-security4
Lecture29 cc-security4Lecture29 cc-security4
Lecture29 cc-security4
 
Lecture28 cc-security3
Lecture28 cc-security3Lecture28 cc-security3
Lecture28 cc-security3
 
Lecture27 cc-security2
Lecture27 cc-security2Lecture27 cc-security2
Lecture27 cc-security2
 
Lecture26 cc-security1
Lecture26 cc-security1Lecture26 cc-security1
Lecture26 cc-security1
 
Lecture 30 cloud mktplace
Lecture 30 cloud mktplaceLecture 30 cloud mktplace
Lecture 30 cloud mktplace
 
Week 7 lecture material
Week 7 lecture materialWeek 7 lecture material
Week 7 lecture material
 
Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16Gurukul Cse cbcs-2015-16
Gurukul Cse cbcs-2015-16
 
Microprocessor full hand made notes
Microprocessor full hand made notesMicroprocessor full hand made notes
Microprocessor full hand made notes
 
Transfer Leaning Using Pytorch synopsis Minor project pptx
Transfer Leaning Using Pytorch  synopsis Minor project pptxTransfer Leaning Using Pytorch  synopsis Minor project pptx
Transfer Leaning Using Pytorch synopsis Minor project pptx
 
Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Cloud computing ebook
Cloud computing ebookCloud computing ebook
Cloud computing ebook
 
java program assigment -2
java program assigment -2java program assigment -2
java program assigment -2
 

Recently uploaded

chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 

Recently uploaded (20)

chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 

Mod05lec23(map reduce tutorial)

  • 1. 1 Cloud Computing : MapReduce - Tutorial Prof. Soumya K Ghosh Department of Computer Science and Engineering IIT KHARAGPUR
  • 2. Introduction • MapReduce: programming model developed at Google • Objective: – Implement large scale search – Text processing on massively scalable web data stored using BigTable and GFS distributed file system • Designed for processing and generating large volumes of data via massively parallel computations, utilizing tens of thousands of processors at a time • Fault tolerant: ensure progress of computation even if processors and networks fail • Example: – Hadoop: open source implementation of MapReduce (developed at Yahoo!) – Available on pre-packaged AMIs on Amazon EC2 cloud platform 9/11/2017 2
  • 3. MapReduce Model 9/11/2017 3 • Parallel programming abstraction • Used by many different parallel applications which carry out large-scale computation involving thousands of processors • Leverages a common underlying fault-tolerant implementation • Two phases of MapReduce: – Map operation – Reduce operation • A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are assigned to work on the problem • The computation is coordinated by a single master process
  • 4. MapReduce Model Contd… 9/11/2017 4 • Map phase: – Each mapper reads approximately 1/M of the input from the global file system, using locations given by the master – Map operation consists of transforming one set of key-value pairs to another: – Each mapper writes computation results in one file per reducer – Files are sorted by a key and stored to the local file system – The master keeps track of the location of these files
  • 5. MapReduce Model Contd… 9/11/2017 5 • Reduce phase: – The master informs the reducers where the partial computations have been stored on local files of respective mappers – Reducers make remote procedure call requests to the mappers to fetch the files – Each reducer groups the results of the map step using the same key and performs a function f on the list of values that correspond to these key value: – Final results are written back to the GFS file system
  • 6. MapReduce: Example 9/11/2017 6 • 3 mappers; 2 reducers • Map function: • Reduce function:
  • 7. Problem-1 9/11/2017 7 In a MapReduce framework consider the HDFS block size is 64 MB. We have 3 files of size 64K, 65Mb and 127Mb. How many blocks will be created by Hadoop framework?
  • 8. Problem-2 9/11/2017 8 Write the pseudo-codes (for map and reduce functions) for calculating the average of a set of integers in MapReduce. Suppose A = (10, 20, 30, 40, 50) is a set of integers. Show the map and reduce outputs.
  • 9. Problem-3 9/11/2017 9 Compute total and average salary of organization XYZ and group by based on gender (male or female) using MapReduce. The input is as follows Name, Gender, Salary John, M, 10,000 Martha, F, 15,000 ----
  • 10. Problem-4 9/11/2017 10 Write the Map and Reduce functions (pseudo-codes) for the following Word Length Categorization problem under MapReduce model. Word Length Categorization: Given a text paragraph (containing only words), categorize each word into following categories. Output the frequency of occurrence of words in each category. Categories: tiny: 1-2 letters; small: 3-5 letters; medium: 6-9 letters; big: 10 or more letters
  • 11. 11