SlideShare a Scribd company logo
1 of 32
Download to read offline
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Distributed Systems 1
CUCS Course 4113
https://systems.cs.columbia.edu/ds1-class/
Instructor: Roxana Geambasu
1
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
MapReduce
2
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Outline
• Today: MapReduce
– Analytics programming interface
– Word count example
– Chaining
– Reading/write from/to HDFS
– Dealing with failures
• In future: Spark
– Motivation
– Resilient Distributed Datasets (RDDs)
– Programming interface
– Transformations and actions
3
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Large-scale Analytics
• Compute the frequency of
words in a corpus of
documents.
• Count how many times users
have clicked on each of a
(large) set of ads.
• PageRank: Compute the
“importance” of a web page
based on the “importances”
of the pages that link to it.
• ….
4
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 1: SQL
5
• Before MapReduce, analytics mostly done in SQL, or manually.
• Example: Count word appearances in a corpus of documents.
• With SQL, the rough query might be:
• Very expressive, convenient to program, but no one knew how
to scale SQL execution!
SELECT COUNT(*)
FROM (
SELECT UNNEST(string_to_array(doc_content, ‘ ’)) as word
FROM Corpus
)
GROUP BY word
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual
6
• Example: Count word appearances
in a corpus of documents.
Docs
net-
work
M1 M2
M5 M4
M3
Mn
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual
7
• Example: Count word appearances
in a corpus of documents.
• Phase 1: Assign documents to
different machines/nodes.
– Each computes a dictionary:
{word: local_freq}.
net-
work
M1 M2
M5 M4
M3
Mn
Docs
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual
8
• Example: Count word appearances
in a corpus of documents.
• Phase 1: Assign documents to
different machines/nodes.
– Each computes a dictionary:
{word: local_freq}.
• Phase 2: Nodes exchange
dictionaries (how?) to aggregate
local_freq’s.
Docs
net-
work
M1 M2
M5 M4
M3
Mn
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual
9
• Example: Count word appearances
in a corpus of documents.
• Phase 1: Assign documents to
different machines/nodes.
– Each computes a dictionary:
{word: local_freq}.
• Phase 2: Nodes exchange
dictionaries (how?) to aggregate
local_freq’s.
– But how to make this scale??
Docs
net-
work
M1 M2
M5 M4
M3
Mn
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual (cont’ed)
10
• Phase 2, Option a: Send all
{word: local_freq} dictionaries to
one node, who aggregates.
– But what if it’s too much
data for one node?
• Phase 2, Option b: Each node
sends (word, local_freq) to a
designated node, e.g., node
with ID hash(word) % N.
net-
work
M1 M2
M5 M4
M3
Mn
Docs
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Manual (cont’ed)
11
• Phase 2, Option a: Send all
{word: local_freq} dictionaries to
one node, who aggregates.
– But what if it’s too much
data for one node?
• Phase 2, Option b: Each node
sends (word, local_freq) to a
designated node, e.g., node
with ID hash(word) % N.
net-
work
M1 M2
M5 M4
M3
Mn
Docs
We’ve roughly discovered an app-specific version of MapReduce!
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Challenges
12
• How to generalize to other
applications?
• How to deal with failures?
• How to deal with slow nodes?
• How to deal with load balancing
(some docs are very large,
others small)?
• How to deal with skew (some
words are very frequent, so
nodes designated to aggregate
them will be pounded)?
• …
net-
work
M1 M2
M5 M4
M3
Mn
Docs
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
MapReduce
13
• Parallelizable programming model:
– Applies to a broad class of analytics applications.
– Isn’t as expressive as SQL but it is easier to scale.
– Consists of two phases, each intrinsically parallelizable:
• Map: processes input elements independently to emit relevant
(key, value) pairs from each.
• Transparently, the system groups all the values for each key
together: (key, [list of values]).
• Reduce: aggregates all the values for each key to emit a global
value for each key.
• Scalable, efficient, fault tolerant runtime system.
– Addresses preceding challenges (and more).
• Technical paper: please read it!
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
How it works
14
• Input: a collection of elements of (key, value) pair type.
• Programmer defines two functions:
– Map(key, value) → 0, 1, or more (key’, value’) pairs
– Reduce(key, value-list) → output
• Execution:
– Apply Map to each input key-value pair, in parallel for different keys.
– Sort emitted (key’, value’) pairs to produce (key, values-list) pairs.
– Apply Reduce to each (key, value-list) pair, in parallel for different
keys.
• Output is the union of all Reduce invocations’ outputs.
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Workflow
15
Reducer
Reducer
Mapper
Mapper
Mapper
read
local
write
system does
remote read, sort
Output
File 0
Output
File 1
write
Split 0
Split 1
Split 2
Input data
(in a DFS)
Output data
(in a DFS)
Map phase:
extract something you care
about from each record
Reduce phase:
aggregate
Workers Workers
Tmp data
(local FS)
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
• We have a directory, which contains many documents.
• The documents contain words separated by whitespace
and punctuation.
• Goal: Count the number of times each distinct word
appears in the file.
Example: Word count
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Word count with MapReduce
Map(key, value):
// key: document ID; value: document content
FOR (each word w IN value)
emit(w, 1);
Reduce(key, value-list):
// key: a word; value-list: a list of integers
result = 0;
FOR (each integer v on value-list)
result += v;
emit(key, result);
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Word count with MapReduce
Map(key, value):
// key: document ID; value: document content
FOR (each word w IN value)
emit(w, 1);
Reduce(key, value-list):
// key: a word; value-list: a list of integers
result = 0;
FOR (each integer v on value-list)
result += v;
emit(key, result);
Expect to be all 1’s, but
“combiners” allow local
summing of
integers with the same
key before passing to
reducers.
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Map Phase
• Mapper is given key: document ID; value: document content, say:
(D1,“The teacher went to the store. The store was closed; the
store opens in the morning. The store opens at 9am.”)
• It will emit the following pairs:
<The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store,1>
<the, 1> <store, 1> <was, 1> <closed, 1> <the, 1> <store,1>
<opens, 1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1>
<opens, 1> <at, 1> <9am, 1>
• Normally, there will be many documents, hence many Mappers that
emit such pairs in parallel, but for simplicity, let’s say that these are
all the pairs emitted from the Map phase. 19
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Intermediary Phase
• System sorts emitted (key, value) pairs by key:
20
<9am, 1>
<at, 1>
<closed, 1>
<in, 1>
<morning, 1>
<opens, 1>
<opens, 1>
<store, 1>
<store,1>
<store, 1>
<store,1>
<teacher, 1>
<the, 1>
<the, 1>
<the, 1>
<the, 1>
<the 1>
<to, 1>
<went, 1>
<was, 1>
<The, 1>
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Intermediary Phase
• System sorts emitted (key, value) pairs by key:
21
<9am, 1>
<at, 1>
<closed, 1>
<in, 1>
<morning, 1>
<opens, 1>
<opens, 1>
<store, 1>
<store,1>
<store, 1>
<store,1>
<teacher, 1>
<the, 1>
<the, 1>
<the, 1>
<the, 1>
<the 1>
<to, 1>
<went, 1>
<was, 1>
<The, 1>
Reducer 2
Reducer 1
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
• For each unique key emitted from the Map Phase, function
Reduce(key, value-list) is invoked on Reducer 1 or Reducer 2.
• Across their invocations, these Reducers will emit:
22
Reduce Phase
<9am, 1>
<at, 1>
<closed, 1>
<in, 1>
<morning, 1>
<opens, 2>
<store, 4>
<teacher, 1>
<the, 5>
<to, 1>
<went, 1>
<was, 1>
<The, 1>
Reducer 1 Reducer 2
Output of the entire program
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Chaining MapReduce
• The programming model seems pretty restrictive.
• But quite a few analytics applications can be written with it,
especially with a technique called chaining.
• If the output of reducers is (key, value) pairs, then their output
can be passed onto other Map/Reduce processes.
• This chaining can support a variety of analytics (though
certainly not all types of analytics, e.g., no ML b/c no loops).
23
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Example: Word Frequency
• Suppose instead of word count, we wanted to compute word
frequency: the probability that a word would appear in a document.
• This means computing the fraction of times a word appears, out of
the total number of words in the corpus.
24
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Solution: Chain two MapReduce’s
• First Map/Reduce: Word Count (like before)
– Map: process documents and output <word, 1> pairs.
– Multiple Reducers: emit <word, word_count> for each word.
• Second MapReduce:
– Map: processes <word, word_count> and outputs (1, (word, word_count)).
– 1 Reducer: does two passes:
• In first pass, sums up all word_count’s to calculate overall_count.
• In second pass calculates fractions. Emits multiple <word,
word_count/overall_count>.
25
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Option 2: Challenges (recall)
26
• How to generalize to other applications?
– See original MapReduce paper for more examples.
• How to deal with failures?
• How to deal with slow nodes?
• How to deal with load balancing?
• How to deal with skew?
• The MapReduce runtime tries to hide these challenges as
much as possible. We’ll talk about a few of the performance
and fault tolerance challenges here.
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Architecture
27
Reducer
Reducer
Mapper
Mapper
Mapper
read
local
write
system does
remote read, sort
Output
File 0
Output
File 1
write
Split 0
Split 1
Split 2
Input data
(in a DFS)
Output data
(in a DFS)
Tmp data
(local FS)
Master
assign
map
task
assign
reduce
task
DFS
Data
Nodes
DFS
Data
Nodes
DFS
Name
Server
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Data locality for performance
• Master scheduling policy:
– Asks DFS for locations of replicas of input file blocks.
– Map tasks are scheduled so DFS input block replica are
on the same machine or on the same rack.
• Effect: Thousands of machines read input at local
disk speed.
– Don’t need to transfer input data all over the cluster over
the network: eliminate network bottleneck!
28
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Heartbeats, replication for fault tolerance
• Failures are the norm in data centers.
• Worker failure:
– Master detects if workers failed by periodically pinging
them (this is called a “heartbeat”).
– Re-execute in-progress map/reduce tasks.
• Master failure:
– Initially, was single point of failure; Resume from Execution
Log. Subsequent versions used replication and consensus.
• Effect: From Google’s paper: lost 1600 of 1800 machines once, but
finished fine.
29
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Redundant execution for performance
• Slow workers or stragglers significantly lengthen completion time.
• Slowest worker can determine the total latency!
– This is why many systems measure 99th
percentile latency.
– Other jobs consuming resources on machine.
– Bad disks with errors transfer data very slowly.
• Solution: spawn backup copies of tasks.
– Whichever one finishes first "wins.”
– I.e., treat slow executions as failed executions!
30
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Take-Aways
• MapReduce is a programming model and runtime for scalable, fault
tolerant, and efficient analytics.
– Well, some types of analytics, it’s not very expressive.
• It hides common at-scale challenges under convenient abstractions.
– Programmers need only implement Map and Reduce and the system
takes care of running those at scale on lots of data.
– Failures raise semantic/performance challenges for MapReduce, which
it typically handles through redundancy.
• There exist open-source implementations, including Hadoop, Flink.
• Spark, a popular data analytics framework, also supports
MapReduce but also a wider range of programming models.
31
Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu
Acknowledgement
• Slides prepared based on Asaf Cidon’s 2020 DS-1
invited lecture.
32

More Related Content

Similar to 02-map-reduce.pdf

Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerancePallav Jha
 
Big data shim
Big data shimBig data shim
Big data shimtistrue
 
Evaluating Cues for Resuming Interrupted Programming TAsks
Evaluating Cues for Resuming Interrupted Programming TAsksEvaluating Cues for Resuming Interrupted Programming TAsks
Evaluating Cues for Resuming Interrupted Programming TAsksChris Parnin
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdfFEG
 
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduceComputing Scientometrics in Large-Scale Academic Search Engines with MapReduce
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduceLeonidas Akritidis
 
L19CloudMapReduce introduction for cloud computing .ppt
L19CloudMapReduce introduction for cloud computing .pptL19CloudMapReduce introduction for cloud computing .ppt
L19CloudMapReduce introduction for cloud computing .pptMaruthiPrasad96
 
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...Neelabha Pant
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPCinside-BigData.com
 
Online learning with structured streaming, spark summit brussels 2016
Online learning with structured streaming, spark summit brussels 2016Online learning with structured streaming, spark summit brussels 2016
Online learning with structured streaming, spark summit brussels 2016Ram Sriharsha
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce scriptHaripritha
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptxShimoFcis
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014Luigi Dell'Aquila
 

Similar to 02-map-reduce.pdf (20)

Unit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdfUnit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdf
 
Unit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdfUnit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdf
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
 
Big data shim
Big data shimBig data shim
Big data shim
 
Evaluating Cues for Resuming Interrupted Programming TAsks
Evaluating Cues for Resuming Interrupted Programming TAsksEvaluating Cues for Resuming Interrupted Programming TAsks
Evaluating Cues for Resuming Interrupted Programming TAsks
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
 
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduceComputing Scientometrics in Large-Scale Academic Search Engines with MapReduce
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce
 
L19CloudMapReduce introduction for cloud computing .ppt
L19CloudMapReduce introduction for cloud computing .pptL19CloudMapReduce introduction for cloud computing .ppt
L19CloudMapReduce introduction for cloud computing .ppt
 
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPC
 
Online learning with structured streaming, spark summit brussels 2016
Online learning with structured streaming, spark summit brussels 2016Online learning with structured streaming, spark summit brussels 2016
Online learning with structured streaming, spark summit brussels 2016
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
Map reduce part1
Map reduce part1Map reduce part1
Map reduce part1
 
Enar short course
Enar short courseEnar short course
Enar short course
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
 
mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
 
ch02-mapreduce.pptx
ch02-mapreduce.pptxch02-mapreduce.pptx
ch02-mapreduce.pptx
 
Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
 

Recently uploaded

handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Studentskannan348865
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Toolssoginsider
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentationsj9399037128
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptjigup7320
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological universityMohd Saifudeen
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfJNTUA
 
Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfKira Dess
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdfKamal Acharya
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalSwarnaSLcse
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniR. Sosa
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationEmaan Sharma
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 

Recently uploaded (20)

handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney Uni
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 

02-map-reduce.pdf

  • 1. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Distributed Systems 1 CUCS Course 4113 https://systems.cs.columbia.edu/ds1-class/ Instructor: Roxana Geambasu 1
  • 2. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu MapReduce 2
  • 3. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Outline • Today: MapReduce – Analytics programming interface – Word count example – Chaining – Reading/write from/to HDFS – Dealing with failures • In future: Spark – Motivation – Resilient Distributed Datasets (RDDs) – Programming interface – Transformations and actions 3
  • 4. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Large-scale Analytics • Compute the frequency of words in a corpus of documents. • Count how many times users have clicked on each of a (large) set of ads. • PageRank: Compute the “importance” of a web page based on the “importances” of the pages that link to it. • …. 4
  • 5. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 1: SQL 5 • Before MapReduce, analytics mostly done in SQL, or manually. • Example: Count word appearances in a corpus of documents. • With SQL, the rough query might be: • Very expressive, convenient to program, but no one knew how to scale SQL execution! SELECT COUNT(*) FROM ( SELECT UNNEST(string_to_array(doc_content, ‘ ’)) as word FROM Corpus ) GROUP BY word
  • 6. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual 6 • Example: Count word appearances in a corpus of documents. Docs net- work M1 M2 M5 M4 M3 Mn
  • 7. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual 7 • Example: Count word appearances in a corpus of documents. • Phase 1: Assign documents to different machines/nodes. – Each computes a dictionary: {word: local_freq}. net- work M1 M2 M5 M4 M3 Mn Docs
  • 8. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual 8 • Example: Count word appearances in a corpus of documents. • Phase 1: Assign documents to different machines/nodes. – Each computes a dictionary: {word: local_freq}. • Phase 2: Nodes exchange dictionaries (how?) to aggregate local_freq’s. Docs net- work M1 M2 M5 M4 M3 Mn
  • 9. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual 9 • Example: Count word appearances in a corpus of documents. • Phase 1: Assign documents to different machines/nodes. – Each computes a dictionary: {word: local_freq}. • Phase 2: Nodes exchange dictionaries (how?) to aggregate local_freq’s. – But how to make this scale?? Docs net- work M1 M2 M5 M4 M3 Mn
  • 10. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual (cont’ed) 10 • Phase 2, Option a: Send all {word: local_freq} dictionaries to one node, who aggregates. – But what if it’s too much data for one node? • Phase 2, Option b: Each node sends (word, local_freq) to a designated node, e.g., node with ID hash(word) % N. net- work M1 M2 M5 M4 M3 Mn Docs
  • 11. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Manual (cont’ed) 11 • Phase 2, Option a: Send all {word: local_freq} dictionaries to one node, who aggregates. – But what if it’s too much data for one node? • Phase 2, Option b: Each node sends (word, local_freq) to a designated node, e.g., node with ID hash(word) % N. net- work M1 M2 M5 M4 M3 Mn Docs We’ve roughly discovered an app-specific version of MapReduce!
  • 12. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Challenges 12 • How to generalize to other applications? • How to deal with failures? • How to deal with slow nodes? • How to deal with load balancing (some docs are very large, others small)? • How to deal with skew (some words are very frequent, so nodes designated to aggregate them will be pounded)? • … net- work M1 M2 M5 M4 M3 Mn Docs
  • 13. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu MapReduce 13 • Parallelizable programming model: – Applies to a broad class of analytics applications. – Isn’t as expressive as SQL but it is easier to scale. – Consists of two phases, each intrinsically parallelizable: • Map: processes input elements independently to emit relevant (key, value) pairs from each. • Transparently, the system groups all the values for each key together: (key, [list of values]). • Reduce: aggregates all the values for each key to emit a global value for each key. • Scalable, efficient, fault tolerant runtime system. – Addresses preceding challenges (and more). • Technical paper: please read it!
  • 14. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu How it works 14 • Input: a collection of elements of (key, value) pair type. • Programmer defines two functions: – Map(key, value) → 0, 1, or more (key’, value’) pairs – Reduce(key, value-list) → output • Execution: – Apply Map to each input key-value pair, in parallel for different keys. – Sort emitted (key’, value’) pairs to produce (key, values-list) pairs. – Apply Reduce to each (key, value-list) pair, in parallel for different keys. • Output is the union of all Reduce invocations’ outputs.
  • 15. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Workflow 15 Reducer Reducer Mapper Mapper Mapper read local write system does remote read, sort Output File 0 Output File 1 write Split 0 Split 1 Split 2 Input data (in a DFS) Output data (in a DFS) Map phase: extract something you care about from each record Reduce phase: aggregate Workers Workers Tmp data (local FS)
  • 16. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu • We have a directory, which contains many documents. • The documents contain words separated by whitespace and punctuation. • Goal: Count the number of times each distinct word appears in the file. Example: Word count
  • 17. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Word count with MapReduce Map(key, value): // key: document ID; value: document content FOR (each word w IN value) emit(w, 1); Reduce(key, value-list): // key: a word; value-list: a list of integers result = 0; FOR (each integer v on value-list) result += v; emit(key, result);
  • 18. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Word count with MapReduce Map(key, value): // key: document ID; value: document content FOR (each word w IN value) emit(w, 1); Reduce(key, value-list): // key: a word; value-list: a list of integers result = 0; FOR (each integer v on value-list) result += v; emit(key, result); Expect to be all 1’s, but “combiners” allow local summing of integers with the same key before passing to reducers.
  • 19. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Map Phase • Mapper is given key: document ID; value: document content, say: (D1,“The teacher went to the store. The store was closed; the store opens in the morning. The store opens at 9am.”) • It will emit the following pairs: <The, 1> <teacher, 1> <went, 1> <to, 1> <the, 1> <store,1> <the, 1> <store, 1> <was, 1> <closed, 1> <the, 1> <store,1> <opens, 1> <in, 1> <the, 1> <morning, 1> <the 1> <store, 1> <opens, 1> <at, 1> <9am, 1> • Normally, there will be many documents, hence many Mappers that emit such pairs in parallel, but for simplicity, let’s say that these are all the pairs emitted from the Map phase. 19
  • 20. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Intermediary Phase • System sorts emitted (key, value) pairs by key: 20 <9am, 1> <at, 1> <closed, 1> <in, 1> <morning, 1> <opens, 1> <opens, 1> <store, 1> <store,1> <store, 1> <store,1> <teacher, 1> <the, 1> <the, 1> <the, 1> <the, 1> <the 1> <to, 1> <went, 1> <was, 1> <The, 1>
  • 21. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Intermediary Phase • System sorts emitted (key, value) pairs by key: 21 <9am, 1> <at, 1> <closed, 1> <in, 1> <morning, 1> <opens, 1> <opens, 1> <store, 1> <store,1> <store, 1> <store,1> <teacher, 1> <the, 1> <the, 1> <the, 1> <the, 1> <the 1> <to, 1> <went, 1> <was, 1> <The, 1> Reducer 2 Reducer 1
  • 22. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu • For each unique key emitted from the Map Phase, function Reduce(key, value-list) is invoked on Reducer 1 or Reducer 2. • Across their invocations, these Reducers will emit: 22 Reduce Phase <9am, 1> <at, 1> <closed, 1> <in, 1> <morning, 1> <opens, 2> <store, 4> <teacher, 1> <the, 5> <to, 1> <went, 1> <was, 1> <The, 1> Reducer 1 Reducer 2 Output of the entire program
  • 23. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Chaining MapReduce • The programming model seems pretty restrictive. • But quite a few analytics applications can be written with it, especially with a technique called chaining. • If the output of reducers is (key, value) pairs, then their output can be passed onto other Map/Reduce processes. • This chaining can support a variety of analytics (though certainly not all types of analytics, e.g., no ML b/c no loops). 23
  • 24. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Example: Word Frequency • Suppose instead of word count, we wanted to compute word frequency: the probability that a word would appear in a document. • This means computing the fraction of times a word appears, out of the total number of words in the corpus. 24
  • 25. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Solution: Chain two MapReduce’s • First Map/Reduce: Word Count (like before) – Map: process documents and output <word, 1> pairs. – Multiple Reducers: emit <word, word_count> for each word. • Second MapReduce: – Map: processes <word, word_count> and outputs (1, (word, word_count)). – 1 Reducer: does two passes: • In first pass, sums up all word_count’s to calculate overall_count. • In second pass calculates fractions. Emits multiple <word, word_count/overall_count>. 25
  • 26. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Option 2: Challenges (recall) 26 • How to generalize to other applications? – See original MapReduce paper for more examples. • How to deal with failures? • How to deal with slow nodes? • How to deal with load balancing? • How to deal with skew? • The MapReduce runtime tries to hide these challenges as much as possible. We’ll talk about a few of the performance and fault tolerance challenges here.
  • 27. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Architecture 27 Reducer Reducer Mapper Mapper Mapper read local write system does remote read, sort Output File 0 Output File 1 write Split 0 Split 1 Split 2 Input data (in a DFS) Output data (in a DFS) Tmp data (local FS) Master assign map task assign reduce task DFS Data Nodes DFS Data Nodes DFS Name Server
  • 28. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Data locality for performance • Master scheduling policy: – Asks DFS for locations of replicas of input file blocks. – Map tasks are scheduled so DFS input block replica are on the same machine or on the same rack. • Effect: Thousands of machines read input at local disk speed. – Don’t need to transfer input data all over the cluster over the network: eliminate network bottleneck! 28
  • 29. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Heartbeats, replication for fault tolerance • Failures are the norm in data centers. • Worker failure: – Master detects if workers failed by periodically pinging them (this is called a “heartbeat”). – Re-execute in-progress map/reduce tasks. • Master failure: – Initially, was single point of failure; Resume from Execution Log. Subsequent versions used replication and consensus. • Effect: From Google’s paper: lost 1600 of 1800 machines once, but finished fine. 29
  • 30. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Redundant execution for performance • Slow workers or stragglers significantly lengthen completion time. • Slowest worker can determine the total latency! – This is why many systems measure 99th percentile latency. – Other jobs consuming resources on machine. – Bad disks with errors transfer data very slowly. • Solution: spawn backup copies of tasks. – Whichever one finishes first "wins.” – I.e., treat slow executions as failed executions! 30
  • 31. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Take-Aways • MapReduce is a programming model and runtime for scalable, fault tolerant, and efficient analytics. – Well, some types of analytics, it’s not very expressive. • It hides common at-scale challenges under convenient abstractions. – Programmers need only implement Map and Reduce and the system takes care of running those at scale on lots of data. – Failures raise semantic/performance challenges for MapReduce, which it typically handles through redundancy. • There exist open-source implementations, including Hadoop, Flink. • Spark, a popular data analytics framework, also supports MapReduce but also a wider range of programming models. 31
  • 32. Distributed Systems 1, Columbia Course 4113, Instructor: Roxana Geambasu Acknowledgement • Slides prepared based on Asaf Cidon’s 2020 DS-1 invited lecture. 32