SlideShare a Scribd company logo
Cloud Computing
MapReduce in Heterogeneous
Environments
Eva Kalyvianaki
ek264@cam.ac.uk
Contents
 Looking at MapReduce performance in heterogeneous
clusters
 Material is from the paper:
“Improving MapReduce Performance in Heterogeneous Environments”,
By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and
Ion Stoica, published in Usenix OSDI conference, 2008
 and their presentation at OSDI
2
Motivation: MapReduce is becoming popular
 Open-source implementation, Hadoop, used by Yahoo!,
Facebook, Last.fm, …
 Scale: 20 PB/day at Google, O(10,000) nodes at Yahoo, 3000
jobs/day at Facebook
3
Stragglers in MapReduce
 Straggler is a node that performs poorly or not performing
at all.
 Original MapReduce mitigation approach was:
 To run a speculative copy (called a backup task)
 Whichever copy or original would finish first would be included
 Without speculative execution, a job would be slow as the
slowest sub-task
 Google notes that speculative execution can improve job
response times by 44%
 Is this approach good enough for modern clusters?
4
Modern Clusters: Heterogeneity is the norm
 Cloud computing providers like Amazon’s Elastic Compute
Cloud (EC2) provide cheap on-demand computing:
 Price: 2 cents / VM / hour
 Scale: thousands of VMs
 Caveat: less control of performance
 Main challenge for Hadoop on EC2 is performance
heterogeneity, which breaks task scheduler assumptions
 This lecture/paper is on a new LATE scheduler that can cut
response time in half
5
MapReduce Revised
6
MapReduce Implementation, Hadoop
7
Scheduling in MapReduce
 When a node has an empty slot, Hadoop chooses one from
the three categories in the following priority:
1. A failed task is given higher priority
2. Unscheduled tasks. For maps, tasks with local data to the node are
chosen first.
3. Looks to run a speculative task.
8
Deciding on Speculative Tasks
 Which task to execute speculatively?
 Hadoop monitors tasks progress using a progress score: a
number from 0, …, 1
 For mappers: the score is the fraction of input data read
 For reducers: the execution is divided into three equal phases,
1/3 of the score each:
 Copy phase: percent of maps that output has been copied from
 Sort phase: map outputs are sorted by key: percent of data merged
 Reduce phase: percent of data passed through the reduce function
 Example: a task halfway through the copy phase has
progress score = 1/2*1/3 = 1/6.
 Example: a task halfway through the reduce phase has
progress score = 1/3 + 1/3 + 1/2 * 1/3 = 5/6
9
Deciding on Speculative Tasks (con’t)
 Hadoop looks at the average progress of each category of
maps and reduces and defines a threshold:
 When a task’s progress is less than the average for its
category minus 0.2, and the task has run at least one
minute, it is marked as a straggler:
threshold = avgProgress – 0.2
 All tasks with progress score < threshold are stragglers
 Ties are broken by data locality
 This approach works reasonably well in homogeneous clusters
10
Scheduler’s Assumptions
1. Nodes can perform work at roughly the same rate
2. Tasks progress at constant rate all the time
3. There is no cost to starting a speculative task
4. A task’s progress is roughly equal to the fraction of its total
work
5. Tasks tend to finish in waves, so a task with a low progress
score is likely a slow task
6. Different task of the same category (maps or reduces) take
roughly the same amount of work
11
Revising Scheduler’s Assumptions
1. Nodes can perform work at roughly the same rate
2. Tasks progress at constant rate all the time
 (1) In heterogeneous clusters some nodes are slower (older)
than others
 (2) Virtualized clusters “suffer” from co-location interference
12
Heterogeneity in Virtualized Environments
 VM technology isolates CPU and memory, but disk and
network are shared
 Full bandwidth when no contention
 Equal shares when there is contention
 2.5x performance difference
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
IO
Performance
per
VM
(MB/s)
VMs on Physical Host
13
Revising Scheduler’s Assumptions
3. There is no cost to starting a speculative task
4. A task’s progress is roughly equal to the fraction of its total
work
5. Tasks tend to finish in waves, so a task with a low progress
score is likely a slow task
 (3) Too many speculative tasks can take away resources
from other running tasks
 (4) The copy phase of reducers is the slowest part, because
it involves all-pairs communications. But this phase counts
for 1/3 of the total reduce work.
 (5) Tasks from different generations will be executed
concurrently. So newer faster tasks are considered with older
show tasks, avgProgress changes a lot.
14
Idea: Progress Rates
 Instead of using progress score values, compute progress
rates, and back up tasks that are “far enough” below the
mean
 Problem: can still select the wrong tasks
15
Progress Rate Example
Time (min)
Node 1
Node 2
Node 3
3x slower
1.9x slower
1 task/min
1 min 2 min
16
Progress Rate Example
Node 1
Node 2
Node 3
What if the job had 5 tasks?
time left: 1.8 min
2 min
Time (min)
Node 2 is slowest, but should back up Node 3’s
task!
time left: 1 min
17
Our Scheduler: LATE
 Insight: back up the task with the largest estimated finish
time
 “Longest Approximate Time to End”  LATE
 Look forward instead of looking backward
 Sanity thresholds:
 Cap number of backup tasks
 Launch backups on fast nodes
 Only back up tasks that are sufficiently slow
18
LATE Details
 Estimating finish times:
progress score
execution time
progress rate =
1 – progress score
progress rate
estimated time left =
19
LATE Scheduler
 If a task slot becomes available and there are less than
SpeculativeCap tasks running, then:
1. Ignore the request if the node’s total progress is below
SlowNodeThreshold (=25th percentile)
2. Rank currently running, non-speculatively executed tasks by
estimated time left
3. Launch a copy of the highest-ranked task with progress rate below
SlowTaskThreshold (=25th percentile)
 Threshold values:
 10% cap on backups, 25th percentiles for slow node/task
 Validated by sensitivity analysis
20
LATE Example
Node 1
Node 2
Node 3
2 min
Time (min)
Progress = 5.3%
Estimated time left:
(1-0.66) / (1/3) = 1
Estimated time left:
(1-0.05) / (1/1.9) = 1.8
Progress = 66%
LATE correctly picks Node 3
21
Evaluation
 Environments:
 EC2 (3 job types, 200-250 nodes)
 Small local testbed
 Self-contention through VM placement
 Stragglers through background processes
22
EC2 Sort without Stragglers (Sec 5.2.1)
 106 machines , 7-8 VMs per machine  total of 243 VMs
 128 MB data per host, 30 GB in total
 486 map tasks and 437 reduce tasks
 average 27% speedup over native, 31% over no backups
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Worst Best Average
Normalized
Response
Time
No Backups
Hadoop Native
LATE Scheduler
23
EC2 Sort with Stragglers (Sec 5.2.2)
 8 VMs are manually slowed down out of 100 VMs in total
 running background of CPU- and disk-intensive jobs
 average 58% speedup over native, 220% over no backups
 93% max speedup over native
0.0
0.5
1.0
1.5
2.0
2.5
Worst Best Average
Normalized
Response
Time
No Backups
Hadoop Native
LATE Scheduler
24
Conclusion
 Heterogeneity is a challenge for parallel apps, and is
growing more important
 Lessons:
 Back up tasks which hurt response time most
 2x improvement using simple algorithm
25
Summary
 MapReduce is a very powerful and expressive model
 Performance depends a lot on implementation details
 Material is from the paper:
“Improving MapReduce Performance in Heterogeneous Environments”,
By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and
Ion Stoica, published in Usenix OSDI conference, 2008
 and their presentation at OSDI
26

More Related Content

Similar to mapreduce-advanced.pptx

mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
ShimoFcis
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
sreehari orienit
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
NAVER D2
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
areej qasrawi
 
Map reduce
Map reduceMap reduce
Map reduce
대호 김
 
Memory management
Memory managementMemory management
Memory management
MuhammadRobeel3
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
Reynold Xin
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data Management
Chris Bunch
 
Hadoop
HadoopHadoop
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
Akhila Prabhakaran
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop Cluster
IRJET Journal
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
Héloïse Nonne
 
Apache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and PresentApache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and Present
Databricks
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
Andy Petrella
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
Antonios Katsarakis
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
butest
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
pramodbiligiri
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
Leila panahi
 

Similar to mapreduce-advanced.pptx (20)

mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
Map reduce
Map reduceMap reduce
Map reduce
 
Memory management
Memory managementMemory management
Memory management
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data Management
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop Cluster
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
Apache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and PresentApache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and Present
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 

More from ShimoFcis

Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdf
ShimoFcis
 
05_SQA_Overview.ppt
05_SQA_Overview.ppt05_SQA_Overview.ppt
05_SQA_Overview.ppt
ShimoFcis
 
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptxTopic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
ShimoFcis
 
4-DES.pdf
4-DES.pdf4-DES.pdf
4-DES.pdf
ShimoFcis
 
lab-8 (1).pptx
lab-8 (1).pptxlab-8 (1).pptx
lab-8 (1).pptx
ShimoFcis
 
Lab-11-C-Problems.pptx
Lab-11-C-Problems.pptxLab-11-C-Problems.pptx
Lab-11-C-Problems.pptx
ShimoFcis
 
Mid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptxMid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptx
ShimoFcis
 
Lecture 6.pptx
Lecture 6.pptxLecture 6.pptx
Lecture 6.pptx
ShimoFcis
 
Lecture Cloud Security.pptx
Lecture Cloud Security.pptxLecture Cloud Security.pptx
Lecture Cloud Security.pptx
ShimoFcis
 
storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptx
ShimoFcis
 

More from ShimoFcis (10)

Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdf
 
05_SQA_Overview.ppt
05_SQA_Overview.ppt05_SQA_Overview.ppt
05_SQA_Overview.ppt
 
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptxTopic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
 
4-DES.pdf
4-DES.pdf4-DES.pdf
4-DES.pdf
 
lab-8 (1).pptx
lab-8 (1).pptxlab-8 (1).pptx
lab-8 (1).pptx
 
Lab-11-C-Problems.pptx
Lab-11-C-Problems.pptxLab-11-C-Problems.pptx
Lab-11-C-Problems.pptx
 
Mid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptxMid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptx
 
Lecture 6.pptx
Lecture 6.pptxLecture 6.pptx
Lecture 6.pptx
 
Lecture Cloud Security.pptx
Lecture Cloud Security.pptxLecture Cloud Security.pptx
Lecture Cloud Security.pptx
 
storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptx
 

Recently uploaded

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 

Recently uploaded (20)

KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 

mapreduce-advanced.pptx

  • 1. Cloud Computing MapReduce in Heterogeneous Environments Eva Kalyvianaki ek264@cam.ac.uk
  • 2. Contents  Looking at MapReduce performance in heterogeneous clusters  Material is from the paper: “Improving MapReduce Performance in Heterogeneous Environments”, By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica, published in Usenix OSDI conference, 2008  and their presentation at OSDI 2
  • 3. Motivation: MapReduce is becoming popular  Open-source implementation, Hadoop, used by Yahoo!, Facebook, Last.fm, …  Scale: 20 PB/day at Google, O(10,000) nodes at Yahoo, 3000 jobs/day at Facebook 3
  • 4. Stragglers in MapReduce  Straggler is a node that performs poorly or not performing at all.  Original MapReduce mitigation approach was:  To run a speculative copy (called a backup task)  Whichever copy or original would finish first would be included  Without speculative execution, a job would be slow as the slowest sub-task  Google notes that speculative execution can improve job response times by 44%  Is this approach good enough for modern clusters? 4
  • 5. Modern Clusters: Heterogeneity is the norm  Cloud computing providers like Amazon’s Elastic Compute Cloud (EC2) provide cheap on-demand computing:  Price: 2 cents / VM / hour  Scale: thousands of VMs  Caveat: less control of performance  Main challenge for Hadoop on EC2 is performance heterogeneity, which breaks task scheduler assumptions  This lecture/paper is on a new LATE scheduler that can cut response time in half 5
  • 8. Scheduling in MapReduce  When a node has an empty slot, Hadoop chooses one from the three categories in the following priority: 1. A failed task is given higher priority 2. Unscheduled tasks. For maps, tasks with local data to the node are chosen first. 3. Looks to run a speculative task. 8
  • 9. Deciding on Speculative Tasks  Which task to execute speculatively?  Hadoop monitors tasks progress using a progress score: a number from 0, …, 1  For mappers: the score is the fraction of input data read  For reducers: the execution is divided into three equal phases, 1/3 of the score each:  Copy phase: percent of maps that output has been copied from  Sort phase: map outputs are sorted by key: percent of data merged  Reduce phase: percent of data passed through the reduce function  Example: a task halfway through the copy phase has progress score = 1/2*1/3 = 1/6.  Example: a task halfway through the reduce phase has progress score = 1/3 + 1/3 + 1/2 * 1/3 = 5/6 9
  • 10. Deciding on Speculative Tasks (con’t)  Hadoop looks at the average progress of each category of maps and reduces and defines a threshold:  When a task’s progress is less than the average for its category minus 0.2, and the task has run at least one minute, it is marked as a straggler: threshold = avgProgress – 0.2  All tasks with progress score < threshold are stragglers  Ties are broken by data locality  This approach works reasonably well in homogeneous clusters 10
  • 11. Scheduler’s Assumptions 1. Nodes can perform work at roughly the same rate 2. Tasks progress at constant rate all the time 3. There is no cost to starting a speculative task 4. A task’s progress is roughly equal to the fraction of its total work 5. Tasks tend to finish in waves, so a task with a low progress score is likely a slow task 6. Different task of the same category (maps or reduces) take roughly the same amount of work 11
  • 12. Revising Scheduler’s Assumptions 1. Nodes can perform work at roughly the same rate 2. Tasks progress at constant rate all the time  (1) In heterogeneous clusters some nodes are slower (older) than others  (2) Virtualized clusters “suffer” from co-location interference 12
  • 13. Heterogeneity in Virtualized Environments  VM technology isolates CPU and memory, but disk and network are shared  Full bandwidth when no contention  Equal shares when there is contention  2.5x performance difference 0 10 20 30 40 50 60 70 1 2 3 4 5 6 7 IO Performance per VM (MB/s) VMs on Physical Host 13
  • 14. Revising Scheduler’s Assumptions 3. There is no cost to starting a speculative task 4. A task’s progress is roughly equal to the fraction of its total work 5. Tasks tend to finish in waves, so a task with a low progress score is likely a slow task  (3) Too many speculative tasks can take away resources from other running tasks  (4) The copy phase of reducers is the slowest part, because it involves all-pairs communications. But this phase counts for 1/3 of the total reduce work.  (5) Tasks from different generations will be executed concurrently. So newer faster tasks are considered with older show tasks, avgProgress changes a lot. 14
  • 15. Idea: Progress Rates  Instead of using progress score values, compute progress rates, and back up tasks that are “far enough” below the mean  Problem: can still select the wrong tasks 15
  • 16. Progress Rate Example Time (min) Node 1 Node 2 Node 3 3x slower 1.9x slower 1 task/min 1 min 2 min 16
  • 17. Progress Rate Example Node 1 Node 2 Node 3 What if the job had 5 tasks? time left: 1.8 min 2 min Time (min) Node 2 is slowest, but should back up Node 3’s task! time left: 1 min 17
  • 18. Our Scheduler: LATE  Insight: back up the task with the largest estimated finish time  “Longest Approximate Time to End”  LATE  Look forward instead of looking backward  Sanity thresholds:  Cap number of backup tasks  Launch backups on fast nodes  Only back up tasks that are sufficiently slow 18
  • 19. LATE Details  Estimating finish times: progress score execution time progress rate = 1 – progress score progress rate estimated time left = 19
  • 20. LATE Scheduler  If a task slot becomes available and there are less than SpeculativeCap tasks running, then: 1. Ignore the request if the node’s total progress is below SlowNodeThreshold (=25th percentile) 2. Rank currently running, non-speculatively executed tasks by estimated time left 3. Launch a copy of the highest-ranked task with progress rate below SlowTaskThreshold (=25th percentile)  Threshold values:  10% cap on backups, 25th percentiles for slow node/task  Validated by sensitivity analysis 20
  • 21. LATE Example Node 1 Node 2 Node 3 2 min Time (min) Progress = 5.3% Estimated time left: (1-0.66) / (1/3) = 1 Estimated time left: (1-0.05) / (1/1.9) = 1.8 Progress = 66% LATE correctly picks Node 3 21
  • 22. Evaluation  Environments:  EC2 (3 job types, 200-250 nodes)  Small local testbed  Self-contention through VM placement  Stragglers through background processes 22
  • 23. EC2 Sort without Stragglers (Sec 5.2.1)  106 machines , 7-8 VMs per machine  total of 243 VMs  128 MB data per host, 30 GB in total  486 map tasks and 437 reduce tasks  average 27% speedup over native, 31% over no backups 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Worst Best Average Normalized Response Time No Backups Hadoop Native LATE Scheduler 23
  • 24. EC2 Sort with Stragglers (Sec 5.2.2)  8 VMs are manually slowed down out of 100 VMs in total  running background of CPU- and disk-intensive jobs  average 58% speedup over native, 220% over no backups  93% max speedup over native 0.0 0.5 1.0 1.5 2.0 2.5 Worst Best Average Normalized Response Time No Backups Hadoop Native LATE Scheduler 24
  • 25. Conclusion  Heterogeneity is a challenge for parallel apps, and is growing more important  Lessons:  Back up tasks which hurt response time most  2x improvement using simple algorithm 25
  • 26. Summary  MapReduce is a very powerful and expressive model  Performance depends a lot on implementation details  Material is from the paper: “Improving MapReduce Performance in Heterogeneous Environments”, By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica, published in Usenix OSDI conference, 2008  and their presentation at OSDI 26