SlideShare a Scribd company logo
Revisiting Size-Based Scheduling 
with Estimated Job Sizes 
Matteo Dell’Amico (EURECOM, France), 
Damiano Carra (Univ. Verona, Italy) 
Mario Pastorelli, Pietro Michiardi (EURECOM, France) 
MASCOTS 2014 
These slides at http://bit.ly/schedsim 
1
On Size-Based Scheduling
On Size-Based Scheduling An Example 
Processor-Sharing vs. Size-Based 
100 
cluster 
usage (%) 
50 
time 
(s) 
job 3 
job 2 
10 15 37.5 42.5 50 
100 
cluster 
usage (%) 
10 20 30 50 
50 
time 
(s) 
job 1 
job 1 job 2 job 3 job 1 
3
On Size-Based Scheduling An Example 
Processor-Sharing vs. Size-Based 
100 
cluster 
usage (%) 
50 
time 
(s) 
job 3 
job 2 
10 15 37.5 42.5 50 
100 
cluster 
usage (%) 
10 20 30 50 
50 
time 
(s) 
job 1 
job 1 job 2 job 3 job 1 
3
On Size-Based Scheduling Properties 
Size-Based Scheduling: Some Properties 
Shortest Remaining Processing Time (SRPT) 
Minimizes mean sojourn time (MST) [Schrage, OPER RES ’68] 
Sojourn time: interval between job submission and completion 
4
On Size-Based Scheduling Properties 
Size-Based Scheduling: Some Properties 
Shortest Remaining Processing Time (SRPT) 
Minimizes mean sojourn time (MST) [Schrage, OPER RES ’68] 
Sojourn time: interval between job submission and completion 
Fair Sojourn Protocol (FSP) 
Jobs are scheduled in the order they would complete if doing 
Processor Sharing (PS) 
Efficiency: very close to SRPT 
Fairness: each job completes not later than Processor Sharing 
[Friedman & Henderson, SIGMETRICS ’03] 
4
On Size-Based Scheduling Related Work 
Where Are All the Size-Based Schedulers? 
Job size is almost never known a priori 
5
On Size-Based Scheduling Related Work 
Where Are All the Size-Based Schedulers? 
Job size is almost never known a priori 
Related Work: Inexact Job Size Information 
Simulation-based study: estimations need to be precise 
[Lu et al., MASCOTS 2004] 
Analytic study: bounded errors, over-estimation only 
[Wierman & Nuyens, SIGMETRICS PER, 2008] 
5
On Size-Based Scheduling Related Work 
Where Are All the Size-Based Schedulers? 
Job size is almost never known a priori 
Related Work: Inexact Job Size Information 
Simulation-based study: estimations need to be precise 
[Lu et al., MASCOTS 2004] 
Analytic study: bounded errors, over-estimation only 
[Wierman & Nuyens, SIGMETRICS PER, 2008] 
What Motivated Us 
We developed HFSP, a size-based scheduler for Hadoop 
We found it works very well even with very rough estimations 
[Pastorelli et al., BIGDATA 2013] 
5
Understanding Size-Based Scheduling With Errors
Understanding Size-Based Scheduling With Errors Simulation 
Scheduling Simulation 
Main Features 
Simulates single-server, preemptive scheduling 
Can create synthetic traces or replay real ones 
Injects artificial size estimation errors 
in this case, SRPT and FSP become SRPTE and FSPE 
Efficient and easy to prototype new schedulers 
10,000 jobs are simulated in half a second on an old laptop 
FSP is ~50 lines of Python code 
Free Software: Apache License 2.0 
https://bitbucket.org/bigfootproject/schedsim 
7
Understanding Size-Based Scheduling With Errors Simulation 
Log-Normal Error Distribution 
4.0 
3.5 
3.0 
2.5 
PDF 
2.0 
1.5 
1.0 
0.5 
0.0 0.5 1.0 1.5 2.0 2.5 3.0 
x 
0.0 
sigma= 0.125 
sigma= 0.25 
sigma= 1 
sigma= 4 
Error: real size 
estimated size 
8
Understanding Size-Based Scheduling With Errors Simulation 
Weibull Job Size Distribution 
2.0 
1.5 
PDF 
1.0 
0.5 
0.0 0.5 1.0 1.5 2.0 2.5 3.0 
x 
0.0 
shape= 0.125 
shape= 1 
shape= 2 
shape= 4 
Interpolates between 
heavy-tailed job size distributions (shape<1) 
exponential distributions (shape=1) 
bell-shaped distributions (shape>1) 9
Understanding Size-Based Scheduling With Errors Simulation 
Other Parameters 
Number of jobs (default: 10,000 per workload) 
at least 30 repetitions per data point 
System load (default: 0.9) 
Ratio between requested and available resources 
Job arrival time (default: exponential) 
Bursts vs. regular 
These parameters are not fundamental 
more details in the paper 
10
Understanding Size-Based Scheduling With Errors Simulation Results 
Size-Based Scheduling With Errors 
SRPTE FSPE 
Problems for heavy-tailed job size distributions 
Otherwise, size-based scheduling works very well 
11
Understanding Size-Based Scheduling With Errors Simulation Results 
Over-Estimations and Under-Estimations 
t 
Over-­‐es'ma'on 
Under-­‐es'ma'on 
t 
t 
t 
Remaining 
size 
Remaining 
size 
Remaining 
size 
Remaining 
size 
J1 
J2 
J3 
J2 
J3 
J^1 
J4 
J5 
J6 
J4 
J5 
J6 
^ 
Over-estimating hurts a single job: limited damage 
Under-estimating very large jobs can wreak havoc 
12
Size-Based Scheduling For Approximate Sizes
Size-Based Scheduling For Approximate Sizes FSPE+PS 
FSPE + PS 
Idea 
Without errors, real jobs always complete before virtual ones 
When they don’t (they are late), there has been a mistake 
The scheduler can realize this, and take corrective action 
Realization 
A scheduler such that late jobs don’t block the system 
Just do processor sharing between them instead of scheduling 
the “most late” one 
14
Size-Based Scheduling For Approximate Sizes FSPE+PS 
FSPE + PS: Results 
FSPE FSPE + PS 
Performance becomes very close to optimal 
Outperformed by PS only for extreme skew and errors 
Replaying real-world traces gives analogous results 
15
Size-Based Scheduling For Approximate Sizes Comparison With SRPT 
Schedulers vs. SRPT 
10 
8 
6 
4 
0.125 0.25 0.5 1 2 4 
shape 
2 
MST / MST(SRPT) 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
FIFO 
Sigma: 0.5 
16
Size-Based Scheduling For Approximate Sizes Real Workloads 
Hadoop @ Facebook 
10 
8 
6 
4 
0.125 0.25 0.5 1 2 4 
sigma 
2 
MST / MST(SRPT) 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
10 
8 
6 
4 
0.125 0.25 0.5 1 2 4 
sigma 
2 
MST / MST(SRPT) 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
Synthetic workload (shape=0.25) Facebook Hadoop Cluster 
17
Size-Based Scheduling For Approximate Sizes Real Workloads 
Web Cache 
100 
10 
0.125 0.25 0.5 1 2 4 
sigma 
1 
MST / MST(SRPT) 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
10000 
1000 
100 
10 
0.125 0.25 0.5 1 2 4 
sigma 
1 
MST / MST(SRPT) 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
FIFO 
Synthetic workload (shape=0.177) IRCache Web Cache 
18
Take-Home Messages
Take-Home Messages 
Take-Home Messages 
For System Designers 
Do not be afraid of size-based scheduling 
it can work great even with very rough estimations 
Further Research 
Schedulers like FSPE+PS, designed for estimated sizes, work 
very well 
Can we design a scheduler that always outperforms PS? 
Can we get better analytical understanding of the phenomena 
we observed? 
These slides (plus bonus content) available at 
http://bit.ly/schedsim 
20
Bonus Content
Bonus Content Fairness 
Fairness: Slowdown 
1.0 
0.8 
0.6 
ECDF 
0.4 
0.2 
100 101 102 
slowdown 
0.0 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
FIFO 
1.00 
0.98 
0.96 
ECDF 
0.94 
0.92 
100 101 102 
slowdown 
0.90 
Shape: 0.25, sigma: 0.5 
22
Bonus Content Fairness 
Fairness: Conditional Slowdown 
107 
106 
105 
slowdown 
104 
103 
102 
101 
104 103 102 101 100 101 102 
job size 
100 
SRPTE 
FSPE 
FSPE+PS 
PS 
LAS 
FIFO 
Shape: 0.25, sigma: 0.5 
23

More Related Content

Similar to Revisiting Size-Based Scheduling with Estimated Job Sizes

USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
HCL Technologies
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
Joshua Bloom
 
Ssbse10.ppt
Ssbse10.pptSsbse10.ppt
Analytics for Process Excellence
Analytics for Process ExcellenceAnalytics for Process Excellence
Analytics for Process Excellence
Denis Gagné
 
Business Rules on Hadoop
Business Rules on HadoopBusiness Rules on Hadoop
Business Rules on Hadoop
DataWorks Summit
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
New solutions for production dilemmas
New solutions for production dilemmasNew solutions for production dilemmas
New solutions for production dilemmas
armandogo92
 
OS-Assisted Task Preemption for Hadoop
OS-Assisted Task Preemption for HadoopOS-Assisted Task Preemption for Hadoop
OS-Assisted Task Preemption for Hadoop
Matteo Dell'Amico
 
Novel Scheduling Algorithm in DFS9(1)
Novel Scheduling Algorithm in DFS9(1)Novel Scheduling Algorithm in DFS9(1)
Novel Scheduling Algorithm in DFS9(1)
kota Ankita
 
Software Project Estimation
Software Project EstimationSoftware Project Estimation
Software Project Estimation
Frank Vogelezang
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
Pooyan Jamshidi
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
Eng Teong Cheah
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
Accumulo Summit
 
Lecture 06 - CS-5040 - modern database systems
Lecture 06  - CS-5040 - modern database systemsLecture 06  - CS-5040 - modern database systems
Lecture 06 - CS-5040 - modern database systems
Michael Mathioudakis
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
Robert Grossman
 
parellel computing
parellel computingparellel computing
parellel computing
katakdound
 
Cost Estimating of the future
Cost Estimating of the futureCost Estimating of the future
Cost Estimating of the future
htolis
 
Automating With Excel An Object Oriented Approach
Automating  With  Excel    An  Object  Oriented  ApproachAutomating  With  Excel    An  Object  Oriented  Approach
Automating With Excel An Object Oriented Approach
Razorleaf Corporation
 
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
Sugree Phatanapherom
 
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
Improving the cosmic approximate sizing using the fuzzy logic epcu model   al...Improving the cosmic approximate sizing using the fuzzy logic epcu model   al...
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
IWSM Mensura
 

Similar to Revisiting Size-Based Scheduling with Estimated Job Sizes (20)

USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
Ssbse10.ppt
Ssbse10.pptSsbse10.ppt
Ssbse10.ppt
 
Analytics for Process Excellence
Analytics for Process ExcellenceAnalytics for Process Excellence
Analytics for Process Excellence
 
Business Rules on Hadoop
Business Rules on HadoopBusiness Rules on Hadoop
Business Rules on Hadoop
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
New solutions for production dilemmas
New solutions for production dilemmasNew solutions for production dilemmas
New solutions for production dilemmas
 
OS-Assisted Task Preemption for Hadoop
OS-Assisted Task Preemption for HadoopOS-Assisted Task Preemption for Hadoop
OS-Assisted Task Preemption for Hadoop
 
Novel Scheduling Algorithm in DFS9(1)
Novel Scheduling Algorithm in DFS9(1)Novel Scheduling Algorithm in DFS9(1)
Novel Scheduling Algorithm in DFS9(1)
 
Software Project Estimation
Software Project EstimationSoftware Project Estimation
Software Project Estimation
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
 
Lecture 06 - CS-5040 - modern database systems
Lecture 06  - CS-5040 - modern database systemsLecture 06  - CS-5040 - modern database systems
Lecture 06 - CS-5040 - modern database systems
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
parellel computing
parellel computingparellel computing
parellel computing
 
Cost Estimating of the future
Cost Estimating of the futureCost Estimating of the future
Cost Estimating of the future
 
Automating With Excel An Object Oriented Approach
Automating  With  Excel    An  Object  Oriented  ApproachAutomating  With  Excel    An  Object  Oriented  Approach
Automating With Excel An Object Oriented Approach
 
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computi...
 
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
Improving the cosmic approximate sizing using the fuzzy logic epcu model   al...Improving the cosmic approximate sizing using the fuzzy logic epcu model   al...
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
 

Recently uploaded

Quality assurance B.pharm 6th semester BP606T UNIT 5
Quality assurance B.pharm 6th semester BP606T UNIT 5Quality assurance B.pharm 6th semester BP606T UNIT 5
Quality assurance B.pharm 6th semester BP606T UNIT 5
vimalveerammal
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdfHolsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
frank0071
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
OmAle5
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
Sérgio Sacani
 
Male reproduction physiology by Suyash Garg .pptx
Male reproduction physiology by Suyash Garg .pptxMale reproduction physiology by Suyash Garg .pptx
Male reproduction physiology by Suyash Garg .pptx
suyashempire
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
goluk9330
 
Embracing Deep Variability For Reproducibility and Replicability
Embracing Deep Variability For Reproducibility and ReplicabilityEmbracing Deep Variability For Reproducibility and Replicability
Embracing Deep Variability For Reproducibility and Replicability
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Creative-Biolabs
 
Polycythemia vera_causes_disorders_treatment.pptx
Polycythemia vera_causes_disorders_treatment.pptxPolycythemia vera_causes_disorders_treatment.pptx
Polycythemia vera_causes_disorders_treatment.pptx
muralinath2
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
Sérgio Sacani
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Sérgio Sacani
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
Reaching the age of Adolescence- Class 8
Reaching the age of Adolescence- Class 8Reaching the age of Adolescence- Class 8
Reaching the age of Adolescence- Class 8
abhinayakamasamudram
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
suyashempire
 

Recently uploaded (20)

Quality assurance B.pharm 6th semester BP606T UNIT 5
Quality assurance B.pharm 6th semester BP606T UNIT 5Quality assurance B.pharm 6th semester BP606T UNIT 5
Quality assurance B.pharm 6th semester BP606T UNIT 5
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdfHolsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
Holsinger, Bruce W. - Music, body and desire in medieval culture [2001].pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
 
Male reproduction physiology by Suyash Garg .pptx
Male reproduction physiology by Suyash Garg .pptxMale reproduction physiology by Suyash Garg .pptx
Male reproduction physiology by Suyash Garg .pptx
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
 
Embracing Deep Variability For Reproducibility and Replicability
Embracing Deep Variability For Reproducibility and ReplicabilityEmbracing Deep Variability For Reproducibility and Replicability
Embracing Deep Variability For Reproducibility and Replicability
 
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...
 
Polycythemia vera_causes_disorders_treatment.pptx
Polycythemia vera_causes_disorders_treatment.pptxPolycythemia vera_causes_disorders_treatment.pptx
Polycythemia vera_causes_disorders_treatment.pptx
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
Reaching the age of Adolescence- Class 8
Reaching the age of Adolescence- Class 8Reaching the age of Adolescence- Class 8
Reaching the age of Adolescence- Class 8
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
 

Revisiting Size-Based Scheduling with Estimated Job Sizes

  • 1. Revisiting Size-Based Scheduling with Estimated Job Sizes Matteo Dell’Amico (EURECOM, France), Damiano Carra (Univ. Verona, Italy) Mario Pastorelli, Pietro Michiardi (EURECOM, France) MASCOTS 2014 These slides at http://bit.ly/schedsim 1
  • 3. On Size-Based Scheduling An Example Processor-Sharing vs. Size-Based 100 cluster usage (%) 50 time (s) job 3 job 2 10 15 37.5 42.5 50 100 cluster usage (%) 10 20 30 50 50 time (s) job 1 job 1 job 2 job 3 job 1 3
  • 4. On Size-Based Scheduling An Example Processor-Sharing vs. Size-Based 100 cluster usage (%) 50 time (s) job 3 job 2 10 15 37.5 42.5 50 100 cluster usage (%) 10 20 30 50 50 time (s) job 1 job 1 job 2 job 3 job 1 3
  • 5. On Size-Based Scheduling Properties Size-Based Scheduling: Some Properties Shortest Remaining Processing Time (SRPT) Minimizes mean sojourn time (MST) [Schrage, OPER RES ’68] Sojourn time: interval between job submission and completion 4
  • 6. On Size-Based Scheduling Properties Size-Based Scheduling: Some Properties Shortest Remaining Processing Time (SRPT) Minimizes mean sojourn time (MST) [Schrage, OPER RES ’68] Sojourn time: interval between job submission and completion Fair Sojourn Protocol (FSP) Jobs are scheduled in the order they would complete if doing Processor Sharing (PS) Efficiency: very close to SRPT Fairness: each job completes not later than Processor Sharing [Friedman & Henderson, SIGMETRICS ’03] 4
  • 7. On Size-Based Scheduling Related Work Where Are All the Size-Based Schedulers? Job size is almost never known a priori 5
  • 8. On Size-Based Scheduling Related Work Where Are All the Size-Based Schedulers? Job size is almost never known a priori Related Work: Inexact Job Size Information Simulation-based study: estimations need to be precise [Lu et al., MASCOTS 2004] Analytic study: bounded errors, over-estimation only [Wierman & Nuyens, SIGMETRICS PER, 2008] 5
  • 9. On Size-Based Scheduling Related Work Where Are All the Size-Based Schedulers? Job size is almost never known a priori Related Work: Inexact Job Size Information Simulation-based study: estimations need to be precise [Lu et al., MASCOTS 2004] Analytic study: bounded errors, over-estimation only [Wierman & Nuyens, SIGMETRICS PER, 2008] What Motivated Us We developed HFSP, a size-based scheduler for Hadoop We found it works very well even with very rough estimations [Pastorelli et al., BIGDATA 2013] 5
  • 11. Understanding Size-Based Scheduling With Errors Simulation Scheduling Simulation Main Features Simulates single-server, preemptive scheduling Can create synthetic traces or replay real ones Injects artificial size estimation errors in this case, SRPT and FSP become SRPTE and FSPE Efficient and easy to prototype new schedulers 10,000 jobs are simulated in half a second on an old laptop FSP is ~50 lines of Python code Free Software: Apache License 2.0 https://bitbucket.org/bigfootproject/schedsim 7
  • 12. Understanding Size-Based Scheduling With Errors Simulation Log-Normal Error Distribution 4.0 3.5 3.0 2.5 PDF 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 x 0.0 sigma= 0.125 sigma= 0.25 sigma= 1 sigma= 4 Error: real size estimated size 8
  • 13. Understanding Size-Based Scheduling With Errors Simulation Weibull Job Size Distribution 2.0 1.5 PDF 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 x 0.0 shape= 0.125 shape= 1 shape= 2 shape= 4 Interpolates between heavy-tailed job size distributions (shape<1) exponential distributions (shape=1) bell-shaped distributions (shape>1) 9
  • 14. Understanding Size-Based Scheduling With Errors Simulation Other Parameters Number of jobs (default: 10,000 per workload) at least 30 repetitions per data point System load (default: 0.9) Ratio between requested and available resources Job arrival time (default: exponential) Bursts vs. regular These parameters are not fundamental more details in the paper 10
  • 15. Understanding Size-Based Scheduling With Errors Simulation Results Size-Based Scheduling With Errors SRPTE FSPE Problems for heavy-tailed job size distributions Otherwise, size-based scheduling works very well 11
  • 16. Understanding Size-Based Scheduling With Errors Simulation Results Over-Estimations and Under-Estimations t Over-­‐es'ma'on Under-­‐es'ma'on t t t Remaining size Remaining size Remaining size Remaining size J1 J2 J3 J2 J3 J^1 J4 J5 J6 J4 J5 J6 ^ Over-estimating hurts a single job: limited damage Under-estimating very large jobs can wreak havoc 12
  • 17. Size-Based Scheduling For Approximate Sizes
  • 18. Size-Based Scheduling For Approximate Sizes FSPE+PS FSPE + PS Idea Without errors, real jobs always complete before virtual ones When they don’t (they are late), there has been a mistake The scheduler can realize this, and take corrective action Realization A scheduler such that late jobs don’t block the system Just do processor sharing between them instead of scheduling the “most late” one 14
  • 19. Size-Based Scheduling For Approximate Sizes FSPE+PS FSPE + PS: Results FSPE FSPE + PS Performance becomes very close to optimal Outperformed by PS only for extreme skew and errors Replaying real-world traces gives analogous results 15
  • 20. Size-Based Scheduling For Approximate Sizes Comparison With SRPT Schedulers vs. SRPT 10 8 6 4 0.125 0.25 0.5 1 2 4 shape 2 MST / MST(SRPT) SRPTE FSPE FSPE+PS PS LAS FIFO Sigma: 0.5 16
  • 21. Size-Based Scheduling For Approximate Sizes Real Workloads Hadoop @ Facebook 10 8 6 4 0.125 0.25 0.5 1 2 4 sigma 2 MST / MST(SRPT) SRPTE FSPE FSPE+PS PS LAS 10 8 6 4 0.125 0.25 0.5 1 2 4 sigma 2 MST / MST(SRPT) SRPTE FSPE FSPE+PS PS LAS Synthetic workload (shape=0.25) Facebook Hadoop Cluster 17
  • 22. Size-Based Scheduling For Approximate Sizes Real Workloads Web Cache 100 10 0.125 0.25 0.5 1 2 4 sigma 1 MST / MST(SRPT) SRPTE FSPE FSPE+PS PS LAS 10000 1000 100 10 0.125 0.25 0.5 1 2 4 sigma 1 MST / MST(SRPT) SRPTE FSPE FSPE+PS PS LAS FIFO Synthetic workload (shape=0.177) IRCache Web Cache 18
  • 24. Take-Home Messages Take-Home Messages For System Designers Do not be afraid of size-based scheduling it can work great even with very rough estimations Further Research Schedulers like FSPE+PS, designed for estimated sizes, work very well Can we design a scheduler that always outperforms PS? Can we get better analytical understanding of the phenomena we observed? These slides (plus bonus content) available at http://bit.ly/schedsim 20
  • 26. Bonus Content Fairness Fairness: Slowdown 1.0 0.8 0.6 ECDF 0.4 0.2 100 101 102 slowdown 0.0 SRPTE FSPE FSPE+PS PS LAS FIFO 1.00 0.98 0.96 ECDF 0.94 0.92 100 101 102 slowdown 0.90 Shape: 0.25, sigma: 0.5 22
  • 27. Bonus Content Fairness Fairness: Conditional Slowdown 107 106 105 slowdown 104 103 102 101 104 103 102 101 100 101 102 job size 100 SRPTE FSPE FSPE+PS PS LAS FIFO Shape: 0.25, sigma: 0.5 23