SlideShare a Scribd company logo
Juan Sebastián Numpaque - Nicolás Cardozo
@ncardoz
{js.numpaque10, n.cardozo}@uniandes.edu.co
CCC’21 - 15 Congreso Colombiano de Computación- 22 al 26 de noviembre - (Virtual)
Evaluation of Work Stealing Algorithms
2
Scheduling computation
static Dynamic
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
P3
P2
P1 P4 P3
P2
P1 P4
2
Scheduling computation
static Dynamic
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
P3
P2
P1 P4 P3
P2
P1 P4
v3 v2 v1
3
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
P3
P2
P1 P4
Idle processors steal tasks from processors with tasks in their queue
v3 v2 v1
3
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
P3
P2
P1 P4
Idle processors steal tasks from processors with tasks in their queue
v3
v3 v2 v1
3
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
P3
P2
P1 P4
Idle processors steal tasks from processors with tasks in their queue
v3 v2
4
Work stealing
Work stealing presents an improvement with respect to dynamic
scheduling with respect to:
Automated work balancing
Better Portability
Scalability to the number of processors
Work stealing algorithms are good,
but how good are they?
6
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
V2
V3
V4
V5
Queue P1 Queue P2 Queue P3 Queue P4
P1 P2 P3 P4
V1
head
6
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
V3
V4
V5
Queue P1 Queue P2 Queue P3 Queue P4
P1 P2 P3 P4
V1 V2
head
6
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
V5
Queue P1 Queue P2 Queue P3 Queue P4
P1 P2 P3 P4
V1 V2
V3
V4
head
6
Work stealing
[Blumofe et al. Scheduling multithreaded computations by workstealing. 1995]
V5
Queue P1 Queue P2 Queue P3 Queue P4
P1 P2 P3 P4
V1 V2
V3
V4
head
LIFO FIFO
7
Work stealing algorithms
LIFO
FIFO
• A tasks’s children are enqueued at the back of the queue in the
processor that executed the parent task
• If the processor is idle, it takes the task at the queue’s head
• Tasks are stolen from another processor’s queue head
• A tasks’s children are enqueued at the head of the queue in the
processor that executed the parent task
• If the processor is idle, it takes the task at the queue’s head
• Tasks are stolen from the back of another processor’s queue
8
Priority-based work stealing
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
Longest path over the computation nodes
8
Priority-based work stealing
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
v7
Longest path over the computation nodes
8
Priority-based work stealing
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
v7 v8 v13
Longest path over the computation nodes
8
Priority-based work stealing
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
v7
v3 v8 v13
Longest path over the computation nodes
8
Priority-based work stealing
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
v7
v3 v8 v13
v6
v5 v7 v8 v13
Longest path over the computation nodes
9
Priority-based work stealing
Tasks further away from the end node (v14) should take priority
over tasks closer towards the end of the computation
• A tasks’s children are enqueued at the back of the queue ordered
by priority
• If the processor is idle, it takes the task at the queue’s head
• Tasks are stolen from another processor’s queue head
v7
v6
v5
v9 v10
v8
v12
v11
v4
v3
v2
v1 v13 v14
v15
• Performance of the algorithm depends on the
way tasks are chosen (avoid possible
bottlenecks!)
• Classic algorithms are not fare
11
Evaluation
We evaluate the performance and fairness of existing work
stealing algorithms and our proposed approach
1. Generate a random computation DAGs
graph nodes variate in [50, 1600]
graph edges variate in density {0.2, 0.5, 0.8}
2.Scale the number of processors in the execution [1, 96]
3.Execute all the tasks in the DAG using each algorithm
12
Performance results
https://flaglab.github.io/WorkStealingAlgorithms/
Execution
time
in
ms
0
15
30
45
60
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
1
2
3
4
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
3
7
10
13
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
18
35
53
70
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
8 processors
96 processors
32 processors
density = 0.2
13
Performance results
https://flaglab.github.io/WorkStealingAlgorithms/
Execution
time
in
ms
0
13
25
38
50
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
2
3
5
6
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
3
7
10
13
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
30
60
90
120
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
8 processors
96 processors
32 processors
density = 0.5
14
Performance results
https://flaglab.github.io/WorkStealingAlgorithms/
8 processors
96 processors
32 processors
Execution
time
in
ms
0
13
25
38
50
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
2
5
7
9
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
20
40
60
80
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
Execution
time
in
ms
0
225
450
675
900
No. of DAG nodes
50 100 200 400 800 1600
PRIO FIFO LIFO
density = 0.8
15
Fairness results
https://flaglab.github.io/WorkStealingAlgorithms/
Load
No.
of
tasks
0
45
90
135
180
No. of processors
1 2 3 4 5 6 7 8
PRIO FIFO LIFO
No.
of
tasks
0
40
80
120
160
No. of processors
1 2 3 4 5 6 7 8
PRIO FIFO LIFO
No.
of
tasks
0
35
70
105
140
No. of processor
1 2 3 4 5 6 7 8
PRIO FIFO LIFO
0.2 density 0.5 density
0.8 density
16
Fairness results
https://flaglab.github.io/WorkStealingAlgorithms/
Load
No.
of
tasks
0
10
20
30
40
No. of processors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
PRIO FIFO LIFO
No.
of
tasks
0
40
80
120
160
No. of processors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
PRIO FIFO LIFO
No.
of
tasks
0
35
70
105
140
No. of processors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
PRIO FIFO LIFO
0.2 density 0.5 density
0.8 density
• FIFO falls short in the in both performance and
balance at scale
• LIFO scales better that other algorithms
• Priority has a good performance but it can
decay rapidly with many nodes, however it
presents the best balance
@ncardoz n.cardozo@uniandes.edu.co
Conclusion
https://flaglab.github.io
• FIFO falls short in the in both performance and
balance at scale
• LIFO scales better that other algorithms
• Priority has a good performance but it can
decay rapidly with many nodes, however it
presents the best balance
@ncardoz n.cardozo@uniandes.edu.co
Questions?
Conclusion
https://flaglab.github.io

More Related Content

What's hot

BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
Linaro
 
Understanding git
Understanding gitUnderstanding git
Understanding gitAvik Das
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
Kernel TLV
 
NYAN Conference: Debugging asynchronous scenarios in .net
NYAN Conference: Debugging asynchronous scenarios in .netNYAN Conference: Debugging asynchronous scenarios in .net
NYAN Conference: Debugging asynchronous scenarios in .net
Alexandra Hayere
 
p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4
Kentaro Ebisawa
 
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Shinya Takamaeda-Y
 
Radare2 @ ndh2k15 : First r2babies steps
Radare2 @ ndh2k15 : First r2babies stepsRadare2 @ ndh2k15 : First r2babies steps
Radare2 @ ndh2k15 : First r2babies steps
Maijin
 
Mod06 new development tools
Mod06 new development toolsMod06 new development tools
Mod06 new development tools
Peter Haase
 
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Yunong Xiao
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
Jian-Hong Pan
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
Kernel TLV
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Pimp my gc - Supersonic Scala
Pimp my gc - Supersonic ScalaPimp my gc - Supersonic Scala
Pimp my gc - Supersonic Scala
Pierre Laporte
 
Andrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profitAndrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profit
linuxlab_conf
 
SuperAGILE Standard Orbital data Analysis pipeline
SuperAGILE Standard Orbital  data Analysis pipelineSuperAGILE Standard Orbital  data Analysis pipeline
SuperAGILE Standard Orbital data Analysis pipeline
Francesco Lazzarotto
 
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideBKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
Linaro
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevJian-Hong Pan
 
Using Kafka in your python application - Python fwdays 2020
Using Kafka in your python application - Python fwdays 2020Using Kafka in your python application - Python fwdays 2020
Using Kafka in your python application - Python fwdays 2020
Oleksandr Tarasenko
 
Oleksandr Tarasenko "Using Kafka in your python applications"
Oleksandr Tarasenko "Using Kafka in your python applications"Oleksandr Tarasenko "Using Kafka in your python applications"
Oleksandr Tarasenko "Using Kafka in your python applications"
Fwdays
 
#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++
IncludeOS
 

What's hot (20)

BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
 
Understanding git
Understanding gitUnderstanding git
Understanding git
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
 
NYAN Conference: Debugging asynchronous scenarios in .net
NYAN Conference: Debugging asynchronous scenarios in .netNYAN Conference: Debugging asynchronous scenarios in .net
NYAN Conference: Debugging asynchronous scenarios in .net
 
p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4
 
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
 
Radare2 @ ndh2k15 : First r2babies steps
Radare2 @ ndh2k15 : First r2babies stepsRadare2 @ ndh2k15 : First r2babies steps
Radare2 @ ndh2k15 : First r2babies steps
 
Mod06 new development tools
Mod06 new development toolsMod06 new development tools
Mod06 new development tools
 
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
Building Observable Applications w/ Node.js -- BayNode Meetup, March 2014
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Pimp my gc - Supersonic Scala
Pimp my gc - Supersonic ScalaPimp my gc - Supersonic Scala
Pimp my gc - Supersonic Scala
 
Andrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profitAndrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profit
 
SuperAGILE Standard Orbital data Analysis pipeline
SuperAGILE Standard Orbital  data Analysis pipelineSuperAGILE Standard Orbital  data Analysis pipeline
SuperAGILE Standard Orbital data Analysis pipeline
 
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation GuideBKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
 
Using Kafka in your python application - Python fwdays 2020
Using Kafka in your python application - Python fwdays 2020Using Kafka in your python application - Python fwdays 2020
Using Kafka in your python application - Python fwdays 2020
 
Oleksandr Tarasenko "Using Kafka in your python applications"
Oleksandr Tarasenko "Using Kafka in your python applications"Oleksandr Tarasenko "Using Kafka in your python applications"
Oleksandr Tarasenko "Using Kafka in your python applications"
 
#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++#Include os - From bootloader to REST API with the new C++
#Include os - From bootloader to REST API with the new C++
 

Similar to [CCC'21] Evaluation of Work Stealing Algorithms

May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-optJeff Larkin
 
Scaling mysql with python (and Docker).
Scaling mysql with python (and Docker).Scaling mysql with python (and Docker).
Scaling mysql with python (and Docker).
Roberto Polli
 
Joblib: Lightweight pipelining for parallel jobs (v2)
Joblib:  Lightweight pipelining for parallel jobs (v2)Joblib:  Lightweight pipelining for parallel jobs (v2)
Joblib: Lightweight pipelining for parallel jobs (v2)
Marcel Caraciolo
 
LinuxLabs 2017 talk: Container monitoring challenges
LinuxLabs 2017 talk: Container monitoring challengesLinuxLabs 2017 talk: Container monitoring challenges
LinuxLabs 2017 talk: Container monitoring challenges
Xavier Vello
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
Tzung-Bi Shih
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
MLconf
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
Cyber Security Alliance
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
Cray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesJeff Larkin
 
Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...
Lucas Leong
 
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStormServerless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
Dmitri Zimine
 
Docker In the Bank
Docker In the BankDocker In the Bank
Docker In the Bank
Aleksandr Tarasov
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
Cisco DevNet
 
第二回CTF勉強会資料
第二回CTF勉強会資料第二回CTF勉強会資料
第二回CTF勉強会資料
Asuka Nakajima
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
Kyle Hailey
 
(Even more) Rapid App Development with RubyMotion
(Even more) Rapid App Development with RubyMotion(Even more) Rapid App Development with RubyMotion
(Even more) Rapid App Development with RubyMotion
Stefan Haflidason
 
Power of linked list
Power of linked listPower of linked list
Power of linked list
Peter Hlavaty
 
HPC Examples
HPC ExamplesHPC Examples
HPC Examples
Wendi Sapp
 
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Hsien-Hsin Sean Lee, Ph.D.
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch prediction
rinnocente
 

Similar to [CCC'21] Evaluation of Work Stealing Algorithms (20)

May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 
Scaling mysql with python (and Docker).
Scaling mysql with python (and Docker).Scaling mysql with python (and Docker).
Scaling mysql with python (and Docker).
 
Joblib: Lightweight pipelining for parallel jobs (v2)
Joblib:  Lightweight pipelining for parallel jobs (v2)Joblib:  Lightweight pipelining for parallel jobs (v2)
Joblib: Lightweight pipelining for parallel jobs (v2)
 
LinuxLabs 2017 talk: Container monitoring challenges
LinuxLabs 2017 talk: Container monitoring challengesLinuxLabs 2017 talk: Container monitoring challenges
LinuxLabs 2017 talk: Container monitoring challenges
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
 
Cray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best Practices
 
Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...
 
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStormServerless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
 
Docker In the Bank
Docker In the BankDocker In the Bank
Docker In the Bank
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
 
第二回CTF勉強会資料
第二回CTF勉強会資料第二回CTF勉強会資料
第二回CTF勉強会資料
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
 
(Even more) Rapid App Development with RubyMotion
(Even more) Rapid App Development with RubyMotion(Even more) Rapid App Development with RubyMotion
(Even more) Rapid App Development with RubyMotion
 
Power of linked list
Power of linked listPower of linked list
Power of linked list
 
HPC Examples
HPC ExamplesHPC Examples
HPC Examples
 
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch prediction
 

More from Universidad de los Andes

An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...
Universidad de los Andes
 
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
Universidad de los Andes
 
[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...
Universidad de los Andes
 
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
Universidad de los Andes
 
[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps
Universidad de los Andes
 
Keeping Up! with LaTeX
Keeping Up! with LaTeXKeeping Up! with LaTeX
Keeping Up! with LaTeX
Universidad de los Andes
 
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
Universidad de los Andes
 
Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...
Universidad de los Andes
 
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Universidad de los Andes
 
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary studyDoes Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
Universidad de los Andes
 
Learning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptationsLearning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptations
Universidad de los Andes
 
Distributed context Petri nets
Distributed context Petri netsDistributed context Petri nets
Distributed context Petri nets
Universidad de los Andes
 
CQL: declarative language for context activation
CQL: declarative language for context activationCQL: declarative language for context activation
CQL: declarative language for context activation
Universidad de los Andes
 
Generating software adaptations using machine learning
Generating software adaptations using machine learningGenerating software adaptations using machine learning
Generating software adaptations using machine learning
Universidad de los Andes
 
[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales
Universidad de los Andes
 
Programming language techniques for adaptive software
Programming language techniques for adaptive softwareProgramming language techniques for adaptive software
Programming language techniques for adaptive software
Universidad de los Andes
 
Peace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contextsPeace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contexts
Universidad de los Andes
 
Emergent Software Services
Emergent Software ServicesEmergent Software Services
Emergent Software Services
Universidad de los Andes
 

More from Universidad de los Andes (18)

An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...
 
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
 
[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...
 
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
 
[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps
 
Keeping Up! with LaTeX
Keeping Up! with LaTeXKeeping Up! with LaTeX
Keeping Up! with LaTeX
 
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
 
Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...
 
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
 
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary studyDoes Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study
 
Learning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptationsLearning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptations
 
Distributed context Petri nets
Distributed context Petri netsDistributed context Petri nets
Distributed context Petri nets
 
CQL: declarative language for context activation
CQL: declarative language for context activationCQL: declarative language for context activation
CQL: declarative language for context activation
 
Generating software adaptations using machine learning
Generating software adaptations using machine learningGenerating software adaptations using machine learning
Generating software adaptations using machine learning
 
[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales
 
Programming language techniques for adaptive software
Programming language techniques for adaptive softwareProgramming language techniques for adaptive software
Programming language techniques for adaptive software
 
Peace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contextsPeace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contexts
 
Emergent Software Services
Emergent Software ServicesEmergent Software Services
Emergent Software Services
 

Recently uploaded

How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
JEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questionsJEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questions
ShivajiThube2
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 

Recently uploaded (20)

How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
JEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questionsJEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questions
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 

[CCC'21] Evaluation of Work Stealing Algorithms

  • 1. Juan Sebastián Numpaque - Nicolás Cardozo @ncardoz {js.numpaque10, n.cardozo}@uniandes.edu.co CCC’21 - 15 Congreso Colombiano de Computación- 22 al 26 de noviembre - (Virtual) Evaluation of Work Stealing Algorithms
  • 2. 2 Scheduling computation static Dynamic v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 P3 P2 P1 P4 P3 P2 P1 P4
  • 3. 2 Scheduling computation static Dynamic v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 P3 P2 P1 P4 P3 P2 P1 P4
  • 4. v3 v2 v1 3 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 P3 P2 P1 P4 Idle processors steal tasks from processors with tasks in their queue
  • 5. v3 v2 v1 3 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 P3 P2 P1 P4 Idle processors steal tasks from processors with tasks in their queue v3
  • 6. v3 v2 v1 3 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 P3 P2 P1 P4 Idle processors steal tasks from processors with tasks in their queue v3 v2
  • 7. 4 Work stealing Work stealing presents an improvement with respect to dynamic scheduling with respect to: Automated work balancing Better Portability Scalability to the number of processors
  • 8. Work stealing algorithms are good, but how good are they?
  • 9. 6 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] V2 V3 V4 V5 Queue P1 Queue P2 Queue P3 Queue P4 P1 P2 P3 P4 V1 head
  • 10. 6 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] V3 V4 V5 Queue P1 Queue P2 Queue P3 Queue P4 P1 P2 P3 P4 V1 V2 head
  • 11. 6 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] V5 Queue P1 Queue P2 Queue P3 Queue P4 P1 P2 P3 P4 V1 V2 V3 V4 head
  • 12. 6 Work stealing [Blumofe et al. Scheduling multithreaded computations by workstealing. 1995] V5 Queue P1 Queue P2 Queue P3 Queue P4 P1 P2 P3 P4 V1 V2 V3 V4 head LIFO FIFO
  • 13. 7 Work stealing algorithms LIFO FIFO • A tasks’s children are enqueued at the back of the queue in the processor that executed the parent task • If the processor is idle, it takes the task at the queue’s head • Tasks are stolen from another processor’s queue head • A tasks’s children are enqueued at the head of the queue in the processor that executed the parent task • If the processor is idle, it takes the task at the queue’s head • Tasks are stolen from the back of another processor’s queue
  • 14. 8 Priority-based work stealing v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 Longest path over the computation nodes
  • 15. 8 Priority-based work stealing v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 v7 Longest path over the computation nodes
  • 16. 8 Priority-based work stealing v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 v7 v8 v13 Longest path over the computation nodes
  • 17. 8 Priority-based work stealing v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 v7 v3 v8 v13 Longest path over the computation nodes
  • 18. 8 Priority-based work stealing v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15 v7 v3 v8 v13 v6 v5 v7 v8 v13 Longest path over the computation nodes
  • 19. 9 Priority-based work stealing Tasks further away from the end node (v14) should take priority over tasks closer towards the end of the computation • A tasks’s children are enqueued at the back of the queue ordered by priority • If the processor is idle, it takes the task at the queue’s head • Tasks are stolen from another processor’s queue head v7 v6 v5 v9 v10 v8 v12 v11 v4 v3 v2 v1 v13 v14 v15
  • 20. • Performance of the algorithm depends on the way tasks are chosen (avoid possible bottlenecks!) • Classic algorithms are not fare
  • 21. 11 Evaluation We evaluate the performance and fairness of existing work stealing algorithms and our proposed approach 1. Generate a random computation DAGs graph nodes variate in [50, 1600] graph edges variate in density {0.2, 0.5, 0.8} 2.Scale the number of processors in the execution [1, 96] 3.Execute all the tasks in the DAG using each algorithm
  • 22. 12 Performance results https://flaglab.github.io/WorkStealingAlgorithms/ Execution time in ms 0 15 30 45 60 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 1 2 3 4 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 3 7 10 13 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 18 35 53 70 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO 8 processors 96 processors 32 processors density = 0.2
  • 23. 13 Performance results https://flaglab.github.io/WorkStealingAlgorithms/ Execution time in ms 0 13 25 38 50 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 2 3 5 6 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 3 7 10 13 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 30 60 90 120 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO 8 processors 96 processors 32 processors density = 0.5
  • 24. 14 Performance results https://flaglab.github.io/WorkStealingAlgorithms/ 8 processors 96 processors 32 processors Execution time in ms 0 13 25 38 50 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 2 5 7 9 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 20 40 60 80 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO Execution time in ms 0 225 450 675 900 No. of DAG nodes 50 100 200 400 800 1600 PRIO FIFO LIFO density = 0.8
  • 25. 15 Fairness results https://flaglab.github.io/WorkStealingAlgorithms/ Load No. of tasks 0 45 90 135 180 No. of processors 1 2 3 4 5 6 7 8 PRIO FIFO LIFO No. of tasks 0 40 80 120 160 No. of processors 1 2 3 4 5 6 7 8 PRIO FIFO LIFO No. of tasks 0 35 70 105 140 No. of processor 1 2 3 4 5 6 7 8 PRIO FIFO LIFO 0.2 density 0.5 density 0.8 density
  • 26. 16 Fairness results https://flaglab.github.io/WorkStealingAlgorithms/ Load No. of tasks 0 10 20 30 40 No. of processors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 PRIO FIFO LIFO No. of tasks 0 40 80 120 160 No. of processors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 PRIO FIFO LIFO No. of tasks 0 35 70 105 140 No. of processors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 PRIO FIFO LIFO 0.2 density 0.5 density 0.8 density
  • 27. • FIFO falls short in the in both performance and balance at scale • LIFO scales better that other algorithms • Priority has a good performance but it can decay rapidly with many nodes, however it presents the best balance @ncardoz n.cardozo@uniandes.edu.co Conclusion https://flaglab.github.io
  • 28. • FIFO falls short in the in both performance and balance at scale • LIFO scales better that other algorithms • Priority has a good performance but it can decay rapidly with many nodes, however it presents the best balance @ncardoz n.cardozo@uniandes.edu.co Questions? Conclusion https://flaglab.github.io