SlideShare a Scribd company logo
Tobias Fuchs 
Evaluation of Task Scheduling 
Algorithms and Wait-Free Data 
Structures for Embedded Multi-Core 
Systems 
• Vortrag zur Masterarbeit 
• Aufgabensteller: Prof. Dr. Dieter Kranzlmüller 
• Betreuer: Dr. Karl Fürlinger (LMU) 
Dr. Tobias Schüle (Siemens CT) 
• Datum des Vortrags: 05.11.2014
Structure of this talk 
1. Introduction 
1. Motivation 
2. Problem Statement and Objectives 
2. Wait-free data structures 
1. Foundations 
2. Pools 
3. Queues 
4. Stacks 
3. Task Scheduling 
1. Work stealing 
2. Prioritized work stealing in EMBB 
4. Conclusion 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 2
Wait-freedom: 
Motivation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 3
Motivation 
Wait-free algorithms 
• Strongest possible fault tolerance 
• Guarantee progress and upper bound for execution time 
Gains: 
+ Progress is potentially a formal constraint in real-time 
computing 
+ Wait-freedom eliminates the classic concurrency problems: 
Deadlocks, Priority Inversion, Convoying, Kill-Intolerance 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 4
Problem statement 
State of the art 
No suitable wait-free data structures for embedded systems: 
• Employing mechanisms such as garbage collection 
• Not designed for restricted resources 
• No evaluation for latency 
Challenges: 
- Transforming data structures to wait-free equivalents is 
non-trivial, usually from-scratch redesign 
- Implementations depend on platform architecture 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 5
Objectives 
1. Review and evaluation of state of the art approaches for 
suitability on embedded systems 
2. Real-time compliant implementations of wait-free data 
structures 
3. Definition, implementation and evaluation of suitable 
benchmark scenarios for wait-free data structures and 
task scheduling algorithms 
+ Automated verification derived from semantic definition 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 6
Foundations 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 7
Progress conditions 
Classification of progress 
On the Nature of Progress (Herlihy, Shavit 2011) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 8
Real-time requirements 
Performance priorities on real-time systems 
Guarantees on worst-case runtime behavior 
 Aim for latency / jitter-reduction, neglecting throughput 
 Avoid non-determinism, as in malloc / new (see: MISRA) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 9
Evaluation methodology 
Real-time applications are designed to optimize latency 
Related work does not evaluate latency, but only mean or 
median throughput 
Evaluation of worst-case latency is tough: 
• In related work, measurements outside of 97.5% confidence 
interval are considered outliers and ignored 
• These outliers are our data 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 10
Pools 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 11
Wait-free data structures: 
Pools 
Pools 
… realize dynamic memory allocation 
… while eliminating heap fragmentation 
• Fundamental data structure of any concurrent container 
• Fixed number of objects in static or automatic memory 
• Pools manage concurrent removal and reclamation of 
objects 
RemoveAny(pool, er) Remove and return element er 
Add(pool, e) Add element e back to the pool 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 12
Pools: 
Related work 
Related work 
Close to none: 
• Several lock-free pools, e.g. tree-based 
• Wait-free pools: array-based, simple yet inefficient 
Why are wait-free pools hard to design? 
Common wait-free paradigms require dynamic memory 
allocation … 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 13
Array-based pools 
Array-based wait-free pools 
• Consists of array holding atomic reservation flags 
• Threads traverse reservation array from the beginning 
and try to reserve a flag atomically (CAS) 
• Index of successfully toggled flag is acquired element index 
• Worst-case complexity: O(n) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 14
Compartment pool 
Wait-free pool with thread-specific compartments 
• Array-based pool with additional range of elements that 
can only be acquired by a specific thread 
• Threads acquire elements from their private compartment 
first 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 15
Wait-free data structures: 
Pools - Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 16
Wait-free data structures: 
Pools - Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 17
Wait-free data structures: 
Pools - Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 18
Queues 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 19
Queues: 
Related work 
Related work 
Kogan and Petrank presented the first wait-free queue for 
multiple enqueuers and dequeuers 
Wait-Free Queues With Multiple Enqueuers and Dequeuers (Kogan, Petrank, 2011) 
- Implemented in Java 
- Relying on garbage collection 
- Requires monotonic counter (phase) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 20
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Redesign helping scheme to remove phase counter 
• In original publication, new phase value is greater than all 
phases of any announced operation (including non-pending) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 21
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Redesign helping scheme to remove phase counter 
• Modification: Help all other non-pending operations first 
• Possibly helping operations that are newer than the thread‘s 
own operation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 22
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Redesign helping scheme to remove phase counter 
• Fairness is maintained: all other threads are guaranteed 
to help this thread’s operation before engaging in their own 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 23
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Memory reclamation 
Hazard pointers scheme typically presented as a solution 
Hazard pointers: Safe memory reclamation for lock-free objects (Michael, 2004) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 24
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Introduce hazard pointers 
Step 1: Find upper memory bound for hazard pointers 
Step 2: Guard queue nodes using hazard pointers 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 25
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Introduce hazard pointers 
Step 2: Guard queue nodes using hazard pointers 
Culprit: Guarding is not wait-free 
pointer p = node.Next; 
// -- possible change of node.Next – 
while(hp.GuardPointer(p) && p != node.Next) { 
// Release and retry, unbounded number of retries 
hp.ReleaseGuard(p); 
} 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 26
Kogan-Petrank queue 
Adapting the Kogan-Petrank wait-free queue 
Introduce hazard pointers 
Step 2: Guard queue nodes using hazard pointers 
Culprit: Guarding is not wait-free 
Fortunately, retry loops can be avoided in the Kogan- 
Petrank queue, but the implementation is not trivial 
see implementation at 
https://github.com/fuchsto/embb/tree/benchmark/ 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 27
Queues - Evaluation 
Queue benchmark scenarios 
In addition to scenarios for bag semantics 
• Buffer latency 
Elements enqueued with current timestamp, difference from 
timestamp at dequeue is buffer latency 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 28
Queues - Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 29
Queues - Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 30
Stacks 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 31
Stacks: 
Related work 
Related work 
Fatourou presented a wait-free “universal” construction 
that is applicable for stacks 
Wait-Free Queues With Multiple Enqueuers and Dequeuers (Kogan, Petrank, 2011) 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 32
Elimination stack 
Fatourou’s universal construction SIM 
A highly efficient universal construction (Fatourou, 2011) 
Principle 
• Optimized helping scheme 
• Threads apply operations to a local copy of the stack 
• Every thread tries to replace the global shared object with 
its local copy via CAS 
• Only applicable for shared objects with small state 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 33
Elimination stack 
Fatourou’s universal construction SIM 
A highly efficient universal construction (Fatourou, 2011) 
Elimination 
• Push and Pop have reverse semantics: 
Push(Pop(stack)) = Pop(Push(stack)) = stack 
• Eliminated operations are completed immediately 
if they do not alter the object’s state 
Significantly improves performance if applicable 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 34
Elimination stack 
Fatourou’s universal construction SIM 
A highly efficient universal construction (Fatourou, 2013) 
Original version is not suitable for real-time applications: 
- ABA problem is prevented using tagged pointers 
- Thread-local pools with unbounded capacity 
- No deallocation in published algorithm 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 35
Elimination stack 
Fatourou’s universal construction SIM 
A highly efficient universal construction (Fatourou, 2013) 
Modified version of Fatourou’s stack 
- Uses hazard pointers for safe reclamation 
- Uses compartment pool with limited capacity 
- Employs the elimination scheme from the original 
publication 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 36
Stacks: 
Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 37
Stacks: 
Evaluation 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 38
Task scheduling 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 39
Task Scheduling: 
Objectives 
Task Scheduling 
• Intra-process task scheduling with priority queues 
• Low-overhead, fine-grained scheduling of thousands of 
small tasks 
 Priorities: 
Focus on low latency and jitter reduction (i.e. predictability), 
thus regarding maximum throughput as a secondary 
benchmark. 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 40
Task scheduling: 
Work stealing 
Work stealing 
• One worker thread per 
SMP core, no migration 
• Tasks passed as &func 
• Load-balancing on task 
queues 
• Many flavors of concrete 
implementations 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 41
Task scheduling: 
Work stealing 
Work stealing with task priorities 
• Extended work-stealing 
by queues for every 
priority 
• 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 42
Conclusion 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 43
Conclusion 
Revisiting the objective 
• Wait-free implementations of pools, queues and stacks now 
available for real-time applications 
• Benchmark framework and evaluation tools (R) are 
published as open source 
• Reproducible evaluation of real-time performance 
• Verification tool chain on the way 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 44
Conclusion 
Recommendations 
• Wait-free data structures can rival performance of lock-free 
implementations 
• But are hard to maintain 
• Formal wait-freedom is practically not achievable 
Employ wait-free data structures for fault-tolerance, not as a 
guarantee for critical deadlines 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 45
Thank You 
Source code (data structures, benchmarks, R scripts): 
https://github.com/fuchsto/embb/tree/benchmark/ 
Official development source base of embb: 
https://github.com/siemens/embb/tree/development/ 
Wiki to this thesis: 
http://wiki.coreglit.ch 
Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 46

More Related Content

What's hot

4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architecturesIslam Samir
 
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
ITIIIndustries
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...ijcsit
 
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
IJCNCJournal
 
Association Rule Mining Using WEKA
Association Rule Mining Using WEKAAssociation Rule Mining Using WEKA
Association Rule Mining Using WEKA
Prothoma Diteeya
 
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
IRJET Journal
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Editor IJCATR
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
Intel® Software
 
226 team project-report-manjula kollipara
226 team project-report-manjula kollipara226 team project-report-manjula kollipara
226 team project-report-manjula kollipara
Manjula Kollipara
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
PingCAP
 
Thilaganga mphil cs viva presentation ppt
Thilaganga mphil cs viva presentation pptThilaganga mphil cs viva presentation ppt
Thilaganga mphil cs viva presentation ppt
thilaganga
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on Spark
Dalei Li
 

What's hot (14)

4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
 
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
Optimization of Remote Core Locking Synchronization in Multithreaded Programs...
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
 
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
 
Association Rule Mining Using WEKA
Association Rule Mining Using WEKAAssociation Rule Mining Using WEKA
Association Rule Mining Using WEKA
 
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
Static Load Balancing of Parallel Mining Efficient Algorithm with PBEC in Fre...
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
 
226 team project-report-manjula kollipara
226 team project-report-manjula kollipara226 team project-report-manjula kollipara
226 team project-report-manjula kollipara
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
 
nn network
nn networknn network
nn network
 
4 026
4 0264 026
4 026
 
Thilaganga mphil cs viva presentation ppt
Thilaganga mphil cs viva presentation pptThilaganga mphil cs viva presentation ppt
Thilaganga mphil cs viva presentation ppt
 
Implementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on SparkImplementation of linear regression and logistic regression on Spark
Implementation of linear regression and logistic regression on Spark
 

Viewers also liked

TASK SCHEDULING ON ADAPTIVE MULTI-CORE
TASK SCHEDULING ON ADAPTIVE MULTI-CORETASK SCHEDULING ON ADAPTIVE MULTI-CORE
TASK SCHEDULING ON ADAPTIVE MULTI-CORE
Haris Muhammed
 
Multicore scheduling in automotive ECUs
Multicore scheduling in automotive ECUsMulticore scheduling in automotive ECUs
Multicore scheduling in automotive ECUs
RealTime-at-Work (RTaW)
 
Lock free algorithms
Lock free algorithmsLock free algorithms
Lock free algorithmsPan Ip
 
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...ijesajournal
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programmingUmeshwaran V
 
Partitioning CCGrid 2012
Partitioning CCGrid 2012Partitioning CCGrid 2012
Partitioning CCGrid 2012
Weiwei Chen
 
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time SystemsSara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
knowdiff
 
Real time system in Multicore/Multiprocessor system
Real time system in Multicore/Multiprocessor systemReal time system in Multicore/Multiprocessor system
Real time system in Multicore/Multiprocessor system
Mayank Garg
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash Table
Azul Systems Inc.
 
Critical Chain Project Management
Critical Chain Project ManagementCritical Chain Project Management
Critical Chain Project Management
Fred Wiersma
 

Viewers also liked (10)

TASK SCHEDULING ON ADAPTIVE MULTI-CORE
TASK SCHEDULING ON ADAPTIVE MULTI-CORETASK SCHEDULING ON ADAPTIVE MULTI-CORE
TASK SCHEDULING ON ADAPTIVE MULTI-CORE
 
Multicore scheduling in automotive ECUs
Multicore scheduling in automotive ECUsMulticore scheduling in automotive ECUs
Multicore scheduling in automotive ECUs
 
Lock free algorithms
Lock free algorithmsLock free algorithms
Lock free algorithms
 
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...
AN EFFICIENT HYBRID SCHEDULER USING DYNAMIC SLACK FOR REAL-TIME CRITICAL TASK...
 
C# Parallel programming
C# Parallel programmingC# Parallel programming
C# Parallel programming
 
Partitioning CCGrid 2012
Partitioning CCGrid 2012Partitioning CCGrid 2012
Partitioning CCGrid 2012
 
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time SystemsSara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
 
Real time system in Multicore/Multiprocessor system
Real time system in Multicore/Multiprocessor systemReal time system in Multicore/Multiprocessor system
Real time system in Multicore/Multiprocessor system
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash Table
 
Critical Chain Project Management
Critical Chain Project ManagementCritical Chain Project Management
Critical Chain Project Management
 

Similar to Wait-free data structures on embedded multi-core systems

Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
MumitAhmed1
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
SharabiNaif
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
Anonymous9etQKwW
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
Anubhav Jain
 
Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
Bryan Bende
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
TEST Huddle
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesJun Liu
 
Public vs. Private Cloud Performance by Flex
Public vs. Private Cloud Performance by FlexPublic vs. Private Cloud Performance by Flex
Public vs. Private Cloud Performance by Flex
StackIQ
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
jie cao
 
참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의DzH QWuynh
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
faithxdunce63732
 
An application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic forAn application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic for
Khyati Rajput
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Aaron Shilo
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition Systems
Lionel Briand
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
CloudLightning
 
Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?
inside-BigData.com
 
Cs 331 Data Structures
Cs 331 Data StructuresCs 331 Data Structures
Chap 2 classification of parralel architecture and introduction to parllel p...
Chap 2  classification of parralel architecture and introduction to parllel p...Chap 2  classification of parralel architecture and introduction to parllel p...
Chap 2 classification of parralel architecture and introduction to parllel p...
Malobe Lottin Cyrille Marcel
 

Similar to Wait-free data structures on embedded multi-core systems (20)

Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
 
Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 Slides
 
HPPS 2008 - Maesani Moro
HPPS 2008 - Maesani MoroHPPS 2008 - Maesani Moro
HPPS 2008 - Maesani Moro
 
Public vs. Private Cloud Performance by Flex
Public vs. Private Cloud Performance by FlexPublic vs. Private Cloud Performance by Flex
Public vs. Private Cloud Performance by Flex
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
An application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic forAn application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic for
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
 
Model Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition SystemsModel Based Test Validation and Oracles for Data Acquisition Systems
Model Based Test Validation and Oracles for Data Acquisition Systems
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?
 
Cs 331 Data Structures
Cs 331 Data StructuresCs 331 Data Structures
Cs 331 Data Structures
 
Chap 2 classification of parralel architecture and introduction to parllel p...
Chap 2  classification of parralel architecture and introduction to parllel p...Chap 2  classification of parralel architecture and introduction to parllel p...
Chap 2 classification of parralel architecture and introduction to parllel p...
 

Recently uploaded

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
muralinath2
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 

Recently uploaded (20)

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 

Wait-free data structures on embedded multi-core systems

  • 1. Tobias Fuchs Evaluation of Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems • Vortrag zur Masterarbeit • Aufgabensteller: Prof. Dr. Dieter Kranzlmüller • Betreuer: Dr. Karl Fürlinger (LMU) Dr. Tobias Schüle (Siemens CT) • Datum des Vortrags: 05.11.2014
  • 2. Structure of this talk 1. Introduction 1. Motivation 2. Problem Statement and Objectives 2. Wait-free data structures 1. Foundations 2. Pools 3. Queues 4. Stacks 3. Task Scheduling 1. Work stealing 2. Prioritized work stealing in EMBB 4. Conclusion Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 2
  • 3. Wait-freedom: Motivation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 3
  • 4. Motivation Wait-free algorithms • Strongest possible fault tolerance • Guarantee progress and upper bound for execution time Gains: + Progress is potentially a formal constraint in real-time computing + Wait-freedom eliminates the classic concurrency problems: Deadlocks, Priority Inversion, Convoying, Kill-Intolerance Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 4
  • 5. Problem statement State of the art No suitable wait-free data structures for embedded systems: • Employing mechanisms such as garbage collection • Not designed for restricted resources • No evaluation for latency Challenges: - Transforming data structures to wait-free equivalents is non-trivial, usually from-scratch redesign - Implementations depend on platform architecture Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 5
  • 6. Objectives 1. Review and evaluation of state of the art approaches for suitability on embedded systems 2. Real-time compliant implementations of wait-free data structures 3. Definition, implementation and evaluation of suitable benchmark scenarios for wait-free data structures and task scheduling algorithms + Automated verification derived from semantic definition Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 6
  • 7. Foundations Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 7
  • 8. Progress conditions Classification of progress On the Nature of Progress (Herlihy, Shavit 2011) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 8
  • 9. Real-time requirements Performance priorities on real-time systems Guarantees on worst-case runtime behavior  Aim for latency / jitter-reduction, neglecting throughput  Avoid non-determinism, as in malloc / new (see: MISRA) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 9
  • 10. Evaluation methodology Real-time applications are designed to optimize latency Related work does not evaluate latency, but only mean or median throughput Evaluation of worst-case latency is tough: • In related work, measurements outside of 97.5% confidence interval are considered outliers and ignored • These outliers are our data Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 10
  • 11. Pools Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 11
  • 12. Wait-free data structures: Pools Pools … realize dynamic memory allocation … while eliminating heap fragmentation • Fundamental data structure of any concurrent container • Fixed number of objects in static or automatic memory • Pools manage concurrent removal and reclamation of objects RemoveAny(pool, er) Remove and return element er Add(pool, e) Add element e back to the pool Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 12
  • 13. Pools: Related work Related work Close to none: • Several lock-free pools, e.g. tree-based • Wait-free pools: array-based, simple yet inefficient Why are wait-free pools hard to design? Common wait-free paradigms require dynamic memory allocation … Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 13
  • 14. Array-based pools Array-based wait-free pools • Consists of array holding atomic reservation flags • Threads traverse reservation array from the beginning and try to reserve a flag atomically (CAS) • Index of successfully toggled flag is acquired element index • Worst-case complexity: O(n) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 14
  • 15. Compartment pool Wait-free pool with thread-specific compartments • Array-based pool with additional range of elements that can only be acquired by a specific thread • Threads acquire elements from their private compartment first Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 15
  • 16. Wait-free data structures: Pools - Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 16
  • 17. Wait-free data structures: Pools - Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 17
  • 18. Wait-free data structures: Pools - Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 18
  • 19. Queues Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 19
  • 20. Queues: Related work Related work Kogan and Petrank presented the first wait-free queue for multiple enqueuers and dequeuers Wait-Free Queues With Multiple Enqueuers and Dequeuers (Kogan, Petrank, 2011) - Implemented in Java - Relying on garbage collection - Requires monotonic counter (phase) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 20
  • 21. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Redesign helping scheme to remove phase counter • In original publication, new phase value is greater than all phases of any announced operation (including non-pending) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 21
  • 22. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Redesign helping scheme to remove phase counter • Modification: Help all other non-pending operations first • Possibly helping operations that are newer than the thread‘s own operation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 22
  • 23. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Redesign helping scheme to remove phase counter • Fairness is maintained: all other threads are guaranteed to help this thread’s operation before engaging in their own Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 23
  • 24. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Memory reclamation Hazard pointers scheme typically presented as a solution Hazard pointers: Safe memory reclamation for lock-free objects (Michael, 2004) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 24
  • 25. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Introduce hazard pointers Step 1: Find upper memory bound for hazard pointers Step 2: Guard queue nodes using hazard pointers Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 25
  • 26. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Introduce hazard pointers Step 2: Guard queue nodes using hazard pointers Culprit: Guarding is not wait-free pointer p = node.Next; // -- possible change of node.Next – while(hp.GuardPointer(p) && p != node.Next) { // Release and retry, unbounded number of retries hp.ReleaseGuard(p); } Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 26
  • 27. Kogan-Petrank queue Adapting the Kogan-Petrank wait-free queue Introduce hazard pointers Step 2: Guard queue nodes using hazard pointers Culprit: Guarding is not wait-free Fortunately, retry loops can be avoided in the Kogan- Petrank queue, but the implementation is not trivial see implementation at https://github.com/fuchsto/embb/tree/benchmark/ Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 27
  • 28. Queues - Evaluation Queue benchmark scenarios In addition to scenarios for bag semantics • Buffer latency Elements enqueued with current timestamp, difference from timestamp at dequeue is buffer latency Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 28
  • 29. Queues - Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 29
  • 30. Queues - Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 30
  • 31. Stacks Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 31
  • 32. Stacks: Related work Related work Fatourou presented a wait-free “universal” construction that is applicable for stacks Wait-Free Queues With Multiple Enqueuers and Dequeuers (Kogan, Petrank, 2011) Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 32
  • 33. Elimination stack Fatourou’s universal construction SIM A highly efficient universal construction (Fatourou, 2011) Principle • Optimized helping scheme • Threads apply operations to a local copy of the stack • Every thread tries to replace the global shared object with its local copy via CAS • Only applicable for shared objects with small state Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 33
  • 34. Elimination stack Fatourou’s universal construction SIM A highly efficient universal construction (Fatourou, 2011) Elimination • Push and Pop have reverse semantics: Push(Pop(stack)) = Pop(Push(stack)) = stack • Eliminated operations are completed immediately if they do not alter the object’s state Significantly improves performance if applicable Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 34
  • 35. Elimination stack Fatourou’s universal construction SIM A highly efficient universal construction (Fatourou, 2013) Original version is not suitable for real-time applications: - ABA problem is prevented using tagged pointers - Thread-local pools with unbounded capacity - No deallocation in published algorithm Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 35
  • 36. Elimination stack Fatourou’s universal construction SIM A highly efficient universal construction (Fatourou, 2013) Modified version of Fatourou’s stack - Uses hazard pointers for safe reclamation - Uses compartment pool with limited capacity - Employs the elimination scheme from the original publication Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 36
  • 37. Stacks: Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 37
  • 38. Stacks: Evaluation Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 38
  • 39. Task scheduling Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 39
  • 40. Task Scheduling: Objectives Task Scheduling • Intra-process task scheduling with priority queues • Low-overhead, fine-grained scheduling of thousands of small tasks  Priorities: Focus on low latency and jitter reduction (i.e. predictability), thus regarding maximum throughput as a secondary benchmark. Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 40
  • 41. Task scheduling: Work stealing Work stealing • One worker thread per SMP core, no migration • Tasks passed as &func • Load-balancing on task queues • Many flavors of concrete implementations Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 41
  • 42. Task scheduling: Work stealing Work stealing with task priorities • Extended work-stealing by queues for every priority • Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 42
  • 43. Conclusion Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 43
  • 44. Conclusion Revisiting the objective • Wait-free implementations of pools, queues and stacks now available for real-time applications • Benchmark framework and evaluation tools (R) are published as open source • Reproducible evaluation of real-time performance • Verification tool chain on the way Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 44
  • 45. Conclusion Recommendations • Wait-free data structures can rival performance of lock-free implementations • But are hard to maintain • Formal wait-freedom is practically not achievable Employ wait-free data structures for fault-tolerance, not as a guarantee for critical deadlines Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 45
  • 46. Thank You Source code (data structures, benchmarks, R scripts): https://github.com/fuchsto/embb/tree/benchmark/ Official development source base of embb: https://github.com/siemens/embb/tree/development/ Wiki to this thesis: http://wiki.coreglit.ch Task Scheduling Algorithms and Wait-Free Data Structures for Embedded Multi-Core Systems 46