SlideShare a Scribd company logo
1 of 1
Download to read offline
The LEGaTO project has received funding from the European Union’s Horizon 2020 research and innovation
programme under the grant agreement No 780681. www.legato-project.eu
Low Energy Task Scheduling based on Work Stealing
Jing Chen
Chalmers University of Technology
Directed Acyclic Graph (DAG)
n A task-based way to express
multithreading applications.
n Nodes are tasks.
n Edges are dependencies.
Asymmetric platforms feature
n High performance and power
hungry cores
n Energy efficient and small
cores
Dynamic Task Scheduling
Work stealing: better scalability in larger systems, less
communication contention than centralized scheme.
Performance Improvement NOT enough for energy reduction
DVFS: voltage and frequency scaling
n Users are usually not permitted to manipulate DVFS settings.
Overhead: tens
of 𝜇𝑠 to over
one 𝑚𝑠
Multithreading
Application: 𝜇𝑠
level fine-
grained tasks
NOT realistic to
use DVFS per
task
State-of-the-
art: Per-core
DVFS
Significant
hardware cost
(inductors and
capacitors)
Most systems
only feature
cluster-based
DVFS
State-of-the-
art: platform
complete
control
If some other
applications run
on same cluster
Badly influence
energy of these
applications
Low Energy Runtime Design
Power Profiling
n Help runtime understand CPU power consumption trends
(number/type of cores, different frequencies)
n We evaluate power profiling techniques:
(a) Directly sample power by accessing the onboard power sensor, e.g.
NVIDIA Jetson TX2 INA3221.
(b) Intel RAPL energy model, sample energy every fixed time, then:
Powern+1 = (Energyn+1 - Energyn) / (tn+1 - tn)
Dynamic Performance Modeling
n Provide accurate prediction for future task given a set of resources
n Independent of platforms and frequencies
n Achieve scalablity and portablity goals
Idleness Tracing
n Give the information about real-time status of cores
n Put cores to ”sleep” when it is under-utilized
n Sleeping time exploits backoff exponential strategy
n Provide the real-time parallel slackness of active cores =>
calculation of shared board static power on each running task
Task Mapping Algorithm (Per task level)
For a given configuration (Start core, number of cores):
n Performance Tracer => Execution Time Prediction
n Power Profiles => Dynamic Power Prediction
n Power Profiles + Idleness Tracer => Static Power Prediction
Energy Prediction = (Static Power + Dynamic Power) x Execution Time
Experimental Results
Name Acronym Notion
Random Work
Stealing (+Sleep)
RWS (+S) Typical greedy scheduling (enhanced with Sleep)
Fastest Cores
with Criticality
(+Sleep)
FCC (+S) Critical tasks are mapped to the set of cores that
minimize execution time and are not allowed
work stealing, noncritical tasks follow parent
queue and only search for the best number of
cores that minimize the execution time of the
task (enhanced with Sleep)
Lowest Cost with
Criticality
(+Sleep)
LCC (+S) The difference between LCC and FCC is that
minimizing execution time becomes minimizing
parallel cost. The parallel cost means ”execution
time * number of cores” (enhanced with Sleep)
Lowest Energy
without
Criticality
LENC Task scheduling targets lowest energy, no need
for criticality awareness
0
2
4
6
8
10
12
14
16
18
RWS RWS+Sleep FCC FCC+Sleep LCC LCC+Sleep LENC
EnergyConsumption[J]
x1000
2D-Heat on Haswell one node
1000 iterations, resolution=10240
0
100
200
300
400
500
600
MAX&MAX MAX&MIN MIN&MAX MIN&MIN
Energy[J]
VGG-16 on NVIDIA Jetson TX2
RWS RWS+S FCC FCC+S LCC LCC+S LENC
n MAX&MIN (x-axis) means on TX2, Denver cluster frequency is
maximum, A57 cluster frequency is minimum.
n LENC achieves lowest energy, e.g.31%-74% energy reduction than
RWS, 19%-68% than FCC, 25%-73% than LCC.
n Haswell is a symmetric platform, 2D-Heat includes two kernels:
copy (memory-bound) and stencil (compute-bound).
n Sleep strategy brings 38% energy reduction in RWS vs. RWS+S, 9%
in FCC vs. FCC+S, 33% in LCC vs. LCC+S.
n LENC achieves low energy task type awareness:
(a) Copy tasks choose number of cores=5
(b) Stencil tasks choose number of cores=10.
Background
The importance of task feature awareness:
n Naive assignment causes the mismatch of task types and
core types, e.g. compute-bound kernels using powerful
Denver cluster on TX2 is more energy efficient than using all.

More Related Content

What's hot

LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGATO project
 
Task programming in cloud computing
Task programming in cloud computingTask programming in cloud computing
Task programming in cloud computingSuresh Pokharel
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraMasaharu Munetomo
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer FugakuRCCSRENKEI
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance ComputingUmarudin Zaenuri
 
Extracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationExtracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationJônatas Paganini
 
European Exascale System Interconnect & Storage
European Exascale System Interconnect & StorageEuropean Exascale System Interconnect & Storage
European Exascale System Interconnect & Storageinside-BigData.com
 
CloudLightning Simulator
CloudLightning SimulatorCloudLightning Simulator
CloudLightning SimulatorCloudLightning
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19ExtremeEarth
 
Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)inside-BigData.com
 
High performance computing
High performance computingHigh performance computing
High performance computingMaher Alshammari
 
Rain technology ppt
Rain technology pptRain technology ppt
Rain technology pptDC Graphics
 
Presentation
PresentationPresentation
Presentationbutest
 
Varun Gatne - Resume - Final
Varun Gatne - Resume - FinalVarun Gatne - Resume - Final
Varun Gatne - Resume - FinalVarun Gatne
 
UberCloud Webinar Abaqus and cloud computing
UberCloud Webinar Abaqus and cloud computingUberCloud Webinar Abaqus and cloud computing
UberCloud Webinar Abaqus and cloud computingThomas Francis
 
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)Mahadevan N
 

What's hot (20)

LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
 
Task programming in cloud computing
Task programming in cloud computingTask programming in cloud computing
Task programming in cloud computing
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
Introduction to High Performance Computing
Introduction to High Performance ComputingIntroduction to High Performance Computing
Introduction to High Performance Computing
 
Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?Cloud, Fog, or Edge: Where and When to Compute?
Cloud, Fog, or Edge: Where and When to Compute?
 
Extracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationExtracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated application
 
European Exascale System Interconnect & Storage
European Exascale System Interconnect & StorageEuropean Exascale System Interconnect & Storage
European Exascale System Interconnect & Storage
 
CloudLightning Simulator
CloudLightning SimulatorCloudLightning Simulator
CloudLightning Simulator
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19
 
Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Rain technology ppt
Rain technology pptRain technology ppt
Rain technology ppt
 
Presentation
PresentationPresentation
Presentation
 
Varun Gatne - Resume - Final
Varun Gatne - Resume - FinalVarun Gatne - Resume - Final
Varun Gatne - Resume - Final
 
UberCloud Webinar Abaqus and cloud computing
UberCloud Webinar Abaqus and cloud computingUberCloud Webinar Abaqus and cloud computing
UberCloud Webinar Abaqus and cloud computing
 
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)
 

Similar to Low Energy Task Scheduling based on Work Stealing

A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 
22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedngPoornima_Rajanna
 
Dvfs nima-afraz
Dvfs nima-afrazDvfs nima-afraz
Dvfs nima-afrazNima Afraz
 
Battery Aware Dynamic Scheduling for Periodic Task Graphs
Battery Aware Dynamic Scheduling for Periodic Task GraphsBattery Aware Dynamic Scheduling for Periodic Task Graphs
Battery Aware Dynamic Scheduling for Periodic Task GraphsNicolas Navet
 
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...cscpconf
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsLEGATO project
 
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ..."Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...Edge AI and Vision Alliance
 
Large scale simulation ship power system hebner-herbst-gatozzi - july 2010
Large scale simulation ship power system  hebner-herbst-gatozzi - july 2010Large scale simulation ship power system  hebner-herbst-gatozzi - july 2010
Large scale simulation ship power system hebner-herbst-gatozzi - july 2010cahouser
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascaleinside-BigData.com
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsNECST Lab @ Politecnico di Milano
 
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudCoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudAta Turk
 
Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Arinto Murdopo
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...VLSICS Design
 
Saving Energy for Cloud Applications in Mobile Devices using Nearby Resources
Saving Energy for Cloud Applications in Mobile Devices using Nearby ResourcesSaving Energy for Cloud Applications in Mobile Devices using Nearby Resources
Saving Energy for Cloud Applications in Mobile Devices using Nearby ResourcesAnas Toma
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksTaegyun Jeon
 
Apache track d updated
Apache   track d updatedApache   track d updated
Apache track d updatedAlona Gradman
 

Similar to Low Energy Task Scheduling based on Work Stealing (20)

A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng22). smlevel energy eff-dynamictaskschedng
22). smlevel energy eff-dynamictaskschedng
 
Dvfs nima-afraz
Dvfs nima-afrazDvfs nima-afraz
Dvfs nima-afraz
 
Battery Aware Dynamic Scheduling for Periodic Task Graphs
Battery Aware Dynamic Scheduling for Periodic Task GraphsBattery Aware Dynamic Scheduling for Periodic Task Graphs
Battery Aware Dynamic Scheduling for Periodic Task Graphs
 
HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020
 
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
DYNAMIC VOLTAGE SCALING FOR POWER CONSUMPTION REDUCTION IN REAL-TIME MIXED TA...
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
 
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ..."Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
 
Large scale simulation ship power system hebner-herbst-gatozzi - july 2010
Large scale simulation ship power system  hebner-herbst-gatozzi - july 2010Large scale simulation ship power system  hebner-herbst-gatozzi - july 2010
Large scale simulation ship power system hebner-herbst-gatozzi - july 2010
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascale
 
Umit hw6
Umit hw6Umit hw6
Umit hw6
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open CloudCoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud
 
Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...
A MULTI-OBJECTIVE PERSPECTIVE FOR OPERATOR SCHEDULING USING FINEGRAINED DVS A...
 
Saving Energy for Cloud Applications in Mobile Devices using Nearby Resources
Saving Energy for Cloud Applications in Mobile Devices using Nearby ResourcesSaving Energy for Cloud Applications in Mobile Devices using Nearby Resources
Saving Energy for Cloud Applications in Mobile Devices using Nearby Resources
 
Green scheduling
Green schedulingGreen scheduling
Green scheduling
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural Networks
 
Apache track d updated
Apache   track d updatedApache   track d updated
Apache track d updated
 

More from LEGATO project

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitLEGATO project
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemLEGATO project
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsLEGATO project
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworkLEGATO project
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...LEGATO project
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGATO project
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edgeLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGATO project
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGATO project
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGATO project
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneLEGATO project
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingLEGATO project
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edgeLEGATO project
 
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyFPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyLEGATO project
 
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingRECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingLEGATO project
 
Secure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGXSecure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGXLEGATO project
 
HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataLEGATO project
 

More from LEGATO project (20)

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for Profit
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating system
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEs
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow Framework
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use Case
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edge
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
 
LEGaTO Integration
LEGaTO IntegrationLEGaTO Integration
LEGaTO Integration
 
LEGaTO: Use cases
LEGaTO: Use casesLEGaTO: Use cases
LEGaTO: Use cases
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming Models
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing Workshop
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow Computing
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edge
 
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyFPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
 
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingRECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
 
Secure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGXSecure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGX
 
HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat data
 

Recently uploaded

Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermicultureTakeleZike1
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detailhaiderbaloch3
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxsubscribeus100
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 

Recently uploaded (20)

Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermiculture
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detail
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 

Low Energy Task Scheduling based on Work Stealing

  • 1. The LEGaTO project has received funding from the European Union’s Horizon 2020 research and innovation programme under the grant agreement No 780681. www.legato-project.eu Low Energy Task Scheduling based on Work Stealing Jing Chen Chalmers University of Technology Directed Acyclic Graph (DAG) n A task-based way to express multithreading applications. n Nodes are tasks. n Edges are dependencies. Asymmetric platforms feature n High performance and power hungry cores n Energy efficient and small cores Dynamic Task Scheduling Work stealing: better scalability in larger systems, less communication contention than centralized scheme. Performance Improvement NOT enough for energy reduction DVFS: voltage and frequency scaling n Users are usually not permitted to manipulate DVFS settings. Overhead: tens of 𝜇𝑠 to over one 𝑚𝑠 Multithreading Application: 𝜇𝑠 level fine- grained tasks NOT realistic to use DVFS per task State-of-the- art: Per-core DVFS Significant hardware cost (inductors and capacitors) Most systems only feature cluster-based DVFS State-of-the- art: platform complete control If some other applications run on same cluster Badly influence energy of these applications Low Energy Runtime Design Power Profiling n Help runtime understand CPU power consumption trends (number/type of cores, different frequencies) n We evaluate power profiling techniques: (a) Directly sample power by accessing the onboard power sensor, e.g. NVIDIA Jetson TX2 INA3221. (b) Intel RAPL energy model, sample energy every fixed time, then: Powern+1 = (Energyn+1 - Energyn) / (tn+1 - tn) Dynamic Performance Modeling n Provide accurate prediction for future task given a set of resources n Independent of platforms and frequencies n Achieve scalablity and portablity goals Idleness Tracing n Give the information about real-time status of cores n Put cores to ”sleep” when it is under-utilized n Sleeping time exploits backoff exponential strategy n Provide the real-time parallel slackness of active cores => calculation of shared board static power on each running task Task Mapping Algorithm (Per task level) For a given configuration (Start core, number of cores): n Performance Tracer => Execution Time Prediction n Power Profiles => Dynamic Power Prediction n Power Profiles + Idleness Tracer => Static Power Prediction Energy Prediction = (Static Power + Dynamic Power) x Execution Time Experimental Results Name Acronym Notion Random Work Stealing (+Sleep) RWS (+S) Typical greedy scheduling (enhanced with Sleep) Fastest Cores with Criticality (+Sleep) FCC (+S) Critical tasks are mapped to the set of cores that minimize execution time and are not allowed work stealing, noncritical tasks follow parent queue and only search for the best number of cores that minimize the execution time of the task (enhanced with Sleep) Lowest Cost with Criticality (+Sleep) LCC (+S) The difference between LCC and FCC is that minimizing execution time becomes minimizing parallel cost. The parallel cost means ”execution time * number of cores” (enhanced with Sleep) Lowest Energy without Criticality LENC Task scheduling targets lowest energy, no need for criticality awareness 0 2 4 6 8 10 12 14 16 18 RWS RWS+Sleep FCC FCC+Sleep LCC LCC+Sleep LENC EnergyConsumption[J] x1000 2D-Heat on Haswell one node 1000 iterations, resolution=10240 0 100 200 300 400 500 600 MAX&MAX MAX&MIN MIN&MAX MIN&MIN Energy[J] VGG-16 on NVIDIA Jetson TX2 RWS RWS+S FCC FCC+S LCC LCC+S LENC n MAX&MIN (x-axis) means on TX2, Denver cluster frequency is maximum, A57 cluster frequency is minimum. n LENC achieves lowest energy, e.g.31%-74% energy reduction than RWS, 19%-68% than FCC, 25%-73% than LCC. n Haswell is a symmetric platform, 2D-Heat includes two kernels: copy (memory-bound) and stencil (compute-bound). n Sleep strategy brings 38% energy reduction in RWS vs. RWS+S, 9% in FCC vs. FCC+S, 33% in LCC vs. LCC+S. n LENC achieves low energy task type awareness: (a) Copy tasks choose number of cores=5 (b) Stencil tasks choose number of cores=10. Background The importance of task feature awareness: n Naive assignment causes the mismatch of task types and core types, e.g. compute-bound kernels using powerful Denver cluster on TX2 is more energy efficient than using all.