SlideShare a Scribd company logo
1 of 17
Energy Aware Task
Allocation on a Large
Scale Heterogeneous
Multi-Core SOC
Eric Jonardi

1
Problem
• upper bound on the performance of single core processors
• future embedded systems must be faster while less consuming
energy
• reducing energy consumption usually results in reduced
performance

• Solution: multiple processors per die
• multi-core system on chip (MCSOC).
• combining processors with different architectures
• heterogeneity creates opportunity for optimization

• highly effective for large scale data centers
• task mapping grows increasingly complex
• reliable and fast task mapping is needed

2
Project Goal
• develop a static/offline method of assigning incoming tasks
(also known as mapping) the various cores of a heterogeneous
MCSOC
• the mapping will minimize the energy consumed to fully
execute a workload, such that all task are executed

3
Overview
•
•
•
•
•
•

MCSOC model
Simulated annealing algorithm
ARM A7 & A15 Architecture
Gem5 task simulations
mapping algorithm setup
Results

4
MCSOC Device Model

5
Simulated Annealing
•
•
•
•

task mapping is NP-complete
simulated annealing is an iterative search heuristic
allows escape from local minima
solution not ideal, but is “good enough”

6
Simulated Annealing

7
Simulated Annealing
• iterative search heuristic
• allows escape from local minima
• primary variables
• Initial temperature
• Cooling rate
• Parameter mutation rate
• P-state mutation rate
• Task flow mutation rate

8
ARM A7 & A15 Architectures

9
P-states for A7 & A15
Normalized P-states
1
0.9
0.8
0.7
Power

0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Performance
A15

A7

10
Gem5 Simulated Tasks
0.016

0.014

Runtime (seconds)

0.012

0.01

0.008

0.006

0.004

0.002

0
400

500

600

700

800

900

Clock Speed (GHz)
fft

ocean

oceanNC

fft2

radix

lu

1,000

11
ECS Matrix

12
Mapping Algorithm Setup
• Global arrival rate for each task type
• Each core:
• p-state
• task flow rate for each task type

•
•
•
•

Global execution rate for each task type
Simulated annealing loop
Fitness function (evaluated energy)
Solution repair function

13
Running the simulations

14
Results
16
14
12
10
8
6
4
2
0

Energy Decrease (stress = 0.6)

3x3
4x4
5x5

6x6
10x10

1000

10000

100000

Avg % Decrease in Energy

Avg % Decrease in Energy

Energy Decrease (stress = 0.8)

1000000

60
50
40

3x3

30

4x4

20

5x5

10

6x6

0

10x10

1000

# Iterations

3x3
4x4
5x5
6x6
100000

# Iterations

1000000

Energy Decrease (stress = 0.1)

1000000

10x10

Avg % Decrease in Energy

Avg % Decrease in Energy

70
60
50
40
30
20
10
0
10000

100000

# Iterations

Energy Decrease (stress = 0.4)

1000

10000

100
80

3x3

60

4x4

40

5x5
20

6x6

0
1000

10x10
10000

100000

# Iterations

1000000

15
Simulation Runtimes
Size
3x3

4x4

5x5

6x6

10x10

# Iterations
1000
10000
100000
1000000
1000
10000
100000
1000000
1000
10000
100000
1000000
1000
10000
100000
1000000
1000
10000
100000
1000000

Avg Runtime (s)
<1
<1
<1
14
<1
<1
2
20
<1
<1
4
36
<1
<1
5
48
<1
2
25
244

16
Conclusion
• Heterogeneity in MCSOCs creates opportunities for
optimization
• Simulated annealing is an effective optimization heuristic
• Proper mapping of workloads in heterogeneous MCSOCs can
greatly reduce total energy consumption when compared to a
non-energy aware mapping methodology

17

More Related Content

What's hot

Threading Successes 03 Gamebryo
Threading Successes 03   GamebryoThreading Successes 03   Gamebryo
Threading Successes 03 Gamebryoguest40fc7cd
 
Introduction to Klepsydra
Introduction to KlepsydraIntroduction to Klepsydra
Introduction to KlepsydraPablo Ghiglino
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScyllaDB
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesDataWorks Summit/Hadoop Summit
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
QCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaQCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaSean Zhong
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
 
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsUsing SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsScyllaDB
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk ModellingScyllaDB
 
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationUse of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationJonathan Beard
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Ilya Ganelin
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerShun-ping Chiu
 
C-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the CloudC-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the CloudQian Lin
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineTianlun Zhang
 
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution Experience
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution ExperienceICANN DNS Symposium (IDS 2019): RDAP CDN Distribution Experience
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution ExperienceAPNIC
 
The Database Sizing Workflow
The Database Sizing WorkflowThe Database Sizing Workflow
The Database Sizing WorkflowKristofferson A
 
Performance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionPerformance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionHaribabu Nandyal Padmanaban
 

What's hot (20)

Threading Successes 03 Gamebryo
Threading Successes 03   GamebryoThreading Successes 03   Gamebryo
Threading Successes 03 Gamebryo
 
Introduction to Klepsydra
Introduction to KlepsydraIntroduction to Klepsydra
Introduction to Klepsydra
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resources
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
QCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on AkkaQCON 2015: Gearpump, Realtime Streaming on Akka
QCON 2015: Gearpump, Realtime Streaming on Akka
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
 
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsUsing SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationUse of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
 
C-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the CloudC-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the Cloud
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
Acrl
AcrlAcrl
Acrl
 
Apache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engineApache Gearpump next-gen streaming engine
Apache Gearpump next-gen streaming engine
 
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution Experience
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution ExperienceICANN DNS Symposium (IDS 2019): RDAP CDN Distribution Experience
ICANN DNS Symposium (IDS 2019): RDAP CDN Distribution Experience
 
The Database Sizing Workflow
The Database Sizing WorkflowThe Database Sizing Workflow
The Database Sizing Workflow
 
Performance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage CollectionPerformance Tuning - Understanding Garbage Collection
Performance Tuning - Understanding Garbage Collection
 

Viewers also liked

Valentine monsters
Valentine monstersValentine monsters
Valentine monsterschackettb
 
How to have a great holiday
How to have a great holidayHow to have a great holiday
How to have a great holidaychackettb
 
Affecting feelings group brainstorm
Affecting feelings group brainstormAffecting feelings group brainstorm
Affecting feelings group brainstormchackettb
 
Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Sumit Sandhu
 
Windows xp presentation
Windows xp presentationWindows xp presentation
Windows xp presentation3880075
 
Working Effectively in Diverse Teams
Working Effectively in Diverse TeamsWorking Effectively in Diverse Teams
Working Effectively in Diverse TeamsÖzge Özdemir
 
Jornal Valor Econômico: Dados Commodities 07/01/2016
Jornal Valor Econômico: Dados Commodities 07/01/2016Jornal Valor Econômico: Dados Commodities 07/01/2016
Jornal Valor Econômico: Dados Commodities 07/01/2016Agricultura Sao Paulo
 
Qbiss One - Breakthrough in facades
Qbiss One - Breakthrough in facadesQbiss One - Breakthrough in facades
Qbiss One - Breakthrough in facadestrimo-vsk
 
В шаге от покупки...
В шаге от покупки...В шаге от покупки...
В шаге от покупки...U-Too
 
Smu mscit sem 4 spring 2015 assignments
Smu mscit sem 4 spring 2015 assignmentsSmu mscit sem 4 spring 2015 assignments
Smu mscit sem 4 spring 2015 assignmentssolved_assignments
 

Viewers also liked (15)

Ring chromosome 7
Ring chromosome 7Ring chromosome 7
Ring chromosome 7
 
Valentine monsters
Valentine monstersValentine monsters
Valentine monsters
 
How to have a great holiday
How to have a great holidayHow to have a great holiday
How to have a great holiday
 
Affecting feelings group brainstorm
Affecting feelings group brainstormAffecting feelings group brainstorm
Affecting feelings group brainstorm
 
Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Science-2013-Vannier-239-42
Science-2013-Vannier-239-42
 
HLTF
HLTFHLTF
HLTF
 
Windows xp presentation
Windows xp presentationWindows xp presentation
Windows xp presentation
 
Ruhani sandhu
Ruhani sandhuRuhani sandhu
Ruhani sandhu
 
Working Effectively in Diverse Teams
Working Effectively in Diverse TeamsWorking Effectively in Diverse Teams
Working Effectively in Diverse Teams
 
Bondia.cat 17/12/2013
Bondia.cat 17/12/2013Bondia.cat 17/12/2013
Bondia.cat 17/12/2013
 
Jornal Valor Econômico: Dados Commodities 07/01/2016
Jornal Valor Econômico: Dados Commodities 07/01/2016Jornal Valor Econômico: Dados Commodities 07/01/2016
Jornal Valor Econômico: Dados Commodities 07/01/2016
 
Resume 2015
Resume 2015Resume 2015
Resume 2015
 
Qbiss One - Breakthrough in facades
Qbiss One - Breakthrough in facadesQbiss One - Breakthrough in facades
Qbiss One - Breakthrough in facades
 
В шаге от покупки...
В шаге от покупки...В шаге от покупки...
В шаге от покупки...
 
Smu mscit sem 4 spring 2015 assignments
Smu mscit sem 4 spring 2015 assignmentsSmu mscit sem 4 spring 2015 assignments
Smu mscit sem 4 spring 2015 assignments
 

Similar to Energy Aware Task Allocation on Heterogeneous Multi-Core SOC (EA-TA-HMCS

Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...James McCombs
 
Unleash performance through parallelism - Intel® Math Kernel Library
Unleash performance through parallelism - Intel® Math Kernel LibraryUnleash performance through parallelism - Intel® Math Kernel Library
Unleash performance through parallelism - Intel® Math Kernel LibraryIntel IT Center
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Summit
 
Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Alpine Data
 
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPUA Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPUCarlos Reaño González
 
How to achieve 95%+ Accurate power measurement during architecture exploration?
How to achieve 95%+ Accurate power measurement during architecture exploration? How to achieve 95%+ Accurate power measurement during architecture exploration?
How to achieve 95%+ Accurate power measurement during architecture exploration? Deepak Shankar
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxAkshitAgiwal1
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Spark Summit
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
Fast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating SystemsFast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating SystemsRuhaim Izmeth
 
ARM® Cortex™ M Energy Optimization - Using Instruction Cache
ARM® Cortex™ M Energy Optimization - Using Instruction CacheARM® Cortex™ M Energy Optimization - Using Instruction Cache
ARM® Cortex™ M Energy Optimization - Using Instruction CacheRaahul Raghavan
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsNECST Lab @ Politecnico di Milano
 
Embedded system custom single purpose processors
Embedded system custom single  purpose processorsEmbedded system custom single  purpose processors
Embedded system custom single purpose processorsAiswaryadevi Jaganmohan
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentDataWorks Summit
 
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...Arun Joseph
 

Similar to Energy Aware Task Allocation on Heterogeneous Multi-Core SOC (EA-TA-HMCS (20)

Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
 
Unleash performance through parallelism - Intel® Math Kernel Library
Unleash performance through parallelism - Intel® Math Kernel LibraryUnleash performance through parallelism - Intel® Math Kernel Library
Unleash performance through parallelism - Intel® Math Kernel Library
 
OOW-IMC-final
OOW-IMC-finalOOW-IMC-final
OOW-IMC-final
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
 
Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017
 
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPUA Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU
A Study on Atomics-based Integer Sum Reduction in HIP on AMD GPU
 
How to achieve 95%+ Accurate power measurement during architecture exploration?
How to achieve 95%+ Accurate power measurement during architecture exploration? How to achieve 95%+ Accurate power measurement during architecture exploration?
How to achieve 95%+ Accurate power measurement during architecture exploration?
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
03 performance
03 performance03 performance
03 performance
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
Fast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating SystemsFast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating Systems
 
ARM® Cortex™ M Energy Optimization - Using Instruction Cache
ARM® Cortex™ M Energy Optimization - Using Instruction CacheARM® Cortex™ M Energy Optimization - Using Instruction Cache
ARM® Cortex™ M Energy Optimization - Using Instruction Cache
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
 
04 performance
04 performance04 performance
04 performance
 
Embedded system custom single purpose processors
Embedded system custom single  purpose processorsEmbedded system custom single  purpose processors
Embedded system custom single purpose processors
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
 
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...Empirically Derived Abstractions in Uncore Power Modeling for a  Server-Class...
Empirically Derived Abstractions in Uncore Power Modeling for a Server-Class...
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 

Recently uploaded (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 

Energy Aware Task Allocation on Heterogeneous Multi-Core SOC (EA-TA-HMCS

Editor's Notes

  1. Certain tasks will run faster on one architecture compared to a different architecturethis reduction in runtime results in a lower amount of energy in required to execute that task (energy is the integral of power over time, less time means less energy, given a constant power consumption)properly matching tasks to machines can reduce total system power
  2. Square gridThe MCSOC is assumed to have 2 types of processor cores; high efficiency and high performanceThere is an equal number of each core type (N2/2) where possible. For cases with an odd number of total cores
  3. iterative heuristic search algorithm that mimics the formation of structures in metals during coolingnature of structuresare a function of the rate of coolingFaster cooling will result in more irregular structures (e.g. higher total energy)slower cooling will result in more regular structures (e.g. lower total energy).
  4. Neighbor generated by modifying some of the parameters of current solution
  5. solution accepted if z &gt; ythe smaller the change in energy (fitness function), and the higher the temperature, the greater chance of accepting the proposed solution
  6. Two architectures chosen to create heterogeneous computing evironmentthe A7 is high efficiencywhile the A15, with its much more complex pipeline, is higher performance but much higher power
  7. A7 and A15 each have 4 pstatespower and performance are normalized for simplicityrelative power is used forpstate power in simulation, code snippet shownrelativeperformance of each pstate is used in the Gem5 workload simulations to build the ECS matrix
  8. Synthetically generated workloadFive standard benchmarks (FFT, ocean w/ contiguous partitions, ocean w/ non-contiguous partitions, radix, lu) were simulated on the ARM architecture included in Gem5four different clock speeds (1GHz, 866MHz, 650MHz, 434MHz) from real world ARM pstates shown in prev. slideThe FFT benchmark was run twice with two different problem sizesWhile the simulated workloads are not a comprehensive survey of all possible tasks for an embedded system, they vary sufficiently in runtime and computational intensity for the purposes of this investigation
  9. Runtimes uesd to generate the ECS matrix for all task/core/pstate combinationsActual ECS matrix from code shownECS in the inverse of runtime
  10. Tasks are assigned to cores as flow rates, rather than as individual tasksflow rates are the fraction of time that the core spends executing that taskThe execution rate for a task type on a core in a given p-state is the product of the flow rate and the ECS for that task/core/p-state.Global execution rate for each task is the sum of the individual execution rates on each coreenergy calculated by summing the energy of each coreenergy of each core is a function of its core type and its current p-stateThe relative energy for each p-state on each core type was obtained from the previously mentioned ARM whitepapergenerated solution might not be validrepair function randomly increases pstates until all task types are fully executed
  11. Several hours to collect the necessary number of trials for all data points
  12. Five different MCSOCs sizes were consideredEach configuration was simulated five times for each of the four iteration limits (1,000 iterations, 10,000 iterations, 100,000 iterations, and 1,000,000 iterations) of SA algorithmpercent decrease calculated as relative decrease in total device energy from a randomly generated initial solutionfive trials, averaged to accurately represent performance (due to random initial solution)explain stress factorThe stress factor is the percentage of the maximum workload that the device can support. For example, a stress factor of 0.8 means that the workload is 80% of the maximum. Higher stress factors allow less opportunity for optimization, as more of the device resources are utilized, limiting the number of available allocation options. During simulations, 4 stress factors were tested to simulate a full spectrum of MCSOC workload conditions. A stress factor of 1 was not simulated, as this would mean that the entire devicewas fully utilized and therefore there would be no opportunity for optimization.
  13. While this simulation is intended to be a static mappingimportant to consider how long it takes for the mapping to completethe mapping times are very small except for large MCSOCs with a large number of iterations of the SA algorithm