SlideShare a Scribd company logo
1 
How to build an energy model 
for your SoC 
Linaro Connect CLU14, Burlingame,CA. 
Morten Rasmussen
Why do you need an energy model? 
 Most of the Linux kernel is blissfully unaware of SoC power 
management features: 
 P-states, clock domains, C-states, power domains, ... 
 Only largely autonomous subsystems are aware of some of these 
details (cpufreq, cpuidle, …) 
 The plan is to change that by coordinating task scheduling, frequency 
scaling, and idle-state selection to improve power management. 
 Energy saving techniques must be applied under the right 
circumstances which vary between SoCs. 
 The kernel must therefore have a better understanding of 
power(energy)/performance trade-offs for the particular SoC to make 
the right decisions. 
 An energy model can provide that information. 
 As a bonus, the energy model may also be used by tools to quick 
energy estimates based on execution traces. 
2
Modelling limitations 
 Model are never accurate, but we only need enough detail 
to make the right decisions most of the time. 
 The model will be used by critical code paths in the kernel, 
so it has to be as simple as possible. 
 Only considers cpus, no memory or peripherals. 
3
A simplified system view 
4 
G 
cpu0 cpu1 
Shared HW 
G G 
G 
cpu2 cpu3 
Shared HW 
G G 
Power 
G G G G 
Clock source Clock source 
G Clock gating 
G Power gating 
Power domain
Energy consumption simplified 
5 
Px 
time 
power 
Py 
Transition 
energy 
Cz 
Busy 
Transition 
Idle 
Busy energy Busy energy 
Idle energy
Scheduler Topology Hierarchy 
6 
0 1 2 3 
Cluster/package 
P-states 
Cluster/package 
Disclaimer: This a simplified view of 
the sched_domain hiearchy. 
Struct sched_group 
Energy model tables Per-core C-states 
C-states
Energy model data 
 P-states: 
 Compute capacity: Performance score normalize to highest P-state of 
fastest cpu in the system (1024). Choose benchmark carefully. 
Preferably use a suite of benchmarks. 
 Power: Busy power = energy/second. Normalized to any reference, but 
must be consistent across all cpus. 
 C-states: 
 Power: Idle power = energy/second. Normalized. 
 Wake-up energy. Energy consumed during P->C + C->P state 
transitions. Unit must be consistent with power numbers. 
 Note: 
7 
 Power numbers should only include power consumption associated 
with the group where the tables are attached, i.e. per-core P-state 
power should only include power consumed by the core itself, shared 
HW is accounted for in the table belonging to the level above.
Energy model data 
0 1 
8 
C-states 
power wu (state) 
10 6 (C1) 
... ... ... 
Cluster 
P-states 
P-states C-states 
power wu (state) 
0 0 (WFI) 
... ... ... 
capacity power (freq) 
358 2967 (350) 
... ... ... 
1024 4905 (1000) 
capacity power (freq) 
358 187 (350) 
... ... ... 
1024 1024 (1000) 
CPU
Energy model algorithm 
9 
for_each_domain(cpu, sd) { 
sg = sched_group_of(cpu) 
energy_before = curr_util(sg) * busy_power(sg) 
+ (1-curr_util(sg)) * idle_power(sg) 
energy_after = new_util(sg) * busy_power(sg) 
+ (1-new_util(sg)) * idle_power(sg) 
+ (1-new_util(sg)) * wakeups * wakeup_energy(sg) 
energy_diff += energy_before - energy_after 
if (energy_before == energy_after) 
break; 
} 
return energy_diff
Backups 
10
11 
Platform performance/energy 
data/model in scheduler or 
user-space 
Energy-Aware Workshop @ Kernel Summit 2014, Chicago 
Morten Rasmussen
Sub-topics 
 Techniques for reducing energy consumption vary between 
platforms: 
 Race-to-idle 
 Task packing 
 P- and C-state constraints (Turbo Mode, package C-states, …) 
 … but they are not universally all good. Most likely only to a 
certain extend. 
 We need to know when to apply each of the techniques for a 
particular platform. 
 Proposals: 
12 
 Tunable heuristics for each technique that can controlled by somebody 
else (user-space?), basically passing the problems to others. 
 Provide in-kernel performance/energy model that can estimate the 
impact of scheduling decisions.
Backup/More stuff 
13
Model Validation: ARM TC2, sysbench 
14 
Correlation (Pearson): 
A15 = 0.93 
A7 = 0.96
Model Validation: ARM TC2, periodic 
15 
Correlation (Pearson): 
A15 = 0.17 
A7 = -0.01
Model Validation: ARM TC2, Android audio 
16 
Correlation (Pearson): 
A15 = 0.03 
A7 = 0.48
Model Validation: ARM TC2, Android 
bbench 
17 
Correlation (Pearson): 
A15 = 0.67 
A7 = 0.80
Old slides 
18
Motivation 
 Energy cost driven task placement (load-balancing) 
19 
 Focus on the actual goal of the energy-aware scheduling activities: 
 Saving energy while achieving (near) optimum performance. 
 Energy benefit of scheduling decision clear when made. 
 Assuming energy cost estimates are fairly accurate. 
 Introduce a simple energy model to estimate costs and guide 
scheduling decisions. 
 Requested by maintainers at the KS workshop. 
 Gives the right amount of packing and spreading. 
 May simplify balancing decision logic. 
 Strong focus on saving energy in load balancing algorithms. 
 big.LITTLE support comes naturally and almost for free. 
 This just one part of the energy efficiency work. 
 Several related sessions this week.
Energy Load Balancing 
 The idea (a bit simplified): 
20 
 Let the resulting energy consumption guide all balancing decisions: 
 if (energy_diff(task, src_cpu, dst_cpu) > 0) { 
move_task(task, src_cpu, dst_cpu); 
} else { 
/* Try some other task */ 
} 
 Ideally, we should get the optimum balance if we try all combinations 
of tasks and cpus. 
 In reality it is not that simple. We can't try all combinations, but we 
can get fairly close for most scenarios. 
 If the energy model is accurate enough we get packing and spreading 
implicitly and only when it saves energy 
 Should work for any system. SMP and big.LITTLE (with a few 
extensions).
Power and Energy 
 Goal: Save energy, not power. Power 
21 
Time 
Energy 
ecpu=P⋅t , t=inst 
cc 
ecpu=P(cc) inst 
cc 
ecpu=P(cc)( 
insttask 
cc + 
Work 
instidle 
cc ) 
ecpu=etask+eidle 
Compute capacity (~ freq * uarch) 
= Energy/inst: This is what we try to minimize. 
ecpu=Pbusy (cc) 
insttask 
cc +Pidle 
instidle 
cc 
If we have cpuidle support we get: 
~ utilization 
Tracked load 
Time 
Time in runnable state 
~ utilization* 
We have to add an additional leakage energy term to reflect that it is better not wake cpus 
unnecessarily.
Simple Energy Model 
 cpu_energy = power(cc) * util/cc 
22 
+ idle_power * (1-(util/cc)) 
+ leakage_energy 
 cluster_energy = 
c_active_power * c_util 
+ c_idle_power * (1-c_util) 
 util = Scale invariant cpu utilization (Tracked load). 
 cc = Current compute capacity (depends on freq and uarch). 
 power(cc) = Busy power (fully loaded) at current capacity from table. 
 idle_power = Idle power consumption (~WFI). 
 leakage_energy = Constant representing the cost of waking the cpu. 
 c_util = Cluster utilization. Depends on max(util/cc) ratio of its cpus. 
 c_active_power = Cluster active power. 
 c_idle_power = Cluster idle power.
Compute Capacity and Power 
 Processor specific table expressing power and compute 
capacity at each P-state. 
 The sched domain hierarchy is in a good position to hold this type of 
information. 
 Example (entirely made up): 
23 
Capacity Power 
0.2 0.4 
0.4 0.9 
0.6 1.5 
0.8 2.2 
1.0 3.2 
Capacity Power 
0.4 1.6 
0.8 4.4 
1.2 9.0 
1.6 15.0 
2.0 23.0 
Little Big 
idle 0.1 
leakage 0.1 
Equal compute capacity 
idle 0.3 
leakage 0.5 
Little Big 
cluster 
active 2.4 6.0 
idle 0.0 0.0
energy_diff() 
 Balancing two cpus: 
24 
def energy_diff(tload, scpu, dcpu): 
# Estimate the next compute capacity (P-state) 
s_new_cc = find_cpu_cap(scpu, cpu_util(scpu)) 
# energy model cost for task on source cpu 
s_task_energy = tload/s_new_cc * cpu_cc_power(scpu, s_new_cc) 
if nr_running(scpu) == 1: 
s_task_energy += cpu_leakage_energy[cpu_type[scpu]] 
# Estimate destination cpu cc after adding the task 
d_new_cc = find_cpu_cc(dcpu, cpu_util(dcpu)+tload) 
# energy model cost for task on destination cpu 
d_task_energy = tload/d_new_cc * cpu_cc_power(dcpu, d_new_cc) 
if nr_running(dcpu) == 0: 
d_task_energy += cpu_leakage_energy[cpu_type[dcpu]] 
return s_task_energy - d_task_energy 
 Balancing sched domains is slightly more complicated as it 
involves cluster power as well.
Example 
After EA load balance: 
25 
cpu rq util cap cc_power leak power 
0 {0.2} 0.2 0.2 0.4 0.1 0.5 
1 {0.1} 0.1 0.2 0.4 0.1 0.35 
energy_diff() 
2 {} 0.0 0.2 0.4 0.1 0.1 
= 0.075* 
cluster - 1.0 - 2.4 - 2.4 
Total 3.35 
0.55 
saved 
cpu rq util cap cc_power leak power 
0 {0.2, 0.1} 0.3 0.4 0.9 0.1 0.8 
1 {} 0.0 0.4 0.9 0.1 0.1 
2 {} 0.0 0.4 0.9 0.1 0.1 
cluster - 0.75 - 2.4 - 1.8 
Total 2.8 
* energy_diff() ignores cluster power and other tasks to keep computations cheap and simple. 
Better accuracy can be added if necessary.
Is the energy model too simple? 
 It is essential that the energy model is fast and is easy to use for load-balancing. 
26 
 The scheduler is a critical path and already complex enough. 
 Python model tests 
 Disclaimer: These numbers have not been validated in any way. 
 Test configuration: 3+3 big.LITTLE, 1000 random balance scenarios. 
 Rand/Opt: Random balance energy (starting point) worse than best possible balance 
energy (brute-force). 
 EA/Opt: Energy model based balance energy worse than best possible balance energy. 
 EA == Opt: Scenarios where EA found best possible balance. 
Tasks Rand/Opt EA/Opt EA == Opt 
2 7.86% 0.09% 72.60% 
3 7.79% 0.15% 64.80% 
4 9.39% 0.45% 62.00% 
5 10.02% 1.15% 51.10% 
6 11.44% 2.23% 38.30%
What is next? 
 Early prototype to validate the idea. Initial focus getting 
energy_diff() working on simple SMP system. 
 Post on LKML very soon. 
 Open Issues 
 Exposing power/capacity tables to kernel. Essential to make the right 
decisions. 
 Plumbing: Where do the tables come from? DT? 
 Next steps: 
27 
 Scale invariance: Requirement for the energy model to work. 
 Fix cpu_power/compute capacity use in scheduler. 
 Tooling and benchmarks (covered in another session) 
 Idle integration (covered in another session)
Questions? 
28

More Related Content

What's hot

HKG15-107: ACPI Power Management on ARM64 Servers (v2)
HKG15-107: ACPI Power Management on ARM64 Servers (v2)HKG15-107: ACPI Power Management on ARM64 Servers (v2)
HKG15-107: ACPI Power Management on ARM64 Servers (v2)
Linaro
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+
Aananth C N
 
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
The Linux Foundation
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux Kernel
SUSE Labs Taipei
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Anne Nicolas
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
RISC-V International
 
Breaking Down the Entry Barriers on Linux Kernel Networking Stack
Breaking Down the Entry Barriers on Linux Kernel Networking StackBreaking Down the Entry Barriers on Linux Kernel Networking Stack
Breaking Down the Entry Barriers on Linux Kernel Networking Stack
Juhee Kang
 
The ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RASThe ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RAS
Yasunori Goto
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
Michelle Holley
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
Kirill Tsym
 
Scheduling in Android
Scheduling in AndroidScheduling in Android
Scheduling in Android
Opersys inc.
 
Hibernation in Linux 2.6.29
Hibernation in Linux 2.6.29Hibernation in Linux 2.6.29
Hibernation in Linux 2.6.29Varun Mahajan
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)
Hann Yu-Ju Huang
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Linaro
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
Adrian Huang
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 

What's hot (20)

HKG15-107: ACPI Power Management on ARM64 Servers (v2)
HKG15-107: ACPI Power Management on ARM64 Servers (v2)HKG15-107: ACPI Power Management on ARM64 Servers (v2)
HKG15-107: ACPI Power Management on ARM64 Servers (v2)
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+
 
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux Kernel
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
 
Breaking Down the Entry Barriers on Linux Kernel Networking Stack
Breaking Down the Entry Barriers on Linux Kernel Networking StackBreaking Down the Entry Barriers on Linux Kernel Networking Stack
Breaking Down the Entry Barriers on Linux Kernel Networking Stack
 
The ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RASThe ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RAS
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
Scheduling in Android
Scheduling in AndroidScheduling in Android
Scheduling in Android
 
Hibernation in Linux 2.6.29
Hibernation in Linux 2.6.29Hibernation in Linux 2.6.29
Hibernation in Linux 2.6.29
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
 
HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)HSA Kernel Code (KFD v0.6)
HSA Kernel Code (KFD v0.6)
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 

Viewers also liked

LCA14: LCA14-415: ACPI Power Management
LCA14: LCA14-415: ACPI Power ManagementLCA14: LCA14-415: ACPI Power Management
LCA14: LCA14-415: ACPI Power ManagementLinaro
 
LCU14 114- Upstreaming 201
LCU14 114- Upstreaming 201LCU14 114- Upstreaming 201
LCU14 114- Upstreaming 201
Linaro
 
SOC Power Estimation
SOC Power EstimationSOC Power Estimation
SOC Power Estimation
Mahesh Dananjaya
 
Analysis of leakage current calculation for nanoscale MOSFET and FinFET
Analysis of leakage current calculation for nanoscale MOSFET and FinFETAnalysis of leakage current calculation for nanoscale MOSFET and FinFET
Analysis of leakage current calculation for nanoscale MOSFET and FinFET
IJTET Journal
 
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by HisiliconLAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
Linaro
 
Introduction of ram ddr3
Introduction of ram ddr3Introduction of ram ddr3
Introduction of ram ddr3Technocratz
 
Lcu14 101- coresight overview
Lcu14 101- coresight overviewLcu14 101- coresight overview
Lcu14 101- coresight overview
Linaro
 
BKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream StategyBKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream Stategy
Linaro
 

Viewers also liked (8)

LCA14: LCA14-415: ACPI Power Management
LCA14: LCA14-415: ACPI Power ManagementLCA14: LCA14-415: ACPI Power Management
LCA14: LCA14-415: ACPI Power Management
 
LCU14 114- Upstreaming 201
LCU14 114- Upstreaming 201LCU14 114- Upstreaming 201
LCU14 114- Upstreaming 201
 
SOC Power Estimation
SOC Power EstimationSOC Power Estimation
SOC Power Estimation
 
Analysis of leakage current calculation for nanoscale MOSFET and FinFET
Analysis of leakage current calculation for nanoscale MOSFET and FinFETAnalysis of leakage current calculation for nanoscale MOSFET and FinFET
Analysis of leakage current calculation for nanoscale MOSFET and FinFET
 
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by HisiliconLAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
LAS16-310: Introducing the first 96Boards TV Platform: Poplar by Hisilicon
 
Introduction of ram ddr3
Introduction of ram ddr3Introduction of ram ddr3
Introduction of ram ddr3
 
Lcu14 101- coresight overview
Lcu14 101- coresight overviewLcu14 101- coresight overview
Lcu14 101- coresight overview
 
BKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream StategyBKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream Stategy
 

Similar to LCU14-410: How to build an Energy Model for your SoC

LCU13: Power-efficient scheduling, and the latest news from the kernel summit
LCU13: Power-efficient scheduling, and the latest news from the kernel summitLCU13: Power-efficient scheduling, and the latest news from the kernel summit
LCU13: Power-efficient scheduling, and the latest news from the kernel summit
Linaro
 
Green scheduling
Green schedulingGreen scheduling
Green scheduling
Vincenzo De Maio
 
Economic Dispatch of Generated Power Using Modified Lambda-Iteration Method
Economic Dispatch of Generated Power Using Modified Lambda-Iteration MethodEconomic Dispatch of Generated Power Using Modified Lambda-Iteration Method
Economic Dispatch of Generated Power Using Modified Lambda-Iteration Method
IOSR Journals
 
Compiler optimization techniques
Compiler optimization techniquesCompiler optimization techniques
Compiler optimization techniques
Hardik Devani
 
Economic dipatch
Economic dipatch Economic dipatch
Economic dipatch
Doni Wahyudi
 
BKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPABKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPA
Linaro
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersCemal Ardil
 
System-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
System-wide Energy Optimization for Multiple DVS Components and Real-time TasksSystem-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
System-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
Heechul Yun
 
Optimization of Economic Load Dispatch with Unit Commitment on Multi Machine
Optimization of Economic Load Dispatch with Unit Commitment on Multi MachineOptimization of Economic Load Dispatch with Unit Commitment on Multi Machine
Optimization of Economic Load Dispatch with Unit Commitment on Multi Machine
IJAPEJOURNAL
 
Map-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on MulticoreMap-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on Multicore
illidan2004
 
Oracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethodsOracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethods
Ajith Narayanan
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
Ehsan Sharifi
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
NECST Lab @ Politecnico di Milano
 
Learning
LearningLearning
Learningbutest
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
Amazon Web Services
 
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
IOSR Journals
 
Operation cost reduction in unit commitment problem using improved quantum bi...
Operation cost reduction in unit commitment problem using improved quantum bi...Operation cost reduction in unit commitment problem using improved quantum bi...
Operation cost reduction in unit commitment problem using improved quantum bi...
IJECEIAES
 
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_questionWhy is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
Ajith Narayanan
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
Amazon Web Services
 

Similar to LCU14-410: How to build an Energy Model for your SoC (20)

LCU13: Power-efficient scheduling, and the latest news from the kernel summit
LCU13: Power-efficient scheduling, and the latest news from the kernel summitLCU13: Power-efficient scheduling, and the latest news from the kernel summit
LCU13: Power-efficient scheduling, and the latest news from the kernel summit
 
Green scheduling
Green schedulingGreen scheduling
Green scheduling
 
Economic Dispatch of Generated Power Using Modified Lambda-Iteration Method
Economic Dispatch of Generated Power Using Modified Lambda-Iteration MethodEconomic Dispatch of Generated Power Using Modified Lambda-Iteration Method
Economic Dispatch of Generated Power Using Modified Lambda-Iteration Method
 
Compiler optimization techniques
Compiler optimization techniquesCompiler optimization techniques
Compiler optimization techniques
 
Economic dipatch
Economic dipatch Economic dipatch
Economic dipatch
 
BKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPABKK16-TR08 How to generate power models for EAS and IPA
BKK16-TR08 How to generate power models for EAS and IPA
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
 
System-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
System-wide Energy Optimization for Multiple DVS Components and Real-time TasksSystem-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
System-wide Energy Optimization for Multiple DVS Components and Real-time Tasks
 
Optimization of Economic Load Dispatch with Unit Commitment on Multi Machine
Optimization of Economic Load Dispatch with Unit Commitment on Multi MachineOptimization of Economic Load Dispatch with Unit Commitment on Multi Machine
Optimization of Economic Load Dispatch with Unit Commitment on Multi Machine
 
Map-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on MulticoreMap-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on Multicore
 
Oracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethodsOracle ebs capacity_analysisusingstatisticalmethods
Oracle ebs capacity_analysisusingstatisticalmethods
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
Run-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environmentsRun-time power management in cloud and containerized environments
Run-time power management in cloud and containerized environments
 
Energy saving policies final
Energy saving policies finalEnergy saving policies final
Energy saving policies final
 
Learning
LearningLearning
Learning
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
Combining both Plug-in Vehicles and Renewable Energy Resources for Unit Commi...
 
Operation cost reduction in unit commitment problem using improved quantum bi...
Operation cost reduction in unit commitment problem using improved quantum bi...Operation cost reduction in unit commitment problem using improved quantum bi...
Operation cost reduction in unit commitment problem using improved quantum bi...
 
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_questionWhy is my_oracle_e-biz_database_slow_a_million_dollar_question
Why is my_oracle_e-biz_database_slow_a_million_dollar_question
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Linaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Linaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Linaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
Linaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
Linaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Recently uploaded

A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 

Recently uploaded (20)

A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 

LCU14-410: How to build an Energy Model for your SoC

  • 1. 1 How to build an energy model for your SoC Linaro Connect CLU14, Burlingame,CA. Morten Rasmussen
  • 2. Why do you need an energy model?  Most of the Linux kernel is blissfully unaware of SoC power management features:  P-states, clock domains, C-states, power domains, ...  Only largely autonomous subsystems are aware of some of these details (cpufreq, cpuidle, …)  The plan is to change that by coordinating task scheduling, frequency scaling, and idle-state selection to improve power management.  Energy saving techniques must be applied under the right circumstances which vary between SoCs.  The kernel must therefore have a better understanding of power(energy)/performance trade-offs for the particular SoC to make the right decisions.  An energy model can provide that information.  As a bonus, the energy model may also be used by tools to quick energy estimates based on execution traces. 2
  • 3. Modelling limitations  Model are never accurate, but we only need enough detail to make the right decisions most of the time.  The model will be used by critical code paths in the kernel, so it has to be as simple as possible.  Only considers cpus, no memory or peripherals. 3
  • 4. A simplified system view 4 G cpu0 cpu1 Shared HW G G G cpu2 cpu3 Shared HW G G Power G G G G Clock source Clock source G Clock gating G Power gating Power domain
  • 5. Energy consumption simplified 5 Px time power Py Transition energy Cz Busy Transition Idle Busy energy Busy energy Idle energy
  • 6. Scheduler Topology Hierarchy 6 0 1 2 3 Cluster/package P-states Cluster/package Disclaimer: This a simplified view of the sched_domain hiearchy. Struct sched_group Energy model tables Per-core C-states C-states
  • 7. Energy model data  P-states:  Compute capacity: Performance score normalize to highest P-state of fastest cpu in the system (1024). Choose benchmark carefully. Preferably use a suite of benchmarks.  Power: Busy power = energy/second. Normalized to any reference, but must be consistent across all cpus.  C-states:  Power: Idle power = energy/second. Normalized.  Wake-up energy. Energy consumed during P->C + C->P state transitions. Unit must be consistent with power numbers.  Note: 7  Power numbers should only include power consumption associated with the group where the tables are attached, i.e. per-core P-state power should only include power consumed by the core itself, shared HW is accounted for in the table belonging to the level above.
  • 8. Energy model data 0 1 8 C-states power wu (state) 10 6 (C1) ... ... ... Cluster P-states P-states C-states power wu (state) 0 0 (WFI) ... ... ... capacity power (freq) 358 2967 (350) ... ... ... 1024 4905 (1000) capacity power (freq) 358 187 (350) ... ... ... 1024 1024 (1000) CPU
  • 9. Energy model algorithm 9 for_each_domain(cpu, sd) { sg = sched_group_of(cpu) energy_before = curr_util(sg) * busy_power(sg) + (1-curr_util(sg)) * idle_power(sg) energy_after = new_util(sg) * busy_power(sg) + (1-new_util(sg)) * idle_power(sg) + (1-new_util(sg)) * wakeups * wakeup_energy(sg) energy_diff += energy_before - energy_after if (energy_before == energy_after) break; } return energy_diff
  • 11. 11 Platform performance/energy data/model in scheduler or user-space Energy-Aware Workshop @ Kernel Summit 2014, Chicago Morten Rasmussen
  • 12. Sub-topics  Techniques for reducing energy consumption vary between platforms:  Race-to-idle  Task packing  P- and C-state constraints (Turbo Mode, package C-states, …)  … but they are not universally all good. Most likely only to a certain extend.  We need to know when to apply each of the techniques for a particular platform.  Proposals: 12  Tunable heuristics for each technique that can controlled by somebody else (user-space?), basically passing the problems to others.  Provide in-kernel performance/energy model that can estimate the impact of scheduling decisions.
  • 14. Model Validation: ARM TC2, sysbench 14 Correlation (Pearson): A15 = 0.93 A7 = 0.96
  • 15. Model Validation: ARM TC2, periodic 15 Correlation (Pearson): A15 = 0.17 A7 = -0.01
  • 16. Model Validation: ARM TC2, Android audio 16 Correlation (Pearson): A15 = 0.03 A7 = 0.48
  • 17. Model Validation: ARM TC2, Android bbench 17 Correlation (Pearson): A15 = 0.67 A7 = 0.80
  • 19. Motivation  Energy cost driven task placement (load-balancing) 19  Focus on the actual goal of the energy-aware scheduling activities:  Saving energy while achieving (near) optimum performance.  Energy benefit of scheduling decision clear when made.  Assuming energy cost estimates are fairly accurate.  Introduce a simple energy model to estimate costs and guide scheduling decisions.  Requested by maintainers at the KS workshop.  Gives the right amount of packing and spreading.  May simplify balancing decision logic.  Strong focus on saving energy in load balancing algorithms.  big.LITTLE support comes naturally and almost for free.  This just one part of the energy efficiency work.  Several related sessions this week.
  • 20. Energy Load Balancing  The idea (a bit simplified): 20  Let the resulting energy consumption guide all balancing decisions:  if (energy_diff(task, src_cpu, dst_cpu) > 0) { move_task(task, src_cpu, dst_cpu); } else { /* Try some other task */ }  Ideally, we should get the optimum balance if we try all combinations of tasks and cpus.  In reality it is not that simple. We can't try all combinations, but we can get fairly close for most scenarios.  If the energy model is accurate enough we get packing and spreading implicitly and only when it saves energy  Should work for any system. SMP and big.LITTLE (with a few extensions).
  • 21. Power and Energy  Goal: Save energy, not power. Power 21 Time Energy ecpu=P⋅t , t=inst cc ecpu=P(cc) inst cc ecpu=P(cc)( insttask cc + Work instidle cc ) ecpu=etask+eidle Compute capacity (~ freq * uarch) = Energy/inst: This is what we try to minimize. ecpu=Pbusy (cc) insttask cc +Pidle instidle cc If we have cpuidle support we get: ~ utilization Tracked load Time Time in runnable state ~ utilization* We have to add an additional leakage energy term to reflect that it is better not wake cpus unnecessarily.
  • 22. Simple Energy Model  cpu_energy = power(cc) * util/cc 22 + idle_power * (1-(util/cc)) + leakage_energy  cluster_energy = c_active_power * c_util + c_idle_power * (1-c_util)  util = Scale invariant cpu utilization (Tracked load).  cc = Current compute capacity (depends on freq and uarch).  power(cc) = Busy power (fully loaded) at current capacity from table.  idle_power = Idle power consumption (~WFI).  leakage_energy = Constant representing the cost of waking the cpu.  c_util = Cluster utilization. Depends on max(util/cc) ratio of its cpus.  c_active_power = Cluster active power.  c_idle_power = Cluster idle power.
  • 23. Compute Capacity and Power  Processor specific table expressing power and compute capacity at each P-state.  The sched domain hierarchy is in a good position to hold this type of information.  Example (entirely made up): 23 Capacity Power 0.2 0.4 0.4 0.9 0.6 1.5 0.8 2.2 1.0 3.2 Capacity Power 0.4 1.6 0.8 4.4 1.2 9.0 1.6 15.0 2.0 23.0 Little Big idle 0.1 leakage 0.1 Equal compute capacity idle 0.3 leakage 0.5 Little Big cluster active 2.4 6.0 idle 0.0 0.0
  • 24. energy_diff()  Balancing two cpus: 24 def energy_diff(tload, scpu, dcpu): # Estimate the next compute capacity (P-state) s_new_cc = find_cpu_cap(scpu, cpu_util(scpu)) # energy model cost for task on source cpu s_task_energy = tload/s_new_cc * cpu_cc_power(scpu, s_new_cc) if nr_running(scpu) == 1: s_task_energy += cpu_leakage_energy[cpu_type[scpu]] # Estimate destination cpu cc after adding the task d_new_cc = find_cpu_cc(dcpu, cpu_util(dcpu)+tload) # energy model cost for task on destination cpu d_task_energy = tload/d_new_cc * cpu_cc_power(dcpu, d_new_cc) if nr_running(dcpu) == 0: d_task_energy += cpu_leakage_energy[cpu_type[dcpu]] return s_task_energy - d_task_energy  Balancing sched domains is slightly more complicated as it involves cluster power as well.
  • 25. Example After EA load balance: 25 cpu rq util cap cc_power leak power 0 {0.2} 0.2 0.2 0.4 0.1 0.5 1 {0.1} 0.1 0.2 0.4 0.1 0.35 energy_diff() 2 {} 0.0 0.2 0.4 0.1 0.1 = 0.075* cluster - 1.0 - 2.4 - 2.4 Total 3.35 0.55 saved cpu rq util cap cc_power leak power 0 {0.2, 0.1} 0.3 0.4 0.9 0.1 0.8 1 {} 0.0 0.4 0.9 0.1 0.1 2 {} 0.0 0.4 0.9 0.1 0.1 cluster - 0.75 - 2.4 - 1.8 Total 2.8 * energy_diff() ignores cluster power and other tasks to keep computations cheap and simple. Better accuracy can be added if necessary.
  • 26. Is the energy model too simple?  It is essential that the energy model is fast and is easy to use for load-balancing. 26  The scheduler is a critical path and already complex enough.  Python model tests  Disclaimer: These numbers have not been validated in any way.  Test configuration: 3+3 big.LITTLE, 1000 random balance scenarios.  Rand/Opt: Random balance energy (starting point) worse than best possible balance energy (brute-force).  EA/Opt: Energy model based balance energy worse than best possible balance energy.  EA == Opt: Scenarios where EA found best possible balance. Tasks Rand/Opt EA/Opt EA == Opt 2 7.86% 0.09% 72.60% 3 7.79% 0.15% 64.80% 4 9.39% 0.45% 62.00% 5 10.02% 1.15% 51.10% 6 11.44% 2.23% 38.30%
  • 27. What is next?  Early prototype to validate the idea. Initial focus getting energy_diff() working on simple SMP system.  Post on LKML very soon.  Open Issues  Exposing power/capacity tables to kernel. Essential to make the right decisions.  Plumbing: Where do the tables come from? DT?  Next steps: 27  Scale invariance: Requirement for the energy model to work.  Fix cpu_power/compute capacity use in scheduler.  Tooling and benchmarks (covered in another session)  Idle integration (covered in another session)