Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
Antonio Paoli...
2
Goal:
3
Goal:
Parallelism helps to reduce energy
while meeting real-time requirements
4
5
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
6
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
7
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
8
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
9
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
10
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
11
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
12
Quantifying Energy Consumption
for Practical Fork-Join Parallelism
on an Embedded Real-Time Operating System
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
Combine:
● parallel task model
● Power-aware multi-core scheduling
55
Save more with parallelism? Yes!
Combine:
● parallel task model
● Power-aware multi-core scheduling
→ validated in theory → malleable jobs (RTCSA’14)
56
Sa...
This work
Goal:
Reduce energy consumption as much possible
while meeting real-time requirements.
57
This work
Goal:
Reduce energy consumption as much possible
while meeting real-time requirements.
Validate it in practice
58
Validate in practice
● Parallel programming model
● RTOS techniques → power-aware multi-core hard real-time scheduling
● E...
1. Run-time framework
2. Task model and analysis
3. Experiments
60
1. Run-time framework
2. Task model and analysis
3. Experiments
61
Run-time framework
62
MPSoC platform
Run-time framework
63
MPSoC platform
RTOS
& lib support
Run-time framework
64
MPSoC platform
RTOS
& lib support
Run-time framework
65
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Run-time framework
66
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Parallel programming model
67
(Simple) Parallel Program
68
(Simple) Parallel Program Flow
69
(Simple) Parallel Program Flow
70
An embedded RTOS
71
An embedded RTOS
72
● Micro-kernel based
● Natively supports multi-core
● Provides hard-real time scheduling
● Power manag...
1. Run-time framework
2. Task model and analysis
3. Experiments
73
Parallel task model
74
Parallel task model
75
● Sporadic fork-join
● Arbitrary deadlines
● Rate monotonic based analysis
● Threads are partitioned
● 3 stages, very simp...
● Symmetric Multi-Core
● Global DVFS, finite set of operating points 〈 Vdd
, f 〉
● Power increases monotonically with freq...
Optimisation & partitioning process
78
Power optimisation
Partitioner
Schedulability test
Optimisation & partitioning process
79
Power optimisation
Partitioner
Schedulability test
Axer et al
proved wrong (shepher...
1. Run-time framework
2. Task model and analysis
3. Experiments
80
1. Run-time framework
2. Task model and analysis
3. Experiments
a. Testbed description
b. Single use cases
c. Task systems...
1. Run-time framework
2. Task model and analysis
3. Experiments
a. Testbed description
b. Single use cases
c. Task systems...
Experimental stack
83
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Experimental stack
84
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Experimental stack
85
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Measurement
framework
Experimental stack
86
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Measurement
framework
Target platform - i.MX6q SabreLite
87
● Embedded board ARM Cortex A9 MP
● Supported by HIPPEROS
● 4 cores, but global DVFS...
Operating points
88
Operating points
89
90
MPSoC
Embedded board
91
MPSoC
Embedded board
Power supply
220 V
220 V
Transformer
5 V
92
MPSoC
Oscilloscope
Embedded board
Power supply
220 V
220 V5 V
R
Voltage
measurements
P = V² / R
Transformer
93
MPSoC
Control Computer
Oscilloscope
Embedded board
Power supply
220 V
220 V5 V
R
Serial
JTAG
USB
Captured data
transfer...
Use cases
94
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Measurement
framework
Use cases
95
MPSoC platform
RTOS
& lib support
Parallel app.
benchmark
Measurement
framework
π
AES
Image
● Different workloads
● Easy OpenMP implementation
● Good scaling expected
Use cases
96
Image
AES π
1. Run-time framework
2. Task model and analysis
3. Experiments
a. Testbed description
b. Single use cases
c. Task systems...
Use case alone - voltage probe output
98
Use case alone - converted to power values
99
Use case alone - execution time measurement
100
Use case alone - peak power measurement
101
Use case alone - energy measurement
102
Use case alone - execution time
103
Use case alone - peak power
104
Single core power consumption
105
P ∝ Vdd
2
f
Vdd
∝ f
Single core power consumption
106
P ∝ Vdd
2
f
Vdd
∝ f
→ P ∝ f3
Multi-core power consumption
107
P ∝ k f3
Use case alone - peak power
108
P ∝ k f 3
Use case alone - energy
109
Execution time speedup
110
Frequency scaling OpenMP parallelism
1. Run-time framework
2. Task model and analysis
3. Experiments
a. Testbed description
b. Single use cases
c. Task systems...
System experiments
112
● Generate 232 random task systems (U = 0.6 → 2.9)
System experiments
113
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
System experiments
114
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
● Scheduling test & optimisation in...
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
● Scheduling test & optimisation in...
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
● Scheduling test & optimisation in...
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
● Scheduling test & optimisation in...
● Generate 232 random task systems (U = 0.6 → 2.9)
● Bind generated tasks to use cases
● Scheduling test & optimisation in...
Energy consumption
120
Relative energy savings
121
Conclusions
122
● Practical experimental framework flow for parallel real-time applications
● Parallelisation saves up to ...
Conclusion
123
Parallelism helps to reduce energy
while meeting real-time requirements
Thank you.
124
Questions?
125
Dynamic power vs static leakage
126
P ∝ k f3
+ α k
P ∝ k f3
+ α k
Dynamic power vs static leakage
127
dynamic static
Dynamic power vs static leakage
128
dynamic static
● α is platform-dependent
● Comparison between DVFS and DPM
P ∝ k f3
+ ...
● 232 systems for each degree of //
○ No high utilisation systems
→ Partitioned Rate Monotonic
○ Only feasible systems for...
● Same systems, but rel. to 1 thread
● Parallelism helps to save energy :-)
● < 0.13, but not a lot of systems...
● “4 thr...
● “3 threads” is bad:
○ Loss of symmetry with 4 cores
○ Different tasks executes simultaneously (1+3)
● Measures on the wh...
● Flawed, but does not impact results
○ Axer et al (ECRTS’13)
○ Flaw pointed by Fonseca et al (SIES’16)
● Scheduling frame...
Upcoming SlideShare
Loading in …5
×

of

Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 1 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 2 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 3 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 4 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 5 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 6 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 7 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 8 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 9 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 10 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 11 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 12 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 13 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 14 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 15 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 16 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 17 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 18 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 19 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 20 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 21 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 22 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 23 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 24 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 25 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 26 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 27 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 28 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 29 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 30 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 31 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 32 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 33 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 34 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 35 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 36 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 37 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 38 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 39 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 40 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 41 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 42 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 43 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 44 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 45 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 46 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 47 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 48 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 49 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 50 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 51 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 52 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 53 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 54 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 55 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 56 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 57 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 58 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 59 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 60 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 61 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 62 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 63 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 64 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 65 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 66 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 67 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 68 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 69 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 70 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 71 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 72 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 73 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 74 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 75 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 76 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 77 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 78 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 79 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 80 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 81 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 82 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 83 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 84 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 85 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 86 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 87 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 88 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 89 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 90 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 91 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 92 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 93 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 94 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 95 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 96 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 97 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 98 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 99 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 100 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 101 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 102 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 103 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 104 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 105 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 106 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 107 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 108 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 109 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 110 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 111 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 112 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 113 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 114 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 115 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 116 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 117 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 118 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 119 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 120 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 121 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 122 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 123 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 124 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 125 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 126 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 127 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 128 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 129 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 130 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 131 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Slide 132
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0 Likes

Share

Download to read offline

Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System

Download to read offline

Presentation by Antonio Paolillo (HIPPEROS) at the 24th ACM International Conference on Real-Time Networks and Systems (21st October 2016)

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System

  1. 1. Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System Antonio Paolillo - Ph.D student & software engineer 24th ACM International Conference on Real-Time Networks and Systems 21th October 2016
  2. 2. 2
  3. 3. Goal: 3
  4. 4. Goal: Parallelism helps to reduce energy while meeting real-time requirements 4
  5. 5. 5 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  6. 6. 6 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  7. 7. 7 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  8. 8. 8 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  9. 9. 9 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  10. 10. 10 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  11. 11. 11 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  12. 12. 12 Quantifying Energy Consumption for Practical Fork-Join Parallelism on an Embedded Real-Time Operating System
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. 16
  17. 17. 17
  18. 18. 18
  19. 19. 19
  20. 20. 20
  21. 21. 21
  22. 22. 22
  23. 23. 23
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. 27
  28. 28. 28
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. 37
  38. 38. 38
  39. 39. 39
  40. 40. 40
  41. 41. 41
  42. 42. 42
  43. 43. 43
  44. 44. 44
  45. 45. 45
  46. 46. 46
  47. 47. 47
  48. 48. 48
  49. 49. 49
  50. 50. 50
  51. 51. 51
  52. 52. 52
  53. 53. 53
  54. 54. 54
  55. 55. Combine: ● parallel task model ● Power-aware multi-core scheduling 55 Save more with parallelism? Yes!
  56. 56. Combine: ● parallel task model ● Power-aware multi-core scheduling → validated in theory → malleable jobs (RTCSA’14) 56 Save more with parallelism? Yes!
  57. 57. This work Goal: Reduce energy consumption as much possible while meeting real-time requirements. 57
  58. 58. This work Goal: Reduce energy consumption as much possible while meeting real-time requirements. Validate it in practice 58
  59. 59. Validate in practice ● Parallel programming model ● RTOS techniques → power-aware multi-core hard real-time scheduling ● Evaluated with full experimental stack (RTOS + hardware in the loop) ● Simple, but real application use cases 59
  60. 60. 1. Run-time framework 2. Task model and analysis 3. Experiments 60
  61. 61. 1. Run-time framework 2. Task model and analysis 3. Experiments 61
  62. 62. Run-time framework 62 MPSoC platform
  63. 63. Run-time framework 63 MPSoC platform RTOS & lib support
  64. 64. Run-time framework 64 MPSoC platform RTOS & lib support
  65. 65. Run-time framework 65 MPSoC platform RTOS & lib support Parallel app. benchmark
  66. 66. Run-time framework 66 MPSoC platform RTOS & lib support Parallel app. benchmark
  67. 67. Parallel programming model 67
  68. 68. (Simple) Parallel Program 68
  69. 69. (Simple) Parallel Program Flow 69
  70. 70. (Simple) Parallel Program Flow 70
  71. 71. An embedded RTOS 71
  72. 72. An embedded RTOS 72 ● Micro-kernel based ● Natively supports multi-core ● Provides hard-real time scheduling ● Power management fixed at task level We developed HOMPRTL → OpenMP run-time support for HIPPEROS.
  73. 73. 1. Run-time framework 2. Task model and analysis 3. Experiments 73
  74. 74. Parallel task model 74
  75. 75. Parallel task model 75
  76. 76. ● Sporadic fork-join ● Arbitrary deadlines ● Rate monotonic based analysis ● Threads are partitioned ● 3 stages, very simple Parallel task model 76
  77. 77. ● Symmetric Multi-Core ● Global DVFS, finite set of operating points 〈 Vdd , f 〉 ● Power increases monotonically with frequency Platform model 77
  78. 78. Optimisation & partitioning process 78 Power optimisation Partitioner Schedulability test
  79. 79. Optimisation & partitioning process 79 Power optimisation Partitioner Schedulability test Axer et al proved wrong (shepherding)
  80. 80. 1. Run-time framework 2. Task model and analysis 3. Experiments 80
  81. 81. 1. Run-time framework 2. Task model and analysis 3. Experiments a. Testbed description b. Single use cases c. Task systems 81
  82. 82. 1. Run-time framework 2. Task model and analysis 3. Experiments a. Testbed description b. Single use cases c. Task systems 82
  83. 83. Experimental stack 83 MPSoC platform RTOS & lib support Parallel app. benchmark
  84. 84. Experimental stack 84 MPSoC platform RTOS & lib support Parallel app. benchmark
  85. 85. Experimental stack 85 MPSoC platform RTOS & lib support Parallel app. benchmark Measurement framework
  86. 86. Experimental stack 86 MPSoC platform RTOS & lib support Parallel app. benchmark Measurement framework
  87. 87. Target platform - i.MX6q SabreLite 87 ● Embedded board ARM Cortex A9 MP ● Supported by HIPPEROS ● 4 cores, but global DVFS ● Operating points
  88. 88. Operating points 88
  89. 89. Operating points 89
  90. 90. 90 MPSoC Embedded board
  91. 91. 91 MPSoC Embedded board Power supply 220 V 220 V Transformer 5 V
  92. 92. 92 MPSoC Oscilloscope Embedded board Power supply 220 V 220 V5 V R Voltage measurements P = V² / R Transformer
  93. 93. 93 MPSoC Control Computer Oscilloscope Embedded board Power supply 220 V 220 V5 V R Serial JTAG USB Captured data transfer Oscilloscope trigger Voltage measurements P = V² / R Transformer HIPPEROS Inputs Outputs Serial
  94. 94. Use cases 94 MPSoC platform RTOS & lib support Parallel app. benchmark Measurement framework
  95. 95. Use cases 95 MPSoC platform RTOS & lib support Parallel app. benchmark Measurement framework π AES Image
  96. 96. ● Different workloads ● Easy OpenMP implementation ● Good scaling expected Use cases 96 Image AES π
  97. 97. 1. Run-time framework 2. Task model and analysis 3. Experiments a. Testbed description b. Single use cases c. Task systems 97
  98. 98. Use case alone - voltage probe output 98
  99. 99. Use case alone - converted to power values 99
  100. 100. Use case alone - execution time measurement 100
  101. 101. Use case alone - peak power measurement 101
  102. 102. Use case alone - energy measurement 102
  103. 103. Use case alone - execution time 103
  104. 104. Use case alone - peak power 104
  105. 105. Single core power consumption 105 P ∝ Vdd 2 f Vdd ∝ f
  106. 106. Single core power consumption 106 P ∝ Vdd 2 f Vdd ∝ f → P ∝ f3
  107. 107. Multi-core power consumption 107 P ∝ k f3
  108. 108. Use case alone - peak power 108 P ∝ k f 3
  109. 109. Use case alone - energy 109
  110. 110. Execution time speedup 110 Frequency scaling OpenMP parallelism
  111. 111. 1. Run-time framework 2. Task model and analysis 3. Experiments a. Testbed description b. Single use cases c. Task systems 111
  112. 112. System experiments 112
  113. 113. ● Generate 232 random task systems (U = 0.6 → 2.9) System experiments 113
  114. 114. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases System experiments 114
  115. 115. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases ● Scheduling test & optimisation in Python for 1, 2, 3, 4 threads System experiments 115
  116. 116. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases ● Scheduling test & optimisation in Python for 1, 2, 3, 4 threads ● Operating points and threads partition for each task System experiments 116
  117. 117. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases ● Scheduling test & optimisation in Python for 1, 2, 3, 4 threads ● Operating points and threads partition for each task ● Generate HIPPEROS builds System experiments 117
  118. 118. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases ● Scheduling test & optimisation in Python for 1, 2, 3, 4 threads ● Operating points and threads partition for each task ● Generate HIPPEROS builds ● Run feasible systems and measures System experiments 118
  119. 119. ● Generate 232 random task systems (U = 0.6 → 2.9) ● Bind generated tasks to use cases ● Scheduling test & optimisation in Python for 1, 2, 3, 4 threads ● Operating points and threads partition for each task ● Generate HIPPEROS builds ● Run feasible systems and measures System experiments 119
  120. 120. Energy consumption 120
  121. 121. Relative energy savings 121
  122. 122. Conclusions 122 ● Practical experimental framework flow for parallel real-time applications ● Parallelisation saves up to 25% energy (the whole board) ● Confronted theory with practice ○ Speedup factors ○ Power measurements ● Challenge: integrate “OpenMP-like” programs in industry systems
  123. 123. Conclusion 123 Parallelism helps to reduce energy while meeting real-time requirements
  124. 124. Thank you. 124
  125. 125. Questions? 125
  126. 126. Dynamic power vs static leakage 126 P ∝ k f3 + α k
  127. 127. P ∝ k f3 + α k Dynamic power vs static leakage 127 dynamic static
  128. 128. Dynamic power vs static leakage 128 dynamic static ● α is platform-dependent ● Comparison between DVFS and DPM P ∝ k f3 + α k
  129. 129. ● 232 systems for each degree of // ○ No high utilisation systems → Partitioned Rate Monotonic ○ Only feasible systems for 1 → 4 threads ● More threads consume less ● Low utilisation systems don’t need high operating points (often idle) ● Analysis is pessimistic for high number of threads Energy consumption 129
  130. 130. ● Same systems, but rel. to 1 thread ● Parallelism helps to save energy :-) ● < 0.13, but not a lot of systems... ● “4 threads” dominates ● ↗ Utot ⇒ ↘ energy ○ Until Utot = 2.4 ○ For high Utot : ■ Analysis pessimism ■ Schedulable systems only bias, so already well distributed ■ Some systems are only schedulable in parallel Relative energy savings 130
  131. 131. ● “3 threads” is bad: ○ Loss of symmetry with 4 cores ○ Different tasks executes simultaneously (1+3) ● Measures on the whole platform ○ CPU alone would give better results ● Need more data ○ Charts would be smoother ○ But it takes time… ■ 1 minute/execution ■ 4 executions/system ■ 232 systems, ≈ 15 hours Questions on experiment results 131
  132. 132. ● Flawed, but does not impact results ○ Axer et al (ECRTS’13) ○ Flaw pointed by Fonseca et al (SIES’16) ● Scheduling framework ○ Part of the technical framework ○ Not the important contribution/result ● Will be improved with future work Analysis technique 132

Presentation by Antonio Paolillo (HIPPEROS) at the 24th ACM International Conference on Real-Time Networks and Systems (21st October 2016)

Views

Total views

213

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

8

Shares

0

Comments

0

Likes

0

×