Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Drowsy-DC: Power management in
datacenters inspired by smartphones
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en fa...
“La prévision est une chose difficile
surtout lorsqu’il s’agit de l’avenir.”
Winston Churchill
2 / 27
Alain Tchana (Ne fais ...
Energy consumption
Electricity cost
+50% of TCO [Computer’11, HP report]
Eolas (Grenoble): 5 employees to manage the data ...
Current situation on servers with ACPI
0 20 40 60 80 100
0
50
100
S5
S4S3
S0idle
Utilization rate
Consumedenergy(%)
S5: Off...
Current situation on servers with ACPI
Reasons
Not all server components are individually manageable
PM policies are not e...
Virtualization
Almost everywhere (data centers, SDN,
NFV, IoT, phones, 5G devices, etc.)
Allows
optimal resource utilizati...
Resource utilization
Observation
Machines are under-used
-20% (e.g., Twitter [ASPLOS’14], Amazon [EuroSys’10],
Microsoft A...
Resource utilization importance
Figure: M. Fontoura [SOSP’17]
8 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en ...
VM Consolidation
1 Exploit the variability of the workload
2 Dynamically relocate VMs (using migration)
3 Turn-off empty ma...
VM Consolidation
Limitations
Memory is the most demanded resource type
Working set size estimation and prediction is very ...
1 (Energy) Drowsy-DC: power mgt. inspired by smartphones [IPDPS’19]
Collaboration with EPFL
11 / 27
Alain Tchana (Ne fais ...
Observation
Emergence of Long-Lived, Mostly Idle (LLMI) VMs
Always available services [EuroSys’16]
Very small audience or ...
Drowsy Data Center (Drowsy-DC): a new
consolidation paradigm
Basic idea
Instead of consolidating to put empty servers to s...
Drowsy-DC: basic idea
Classic consolidation Drowsy-DC consolidation
Goal Put unused hosts to sleep Suspend idle hosts
Meth...
Architecture and prototype (open source)
15 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
Idleness Probability (IP) is computed
from the Idleness Model (IM)
Consolidate VMs with si...
Idleness-aware VM consolidation
Idleness Model (by studying 308 clusters from Nutanix)
Different time scales
hour in the da...
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId...
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId...
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId...
Idleness-aware VM consolidation
Computation of w∗ and other details in [IPDPS’19]
21 / 27
Alain Tchana (Ne fais pas d’IA e...
Server wakeup optimizations
Envoi ordre réveil
Réception ordre réveil
Réveil matériel
Réveil OS
Firmware (470ms)
Réveil CP...
Low scale experiments
5 HP machines, consume about 10% energy in S3
6 LLMI VMs (VM3-8), initially spread
2 LLMU VMs (VM1-2...
Low scale experiments: Results
Figure: Colocation percentage of each VM — black cells: V1 and V2
were LLMU VMs; dark gray ...
Low scale experiments: Results
Algorithm P2 P3 P4 P5 Global
Drowsy-DC 0 94 79 91 66
OpenStack Neat 89 7 8 93 49
Table: Fra...
Large scale experiments: Energy saving
on Google traces ( 12.000 machines)
10 20 30 40 50 60 70 80 90
2
4
6
8
·103
LLMI VM...
“Ceux qui ne savent pas où ils vont ,
sont surpris d’arriver ailleurs.”
Pierre Dac
27 / 27
Alain Tchana (Ne fais pas d’IA ...
Upcoming SlideShare
Loading in …5
×

OSIS19_Cloud : Performance and power management in virtualized data centers, by Alain Tchana

19 views

Published on

My research is in virtualized infrastructure domain. I aim at minimizing electricity consumption while improving application performance. To achieve the first goal, I work both at the entire datacenter level (by providing better VM placement strategies) and at the physical machine level (by providing better power management policies). Concerning the second goal, I work both at the VM monitor level (for minimizing its overhead) and at the VM's operating system (OS) level (for making it aware of the fact that it is virtualized).
In this talk I present two contributions of my research team, one for each objective.
The first contribution presents Drowsy-DC, a novel way to reduce data center power consumption inspired by smartphones.
The second contribution presents XPV (eXtended Para-Virtualization), a new principle for well virtualizing NUMA machines.

Published in: Software
  • Be the first to comment

  • Be the first to like this

OSIS19_Cloud : Performance and power management in virtualized data centers, by Alain Tchana

  1. 1. Drowsy-DC: Power management in datacenters inspired by smartphones Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!) Professeur des Universités Université de Nice Sophia Antipolis (alain.tchana@unice.fr) Joint work with EPFL 1 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  2. 2. “La prévision est une chose difficile surtout lorsqu’il s’agit de l’avenir.” Winston Churchill 2 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  3. 3. Energy consumption Electricity cost +50% of TCO [Computer’11, HP report] Eolas (Grenoble): 5 employees to manage the data center Basic approach: ACPI ACPI defines both the performance level and the sleep modes of the whole server and its internal components S-, P-, C-, D-states Thus, power management (PM) policies adapt the power consumed by the server according to its utilization rate (e.g. DVFS on CPUs) 3 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  4. 4. Current situation on servers with ACPI 0 20 40 60 80 100 0 50 100 S5 S4S3 S0idle Utilization rate Consumedenergy(%) S5: Off; S4: suspend to disk; S3: suspend to RAM; S0: idle. 4 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  5. 5. Current situation on servers with ACPI Reasons Not all server components are individually manageable PM policies are not easy to implement by sysadmins Relation between utilization and energy consumption 5 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  6. 6. Virtualization Almost everywhere (data centers, SDN, NFV, IoT, phones, 5G devices, etc.) Allows optimal resource utilization (colocation) quick deployment (packaging) maintenance with zero downtime (live migration) fault tolerance (checkpointing, live migration) scalability (dynamic resource management) ...⇒(time, square, staff, money) saving 6 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  7. 7. Resource utilization Observation Machines are under-used -20% (e.g., Twitter [ASPLOS’14], Amazon [EuroSys’10], Microsoft Azure [SOSP’17], Eolas, etc.) Because of Resource oversizing Workload variation ACPI limitations 7 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  8. 8. Resource utilization importance Figure: M. Fontoura [SOSP’17] 8 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  9. 9. VM Consolidation 1 Exploit the variability of the workload 2 Dynamically relocate VMs (using migration) 3 Turn-off empty machines 4 e.g., OpenStack Neat, VMware DRS/DPM 9 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  10. 10. VM Consolidation Limitations Memory is the most demanded resource type Working set size estimation and prediction is very hard task Thus machines are still under-used (Alibaba traces in 2018, Microsoft Azure [SOSP’17]) 10 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  11. 11. 1 (Energy) Drowsy-DC: power mgt. inspired by smartphones [IPDPS’19] Collaboration with EPFL 11 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  12. 12. Observation Emergence of Long-Lived, Mostly Idle (LLMI) VMs Always available services [EuroSys’16] Very small audience or clear periodicity in usage ski resort web servers, weekly report generation... 1 2 3 4 5 6 10 20 Days Activity(%) Figure: IBM cluster’s traces 12 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  13. 13. Drowsy Data Center (Drowsy-DC): a new consolidation paradigm Basic idea Instead of consolidating to put empty servers to sleep, consolidate to maximize idle servers and put them to sleep. 13 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  14. 14. Drowsy-DC: basic idea Classic consolidation Drowsy-DC consolidation Goal Put unused hosts to sleep Suspend idle hosts Method Gather workload to Gather workload to maximize unused hosts maximize idle periods 14 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  15. 15. Architecture and prototype (open source) 15 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  16. 16. Idleness-aware VM consolidation Idleness Probability (IP) is computed from the Idleness Model (IM) Consolidate VMs with similar IPs during the next hour how likely is the VM to be idle the nex hour? 16 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  17. 17. Idleness-aware VM consolidation Idleness Model (by studying 308 clusters from Nutanix) Different time scales hour in the day (e.g., morning) day in the week (e.g., weekend) day in the month (e.g., end of the month) month in the year (e.g., winter) Example a national diploma results website is mostly used at some specific hours (2 p.m., 3 p.m.) of a specific day (20th) of one month (July), every year 17 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  18. 18. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) SId (h): synthesized idleness during h regarding its position in the day SIw (h, dw ): synthesized idleness depending on h and the day of the week dw SIm(h, dm): synthesized idleness depending on h and the day of the month dm SIy (h, dm, m): synthesized idleness depending on h, dm and the month m of the year 18 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  19. 19. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) Weight (wd , ww , wm, and wy ) the importance of the time scale in the IM e.g., a national diploma results website is mostly used at some specific hours (2 p.m., 3 p.m.) of a specific day (20th) of one month (July), every year 19 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  20. 20. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) Weight (wd , ww , wm, and wy ) IP(h, dw , dm, m) = wd · SId (h) + ww · SIw (h, dw ) + wm · SIm(h, dm) + wy · SIy (h, dm, m) (1) 20 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  21. 21. Idleness-aware VM consolidation Computation of w∗ and other details in [IPDPS’19] 21 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  22. 22. Server wakeup optimizations Envoi ordre réveil Réception ordre réveil Réveil matériel Réveil OS Firmware (470ms) Réveil CPU Réveil périphériques Début réinitialisation carte réseau OS opérationnel Noyau (190ms) Lien Ethernet établi Reset carte réseau (1,3s) Réseau opérationnel Processus ordonnancé 50ms Latence de réveil totale du service (~2s) Optimizations Kernel select only needed drivers and CPUs Network device use "critical section" to eliminate reset 22 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  23. 23. Low scale experiments 5 HP machines, consume about 10% energy in S3 6 LLMI VMs (VM3-8), initially spread 2 LLMU VMs (VM1-2), initially spread 1 2 3 4 5 6 10 20 Days Activity(%) VM3, VM4 VM6 Figure: IBM cluster’s traces 23 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  24. 24. Low scale experiments: Results Figure: Colocation percentage of each VM — black cells: V1 and V2 were LLMU VMs; dark gray cells: V3 and V4 received the same workload. Last column is the number of migrations a VM experienced. 24 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  25. 25. Low scale experiments: Results Algorithm P2 P3 P4 P5 Global Drowsy-DC 0 94 79 91 66 OpenStack Neat 89 7 8 93 49 Table: Fraction of time (percent) spent by hosts in suspended power state, with Drowsy-DC and with Neat 25 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  26. 26. Large scale experiments: Energy saving on Google traces ( 12.000 machines) 10 20 30 40 50 60 70 80 90 2 4 6 8 ·103 LLMI VM proportion in the datacenter (%) Powerconsumption(kWh) Neat Neat+S3 Oasis Drowsy-DC Figure: Drowsy-DC: 76% - 81% energy savings compared to OpenStack Neat 26 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  27. 27. “Ceux qui ne savent pas où ils vont , sont surpris d’arriver ailleurs.” Pierre Dac 27 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)

×