SlideShare a Scribd company logo
Drowsy-DC: Power management in
datacenters inspired by smartphones
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Professeur des Universités
Université de Nice Sophia Antipolis
(alain.tchana@unice.fr)
Joint work with EPFL
1 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
“La prévision est une chose difficile
surtout lorsqu’il s’agit de l’avenir.”
Winston Churchill
2 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Energy consumption
Electricity cost
+50% of TCO [Computer’11, HP report]
Eolas (Grenoble): 5 employees to manage the data center
Basic approach: ACPI
ACPI defines both the performance level and the sleep modes of
the whole server and its internal components
S-, P-, C-, D-states
Thus, power management (PM) policies adapt the power
consumed by the server according to its utilization rate (e.g.
DVFS on CPUs)
3 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Current situation on servers with ACPI
0 20 40 60 80 100
0
50
100
S5
S4S3
S0idle
Utilization rate
Consumedenergy(%)
S5: Off; S4: suspend to disk; S3: suspend to RAM; S0: idle.
4 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Current situation on servers with ACPI
Reasons
Not all server components are individually manageable
PM policies are not easy to implement by sysadmins
Relation between utilization and energy consumption
5 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Virtualization
Almost everywhere (data centers, SDN,
NFV, IoT, phones, 5G devices, etc.)
Allows
optimal resource utilization (colocation)
quick deployment (packaging)
maintenance with zero downtime (live migration)
fault tolerance (checkpointing, live migration)
scalability (dynamic resource management)
...⇒(time, square, staff, money) saving
6 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Resource utilization
Observation
Machines are under-used
-20% (e.g., Twitter [ASPLOS’14], Amazon [EuroSys’10],
Microsoft Azure [SOSP’17], Eolas, etc.)
Because of
Resource oversizing
Workload variation
ACPI limitations
7 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Resource utilization importance
Figure: M. Fontoura [SOSP’17]
8 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
VM Consolidation
1 Exploit the variability of the workload
2 Dynamically relocate VMs (using migration)
3 Turn-off empty machines
4 e.g., OpenStack Neat, VMware DRS/DPM
9 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
VM Consolidation
Limitations
Memory is the most demanded resource type
Working set size estimation and prediction is very hard task
Thus machines are still under-used (Alibaba traces in 2018,
Microsoft Azure [SOSP’17])
10 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
1 (Energy) Drowsy-DC: power mgt. inspired by smartphones [IPDPS’19]
Collaboration with EPFL
11 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Observation
Emergence of Long-Lived, Mostly Idle (LLMI) VMs
Always available services [EuroSys’16]
Very small audience or clear periodicity in usage
ski resort web servers, weekly report generation...
1 2 3 4 5 6
10
20
Days
Activity(%)
Figure: IBM cluster’s traces
12 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Drowsy Data Center (Drowsy-DC): a new
consolidation paradigm
Basic idea
Instead of consolidating to put empty servers to sleep, consolidate to
maximize idle servers and put them to sleep.
13 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Drowsy-DC: basic idea
Classic consolidation Drowsy-DC consolidation
Goal Put unused hosts to sleep Suspend idle hosts
Method Gather workload to Gather workload to
maximize unused hosts maximize idle periods
14 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Architecture and prototype (open source)
15 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
Idleness Probability (IP) is computed
from the Idleness Model (IM)
Consolidate VMs with similar IPs during
the next hour
how likely is the VM to be idle the nex
hour?
16 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
Idleness Model (by studying 308 clusters from Nutanix)
Different time scales
hour in the day (e.g., morning)
day in the week (e.g., weekend)
day in the month (e.g., end of the month)
month in the year (e.g., winter)
Example
a national diploma results website is mostly used at some specific
hours (2 p.m., 3 p.m.) of a specific day (20th) of one month
(July), every year
17 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy )
SId (h): synthesized idleness during h regarding its position in the day
SIw (h, dw ): synthesized idleness depending on h and the day of the
week dw
SIm(h, dm): synthesized idleness depending on h and the day of the
month dm
SIy (h, dm, m): synthesized idleness depending on h, dm and the
month m of the year
18 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy )
Weight (wd , ww , wm, and wy )
the importance of the time scale in the IM
e.g., a national diploma results website is mostly used at some
specific hours (2 p.m., 3 p.m.) of a specific day (20th) of one month
(July), every year
19 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
IM is composed of Synthesized Idlenesses and Weights
Synthesized Idlenesses (SI∗):
(24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy )
Weight (wd , ww , wm, and wy )
IP(h, dw , dm, m) = wd · SId (h) + ww · SIw (h, dw )
+ wm · SIm(h, dm) + wy · SIy (h, dm, m)
(1)
20 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Idleness-aware VM consolidation
Computation of w∗ and other details in [IPDPS’19]
21 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Server wakeup optimizations
Envoi ordre réveil
Réception ordre réveil
Réveil matériel
Réveil OS
Firmware (470ms)
Réveil CPU
Réveil périphériques
Début réinitialisation carte réseau
OS opérationnel
Noyau (190ms)
Lien Ethernet établi
Reset carte réseau (1,3s)
Réseau opérationnel
Processus ordonnancé
50ms
Latence de réveil totale du service (~2s)
Optimizations
Kernel
select only needed drivers and CPUs
Network device
use "critical section" to eliminate reset
22 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Low scale experiments
5 HP machines, consume about 10% energy in S3
6 LLMI VMs (VM3-8), initially spread
2 LLMU VMs (VM1-2), initially spread
1 2 3 4 5 6
10
20
Days
Activity(%)
VM3, VM4
VM6
Figure: IBM cluster’s traces
23 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Low scale experiments: Results
Figure: Colocation percentage of each VM — black cells: V1 and V2
were LLMU VMs; dark gray cells: V3 and V4 received the same
workload. Last column is the number of migrations a VM experienced.
24 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Low scale experiments: Results
Algorithm P2 P3 P4 P5 Global
Drowsy-DC 0 94 79 91 66
OpenStack Neat 89 7 8 93 49
Table: Fraction of time (percent) spent by hosts in suspended power
state, with Drowsy-DC and with Neat
25 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
Large scale experiments: Energy saving
on Google traces ( 12.000 machines)
10 20 30 40 50 60 70 80 90
2
4
6
8
·103
LLMI VM proportion in the datacenter (%)
Powerconsumption(kWh)
Neat Neat+S3 Oasis Drowsy-DC
Figure: Drowsy-DC: 76% - 81% energy savings
compared to OpenStack Neat
26 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
“Ceux qui ne savent pas où ils vont ,
sont surpris d’arriver ailleurs.”
Pierre Dac
27 / 27
Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)

More Related Content

More from Pôle Systematic Paris-Region

OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
Pôle Systematic Paris-Region
 
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
Pôle Systematic Paris-Region
 
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick MoyOsis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Pôle Systematic Paris-Region
 
Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?
Pôle Systematic Paris-Region
 
Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin
Pôle Systematic Paris-Region
 
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMAOsis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Pôle Systematic Paris-Region
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Pôle Systematic Paris-Region
 
Osis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritageOsis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritage
Pôle Systematic Paris-Region
 
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
Pôle Systematic Paris-Region
 
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riotOSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
Pôle Systematic Paris-Region
 
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
Pôle Systematic Paris-Region
 
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
Pôle Systematic Paris-Region
 
PyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelatPyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelat
Pôle Systematic Paris-Region
 
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
Pôle Systematic Paris-Region
 

More from Pôle Systematic Paris-Region (20)

OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
 
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
 
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
 
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
 
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
 
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
 
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick MoyOsis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
 
Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?Osis18_Cloud : Pas de commun sans communauté ?
Osis18_Cloud : Pas de commun sans communauté ?
 
Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin Osis18_Cloud : Projet Wolphin
Osis18_Cloud : Projet Wolphin
 
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMAOsis18_Cloud : Virtualisation efficace d’architectures NUMA
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
 
Osis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritageOsis18_Cloud : Software-heritage
Osis18_Cloud : Software-heritage
 
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
 
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riotOSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
 
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
 
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
 
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
 
PyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelatPyParis 2017 / Un mooc python, by thierry parmentelat
PyParis 2017 / Un mooc python, by thierry parmentelat
 
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
PyParis2017 / Python pour les enseignants des classes préparatoires, by Olivi...
 

Recently uploaded

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 

Recently uploaded (20)

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 

OSIS19_Cloud : Performance and power management in virtualized data centers, by Alain Tchana

  • 1. Drowsy-DC: Power management in datacenters inspired by smartphones Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!) Professeur des Universités Université de Nice Sophia Antipolis (alain.tchana@unice.fr) Joint work with EPFL 1 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 2. “La prévision est une chose difficile surtout lorsqu’il s’agit de l’avenir.” Winston Churchill 2 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 3. Energy consumption Electricity cost +50% of TCO [Computer’11, HP report] Eolas (Grenoble): 5 employees to manage the data center Basic approach: ACPI ACPI defines both the performance level and the sleep modes of the whole server and its internal components S-, P-, C-, D-states Thus, power management (PM) policies adapt the power consumed by the server according to its utilization rate (e.g. DVFS on CPUs) 3 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 4. Current situation on servers with ACPI 0 20 40 60 80 100 0 50 100 S5 S4S3 S0idle Utilization rate Consumedenergy(%) S5: Off; S4: suspend to disk; S3: suspend to RAM; S0: idle. 4 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 5. Current situation on servers with ACPI Reasons Not all server components are individually manageable PM policies are not easy to implement by sysadmins Relation between utilization and energy consumption 5 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 6. Virtualization Almost everywhere (data centers, SDN, NFV, IoT, phones, 5G devices, etc.) Allows optimal resource utilization (colocation) quick deployment (packaging) maintenance with zero downtime (live migration) fault tolerance (checkpointing, live migration) scalability (dynamic resource management) ...⇒(time, square, staff, money) saving 6 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 7. Resource utilization Observation Machines are under-used -20% (e.g., Twitter [ASPLOS’14], Amazon [EuroSys’10], Microsoft Azure [SOSP’17], Eolas, etc.) Because of Resource oversizing Workload variation ACPI limitations 7 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 8. Resource utilization importance Figure: M. Fontoura [SOSP’17] 8 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 9. VM Consolidation 1 Exploit the variability of the workload 2 Dynamically relocate VMs (using migration) 3 Turn-off empty machines 4 e.g., OpenStack Neat, VMware DRS/DPM 9 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 10. VM Consolidation Limitations Memory is the most demanded resource type Working set size estimation and prediction is very hard task Thus machines are still under-used (Alibaba traces in 2018, Microsoft Azure [SOSP’17]) 10 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 11. 1 (Energy) Drowsy-DC: power mgt. inspired by smartphones [IPDPS’19] Collaboration with EPFL 11 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 12. Observation Emergence of Long-Lived, Mostly Idle (LLMI) VMs Always available services [EuroSys’16] Very small audience or clear periodicity in usage ski resort web servers, weekly report generation... 1 2 3 4 5 6 10 20 Days Activity(%) Figure: IBM cluster’s traces 12 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 13. Drowsy Data Center (Drowsy-DC): a new consolidation paradigm Basic idea Instead of consolidating to put empty servers to sleep, consolidate to maximize idle servers and put them to sleep. 13 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 14. Drowsy-DC: basic idea Classic consolidation Drowsy-DC consolidation Goal Put unused hosts to sleep Suspend idle hosts Method Gather workload to Gather workload to maximize unused hosts maximize idle periods 14 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 15. Architecture and prototype (open source) 15 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 16. Idleness-aware VM consolidation Idleness Probability (IP) is computed from the Idleness Model (IM) Consolidate VMs with similar IPs during the next hour how likely is the VM to be idle the nex hour? 16 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 17. Idleness-aware VM consolidation Idleness Model (by studying 308 clusters from Nutanix) Different time scales hour in the day (e.g., morning) day in the week (e.g., weekend) day in the month (e.g., end of the month) month in the year (e.g., winter) Example a national diploma results website is mostly used at some specific hours (2 p.m., 3 p.m.) of a specific day (20th) of one month (July), every year 17 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 18. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) SId (h): synthesized idleness during h regarding its position in the day SIw (h, dw ): synthesized idleness depending on h and the day of the week dw SIm(h, dm): synthesized idleness depending on h and the day of the month dm SIy (h, dm, m): synthesized idleness depending on h, dm and the month m of the year 18 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 19. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) Weight (wd , ww , wm, and wy ) the importance of the time scale in the IM e.g., a national diploma results website is mostly used at some specific hours (2 p.m., 3 p.m.) of a specific day (20th) of one month (July), every year 19 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 20. Idleness-aware VM consolidation IM is composed of Synthesized Idlenesses and Weights Synthesized Idlenesses (SI∗): (24 SId , 24 × 7 SIw , 24 × 31 SIm, 24 × 365 SIy ) Weight (wd , ww , wm, and wy ) IP(h, dw , dm, m) = wd · SId (h) + ww · SIw (h, dw ) + wm · SIm(h, dm) + wy · SIy (h, dm, m) (1) 20 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 21. Idleness-aware VM consolidation Computation of w∗ and other details in [IPDPS’19] 21 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 22. Server wakeup optimizations Envoi ordre réveil Réception ordre réveil Réveil matériel Réveil OS Firmware (470ms) Réveil CPU Réveil périphériques Début réinitialisation carte réseau OS opérationnel Noyau (190ms) Lien Ethernet établi Reset carte réseau (1,3s) Réseau opérationnel Processus ordonnancé 50ms Latence de réveil totale du service (~2s) Optimizations Kernel select only needed drivers and CPUs Network device use "critical section" to eliminate reset 22 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 23. Low scale experiments 5 HP machines, consume about 10% energy in S3 6 LLMI VMs (VM3-8), initially spread 2 LLMU VMs (VM1-2), initially spread 1 2 3 4 5 6 10 20 Days Activity(%) VM3, VM4 VM6 Figure: IBM cluster’s traces 23 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 24. Low scale experiments: Results Figure: Colocation percentage of each VM — black cells: V1 and V2 were LLMU VMs; dark gray cells: V3 and V4 received the same workload. Last column is the number of migrations a VM experienced. 24 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 25. Low scale experiments: Results Algorithm P2 P3 P4 P5 Global Drowsy-DC 0 94 79 91 66 OpenStack Neat 89 7 8 93 49 Table: Fraction of time (percent) spent by hosts in suspended power state, with Drowsy-DC and with Neat 25 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 26. Large scale experiments: Energy saving on Google traces ( 12.000 machines) 10 20 30 40 50 60 70 80 90 2 4 6 8 ·103 LLMI VM proportion in the datacenter (%) Powerconsumption(kWh) Neat Neat+S3 Oasis Drowsy-DC Figure: Drowsy-DC: 76% - 81% energy savings compared to OpenStack Neat 26 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)
  • 27. “Ceux qui ne savent pas où ils vont , sont surpris d’arriver ailleurs.” Pierre Dac 27 / 27 Alain Tchana (Ne fais pas d’IA et ne souhaite pas en faire!)