Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

OK Labs - Virtualization as the Nexus of Multicore Power Management

on

  • 1,018 views

ARM TechCon Session "Virtualization as the Nexus of Multicore Power Management" ...

ARM TechCon Session "Virtualization as the Nexus of Multicore Power Management"

Thursday, November 11, 2010

Adoption of multicore technology for the desktop,data center and embedded designs responds to comparable needs – to scale compute capacity without stepping up system clocks and to attain more MIPS-per-watt for devices and applications. Multicore for the desktop and data center enjoys mature support from deployed OSes. Even as embedded OSes become more adept at running on multicore CPUs, applications and middleware still face challenges of thread-safety, concurrency and load balancing. Mobile virtualization is a means to get maximum value from multicore ARM designs, at both architectural and app levels. It examines multicore use cases for virtualization, and how it brings superior CPU utilization,greater security, smoother legacy migration,& smarter energy management to multicore designs.

Statistics

Views

Total Views
1,018
Views on SlideShare
1,018
Embed Views
0

Actions

Likes
0
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

OK Labs - Virtualization as the Nexus of Multicore Power Management OK Labs - Virtualization as the Nexus of Multicore Power Management Presentation Transcript

  • November  9-­‐11,  2010   The  Santa  Clara  Conven7on  Center   www.armtechcon.com  
  • Energy Management for Mobile Devices Power to the Microvisor!  
  • >  Energy-management >  Virtualization basics >  Enter multicore >  Summary
  • >  Device uses energy •  Drains battery >  Goal of energy management: •  Maximize battery life
  • Dynamic voltage and frequency scaling >  CMOS power consumption: •  P = Pdyn + Pstat •  Pdyn ∝ f V2 •  Vmin ∝ f (very approximately) >  Assuming execution time T ∝ 1 / f •  Edyn = Pdyn T ∝ f V2 / f = V2 = f2 •  lower frequency ⇒ lower dynamic energy
  • >  When CPU is idle, turn clock off •  Pdyn = 0 ⇒ P = Pstat >  Sleep states reduce power further: •  Psleep < Pstat >  Typically have multiple sleep states •  shallow sleep states save some energy −  but fast to enter/exit •  deep sleep states save more energy −  but lose state and are expensive to enter/exit >  Complex tradeoff
  • >  Edyn ∝ f 2 ⇒ lowest frequency is best >  Ignores static energy! •  E = Edyn + Estat •  Edyn ∝ f 2 •  Estat = Pstat T ∝ 1/f >  Low f increases execution time ⇒ Estat increases at low f !
  • >  Run at maximum f, then go to sleep •  Tries to minimize static power — but: •  dynamic power isn’t irrelevant (yet) –  T ∝ 1/f isn’t correct either — ignores memory! •  Effect of memory stalls •  T = TCPU + Tmem •  TCPU ∝ 1/f •  Tmem = const •  Estat ∝ T = 1/f + const >  Ignores sleep energy!
  • >  Run at maximum f, then go to sleep >  Earlier completion ⇒ longer sleep •  E = Edyn + Estat + Esleep •  Esleep = Psleep Tsleep •  Tsleep = T0 – T •  Esleep = Psleep (T0 - T) >  Still ignores dynamic energy!
  • Memory- bound CPU- bound
  • CPU- bound Memory- bound Naïve model
  • High-power sleep state Low-power sleep state
  • >  Energy management is complex! >  Optimal setting depends on: •  Workload −  memory-bound vs CPU-bound vs in-between •  Hardware platform −  static vs dynamic energy −  CPU vs memory power −  depth of sleep states and cost of entering >  Simple models don’t work!
  • >  How to establish memory-boundedness? >  Easy way out: pre-characterization •  measure behavior off-line •  determine optimal power setting −  by model or trial-and-error >  Ok-ish for pre-defined workloads >  Unsuitable for open systems •  ... such as phones >  Tricky with apps which change behavior
  • >  Need to observe app and adjust setting •  works for any app •  adjusts to changing behavior >  Solution by [Snowdon et al., EuroSys’09] >  Performance counters are your friends! •  e.g. cache misses indicate memory access >  Can systematically select best counters •  build model of platform •  Linear combination of performance-counter readings •  pre-characterize hardware •  pick counters which provide most accurate model •  using sound statistical methods
  • >  Model predicts energy consumption and relative execution speed •  at present setpoint •  at different setpoins >  Accurately predicts energy- and performance response to DVFS •  within a few % >  Can use this for informed energy-management decisions
  • Memory- bound CPU- bound
  • CPU- bound Memory- bound
  • >  What is “best”? Workload •  Maximal Performance? Statistics •  Minimal Energy? •  Minimal Power? >  Depends... Candidate Setpoints >  May change •  battery depletes >  Need flexible policies QoS Info Setting
  • Performance CPU- bound Memory- bound Energy
  • CPU- bound Memory- bound
  • >  Implementation of power model and policies •  once for platform vs once for each guest •  no guest has global view, hypervisor does •  integration with other cores −  DSPs, baseband processor •  policy-mechanism separation
  • >  Controls all resources •  CPU, memory, devices >  De-privileged guest OSes •  execute in user mode •  prevents interference −  with hypervisor −  with other guests •  ensures hypervisor retains control over resources
  • >  Subsystems compete for it >  Cannot let subsystems manage it •  just as with memory, CPU >  Needs trusted, central authority >  Needs to be done in virtualization layer
  • >  Mechanisms in hypervisor >  Policies in isolated management module >  Keep hypervisor policy-free •  HW-like
  • Subsystem #1 Subsystem #2 >  Additional degree of freedom •  DVFS + sleep states + core shutdown •  Hypervisor supports transparent, temporary VCPU VCPU VCPU VCPU consolidation of cores •  Unneeded cores turned off to reduce power OKL4 Microvisor >  Different tradeoffs CPU CPU CPU CPU •  Performance vs power close to linear >  Important to manage cores globally Subsystem #1 Subsystem #2 •  In average more cores off than with per-guest management •  Can use deeper sleep state VCPU VCPU VCPU VCPU •  Less overall energy use OKL4 Microvisor CPU CPU CPU CPU
  • >  Cache coherency couples clock frequencies of multiple cores >  OSes running on different cores cannot adjust clock independently >  Requires entity with global view
  • >  Cores have same ISA but different clock rates >  Hypervisor can determine optimal mapping of subsystems to cores •  Using same infrastructure as for DVFS •  Integrate with temporary core consolidation CPU-bound Memory-bound Subsystem Subsystem VCPU VCPU VCPU VCPU OKL4 Microvisor Fast Fast Slow Slow CPU CPU CPU CPU
  • >  Move subsystems between cores •  including temporary consolidation Subsystem #1 Subsystem #2 of different subsystems on common core >  Architectural inter-core dependencies •  cannot manage core clocks independently VCPU VCPU VCPU VCPU >  Requires global control •  ... outside individual OSes OKL4 Microvisor CPU CPU CPU CPU •  indirection layer between OS and hardware >  No practical alternative to virtualization!
  • >  Virtualization is unavoidable long-term >  ... but provides other benefits short-term >  Early uptake maximises benefits >  Future-proof your designs!
  •   Thank You!