XS Boston 2008 Memory Overcommit
Upcoming SlideShare
Loading in...5
×
 

XS Boston 2008 Memory Overcommit

on

  • 2,085 views

Dan Magenheimer: Memory Overcommit...Without the Commitment

Dan Magenheimer: Memory Overcommit...Without the Commitment

Statistics

Views

Total Views
2,085
Views on SlideShare
2,085
Embed Views
0

Actions

Likes
1
Downloads
28
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

XS Boston 2008 Memory Overcommit XS Boston 2008 Memory Overcommit Document Transcript

  • <Insert Picture Here> Memory Overcommit… without the Commitment Speaker: Dan Magenheimer 2008 Oracle Corporation
  • Overview • What is overcommitment? • aka oversubscription or overbooking • Why (and why not) overcommit memory? • Known techniques for memory overcommit • Feedback-directed ballooning Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • CPU overcommitment Four underutilized 2-cpu virtual servers One 4-CPU physical server Xen supports CPU overcommitment (aka “consolidation”) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • I/O overcommitment Four underutilized 2-cpu virtual servers each with a 1Gb NIC One 4-CPU physical server with a 1Gb NIC Xen supports I/O overcommitment Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Memory overcommitment??? Four underutilized 2-cpu virtual servers each with 1GB RAM One 4-CPU physical server w/4GB RAM X X SORRY! Xen doesn’t support memory overcommitment! Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Why doesn’t Xen overcommit memory? • Memory is cheap – buy more • “When you overbook memory excessively, performance takes a hit” • Most consolidated workloads don’t benefit much • Overcommit requires swapping – and Xen doesn’t do I/O in the hypervisor • Overcommit adds lots of complexity and latency to important features like save/restore/migration • Operating systems know what they are doing and can use all the memory they can get Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Why doesn’t Xen overcommit memory? • Memory is cheap – buy more… except when you’re out of slots or need BIG dimms. • “When you overbook memory excessively, performance takes a hit”… yes, but true of overbooking CPU and I/O too. Sometimes tradeoffs have to be made. • Most consolidated workloads don’t benefit much…but some workloads do! • Overcommit requires swapping – and Xen doesn’t do I/O in the hypervisor…only if black-box swapping is required • Overcommit adds lots of complexity and latency to important features like save/restore/migration… some techniques maybe… we’ll see… • Operating systems know what they are doing and can use all the memory they can get…but an idle or lightly-loaded OS may not! Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Why should Xen support memory overcommitment? • Competitive reasons “VMware Infrastructure’s exclusive ability to overcommit memory gives it an advantage in cost per VM that others can’t match” * • High(er) density consolidation can save money • Sum of guest working sets is often smaller than available physical memory • Inefficient guest OS utilization of physical memory (cacheing vs “hoarding”) * E. Horschman, Cheap Hypervisors: A Fine Idea -- If you can afford them, blog posting, http://blogs.vmware.com/virtualreality/2008/03/cheap-hyperviso.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Problem statement Oracle OnDemand businesses (both internal/external): • would like to use Oracle VM (Xen-based) • but uses memory overcommit extensively The Oracle VM team was asked… can we: • implement memory overcommit on Xen? • get it accepted upstream? Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Memory Overcommit Investigation • Technology survey • understand known techniques and implementations • understand what Xen has today and its limitations • Propose a solution • OK to place requirements on guest • e.g. black-box solution unnecessary • soon and good is better than late and great • phased delivery OK if necessary • e.g. Oracle Enterprise Linux now, Windows later • preferably high bang for the buck • e.g. 80% of value with 20% of cost Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Techniques for memory overcommit • Ballooning • Content-based page sharing • VMM-driven demand paging • Hot-plug memory add/delete • Ticketed ballooning • Swapping entire guests Black-box or gray-box*… or white-box? * T. Wood, et al. Black-box and Gray-box Strategies for Virtual Machine Migration, In Proceedings NSDI ’07. http://www.cs.umass.edu/~twood/pubs/NSDI07.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • WHAT IF…..? Operating systems were able to: • recognize when physical memory is not being used efficiently and communicate relevant statistics • surrender memory when it is underutilized • reclaim memory when it is needed And Xen/domain0 could balance the allocation of physical memory, just as it does for CPU/devices? …. Maybe this is already possible?!? Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Ballooning (gray-box) Currently implemented by: • In-guest device driver • “steals” / reclaims memory via guest in-kernel APIs • e.g. get_free_page() and MmAllocatPagesforMdl() KVM • Balloon inflation increases guest memory pressure • leverages guest native memory management algorithms • Xen has ballooning today • mostly used for domain0 autoballooning • has problems, but recent patch avoids worst* • Vmware and KVM have it today too Issues: • driver must be installed • not available during boot • reclaim may not be fast enough; potential out-of-memory conditions * J. Beulich, [PATCH] linux/balloon: don’t allow ballooning down a domain below a reasonable limit,, Xen developers archive, http://lists.xensource.com/archives/html/xen-devel/2008-04/msg00143.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • (black-box) Content-based page sharing • One physical page frame used for multiple identical pages • sharing works both intra-guest and inter-guest • hypervisor periodically scans for copies and “merges” • copy-on-write breaks share Currently implemented by: • Investigated on Xen, but never in-tree* ** • measured savings of 4-12% KVM • Vmware*** has had for a long time, KVM soon **** Issues: • Performance cost of discovery scans, frequent share set-up/tear-down • High complexity for relatively low gain * J. Kloster et al, On the feasibility of memory sharing: content based page sharing in the xen virtual machine monitor, Technical Report, 2006. http://www.cs.aau.dk/library/files/rapbibfiles1/1150283144.pdf ** G. Milos, Memory COW in Xen, Presentation at Xen Summit, Nov 2007. http://www.xen.org/files/xensummit_fall2007/18_GregorMilos.pdf *** C. Waldspurger. Memory Resource Management in Vmware ESX Server, In Proceedings OSDI’02, http://www.usenix.org/events/osdi02/tech/walspurger/waldspurger.pdf **** A. Kivity, Memory overcommit with kvm, http://avikivity.blogspot.com/2008/04/memory-overcommit-with-kvm.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Demand paging (black-box) Currently implemented by: KVM • VMM reclaims memory and swaps to disk • VMware has today • used as last resort • randomized page selection • Could potentially be done on Xen via domain0 Issues: • “Hypervisor” must have disk/net drivers • “Semantic gap”* Double paging * P. Chen et al. When Virtual is Better than Real. In Proceedings HOTOS ’01. http://www.eecs.umich.edu/~pmchen/papers/chen01.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Hotplug memory add/delete (white-box) Currently implemented by: • Essentially just ballooning with: • larger granularity • less fragmentation • potentially unlimited maximum memory • no kernel data overhead for unused pages Issues: • Not widely available yet (for x86) • Larger granularity • Hotplug delete requires defragmentation J. Schopp et al, Resizing Memory with Balloons and Hotplug, Ottawa Linux Symposium 2006, https://ols2006.108.redhat.com/reprints/schopp-reprint.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • “Ticketed” ballooning (white-box) • Proposed by Ian Pratt* • A ticket is obtained when a page is surrendered to the balloon driver • Original page can be retrieved if Xen hasn’t given the page to another domain • Similar to a system-wide second-chance buffer cache or unreliable swap device • Never implemented (afaik) * http://lists.xensource.com/archives/html/xen-devel/2008-05/msg00321.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Whole-guest swapping (black?-box) • Proposed by Keir Fraser* • Forced save/restore of idle/low-priority guests • Wake-on-LAN-like mechanism causes restore • Never implemented (afaik) Issues: • Very long latency for guest resume • Very high system I/O overhead when densely overcommitted * http://lists.xensource.com/archives/html/xen-devel/2005-12/msg00409.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Observations • Xen balloon driver works well • recent patch avoids O-O-M problems • works on hvm if pv-on-hvm drivers present • ballooning up from “memory=xxx” to “maxmem=yyy” works (on pvm domains) • ballooned-down domain doesn’t restrict creation of new domains • Linux provides lots of memory-status information • /proc/meminfo and /proc/vmstat • Committed_AS is a decent estimator of current memory need • Linux does OK when put under memory pressure • rapid/frequent balloon inflation/deflation just works… as long as remaining available Linux memory is not too small • properly configured Linux swap disk works when necessary; obviates need for “system-wide” demand paging • Xenstore tools work for two-way communication even in hvm Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Proposed Solution: Feedback-directed ballooning (gray-box) Use relevant Linux memory statistics to control balloon size • Selfballooning: • Local feedback loop; immediate balloon changes • Eagerly inflates balloon to create memory pressure • No management or domain0 involvement • Directed ballooning: • Memory stats fed from each domainU to domain0 • Policy module in domain0 determines balloon size, controls memory pressure for each domain (not yet implemented) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Implementation: Feedback-directed ballooning • No changes to Xen or domain0 kernel or drivers! • Entirely implemented with user-land bash scripts • Self-ballooning and stat reporting/monitoring only (for now) • Committed_AS used (for now) as memory estimator • Hysteresis parameters -- settable to rate-limit balloon changes • Minimum memory floor enforced to avoid O-O-M conditions • same maxmem-dependent algorithm as recent balloon driver bugfix • Other guest requirements: • Properly sized and configured swap (virtual) disk for each guest • HVM: pv-on-hvm drivers present • Xenstore tools present (but not for selfballooning) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Feedback-directed Ballooning Results • Overcommit ratio • 7:4 w/default configuration (7 512MB loaded guests, 2GB phys memory) • 15:4 w/aggressive config (15 512MB idle guests, 2GB phys memory) • for pvm guests, arbitrarily higher due to “maxmem=” • Preliminary performance (Linux kernel make after make clean, 5 runs, mean of middle 3) ballooning Memory Min User Sys Elapsed Major page Down 27% (MB) (sec) (sec) faults (MB) (sec) Hysteresis slower Off 785 121 954 0 2048 Self 368 778 104 1209 8940 10 Off 775 95 1120 4650 1024 Self 238 775 93 1201 9490 10 Off 775 79 1167 8520 512 3% self 172 775 80 1202 9650 10 slower Off 773 79 1186 9150 256 Self 106 775 80 1202 9650 10 Selfballooning costly for large-memory domains but barely noticeable for smaller-memory domains Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Domain0 screenshot with monitoring tool and xentop showing memory overcommitment Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Future Work • Domain0 policy module for directed ballooning • some combination of directed and self-ballooning?? • Improved feedback / heuristics • Combine multiple memory statistics, check idle time • Prototype kernel changes (“white-box” feedback) • Better “idle memory” metrics • Benchmarking in real world • More aggressive minimum memory experiments • Windows support Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • Conclusions • Xen does do memory overcommit today! • Memory overcommit has some performance impact • but still useful in environments where high VM density is more important than max performance • Lots of cool research directions possible for “virtualization-aware” OS memory management Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • <Insert Picture Here> Memory Overcommit… without the Commitment Speaker: Dan Magenheimer 2008 Oracle Corporation