XS Boston 2008 Memory Overcommit

  • 1,352 views
Uploaded on

Dan Magenheimer: Memory Overcommit...Without the Commitment

Dan Magenheimer: Memory Overcommit...Without the Commitment

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,352
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
31
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. <Insert Picture Here> Memory Overcommit… without the Commitment Speaker: Dan Magenheimer 2008 Oracle Corporation
  • 2. Overview • What is overcommitment? • aka oversubscription or overbooking • Why (and why not) overcommit memory? • Known techniques for memory overcommit • Feedback-directed ballooning Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 3. CPU overcommitment Four underutilized 2-cpu virtual servers One 4-CPU physical server Xen supports CPU overcommitment (aka “consolidation”) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 4. I/O overcommitment Four underutilized 2-cpu virtual servers each with a 1Gb NIC One 4-CPU physical server with a 1Gb NIC Xen supports I/O overcommitment Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 5. Memory overcommitment??? Four underutilized 2-cpu virtual servers each with 1GB RAM One 4-CPU physical server w/4GB RAM X X SORRY! Xen doesn’t support memory overcommitment! Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 6. Why doesn’t Xen overcommit memory? • Memory is cheap – buy more • “When you overbook memory excessively, performance takes a hit” • Most consolidated workloads don’t benefit much • Overcommit requires swapping – and Xen doesn’t do I/O in the hypervisor • Overcommit adds lots of complexity and latency to important features like save/restore/migration • Operating systems know what they are doing and can use all the memory they can get Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 7. Why doesn’t Xen overcommit memory? • Memory is cheap – buy more… except when you’re out of slots or need BIG dimms. • “When you overbook memory excessively, performance takes a hit”… yes, but true of overbooking CPU and I/O too. Sometimes tradeoffs have to be made. • Most consolidated workloads don’t benefit much…but some workloads do! • Overcommit requires swapping – and Xen doesn’t do I/O in the hypervisor…only if black-box swapping is required • Overcommit adds lots of complexity and latency to important features like save/restore/migration… some techniques maybe… we’ll see… • Operating systems know what they are doing and can use all the memory they can get…but an idle or lightly-loaded OS may not! Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 8. Why should Xen support memory overcommitment? • Competitive reasons “VMware Infrastructure’s exclusive ability to overcommit memory gives it an advantage in cost per VM that others can’t match” * • High(er) density consolidation can save money • Sum of guest working sets is often smaller than available physical memory • Inefficient guest OS utilization of physical memory (cacheing vs “hoarding”) * E. Horschman, Cheap Hypervisors: A Fine Idea -- If you can afford them, blog posting, http://blogs.vmware.com/virtualreality/2008/03/cheap-hyperviso.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 9. Problem statement Oracle OnDemand businesses (both internal/external): • would like to use Oracle VM (Xen-based) • but uses memory overcommit extensively The Oracle VM team was asked… can we: • implement memory overcommit on Xen? • get it accepted upstream? Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 10. Memory Overcommit Investigation • Technology survey • understand known techniques and implementations • understand what Xen has today and its limitations • Propose a solution • OK to place requirements on guest • e.g. black-box solution unnecessary • soon and good is better than late and great • phased delivery OK if necessary • e.g. Oracle Enterprise Linux now, Windows later • preferably high bang for the buck • e.g. 80% of value with 20% of cost Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 11. Techniques for memory overcommit • Ballooning • Content-based page sharing • VMM-driven demand paging • Hot-plug memory add/delete • Ticketed ballooning • Swapping entire guests Black-box or gray-box*… or white-box? * T. Wood, et al. Black-box and Gray-box Strategies for Virtual Machine Migration, In Proceedings NSDI ’07. http://www.cs.umass.edu/~twood/pubs/NSDI07.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 12. WHAT IF…..? Operating systems were able to: • recognize when physical memory is not being used efficiently and communicate relevant statistics • surrender memory when it is underutilized • reclaim memory when it is needed And Xen/domain0 could balance the allocation of physical memory, just as it does for CPU/devices? …. Maybe this is already possible?!? Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 13. Ballooning (gray-box) Currently implemented by: • In-guest device driver • “steals” / reclaims memory via guest in-kernel APIs • e.g. get_free_page() and MmAllocatPagesforMdl() KVM • Balloon inflation increases guest memory pressure • leverages guest native memory management algorithms • Xen has ballooning today • mostly used for domain0 autoballooning • has problems, but recent patch avoids worst* • Vmware and KVM have it today too Issues: • driver must be installed • not available during boot • reclaim may not be fast enough; potential out-of-memory conditions * J. Beulich, [PATCH] linux/balloon: don’t allow ballooning down a domain below a reasonable limit,, Xen developers archive, http://lists.xensource.com/archives/html/xen-devel/2008-04/msg00143.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 14. (black-box) Content-based page sharing • One physical page frame used for multiple identical pages • sharing works both intra-guest and inter-guest • hypervisor periodically scans for copies and “merges” • copy-on-write breaks share Currently implemented by: • Investigated on Xen, but never in-tree* ** • measured savings of 4-12% KVM • Vmware*** has had for a long time, KVM soon **** Issues: • Performance cost of discovery scans, frequent share set-up/tear-down • High complexity for relatively low gain * J. Kloster et al, On the feasibility of memory sharing: content based page sharing in the xen virtual machine monitor, Technical Report, 2006. http://www.cs.aau.dk/library/files/rapbibfiles1/1150283144.pdf ** G. Milos, Memory COW in Xen, Presentation at Xen Summit, Nov 2007. http://www.xen.org/files/xensummit_fall2007/18_GregorMilos.pdf *** C. Waldspurger. Memory Resource Management in Vmware ESX Server, In Proceedings OSDI’02, http://www.usenix.org/events/osdi02/tech/walspurger/waldspurger.pdf **** A. Kivity, Memory overcommit with kvm, http://avikivity.blogspot.com/2008/04/memory-overcommit-with-kvm.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 15. Demand paging (black-box) Currently implemented by: KVM • VMM reclaims memory and swaps to disk • VMware has today • used as last resort • randomized page selection • Could potentially be done on Xen via domain0 Issues: • “Hypervisor” must have disk/net drivers • “Semantic gap”* Double paging * P. Chen et al. When Virtual is Better than Real. In Proceedings HOTOS ’01. http://www.eecs.umich.edu/~pmchen/papers/chen01.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 16. Hotplug memory add/delete (white-box) Currently implemented by: • Essentially just ballooning with: • larger granularity • less fragmentation • potentially unlimited maximum memory • no kernel data overhead for unused pages Issues: • Not widely available yet (for x86) • Larger granularity • Hotplug delete requires defragmentation J. Schopp et al, Resizing Memory with Balloons and Hotplug, Ottawa Linux Symposium 2006, https://ols2006.108.redhat.com/reprints/schopp-reprint.pdf Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 17. “Ticketed” ballooning (white-box) • Proposed by Ian Pratt* • A ticket is obtained when a page is surrendered to the balloon driver • Original page can be retrieved if Xen hasn’t given the page to another domain • Similar to a system-wide second-chance buffer cache or unreliable swap device • Never implemented (afaik) * http://lists.xensource.com/archives/html/xen-devel/2008-05/msg00321.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 18. Whole-guest swapping (black?-box) • Proposed by Keir Fraser* • Forced save/restore of idle/low-priority guests • Wake-on-LAN-like mechanism causes restore • Never implemented (afaik) Issues: • Very long latency for guest resume • Very high system I/O overhead when densely overcommitted * http://lists.xensource.com/archives/html/xen-devel/2005-12/msg00409.html Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 19. Observations • Xen balloon driver works well • recent patch avoids O-O-M problems • works on hvm if pv-on-hvm drivers present • ballooning up from “memory=xxx” to “maxmem=yyy” works (on pvm domains) • ballooned-down domain doesn’t restrict creation of new domains • Linux provides lots of memory-status information • /proc/meminfo and /proc/vmstat • Committed_AS is a decent estimator of current memory need • Linux does OK when put under memory pressure • rapid/frequent balloon inflation/deflation just works… as long as remaining available Linux memory is not too small • properly configured Linux swap disk works when necessary; obviates need for “system-wide” demand paging • Xenstore tools work for two-way communication even in hvm Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 20. Proposed Solution: Feedback-directed ballooning (gray-box) Use relevant Linux memory statistics to control balloon size • Selfballooning: • Local feedback loop; immediate balloon changes • Eagerly inflates balloon to create memory pressure • No management or domain0 involvement • Directed ballooning: • Memory stats fed from each domainU to domain0 • Policy module in domain0 determines balloon size, controls memory pressure for each domain (not yet implemented) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 21. Implementation: Feedback-directed ballooning • No changes to Xen or domain0 kernel or drivers! • Entirely implemented with user-land bash scripts • Self-ballooning and stat reporting/monitoring only (for now) • Committed_AS used (for now) as memory estimator • Hysteresis parameters -- settable to rate-limit balloon changes • Minimum memory floor enforced to avoid O-O-M conditions • same maxmem-dependent algorithm as recent balloon driver bugfix • Other guest requirements: • Properly sized and configured swap (virtual) disk for each guest • HVM: pv-on-hvm drivers present • Xenstore tools present (but not for selfballooning) Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 22. Feedback-directed Ballooning Results • Overcommit ratio • 7:4 w/default configuration (7 512MB loaded guests, 2GB phys memory) • 15:4 w/aggressive config (15 512MB idle guests, 2GB phys memory) • for pvm guests, arbitrarily higher due to “maxmem=” • Preliminary performance (Linux kernel make after make clean, 5 runs, mean of middle 3) ballooning Memory Min User Sys Elapsed Major page Down 27% (MB) (sec) (sec) faults (MB) (sec) Hysteresis slower Off 785 121 954 0 2048 Self 368 778 104 1209 8940 10 Off 775 95 1120 4650 1024 Self 238 775 93 1201 9490 10 Off 775 79 1167 8520 512 3% self 172 775 80 1202 9650 10 slower Off 773 79 1186 9150 256 Self 106 775 80 1202 9650 10 Selfballooning costly for large-memory domains but barely noticeable for smaller-memory domains Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 23. Domain0 screenshot with monitoring tool and xentop showing memory overcommitment Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 24. Future Work • Domain0 policy module for directed ballooning • some combination of directed and self-ballooning?? • Improved feedback / heuristics • Combine multiple memory statistics, check idle time • Prototype kernel changes (“white-box” feedback) • Better “idle memory” metrics • Benchmarking in real world • More aggressive minimum memory experiments • Windows support Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 25. Conclusions • Xen does do memory overcommit today! • Memory overcommit has some performance impact • but still useful in environments where high VM density is more important than max performance • Lots of cool research directions possible for “virtualization-aware” OS memory management Memory Overcommit without the Commitment (Xen Summit 2008) - Dan Magenheimer
  • 26. <Insert Picture Here> Memory Overcommit… without the Commitment Speaker: Dan Magenheimer 2008 Oracle Corporation