Your SlideShare is downloading. ×
0
<Insert Picture Here>




                        Transcendent
                        Memory on Xen
    2009             ...
Agenda

         •   Motivation and Challenge
         •   Overview of Physical Memory Management
         •   Transcenden...
Motivation
        • Memory is increasingly becoming a
          bottleneck in virtualized system
        • Existing mecha...
The Virtualized Physical Memory
                      Resource Optimization Challenge
         Optimize, across time, the ...
The Virtualized Physical Memory
                      Resource Optimization Challenge
         Optimize, across time, the ...
Agenda

         • Motivation and Challenge
         • Overview of Physical Memory Management
               • in an opera...
OS Physical Memory Management



                                                                 • Operating systems
    ...
OS Physical Memory Management

                                                                 • Operating systems are
  ...
OS Physical Memory Management

                                                                 • Operating systems are
  ...
OS Physical Memory Management


                                                                    • What does an OS do
 ...
OS Physical Memory Management


                                                                 • What does an OS do
    ...
OS Physical Memory Management


                                                                 • What does an OS do
    ...
OS Physical Memory Management


                                                                 • What does an OS do
    ...
Agenda

         • Motivation and Challenge
         • Overview of Physical Memory Management
               • in an opera...
VMM Physical Memory Management


                                                                 • Xen partitions memory
...
VMM Physical Memory Management


                                                                 • Xen partitions
       ...
VMM Physical Memory Management


                                                                 • Xen partitions
       ...
VMM Physical Memory Management


                                                                         • Xen partitions...
VMM Physical Memory Management
                            in the presence of migration
                                  ...
VMM Physical Memory Management
                            in the presence of ballooning

                                ...
VMM Physical Memory Management
                            in the presence of ballooning

                                ...
VMM Physical Memory Management
                            in the presence of ballooning

                                ...
VMM Physical Memory Management
                            in the presence of ballooning
                                 ...
The Virtualized Physical Memory
                      Resource Optimization Challenge
         Optimize, across time, the ...
Why this IS a hard problem!
                                         Summary
         • OS’s use as much memory as they ar...
Agenda

         •   Motivation and Challenge
         •   Overview of Physical Memory Management
         •   Transcenden...
Transcendent memory
                    creating the transcendent memory pool

                                           ...
Transcendent memory
                    creating the transcendent memory pool

                                           ...
Transcendent memory
                                              API characteristics

                                   ...
Transcendent memory
     four different subpool types                                                four different uses

...
Transcendent memory
                                                                 caveats
        • Requirements
      ...
Agenda

         • Motivation and Challenge
         • Overview of Physical Memory Management
         • Transcendent Memo...
hcache
                                                                 • a second-chance clean
                          ...
hcache (with compression)

                                                                 • Compression
                ...
hcache (multiple guests)

                                                                 • second-chance page cache
    ...
shared hcache (for clustering)


                                                                 • guests sharing a
     ...
hswap

                                                                  • over-ballooned guests
                         ...
Agenda

         •   Motivation and Challenge
         •   Overview of Physical Memory Management
         •   Transcenden...
Current Status

         • hcache and hswap fully working
               • shared hcache soon
         • xen-side patch re...
Future Work

         • finish “shared hcache” work (ocfs2)
         • shared-persistent pool investigation
              ...
Acknowledgements

     • Chris Mason (Oracle)
           • Linux vfs changes for hcache
     • Zhigang Wang (Oracle)
     ...
For more information



           http://oss.oracle.com/projects/tmem




Transcendent Memory on Xen (Xen Summit 2009) - ...
<Insert Picture Here>




                        Transcendent
                        Memory on Xen
    2009             ...
Backup Slides




Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
Transcendent Memory API
                                                        overview (API v0.0.1)




Transcendent Mem...
Transcendent memory API
                                          op overview (API v0.0.1)

         Two classes of operat...
Transcendent memory API
                                         pool creation (API v0.0.1)
           Syntax: pool_id = t...
Transcendent Memory API
                              what is a “handle”?? (API v0.0.1)
                  retval = tmem_op...
Transcendent Memory API
                                      API operations (API v0.0.1)
        • tmem_new_pool(uuid,fla...
Transcendent Memory API
                         important semantic details (v0.0.1)
           • get_page on a private+ep...
Transcendent memory
                                           hcache performance
                                        ...
Transcendent memory
                                  hcache compensates for
                                 underprovisi...
hcache (multiple domains + compressed)

                                                                 • shared compress...
Upcoming SlideShare
Loading in...5
×

XS Oracle 2009 Transcendent Memory

1,042

Published on

Dan Magenheimer: Transcendent Memory on Xen

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,042
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
36
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "XS Oracle 2009 Transcendent Memory"

  1. 1. <Insert Picture Here> Transcendent Memory on Xen 2009 Speaker: Dan Magenheimer Oracle Corporation
  2. 2. Agenda • Motivation and Challenge • Overview of Physical Memory Management • Transcendent Memory (“tmem”) Overview • Transcendent Memory in Action • Status, Futures, etc. Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  3. 3. Motivation • Memory is increasingly becoming a bottleneck in virtualized system • Existing mechanisms have major holes ballooning Four underutilized 2-cpu virtual servers each with 1GB RAM One 4-CPU physical server w/4GB RAM X X memory overcommitment page sharing Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  4. 4. The Virtualized Physical Memory Resource Optimization Challenge Optimize, across time, the distribution of machine memory among a maximal set of virtual machines by: • measuring the current and future memory need of each running VM and • reclaiming memory from those VMs that have an excess of memory and either: • providing it to VMs that need more memory or • using it to provision additional new VMs. • without suffering a significant performance penalty Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  5. 5. The Virtualized Physical Memory Resource Optimization Challenge Optimize, across time, the distribution of machine memory among a maximal set of virtual machines by: • measuring the current and future memory need of each running VM and • reclaiming memory from those VMs that have an excess of memory and either: • providing it to VMs that need more memory or • using it to provision additional new VMs. • without suffering a significant performance penalty …..Why is this a hard problem? Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  6. 6. Agenda • Motivation and Challenge • Overview of Physical Memory Management • in an operating system • in a virtual machine monitor (Xen) • Transcendent Memory Overview • Transcendent Memory In Action • Status, Futures, etc. Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  7. 7. OS Physical Memory Management • Operating systems are memory hogs! OS Memory constraint Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  8. 8. OS Physical Memory Management • Operating systems are memory hogs! OS If you give an operating system more memory….. New larger memory constraint Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  9. 9. OS Physical Memory Management • Operating systems are memory hogs! My name is Linux and I am a memory …it uses up any hog memory you give it! Memory constraint Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  10. 10. OS Physical Memory Management • What does an OS do with all that memory? Kernel code Pa ge User data ta Page bl es cache Kernel User code data Everything else Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  11. 11. OS Physical Memory Management • What does an OS do with all that memory? page cache Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  12. 12. OS Physical Memory Management • What does an OS do with all that memory? page cache Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  13. 13. OS Physical Memory Management • What does an OS do with all that memory? page cache …much of the time mostly page cache … some of which will be useful in the future … and some of which is wasted Everything else Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  14. 14. Agenda • Motivation and Challenge • Overview of Physical Memory Management • in an operating system • in a virtual machine monitor (Xen) • Transcendent Memory Overview • Transcendent Memory In Action • Status, Futures, etc. Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  15. 15. VMM Physical Memory Management • Xen partitions memory • hypervisor memory guest • dom0 memory • guest memory Dom0 is special ☺ Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  16. 16. VMM Physical Memory Management • Xen partitions memory guest • Xen memory • dom0 memory • guest 1 memory • guest 2 memory guest • whatever’s left over: “fallow” memory fallow, adj., land left without a crop for one or more years Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  17. 17. VMM Physical Memory Management • Xen partitions fallow memory guest • Xen memory fallow • dom0 memory fallow • guest 1 memory • guest 2 memory guest • whatever’s left over: “fallow” memory fallow, adj., land left without a crop for one or more years Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  18. 18. VMM Physical Memory Management • Xen partitions memory fallow among more guests fallow gues • Xen memory guest t • dom0 memory guest • guest 1 memory • guest 2 memory • guest 3… fallow guest • BUT still fallow memory leftover Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  19. 19. VMM Physical Memory Management in the presence of migration fallow fallow gue guest s t Physical gues machine “B” • migration t fallow • requires fallow memory in the target machine • leaves behind fallow fallow memory in the fallow gue s t originating machine gues t fallow guest Physical machine “A” Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  20. 20. VMM Physical Memory Management in the presence of ballooning • Use ballooning to fallow allow guest memory fallow gues guest t size to grow? guest • Goal: fill fallow memory fallow guest Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  21. 21. VMM Physical Memory Management in the presence of ballooning • Look! No more fallow guest fallow memory! guest But…. guest gue st gue st Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  22. 22. VMM Physical Memory Management in the presence of ballooning • Look! No more fallow fallow guest memory! guest But…. guest gue st gue st And but… Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  23. 23. VMM Physical Memory Management in the presence of ballooning Using ballooning to take memory away: • not instantaneous (memory inertia) • guest can’t predict future needs • good pages are evicted along with the bad • don’t know how much/fast to balloon • Too much or too fast thrashing or the dreaded OOM killer Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  24. 24. The Virtualized Physical Memory Resource Optimization Challenge Optimize, across time, the distribution of machine memory among a maximal set of virtual machines by: • measuring the current and future memory need of each running VM and • reclaiming memory from those VMs that have an excess of memory and either: • providing it to VMs that need more memory or • using it to provision additional new VMs. • without suffering a significant performance penalty …..This IS a hard problem!!! Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  25. 25. Why this IS a hard problem! Summary • OS’s use as much memory as they are given • but cannot predict the future so often guess wrong • and often much memory owned by an OS is wasted • Xen leaves large amounts of memory fallow • fixed partitioning results in fragmentation • migration requires fallow memory to succeed • Ballooning helps but: • can’t predict future memory needs of guests • memory has inertia • the price of incorrect guesses can be dire NEED A NEW APPROACH TO VIRTUALIZED PHYSICAL MEMORY MANAGEMENT!! Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  26. 26. Agenda • Motivation and Challenge • Overview of Physical Memory Management • Transcendent Memory Overview • Transcendent Memory In Action • Status, Futures, etc. Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  27. 27. Transcendent memory creating the transcendent memory pool • Step 1a: reclaim all fallow memory fallow • Step 1b: reclaim wasted guest fallow memory (e.g. via ballooning) guest guest • Step 1c: collect it all into a pool guest fallow guest Transcendent memory pool Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  28. 28. Transcendent memory creating the transcendent memory pool • Step 2: provide indirect access, strictly controlled by guest the hypervisor and dom0 guest data data control Transcendent guest memory data pool control data guest Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  29. 29. Transcendent memory API characteristics Transcendent memory API guest guest • paravirtualized (lightly) • narrow • well-specified • operations are: • synchronous • page-oriented (one page per op) • copy-based • multi-faceted • extensible Transcendent memory pool Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  30. 30. Transcendent memory four different subpool types four different uses ephemeral persistent private “second-chance” Fast swap clean-page cache!! “device”!! Implemented and working “hcache” “hswap” today (Linux + Xen) In development shared server-side cluster inter-domain filesystem cache? shared Under investigation “shared hcache” memory? eph-em-er-al, adj., … transitory, existing only briefly, short-lived (i.e. NOT persistent) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  31. 31. Transcendent memory caveats • Requirements • guest OS must be paravirtualized • 64-bit hypervisor and CPU • Workload: • should exert memory pressure in at least one guest • memory pressure in multiple guests should vary across time • For best results: • dom0 should be configured with a fixed memory size • guest should have a (virtual) swap disk configured • Complementary to: • feedback-directed ballooning • transparent content-based page sharing Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  32. 32. Agenda • Motivation and Challenge • Overview of Physical Memory Management • Transcendent Memory Overview • Transcendent Memory In Action “hcache”* • private-ephemeral pool “shared hcache” • shared-ephemeral pool “hswap”* • private-persistent pool • Status, Future, etc. * called “precache” and “preswap” for Linux Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  33. 33. hcache • a second-chance clean page cache for a guest • “put” clean pages only • “get” only valuable pages • pages eventually are evicted Transcendent memory pool • coherency managed by guest (private+ephemeral) • exclusive cache semantics “put” Transcendent Memory Pool types “get” persistent ephemeral guest “second-chance” Fast swap private clean-page cache!! “device”!! “hcache” “hswap” shared server-side cluster inter-domain filesystem cache? shared memory? “shared hcache” Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  34. 34. hcache (with compression) • Compression • Option (per-domain) guest • nominally doubles available memory • performance-space tradeoff Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  35. 35. hcache (multiple guests) • second-chance page cache for multiple guests guest • Need “memory scheduler”: • global admission/eviction policy: private ephemeral • LRU queue, or tmem pool #1 • weight balanced (future) private ephemeral tmem pool #2 guest Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  36. 36. shared hcache (for clustering) • guests sharing a clustered filesystem guest • non-exclusive • LFU instead of LRU • compression optional SHARED ephemeral a server-side disk cache! tmem pool Clustered filesystem Transcendent Memory Pool types persistent ephemeral guest private “second-chance” Fast swap clean-page cache!! “device”!! “hcache” “hswap” server-side cluster inter-domain shared filesystem cache? shared memory? “shared hcache” Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  37. 37. hswap • over-ballooned guests experiencing unexpected memory pressure have an emergency swap disk • much faster than swapping • persistent (“dirty”) pages OK • prioritized higher than hcache • limited by domain’s maxmem Transcendent Memory Pool types ephemeral persistent “second-chance” Fast swap private clean-page cache!! “device”!! “hcache” “hswap” shared server-side cluster inter-domain filesystem cache? shared memory? “shared hcache” Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  38. 38. Agenda • Motivation and Challenge • Overview of Physical Memory Management • Transcendent Memory Overview • Transcendent Memory In Action • Status, Future, etc. Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  39. 39. Current Status • hcache and hswap fully working • shared hcache soon • xen-side patch ready for inclusion in xen-unstable • ~3K line patch, but low impact on existing code • enabled with xen boot option (off by default) • “technology preview” • goal: broader community usage (3.4?) • linux-side patch ready • low impact on existing code • 2.6.18-xen version ready for inclusion in Xen-linux tree • 2.6.28 version working Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  40. 40. Future Work • finish “shared hcache” work (ocfs2) • shared-persistent pool investigation • inter-domain communication? • real world performance measurement/analysis • identify tuning opportunities (e.g. scaleability) and repeat • finish “memory scheduler” • tmem for: • native Linux? • Linux containers? • KVM? • Hvm domains? Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  41. 41. Acknowledgements • Chris Mason (Oracle) • Linux vfs changes for hcache • Zhigang Wang (Oracle) • Xen tools (xm + libxc) code • Kurt Hackel (Oracle), various HP friends, Ian, Keir, Jeremy • design feedback along the way Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  42. 42. For more information http://oss.oracle.com/projects/tmem Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  43. 43. <Insert Picture Here> Transcendent Memory on Xen 2009 Speaker: Dan Magenheimer Oracle Corporation
  44. 44. Backup Slides Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  45. 45. Transcendent Memory API overview (API v0.0.1) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  46. 46. Transcendent memory API op overview (API v0.0.1) Two classes of operations: • Create a pool Syntax: pool_id = tmem_new_pool(uuid, flags) • Operate on a created pool Generic syntax: retval = tmem_op(handle,pfn[,ofs1,ofs2,len]) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  47. 47. Transcendent memory API pool creation (API v0.0.1) Syntax: pool_id = tmem_new_pool(uuid, flags) ephemeral persistent private “second-chance” Fast swap clean- page cache!! “device”!! Implemented and working “hcache” “hswap” today (Linux + Xen) shared server-side cluster inter-domain Under investigation filesystem cache? shared memory? flags: private vs. shared, ephemeral vs. persistent, page size, API version, … ??? uuid: 128-bit “share name” (for shared pools, ignored for private pools) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  48. 48. Transcendent Memory API what is a “handle”?? (API v0.0.1) retval = tmem_op(handle,pfn) (is actually) retval = tmem_op(pool_id,object_id,page_id,pfn) • The “handle” used in previous slides is actually a three-element “handle-tuple” consisting of: • a 32-bit pool-id (obtained from tmem_new_pool()) • a 64-bit object-id • a 32-bit page-id • In filesystem-like usage: • pool-id one per filesystem • object-id inode • page-id page index into a file Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  49. 49. Transcendent Memory API API operations (API v0.0.1) • tmem_new_pool(uuid,flags) • tmem_destroy_pool(pool_id) • tmem_put_page(pool_id,object_id,page_id,pfn) • tmem_get_page(pool_id,object_id,page_id,empty_pfn) • tmem_flush_page(pool_id,object_id,page_id) • tmem_flush_object(pool_id,object_id) • tmem_read(pool_id,object_id,page_id,pfn, offset1,offset2,len) • tmem_write(pool_id,object_id,page_id,pfn, offset1,offset2,len • tmem_xchg(pool_id,object_id,page_id,pfn, offset1,offset2,len) • tmem_control(TBD…) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  50. 50. Transcendent Memory API important semantic details (v0.0.1) • get_page on a private+ephemeral pool is destructive (auto-flush) • implements exclusive cache semantics • no serialization guarantees are provided for SMP VMs • clients must ensure coherency with their own caches/data stores but implementation provides following guarantees: • put/put/get (aka “dup put”) coherency tmem_put_page(ABC,D1); tmem_put_page(ABC,D2); tmem_get_page(ABC,E); E may never contain the data from D1. (implies that on persistent pools, dup put must never fail) • get/get coherency tmem_get_page(ABC,E); tmem_get_page(ABC,E); If the first get fails, the second must also fail • all flush operations must always succeed • return values: >=0 means success, < 0 failure (errno) • see spec for more information Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  51. 51. Transcendent memory hcache performance (smaller is better) 100 70 60 80 disk reads (K) 50 60 seconds 40 30 40 20 20 10 0 0 pcpu=2 pcpu=4 pcpu=4 pcpu=2 pcpu=4 pcpu=4 vcpu=2 vcpu=2 vcpu=4 vcpu=2 vcpu=2 vcpu=4 256MB w/hcache 256MB no hcache 256MB w/hcache 256MB no hcache 1024MB no hcache 2048MB no hcache 1024MB no hcache 2048MB no hcache Benchmark: Linux compile, cold page cache, pre-caching enabled (ccache) Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  52. 52. Transcendent memory hcache compensates for underprovisioned memory 120 600 100 500 disk reads (K) 80 400 seconds 60 300 40 200 20 100 0 0 pcpu=4 vcpu=4 pcpu=4 vcpu=4 128MB w/hcache 128MB no hcache 128MB w/hcache 128MB no hcache 256MB no hcache 1024MB no hcache 256MB no hcache 1024MB no hcache Benchmark: Linux compile, warm page cache, pre-caching disabled Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  53. 53. hcache (multiple domains + compressed) • shared compressed extended page cache for guest more than one guest guest Transcendent Memory on Xen (Xen Summit 2009) - Dan Magenheimer
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×