Xen   3.0  and the Art of  Virtualization Ian Pratt Keir Fraser, Steven Hand, Christian Limpach, Andrew Warfield, Dan Mage...
Outline <ul><li>Virtualization Overview </li></ul><ul><li>Xen Architecture </li></ul><ul><li>New Features in Xen 3.0 </li>...
Virtualization Overview <ul><li>Single OS image: Virtuozo, Vservers, Zones </li></ul><ul><ul><li>Group user processes into...
Virtualization in the Enterprise X <ul><li>Consolidate  under-utilized servers   to reduce CapEx and OpEx </li></ul><ul><l...
Xen Today : Xen 2.0.6 <ul><li>Secure isolation between VMs </li></ul><ul><li>Resource control and QoS </li></ul><ul><li>On...
Para-Virtualization in Xen <ul><li>Xen extensions to x86 arch  </li></ul><ul><ul><li>Like x86, but Xen invoked for privile...
Xen 2.0 Architecture Event Channel Virtual MMU Virtual CPU  Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI...
Xen 3.0 Architecture Event Channel Virtual MMU Virtual CPU  Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI...
x86_32  <ul><li>Xen reserves top of VA space </li></ul><ul><li>Segmentation protects Xen from kernel </li></ul><ul><li>Sys...
x86_64  <ul><li>Large VA space makes life a lot easier, but: </li></ul><ul><li>No segment limit support </li></ul><ul><li>...
x86_64  <ul><li>Run user-space and kernel in ring 3 using different pagetables </li></ul><ul><ul><li>Two PGD’s (PML4’s): o...
Para-Virtualizing the MMU <ul><li>Guest OSes allocate and manage own PTs </li></ul><ul><ul><li>Hypercall to change PT base...
Writeable Page Tables : 1 – Write fault  MMU Guest OS Xen VMM Hardware page fault first guest write guest reads Virtual ->...
Writeable Page Tables : 2 – Emulate?  MMU Guest OS Xen VMM Hardware first guest write guest reads Virtual -> Machine emula...
Writeable Page Tables : 3 - Unhook MMU Guest OS Xen VMM Hardware guest writes guest reads Virtual -> Machine X
Writeable Page Tables : 4 - First Use MMU Guest OS Xen VMM Hardware page fault guest writes guest reads Virtual -> Machine X
Writeable Page Tables : 5 – Re-hook MMU Guest OS Xen VMM Hardware validate guest writes guest reads Virtual -> Machine
MMU Micro-Benchmarks L X V U Page fault ( µ s) L X V U Process fork ( µ s) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1...
SMP Guest Kernels <ul><li>Xen extended to support multiple VCPUs </li></ul><ul><ul><li>Virtual IPI’s sent via Xen event ch...
SMP Guest Kernels <ul><li>Takes great care to get good SMP performance while remaining secure </li></ul><ul><ul><li>Requir...
I/O Architecture <ul><li>Xen  IO-Spaces  delegate guest OSes protected access to specified h/w devices </li></ul><ul><ul><...
VT-x / (Pacifica) <ul><li>Enable Guest OSes to be run without para-virtualization modifications </li></ul><ul><ul><li>E.g....
Native Device Drivers Control Panel (xm/xend) Front end Virtual Drivers Linux xen64 Xen Hypervisor Device Models Guest BIO...
MMU Virtualizion : Shadow-Mode MMU Accessed & dirty bits Guest OS VMM Hardware guest writes guest reads Virtual -> Pseudo-...
VM Relocation : Motivation <ul><li>VM relocation enables: </li></ul><ul><ul><li>High-availability </li></ul></ul><ul><ul><...
Assumptions <ul><li>Networked storage </li></ul><ul><ul><li>NAS: NFS, CIFS </li></ul></ul><ul><ul><li>SAN: Fibre Channel <...
Challenges <ul><li>VMs have lots of state in memory </li></ul><ul><li>Some VMs have soft real-time requirements </li></ul>...
Relocation Strategy Stage 0: pre-migration Stage 1: reservation Stage 2: iterative pre-copy Stage 3: stop-and-copy Stage 4...
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Final
Writable Working Set <ul><li>Pages that are dirtied must be re-sent </li></ul><ul><ul><li>Super hot pages </li></ul></ul><...
Writable Working Set <ul><li>Set of pages written to by OS/application </li></ul><ul><li>Pages that are dirtied must be re...
Page Dirtying Rate <ul><li>Dirtying rate determines VM down-time </li></ul><ul><ul><li>Shorter iters  ->  less dirtying  -...
Rate Limited Relocation <ul><li>Dynamically adjust resources committed to performing page transfer </li></ul><ul><ul><li>D...
Web Server Relocation
Iterative Progress: SPECWeb 52s
Iterative Progress: Quake3
Quake 3 Server relocation
Extensions <ul><li>Cluster load balancing </li></ul><ul><ul><li>Pre-migration analysis phase </li></ul></ul><ul><ul><li>Op...
Current 3.0 Status           Driver Domains     64-on-64     VT ?   4TB 16GB   >4GB memory         ~tools Save/Restore/Mig...
3.1 Roadmap <ul><li>Improved full-virtualization support </li></ul><ul><ul><li>Pacifica / VT-x abstraction </li></ul></ul>...
IO Virtualization <ul><li>IO virtualization in s/w incurs overhead </li></ul><ul><ul><li>Latency vs. overhead tradeoff </l...
Research Roadmap <ul><li>Whole-system debugging </li></ul><ul><ul><li>Lightweight checkpointing and replay </li></ul></ul>...
Conclusions <ul><li>Xen is a complete and robust GPL VMM </li></ul><ul><li>Outstanding performance and scalability </li></...
Thanks! <ul><li>The Xen project is hiring, both in Cambridge UK, Palo Alto and New York </li></ul><ul><li>[email_address] ...
Backup slides
Isolated Driver VMs <ul><li>Run device drivers in separate domains </li></ul><ul><li>Detect failure e.g. </li></ul><ul><ul...
Device Channel Interface
Scalability <ul><li>Scalability principally limited by Application resource requirements </li></ul><ul><ul><li>several 10’...
System Performance L X V U SPEC INT2000 (score) L X V U Linux build time (s) L X V U OSDB-OLTP (tup/s) L X V U SPEC WEB99 ...
TCP results L X V U Tx, MTU 1500 (Mbps) L X V U Rx, MTU 1500 (Mbps) L X V U Tx, MTU 500 (Mbps) L X V U Rx, MTU 500 (Mbps) ...
Scalability L X 2 L X 4 L X 8 L X 16 0 200 400 600 800 1000 Simultaneous SPEC WEB99 Instances on Linux (L) and Xen(X)
Resource Differentation 2 4 8 8(diff) OSDB-IR 2 4 8 8(diff) OSDB-OLTP 0.0 0.5 1.0 1.5 2.0 Simultaneous OSDB-IR and OSDB-OL...
 
 
 
Upcoming SlideShare
Loading in...5
×

slides on Xen 3.0

546
-1

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
546
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Gerd Knor of Suse, Rusty Russell, Jon Mason and Ryan Harper of IBM
  • POPF doesn’t trap if executed with insufficient privilege
  • Commercial / production use Released under the GPL; Guest OSes can be any licensce
  • X86 hard to virtualize so benefits from more modifications than is necessary for IA64/PPC popf Avoiding fault trapping : cli/sti can be handled at guest level
  • Important for fork
  • RaW hazards
  • Use update_va_mapping in preference
  • typically 2 pages writeable : one in other page table
  • RaW hazards
  • RaW hazards
  • Most benchmark results identical. In addition to these, some slight slowdown
  • Xen has always supported SMP hosts Important to make virtualization ubiquitous, enterprise
  • Xen has always supported SMP hosts Important to make virtualization ubiquitous, enterprise
  • Pacifica
  • At least double the number of faults Memory requirements Hiding which machine pages are in use: may interfere with page colouring, large TLB entries etc.
  • Framebuffer console NUMA
  • Net-restart
  • newring
  • We have run 128 guests Tickerless patch may help for large numbers of guests
  • disk is easy new IO isn’t as fast as not all ring 0 ; win big time
  • 8(diff) : using QoS controls to split resources in 1:2:3:4:5:6:7:8 ratios. OSDB-OLTP suffers due to current poor disk scheduling – should be fixed in future Xen release
  • slides on Xen 3.0

    1. 1. Xen 3.0 and the Art of Virtualization Ian Pratt Keir Fraser, Steven Hand, Christian Limpach, Andrew Warfield, Dan Magenheimer (HP), Jun Nakajima (Intel), Asit Mallick (Intel) Computer Laboratory
    2. 2. Outline <ul><li>Virtualization Overview </li></ul><ul><li>Xen Architecture </li></ul><ul><li>New Features in Xen 3.0 </li></ul><ul><li>VM Relocation </li></ul><ul><li>Xen Roadmap </li></ul>
    3. 3. Virtualization Overview <ul><li>Single OS image: Virtuozo, Vservers, Zones </li></ul><ul><ul><li>Group user processes into resource containers </li></ul></ul><ul><ul><li>Hard to get strong isolation </li></ul></ul><ul><li>Full virtualization: VMware, VirtualPC, QEMU </li></ul><ul><ul><li>Run multiple unmodified guest OSes </li></ul></ul><ul><ul><li>Hard to efficiently virtualize x86 </li></ul></ul><ul><li>Para-virtualization: UML, Xen </li></ul><ul><ul><li>Run multiple guest OSes ported to special arch </li></ul></ul><ul><ul><li>Arch Xen/x86 is very close to normal x86 </li></ul></ul>
    4. 4. Virtualization in the Enterprise X <ul><li>Consolidate under-utilized servers to reduce CapEx and OpEx </li></ul><ul><li>Avoid downtime with VM Relocation </li></ul><ul><li>Dynamically re-balance workload to guarantee application SLAs </li></ul><ul><li>Enforce security policy </li></ul>X X
    5. 5. Xen Today : Xen 2.0.6 <ul><li>Secure isolation between VMs </li></ul><ul><li>Resource control and QoS </li></ul><ul><li>Only guest kernel needs to be ported </li></ul><ul><ul><li>User-level apps and libraries run unmodified </li></ul></ul><ul><ul><li>Linux 2.4/2.6, NetBSD, FreeBSD, Plan9, Solaris </li></ul></ul><ul><li>Execution performance close to native </li></ul><ul><li>Broad x86 hardware support </li></ul><ul><li>Live Relocation of VMs between Xen nodes </li></ul>
    6. 6. Para-Virtualization in Xen <ul><li>Xen extensions to x86 arch </li></ul><ul><ul><li>Like x86, but Xen invoked for privileged ops </li></ul></ul><ul><ul><li>Avoids binary rewriting </li></ul></ul><ul><ul><li>Minimize number of privilege transitions into Xen </li></ul></ul><ul><ul><li>Modifications relatively simple and self-contained </li></ul></ul><ul><li>Modify kernel to understand virtualised env. </li></ul><ul><ul><li>Wall-clock time vs. virtual processor time </li></ul></ul><ul><ul><ul><li>Desire both types of alarm timer </li></ul></ul></ul><ul><ul><li>Expose real resource availability </li></ul></ul><ul><ul><ul><li>Enables OS to optimise its own behaviour </li></ul></ul></ul>
    7. 7. Xen 2.0 Architecture Event Channel Virtual MMU Virtual CPU Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) Native Device Driver GuestOS (XenLinux) Device Manager & Control s/w VM0 Native Device Driver GuestOS (XenLinux) Unmodified User Software VM1 Front-End Device Drivers GuestOS (XenLinux) Unmodified User Software VM2 Front-End Device Drivers GuestOS (XenBSD) Unmodified User Software VM3 Safe HW IF Xen Virtual Machine Monitor Back-End Back-End
    8. 8. Xen 3.0 Architecture Event Channel Virtual MMU Virtual CPU Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) Native Device Driver GuestOS (XenLinux) Device Manager & Control s/w VM0 Native Device Driver GuestOS (XenLinux) Unmodified User Software VM1 Front-End Device Drivers GuestOS (XenLinux) Unmodified User Software VM2 Front-End Device Drivers Unmodified GuestOS (WinXP)) Unmodified User Software VM3 Safe HW IF Xen Virtual Machine Monitor Back-End Back-End VT-x x86_32 x86_64 IA64 AGP ACPI PCI SMP
    9. 9. x86_32 <ul><li>Xen reserves top of VA space </li></ul><ul><li>Segmentation protects Xen from kernel </li></ul><ul><li>System call speed unchanged </li></ul><ul><li>Xen 3 now supports PAE for >4GB mem </li></ul>ring 3 Kernel User 4GB 3GB 0GB Xen S S U ring 1 ring 0
    10. 10. x86_64 <ul><li>Large VA space makes life a lot easier, but: </li></ul><ul><li>No segment limit support </li></ul><ul><li>Need to use page-level protection to protect hypervisor </li></ul>Kernel User 2 64 0 Xen U S U Reserved 2 47 2 64 -2 47
    11. 11. x86_64 <ul><li>Run user-space and kernel in ring 3 using different pagetables </li></ul><ul><ul><li>Two PGD’s (PML4’s): one with user entries; one with user plus kernel entries </li></ul></ul><ul><li>System calls require an additional syscall/ret via Xen </li></ul><ul><li>Per-CPU trampoline to avoid needing GS in Xen </li></ul>Kernel User Xen U S U syscall/sysret r3 r0 r3
    12. 12. Para-Virtualizing the MMU <ul><li>Guest OSes allocate and manage own PTs </li></ul><ul><ul><li>Hypercall to change PT base </li></ul></ul><ul><li>Xen must validate PT updates before use </li></ul><ul><ul><li>Allows incremental updates, avoids revalidation </li></ul></ul><ul><li>Validation rules applied to each PTE: </li></ul><ul><ul><li>1. Guest may only map pages it owns* </li></ul></ul><ul><ul><li>2. Pagetable pages may only be mapped RO </li></ul></ul><ul><li>Xen traps PTE updates and emulates, or ‘unhooks’ PTE page for bulk updates </li></ul>
    13. 13. Writeable Page Tables : 1 – Write fault MMU Guest OS Xen VMM Hardware page fault first guest write guest reads Virtual -> Machine
    14. 14. Writeable Page Tables : 2 – Emulate? MMU Guest OS Xen VMM Hardware first guest write guest reads Virtual -> Machine emulate? yes
    15. 15. Writeable Page Tables : 3 - Unhook MMU Guest OS Xen VMM Hardware guest writes guest reads Virtual -> Machine X
    16. 16. Writeable Page Tables : 4 - First Use MMU Guest OS Xen VMM Hardware page fault guest writes guest reads Virtual -> Machine X
    17. 17. Writeable Page Tables : 5 – Re-hook MMU Guest OS Xen VMM Hardware validate guest writes guest reads Virtual -> Machine
    18. 18. MMU Micro-Benchmarks L X V U Page fault ( µ s) L X V U Process fork ( µ s) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 lmbench results on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)
    19. 19. SMP Guest Kernels <ul><li>Xen extended to support multiple VCPUs </li></ul><ul><ul><li>Virtual IPI’s sent via Xen event channels </li></ul></ul><ul><ul><li>Currently up to 32 VCPUs supported </li></ul></ul><ul><li>Simple hotplug/unplug of VCPUs </li></ul><ul><ul><li>From within VM or via control tools </li></ul></ul><ul><ul><li>Optimize one active VCPU case by binary patching spinlocks </li></ul></ul>
    20. 20. SMP Guest Kernels <ul><li>Takes great care to get good SMP performance while remaining secure </li></ul><ul><ul><li>Requires extra TLB syncronization IPIs </li></ul></ul><ul><li>Paravirtualized approach enables several important benefits </li></ul><ul><ul><li>Avoids many virtual IPIs </li></ul></ul><ul><ul><li>Allows ‘bad preemption’ avoidance </li></ul></ul><ul><ul><li>Auto hot plug/unplug of CPUs </li></ul></ul><ul><li>SMP scheduling is a tricky problem </li></ul><ul><ul><li>Strict gang scheduling leads to wasted cycles </li></ul></ul>
    21. 21. I/O Architecture <ul><li>Xen IO-Spaces delegate guest OSes protected access to specified h/w devices </li></ul><ul><ul><li>Virtual PCI configuration space </li></ul></ul><ul><ul><li>Virtual interrupts </li></ul></ul><ul><ul><li>(Need IOMMU for full DMA protection) </li></ul></ul><ul><li>Devices are virtualised and exported to other VMs via Device Channels </li></ul><ul><ul><li>Safe asynchronous shared memory transport </li></ul></ul><ul><ul><li>‘ Backend’ drivers export to ‘frontend’ drivers </li></ul></ul><ul><ul><li>Net: use normal bridging, routing, iptables </li></ul></ul><ul><ul><li>Block: export any blk dev e.g. sda4,loop0,vg3 </li></ul></ul><ul><li>(Infiniband / Smart NICs for direct guest IO) </li></ul>
    22. 22. VT-x / (Pacifica) <ul><li>Enable Guest OSes to be run without para-virtualization modifications </li></ul><ul><ul><li>E.g. legacy Linux, Windows XP/2003 </li></ul></ul><ul><li>CPU provides traps for certain privileged instrs </li></ul><ul><li>Shadow page tables used to provide MMU virtualization </li></ul><ul><li>Xen provides simple platform emulation </li></ul><ul><ul><li>BIOS, Ethernet (ne2k), IDE emulation </li></ul></ul><ul><li>(Install paravirtualized drivers after booting for high-performance IO) </li></ul>
    23. 23. Native Device Drivers Control Panel (xm/xend) Front end Virtual Drivers Linux xen64 Xen Hypervisor Device Models Guest BIOS Unmodified OS Domain N Linux xen64 Callback / Hypercall VMExit Virtual Platform 0D Guest VM (VMX) (32-bit) Backend Virtual driver Native Device Drivers Domain 0 Event channel 0P 1/3P 3P FE Virtual Drivers Guest BIOS Unmodified OS VMExit Virtual Platform Guest VM (VMX) (64-bit) FE Virtual Drivers 3D I/O: PIT, APIC, PIC, IOAPIC Processor Memory Control Interface Hypercalls Event Channel Scheduler
    24. 24. MMU Virtualizion : Shadow-Mode MMU Accessed & dirty bits Guest OS VMM Hardware guest writes guest reads Virtual -> Pseudo-physical Virtual -> Machine Updates
    25. 25. VM Relocation : Motivation <ul><li>VM relocation enables: </li></ul><ul><ul><li>High-availability </li></ul></ul><ul><ul><ul><li>Machine maintenance </li></ul></ul></ul><ul><ul><li>Load balancing </li></ul></ul><ul><ul><ul><li>Statistical multiplexing gain </li></ul></ul></ul>Xen Xen
    26. 26. Assumptions <ul><li>Networked storage </li></ul><ul><ul><li>NAS: NFS, CIFS </li></ul></ul><ul><ul><li>SAN: Fibre Channel </li></ul></ul><ul><ul><li>iSCSI, network block dev </li></ul></ul><ul><ul><li>drdb network RAID </li></ul></ul><ul><li>Good connectivity </li></ul><ul><ul><li>common L2 network </li></ul></ul><ul><ul><li>L3 re-routeing </li></ul></ul>Xen Xen Storage
    27. 27. Challenges <ul><li>VMs have lots of state in memory </li></ul><ul><li>Some VMs have soft real-time requirements </li></ul><ul><ul><li>E.g. web servers, databases, game servers </li></ul></ul><ul><ul><li>May be members of a cluster quorum </li></ul></ul><ul><ul><li>Minimize down-time </li></ul></ul><ul><li>Performing relocation requires resources </li></ul><ul><ul><li>Bound and control resources used </li></ul></ul>
    28. 28. Relocation Strategy Stage 0: pre-migration Stage 1: reservation Stage 2: iterative pre-copy Stage 3: stop-and-copy Stage 4: commitment VM active on host A Destination host selected (Block devices mirrored) Initialize container on target host Copy dirty pages in successive rounds Suspend VM on host A Redirect network traffic Synch remaining state Activate on host B VM state on host A released
    29. 29. Pre-Copy Migration: Round 1
    30. 30. Pre-Copy Migration: Round 1
    31. 31. Pre-Copy Migration: Round 1
    32. 32. Pre-Copy Migration: Round 1
    33. 33. Pre-Copy Migration: Round 1
    34. 34. Pre-Copy Migration: Round 2
    35. 35. Pre-Copy Migration: Round 2
    36. 36. Pre-Copy Migration: Round 2
    37. 37. Pre-Copy Migration: Round 2
    38. 38. Pre-Copy Migration: Round 2
    39. 39. Pre-Copy Migration: Final
    40. 40. Writable Working Set <ul><li>Pages that are dirtied must be re-sent </li></ul><ul><ul><li>Super hot pages </li></ul></ul><ul><ul><ul><li>e.g. process stacks; top of page free list </li></ul></ul></ul><ul><ul><li>Buffer cache </li></ul></ul><ul><ul><li>Network receive / disk buffers </li></ul></ul><ul><li>Dirtying rate determines VM down-time </li></ul><ul><ul><li>Shorter iterations -> less dirtying -> … </li></ul></ul>
    41. 41. Writable Working Set <ul><li>Set of pages written to by OS/application </li></ul><ul><li>Pages that are dirtied must be re-sent </li></ul><ul><ul><li>Hot pages </li></ul></ul><ul><ul><ul><li>E.g. process stacks </li></ul></ul></ul><ul><ul><ul><li>Top of free page list (works like a stack) </li></ul></ul></ul><ul><ul><li>Buffer cache </li></ul></ul><ul><ul><li>Network receive / disk buffers </li></ul></ul>
    42. 42. Page Dirtying Rate <ul><li>Dirtying rate determines VM down-time </li></ul><ul><ul><li>Shorter iters -> less dirtying -> shorter iters </li></ul></ul><ul><ul><li>Stop and copy final pages </li></ul></ul><ul><li>Application ‘phase changes’ create spikes </li></ul>time into iteration #dirty
    43. 43. Rate Limited Relocation <ul><li>Dynamically adjust resources committed to performing page transfer </li></ul><ul><ul><li>Dirty logging costs VM ~2-3% </li></ul></ul><ul><ul><li>CPU and network usage closely linked </li></ul></ul><ul><li>E.g. first copy iteration at 100Mb/s, then increase based on observed dirtying rate </li></ul><ul><ul><li>Minimize impact of relocation on server while minimizing down-time </li></ul></ul>
    44. 44. Web Server Relocation
    45. 45. Iterative Progress: SPECWeb 52s
    46. 46. Iterative Progress: Quake3
    47. 47. Quake 3 Server relocation
    48. 48. Extensions <ul><li>Cluster load balancing </li></ul><ul><ul><li>Pre-migration analysis phase </li></ul></ul><ul><ul><li>Optimization over coarse timescales </li></ul></ul><ul><li>Evacuating nodes for maintenance </li></ul><ul><ul><li>Move easy to migrate VMs first </li></ul></ul><ul><li>Storage-system support for VM clusters </li></ul><ul><ul><li>Decentralized, data replication, copy-on-write </li></ul></ul><ul><li>Wide-area relocation </li></ul><ul><ul><li>IPSec tunnels and CoW network mirroring </li></ul></ul>
    49. 49. Current 3.0 Status           Driver Domains     64-on-64     VT ?   4TB 16GB   >4GB memory         ~tools Save/Restore/Migrate     new!     SMP Guests           Domain U           Domain 0 Power IA64 x86_64 x86_32p x86_32  
    50. 50. 3.1 Roadmap <ul><li>Improved full-virtualization support </li></ul><ul><ul><li>Pacifica / VT-x abstraction </li></ul></ul><ul><li>Enhanced control tools project </li></ul><ul><li>Performance tuning and optimization </li></ul><ul><ul><li>Less reliance on manual configuration </li></ul></ul><ul><li>Infiniband / Smart NIC support </li></ul><ul><li>(NUMA, Virtual framebuffer, etc) </li></ul>
    51. 51. IO Virtualization <ul><li>IO virtualization in s/w incurs overhead </li></ul><ul><ul><li>Latency vs. overhead tradeoff </li></ul></ul><ul><ul><ul><li>More of an issue for network than storage </li></ul></ul></ul><ul><ul><li>Can burn 10-30% more CPU </li></ul></ul><ul><li>Solution is well understood </li></ul><ul><ul><li>Direct h/w access from VMs </li></ul></ul><ul><ul><ul><li>Multiplexing and protection implemented in h/w </li></ul></ul></ul><ul><ul><li>Smart NICs / HCAs </li></ul></ul><ul><ul><ul><li>Infiniband, Level-5, Aaorhi etc </li></ul></ul></ul><ul><ul><ul><li>Will become commodity before too long </li></ul></ul></ul>
    52. 52. Research Roadmap <ul><li>Whole-system debugging </li></ul><ul><ul><li>Lightweight checkpointing and replay </li></ul></ul><ul><ul><li>Cluster/dsitributed system debugging </li></ul></ul><ul><li>Software implemented h/w fault tolerance </li></ul><ul><ul><li>Exploit deterministic replay </li></ul></ul><ul><li>VM forking </li></ul><ul><ul><li>Lightweight service replication, isolation </li></ul></ul><ul><li>Secure virtualization </li></ul><ul><ul><li>Multi-level secure Xen </li></ul></ul>
    53. 53. Conclusions <ul><li>Xen is a complete and robust GPL VMM </li></ul><ul><li>Outstanding performance and scalability </li></ul><ul><li>Excellent resource control and protection </li></ul><ul><li>Vibrant development community </li></ul><ul><li>Strong vendor support </li></ul><ul><li>http://xen.sf.net </li></ul>
    54. 54. Thanks! <ul><li>The Xen project is hiring, both in Cambridge UK, Palo Alto and New York </li></ul><ul><li>[email_address] </li></ul>Computer Laboratory
    55. 55. Backup slides
    56. 56. Isolated Driver VMs <ul><li>Run device drivers in separate domains </li></ul><ul><li>Detect failure e.g. </li></ul><ul><ul><li>Illegal access </li></ul></ul><ul><ul><li>Timeout </li></ul></ul><ul><li>Kill domain, restart </li></ul><ul><li>E.g. 275ms outage from failed Ethernet driver </li></ul>0 50 100 150 200 250 300 350 0 5 10 15 20 25 30 35 40 time (s)
    57. 57. Device Channel Interface
    58. 58. Scalability <ul><li>Scalability principally limited by Application resource requirements </li></ul><ul><ul><li>several 10’s of VMs on server-class machines </li></ul></ul><ul><li>Balloon driver used to control domain memory usage by returning pages to Xen </li></ul><ul><ul><li>Normal OS paging mechanisms can deflate quiescent domains to <4MB </li></ul></ul><ul><ul><li>Xen per-guest memory usage <32KB </li></ul></ul><ul><li>Additional multiplexing overhead negligible </li></ul>
    59. 59. System Performance L X V U SPEC INT2000 (score) L X V U Linux build time (s) L X V U OSDB-OLTP (tup/s) L X V U SPEC WEB99 (score) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U)
    60. 60. TCP results L X V U Tx, MTU 1500 (Mbps) L X V U Rx, MTU 1500 (Mbps) L X V U Tx, MTU 500 (Mbps) L X V U Rx, MTU 500 (Mbps) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 TCP bandwidth on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)
    61. 61. Scalability L X 2 L X 4 L X 8 L X 16 0 200 400 600 800 1000 Simultaneous SPEC WEB99 Instances on Linux (L) and Xen(X)
    62. 62. Resource Differentation 2 4 8 8(diff) OSDB-IR 2 4 8 8(diff) OSDB-OLTP 0.0 0.5 1.0 1.5 2.0 Simultaneous OSDB-IR and OSDB-OLTP Instances on Xen Aggregate throughput relative to one instance
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×