Your SlideShare is downloading. ×
Xen Euro Par07
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Xen Euro Par07

928
views

Published on

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
928
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
48
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Transcript

    • 1. Virtualizing the Data Center with Xen Steve Hand University of Cambridge and XenSource
    • 2. What is Xen Anyway?
      • Open source hypervisor
        • run multiple OSes on one machine
        • dynamic sizing of virtual machine
        • much improved manageability
      • Pioneered paravirtualization
        • modify OS kernel to run on Xen
        • (applications mods not required)
        • extremely low overhead (~1%)
      • Massive development effort
        • first (open source) release 2003.
        • today have hundreds of talented community developers
    • 3. Problem: Success of Scale-out
      • “ OS+app per server” provisioning leads to server sprawl
      • Server utilization rates <10%
      • Expensive to maintain, house, power, and cool
      • Slow to provision, inflexible to change or scale
      • Poor resilience to failures
    • 4. Solution: Virtualize With Xen
      • Consolidation : fewer servers slashes CapEx and OpEx
      • Higher utilization : make the most of existing investments
      • “ Instant on” provisioning : any app on any server, any time
      • Robustness to failures and “auto-restart” of VMs on failure
    • 5. Further Virtualization Benefits
      • Separating the OS from the hardware
        • Users no longer forced to upgrade OS to run on latest hardware
      • Device support is part of the platform
        • Write one device driver rather than N
        • Better for system reliability/availability
        • Faster to get new hardware deployed
      • Enables “Virtual Appliances”
        • Applications encapsulated with their OS
        • Easy configuration and management
    • 6. Virtualization Possibilities
      • Value-added functionality from outside OS:
        • Fire-walling / network IDS / “inverse firewall”
        • VPN tunnelling; LAN authentication
        • Virus, mal-ware and exploit scanning
        • OS patch-level status monitoring
        • Performance monitoring and instrumentation
        • Storage backup and snapshots
        • Local disk as just a cache for network storage
        • Carry your world on a USB stick
        • Multi-level secure systems
    • 7. This talk
      • Introduction
      • Xen 3
        • para-virtualization, I/O architecture
        • hardware virtualization (today + tomorrow)
      • XenEnterprise: overview and performance
      • Case Study: Live Migration
      • Outlook
    • 8. Xen 3.0 (5th Dec 2005)
      • Secure isolation between VMs
      • Resource control and QoS
      • x86 32/PAE36/64 plus HVM; IA64, Power
      • PV guest kernel needs to be ported
        • User-level apps and libraries run unmodified
      • Execution performance close to native
      • Broad (linux) hardware support
      • Live Relocation of VMs between Xen nodes
      • Latest stable release is 3.1 (May 2007)
    • 9. Xen 3.x Architecture Event Channel Virtual MMU Virtual CPU Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) Native Device Drivers GuestOS (XenLinux) Device Manager & Control s/w VM0 GuestOS (XenLinux) Unmodified User Software VM1 Front-End Device Drivers GuestOS (XenLinux) Unmodified User Software VM2 Front-End Device Drivers Unmodified GuestOS (WinXP)) Unmodified User Software VM3 Safe HW IF Xen Virtual Machine Monitor Back-End HVM x86_32 x86_64 IA64 AGP ACPI PCI SMP Front-End Device Drivers
    • 10. Para-Virtualization in Xen
      • Xen extensions to x86 arch
        • Like x86, but Xen invoked for privileged ops
        • Avoids binary rewriting
        • Minimize number of privilege transitions into Xen
        • Modifications relatively simple and self-contained
      • Modify kernel to understand virtualised env.
        • Wall-clock time vs. virtual processor time
          • Desire both types of alarm timer
        • Expose real resource availability
          • Enables OS to optimise its own behaviour
    • 11. x86 CPU virtualization
      • Xen runs in ring 0 (most privileged)
      • Ring 1/2 for guest OS, 3 for user-space
        • GPF if guest attempts to use privileged instr
      • Xen lives in top 64MB of linear addr space
        • Segmentation used to protect Xen as switching page tables too slow on standard x86
      • Hypercalls jump to Xen in ring 0
        • Indirection via hypercall page allows flexibility
      • Guest OS may install ‘fast trap’ handler
        • Direct user-space to guest OS system calls
    • 12. Para-Virtualizing the MMU
      • Guest OSes allocate and manage own PTs
        • Hypercall to change PT base
      • Xen must validate PT updates before use
        • Allows incremental updates, avoids revalidation
      • Validation rules applied to each PTE:
        • 1. Guest may only map pages it owns*
        • 2. Pagetable pages may only be mapped RO
      • Xen traps PTE updates and emulates, or ‘unhooks’ PTE page for bulk updates
    • 13. SMP Guest Kernels
      • Xen 3.x supports multiple VCPUs
        • Virtual IPI’s sent via Xen event channels
        • Currently up to 32 VCPUs supported
      • Simple hotplug/unplug of VCPUs
        • From within VM or via control tools
        • Optimize one active VCPU case by binary patching spinlocks
      • NB: Many applications exhibit poor SMP scalability – often better off running multiple instances each in their own OS
    • 14. Performance: SPECJBB Average 0.75% overhead Native 3 GHz Xeon 1GB memory / guest 2.6.9.EL vs. 2.6.9.EL-xen
    • 15. Performance: dbench < 5% Overhead up to 8 way SMP
    • 16. Kernel build 32b PAE; Parallel make, 4 processes per CPU 5% 8% 13% 18% 23% Source: XenSource, Inc: 10/06 # Virtual CPUs
    • 17. I/O Architecture
      • Xen IO-Spaces delegate guest OSes protected access to specified h/w devices
        • Virtual PCI configuration space
        • Virtual interrupts
        • (Need IOMMU for full DMA protection)
      • Devices are virtualised and exported to other VMs via Device Channels
        • Safe asynchronous shared memory transport
        • ‘ Backend’ drivers export to ‘frontend’ drivers
        • Net: use normal bridging, routing, iptables
        • Block: export any blk dev e.g. sda4,loop0,vg3
      • (Infiniband / Smart NICs for direct guest IO)
    • 18. Isolated Driver Domains Event Channel Virtual MMU Virtual CPU Control IF Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) Native Device Driver GuestOS (XenLinux) Device Manager & Control s/w VM0 Native Device Driver GuestOS (XenLinux) VM1 Front-End Device Drivers GuestOS (XenLinux) Unmodified User Software VM2 Front-End Device Drivers GuestOS (XenBSD) Unmodified User Software VM3 Safe HW IF Xen Virtual Machine Monitor Back-End Back-End Driver Domain
    • 19. Isolated Driver VMs
      • Run device drivers in separate domains
      • Detect failure e.g.
        • Illegal access
        • Timeout
      • Kill domain, restart
      • E.g. 275ms outage from failed Ethernet driver
      0 50 100 150 200 250 300 350 0 5 10 15 20 25 30 35 40 time (s)
    • 20. Hardware Virtualization (1)
      • Paravirtualization…
        • has fundamental benefits… (c/f MS Viridian)
        • but is limited to OSes with PV kernels.
      • Recently seen new CPUs from Intel, AMD
        • enable safe trapping of ‘difficult’ instructions
        • provide additional privilege layers (“rings”)
        • currently shipping in most modern server, desktop and notebook systems
      • Solves part of the problem, but…
    • 21. Hardware Virtualization (2)
      • CPU is only part of the system
        • also need to consider memory and I/O
      • Memory:
        • OS wants contiguous physical memory, but Xen needs to share between many OSes
        • Need to dynamically translate between guest physical and ‘real’ physical addresses
        • Use shadow page tables to mirror guest OS page tables (and implicit ‘no paging’ mode)
      • Xen 3.0 includes software shadow page tables; future x86 processors will include hardware support.
    • 22. Hardware Virtualization (3)
      • Finally we need to solve the I/O issue
        • non-PV OSes don’t know about Xen
        • hence run ‘standard’ PC ISA/PCI drivers
      • Just emulate devices in software?
        • complex, fragile and non-performant…
        • … but ok as backstop mechanism.
      • Better:
        • add PV (or “enlightened”) device drivers to OS
        • well-defined driver model makes this relatively easy
        • get PV performance benefits for I/O path
    • 23. Xen 3: Hardware Virtual Machines
      • Enable Guest OSes to be run without modification
        • E.g. legacy Linux, Solaris x86, Windows XP/2003
      • CPU provides vmexits for certain privileged instrs
      • Shadow page tables used to virtualize MMU
      • Xen provides simple platform emulation
        • BIOS, apic, iopaic, rtc, net (pcnet32), IDE emulation
      • Install paravirtualized drivers after booting for high-performance IO
      • Possibility for CPU and memory paravirtualization
        • Non-invasive hypervisor hints from OS
    • 24. Native Device Drivers Control Panel (xm/xend) Front end Virtual Drivers Linux xen64 Device Models Guest BIOS Unmodified OS Domain N Linux xen64 Callback / Hypercall VMExit Virtual Platform 0D Backend Virtual driver Native Device Drivers Domain 0 Event channel 0P 1/3P 3P Guest BIOS Unmodified OS VMExit Virtual Platform 3D HVM Architecture Guest VM (HVM) (32-bit mode) Guest VM (HVM) (64-bit mode) I/O: PIT, APIC, PIC, IOAPIC Processor Memory Control Interface Hypercalls Event Channel Scheduler
    • 25. Progressive Paravirtualization
      • Hypercall API available to HVM guests
      • Selectively add PV extensions to optimize
        • Network and Block IO
        • XenAPIC (event channels)
        • MMU operations
          • multicast TLB flush
          • PTE updates (faster than page fault)
          • page sharing
        • Time (wall-clock and virtual time)
        • CPU and memory hotplug
    • 26. Native Device Drivers Control Panel (xm/xend) Front end Virtual Drivers Linux xen64 Device Models Guest BIOS Unmodified OS Domain N Linux xen64 Callback / Hypercall VMExit Virtual Platform 0D Backend Virtual driver Native Device Drivers Domain 0 Event channel 0P 1/3P 3P FE Virtual Drivers Guest BIOS Unmodified OS VMExit Virtual Platform FE Virtual Drivers 3D PIC/APIC/IOAPIC emulation Guest VM (HVM) (32-bit mode) Guest VM (HVM) (64-bit mode) Xen 3.0.3 Enhanced HVM I/O I/O: PIT, APIC, PIC, IOAPIC Processor Memory Control Interface Hypercalls Event Channel Scheduler
    • 27. HVM I/O Performance Measured with ttcp (1500 byte MTU) Emulated I/O PV on HVM Pure PV Source: XenSource, Sep 06
    • 28. Native Device Drivers Control Panel (xm/xend) Front end Virtual Drivers Linux xen64 Guest BIOS Unmodified OS Domain N Linux xen64 Callback / Hypercall VMExit Virtual Platform 0D Guest VM (HVM) (32-bit mode) Backend Virtual driver Native Device Drivers Domain 0 Event channel 0P 1/3P 3P FE Virtual Drivers Guest BIOS Unmodified OS VMExit Virtual Platform Guest VM (HVM) (64-bit mode) FE Virtual Drivers 3D IO Emulation IO Emulation Future plan: I/O stub domains. Device Models PIC/APIC/IOAPIC emulation I/O: PIT, APIC, PIC, IOAPIC Processor Memory Control Interface Hypercalls Event Channel Scheduler
    • 29. HW Virtualization Summary
      • CPU virtualization available today
        • lets Xen support legacy/proprietary OSes
      • Additional platform protection imminent
        • protect Xen from IO devices
        • full IOMMU extensions coming soon
      • MMU virtualization also coming soon:
        • avoid the need for s/w shadow page tables
        • should improve performance and reduce complexity
      • Device virtualization arriving from various folks:
        • networking already here (ethernet, infiniband)
        • [remote] storage in the works (NPIV, VSAN)
        • graphics and other devices sure to follow…
    • 30. What is XenEnterprise?
      • Multi-Operating System
        • Windows, Linux and Solaris
      • Bundled Management Console
      • Easy to use
        • Xen and guest installers and P2V tools
        • For standard servers and blades
      • High Performance Paravirtualization
        • Exploits hardware virtualization
        • Per guest resource guarantees
      • Extensible Architecture
        • Secure, tiny, low maintenance
        • Extensible by ecosystem partners
      “ Ten minutes to Xen”
    • 31. Get Started with XenEnterprise
      • Packaged and supported virtualization platform for x86 servers
      • Includes the Xen TM Hypervisor
      • High performance bare metal virtualization for both Windows and Linux guests
      • Easy to use and install
      • Management console included for standard management of Xen systems and guests
      • Microsoft supports Windows customers on XenEnterprise
      • Download free from www.xensource.com and try it out!
      “ Ten minutes to Xen”
    • 32. A few XE 3.2 Screenshots… Resource usage statistics Windows 2003 Server Windows XP Easy VM Creation Clone and Export VMs Red Hat Linux 4
    • 33. SPECjbb2005 Sun JVM installed
    • 34. w2k3 SPECcpu2000 integer (multi)
    • 35. w2k3 Passmark-CPU results vs native
    • 36. w2k3 Passmark memory vs native
    • 37. Performance: Conclusions
      • Xen PV guests typically pay ~1% (1VCPU) or ~2% (2VCPU) compared to native
        • ESX 3.0.1 pays 12-21% for same benchmark
      • XenEnterprise includes:
        • hardened and optimized OSS Xen
        • proprietary PV drivers for MS OSes
        • matches or beats ESX performance in nearly all cases
      • Some room for further improvement…
    • 38. Live Migration of VMs: Motivation
      • VM migration enables:
        • High-availability
          • Machine maintenance
        • Load balancing
          • Statistical multiplexing gain
      Xen Xen
    • 39. Assumptions
      • Networked storage
        • NAS: NFS, CIFS
        • SAN: Fibre Channel
        • iSCSI, network block dev
        • drdb network RAID
      • Good connectivity
        • common L2 network
        • L3 re-routeing
      Xen Xen Storage
    • 40. Challenges
      • VMs have lots of state in memory
      • Some VMs have soft real-time requirements
        • E.g. web servers, databases, game servers
        • May be members of a cluster quorum
        • Minimize down-time
      • Performing relocation requires resources
        • Bound and control resources used
    • 41. Relocation Strategy Stage 0: pre-migration Stage 1: reservation Stage 2: iterative pre-copy Stage 3: stop-and-copy Stage 4: commitment VM active on host A Destination host selected (Block devices mirrored) Initialize container on target host Copy dirty pages in successive rounds Suspend VM on host A Redirect network traffic Synch remaining state Activate on host B VM state on host A released
    • 42. Pre-Copy Migration: Round 1
    • 43. Pre-Copy Migration: Round 1
    • 44. Pre-Copy Migration: Round 1
    • 45. Pre-Copy Migration: Round 1
    • 46. Pre-Copy Migration: Round 1
    • 47. Pre-Copy Migration: Round 2
    • 48. Pre-Copy Migration: Round 2
    • 49. Pre-Copy Migration: Round 2
    • 50. Pre-Copy Migration: Round 2
    • 51. Pre-Copy Migration: Round 2
    • 52. Pre-Copy Migration: Final
    • 53.
      • Pages that are dirtied must be re-sent
        • Super hot pages
          • e.g. process stacks; top of page free list
        • Buffer cache
        • Network receive / disk buffers
      • Dirtying rate determines VM down-time
        • Shorter iterations -> less dirtying -> …
      Writeable Working Set
    • 54. Rate Limited Relocation
      • Dynamically adjust resources committed to performing page transfer
        • Dirty logging costs VM ~2-3%
        • CPU and network usage closely linked
      • E.g. first copy iteration at 100Mb/s, then increase based on observed dirtying rate
        • Minimize impact of relocation on server while minimizing down-time
    • 55. Web Server Migration
    • 56. Iterative Progress: SPECWeb 52s
    • 57. Iterative Progress: Quake3
    • 58. Quake 3 Server relocation
    • 59. Xen 3.1 Highlights
      • Released May 18 th 2007
      • Lots of new features:
        • 32-on-64 PV guest support
          • run PAE PV guests on a 64-bit Xen!
        • Save/restore/migrate for HVM guests
        • Dynamic memory control for HVM guests
        • Blktap copy-on-write support (incl checkpoint)
        • XenAPI 1.0 support
          • XML configuration files for virtual machines
          • VM life-cycle management operations; and
          • Secure on- or off-box XML-RPC with many bindings.
    • 60. Xen 3.x Roadmap
      • Continued improved of full-virtualization
        • HVM (VT/AMD-V) optimizations
        • DMA protection of Xen, dom0
        • Optimizations for (software) shadow modes
      • Client deployments: support for 3D graphics, etc
      • Live/continuous checkpoint / rollback
        • disaster recovery for the masses
      • Better NUMA (memory + scheduling)
      • Smart I/O enhancements
      • “ XenSE” : Open Trusted Computing
    • 61. Backed By All Major IT Vendors * Logos are registered trademarks of their owners
    • 62. Summary
      • Xen is re-shaping the IT industry
      • Commoditize the hypervisor
      • Key to volume adoption of virtualization
      • Paravirtualization in the next release of all OSes
      • XenSource Delivers Volume Virtualization
      • XenEnterprise shipping now
      • Closely aligned with our ecosystem to deliver full-featured, open and extensible solutions
      • Partnered with all key OSVs to deliver an interoperable virtualized infrastructure
    • 63. Thanks!
      • EuroSys 2008
      • Venue: Glasgow, Scotland, Apr 2-4 2008
      • Important dates:
        • 14 Sep 2007: submission of abstracts
        • 21 Sep 2007: submission of papers
        • 20 Dec 2007: notification to authors
      • More details at http://www.dcs.gla.ac.uk/Conferences/EuroSys2008