Your SlideShare is downloading. ×
0
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
152.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

152.ppt

523

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
523
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Xen and the Art of Virtualization By Paul Barham, Boris Dragovic, Stevan Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Presented by Diana Carroll
  • 2. Virtual Machines <ul><li>One hardware system with memory, processors, I/O devices. </li></ul><ul><li>Multiple execution environments that each map to an identical representation of the physical system. </li></ul><ul><ul><li>An OS running on a virtual machine is not aware that it is sharing the machine. </li></ul></ul><ul><ul><ul><li>Virtual machines must be isolated from each other even though they share the same hardware. </li></ul></ul></ul><ul><ul><ul><li>The execution of one can’t stall or corrupt the others. </li></ul></ul></ul><ul><ul><ul><li>The performance overhead needs to be acceptably small. </li></ul></ul></ul><ul><ul><li>The virtual machines must share the hardware as equally as possible. </li></ul></ul><ul><li>A Virtual Machine Monitor is needed to accomplish this. </li></ul>
  • 3. Virtual Machine Monitors <ul><li>Also known as a hypervisor. </li></ul><ul><li>Provides an interface for multiple virtual machines to coexist together. </li></ul><ul><li>Can run multiple operating systems on a single computer. </li></ul><ul><ul><li>Provides stability, since even if one OS crashes, the rest of the machine remains functional. </li></ul></ul><ul><ul><li>Can eliminate the need for multiple machines dedicated to different operating systems. </li></ul></ul><ul><li>Provides isolation between operating system instances and multiplexes physical resources across the running virtual machines. </li></ul><ul><ul><li>Much like an OS does with processes. </li></ul></ul>
  • 4. Xen <ul><li>Xen is a Virtual Machine Monitor (VMM). </li></ul><ul><li>Allows users to dynamically instantiate an operating system. </li></ul><ul><li>Hosts operating systems such as Linux and Windows. </li></ul><ul><ul><li>Some source code modifications are necessary. </li></ul></ul><ul><ul><ul><li>In the paper, XenoLinux was complete, Windows XP and NetBSD still in progress. </li></ul></ul></ul><ul><ul><ul><li>Now, NetBSD, Linux, FreeBSD, Plan9, and NetWare are complete. WindowsXP port was successful, but licensing prohibitions prevent it from being released. (1) </li></ul></ul></ul><ul><li>Multiple operating systems can run simultaneously and perform different tasks. </li></ul><ul><li>Is completely software based and requires no special hardware support. </li></ul><ul><ul><li>Full virtualization, in which the virtual hardware is identical to the underlying physical hardware, is virtually impossible on the x86 architecture. </li></ul></ul><ul><ul><li>Xen provides a similar, but not quite identical view of the hardware. </li></ul></ul>
  • 5. Xen Design Principles <ul><li>Support unmodified application binaries. </li></ul><ul><ul><li>Necessary to ensure that it is useful for users. </li></ul></ul><ul><li>Support fully functional, multi-application operating systems a guests. </li></ul><ul><li>Use paravirtualization to provide high performance and good resource isolation. </li></ul><ul><ul><li>The guest operating system has to be modified to run on the Virtual Machine Monitor. </li></ul></ul><ul><ul><li>Specifically, the guest OS can no longer execute in ring 0, because that ring is now occupied by the VMM. </li></ul></ul><ul><ul><li>The guest OS has to be modified to run outside of ring 0. </li></ul></ul><ul><li>Sometimes more correct behavior and better performance are achieved when the resource virtualization is not completely hidden. </li></ul>
  • 6. Xen versus Disco <ul><li>Disco uses true virtualization (almost) </li></ul><ul><ul><li>True virtualization does not require any modification of the guest OS. </li></ul></ul><ul><ul><li>The virtual machine is indistinguishable from the real hardware. </li></ul></ul><ul><li>Xen uses paravirtualization </li></ul><ul><ul><li>The guest OS has to be modified, or ported, onto the Xen hypervisor. </li></ul></ul><ul><ul><li>Xen virtual machines resemble the real hardware but do not attempt to be an exact match. </li></ul></ul><ul><ul><li>When appropriate, the guest OS makes calls to the hypervisor rather than to the hardware. </li></ul></ul><ul><ul><ul><li>e.g. For memory management and I/O. </li></ul></ul></ul><ul><ul><li>Solves the problem of architectures like the x86 that do not support true virtualization. </li></ul></ul><ul><ul><ul><li>The TLB is hardware managed rather than software managed. </li></ul></ul></ul>
  • 7. The Virtual Machine Interface <ul><li>A paravirtualized version of the X86 interface. </li></ul><ul><ul><li>In this case, the x86 architecture is a worst-case environment. </li></ul></ul><ul><li>Divided into memory management, CPU, and I/O. </li></ul><ul><li>Guest operating systems execute within domains. </li></ul><ul><ul><li>A domain is a running virtual machine. </li></ul></ul>
  • 8. Memory Management <ul><li>Guest OS’s are responsible for allocating and managing the hardware page tables. </li></ul><ul><ul><li>Minimal involvement from Xen is required to ensure safety and isolation. </li></ul></ul><ul><ul><li>Necessary since x86 does not have a software-managed TLB, which could be efficiently virtualized. </li></ul></ul><ul><li>Xen exists in a 64MB section at the top of each address space. </li></ul><ul><ul><li>This avoids the TLB being flushed each time the execution path enters or leaves the hypervisor. </li></ul></ul><ul><li>Guest OS allocates and initializes a new page table from its own memory and then registers it with Xen. </li></ul><ul><ul><li>All subsequent updates must be validated by Xen. </li></ul></ul><ul><ul><li>Updates can be batched to improve efficiency. </li></ul></ul><ul><li>Segment descriptors are also validated. They must have lower privelege than Xen, and cannot allow access to the Xen-reserved portion of the address space. </li></ul>
  • 9. Virtualizing the CPU <ul><li>Applications run at different privilege levels. </li></ul><ul><ul><li>Typically, in x86, an OS runs at ring 0, as the most privileged entity in the system. </li></ul></ul><ul><ul><li>Applications usually run at ring 3. </li></ul></ul><ul><li>With a virtualized CPU, the OS no longer runs at ring 0. </li></ul><ul><ul><li>This privilege level is now reserved for the VMM. </li></ul></ul><ul><ul><li>The guest OS must be modified to run at a lower privilege level. </li></ul></ul><ul><ul><li>Since most OS implementations do not use rings 1 and 2, the guest OS can be ported to ring 1. </li></ul></ul><ul><li>This prevents the guest OS from executing privileged hypervisor code, but keeps it safely isolated from applications that are still running in ring 3. </li></ul>
  • 10. CPU virtualization continued <ul><li>Privileged instructions are required to be validated and executed within Xen. </li></ul><ul><ul><li>e.g. Installing a new page table or yielding the processor. </li></ul></ul><ul><ul><li>Attempts to execute a privileged instruction fails since only Xen operates at the highest privilege level. </li></ul></ul><ul><li>Exceptions are managed using a table of exception handlers. </li></ul><ul><ul><li>Page fault handler is the only one that has to be modified to read from an extended stack frame instead of a register. </li></ul></ul><ul><ul><li>System calls allow each guest OS to register a ‘fast’ exception handler, since it is not necessary for it to run in ring 0. </li></ul></ul><ul><li>All exception handlers are validated by Xen. </li></ul><ul><ul><li>Checked to ensure that the handler code does not specify execution in ring 0. </li></ul></ul>
  • 11. I/O <ul><li>Xen uses a set of device abstractions instead of emulating existing hardware devices. </li></ul><ul><ul><li>I/O data is transferred to and from each domain via Xen. </li></ul></ul><ul><ul><li>“ Shared memory, asynchronous buffer-descriptor rings” are used to pass I/O buffer information vertically through the system. </li></ul></ul><ul><li>Asynchronous notifications of I/O events are made to a domain. </li></ul><ul><ul><li>Made by updating a bitmap of pending event types, and possible calling an event handler as specified by its OS. </li></ul></ul>
  • 12. Porting an OS to Xen <ul><li>Requires less than 2% of the total lines of code to be modified. </li></ul>The User Software runs on the Guest OS without requiring modification.
  • 13. Separating Policy from Mechanism <ul><li>The hypervisor only provides basic control operations. </li></ul><ul><ul><li>Authorized domains can export these operations through a control interface. </li></ul></ul><ul><ul><li>An initial domain, Domain0, is created at boot time and can access the control interface. </li></ul></ul><ul><ul><ul><li>It can then use the control interface to create and manage additional domains. </li></ul></ul></ul><ul><ul><ul><li>Responsible for building the domain and initial structures to support each guest OS. </li></ul></ul></ul><ul><ul><ul><li>Can be specialized to handle the varying requirements of different OSes. </li></ul></ul></ul><ul><ul><li>The control interface also supports virtual I/O devices. </li></ul></ul><ul><ul><ul><li>Virtual Network Interfaces (VIF) and Virtual Block Devices (VBD). </li></ul></ul></ul><ul><ul><ul><li>Additional administrative tools may be added to Domain0 in the future. </li></ul></ul></ul>
  • 14. Control Transfer <ul><li>Hypercalls are made from a domain to Xen. </li></ul><ul><ul><li>A synchronous software trap into the hypervisor. </li></ul></ul><ul><ul><ul><li>e.g. to request a set of page-table updates or other privileged operation. </li></ul></ul></ul><ul><ul><li>Control is returned to the calling domain when the call is completed. </li></ul></ul><ul><li>Notifications from Xen to a domain are made using an asynchronous event mechanism. </li></ul><ul><ul><li>Replaces the delivery mechanism for device interrupts. </li></ul></ul><ul><ul><li>Allows lightweight notification of events. </li></ul></ul><ul><ul><li>Similar to Unix signals. </li></ul></ul>
  • 15. Device Transfer <ul><li>The virtual memory manager is an extra protection domain between guest OS and I/O device. </li></ul><ul><ul><li>Data needs to be transferred from I/O device to OS with as little overhead as possible. </li></ul></ul><ul><li>I/O descriptor rings are a circular queue of descriptors that hold producer/consumer pointer pairs. </li></ul><ul><ul><li>Descriptors are allocated by a domain, but accessible from within Xen. </li></ul></ul><ul><ul><li>Access to the ring is controlled by two pairs of pointers. </li></ul></ul><ul><ul><ul><li>Domains produce requests and advance the request producer pointer. </li></ul></ul></ul><ul><ul><ul><li>Xen removes requests and advances the request consumer pointer. </li></ul></ul></ul><ul><ul><ul><li>Xen produces responses and advances the response producer pointer. </li></ul></ul></ul><ul><ul><ul><li>Domains remove responses and advance the response consumer pointer. </li></ul></ul></ul>
  • 16. Virtualization of System Components <ul><li>CPU scheduling is done using the Borrowed Virtual Time algorithm. </li></ul><ul><ul><li>Thread execution is monitored in terms of virtual time. </li></ul></ul><ul><ul><ul><li>The scheduler selects the thread with the earliest effective virtual time. </li></ul></ul></ul><ul><ul><ul><li>A thread can borrow virtual time by warping back to appear earlier and gain dispatch priority. </li></ul></ul></ul><ul><ul><ul><ul><li>But it then goes to the end of the line after execution. </li></ul></ul></ul></ul><ul><ul><li>Protects against low-latency threads using excessive processing cycles. </li></ul></ul><ul><ul><li>CPU resources are allocated dynamically, no need to predict processing requirements in advance. </li></ul></ul><ul><li>Guest OSes are given three ways of interpreting time. </li></ul><ul><ul><li>Virtual time only advances while the domain is executing. </li></ul></ul><ul><ul><li>Real time is the time in nanoseconds since the machine boot (can be locked to an external time source). </li></ul></ul><ul><ul><li>Wall-clock time is real-time + offset. </li></ul></ul>
  • 17. Components Continued: Virtual Address Translation <ul><li>The x86 architecture uses hardware page tables, which makes memory virtualization more difficult. </li></ul><ul><li>Xen only deals with page table updates. </li></ul><ul><ul><li>Guest OS page tables are registered directly with the MMU. </li></ul></ul><ul><ul><li>Guest OSes have read-only access. </li></ul></ul><ul><ul><li>No need to use shadow page tables. </li></ul></ul><ul><li>A guest OS passes Xen its page table updates using a hypercall. </li></ul><ul><ul><li>Requests are validated, and then applied. </li></ul></ul><ul><ul><li>A type and reference count are kept for each machine page frame, and are used to validate updates. </li></ul></ul><ul><ul><ul><li>Frames that have already been validated are marked so they do not have to be revalidated. </li></ul></ul></ul><ul><li>Hypercall requests can be batched to improve performance. </li></ul>
  • 18. Physical Memory and Disk <ul><li>Each domain receives an initial reservation of memory. </li></ul><ul><ul><li>Memory is statically divided between domains. </li></ul></ul><ul><ul><ul><li>A domain may claim additional memory pages up to its reservation limit. </li></ul></ul></ul><ul><ul><ul><li>A domain may also release pages back to Xen. </li></ul></ul></ul><ul><ul><li>A balloon driver passes memory pages from Xen to the guest OS’s page allocator. </li></ul></ul><ul><ul><ul><li>Mapping from physical to hardware addresses is left to the OS. </li></ul></ul></ul><ul><ul><ul><li>Xen provides a shared translation array that is readable by all domains. Updates are validated by Xen first. </li></ul></ul></ul><ul><li>Only Domain0 has direct access to all physical drives. All other domains access a virtual block device (VBD) </li></ul><ul><ul><li>Domain0 manages the virtual block devices, using the I/O ring queuing mechanism to control access. </li></ul></ul><ul><ul><li>A VBD is composed of a list of extents with associated ownership and access control information. </li></ul></ul><ul><ul><li>To a guest OS, the VBD behavior is very similar to that of a SCSI disk. </li></ul></ul><ul><ul><ul><li>Xen keeps the translation table, and my reorder requests or process them in batches. </li></ul></ul></ul>
  • 19. Performance <ul><li>Five implementations compared in total. </li></ul><ul><ul><li>Compared 3 VMM’s </li></ul></ul><ul><ul><ul><li>Vmware workstation 3.2 </li></ul></ul></ul><ul><ul><ul><li>User-Mode Linux (runs the Linux OS in user-mode on a Linux host) </li></ul></ul></ul><ul><ul><ul><li>Xen with XenoLinux port </li></ul></ul></ul><ul><ul><li>Also Native Linux </li></ul></ul><ul><li>All used Redhat 2.0 with the Linux 2.4.21 kernel, i686 architecture, ext3 file system. </li></ul><ul><li>All used Dell 2650 dual processor 2.4GHz systems, 2GB RAM, gigabit Ethernet, and 146GB SCSI drive. Hyperthreading disabled. </li></ul><ul><li>Also tested the ESX server, which replaces the guest OS with a dedicated kernel on VMware, but unable to report the results (EULA restrictions). </li></ul>
  • 20. Performance Results <ul><li>Cluster 1: SPEC CPU suite. </li></ul><ul><ul><li>Computationally intensive application, very little I/O and OS interaction. </li></ul></ul><ul><li>Cluster 2: Time taken to build a default configuration of the Linux 2.4.21 kernel with gcc v2.96. </li></ul><ul><li>Cluster 3: Open Source Database Benchmark suite in default configuration. </li></ul><ul><ul><li>Information retrieval shown in tuples per second. </li></ul></ul><ul><li>Cluster 4: Open Source Database Benchmark suite in default configuration. </li></ul><ul><ul><li>Online Transaction Processing workloads shown in tuples per second. </li></ul></ul><ul><li>Cluster 5: dbench program emulating load placed on a file server. </li></ul><ul><li>Cluster 6: SPEC Web99 is a web server benchmark. </li></ul>
  • 21. Operating System Benchmarks <ul><li>Measured using the lmbench program, version 3.0-a3 </li></ul><ul><ul><li>L-UP is native Linux uniprocessor. </li></ul></ul><ul><ul><li>L-SMP is native Linux multiprocessor. </li></ul></ul><ul><ul><li>Xen is running XenoLinux, their port of the Linux OS. </li></ul></ul><ul><ul><li>VMW is VMware. </li></ul></ul><ul><ul><li>UML is user-mode Linux. </li></ul></ul>
  • 22. Further Performance Measures <ul><li>Multiple instances of PostreSQL in separate domains </li></ul><ul><ul><li>OSDB-IR = Open Source Database Benchmark Information Retrieval. </li></ul></ul><ul><ul><li>OSDB-OLTP = Open Source Database Benchmark On-line Transaction Processing. </li></ul></ul><ul><li>Performance Isolation </li></ul><ul><ul><li>They couldn’t find another OS-based implementation of performance isolation to compare it with. </li></ul></ul><ul><ul><li>They tested Xen using 4 domains running with equal resource allocations. </li></ul></ul><ul><ul><ul><li>2 domains running previously-measured workloads. </li></ul></ul></ul><ul><ul><ul><li>2 domains running disruptive processes (e.g. disk bandwidth hog, fork bomb, memory grabber). </li></ul></ul></ul><ul><ul><li>The impact of the disruptive processes was only a 2-4% decrease in performance . </li></ul></ul><ul><ul><li>Same processes effectively shut down a native Linux system. </li></ul></ul>
  • 23. Scalability <ul><li>Xen’s target was </li></ul><ul><li>to scale to 100 </li></ul><ul><li>domains. </li></ul><ul><li>They were able </li></ul><ul><li>to configure a </li></ul><ul><li>guest OS for </li></ul><ul><li>server </li></ul><ul><li>functionality, </li></ul><ul><li>running a memory </li></ul><ul><li>of only 4MB with swap. </li></ul><ul><ul><li>When an incoming request was received, it could request more memory from Xen. </li></ul></ul><ul><li>Compared to native Linux, they found a tradeoff situation. </li></ul><ul><ul><li>Long time slices gives the highest throughput, but less responsiveness. Xen running with 50ms time slices had similar throughput to Linux. </li></ul></ul><ul><ul><li>Short time slices lowered throughput but improved responsiveness. </li></ul></ul><ul><ul><ul><li>With 128 domains running, Xen still provided a response time of 5.4ms. </li></ul></ul></ul><ul><ul><ul><li>5ms time slices resulted in 7.5% lower throughput. </li></ul></ul></ul>
  • 24. Conclusion <ul><li>Xen is a software based Virtual Machine Monitor (hypervisor). </li></ul><ul><ul><li>Allows multiple OSes to be hosted simultaneously on the same machine. </li></ul></ul><ul><ul><li>Requires the OS to be modified (ported) in order to run on the VMM. </li></ul></ul><ul><ul><li>Provides the protection of performance isolation between domains. </li></ul></ul><ul><li>Xen today… </li></ul><ul><ul><li>Open-source project published under the GPL. </li></ul></ul><ul><ul><ul><li>Currently on version 3.0. </li></ul></ul></ul><ul><ul><ul><li>NetBSD, Linux (several distros, including SuSE, Fedora, RHEL, Mandrake), FreeBSD, Plan9, and NetWare are complete. WindowsXP port was successful, but licensing prohibitions prevent it from being released. </li></ul></ul></ul><ul><li>Hardware support for virtualization </li></ul><ul><ul><li>Intel is releasing a new line of processors that support virtualization. </li></ul></ul><ul><ul><li>2 forms of CPU operation. </li></ul></ul><ul><ul><ul><li>In addition to levels 0-3, there is also a root level where the VMM can run. </li></ul></ul></ul><ul><ul><ul><li>Guest OSes still can run at level 0, so porting is no longer required. </li></ul></ul></ul><ul><ul><li>Virtual Machine Control Structure (VMCS) manages VM entries and exits. </li></ul></ul>
  • 25. References <ul><li>University of Cambridge Xen page </li></ul><ul><ul><li>http://www.cl.cam.ac.uk/Research/SRG/netos/xen/ </li></ul></ul><ul><li>Wikipedia entry for Xen </li></ul><ul><ul><li>http://en.wikipedia.org/wiki/Xen_%28virtual_machine_monitor%29 </li></ul></ul><ul><li>Intel Virtualization Technology, by Rich Uhlig, Gil Neiger, Dio Rodgers, Amy Santoni, Fernando Martins, Andrew Anderson, Steven Bennett, Alain Kagi, Felix Leung, and Larry Smith. </li></ul><ul><ul><li>Published in Computer magazine, May 2005 (Vol. 38, No. 5) ISSN: 0018-9162 </li></ul></ul><ul><li>Borrowed-Virtual-Time (BVT) scheduling: Supporting Latency-sensitive Threads in a General-purpose Scheduler </li></ul><ul><ul><li>Kenneth J. Duda and David R. Cheriton </li></ul></ul>

×