Upcoming SlideShare
Loading in...5

Like this? Share it with your network








Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

SystemVM2.ppt Presentation Transcript

  • 1. Chapter 8 System Virtual Machines 2005.11.9 Dong In Shin Distributed Computing System Laboratory Seoul National Univ. System VMs
  • 2. Contents Performance Enhancement of System VMs 1 Case Study : Vmware Virtual Platform 2 Case Study : The Intel VT-x Technology 3 ** Case Study : Xen 4
  • 3. Performance Enhancement of System Virtual Machines
  • 4. Reasons for Performance Degradation
    • Setup
    • Emulation
      • Some guest instructions need to be emulated (usually via interpretation) by the VMM.
    • Interrupt handling
    • State saving
    • Bookkeeping
      • Ex. The accounting of time charged to a user
    • Time elongation
  • 5. Instruction Emulation Assists
    • The VMM emulates the privilege instruction using a routine whose operation depends on whether the virtual machine is supposed to be executing in system mode or in user mode.
      • Hardware assist for checking the state and performing the actions.
  • 6. Virtual Machine Monitor Assists
    • Context switch
      • Using hardware to save and restore registers
    • Decoding of privileged instructions
      • Hardware assists, such as decoding the privileged instructions.
    • Virtual interval timer
      • Decrementing the virtual counter by some amount estimated by the VMM from the amount that the real timer decrements.
    • Adding to the instruction set
      • A number of new instructions that are not a part of the ISA of the machine.
  • 7. Improving Performance of the Guest System
    • Non-paged mode
      • The guest OS disables dynamic address translation and defines its real address space to be as large as the largest virtual address space.  Page frames are mapped to fixed real pages.
      • The guest OS no longer has to exercise demand paging.
      • No double paging
      • No potential conflict in paging decisions by the guest OS system and the VMM
  • 8. Double Paging
    • Two independent layers of paging will interact, perform poorly.
    Guest OS incorrectly believe a page to be in physical memory ( green/gold pages ) VMM believes an unneeded page is still in use (teal pages) Guest evicts a page despite available physical memory (red pages)
  • 9. Pseudo-page-fault handling
    • A page fault in a VM system
      • A page fault in some VM’s page table
      • A page fault of VMM’s page table
        • Pseudo page-fault handling
    • Process
      • Initialize page-in operation from backing store.
      • Triggers guest ‘pseudo page fault’.
      • Guest OS suspends guest’s user process.
      • VMM does not suspend the guest.
    • On completion of page-in operation
      • VMM calls guest pseudo page fault handler again
      • Guest OS handler wakes up blocked user process.
  • 10. The others…
    • Spool files
      • Without any special mechanism, VMM should intercept the I/O commands and decipher that the virtual machines are simultaneously attempting to send a job to the I/O devices .
      • Handshaking allows the VMM picks up the spool file and continues to merge this file into its own buffer.
    • Inter-virtual-machine communication
      • Communication between two physical machines involves the processing of message packets through several layers at the sender/receiver side
      • This process can be streamlines, simplified, and made faster if the two machines are virtual machines on the same host platform.
  • 11. Specialized Systems
    • Virtual-equals-real (V=R) virtual machine
      • The host address space representing the guest real memory is mapped one-to-one to the host real memory address space.
    • Shadow-table bypass assist
      • The guest page tables can point directly to physical addresses if the dynamic address translation hardware is allowed to manipulate the guest page tables.
    • Preferred-machine assist
      • Allow a guest OS system to operate in system mode rather than user mode.
    • Segment sharing
      • Sharing the code segments of the operating system among the virtual machines, provided the operating system code is written in a reentrance manner.
  • 12. Generalized Support for Virtual Machines
    • Interpretive Execution Facility (IEF)
      • The processor directly executes most of the functions of the virtual machine in hardware.
      • An extreme case of a VM assist.
    • Interpretive Execution Entry and Exit
      • Entry
        • Start Interpretive Execution (SIE) : The software give up control to the hardware IEF part and processor enters the interpretive execution mode.
      • Exit
        • Host Interrupt
        • Interception
          • Unsupported hardware instructions.
          • Exception during the execution of interpreted instruction.
          • Some special case…
  • 13. Interpretive Execution Entry and Exit VMM Software SIE Host interrupt handler Interpretive execution mode Entry into interpretive execution mode Exit for interception Exit for host interrupt Emulation
  • 14. Full-virtualization Versus Para-virtualization
    • Full virtualization
      • Provide total abstraction of the underlying physical system and creates a complete virtual systems in which the guest operating systems can execute.
      • No modification is required in the guest OS or application.
      • The guest OS or application is not aware of the virtualized environment.
    • Advantages
      • Streamlining the migration of applications and workloads between different physical systems.
      • Complete isolation of different applications, which make this approach highly secure.
    • Disadvantages
      • Performance penalty
    • Microsoft Virtual Server and Vmware ESX Server
  • 15. Full-virtualization Versus Para-virtualization
    • Para Virtualization
      • The virtualization technique that presents a software interface to virtual machines that is similar but not identical to that of the underlying hardware.
      • This techniques require modifications to the guest OS that are running on the VMs.
      • The guest OSs are aware that they are executing on a VM.
    • Advantages
      • Near-native performance
    • Disadvantages
      • Some limitations, including several insecurities such as the guest OS cache data, unauthenticated connections, and so forth.
    • Xen system
  • 16. Case Study: Vmware Virtual Platform
  • 17. Vmware Virtual Platform
    • A popular virtual machine infrastructure for IA-32-based PCs and server.
    • An example of a hosted virtual machine system
      • Native virtualization architecture product  Vmware ESX Server
      • This book is limited to the hosted system , Vmware GSX Server (VMWare2001)
    • Challenges
      • Difficulties to virtualize efficiently based on IA-32 environment.
      • The openness of the system architecture.
      • Easy Installation.
  • 18. Vmware’s Hosted Virtual Machine Model
  • 19. Processor Virtualization
    • Critical Instructions in Intel IA-32 architecture
      • not efficiently virtualizable.
    • Protection system references
      • Reference the storage protection system, memory system, or address relocation system. (ex. mov ax, cs )
    • Sensitive register instructions
      • Read or change resource-related registers and memory locations (ex. POPF)
    • Problems
      • The sensitive instructions executed in user mode do not executed as correct as we expected unless the instruction is emulated.
    • Solutions
      • The VM monitor substitutes the instruction with another set of instruction and emulates the action of the original code.
  • 20. Input/Output Virtualization
    • The PC platform supports many more devices and types of devices than any other platform.
    • Emulation in VMMonitor
      • Converting the in and out I/O to new I/O instructions.
      • Requires some knowledge of the device interfaces.
    • New Capability for Devices Through Abstraction Layer
      • VMApp’s ability to insert a layer of abstraction above the physical device.
    • Advantages
      • Reduce performance losses due to virtualization.
        • Ex) Virtual Ethernet switch between a virtual NIC and a physical NIC.
  • 21. Using the Services of the Host Operating System
    • The request is converted into a host OS call.
    • Advantages
      • No limitations for VMM’s access of the host OS’s I/O features.
      • Running the Performance-Critical applications
  • 22. Memory Virtualization
    • Paging requests of the guest OS
      • Not directly intercepted by the VMM, but converted into disk read/writes.
      • VMMonitor translates it to requests on the host OS throught VMApp.
    • Page replacement policy of host OS
      • The host could replace the critical pages of VM system in the competition with other host applications.
      • VMDriver’s critical pages pinning for virtual memory system.
  • 23. Vmware ESX Server
    • Native VM
      • A thin software layer designed to multiplex hardware resources among virtual machines
      • Providing higher I/O performance and complete control over resource management
    • Full Virtualization
      • For servers running multiple instances of unmodified operating systems
  • 24. Page Replacement Issues
    • Problem of double paging
      • Unintended interactions with native memory management policies between in guest operating systems and host system.
    • Ballooning
      • Reclaims the pages considered least valuable by the operating system running in a virtual machine.
      • Small balloon module loaded into the guest OS as a pseudo-device driver or kernel service.
      • Module communicates with ESX server via a private channel.
  • 25. Ballooning in VMware ESX Server
    • Inflating a balloon
      • When the server wants to reclaim memory
      • Driver allocate pinned physical pages within the VM
      • Increase memory pressure in the guest OS, reclaim space to satisfy the driver allocation request
      • Driver communicates the physical page number for each allocated page to ESX server
    • Deflating
      • Frees up memory for general use within the guest OS
  • 26. Virtualizing I/O Devices on VMware Workstation
    • Supported v irtual devices of V Mwa re
      • PS/2 keyboard, PS/2 mouse, floppy drive, IDE controllers with ATA disks and ATAPI CD-ROMs, a Soundblaster 16 sound card, serial and parallel ports, virtual BusLogic SCSI controllers, AMD PCNet Ethernet adapters, and an SVGA video controller.
    • P rocedures
      • I ntercept I/O operations issued by the guest OS. ( IA-32 IN and OUT )
      • E mulated either in the VMM or the VMApp.
    • Drawbacks
      • Virtualizing I/O devices can incur overhead from world switches between the VMM and the host
      • H andling the privileged instructions used to communicate with the hardware
  • 27. Case Study: The Intel VT-x (Vanderpool) Technology
  • 28. Overview
    • VT-x (Vanderpool) technology for IA-32 processors
      • enhance the performance VM implementation through hardware enhancements of the processor.
    • Main Feature
      • The inclusion of the new VMX mode of operation (VMX root/non-root operation)
      • VMX root operation
        • Fully privileged, intended for VM monitor New instructions – VMX instructions
      • VMX non-root operation
        • Not fully privileged, intended for guest software
        • Reduces Guest SW privilege w/o relying on rings
  • 29. Technological Overview Root Mode (VMM) Non-Root (VM1) Non-Root (VM2) Regular Mode Regular Mode vmxon v mlaunch VM1 v mlaunch VM2 v mresume VM2 v mresume VM2 v mresume VM1 vmxoff VM1 exits VM2 exits VM2 exits VM2 exits VM1 exits
  • 30. VT-x Operations IA-32 Operation VMX Root Operation VMX Non-root Operation . . . VMXON VMLAUNCH VMRESUME VM Exit Ring 0 Ring 3 Ring 0 Ring 3 VM 1 Ring 0 Ring 3 VM 2 Ring 0 Ring 3 VM n VMCS 2 VMCS n VMCS 1
  • 31. Capabilities of the Technology
    • A Key aspect
      • The elimination of the need to run all guest code in the user mode.
    • Maintenance of state information
      • Major source of overhead in a software-based solution
      • Hardware technique that allows all of the state-holding data elements to be mapped to their native structures.
      • VMCS (Virtual Machine Control Structure)
        • Hardware implementation take over the tasks of loading and unloading the state from their physical locations.
  • 32. Virtual Machine Control Structure (VMCS)
    • Control Structures in Memory
      • Only one VMCS active per virtual processor at any given time
    • VMCS Payload
      • VM execution, VM exit, and VM entry controls
      • Guest and host state
      • VM-exit information fields
  • 33. ** Case Study: Xen Virtualization
  • 34. Xen Design Principle
    • Support for unmodified applica ti on binaries is essential.
    • Supporting full multi-application operating system is important.
    • Paravirtualization is necessary to obtain high performance and strong resource isolation.
  • 35. Xen Features
    • Secure isolation between VMs
    • Resource Control and QoS
    • Only guest kernel needs to be ported
      • All user-level apps and libraries run unmodified.
      • Linux 2.4/2.6 , NetBSD, FreeBSD, WinXP
    • Execution performance is close to native.
    • Live Migration of VMs between Xen nodes.
  • 36. Xen 3.0 Architecture
  • 37. Xen para-virtualization
    • Arch Xen/X86 , replace privileged instructions with Xen hypercalls.
    • Hypercalls
      • Notifications are delivered to domains from Xen using an asynchronous event mechanism
    • Modify OS to understand virtualized environment
      • Wall-clock time vs. virtual processor time
        • Xen provides both types of alarm timer
      • Expose real resource availability
    • Xen Hypervisor
      • Additional protection domain between guest OSes and I/O devices.
  • 38. X86 Processor Virtualization
    • Xen runs in ring 0 (most privileged)
    • Ring 1,2 for guest OS, 3 for user-space
    • Xen lives in top of 64MB of linear address space.
      • Segmentation used to protect Xen as switching page tables too slow on standard X86
    • Hypercalls jump to Xen in ring 0
    • Guest OS may install ‘fast trap’ handler
    • MMU-virtualization : shadow vs. direct-mode
  • 39. Para-virtualizing the MMU
    • Guest OS allocate and manage own page-tables
      • Hypercalls to change PageTable base.
    • Xen Hypervisor is responsible for trapping accesses to the virtual page table, validating updates and propagating changes.
    • Xen must validate page table updates before use
      • Updates may be queued and batch processed
    • Validation rules applied to each PTE
      • Guest may only map pages it owns
    • XenoLinux implements a balloon driver
      • Adjust a domain’s memory usage by passing memory pages back and forth between Xen and XenoLinux
  • 40. MMU virtualization
  • 41. Writable Page Tables
  • 42. I/O Architecture
    • Asynchronous buffer descriptor rings
      • Using shared-memory
    • Xen I/O-Spaces delegate guest Oses protected access to specified h/w devices
    • The guest OS passes buffer information vertically through the system.
    • Xen performs validation checks.
    • Xen supports a lightweight event-delivery mechanism which is userd for sending asynchronous notifications to a domain.
  • 43. Data Transfer : I/O Descriptor Rings
  • 44. Device Channel Interface
  • 45. Performance
  • 46. Thank You !