Your SlideShare is downloading. ×
Computer Architecture course lecture on Virtualization and ...
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Computer Architecture course lecture on Virtualization and ...


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Virtualization Susanta K. Nanda ECSL CSE 502, Fall’05
  • 2. Virtualization at the Hardware Level
    • Observation
      • Hardware resources are typically under-utilized
      • Hardware resources directly relate to cost
    • Goal: Improve hardware utilization
    • How?
      • Share hardware resources across multiple machines
      • May make sense for network attached storage, but what about processor, memory, etc.?
    • Theme
      • Decouple machine from hardware
    • Virtual Machine (VM)
      • A machine decoupled from the hardware, i.e. does not necessarily correspond to the hardware
      • Multiple “Virtual Machines” on the same physical host could share the underlying hardware
      • First VM: IBM System/360 Model 40 VM [1965]
  • 3. Virtual Machine Monitor (VMM)
    • A thin layer of software on top of the bare machine to facilitate virtualization of hardware resources
    • Mediates between VMs and the hardware
    • Manages VMs
      • Create, Destroy, Power Off/On, etc.
    • Concerns
      • Isomorphism : State transitions must be isomorphic to a physical nachine
      • Isolation : One VM from all others
      • Performance : Close-to-native
      • Correctness : Exactly same hardware interface to the guest OS to support commodity OSes without any modification
  • 4. A Stolen Picture
  • 5. VM: Additional Advantages
    • Non-existing hardware
      • Virtual devices through emulation via a combination of software and other available devices
      • Example: SCSI-disk using IDE-disks, (virtual) timer
      • Use: Legacy systems/software
    • Hides heterogeneity of the underlying hardware
      • Ability to switch hardware vendors
    • Mobility
      • Decoupling helps move a VM from one physical host to another, just as a file
      • Use: Server consolidation, hardware maintenance, etc.
    • OS Debugging, Mixed OS, Event monitoring, Execution Undo, and Many more…
  • 6. Key Concepts: Appearance
    • A VM consists of Shared and Dedicated Hardware
      • Shared: Disk, Memory, NIC, CPU, Printer, etc
      • Dedicated: Keyboard, Mouse, Display, Speakers, CD-Drive, etc
      • A server VM may not require some dedicated devices
    • Dedicated hardware
      • Per User
      • Sharable across multiple VMs if they belong to the same user
  • 7. Key Concepts: State Management
    • Each VM would have its own architected state information
      • Example: registers/memory/disks, page table/TLB
    • Not always possible to map all architected states to its natural level in the host
      • Insufficient/Unavailable host resources
      • Example: Registers of a VM may be architected using main memory in the host
    • VMs keep getting switched in/out by the VMM
      • “ Isomorphism” requires all state transitions to be performed on the VM states
      • “ Performance” requires efficient state management
    • State Management: Indirection Vs. Copying
  • 8. Key Concepts: State Management cont’d
    • Indirection
      • Hold state for each VM in fixed locations in the host’s memory hierarchy
      • A pointer managed by VMM indicating the guest state that is currently active
      • Example: Register block maintained in memory and a processor register pointing to the register block of the currently active VM
      • Pros: Ease of management
      • Cons: Inefficient ( mov eax ebx requires 2 inst)
    • Copying
      • Copy VM’s state information to its natural level in memory hierarchy when switched in
      • Copy them back to the original place when switched out
      • Example: Copy all the VM registers to the processor registers
      • Pros: Efficient (most instructions are executed natively)
      • Cons: Copying overhead
  • 9. Key Concepts: Resource Control
    • VMM must maintain overall control of the hardware resources
      • Hardware resources are assigned to VMs when they are created/executed
      • Should have a way to get them back when they need to assigned to a different VM
      • Similar to multi-programming in OS
    • Privileged Resources
      • Certain resources are accessible only to and managed by VMM
      • Interrupts relating to such resources must then be handled by VMM
      • Privileged resources are emulated by VMM for the VM
      • Example : interval timer
    • All resource that could help maintain control are marked privileged
      • “ Interval timer” is used to decide VM scheduling
      • “ Page table base register” (CR3 on x86) is used to isolate VM memory
    • Issues: VM scheduling (An ideally fair scheduling may not be good)
  • 10. Key Concepts: Native/Hosted VMs
    • Native VMs
      • VMM is installed on the bare machine, no host OS
      • All other VMs are then created through the VMM
      • Pros: Clean Architecture, Efficient
      • Cons: Complicated VMM due to device drivers
      • Example: VMware ESX Server
    • Hosted VMs
      • VMM is installed on top of a host OS
      • User-mode: VMM runs in non-privileged mode
      • Dual-mode: VMM runs partly in privileged mode (as a driver on the host OS) and partly in unprivileged mode (like an application)
      • Pros: VMM uses drivers in the host OS for I/O  Thin VMM
      • Cons: Inefficient for I/O intensive applications
      • Example: Microsoft Virtual Server
  • 11. Processor Virtualization
    • Privilege Levels/Rings
      • System/User mode
    • System ISA vs. User ISA
    • Emulation
      • Guest ISA may differ from Host ISA
      • Binary translation
      • Slower
    • Native Execution
      • Guest and Host ISA must be the same
      • Some “critical” instructions may still need to be emulated
      • Issues: Complexity of discovering and emulating critical instructions efficiently
  • 12. ISA Virtualizability
    • Privileged Instructions (PI)
      • Instructions that generate a trap when executed in any but most-privileged level
      • Example: LIDT (load interrupt descriptor table)
    • Sensitive Instructions (SI)
      • Instructions whose behavior depends on the current privilege level
      • Example: POPF (pops the stack to EFLAGS)
        • In user mode, the “Interrupt Enable” bit of the ELAGS register is not over-written
        • In system mode, the value is blindly copied
    • Popek/Goldberg Theorem
      • For any conventional third-generation computer, a virtual machine monitor may be constructed if the set of sensitive instructions for that computer is a subset of the set of privileged instructions .
      • In other words, ISA is Virtualizable if and only if SI is a subset of PI
  • 13. When ISA is not Virtualizable?
    • All is not lost if an ISA violates Popek/Goldberg theorem
      • However, it brings in additional complications and inefficient in VMM implementation
    • Critical instructions:
      • Instructions that are sensitive but not privileged
      • X86 has 17 critical instructions
      • All critical instructions must be emulated by VMM
    • VMM Components
      • Binary Scanner: Inspects and inserts trap at critical instructions
      • Dispatcher: Gets control when a trap occurs
      • Allocator: Allocates machine resources (e.g. load relocation bounds register)
      • Interpreters: Each interpreter interprets one privileged instruction
  • 14. Memory Virtualization
    • VM support in traditional architectures
      • Architected TLB vs. Architected Page Table
      • Page-fault and Swap
      • One level of indirection: Page Table
    • VMM requires two levels of indirection
      • Virtual Memory to Real Memory: Page Table (Guest OS)
      • Real Memory to Physical Memory: Real Map Table (VMM)
    • Architected Page Table
      • Additional Data Structures
        • Real Map Table (VMM)
        • Shadow Page Tables (VMM): Used by hardware for address translation, directly maps virtual address to physical (not real) address
      • Maintenance:
        • VMM intercepts and emulates Page table modifications, Page table base register modifications by the Guest OS
  • 15. Memory Virtualization contd
    • Architected TLB
      • Virtual TLB: maintained by guest OS
        • Virtual ASID, Virtual Page, Real Page
      • Real TLB: maintained by VMM
        • Real ASID, Virtual Page, Physical Page
      • ASID map table
        • Virtual ASID, Real ASID
      • VMM intercepts/emulates all modifications to TLB by the guest OS
  • 16. I/O Virtualization
    • Virtualizing Devices
      • Dedicated Devices: Display, Keyboard, Mouse, etc.
      • Partitioned Devices: Disk
      • Shared Devices: Network adapter
      • Spooled Devices: Printer
      • Non-existent Physical Devices: virtual network adapter
    • Virtualizing I/O Operations
      • Intercepting/emulating IN/OUT, INS/OUTS
      • Map virtual resource ID to physical device ID
      • De-multiplexing the interrupts for the devices
    • Virtualizing I/O in Hosted VMM
      • VMM-driver translates I/O instructions back to system calls in the host OS
  • 17. Performance Degradation in VMMs
    • Setup: VM State initialization
    • Emulation: Emulating critical instructions
    • Interrupt Handling
      • Interrupts generated by a program within a VM has to be first handled by VMM even though its not required sometimes
    • State Saving: During world switches
    • Bookkeeping: Timers, etc
    • Time Elongation: Memory references take longer
  • 18. VT-x: Vanderpool Technology
    • VMX Mode for Processors
      • VMX Root and VMX Non-root
      • All four privilege level (rings) are available in both root and non-root in VMX mode
        • Thus, four new less privilege levels than Pentiums
      • Guest VMs can run in VMX non-root
      • Host (Hosted VMM) and VMM in VMX root
    • VMX instructions
      • VMX root has access to a new set of instructions
      • Critical shared resources are kept under the control of a monitor in VMX root
      • VMX non-root ring 0 does not have access to the critical resources
      • An example of a critical resource: Memory for state management
  • 19. An Example Operation
    • VMXON: Switch into VMX mode: To VMM
    • VMLAUNCH VM1 : Start executing VM1 in VMX non-root operation
    • VM1 Exits: Go back to VMM
    • VMLAUNCH VM2: Start executing VM2
    • VM2 Exits: Go back to VMM
    • VMRESUME VM2: Switch to VM2 again
    • VM2 Exits: Go back to VMM
    • VMRESUME VM2 : Switch to VM2
    • VMRESUME VM1: VM2 exits, VM1 switched in
    • VM1 exits: Go back to VMM
    • VMXOFF : Get back to Regular mode
  • 20. Maintenance of State
    • VMCS Data Structure
      • Fully specified, various fields defined
      • Manipulated only by hardware or software in VMX-root
      • VMPTR points to the VMCS structure of the current executing VM
      • There can be multiple VMs active at any point, but one of them would be executing
      • VMWRITE/VMREAD to read contents of VMCS
      • State: More than normal, e.g. architecturally hidden part of segment registers
    • Control Fields: Define under what condition a VM exits
      • Example: Some specific interrupt/instruction/etc, number of model-specific registers (MSRs) that need to be saved when VM exits
    • VM exit info
      • Informs the VMM the reason for exit along with supporting info
  • 21. Maintenance of State contd
    • State Area
      • Guest State: Register state, Interruptibility state
      • Host State: Register State
    • Control Area
      • VM Execution Controls
        • Pin/Processor-based execution controls, bitmap fields, etc
      • VM Exit Controls
        • Control bitmap, MSR Controls
      • VM Entry Controls
        • Control bitmap, MSR Controls, Controls for Event Injection
    • VM Exit Information
      • Basic Info: VM-Exit Info, Vectoring Event Info
      • Other Exit Info: Due to event delivery, due to instruction execution