Your SlideShare is downloading. ×
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply



Published on

Published in: Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Virtualization Extensions For Current Processors Shibdas Bandyopadhyay Dept of CISE University of Florida
  • 2. Outline
    • Virtualization
    • Problem with the x86 architecture
    • Intel VT extensions
    • How it solves the problems
    • Future Extensions
  • 3. Virtualization
    • Virtualization is a framework or methodology of dividing the resources of a computer into multiple execution environments, by applying one or more concepts or technologies such as hardware and software partitioning, time-sharing, partial or complete machine simulation, emulation, quality of service and many other – “From Intel Technology Journal, Volume 10, issue 2”
    • Implemented long ago by IBM mainly for mainframes. With emerging many-core CMPs, it has become an efficient way to utilize the resources
    • Useful for Consolidating workloads of several under-utilized server machines into a single one running different virtual machine.
    • Secure and isolated execution environment
    • and many more…
  • 4. Virtualization Strategies
    • Pure Virtualization – Implementing an architecture in software. Well, we can pass on the instructions down to the hardware if they are “virtualization-aware”. Need to monitor every executing instructions. Example – VMWare
    • Para-virtualization – Provides a thin microkernel (Hypervisor) which provides an abstract view of the hardware. “Sensitive” Non- ”Virtualization-aware” instruction should result in a trap to VMM for it to react accordingly. Example – Xen
    • Second approach is more efficient but as all sensitive instruction (As we discuss later) do not generate a trap, operating system has to be modified to enable this
  • 5. Problem with the x86 architecture
    • Designed traditionally to work with a single operating system
    • “ Ring De-privileging” – Guest OS can not run in ring 0 as it should have direct control of the hardware. VMM should use ring de-privileging to run it on ring 1 or 3
    • “ Ring Aliasing” – Software is run at a privilege level it was not designed for and more it is able to detect that. Example – Push CS IA-32 instruction saves the content of CS (which includes current privilege level) in the stack. Guest OS can easily determine that it is not running at desired privilege
  • 6. Problem – Address Space Compression
    • OSs expect to have access to the processor’s full address space
    • A VMM must reserve some space in guest virtual address space for control data structure those are required for transition between VMM and guest. Also, VMM itself can reside on guest address space for easy communication with guest OS
    • The control structures include IDT and GDT for IA-32 processors
    • VMM integrity can be compromised if guests are able to access these structures. So, it needs to protect this region and any access of guest to this region should result in a transition to VMM
  • 7. Problem – Non-faulting sensitive instructions
    • Sensitive instructions results in accessing a privileged state without generating a fault. Hence, VMM does not have a way to trap those and respond accordingly
    • For example – IA-32 registers GDTR, IDTR, LDTR contains data pertaining to CPU operations
    • Write/Load to these registers in non-privilege state generates fault but reads are allowed in non-privilege state.
    • As VMM uses these registers guest should be able to read them to get some erroneous data and also determine that it is running on a virtualized environment
  • 8. Problem – Guest System Calls
    • Ring de-privileging may interfere with the facilities that enable fast transition to OS software. The IA-32 SYSENTER and SYSEXIT instructions support low latency system calls
    • SYSENTER always causes a transition to privilege level 0 and SYSEXIT faults if executed outside ring 0
    • Execution of SYSENTER by an application results in transition to VMM instead of Guest OS
    • Execution of SYSEXIT by Guest OS causes a fault. Hence, it becomes an overhead for VMM to emulate all SYSENTER and SYSEXIT calls resulting in a performance degradation as system calls are frequent
  • 9. Problem – Interrupt Virtualization
    • Interrupt masking are done by the operating systems when it is not ready to receive them by modifying interrupt flag in EEFLAGS register
    • VMM is likely to manage interrupts and thus is needs to prevent guest OS from interrupt masking/unmasking
    • As OS does interrupt masking and unmasking frequently this will result in a trap to VMM every time
    • It gets more complicated when VMM wants to deliver a virtual interrupt.
  • 10. Other Problems
    • “ Hidden States” – Some components of IA-32 architecture does not correspond to software accessible registers. Hence they can not be saved/restored by VMM during VM Switching.
    • Example – IA-32 segment descriptor cache. A segment descriptor load copies the reference segment descriptors from LDT and GDT to this cache. This is not accessible via any register
    • “ Ring Compression” – For IA-32 64 bit EMT mode, paging must be used. But paging does not distinguish between ring 1 & 2 resulting Guest OS to run at ring 3. This makes it at the same privilege level as guest applications and hence is not protected from them
    • Frequent Access to privileged resources such as Task Priority Register (TPR) which controls the interrupt priority must be guarded by VMM resulting in performance drop
  • 11. Solution – Architecture Extension
    • Intel VT-x extension for IA-32 and VT-i extensions for Itanium architecture
    • Two new forms of CPU operations – VMX root operation (intended to be used by VMM) and VMX non-root operation (for Guest VMs)
    • VMX root operation is very similar to IA-32 without VT-x and both root and non-root operations support all four levels of privileges
    • Transition from VMX root operation and VMX non-root operation is called VM entry and from non-root to root mode is called VM exit
    • Entry and Exits are managed by VMCS (Virtual Machine Control Structure) which includes a guest state area and host state area
    • VM Entries load processor state from guest-state area, VM-exits save current processor state to guest state area and loads processor state from host state area
  • 12. Intel VT-x
    • Processor operation is changed substantially in VMX non-root operation. Most instructions and events cause VM exits
    • Some instructions (e.g INVD) cause VM exit unconditionally and can not be executed in non-root operation while other instructions (e.g. INVLPG) can be configured to do VM exits conditionally based on VM-execution control fields in VMCS
    • Guest State Area – contains fields for those registers which must be loaded for proper VMM operation such as segment registers, CR3, IDTR etc
    • It also contains “Hidden states” mentioned earlier. For example, it contains processor’s interruptibility state (whether interrupts are masked or not and NMIs are masked because an NMI is being handled)
    • It does not contain area for GPRs as software can do that much more efficiently
  • 13. Intel VT-x – VM Execution Control Fields
    • VMCS contains a number of fields which specifies which instructions and events will cause VM exits in VMX non-root operation. It also includes controls that support interrupt virtualization
    • “ External Interrupt exiting” – When this is set, all external interrupts cause VM exits. Also, guest is not able to mask the interrupt if this is set
    • “ Interrupt Window exiting” – When this is set, VM exits occur when guest is ready to receive interrupt
    • “ Uses TPR Shadow” – When this control is set, access to APIC’s TPR through control register is handled by accessing a TPR shadow referenced by a pointer in VMCS. Also VMCS includes a TPR threshold and a VM exit occurs after any instruction that reduces the TPR value below that threshold
  • 14. Intel VT-x – VM Execution Control Fields
    • VMCS also contains controls for efficient virtualization of CR registers. VMM might want to retain control of registers which controls paging but
    • VMCS includes a guest/host mask for all of these registers. Guest writes can freely modify unmasked bits but when it tries to write to any masked bit, VM exit occurs. For these registers, VMCS also includes a read shadow which is returned when Guest reads them
    • VMCS includes bitmaps which specify VMM selectivity about the events that will cause VM exits
    • - Execution bitmap allows VMM to specify which exception will
    • cause VM exit
    • - I/O bitmap allows VMM to specify access to which port will
    • cause VM exit
    • - MSR bitmaps allows VMM to specify access to which Model
    • Specific Register will cause VM exit
  • 15. Intel VT-x – VMCS & VM Entry
    • VMCS is referred by physical address eliminating the need of it to be in guest virtual address space which may be different from VMM’s virtual address space
    • VMCS layout is not architecture specific which makes optimizations possible to improve performance
    • VM Entry loads the processor state from Guest state area of VMCS [Note: As this includes CR3 (indicates starting of page table for IA-32 architectures) VMM and Guest can potentially be in different address space]
    • VM entry also provides an option to VMM to inject events. CPU uses guest’s IDT to deliver these events. Virtual interrupts can be delivered using this way
  • 16. Intel VT-x – VM Exit
    • VM exits save processor state into the guest area and then load processor state from host area
    • Each VM exit is requited to save detailed information into VMCS. For example, which instruction caused the VM exit. It may also record a detail exit qualification specifying various arguments of the instruction
    • Each VM exit due to an IA-32 exception saves, in addition to the exception, information about any events served at that point of time. This helps VMM to virtualize nested exceptions
  • 17. Using Intel VT
    • “ Address Compression Problem” – Transition between guest software and VMM can change virtual address space allowing guest to have full access to its virtual address space
    • “ Ring Compression and Ring Aliasing” – As all privilege levels are allowed for guest software, these problems do not occur
    • “ Non-faulting access to privilege state” – Such access now results in a VM exit returning control to VM. Also, some privilege states like GDT, LDT, IDT are controlled entirely by Guest and hence don’t need to transfer control to VMM
    • “ Guest System Calls”- Not arise anymore as Guest is having all privilege levels
  • 18. Using Intel VT
    • “ Interrupt Virtualization” – With VMCS external interrupt control fields a VM exit is forced whenever that interrupt occurs. Also, VMM can also set the interrupt-window exiting control when it has a virtual interrupt to deliver
    • “ Access hidden state” – These are included in Guest-state area which are saved on every VM exit and loaded on every VM entry. Hence states are saved/restored proeperly
    • “ Frequent Access to Privilege Resources” – TPR access is controlled by VMCS and a VM exit occurs when it goes below a threshold stored in VMCS. This results in less number of VM exits
  • 19. Using Intel VT – An example
    • Lazy Floating point state processing
    • - Floating point states are not turned on by the OS during context
    • switching as most process will not use floating point hardware
    • - OS sets TS bit in CR0 which indicates that any floating point
    • access will result in an exception
    • - Any attempt to access floating point device results in a device
    • not present fault. OS handles by turning on floating point
    • processing
    • VMM can also do similarly for the guest OS-
    • - Set TS bit in CR0 so that every access will result in a exception
    • - Set the bit in VMCS indicating that the exception will cause VM
    • exit
    • - Set the TS bit mask to indicate any read into this will cause a VM
    • exit
  • 20. Using Intel VT – An example
    • Lazy Floating point state processing
    • - Floating point states are not turned on by the OS during context
    • switching as most process will not use floating point hardware
    • - OS sets TS bit in CR0 which indicates that any floating point
    • access will result in an exception
    • - Any attempt to access floating point device results in a device
    • not present fault. OS handles by turning on floating point
    • processing
    • VMM can also do similarly for the guest OS-
    • - Set TS bit in CR0 so that every access will result in a exception
    • - Set the bit in VMCS indicating that the exception will cause VM
    • exit
    • - Set the TS bit mask to indicate any read into this will cause a VM
    • exit
    • Handle VM exits to perform restoring guest floating point state and clear all the bits in VMCS
  • 21. Future Extensions
    • “ NMI-window exiting” – For NMIs blocked by another NMI or SMI
    • “ Virtual Processor Identifier” – This is similar to Address space identifier (ASID) which are used to tag TLB entries so that they are not flushed on every VM switching
    • “ Extended Page Table” – Provides another level of page table transition. Normal page tables will be used to translate from virtual address to guest-physical address. Guest physical addresses are translated to host-physical addresses using EPT. This enables Guest OS to directly handle page faults without switching to VMM
  • 22. Thank You