SlideShare a Scribd company logo
1 of 22
Download to read offline
Virtualization Technology Overview


                               Liu, Jinsong
                         (jinsong.liu@intel.com)




SSG System Software Division
Agenda
• Introduction
    history
    Usage model
• Virtualization overview
    cpu virtualiztion
    memory virtualization
    I/O virtualization
• Xen/KVM architecture
    Xen
    KVM
• Some intel work for Openstack
    OAT



                                  2   2012/11/28
Virtualization history

• 60s’             IBM - CP/CMS on S360, VM370, …
• 70’s 80s’        Silence
• 1998             VMWare - SimOS project, Stanford
• 2003             Xen - Xen project, Cambridge
• After that:      KVM/Hyper-v/Parallels …




                                     3   2012/11/28
What is Virtualization

        VM0             VM1                   VMn

           Apps             Apps                  Apps

          Guest OS        Guest OS      ...    Guest OS



               Virtual Machine Monitor (VMM)


        Platform HW
            Memory         Processors         I/O Devices


• VMM is a layer of abstraction
   support multiple guest OSes
   de-privilege each OS to run as Guest OS
• VMM is a layer of redirection
   redirect physical platform to virtual platform illusions of many
   provide virtaul platfom to guest os



                                                    4    2012/11/28
Server Virtualization Usage Model
             Server Consolidation          R&D                 Production



  App            App                       App
             …
  OS             OS                           OS

  HW             HW          VMM                             VMM
                              HW                             HW
Benefit: Cost Savings
• Consolidate services              Benefit: Business Agility and Productivity
• Power saving
                                           Dynamic Load Balancing
              Disaster Recovery
                                        App            App           App           App
                                         1              2             3             4
                 App
    App
             …                           OS            OS            OS                OS
        OS       OS                           VMM                           VMM
                            VMM
                                              HW                            HW
         VMM
                                           CPU Usage                       CPU Usage
          HW                HW               90%                             30%
 Benefit: Lost saving
 • RAS
 • live migration
 • relief lost                                     • Benefit: Productivity


                                                   5    2012/11/28
Agenda

• Introduction
• Virtualization overview
    CPU virtualization
    Memory virtualization
    I/O virtualization
• Xen/KVM architecture
• Some intel work for Openstack




                                  6   2012/11/28
X86 virtualization challenges
• Ring Deprivileging
    Goal: isolate guest OS from
       • Controlling physical resources directly
       • Modifying VMM code and data

    Ring deprivileging layout
       • vmm runs at full privileged ring0
       • Guest kernel runs at
            •   X86-32: deprivileging ring 1
            •   X86-64: deprivileging ring 3
       • Guest app runs at ring 3

    Ring deprivileging problems
       • Unnecessary faulting
            •   some privilege instructions
            •   some exceptions
       • Guest kernel protection (x86-64)

• Virtualization holes
    19 instructions
       • SIDT/SGDT/SLDT …
       • PUSHF/POPF …
    Some userspace holes hard to fix by s/w approach
       • Hard to trap, or
       • Performance overhead




                                                   7    2012/11/28
X86 virtualization challenges


        VM0            VM0
                         1
                                          VM0
                                            2


Ring3   Guest Apps
          Apps          Guest Apps
                          Apps            Guest Apps
                                            Apps


        Guest Kernel
         Guest OS      Guest Kernel
                        Guest OS          Guest Kernel
                                           Guest OS
Ring1



Ring0          Virtual Machine Monitor (VMM)




                                      8    2012/11/28
Typical X86 virtualization approaches
• Para-virtualization (PV)
     Para virtualization approach, like Xen
     Modified guest OS aware and co-work with VMM
     Standardization milestone: linux3.0
         • VMI vs. PVOPS
         • Bare metal vs. virtual platform


• Binary Translation (BT)
     Full virtualization approach, like VMWare
     Unmodified guest OS
     Translate binary ‘on-the-fly’
         • translation block w/ caching,
               •   usually used for kernel, ~80% native performance
               •   userspace app directly runs natively as much as possible, ~100% native performance
               •   overall ~95% native performance
         • Complicated
         • Involves excessive complexities. e.g., self-modifying code


• Hardware-assisted Virtualization (VT)
       Full virtualization approach assisted by hardware, like KVM
       Unmodified guest OS
       Intel VT-x, AMD-v
       Benefits:
         • Closing virtualization holes in hardware
         • Simplify VMM software
         • Optimizing for performance




                                                                              9     2012/11/28
Memory virtualization challenges
• Guest OS has 2 assumptions
    expect to own physical memory starting from 0
      • BIOS/Legacy OS are designed to boot from address low 1M

    expect to own basically contiguous physical memory
      •   OS kernel requires minimal contiguous low memory
      •   DMA require certain level of contiguous memory
      •   Efficient MM management, e.g., less buddy overhead
      •   Efficient TLB, e.g., super page TLB


• MMU virtualization
    How to keep physical TLB valid
    Different approaches involve different complication and overhead




                                                  10   2012/11/28
Memory virtualization challenges
             VM1   VM2          VM3         VM4

              1
  Guest       2
 Pseudo       3
 Physical     4
 Memory       5


                        Hypervisor


                                      5

  Machine                      1

  Physical
  Memory                              2
                    3                 4




                                      11   2012/5/13
Memory virtualization approaches
• Direct page table
     Guest/VMM in same linear space
     Guest/VMM share same page table
                                                            GVA
• Shadow page table
     Guest page table unmodified
         •   gva -> gpa
     VMM shadow page table
         •   gva -> hpa
     Complication and memory overhead                      Guest
                                                          page table
• Extended page table
     Guest page table unmodified
         •   gva -> gpa
                                               Direct                         Shadow
         •   full control CR3, page fault
     VMM extended page table                page table    GPA               page table
        • gpa -> hpa
        • hardware based
        • good scalability for SMP
        • low memory overhead                             Extended
        • Reduce page fault VMexit greatly                page table

• Flexible choices
     Para virtualization
         • Direct page table
         • Shadow page table
     Full virtualization                                      HPA
         • Shadow page table
         • Extended page table




                                                          12    2012/11/28
Shadow page table

                                               Page
                                             Directory

• Guest page table remains                                    PTE
 unmodified to guest
                                                  PDE
    Translate from gva -> gpa                               Page
                                                             Table
• Hypervisor create a new        vCR3
 page table for physical
                                        Virtual
    Use hpa in PDE/PTE
                                        Physical
    Translate from gva -> hpa
    Invisible to guest                         Page
                                              Directory

                                                              PTE

                                                  PDE
                                                              Page
                                                              Table
                                 pCR3




                                                    13    2012/11/28
Extended page table

                Guest CR3                         EPT base pointer




                            Guest    Guest Physical Address    Extended
Guest Linear                Page                                 Page                  Host Physical
  Address                   Tables                              Tables                   Address




 • Extended page table
      Guest can have full control over its page tables and events
               • CR3, INVLPG, page fault
      VMM controls Extended Page Tables
         • Complicated shadow page table is eliminated
         • Improved scalability for SMP guest




                                                                     14   2012/11/28
I/O virtualization requirements
                                    Interrupt
                           Register Access
            Device                              CPU
                           DMA        Shared
                                      Memory



• I/O device from OS point of view
      Resource configuration and probe
      I/O request: IO, MMIO
      I/O data: DMA
      Interrupt

• I/O Virtualization require
    presenting guestos driver a complete device interface
        • Presenting an existing interface
             •   Software Emulation
             •   Direct assignment
        • Presenting a brand new interface
             •   Paravirtualization




                                                      15   2012/11/28
I/O virtualization approaches
• Emulated I/O
     Software emulates real hardware device
     VMs run same driver for the emulated hardware device
     Good legacy software compatibility
     Emulation overheads limit performance

• Paravirtualized I/O
     Uses abstract interfaces and stack for I/O services
     FE driver: guest run virtualization-aware drivers
     BE driver: driver based on simplified I/O interface and stack
     Better performance over emulated I/O

• Direct I/O
   Directly assign device to Guest
        • Guest access I/O device directly
        • High performance and low CPU utilization
   DMA issue
        • Guest set guest physical address
        • DMA hardware only accept host physical address
   Solution: DMA Remapping (a.k.a IOMMU)
        • I/O page table is introduced
        • DMA engine translate according to I/O page table
   Some limitations under live migration




                                                             16   2012/11/28
Virtual platform models
 Hypervisor Model              Host-based Model                           Hybrid Model


                Guest                         Guest                                       Guest
    Apps                                                                Apps     ULM
                Apps                          Apps                                        Apps
                                                                                   DM
                                                                         Service
Preferred       Guest                         Guest             Preferred VM              Guest
   OS            OS               ULM          OS                  OS                      OS
                                         DM                                  DR
P                     M                           DR                P                         M
                                  Host
         Hypervisor                                                           U-Hypervisor
                                  OS      P LKM M
DR                    DM                                            N




     P       Processor Mgt code          DR    Device Driver              N       NoDMA

     M       Memory Mgt code             DM    Device Model




                                                               17       2012/11/28
Agenda

• Introduction
• Virtualization
• Xen/KVM architecture
• Some intel work for Openstack




                                  18   2012/11/28
Xen Architecture
                                                                                     HVM Domain                    HVM Domain
           Domain 0                                  DomainU                           (32-bit)                      (64-bit)

          XenLinux64                                                                   Unmodified                 Unmodified
                                                                                          OS                         OS
       (xm/xend)




                   Models
        Control



                   Device
         Panel



 3P                                                                                                                                 3D

                                                  XenLinux64




                                                                                    Drivers




                                                                                                                Drivers
                                                                                      FE




                                                                                                                  FE
                                                               Front end Virtual
                                                                                                                                    0D
                            Virtual driver



                                                                                      Guest BIOS                  Guest BIOS
                              Backend




                                                                   Drivers
                                                                                     Virtual Platform            Virtual Platform
1/3P   Native
       Device
       Drivers                                                                       VM Exit                    VM Exit

                                             Callback / Hypercall



0P
                                                     Inter-domain Event Channels
       Control Interface                           Scheduler                       Event Channel                    Hypercalls
               Processor                              Memory                             I/O: PIT, APIC, PIC, IOAPIC

                                                            Xen Hypervisor
                                                                                              19   2012/11/28
KVM Architecture

                     Windows                Linux
                      Guest                 Guest

Qemu-kvm




                                                        Non Root


                                                          Root
                      VMCS   VMCS                VMCS



                     vCPU       vMEM        vTimer
    Linux Kernel       vPIC vAPIC      vIOAPIC

                             KVM module


                               20   2012/11/28
Agenda

• Introduction
• Virtualization
• Xen/KVM architecture
• Some intel work for Openstack




                                  21   2012/11/28
Trusted Pools - Implementation

User specifies ::
                                   OpenStack                                                                                       App
                                                                                                                                  App
                                                                                                                                           App
                                                                                                                                          App
                                                                                                                                 App     App
                                                                                                                         Host
  Mem > 2G                                                                                                               agent
  Disk > 50G                                                                                                                   OS      OS
  GPGPU=Intel                                                                                                            Hypervisor / tboot


                         EC2 API
  trusted_host=trusted                              Create VM                                                                HW/TXT
                                                                                                                             Tboot-
                                         Scheduler                                                                           Enabled
                     Create              TrustedFilter
                         OSAPI




                                            Query




                                                                                                                Report
                                                                                                       Attest
                                                    untrusted
                                                    trusted/



                                   Query API                                Attestation
                                                                              Server




                                                                                             Host Agent API
                                                                              Privacy                           OAT-
                                                                Query API




                                                                                CA
                                                                                                                Based
                                     Attestation                             Appraiser
                                      Service                                               Whitelist
                                                                            Whitelist API
                                                                                              DB

More Related Content

What's hot

Server virtualization
Server virtualizationServer virtualization
Server virtualizationofsorganizer
 
Virtualization
VirtualizationVirtualization
VirtualizationBirju Tank
 
VMware Vsphere Graduation Project Presentation
VMware Vsphere Graduation Project PresentationVMware Vsphere Graduation Project Presentation
VMware Vsphere Graduation Project PresentationRabbah Adel Ammar
 
Introduction to virtualization
Introduction to virtualizationIntroduction to virtualization
Introduction to virtualizationAhmad Hafeezi
 
Managing ESXi - Tools and Techniques
Managing ESXi - Tools and TechniquesManaging ESXi - Tools and Techniques
Managing ESXi - Tools and TechniquesChristopher Janoch
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualizationhktripathy
 
Virtual Machine Concept
Virtual Machine ConceptVirtual Machine Concept
Virtual Machine Conceptfatimaanique1
 
Different types of virtualisation
Different types of virtualisationDifferent types of virtualisation
Different types of virtualisationAlessandro Guli
 
An Introduction To Server Virtualisation
An Introduction To Server VirtualisationAn Introduction To Server Virtualisation
An Introduction To Server VirtualisationAlan McSweeney
 
What is Virtualization
What is VirtualizationWhat is Virtualization
What is VirtualizationIsrael Marcus
 
Virtualization
VirtualizationVirtualization
Virtualizationvishnurk
 
Storage Virtualization
Storage VirtualizationStorage Virtualization
Storage VirtualizationMehul Jariwala
 
Introduction to Hyper-V
Introduction to Hyper-VIntroduction to Hyper-V
Introduction to Hyper-VMark Wilson
 

What's hot (20)

Cloud Computing: Virtualization
Cloud Computing: VirtualizationCloud Computing: Virtualization
Cloud Computing: Virtualization
 
Server virtualization
Server virtualizationServer virtualization
Server virtualization
 
Virtualization
VirtualizationVirtualization
Virtualization
 
VMware Vsphere Graduation Project Presentation
VMware Vsphere Graduation Project PresentationVMware Vsphere Graduation Project Presentation
VMware Vsphere Graduation Project Presentation
 
Introduction to virtualization
Introduction to virtualizationIntroduction to virtualization
Introduction to virtualization
 
Managing ESXi - Tools and Techniques
Managing ESXi - Tools and TechniquesManaging ESXi - Tools and Techniques
Managing ESXi - Tools and Techniques
 
Virtualization
VirtualizationVirtualization
Virtualization
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualization
 
Virtual Machine Concept
Virtual Machine ConceptVirtual Machine Concept
Virtual Machine Concept
 
Server virtualization
Server virtualizationServer virtualization
Server virtualization
 
Different types of virtualisation
Different types of virtualisationDifferent types of virtualisation
Different types of virtualisation
 
An Introduction To Server Virtualisation
An Introduction To Server VirtualisationAn Introduction To Server Virtualisation
An Introduction To Server Virtualisation
 
What is Virtualization
What is VirtualizationWhat is Virtualization
What is Virtualization
 
Virtualization
VirtualizationVirtualization
Virtualization
 
Storage Virtualization
Storage VirtualizationStorage Virtualization
Storage Virtualization
 
Introduction to virtualization
Introduction to virtualizationIntroduction to virtualization
Introduction to virtualization
 
Xen Hypervisor
Xen HypervisorXen Hypervisor
Xen Hypervisor
 
Vmware overview
Vmware overviewVmware overview
Vmware overview
 
Virtualization
VirtualizationVirtualization
Virtualization
 
Introduction to Hyper-V
Introduction to Hyper-VIntroduction to Hyper-V
Introduction to Hyper-V
 

Similar to Virtualization Technology Overview

2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01
2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp012virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01
2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01Vietnam Open Infrastructure User Group
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualizationHwanju Kim
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
z/VM 6.2: Increasing the Endless Possibilities of Virtualization
z/VM 6.2: Increasing the Endless Possibilities of Virtualizationz/VM 6.2: Increasing the Endless Possibilities of Virtualization
z/VM 6.2: Increasing the Endless Possibilities of VirtualizationIBM India Smarter Computing
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilThe Linux Foundation
 
Virtualization-the Cloud Enabler by INSPIRE-groups
Virtualization-the Cloud Enabler by INSPIRE-groupsVirtualization-the Cloud Enabler by INSPIRE-groups
Virtualization-the Cloud Enabler by INSPIRE-groupsPraveen Hanchinal
 
Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Microsoft Iceland
 
Dynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 SDynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 SEduardo Castro
 
Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1Eduardo Castro
 
Hyper V And Scvmm Best Practis
Hyper V And Scvmm Best PractisHyper V And Scvmm Best Practis
Hyper V And Scvmm Best PractisBlauge
 

Similar to Virtualization Technology Overview (20)

2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01
2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp012virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01
2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualization
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
z/VM 6.2: Increasing the Endless Possibilities of Virtualization
z/VM 6.2: Increasing the Endless Possibilities of Virtualizationz/VM 6.2: Increasing the Endless Possibilities of Virtualization
z/VM 6.2: Increasing the Endless Possibilities of Virtualization
 
17-virtualization.pptx
17-virtualization.pptx17-virtualization.pptx
17-virtualization.pptx
 
virtual machine.ppt
virtual machine.pptvirtual machine.ppt
virtual machine.ppt
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon Brazil
 
XS Japan 2008 BitVisor English
XS Japan 2008 BitVisor EnglishXS Japan 2008 BitVisor English
XS Japan 2008 BitVisor English
 
Virtualization-the Cloud Enabler by INSPIRE-groups
Virtualization-the Cloud Enabler by INSPIRE-groupsVirtualization-the Cloud Enabler by INSPIRE-groups
Virtualization-the Cloud Enabler by INSPIRE-groups
 
UDS 2012 Xen
UDS 2012 XenUDS 2012 Xen
UDS 2012 Xen
 
Ian Pratt Nsdi Keynote Apr2008
Ian Pratt Nsdi Keynote Apr2008Ian Pratt Nsdi Keynote Apr2008
Ian Pratt Nsdi Keynote Apr2008
 
2. OS vs. VMM
2. OS vs. VMM2. OS vs. VMM
2. OS vs. VMM
 
Xen revisited
Xen revisitedXen revisited
Xen revisited
 
Ina Pratt Fosdem Feb2008
Ina Pratt Fosdem Feb2008Ina Pratt Fosdem Feb2008
Ina Pratt Fosdem Feb2008
 
Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.Virtualization: Hyper-V, VMM, App-V and MED-V.
Virtualization: Hyper-V, VMM, App-V and MED-V.
 
Dynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 SDynamic Memory Management Hyperv 2008 R2 S
Dynamic Memory Management Hyperv 2008 R2 S
 
Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1Dynamic Memory Management HyperV R2 SP1
Dynamic Memory Management HyperV R2 SP1
 
Usenix Invited Talk
Usenix Invited TalkUsenix Invited Talk
Usenix Invited Talk
 
Hyper V And Scvmm Best Practis
Hyper V And Scvmm Best PractisHyper V And Scvmm Best Practis
Hyper V And Scvmm Best Practis
 
Ian Pratt Usenix 08 Keynote
Ian Pratt Usenix 08 KeynoteIan Pratt Usenix 08 Keynote
Ian Pratt Usenix 08 Keynote
 

More from OpenCity Community

More from OpenCity Community (20)

开源讲义.pdf
开源讲义.pdf开源讲义.pdf
开源讲义.pdf
 
物联网操作系统漫谈-GIAC大会.pdf
物联网操作系统漫谈-GIAC大会.pdf物联网操作系统漫谈-GIAC大会.pdf
物联网操作系统漫谈-GIAC大会.pdf
 
2017开源年会-企业开源那些事儿-更新.pdf
2017开源年会-企业开源那些事儿-更新.pdf2017开源年会-企业开源那些事儿-更新.pdf
2017开源年会-企业开源那些事儿-更新.pdf
 
社会化研发
社会化研发社会化研发
社会化研发
 
Containers & CaaS
Containers & CaaSContainers & CaaS
Containers & CaaS
 
OaaS:Open as a Strategy
OaaS:Open as a StrategyOaaS:Open as a Strategy
OaaS:Open as a Strategy
 
Hello openstack 2014
Hello openstack 2014Hello openstack 2014
Hello openstack 2014
 
Docker openstack-2014
Docker openstack-2014Docker openstack-2014
Docker openstack-2014
 
Learn OpenStack from trystack.cn
Learn OpenStack from trystack.cnLearn OpenStack from trystack.cn
Learn OpenStack from trystack.cn
 
OpenStack系列公开课2 -20130508
OpenStack系列公开课2 -20130508OpenStack系列公开课2 -20130508
OpenStack系列公开课2 -20130508
 
OpenStack ecosystem
OpenStack ecosystemOpenStack ecosystem
OpenStack ecosystem
 
How to master OpenStack in 2 hours
How to master OpenStack in 2 hoursHow to master OpenStack in 2 hours
How to master OpenStack in 2 hours
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practice
 
Quantum Networks
Quantum NetworksQuantum Networks
Quantum Networks
 
云计算思考
云计算思考云计算思考
云计算思考
 
Openstorage Openstack
Openstorage OpenstackOpenstorage Openstack
Openstorage Openstack
 
Openstack的研究与实践
Openstack的研究与实践Openstack的研究与实践
Openstack的研究与实践
 
Open Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex YangOpen Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex Yang
 
Nova与虚拟机管理
Nova与虚拟机管理Nova与虚拟机管理
Nova与虚拟机管理
 
Look Into Libvirt Osier Yang
Look Into Libvirt Osier YangLook Into Libvirt Osier Yang
Look Into Libvirt Osier Yang
 

Virtualization Technology Overview

  • 1. Virtualization Technology Overview Liu, Jinsong (jinsong.liu@intel.com) SSG System Software Division
  • 2. Agenda • Introduction  history  Usage model • Virtualization overview  cpu virtualiztion  memory virtualization  I/O virtualization • Xen/KVM architecture  Xen  KVM • Some intel work for Openstack  OAT 2 2012/11/28
  • 3. Virtualization history • 60s’ IBM - CP/CMS on S360, VM370, … • 70’s 80s’ Silence • 1998 VMWare - SimOS project, Stanford • 2003 Xen - Xen project, Cambridge • After that: KVM/Hyper-v/Parallels … 3 2012/11/28
  • 4. What is Virtualization VM0 VM1 VMn Apps Apps Apps Guest OS Guest OS ... Guest OS Virtual Machine Monitor (VMM) Platform HW Memory Processors I/O Devices • VMM is a layer of abstraction  support multiple guest OSes  de-privilege each OS to run as Guest OS • VMM is a layer of redirection  redirect physical platform to virtual platform illusions of many  provide virtaul platfom to guest os 4 2012/11/28
  • 5. Server Virtualization Usage Model Server Consolidation R&D Production App App App … OS OS OS HW HW VMM VMM HW HW Benefit: Cost Savings • Consolidate services Benefit: Business Agility and Productivity • Power saving Dynamic Load Balancing Disaster Recovery App App App App 1 2 3 4 App App … OS OS OS OS OS OS VMM VMM VMM HW HW VMM CPU Usage CPU Usage HW HW 90% 30% Benefit: Lost saving • RAS • live migration • relief lost • Benefit: Productivity 5 2012/11/28
  • 6. Agenda • Introduction • Virtualization overview  CPU virtualization  Memory virtualization  I/O virtualization • Xen/KVM architecture • Some intel work for Openstack 6 2012/11/28
  • 7. X86 virtualization challenges • Ring Deprivileging  Goal: isolate guest OS from • Controlling physical resources directly • Modifying VMM code and data  Ring deprivileging layout • vmm runs at full privileged ring0 • Guest kernel runs at • X86-32: deprivileging ring 1 • X86-64: deprivileging ring 3 • Guest app runs at ring 3  Ring deprivileging problems • Unnecessary faulting • some privilege instructions • some exceptions • Guest kernel protection (x86-64) • Virtualization holes  19 instructions • SIDT/SGDT/SLDT … • PUSHF/POPF …  Some userspace holes hard to fix by s/w approach • Hard to trap, or • Performance overhead 7 2012/11/28
  • 8. X86 virtualization challenges VM0 VM0 1 VM0 2 Ring3 Guest Apps Apps Guest Apps Apps Guest Apps Apps Guest Kernel Guest OS Guest Kernel Guest OS Guest Kernel Guest OS Ring1 Ring0 Virtual Machine Monitor (VMM) 8 2012/11/28
  • 9. Typical X86 virtualization approaches • Para-virtualization (PV)  Para virtualization approach, like Xen  Modified guest OS aware and co-work with VMM  Standardization milestone: linux3.0 • VMI vs. PVOPS • Bare metal vs. virtual platform • Binary Translation (BT)  Full virtualization approach, like VMWare  Unmodified guest OS  Translate binary ‘on-the-fly’ • translation block w/ caching, • usually used for kernel, ~80% native performance • userspace app directly runs natively as much as possible, ~100% native performance • overall ~95% native performance • Complicated • Involves excessive complexities. e.g., self-modifying code • Hardware-assisted Virtualization (VT)  Full virtualization approach assisted by hardware, like KVM  Unmodified guest OS  Intel VT-x, AMD-v  Benefits: • Closing virtualization holes in hardware • Simplify VMM software • Optimizing for performance 9 2012/11/28
  • 10. Memory virtualization challenges • Guest OS has 2 assumptions  expect to own physical memory starting from 0 • BIOS/Legacy OS are designed to boot from address low 1M  expect to own basically contiguous physical memory • OS kernel requires minimal contiguous low memory • DMA require certain level of contiguous memory • Efficient MM management, e.g., less buddy overhead • Efficient TLB, e.g., super page TLB • MMU virtualization  How to keep physical TLB valid  Different approaches involve different complication and overhead 10 2012/11/28
  • 11. Memory virtualization challenges VM1 VM2 VM3 VM4 1 Guest 2 Pseudo 3 Physical 4 Memory 5 Hypervisor 5 Machine 1 Physical Memory 2 3 4 11 2012/5/13
  • 12. Memory virtualization approaches • Direct page table  Guest/VMM in same linear space  Guest/VMM share same page table GVA • Shadow page table  Guest page table unmodified • gva -> gpa  VMM shadow page table • gva -> hpa  Complication and memory overhead Guest page table • Extended page table  Guest page table unmodified • gva -> gpa Direct Shadow • full control CR3, page fault  VMM extended page table page table GPA page table • gpa -> hpa • hardware based • good scalability for SMP • low memory overhead Extended • Reduce page fault VMexit greatly page table • Flexible choices  Para virtualization • Direct page table • Shadow page table  Full virtualization HPA • Shadow page table • Extended page table 12 2012/11/28
  • 13. Shadow page table Page Directory • Guest page table remains PTE unmodified to guest PDE  Translate from gva -> gpa Page Table • Hypervisor create a new vCR3 page table for physical Virtual  Use hpa in PDE/PTE Physical  Translate from gva -> hpa  Invisible to guest Page Directory PTE PDE Page Table pCR3 13 2012/11/28
  • 14. Extended page table Guest CR3 EPT base pointer Guest Guest Physical Address Extended Guest Linear Page Page Host Physical Address Tables Tables Address • Extended page table  Guest can have full control over its page tables and events • CR3, INVLPG, page fault  VMM controls Extended Page Tables • Complicated shadow page table is eliminated • Improved scalability for SMP guest 14 2012/11/28
  • 15. I/O virtualization requirements Interrupt Register Access Device CPU DMA Shared Memory • I/O device from OS point of view  Resource configuration and probe  I/O request: IO, MMIO  I/O data: DMA  Interrupt • I/O Virtualization require  presenting guestos driver a complete device interface • Presenting an existing interface • Software Emulation • Direct assignment • Presenting a brand new interface • Paravirtualization 15 2012/11/28
  • 16. I/O virtualization approaches • Emulated I/O  Software emulates real hardware device  VMs run same driver for the emulated hardware device  Good legacy software compatibility  Emulation overheads limit performance • Paravirtualized I/O  Uses abstract interfaces and stack for I/O services  FE driver: guest run virtualization-aware drivers  BE driver: driver based on simplified I/O interface and stack  Better performance over emulated I/O • Direct I/O  Directly assign device to Guest • Guest access I/O device directly • High performance and low CPU utilization  DMA issue • Guest set guest physical address • DMA hardware only accept host physical address  Solution: DMA Remapping (a.k.a IOMMU) • I/O page table is introduced • DMA engine translate according to I/O page table  Some limitations under live migration 16 2012/11/28
  • 17. Virtual platform models Hypervisor Model Host-based Model Hybrid Model Guest Guest Guest Apps Apps ULM Apps Apps Apps DM Service Preferred Guest Guest Preferred VM Guest OS OS ULM OS OS OS DM DR P M DR P M Host Hypervisor U-Hypervisor OS P LKM M DR DM N P Processor Mgt code DR Device Driver N NoDMA M Memory Mgt code DM Device Model 17 2012/11/28
  • 18. Agenda • Introduction • Virtualization • Xen/KVM architecture • Some intel work for Openstack 18 2012/11/28
  • 19. Xen Architecture HVM Domain HVM Domain Domain 0 DomainU (32-bit) (64-bit) XenLinux64 Unmodified Unmodified OS OS (xm/xend) Models Control Device Panel 3P 3D XenLinux64 Drivers Drivers FE FE Front end Virtual 0D Virtual driver Guest BIOS Guest BIOS Backend Drivers Virtual Platform Virtual Platform 1/3P Native Device Drivers VM Exit VM Exit Callback / Hypercall 0P Inter-domain Event Channels Control Interface Scheduler Event Channel Hypercalls Processor Memory I/O: PIT, APIC, PIC, IOAPIC Xen Hypervisor 19 2012/11/28
  • 20. KVM Architecture Windows Linux Guest Guest Qemu-kvm Non Root Root VMCS VMCS VMCS vCPU vMEM vTimer Linux Kernel vPIC vAPIC vIOAPIC KVM module 20 2012/11/28
  • 21. Agenda • Introduction • Virtualization • Xen/KVM architecture • Some intel work for Openstack 21 2012/11/28
  • 22. Trusted Pools - Implementation User specifies :: OpenStack App App App App App App Host Mem > 2G agent Disk > 50G OS OS GPGPU=Intel Hypervisor / tboot EC2 API trusted_host=trusted Create VM HW/TXT Tboot- Scheduler Enabled Create TrustedFilter OSAPI Query Report Attest untrusted trusted/ Query API Attestation Server Host Agent API Privacy OAT- Query API CA Based Attestation Appraiser Service Whitelist Whitelist API DB