Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
!

XenGT:	
  a	
  So+ware	
  Based	
  Intel	
  Graphics	
  
Virtualiza;on	
  Solu;on
Oct 22, 2013
Haitao Shan, haitao.shan...
Agenda	
!

•  Background
•  Existing Arts
•  XenGT Architecture
•  Performance
•  Summary

2
!

Background
Graphics Computing	
!

•  Entertainment applications
• 

Gaming, video playback, browser, etc.

•  General purpose windowi...
Graphics Virtualization	
!

•  Performance vs. multiplexing
• 
• 

Consistent and rich user experience in all VMs
Share a ...
!

Existing Arts
Device Emulation	
!

•  Only for legacy VGA cards
• 

E.g. Cirrus logic VGA card

•  Limited graphics capability
• 
• 

2D...
Split Driver Model	
!

•  Frontend/Backend drivers
• 
• 
• 

Forward OpenGL/DirectX API calls
Implementation specific for ...
Direct Pass-Through/SR-IOV	
!

•  Best performance with direct pass-through
• 

9

However no multiplexing
!

XenGT Architecture
XenGT	
!

•  A mediated pass-through solution for
graphics virtualization
• 
• 

Pass-through performance critical resourc...
XenGT Architecture	
!

12
Intel Processor Graphics	
!

•  Graphics memory
• 

Virtual memory address spaces
• 
• 

• 

GPU
Global State

A single gl...
Mediated Pass-Through Policies	
!

•  Access frequency on GPU interfaces

•  Policies	
Pass-through
----------------------...
Global Virtual Memory Space	
!

•  The single GVM space is partitioned
• 

Access to VM’s own GVM region is passed
through...
Per-Process Virtual Memory Spaces	
!

•  Each VM manages its own
PPVM spaces
• 
• 

Active space pointed by
PP_DIR_BASE
Ac...
Command Buffers	
!

•  Command buffer access is passed through
• 

Graphics
Driver

Reside in virtual memory spaces
Ring T...
Render Engine Sharing	

Render context switch flow
1.  Wait VM1 ring buffer becoming empty
2.  Save render MMIO registers ...
Display Engine Sharing	
!

Direct display model
-  Display engine points to the frame buffer
of the foreground VM
-  vGT d...
!

Performance
3D Performance	
!

Software and workloads used in performance tests may have been optimized for performance only on Intel ...
Single VM vs. Two VMs	
!

Software and workloads used in performance tests may have been optimized for performance only on...
!

Summary
Summary	
!

•  Sustain consistent and rich user experience in VM
• 

Running native graphics driver in VM

•  Achieve good...
Notices and Disclaimers
!
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE,
EXPRESS...
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Haitao Shan, Intel
Upcoming SlideShare
Loading in …5
×

XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Haitao Shan, Intel

5,305 views

Published on

GPU virtualization has become an increasingly important requirement for client virtualization and cloud. Significant challenges exists realizing the multiplexing of graphics, media and compute workloads from multiple VMs and achieving the goals of being fully functional, high performance and secure. In this presentation, we will first review existing graphics virtualization technologies, and then introduce how XenGT - an open source solution from Intel - approaches differently. Broad functionality and good performance is achieved by accelerating the native OS graphics stack in each VM with minimum hypervisor intervention. A software mediator ensures the secure multiplexing of workloads from the multiple VMs by managing the scheduling of VMs on the GPU and controlling access to privileged resources and operations.

Published in: Technology
  • Be the first to comment

XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Haitao Shan, Intel

  1. 1. ! XenGT:  a  So+ware  Based  Intel  Graphics   Virtualiza;on  Solu;on Oct 22, 2013 Haitao Shan, haitao.shan@intel.com Kevin Tian, kevin.tian@intel.com Eddie Dong, eddie.dong@intel.com
  2. 2. Agenda ! •  Background •  Existing Arts •  XenGT Architecture •  Performance •  Summary 2
  3. 3. ! Background
  4. 4. Graphics Computing ! •  Entertainment applications •  Gaming, video playback, browser, etc. •  General purpose windowing •  Windows Aero, Compiz Fusion, etc •  High performance computing •  Computer aided designs, weather broadcast, etc. Same capability required, when above tasks are moved into VM 4
  5. 5. Graphics Virtualization ! •  Performance vs. multiplexing •  •  Consistent and rich user experience in all VMs Share a single GPU among multiple VMs Client Server VDI, transcoder, GPGPU Embedded 5 Rich Virtual Client Smartphone, tablet, IVI
  6. 6. ! Existing Arts
  7. 7. Device Emulation ! •  Only for legacy VGA cards •  E.g. Cirrus logic VGA card •  Limited graphics capability •  •  2D only Optimizations on frame buffer operations •  E.g. PV framebuffer •  Impossible to emulate a modern GPU •  •  7 Complexity Poor performance
  8. 8. Split Driver Model ! •  Frontend/Backend drivers •  •  •  Forward OpenGL/DirectX API calls Implementation specific for the level of forwarding E.g. VMGL, VMware vGPU, Virgil •  Hardware agnostic •  Challenges on forwarding between host/ guest graphics stacks •  •  8 API compatibility CPU overhead
  9. 9. Direct Pass-Through/SR-IOV ! •  Best performance with direct pass-through •  9 However no multiplexing
  10. 10. ! XenGT Architecture
  11. 11. XenGT ! •  A mediated pass-through solution for graphics virtualization •  •  Pass-through performance critical resources Trap-and-emulate privileged operations •  Maintain a device model per VM •  Run native graphics driver in VM •  Achieve good performance and moderate multiplexing capability Performance Device Emulation Split Driver Model Multiplexing 11 Mediated PassThrough Direct Pass-Through
  12. 12. XenGT Architecture ! 12
  13. 13. Intel Processor Graphics ! •  Graphics memory •  Virtual memory address spaces •  •  •  GPU Global State A single global virtual memory (GVM) space Multiple per-process virtual memory (PPVM) spaces Backed by system memory through GTTs Render Engine State State Per-Process Virtual Memory GPU Commands Display Engine Global Virtual Memory External Monitors •  Render engine •  Fulfill the acceleration capability through fixed pipelines and execution units •  Display engine •  Route date from graphics memory to external monitors •  Global state •  13 Represent remaining circuits, including initialization, PM, etc. Per-Process Graphics Translation Tables (PPGTTs) Global Graphics Translation Table (GGTT) Graphics Memory (System Memory)
  14. 14. Mediated Pass-Through Policies ! •  Access frequency on GPU interfaces •  Policies Pass-through -----------------------------------------Graphics Virtual Memory Spaces Command Buffers 14 Mediation -----------------------------------------MMIO registers GTTs PCI configuration space Legacy VGA I/O ports
  15. 15. Global Virtual Memory Space ! •  The single GVM space is partitioned •  Access to VM’s own GVM region is passed through Classical memory virtualization challenge •  •  •  Host view vs. guest view Address space ballooning with driver cooperation •  GGTT accesses are mediated •  Access to its own GGTT entries is translated •  •  15 GPFN <-> MFN Access to others’ entries is virtualized
  16. 16. Per-Process Virtual Memory Spaces ! •  Each VM manages its own PPVM spaces •  •  Active space pointed by PP_DIR_BASE Accesses are passed through •  PPGTT accesses are writeprotected •  •  16 Shadow PPGTT table Switch PP_DIR_BASE at render context switch
  17. 17. Command Buffers ! •  Command buffer access is passed through •  Graphics Driver Reside in virtual memory spaces Ring Tail GPU Ring Head Submission Chained Batch Buffers Batch Buffer Command Ring Buffer T1 Queue Commands Completion Command Submission Access Registers Command Submission T2 Execute Commands Completion •  Command submission request is mediated •  •  •  •  17 Completion Through MMIO register (ring tail) Render scheduler makes the decision Render owner request is submitted to render engine Non-render owner request is blocked time time
  18. 18. Render Engine Sharing Render context switch flow 1.  Wait VM1 ring buffer becoming empty 2.  Save render MMIO registers for VM1 •  A simple round-robin scheduler •  In 16ms epoch 3.  Flush internal TLB/caches 4.  Hardware context switch 5.  Restore render MMIO registers for VM2 6.  Submit previously queued commands •  Render owner access is trap-andforwarded to the render engine •  Non-render owner access is trapand-emulated 18 !
  19. 19. Display Engine Sharing ! Direct display model -  Display engine points to the frame buffer of the foreground VM -  vGT driver configures display engine for foreground/background switch 19 Indirect display model -  vGT driver provides interface to decode VM frame buffer location/format -  An OpenGL app composites VM frame buffers
  20. 20. ! Performance
  21. 21. 3D Performance ! Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks 21
  22. 22. Single VM vs. Two VMs ! Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks 22
  23. 23. ! Summary
  24. 24. Summary ! •  Sustain consistent and rich user experience in VM •  Running native graphics driver in VM •  Achieve good performance •  Minimum impact on performance critical operations •  Support moderate multiplexing capability •  Trap-and-emulate privileged operations •  Call for action - try and feedback •  •  •  24 https://github.com/01org/XenGT-Preview-kernel https://github.com/01org/XenGT-Preview-xen https://github.com/01org/XenGT-Preview-qemu
  25. 25. Notices and Disclaimers ! INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security Intel, Intel logo, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright © 2013 Intel Corporation. All rights reserved. 25

×