Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HKG18- 214 - Progress of Wrapdrive


Published on

"Session ID: HKG18-214
Session Name: HKG18-214 - Progress of WrapDrive
Speaker: Zaibo Xu
Track: Networking

★ Session Summary ★
As a Common Accelerator Framework for User Space Applications was brought up at last Connect SFO 2017 by Kenneth Lee, we have done a series of work to support user applications better. DMA mapping for accelerators in multiple processes and SVM(Share Virtual Memory) without page fault are running okay on our D06 board, which is enabled by new VFIO and IOMMU APIs based on the SVM patch (RFC v3) from Jean-Philippe Brucker of ARM. Also, we do some performance testing on the above scenarios with our SOC device (ZIP) to show the advantages of Wrapdrive.In the next, SVM with page fault from devices will be supported by Wrapdrive with a high performance and smart coding way.
★ Resources ★
Event Page:
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong

Keyword: Networking
Follow us on Social Media"

Published in: Technology
  • Be the first to comment

  • Be the first to like this

HKG18- 214 - Progress of Wrapdrive

  1. 1. Progress of Wrapdrive (A Common Accelerator Framework for User Space) Zaibo Xu from Huawei
  2. 2. Contents ● Looking back ● Progress ● Current Challenges ● Plans
  3. 3. Looking back on Wrapdrive: SFO 2017 What is Wrapdrive (WD)? 1. An accelerator framework for user space, leveraging hardware accelerators with native performance. 2. Based on VFIO and VFIO Mdev, allowing direct access of hardware with improved security and efficiency. 3. Hardware accelerator is accessed via a ‘queue’, the minimal working unit for user. The queue’s DMA priority is controlled by the user.
  4. 4. Looking back on Wrapdrive: SFO 2017 Why Wrapdrive? ● Try to get the native performance of accelerator. ● Break the limit of one device serving only one process with ‘queue’. ● ‘Queue’ should be managed with more security. Status back then ● Shared virtual memory (SVM, SVA) was in the phase of RFC. ● Substream-ID was not supported by SMMU driver . ● Wrapdrive supported only one process.
  5. 5. Wrapdrive Progress ● Support SVA ● DMA mapping one device from multiple processes ● Keep compatibility (native, normal mdev)
  6. 6. Support SVA ● Extends SVA patch set from Jean-Philippe Brucker. ○ ● Currently no I/O page fault (IOPF). ○ Uses ‘mlock’ to trigger any faults before DMA access and to prevent the pages from being swapped. ○ In theory, WD can support SVA with IOPF, but no accelerator is yet available to test. ● Tested on a Hisilicon D06 board of Hisilicon with ZIP accelerator (Example) ● One Wrapdrive device (Wdev) serves multiple processes including kernel. ○ VFIO can bind one Mdev(queue) to serve one process with an IO page table, so accelerator supported PASID with multiple queues can support multiple processes. Moreover, kernel default IO page table’s PASID is zero, which is existing as before.
  7. 7. DMA Mapping in multiple processes ● Add VFIO APIs for private DMA map, similar to SVA VFIO bind/unbind. +#define VFIO_IOMMU_ATTACH _IO(VFIO_TYPE, VFIO_BASE + 24) +#define VFIO_IOMMU_DETACH _IO(VFIO_TYPE, VFIO_BASE + 25) ● VFIO_IOMMU_ATTACH creates a PASID linked IOVA address space for the VFIO container. ● This IOVA address space is retrieved using a DMA map operation. ● iommu map/unmap versions with PASID is added to the iommu operations. /* Actually, ‘io_mm’ denotes an address space and it includes a PASID */ struct iommu_ops { … int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size); +int (*sva_map)(struct iommu_domain *domain, struct io_mm *io_mm, unsigned long iova, phys_addr_t paddr, size_t size, int prot); +size_t (*sva_unmap)(struct iommu_domain *domain, struct io_mm *io_mm, unsigned long iova, size_t size); … } ● May be merged with normal map/unmap in future.
  8. 8. Extend standard interfaces ● A pointer to the parent device iommu_group is added to the vfio_group. struct vfio_group { struct iommu_group *iommu_group; +/* iommu_group of mdev's parent device */ +struct iommu_group *parent_group; struct list_head next; }; ● A Wdev enabling method is added in ‘vfio_iommu_type1_attach_group’. mdev_bus = symbol_get(mdev_bus_type); if (mdev_bus) { if ((bus == mdev_bus) && !iommu_present(bus)) { … +/* Check if it is wdev (wrapdrive device) and get parent device’s group, or go default logic */ +ret = iommu_group_for_each_dev(iommu_group, &pgroup, vfio_wdev_type); … +domain->domain = iommu_group_default_domain(pgroup); +group->parent_group = pgroup; … +return 0; +} if (!iommu->external_domain) { ● Normal Mediated Device handling is untouched (vGPU etc)
  9. 9. Wrapdrive is moved into VFIO as VFIO_Wdev, and a hardware queue can be a Mdev if the DEV is a VFIO_Wdev. For Mdev from VFIO_Wdev, VFIO operations are finally applied on Mdev’s parent device. One device supports multiple processes without unbinding its driver. VFIO VFIO_Mdev VFIO_Wdev Relationship with VFIO / MDEV IOMMU DEV … Queue Queue Process Process … IO PGTBL IO PGTBL PGTBL PGTBL … … DEV … Queue Queue Process Process … SVA Processes DMA MAP iommu_domain iommu_domain Kernel IO PGTBL PASID = 0
  10. 10. Example ● zlib acceleration on a Hisilicon D06 board (using a ZIP accelerator) Source code: Branch:wrapdrive-4.15-rc9 ZIP K_DRV VFIO_WDEV vfio_wdev_register VFIO ZIP U_DRV U_WD Test Sample 1. Synchronous and asynchronous mode APIs of WD compared. 2. Multiple DMA map and SVA no IOPF scenarios compared. 3. Each scenario is run by 1 or 3 processes. 4. A range of different packet lengths.
  11. 11. Zlib acceleration performance ● zlib throughput on Hisilicon D06 board (using ZIP accelerator) Source code: Branch:wrapdrive-4.15-rc9 Mpps Mpps Bytes 8 scenarios : ZIP accelerator serves 1 or 3 processes with SVA no IOPF or multiple map in synchronous or asynchronous modes.
  12. 12. Current Challenges ● Still existing security risk since several users work on one device. ○ ‘Queue’ is not isolated as much as a task in the OS. ● Wrapdrive queue management on VFIO Mdev is still awkward. ○ Creating Mdev for WD needs root permission. ○ Getting/putting a WD queue involves a series of Sysfs operations. ○ Mdev held by a process cannot be released automatically if the process exits unexpectedly. ● Coexist with Linux kernel Crypto / AF_ALG (controversial topic) ○ Ecosystem is ready because devices support PASID/PRI/ATS/ATC and corresponding software such as VFIO and SVA is in place. ○ Userspace should be able to leverage accelerators directly with high performance now that devices can DMA in user space in a fast and secure way.
  13. 13. Plans ● SVA with I/O page fault (hardware dependent) ● Extend to other type of devices (e.g. NIC) ○ Benefits user space data plane applications (ODP/DPDK etc) ● VFIO_Mdev framework Optimization to better support the requirements of Wrapdrive.
  14. 14. Contributions/ack
  15. 15. Thank You! And question?