SlideShare a Scribd company logo
www.huawei.com
HUAWEI TECHNOLOGIES CO., LTD.
Make Accelerator Pluggable for
Container Engine
Author/ Email: Jiuyue Ma / majiuyue@huawei.com
Version: V1.0 (20150504)
HUAWEI TECHNOLOGIES CO., LTD. 2
Today’s Topic
Container Engine
Hardware Accelerators
e.g. GPU、QAT、FPGA
OLTP/OLAP BigData Processing
Neural Network
CloudRAN
Pluggable
Accelerator Support
for Container Engine
HUAWEI TECHNOLOGIES CO., LTD. 3
Agenda
 Use accelerator in containers and How?
 Key problems to deal with
 Identify accelerator requirements
 Prepare accelerator runtimes
 Manage accelerator resources
 Further development
HUAWEI TECHNOLOGIES CO., LTD. 4
Why Container?
HUAWEI TECHNOLOGIES CO., LTD. 5
Why Container? Why NOT Container?
* Raphael Da Silva. OPEN SOURCE TECHNOLOGY: Docker Containers on IBM Bluemix. 2015.
HUAWEI TECHNOLOGIES CO., LTD. 6
 CPU is not fast/effective/… enough
 Heterogeneous is the future
 Nvidia GPU-compute, Intel Skylake Xeon+FPGA, AWS P2/F1 Instance
 use the right processor, in the right place, at the right time
Why Accelerator?
Container
a.k.a. Docker
[baidu, HotChips 2016]
HUAWEI TECHNOLOGIES CO., LTD. 7
Google it ! “Use XXX in docker”
NVIDIA’s GPU-Docker Solution
DPDK
FPGA
GPU
HUAWEI TECHNOLOGIES CO., LTD. 8
Use GPU in Docker (from nvidia-docker)
Wrapped Command
docker run
--volume-driver=nvidia-docker
--volume=nvidia_driver_361.48:/usr/local/nvidia:ro
③ nvidia-docker cli-wrapper
$ ldconfig -p | grep -E 'nvidia|cuda'
libnvidia-ml.so (libc6,x86-64) => /usr/lib/nvidia-361/libnvidia-ml.so
libnvidia-glcore.so.361.48 (libc6,x86-64) => /usr/lib/nvidia-361/lib…
libnvidia-compiler.so.361.48 (libc6,x86-64) => /usr/lib/nvidia-361/lib…
libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so
...
$ lsmod | grep nvidia
nvidia_uvm 711531 2
nvidia_modeset 742329 0
nvidia 10058469 80 nvidia_modeset,nvidia_uvm
Key Problem:VERSION MISMATCH
$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
② VOLUME: nvidia-docker/nvidia_driver_361.48
① Labeled Image
LABEL com.nvidia.volumes.needed="nvidia_driver"
LABEL com.nvidia.cuda.version="7.5"
• detect/check cuda version
• hook docker cli args
--volume-driver / --volume nvidia
HUAWEI TECHNOLOGIES CO., LTD. 9
Do we really need “nvidia-docker”?
If so, we also need:
 dpdk-docker, qat-docker, fpga-docker, …
But,
if we need both
GPU and DPDK?
nvidia DPDK QAT FPGA
DPDK
nvidia
Pluggable
Accelerator
HUAWEI TECHNOLOGIES CO., LTD. 10
Device Files
(-d device)
/dev/nvidia0
/dev/nvidia-smi
/dev/nvidia-uvm
/dev/uio0 /dev/fpga0
File Bindings
(-v src:dest)
/usr/lib/nvidia-361/libnvidia-ml.so
/usr/lib/nvidia-361/libnvidia-glcore.so.361.48
/usr/lib/nvidia-361/libnvidia-
compiler.so.361.48
/usr/lib/x86_64-linux-gnu/libcuda.so
...
/sys/bus/pci/devices/xxxx:xx:xx.x
/dev/hugepages
/usr/lib/libfpga.so
/sys/bus/pci/devices/xxxx:xx:xx.x
/dev/hugepages
Environments
(-e env)
PATH=$PATH:/usr/lib/nvidia-361/bin
LD_LIBRARY_PATH=…
Pluggable Accelerator, How?
 What’s the difference?
nvidia DPDK FPGA
Accelerator
Plugin Interface
in-house
FPGA Accel
HUAWEI TECHNOLOGIES CO., LTD. 11
Implementation (1/3): IDENTIFY
 Accelerator Requirements
 We are using the function provided by device, not the device itself
 Identified by runtime instead of device
 e.g. “A SHA256 accelerator” instead of “A FPGA with XXXX-bitstream”
 Identify user requirements
 Integrate into image
 Specify when start container
 Identify device capabilities
 Identified by “accelerator plugin”
 Report to engine through plugin api
LABEL runtime
gpu0=cuda:7.5
$ docker run --accel fpga:crypto 
huawei/crypto-image
image cli / rest-api
 dockerd
Device Accel Runtime
/dev/nvidia0 cuda:7.5, cuda:8.0, OpenCL 2.0, …
/dev/qat0 md5, sha256, aes, …
/dev/fpga0 crypto, compress, decode, …
Accelerator
Plugin
Plugin API
HUAWEI TECHNOLOGIES CO., LTD. 12
Implementation (2/3): PREPARE
 Select plugin from accelerator requirements
 Device initialization by plugin
 Dynamic inject required libraries & devices
DPDK-plugin
- eal.so/…
- igb_uio.ko
- /dev/uioX
FPGA-plugin
- fpga.so
- fpga.ko
- /dev/fpgaX
 container
 Host
Application
fpga.ko
v1
 dockerd
/dev/fpga0
generate
bind
libacc
accel requirements
PaaS
fpga.so/dev/fpga0
fpga.so
v1
(2) mount userspace lib
(3) mount device
GPU-plugin
- gpu.so / cuda.so
- gpu.ko
- /dev/gpuX
QAT-plugin
- qat.so
- qat.ko
- /dev/qatX
HUAWEI TECHNOLOGIES CO., LTD. 13
Implementation (3/3): MANAGEMENT
Running
CreatedStopped
CreatingDeleting
INIT
Container
Life-Cycles
allocate
persistent
allocate
non-persistent
release
non-persistent
release
persistent
prepare
parse accel
requirements
Accelerator
Plugin
 Accelerator Life-Cycle
 Allocate  Prepare  Release
 Plugin hooks for each stage
 Integrate with Container Life-
Cycle
 Persistent:CreateDelete
 Non-persistent: StartStop
HUAWEI TECHNOLOGIES CO., LTD. 14
Example Flow
HUAWEI TECHNOLOGIES CO., LTD. 15
Further Developments
 Standardize “Accelerator Runtime”
 OCI Runtime/Image Spec should be good place
 Integrate with Docker Swarm / K8S / etc
 Cluster level accelerator schedule&management
 More Features
 Accelerator Sharing
 Accelerator Exception Interface
 Accelerator Hot-Plug
 …
Copyright© 2011 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the future
financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual
results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information
is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any
time without notice.
Thank you
www.huawei.com

More Related Content

What's hot

[20200720]cloud native develoment - Nelson Lin
[20200720]cloud native develoment - Nelson Lin[20200720]cloud native develoment - Nelson Lin
[20200720]cloud native develoment - Nelson Lin
HanLing Shen
 
OCI Support in Mesos
OCI Support in MesosOCI Support in Mesos
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Stephen Gordon
 
Scale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 servicesScale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 services
LinuxCon ContainerCon CloudOpen China
 
Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)
HungWei Chiu
 
Running Legacy Applications with Containers
Running Legacy Applications with ContainersRunning Legacy Applications with Containers
Running Legacy Applications with Containers
LinuxCon ContainerCon CloudOpen China
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebula Project
 
Unikernelized Linux
Unikernelized LinuxUnikernelized Linux
Linuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharborLinuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharbor
LinuxCon ContainerCon CloudOpen China
 
Kubernetes 1001
Kubernetes 1001Kubernetes 1001
Kubernetes 1001
HungWei Chiu
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebula Project
 
Containerize ovs ovn components
Containerize ovs ovn componentsContainerize ovs ovn components
Containerize ovs ovn components
Aliasgar Ginwala
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
NETWAYS
 
Application-Based Routing
Application-Based RoutingApplication-Based Routing
Application-Based Routing
HungWei Chiu
 
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
Docker, Inc.
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan Hazlett
Docker, Inc.
 
Container Orchestration from Theory to Practice
Container Orchestration from Theory to PracticeContainer Orchestration from Theory to Practice
Container Orchestration from Theory to Practice
Docker, Inc.
 
OpenShift v3 Internal networking details
OpenShift v3 Internal networking detailsOpenShift v3 Internal networking details
OpenShift v3 Internal networking details
Etsuji Nakai
 
Virtualization inside kubernetes
Virtualization inside kubernetesVirtualization inside kubernetes
Virtualization inside kubernetes
inwin stack
 
Linux Kernel Development
Linux Kernel DevelopmentLinux Kernel Development
Linux Kernel Development
LinuxCon ContainerCon CloudOpen China
 

What's hot (20)

[20200720]cloud native develoment - Nelson Lin
[20200720]cloud native develoment - Nelson Lin[20200720]cloud native develoment - Nelson Lin
[20200720]cloud native develoment - Nelson Lin
 
OCI Support in Mesos
OCI Support in MesosOCI Support in Mesos
OCI Support in Mesos
 
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
 
Scale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 servicesScale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 services
 
Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)
 
Running Legacy Applications with Containers
Running Legacy Applications with ContainersRunning Legacy Applications with Containers
Running Legacy Applications with Containers
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
 
Unikernelized Linux
Unikernelized LinuxUnikernelized Linux
Unikernelized Linux
 
Linuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharborLinuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharbor
 
Kubernetes 1001
Kubernetes 1001Kubernetes 1001
Kubernetes 1001
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
 
Containerize ovs ovn components
Containerize ovs ovn componentsContainerize ovs ovn components
Containerize ovs ovn components
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
 
Application-Based Routing
Application-Based RoutingApplication-Based Routing
Application-Based Routing
 
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
DCSF 19 Online Feature Extraction and Event Generation for Computer-Animal In...
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan Hazlett
 
Container Orchestration from Theory to Practice
Container Orchestration from Theory to PracticeContainer Orchestration from Theory to Practice
Container Orchestration from Theory to Practice
 
OpenShift v3 Internal networking details
OpenShift v3 Internal networking detailsOpenShift v3 Internal networking details
OpenShift v3 Internal networking details
 
Virtualization inside kubernetes
Virtualization inside kubernetesVirtualization inside kubernetes
Virtualization inside kubernetes
 
Linux Kernel Development
Linux Kernel DevelopmentLinux Kernel Development
Linux Kernel Development
 

Viewers also liked

From Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering PrimerFrom Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering Primer
LinuxCon ContainerCon CloudOpen China
 
Secure Containers with EPT Isolation
Secure Containers with EPT IsolationSecure Containers with EPT Isolation
Secure Containers with EPT Isolation
LinuxCon ContainerCon CloudOpen China
 
Open Source Software Business Models Redux
Open Source Software Business Models ReduxOpen Source Software Business Models Redux
Open Source Software Business Models Redux
LinuxCon ContainerCon CloudOpen China
 
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux ContainersRebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
LinuxCon ContainerCon CloudOpen China
 
GPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor GraphicsGPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor Graphics
LinuxCon ContainerCon CloudOpen China
 
Status of Embedded Linux
Status of Embedded LinuxStatus of Embedded Linux
Status of Embedded Linux
LinuxCon ContainerCon CloudOpen China
 
Libvirt API Certification
Libvirt API CertificationLibvirt API Certification
Libvirt API Certification
LinuxCon ContainerCon CloudOpen China
 
LiteOS
LiteOS LiteOS
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
LinuxCon ContainerCon CloudOpen China
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
LinuxCon ContainerCon CloudOpen China
 
Practical CNI
Practical CNIPractical CNI
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
LinuxCon ContainerCon CloudOpen China
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
LinuxCon ContainerCon CloudOpen China
 
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoTZephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
LinuxCon ContainerCon CloudOpen China
 
Policy-based Resource Placement
Policy-based Resource PlacementPolicy-based Resource Placement
Policy-based Resource Placement
LinuxCon ContainerCon CloudOpen China
 
OpenDaylight OpenStack Integration
OpenDaylight OpenStack IntegrationOpenDaylight OpenStack Integration
OpenDaylight OpenStack Integration
LinuxCon ContainerCon CloudOpen China
 
Hyperledger Technical Community in China.
Hyperledger Technical Community in China. Hyperledger Technical Community in China.
Hyperledger Technical Community in China.
LinuxCon ContainerCon CloudOpen China
 
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 ArchitectureObstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
LinuxCon ContainerCon CloudOpen China
 
Building a Better Thermostat
Building a Better ThermostatBuilding a Better Thermostat
Building a Better Thermostat
LinuxCon ContainerCon CloudOpen China
 

Viewers also liked (19)

From Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering PrimerFrom Resilient to Antifragile Chaos Engineering Primer
From Resilient to Antifragile Chaos Engineering Primer
 
Secure Containers with EPT Isolation
Secure Containers with EPT IsolationSecure Containers with EPT Isolation
Secure Containers with EPT Isolation
 
Open Source Software Business Models Redux
Open Source Software Business Models ReduxOpen Source Software Business Models Redux
Open Source Software Business Models Redux
 
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux ContainersRebuild - Simplifying Embedded and IoT Development Using Linux Containers
Rebuild - Simplifying Embedded and IoT Development Using Linux Containers
 
GPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor GraphicsGPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor Graphics
 
Status of Embedded Linux
Status of Embedded LinuxStatus of Embedded Linux
Status of Embedded Linux
 
Libvirt API Certification
Libvirt API CertificationLibvirt API Certification
Libvirt API Certification
 
LiteOS
LiteOS LiteOS
LiteOS
 
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
High Performance Linux Virtual Machine on Microsoft Azure: SR-IOV Networking ...
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
 
Practical CNI
Practical CNIPractical CNI
Practical CNI
 
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
点融网区块链即服务实践 - The Practice of Blockchain as a Service in Dianrong
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoTZephyr: Creating a Best-of-Breed, Secure RTOS for IoT
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
 
Policy-based Resource Placement
Policy-based Resource PlacementPolicy-based Resource Placement
Policy-based Resource Placement
 
OpenDaylight OpenStack Integration
OpenDaylight OpenStack IntegrationOpenDaylight OpenStack Integration
OpenDaylight OpenStack Integration
 
Hyperledger Technical Community in China.
Hyperledger Technical Community in China. Hyperledger Technical Community in China.
Hyperledger Technical Community in China.
 
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 ArchitectureObstacles & Solutions for Livepatch Support on ARM64 Architecture
Obstacles & Solutions for Livepatch Support on ARM64 Architecture
 
Building a Better Thermostat
Building a Better ThermostatBuilding a Better Thermostat
Building a Better Thermostat
 

Similar to Make Accelerator Pluggable for Container Engine

State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
inside-BigData.com
 
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Christoforos Kachris
 
CA Performance Manager Agility by using Docker Containers for Network Manag...
CA Performance Manager Agility by using Docker Containers for Network Manag...CA Performance Manager Agility by using Docker Containers for Network Manag...
CA Performance Manager Agility by using Docker Containers for Network Manag...
CA Technologies
 
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUsDCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
Docker, Inc.
 
Delivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesDelivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devices
Ajeet Singh Raina
 
Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!
Red Hat Developers
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!
Anthony Dahanne
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Community
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Jung-Hong Kim
 
Develop QNAP NAS App by Docker
Develop QNAP NAS App by DockerDevelop QNAP NAS App by Docker
Develop QNAP NAS App by Docker
Terry Chen
 
DevFest 2022 - Cloud Workstation Introduction TaiChung
DevFest 2022 - Cloud Workstation Introduction TaiChungDevFest 2022 - Cloud Workstation Introduction TaiChung
DevFest 2022 - Cloud Workstation Introduction TaiChung
KAI CHU CHUNG
 
Simplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudSimplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual Cloud
Liz Warner
 
Node.js, Vagrant, Chef, and Mathoid @ Benetech
Node.js, Vagrant, Chef, and Mathoid @ BenetechNode.js, Vagrant, Chef, and Mathoid @ Benetech
Node.js, Vagrant, Chef, and Mathoid @ Benetech
Christopher Bumgardner
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
Nissan Dookeran
 
Docker engine - Indroduc
Docker engine - IndroducDocker engine - Indroduc
Docker engine - Indroduc
Al Gifari
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
Ricardo Amaro
 
Building a data warehouse with Pentaho and Docker
Building a data warehouse with Pentaho and DockerBuilding a data warehouse with Pentaho and Docker
Building a data warehouse with Pentaho and Docker
Wellington Marinho
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the Cloud
Subbu Rama
 
Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in Kubernetes
Robert Lemke
 
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHPHands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Dana Luther
 

Similar to Make Accelerator Pluggable for Container Engine (20)

State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
 
CA Performance Manager Agility by using Docker Containers for Network Manag...
CA Performance Manager Agility by using Docker Containers for Network Manag...CA Performance Manager Agility by using Docker Containers for Network Manag...
CA Performance Manager Agility by using Docker Containers for Network Manag...
 
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUsDCSF 19 Accelerating Docker Containers with NVIDIA GPUs
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
 
Delivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devicesDelivering Docker & K3s worloads to IoT Edge devices
Delivering Docker & K3s worloads to IoT Edge devices
 
Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!Why you’re going to fail running java on docker!
Why you’re going to fail running java on docker!
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
 
Develop QNAP NAS App by Docker
Develop QNAP NAS App by DockerDevelop QNAP NAS App by Docker
Develop QNAP NAS App by Docker
 
DevFest 2022 - Cloud Workstation Introduction TaiChung
DevFest 2022 - Cloud Workstation Introduction TaiChungDevFest 2022 - Cloud Workstation Introduction TaiChung
DevFest 2022 - Cloud Workstation Introduction TaiChung
 
Simplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudSimplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual Cloud
 
Node.js, Vagrant, Chef, and Mathoid @ Benetech
Node.js, Vagrant, Chef, and Mathoid @ BenetechNode.js, Vagrant, Chef, and Mathoid @ Benetech
Node.js, Vagrant, Chef, and Mathoid @ Benetech
 
Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Docker engine - Indroduc
Docker engine - IndroducDocker engine - Indroduc
Docker engine - Indroduc
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
 
Building a data warehouse with Pentaho and Docker
Building a data warehouse with Pentaho and DockerBuilding a data warehouse with Pentaho and Docker
Building a data warehouse with Pentaho and Docker
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the Cloud
 
Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in Kubernetes
 
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHPHands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
 

More from LinuxCon ContainerCon CloudOpen China

SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home. SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home.
LinuxCon ContainerCon CloudOpen China
 
Flowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoTFlowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoT
LinuxCon ContainerCon CloudOpen China
 
Introduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving ContainerIntroduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving Container
LinuxCon ContainerCon CloudOpen China
 
UEFI HTTP/HTTPS Boot
UEFI HTTP/HTTPS BootUEFI HTTP/HTTPS Boot
How Open Source Communities do Standardization
How Open Source Communities do StandardizationHow Open Source Communities do Standardization
How Open Source Communities do Standardization
LinuxCon ContainerCon CloudOpen China
 
Fully automated kubernetes deployment and management
Fully automated kubernetes deployment and managementFully automated kubernetes deployment and management
Fully automated kubernetes deployment and management
LinuxCon ContainerCon CloudOpen China
 
Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling
LinuxCon ContainerCon CloudOpen China
 
Container Security
Container SecurityContainer Security

More from LinuxCon ContainerCon CloudOpen China (8)

SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home. SecurityPI - Hardening your IoT endpoints in Home.
SecurityPI - Hardening your IoT endpoints in Home.
 
Flowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoTFlowchain: A case study on building a Blockchain for the IoT
Flowchain: A case study on building a Blockchain for the IoT
 
Introduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving ContainerIntroduction to OCI Image Technologies Serving Container
Introduction to OCI Image Technologies Serving Container
 
UEFI HTTP/HTTPS Boot
UEFI HTTP/HTTPS BootUEFI HTTP/HTTPS Boot
UEFI HTTP/HTTPS Boot
 
How Open Source Communities do Standardization
How Open Source Communities do StandardizationHow Open Source Communities do Standardization
How Open Source Communities do Standardization
 
Fully automated kubernetes deployment and management
Fully automated kubernetes deployment and managementFully automated kubernetes deployment and management
Fully automated kubernetes deployment and management
 
Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling Is there still room for innovation in container orchestration and scheduling
Is there still room for innovation in container orchestration and scheduling
 
Container Security
Container SecurityContainer Security
Container Security
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Make Accelerator Pluggable for Container Engine

  • 1. www.huawei.com HUAWEI TECHNOLOGIES CO., LTD. Make Accelerator Pluggable for Container Engine Author/ Email: Jiuyue Ma / majiuyue@huawei.com Version: V1.0 (20150504)
  • 2. HUAWEI TECHNOLOGIES CO., LTD. 2 Today’s Topic Container Engine Hardware Accelerators e.g. GPU、QAT、FPGA OLTP/OLAP BigData Processing Neural Network CloudRAN Pluggable Accelerator Support for Container Engine
  • 3. HUAWEI TECHNOLOGIES CO., LTD. 3 Agenda  Use accelerator in containers and How?  Key problems to deal with  Identify accelerator requirements  Prepare accelerator runtimes  Manage accelerator resources  Further development
  • 4. HUAWEI TECHNOLOGIES CO., LTD. 4 Why Container?
  • 5. HUAWEI TECHNOLOGIES CO., LTD. 5 Why Container? Why NOT Container? * Raphael Da Silva. OPEN SOURCE TECHNOLOGY: Docker Containers on IBM Bluemix. 2015.
  • 6. HUAWEI TECHNOLOGIES CO., LTD. 6  CPU is not fast/effective/… enough  Heterogeneous is the future  Nvidia GPU-compute, Intel Skylake Xeon+FPGA, AWS P2/F1 Instance  use the right processor, in the right place, at the right time Why Accelerator? Container a.k.a. Docker [baidu, HotChips 2016]
  • 7. HUAWEI TECHNOLOGIES CO., LTD. 7 Google it ! “Use XXX in docker” NVIDIA’s GPU-Docker Solution DPDK FPGA GPU
  • 8. HUAWEI TECHNOLOGIES CO., LTD. 8 Use GPU in Docker (from nvidia-docker) Wrapped Command docker run --volume-driver=nvidia-docker --volume=nvidia_driver_361.48:/usr/local/nvidia:ro ③ nvidia-docker cli-wrapper $ ldconfig -p | grep -E 'nvidia|cuda' libnvidia-ml.so (libc6,x86-64) => /usr/lib/nvidia-361/libnvidia-ml.so libnvidia-glcore.so.361.48 (libc6,x86-64) => /usr/lib/nvidia-361/lib… libnvidia-compiler.so.361.48 (libc6,x86-64) => /usr/lib/nvidia-361/lib… libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so ... $ lsmod | grep nvidia nvidia_uvm 711531 2 nvidia_modeset 742329 0 nvidia 10058469 80 nvidia_modeset,nvidia_uvm Key Problem:VERSION MISMATCH $ nvidia-smi Failed to initialize NVML: Driver/library version mismatch ② VOLUME: nvidia-docker/nvidia_driver_361.48 ① Labeled Image LABEL com.nvidia.volumes.needed="nvidia_driver" LABEL com.nvidia.cuda.version="7.5" • detect/check cuda version • hook docker cli args --volume-driver / --volume nvidia
  • 9. HUAWEI TECHNOLOGIES CO., LTD. 9 Do we really need “nvidia-docker”? If so, we also need:  dpdk-docker, qat-docker, fpga-docker, … But, if we need both GPU and DPDK? nvidia DPDK QAT FPGA DPDK nvidia Pluggable Accelerator
  • 10. HUAWEI TECHNOLOGIES CO., LTD. 10 Device Files (-d device) /dev/nvidia0 /dev/nvidia-smi /dev/nvidia-uvm /dev/uio0 /dev/fpga0 File Bindings (-v src:dest) /usr/lib/nvidia-361/libnvidia-ml.so /usr/lib/nvidia-361/libnvidia-glcore.so.361.48 /usr/lib/nvidia-361/libnvidia- compiler.so.361.48 /usr/lib/x86_64-linux-gnu/libcuda.so ... /sys/bus/pci/devices/xxxx:xx:xx.x /dev/hugepages /usr/lib/libfpga.so /sys/bus/pci/devices/xxxx:xx:xx.x /dev/hugepages Environments (-e env) PATH=$PATH:/usr/lib/nvidia-361/bin LD_LIBRARY_PATH=… Pluggable Accelerator, How?  What’s the difference? nvidia DPDK FPGA Accelerator Plugin Interface in-house FPGA Accel
  • 11. HUAWEI TECHNOLOGIES CO., LTD. 11 Implementation (1/3): IDENTIFY  Accelerator Requirements  We are using the function provided by device, not the device itself  Identified by runtime instead of device  e.g. “A SHA256 accelerator” instead of “A FPGA with XXXX-bitstream”  Identify user requirements  Integrate into image  Specify when start container  Identify device capabilities  Identified by “accelerator plugin”  Report to engine through plugin api LABEL runtime gpu0=cuda:7.5 $ docker run --accel fpga:crypto huawei/crypto-image image cli / rest-api  dockerd Device Accel Runtime /dev/nvidia0 cuda:7.5, cuda:8.0, OpenCL 2.0, … /dev/qat0 md5, sha256, aes, … /dev/fpga0 crypto, compress, decode, … Accelerator Plugin Plugin API
  • 12. HUAWEI TECHNOLOGIES CO., LTD. 12 Implementation (2/3): PREPARE  Select plugin from accelerator requirements  Device initialization by plugin  Dynamic inject required libraries & devices DPDK-plugin - eal.so/… - igb_uio.ko - /dev/uioX FPGA-plugin - fpga.so - fpga.ko - /dev/fpgaX  container  Host Application fpga.ko v1  dockerd /dev/fpga0 generate bind libacc accel requirements PaaS fpga.so/dev/fpga0 fpga.so v1 (2) mount userspace lib (3) mount device GPU-plugin - gpu.so / cuda.so - gpu.ko - /dev/gpuX QAT-plugin - qat.so - qat.ko - /dev/qatX
  • 13. HUAWEI TECHNOLOGIES CO., LTD. 13 Implementation (3/3): MANAGEMENT Running CreatedStopped CreatingDeleting INIT Container Life-Cycles allocate persistent allocate non-persistent release non-persistent release persistent prepare parse accel requirements Accelerator Plugin  Accelerator Life-Cycle  Allocate  Prepare  Release  Plugin hooks for each stage  Integrate with Container Life- Cycle  Persistent:CreateDelete  Non-persistent: StartStop
  • 14. HUAWEI TECHNOLOGIES CO., LTD. 14 Example Flow
  • 15. HUAWEI TECHNOLOGIES CO., LTD. 15 Further Developments  Standardize “Accelerator Runtime”  OCI Runtime/Image Spec should be good place  Integrate with Docker Swarm / K8S / etc  Cluster level accelerator schedule&management  More Features  Accelerator Sharing  Accelerator Exception Interface  Accelerator Hot-Plug  …
  • 16. Copyright© 2011 Huawei Technologies Co., Ltd. All Rights Reserved. The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice. Thank you www.huawei.com