SlideShare a Scribd company logo

Vulkan ML Japan Virtual Open House Feb 2021

This document discusses using Vulkan for machine learning acceleration. It outlines objectives to enable native Vulkan applications to use ML with low latency and overhead by avoiding large frameworks. It describes available data type extensions in Vulkan and improved compute shader capabilities. It also mentions several third party frameworks and libraries that use Vulkan for ML tasks. The document calls for feedback on functionality needed in Vulkan for ML and ways its use could be improved.

1 of 10
Download to read offline
© The Khronos® Group Inc. 2020 - Page 1
This work is licensed under a Creative Commons Attribution 4.0 International License
Machine Learning in
Khronos | Vulkan
Pierre Boudier
NVIDIA
January 2021
© The Khronos® Group Inc. 2020 - Page 2
This work is licensed under a Creative Commons Attribution 4.0 International License
Objectives of Vulkan Machine Learning (ML)
• Enable native Vulkan application to use ML with low latency and overhead
- Avoid interop, or to embed very large third-party frameworks (python)
• There are many common blocks in Neural Nets:
- Matrix multiplication
- Convolution
- Tanh/Sigmoid/ReLU
- …
• But the eco system is actually diverse and new mathematical approaches are introduced
every week:
- Locality sensitive hashing, binary weights, sparse matrices, …
• Vulkan should not limit itself to current architectures
- Luckily, Vulkan already has a compute shader abstraction !
© The Khronos® Group Inc. 2020 - Page 3
This work is licensed under a Creative Commons Attribution 4.0 International License
Training Data
Trained
Networks
NNEF/ONNX
Neural Network
Training
Inferencing Acceleration
GPU
Compiled
Code
Hardware Acceleration APIs
Accélération Hardware
(GPUs)
Applications link to
compiled inferencing code
or dynamically generate it
Networks trained on high-end
desktop and cloud systems
© The Khronos® Group Inc. 2020 - Page 4
This work is licensed under a Creative Commons Attribution 4.0 International License
Data type extensions
• By default, the 32 bit floating precision is used for both training and
inferencing, which are basically just running a computational graph:
- Training runs a forward pass, and often times a backward pass to propagate
back the gradient
- Inferencing is just about doing the forward pass
• But both can be done in lower precision types for faster compute time and
reduced data storage
- The following extensions are available in vulkan 1.2:
- VK_KHR_shader_float16_int8
- VK_KHR_8bit_storage / VK_KHR_16bit_storage
- 8 bit integers data types are used for quantized Neural Nets
- FP16 data types can be used for faster math with gradient rescaling in training
- Upcoming: SPV_KHR_integer_dot_product
- Take advantage of fast integer math without implicit compiler peephole optimization
for quantized neural networks
© The Khronos® Group Inc. 2020 - Page 5
This work is licensed under a Creative Commons Attribution 4.0 International License
Improved Compute Shader
• New extensions are devised to improve efficiency
- Upcoming:
- VK_KHR_workgroup_memory_explicit_layout
- Allow more efficient data loading into shared memory for further use with efficient
matrix multiplication operations.
- VK_EXT_ML_primitives
- Exposes basic primitives used in the main stream Neural Nets as optimized building
blocks
- Available:
- VK_NV_cooperative_matrix (NVIDIA)
- Exposes high throughput matrix/vector multiplication hardware units
- Typically used be convolution / matmul layer in fp16 formats
© The Khronos® Group Inc. 2020 - Page 6
This work is licensed under a Creative Commons Attribution 4.0 International License
Import Formats
Caffe, Keras,
MXNet, ONNX
TensorFlow Graph,
PyTorch, ONNX
Front-end / IR NNVM / Relay IR XLA HLO
Output SPIR-V SPIR-V / vulkan API
Primary Machine Learning Compilers

Recommended

OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...Edge AI and Vision Alliance
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNLinaro
 
"SoCs for Computer Vision-enabled IoT Devices," a March 2019 Silicon Valle...
 	 "SoCs for Computer Vision-enabled IoT Devices," a March 2019 Silicon Valle... 	 "SoCs for Computer Vision-enabled IoT Devices," a March 2019 Silicon Valle...
"SoCs for Computer Vision-enabled IoT Devices," a March 2019 Silicon Valle...Edge AI and Vision Alliance
 

More Related Content

What's hot

"Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P..."Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P...Edge AI and Vision Alliance
 
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...Kuralamudhan Ramakrishnan
 
Construire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edgeConstruire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edgeOpen Source Experience
 
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...Steve Wong
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...The Linux Foundation
 
Container Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey BoxellContainer Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey BoxellOracle Developers
 
PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)VMware Tanzu
 
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"IBM France Lab
 
Open Stack Cloud Services
Open Stack Cloud ServicesOpen Stack Cloud Services
Open Stack Cloud ServicesSaurabh Gupta
 
P to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersP to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersVMware Tanzu
 
M9 cloud & open source
M9 cloud & open sourceM9 cloud & open source
M9 cloud & open sourceJosep Bardallo
 
cncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetescncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetesKrishna-Kumar
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePRAGMA PROGETTI
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsAlejandro Rios Peña
 
Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014Werner Keil
 

What's hot (20)

NFV features in kubernetes
NFV features in kubernetesNFV features in kubernetes
NFV features in kubernetes
 
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P..."Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
 
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
ONS 2018 LA - Intel Tutorial: Cloud Native to NFV - Alon Bernstein, Cisco & K...
 
Construire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edgeConstruire une « data fabric » pour les environnements edge
Construire une « data fabric » pour les environnements edge
 
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
KubeCon China June 2019 - Survey of Kubernetes related solutions for IoT and ...
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
Shree duth awasthi_cv
Shree duth awasthi_cvShree duth awasthi_cv
Shree duth awasthi_cv
 
Container Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey BoxellContainer Native Development Tools - Talk by Mickey Boxell
Container Native Development Tools - Talk by Mickey Boxell
 
Shree_Duth_Awasthi_Resume
Shree_Duth_Awasthi_ResumeShree_Duth_Awasthi_Resume
Shree_Duth_Awasthi_Resume
 
PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)PKS is Not JAK8sP (Just Another Kubernetes Platform)
PKS is Not JAK8sP (Just Another Kubernetes Platform)
 
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
20190613 - IBM Cloud Côte d'Azur meetup - "Cloud & Containers"
 
Open Stack Cloud Services
Open Stack Cloud ServicesOpen Stack Cloud Services
Open Stack Cloud Services
 
P to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to ContainersP to V to C: The Value of Bringing “Everything” to Containers
P to V to C: The Value of Bringing “Everything” to Containers
 
M9 cloud & open source
M9 cloud & open sourceM9 cloud & open source
M9 cloud & open source
 
cncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetescncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetes
 
Watson on bluemix
Watson on bluemixWatson on bluemix
Watson on bluemix
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobre
 
Considering Bare Metal
Considering Bare MetalConsidering Bare Metal
Considering Bare Metal
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center Platforms
 
Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014Apache DeviceMap - ApacheCon Europe 2014
Apache DeviceMap - ApacheCon Europe 2014
 

Similar to Vulkan ML Japan Virtual Open House Feb 2021

"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option..."APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...Edge AI and Vision Alliance
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...Edge AI and Vision Alliance
 
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019Codemotion
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux EcosystemKangaroot
 
OpenStack - An Overview
OpenStack - An OverviewOpenStack - An Overview
OpenStack - An Overviewgraziol
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...Edge AI and Vision Alliance
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesKTN
 
The Future of Networks is Open...Source
The Future of Networks is Open...SourceThe Future of Networks is Open...Source
The Future of Networks is Open...SourceFrancois Duthilleul
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayLuciano Resende
 
Rapyuta a cloud robotics platform
Rapyuta a cloud robotics platformRapyuta a cloud robotics platform
Rapyuta a cloud robotics platformieeepondy
 
Learn .NET Core - Introduction
Learn .NET Core - IntroductionLearn .NET Core - Introduction
Learn .NET Core - IntroductionEng Teong Cheah
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...Linaro
 
Built Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfBuilt Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfI-Verve Inc
 
IoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian SkerrettIoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian SkerrettParam Singh
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...The Linux Foundation
 
Parallel universe-issue-29
Parallel universe-issue-29Parallel universe-issue-29
Parallel universe-issue-29DESMOND YUEN
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...Edge AI and Vision Alliance
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaIntel® Software
 

Similar to Vulkan ML Japan Virtual Open House Feb 2021 (20)

"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option..."APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
 
OpenStackDay - XIFI Federation
OpenStackDay - XIFI FederationOpenStackDay - XIFI Federation
OpenStackDay - XIFI Federation
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
Matteo Valoriani, Antimo Musone - The Future of Factory - Codemotion Rome 2019
 
Enabling NFV features in kubernetes
Enabling NFV features in kubernetesEnabling NFV features in kubernetes
Enabling NFV features in kubernetes
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux Ecosystem
 
OpenStack - An Overview
OpenStack - An OverviewOpenStack - An Overview
OpenStack - An Overview
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
The Future of Networks is Open...Source
The Future of Networks is Open...SourceThe Future of Networks is Open...Source
The Future of Networks is Open...Source
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
 
Rapyuta a cloud robotics platform
Rapyuta a cloud robotics platformRapyuta a cloud robotics platform
Rapyuta a cloud robotics platform
 
Learn .NET Core - Introduction
Learn .NET Core - IntroductionLearn .NET Core - Introduction
Learn .NET Core - Introduction
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
Built Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfBuilt Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdf
 
IoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian SkerrettIoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
IoTWorld 2016 OSS Keynote Param Singh, Ian Skerrett
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
Parallel universe-issue-29
Parallel universe-issue-29Parallel universe-issue-29
Parallel universe-issue-29
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and Anaconda
 

More from The Khronos Group Inc.

Vulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationVulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationThe Khronos Group Inc.
 
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021The Khronos Group Inc.
 

More from The Khronos Group Inc. (20)

OpenXR 1.0 Reference Guide
OpenXR 1.0 Reference GuideOpenXR 1.0 Reference Guide
OpenXR 1.0 Reference Guide
 
Vulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationVulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP Translation
 
Vulkan ML JP Translation
Vulkan ML JP TranslationVulkan ML JP Translation
Vulkan ML JP Translation
 
OpenCL Overview JP Translation
OpenCL Overview JP TranslationOpenCL Overview JP Translation
OpenCL Overview JP Translation
 
glTF overview JP Translation
glTF overview JP TranslationglTF overview JP Translation
glTF overview JP Translation
 
Khronos Overview JP Translation
Khronos Overview JP TranslationKhronos Overview JP Translation
Khronos Overview JP Translation
 
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
 
OpenCL 3.0 Reference Guide
OpenCL 3.0 Reference GuideOpenCL 3.0 Reference Guide
OpenCL 3.0 Reference Guide
 
OpenVX 1.3 Reference Guide
OpenVX 1.3 Reference GuideOpenVX 1.3 Reference Guide
OpenVX 1.3 Reference Guide
 
OpenXR 0.90 Overview Guide
OpenXR 0.90 Overview GuideOpenXR 0.90 Overview Guide
OpenXR 0.90 Overview Guide
 
Vulkan 1.1 Reference Guide
Vulkan 1.1 Reference GuideVulkan 1.1 Reference Guide
Vulkan 1.1 Reference Guide
 
SYCL 1.2.1 Reference Card
SYCL 1.2.1 Reference CardSYCL 1.2.1 Reference Card
SYCL 1.2.1 Reference Card
 
OpenCL 2.2 Reference Guide
OpenCL 2.2 Reference GuideOpenCL 2.2 Reference Guide
OpenCL 2.2 Reference Guide
 
OpenGL 4.6 Reference Guide
OpenGL 4.6 Reference GuideOpenGL 4.6 Reference Guide
OpenGL 4.6 Reference Guide
 
glTF 2.0 Reference Guide
glTF 2.0 Reference GuideglTF 2.0 Reference Guide
glTF 2.0 Reference Guide
 
OpenVX 1.2 Reference Guide
OpenVX 1.2 Reference GuideOpenVX 1.2 Reference Guide
OpenVX 1.2 Reference Guide
 
WebGL 2.0 Reference Guide
WebGL 2.0 Reference GuideWebGL 2.0 Reference Guide
WebGL 2.0 Reference Guide
 
OpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick ReferenceOpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick Reference
 
OpenVX 1.1 Reference Guide
OpenVX 1.1 Reference GuideOpenVX 1.1 Reference Guide
OpenVX 1.1 Reference Guide
 
Vulkan 1.0 Quick Reference
Vulkan 1.0 Quick ReferenceVulkan 1.0 Quick Reference
Vulkan 1.0 Quick Reference
 

Recently uploaded

Improving IT Investment Decisions and Business Outcomes with Integrated Enter...
Improving IT Investment Decisions and Business Outcomes with Integrated Enter...Improving IT Investment Decisions and Business Outcomes with Integrated Enter...
Improving IT Investment Decisions and Business Outcomes with Integrated Enter...Cprime
 
Centralized TLS Certificates Management Using Vault PKI + Cert-Manager
Centralized TLS Certificates Management Using Vault PKI + Cert-ManagerCentralized TLS Certificates Management Using Vault PKI + Cert-Manager
Centralized TLS Certificates Management Using Vault PKI + Cert-ManagerSaiLinnThu2
 
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfQ4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfTejal81
 
AI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsAI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsPremsankar Chakkingal
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxNeo4j
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Revolutionizing The Banking Industry: The Monzo Way by CPO, Monzo
Revolutionizing The Banking Industry: The Monzo Way by CPO, MonzoRevolutionizing The Banking Industry: The Monzo Way by CPO, Monzo
Revolutionizing The Banking Industry: The Monzo Way by CPO, MonzoProduct School
 
Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Product School
 
AGFM - Toyota Coaster 1HZ Install Guide.pdf
AGFM - Toyota Coaster 1HZ Install Guide.pdfAGFM - Toyota Coaster 1HZ Install Guide.pdf
AGFM - Toyota Coaster 1HZ Install Guide.pdfRodneyThomas28
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceVijayananda Mohire
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Product School
 
Mastering Play Store App Listing and Optimization
Mastering Play Store App Listing and OptimizationMastering Play Store App Listing and Optimization
Mastering Play Store App Listing and OptimizationAppsthentic Technology
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIEDanBrown980551
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxNeo4j
 
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31shyamraj55
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientKari Kakkonen
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Product School
 
Utilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsUtilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsIES VE
 
Establishing data sharing standards to promote global industry development
Establishing data sharing standards to promote global industry developmentEstablishing data sharing standards to promote global industry development
Establishing data sharing standards to promote global industry developmentThorsten Huelsmann
 

Recently uploaded (20)

Improving IT Investment Decisions and Business Outcomes with Integrated Enter...
Improving IT Investment Decisions and Business Outcomes with Integrated Enter...Improving IT Investment Decisions and Business Outcomes with Integrated Enter...
Improving IT Investment Decisions and Business Outcomes with Integrated Enter...
 
Centralized TLS Certificates Management Using Vault PKI + Cert-Manager
Centralized TLS Certificates Management Using Vault PKI + Cert-ManagerCentralized TLS Certificates Management Using Vault PKI + Cert-Manager
Centralized TLS Certificates Management Using Vault PKI + Cert-Manager
 
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfQ4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
 
AI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsAI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the Classrooms
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Revolutionizing The Banking Industry: The Monzo Way by CPO, Monzo
Revolutionizing The Banking Industry: The Monzo Way by CPO, MonzoRevolutionizing The Banking Industry: The Monzo Way by CPO, Monzo
Revolutionizing The Banking Industry: The Monzo Way by CPO, Monzo
 
Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...
 
AGFM - Toyota Coaster 1HZ Install Guide.pdf
AGFM - Toyota Coaster 1HZ Install Guide.pdfAGFM - Toyota Coaster 1HZ Install Guide.pdf
AGFM - Toyota Coaster 1HZ Install Guide.pdf
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial Intelligence
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
 
Mastering Play Store App Listing and Optimization
Mastering Play Store App Listing and OptimizationMastering Play Store App Listing and Optimization
Mastering Play Store App Listing and Optimization
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIE
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
 
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficient
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
 
Utilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsUtilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding Applications
 
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
 
Establishing data sharing standards to promote global industry development
Establishing data sharing standards to promote global industry developmentEstablishing data sharing standards to promote global industry development
Establishing data sharing standards to promote global industry development
 

Vulkan ML Japan Virtual Open House Feb 2021

  • 1. © The Khronos® Group Inc. 2020 - Page 1 This work is licensed under a Creative Commons Attribution 4.0 International License Machine Learning in Khronos | Vulkan Pierre Boudier NVIDIA January 2021
  • 2. © The Khronos® Group Inc. 2020 - Page 2 This work is licensed under a Creative Commons Attribution 4.0 International License Objectives of Vulkan Machine Learning (ML) • Enable native Vulkan application to use ML with low latency and overhead - Avoid interop, or to embed very large third-party frameworks (python) • There are many common blocks in Neural Nets: - Matrix multiplication - Convolution - Tanh/Sigmoid/ReLU - … • But the eco system is actually diverse and new mathematical approaches are introduced every week: - Locality sensitive hashing, binary weights, sparse matrices, … • Vulkan should not limit itself to current architectures - Luckily, Vulkan already has a compute shader abstraction !
  • 3. © The Khronos® Group Inc. 2020 - Page 3 This work is licensed under a Creative Commons Attribution 4.0 International License Training Data Trained Networks NNEF/ONNX Neural Network Training Inferencing Acceleration GPU Compiled Code Hardware Acceleration APIs Accélération Hardware (GPUs) Applications link to compiled inferencing code or dynamically generate it Networks trained on high-end desktop and cloud systems
  • 4. © The Khronos® Group Inc. 2020 - Page 4 This work is licensed under a Creative Commons Attribution 4.0 International License Data type extensions • By default, the 32 bit floating precision is used for both training and inferencing, which are basically just running a computational graph: - Training runs a forward pass, and often times a backward pass to propagate back the gradient - Inferencing is just about doing the forward pass • But both can be done in lower precision types for faster compute time and reduced data storage - The following extensions are available in vulkan 1.2: - VK_KHR_shader_float16_int8 - VK_KHR_8bit_storage / VK_KHR_16bit_storage - 8 bit integers data types are used for quantized Neural Nets - FP16 data types can be used for faster math with gradient rescaling in training - Upcoming: SPV_KHR_integer_dot_product - Take advantage of fast integer math without implicit compiler peephole optimization for quantized neural networks
  • 5. © The Khronos® Group Inc. 2020 - Page 5 This work is licensed under a Creative Commons Attribution 4.0 International License Improved Compute Shader • New extensions are devised to improve efficiency - Upcoming: - VK_KHR_workgroup_memory_explicit_layout - Allow more efficient data loading into shared memory for further use with efficient matrix multiplication operations. - VK_EXT_ML_primitives - Exposes basic primitives used in the main stream Neural Nets as optimized building blocks - Available: - VK_NV_cooperative_matrix (NVIDIA) - Exposes high throughput matrix/vector multiplication hardware units - Typically used be convolution / matmul layer in fp16 formats
  • 6. © The Khronos® Group Inc. 2020 - Page 6 This work is licensed under a Creative Commons Attribution 4.0 International License Import Formats Caffe, Keras, MXNet, ONNX TensorFlow Graph, PyTorch, ONNX Front-end / IR NNVM / Relay IR XLA HLO Output SPIR-V SPIR-V / vulkan API Primary Machine Learning Compilers
  • 7. © The Khronos® Group Inc. 2020 - Page 7 This work is licensed under a Creative Commons Attribution 4.0 International License Google MLIR and IREE Compilers MLIR Multi-level Intermediate Representation Format and library of compiler utilities that sits between the trained model representation and low-level compilers/executors that generate hardware-specific code IREE Intermediate Representation Execution Environment Lowers and optimizes ML models for real-time accelerated inferencing on mobile/edge heterogeneous hardware Contains scheduling logic to communicate data dependencies to low-level parallel pipelined hardware/APIs like Vulkan, and execution logic to encode dense computation in the form of hardware/API- specific binaries like SPIR-V IREE is a research project today. Google is working with Khronos working groups to explore how SPIR-V code can provide effective inferencing acceleration on APIs such as Vulkan Trained Models Generate Hardware Specific Binaries Optimizes and Lowers for Acceleration
  • 8. © The Khronos® Group Inc. 2020 - Page 8 This work is licensed under a Creative Commons Attribution 4.0 International License Third Party frameworks Alibaba Tencent Ax inc. MNN NCNN Ailia https://github.com/alibaba/M NN https://github.com/Tencent/nc nn https://axinc.jp/en/ Lightweight frameworks for inferencing and training for mobile platforms Cross platform optimized inferencing framework SDK for optimized running of many popular neural networks on multiple platform
  • 9. © The Khronos® Group Inc. 2020 - Page 9 This work is licensed under a Creative Commons Attribution 4.0 International License Non Neural Net Machine Learning Jülich Ethical ML UNITY VkFFT Kompute ML agents https://github.com/DTolm/VkF FT https://github.com/EthicalML/ vulkan-kompute https://github.com/Unity- Technologies/ml-agents Accelerated fast fourrier transform library, which can be used for image resampling, image registration, signal processing … General purpose GPU compute framework cross graphics card Open source project that enables simulation for training intelligent agents via the Unity engine
  • 10. © The Khronos® Group Inc. 2020 - Page 10 This work is licensed under a Creative Commons Attribution 4.0 International License Call To Action • The Vulkan Machine Learning Subgroup is welcoming feedback: - What functionality would you like to see in Vulkan to accelerate your ML needs ? - Neural Nets - Other algorithms: - random forest - logistic regression - K-nearest neighbor - T-SNE - … - Do you have existing product using Vulkan for ML: - Do you want to get in touch with hardware vendors ? - Do you have suggestion on how to improve your pain points? • Contact us at: pboudier@nvidia.com • Help us and join the Vulkan Advisory Panel, or become a Khronos member

Editor's Notes

  1. 1. Hi everyone – I am Neil Trevett - I work on developer ecosystems at NVIDA and I am President of the Khronos Group. I am going to give the latest updates on Khronos’ open standards for vision and inferencing acceleration.
  2. 5. Here we see how Khronos standards can be used in a typical mobile or embedded application needing vision or inferencing acceleration. Neural network training typically occurs on high-powered desktops or servers at high precision, using large training data sets. The resulting trained network, encapsulated in an NNEF file, can be imported into the target system in two ways: either ingested into an inferencing engine such as OpenVX, or passed into a machine learning compiler to generate code that executes inferencing operations. Either the compiled code or the inferencing engine API is linked into the application, which may also use the SYCL compiler to generate additional parallel code if the application is written in C++. Code generated by all these methods can target the lower level OpenCL API to portably reach into diverse hardware processors.   Let’s look at each of these Khronos standards in a little more detail – starting with NNEF.
  3. 16. In addition to inferencing engines, there is active research into machine learning compilers from companies such as Amazon with TVM, Facebook with Glow and Google with XLA.