"Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group

© Copyright Khronos Group 2017 - Page 1
Update on Khronos Standards for
Vision and Machine Learning
December 2017
Neil Trevett | Khronos President
NVIDIA | VP Developer Ecosystem
ntrevett@nvidia.com | @neilt3d |www.khronos.org

Khronos Mission
Software
Silicon
Khronos is an International Industry Consortium of over 100 companies creating
royalty-free, Open Standard APIs to enable software to access hardware
acceleration for 3D Graphics, Virtual and Augmented Reality, Parallel Computing,
Neural Networks and Vision Processing

Khronos Open Standards
Vision
sensor(s)
Tracking and
Positioning
Geometric scene
reconstruction
Semantic scene
understanding
(Neural Networks)
Generate Low Latency 3D
Content and Augmentations
Interact with sensor, haptic
and display devices
Download 3D
augmentation object
and scene data
VR/AR Application

Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Embedded/Mobile
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Vision and Neural Net
Inferencing Runtime
Embedded/Mobile
Standards for Vision and Neural Networks
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks

NNEF - Solving Neural Net Fragmentation
NN Authoring Framework 1
Inference Engine 1
Inference Engine 2
Inference Engine 3
Every Tool Needs an Exporter to
Every Accelerator
With NNEF- NN Training and Inferencing Interoperability
Before NNEF – NN Training and Inferencing Fragmentation
Inference Engine 1
Inference Engine 2
Inference Engine 3
Optimization and processing tools
NNEF is a Cross-vendor Neural Net file format
Encapsulates network formal semantics, structure, data formats,
commonly-used operations (such as convolution, pooling, normalization, etc.)

NNEF Status and Roadmap
• NNEF V1.0 provisional release – targeting late December 2017
- Focus on passing trained frameworks to embedded inference engines
- Support ‘First cut’ range of network types
• NNEF Roadmap
- Track development of new network types
- Allow authoring and retraining (3rd party tools)
- Address a wider range of applications (outside vision apps)
- Increase the expressive power of the format Baseline
Importer
File
Format
Validator
Working Group Participants
Baseline
Exporter
Khronos
Open Source
Projects

OpenVX
PowerEfficiency
Computation Flexibility
Dedicated
Hardware
GPU
Compute
Multi-core
CPUX1
X10
X100
Vision
DSPs
Wide range of vision hardware architectures
OpenVX provides a high-level Graph-based abstraction
->
Enables Graph-level optimizations!
Can be implemented on almost any hardware or processor!
->
Portable, Efficient Vision Processing!
Vision
Node
Vision
Node
Vision
NodeVision
Node
Vision Processing Graph
GPU
Vision Engines
Middleware
Applications
DSP
Hardware
Software
Portability

Simple Edge Detector in OpenVX
vx_image input = vxCreateImage(1920, 1080);
vx_image output = vxCreateImage(0, 0);
vx_image horiz = vxCreateVirtualImage();
vx_image vert = vxCreateVirtualImage();
vx_image mag = vxCreateVirtualImage();
vx_graph g = vxCreateGraph();
vxSobel3x3Node(g, input, horiz, vert);
vxMagnitudeNode(g, horiz, vert, mag);
vxThresholdNode(g, mag, THRESH, output);
status = vxVerifyGraph(g);
status = vxProcessGraph(g);
m
h
i
v
oS M T
Compile the Graph
Execute the Graph
Declare Input and Output Images
Declare Intermediate Images
Construct the Graph topology

OpenVX Evolution
OpenVX 1.0
Spec released October 2014
Conformant
Implementations
OpenVX 1.1
Spec released May 2016
Conformant
Implementations
AMD OpenVX Tools
- Open source, highly optimized
for x86 CPU and OpenCL for GPU
- “Graph Optimizer”
- Scripting for rapid prototyping,
without re-compiling, at
production performance levels
http://gpuopen.com/compute-product/amd-openvx/
New Functionality
Expanded Nodes Functionality
Enhanced Graph Framework
OpenVX 1.2
Spec released May 2017
Adopters Program November 2017
New Functionality
Conditional node execution
Feature detection
Classification operators
Expanded imaging operations
Extensions
Neural Network Acceleration
Graph Save and Restore
16-bit image operation
Safety Critical
OpenVX 1.1 SC for
safety-certifiable systems
OpenVX
Roadmap under Discussion
NNEF Import
Programmable user
kernels with
accelerator offload

New OpenVX 1.2 Functions
• Feature detection: find features useful for object detection and recognition
- Histogram of gradients – HOG  Template matching
- Local binary patterns – LBP  Line finding
• Classification: detect and recognize objects in an image based on a set of features
- Import a classifier model trained offline
- Classify objects based on a set of input features
• Image Processing: transform an image
- Generalized nonlinear filter: Dilate, erode, median with arbitrary kernel shapes
- Non maximum suppression: Find local maximum values in an image
- Edge-preserving noise reduction
• Conditional execution & node predication
- Selectively execute portions of a graph based on a true/false predicate
• Many, many minor improvements
• New Extensions
- Import/export: compile a graph; save and run later
- 16-bit support: signed 16-bit image data
- Neural networks: Layers are represented as OpenVX nodes
B C
S
A
Condition
If A then S ← B else S ← C

© Copyright Khronos Group 2017 - Page 11© Copyright Khronos Group 2017 - Page 11
OpenVX 1.2 and Neural Net Extension
• Convolution Neural Network topologies can be represented as OpenVX graphs
- Layers are represented as OpenVX nodes
- Layers connected by multi-dimensional tensors objects
- Layer types include convolution, activation, pooling, fully-connected, soft-max
- CNN nodes can be mixed with traditional vision nodes
• Import/Export Extension
- Efficient handling of network Weights/Biases or complete networks
• OpenVX will be able to import NNEF files into OpenVX Neural Nets
Vision
Node
Vision
Node
Vision
Node
Downstream
Application
Processing
Native
Camera
Control CNN Nodes
An OpenVX graph mixing CNN nodes
with traditional vision nodes

OpenVX SC - Safety Critical Vision Processing
• OpenVX 1.1 - based on OpenVX 1.1 main specification
- Enhanced determinism
- Specification identifies and numbers requirements
• MISRA C clean per KlocWorks v10
• Divides functionality into “development” and “deployment” feature sets
- Adds requirement to support import/export extension
OpenVX SC
Development Feature
Set (Create Graph)
OpenVX SC
Deployment Feature Set
(Execute Graph)
Binary
format
Verify
Export
Import
Entire graph creation API No graph creation APIImplementation
dependent format

Safety Critical APIs
New Generation APIs for safety
certifiable vision, graphics and
compute
e.g. ISO 26262 and DO-178B/C
OpenGL ES 1.0 - 2003
Fixed function graphics
OpenGL ES 2.0 - 2007
Shader programmable pipeline
OpenGL SC 1.0 - 2005
Fixed function graphics subset
OpenGL SC 2.0 - April 2016
Shader programmable pipeline subset
Experience and Guidelines
OpenCL SC TSG Formed
Working on OpenCL SC 1.2
Eliminate Undefined Behavior
Eliminate Callback Functions
Static Pool of Event Objects
OpenVX SC 1.1 Released May 2017
Restricted “deployment” implementation
executes on the target hardware by reading the
binary format and executing the pre-compiled
graphs
Khronos SCAP
‘Safety Critical Advisory Panel’
Guidelines for designing APIs that ease
system certification.
Open to Khronos member AND
industry experts

Embedded/Mobile
Embedded/Mobile
Embedded/Mobile
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Inferencing Runtime
Embedded/Mobile
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks

OpenCL – Low-level Parallel Programing
• Low-level, explicit programming of heterogeneous parallel compute resources
- One code tree can be executed on CPUs, GPUs, DSPs and FPGA …
• OpenCL C or C++ language to write kernel programs to execute on any compute device
- Platform Layer API - to query, select and initialize compute devices
- Runtime API - to build and execute kernels programs on multiple devices
• The programmer gets to control:
- What programs execute on what device
- Where data is stored in various speed and size memories in the system
- When programs are run, and what operations are dependent on earlier operations
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
GPU
DSP
CPU
CPU
FPGA
Kernel code
compiled for
devices
Devices
CPU
Host
Runtime API
loads and executes
kernels across devices

OpenCL 2.2 Released in May 2017
2011
OpenCL 1.2
Becomes
industry
baseline for
heterogeneous
parallel
computing
OpenCL 2.1
SPIR-V 1.0
SPIR-V in Core
Kernel Language
Flexibility
OpenCL 2.2
SPIR-V 1.2
OpenCL C++ Kernel Language
Static subset of C++14
Templates and Lambdas
SPIR-V 1.2
OpenCL C++ support
Pipes
Efficient device-scope
communication between kernels
Code Generation Optimizations
- Specialization constants at
SPIR-V compilation time
- Constructors and destructors of
program scope global objects
- User callbacks can be set at
program release time
201720152013
OpenCL 2.0
Enables new class
of hardware
SVM
Generic Addresses
On-device dispatch
https://www.khronos.org/opencl/

OpenCL Conformant Implementations
OpenCL 1.0
Specification
Dec08 Jun10
OpenCL 1.1
Specification
Nov11
OpenCL 1.2
Specification
OpenCL 2.0
Specification
Nov13
1.0|Jul13
1.0|Aug09
1.0|May09
1.0|May10
1.0 | Feb11
1.0|May09
1.0|Jan10
1.1|Aug10
1.1|Jul11
1.2|May12
1.2|Jun12
1.1|Feb11
1.1|Mar11
1.1|Jun10
1.1|Aug12
1.1|Nov12
1.1|May13
1.1|Apr12
1.2|Apr14
1.2|Sep13
1.2|Dec12
Desktop
Mobile
FPGA
2.0|Jul14
OpenCL 2.1
Specification
Nov15
1.2|May15
2.0|Dec14
1.0|Dec14
1.2|Dec14
1.2|Sep14
Vendor timelines are
first conformant
submission for each spec
generation
1.2|May15
Embedded
1.2|Aug15
1.2|Mar16
2.0|Nov15
2.0|Apr17
2.1 | Jun16

OpenCL as Language/Library Backend
C++ based
Neural
network
framework
MulticoreWare
open source
project on
Bitbucket
Compiler
directives for
Fortran,
C and C++
Java language
extensions
for
parallelism
Language for
image
processing and
computational
photography
Single
Source C++
Programming
for OpenCL
Hundreds of languages, frameworks
and projects using OpenCL to access
vendor-optimized, heterogeneous
compute runtimes
Vision
processing
open source
project
Open source
software library
for machine
learning
Over 4,000 GitHub repositories
using OpenCL: tools, applications,
libraries, languages - up from
2,000 two years ago

SYCL Ecosystem
• Single-source heterogeneous programming using STANDARD C++
- Use C++ templates and lambda functions for host & device code
- Layered over OpenCL
• Fast and powerful path for bring C++ apps and libraries to OpenCL
- C++ Kernel Fusion - better performance on complex software than hand-coding
- Halide, Eigen, Boost.Compute, SYCLBLAS, SYCL Eigen, SYCL TensorFlow, SYCL GTX
- triSYCL, ComputeCpp, VisionCpp, ComputeCpp SDK …
• More information at http://sycl.tech

SYCL 1.2.1 Publicly Released Today!
• Major update representing two and a half years of work by Khronos members
- Incorporates significant experience gained from three separate implementations
- Feedback from developers of machine learning frameworks such as TensorFlow
- TensorFlow now supports SYCL alongside the original CUDA accelerator back-end
- Layers over OpenCL 1.2
“This is a significant release update for SYCL with an enhanced
ecosystem that matches our intention to support machine learning
and align with modern C++17. SYCL continues to help us to drive
the C++ standard towards heterogeneous support. Our intention is
to move forward quickly with the SYCL roadmap to bring even more
emphasis to machine learning and Safety Critical support, as well as
continued alignment with future ISO C++,”
Michael Wong, SYCL working group chair
C++ Kernel Language
Low Level Control
‘GPGPU’-style separation of
device-side kernel source
code and host code
Single-source C++
Programmer Familiarity
Approach also taken by
C++ AMP and OpenMP
Developer Choice
The development of the two specifications are aligned so
code can be easily shared between the two approaches

OpenCL Roadmap
2011
OpenCL 1.2
OpenCL C Kernel
Language
OpenCL 2.1
SPIR-V in Core
2015
SYCL 1.2
C++11 Single source
programming
OpenCL 2.2
C++ Kernel Language
2017
SYCL 2.2
C++14 Single source
programming
Industry working to bring
Heterogeneous compute to
standard ISO C++
C++17 Parallel STL hosted by Khronos
Executors – for scheduling work
“Managed pointers” or “channels” –
for sharing data
OpenCL for DSPs
- Embedded imaging, vision and inferencing
- Flexible reduced precision
- Conformance without IEEE 32 Floating Point
- Explicit DMA
Single source C++ programming.
Great for supporting C++ apps,
libraries and frameworks
Help bring OpenCL-
class compute to Vulkan
for GPUs

Embedded/Mobile
Embedded/Mobile
Embedded/Mobile
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Inferencing Runtime
Embedded/Mobile
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks

Pervasive Vulkan
All Major GPU Companies shipping Vulkan Drivers for Desktop and Mobile Platforms
Embedded LinuxAndroid TVAndroid 7.0+ Nintendo Switch
http://vulkan.gpuinfo.org/
Oculus RiftSteamVR
VR Platforms
Desktop, Mobile, Embedded and Console Platforms Supporting Vulkan
Including phones and tablets from Google, Huawei, Samsung, Sony, Xiaomi - both premium and mid-range devices
GearVR Google Daydream
Game Engines
Desktop

Market Demand for Universal 3D Portability
Games
Engines
Native 3D
Apps
Browser
Engines Games
Engines
Native 3D
Apps
Browser
Engines
Vulkan Universally
Portable Subset
JavaScript and
WebAssembly Native bindings
for ‘nexgen WebGL’
Community Outreach at GDC 2017
Create a hybrid Portability API?
Feedback - AVOID CREATING A FOURTH API!!!
Would need new specification, CTS, Documentation.
Additional developer learning curve.
A whole new specification to name, brand, promote.
Would INCREASE industry fragmentation
Tools Layers
API Libraries
Shader Translators
MAP Vulkan to Metal
and DX12

Vulkan Portability TSG Process
API
Overlap
Analysis
Metal Shading
Language
HLSL
Vulkan Portability Deliverables
1. Vulkan Subset Diff Spec
2. Vulkan Subset Development Layer
3. Vulkan Subset API Library over DX12/Metal
4. SPIRV-Cross Translator
5. Vulkan Subset Conformance Tests
Expand/test existing
open source SPIRV-Cross Tool
Possible proposals for Vulkan extensions for
enhanced portability (and possibly Web
robustness) sent to Vulkan WG
Identify Vulkan
features not directly mappable
to DX12 and Metal
New Vulkan functionality may affect the
overlap analysis
Layers, APIs, Translators and Tests all to be
developed and released in open source
Open source project with similar goals
https://github.com/gfx-rs/gfx
Vulkan on iOS and macOS
https://moltengl.com/moltenvk/

Vulkan Evolution
Feb16
Vulkan 1.0
Vulkan Extensions
Maintenance updates plus additional functionality
Explicit Building Blocks for VR
Explicit Building Blocks for Homogeneous Multi-GPU
Enhanced Windows System Integration
Increased Shader Language Flexibility
Enhanced Cross-Process and Cross-API Sharing
Vulkan Roadmap
Regular Vulkan Core Releases
Integrates proven KHR extensions
Roadmap Discussions
Pushes forward the envelope of GPU Acceleration:
Enhanced Compute and Language Flexibility
Optimized Vision and Inferencing Acceleration
Ray Tracing
Widened Platform Support
Through Vulkan Portability Initiative
Including Vulkan on macOS and iOS
Strengthened Ecosystem and SDK
Enhanced developer and debugging tools
Regression testing for SDK stability
Enhanced Conformance Testing (API now has 198K test cases
- up from 107K last year)
Compiler robustness - including HLSL support
Explicit Access to GPU
Acceleration

Clspv OpenCL C to Vulkan Compiler
• Experimental collaboration between Google, Codeplay, and Adobe
- Successfully tested on over 200K lines of Adobe OpenCL C production code
- Released in open source https://github.com/google/clspv
- Tracks top-of-tree LLVM and clang, not a fork
• Uses new Vulkan extensions to support OpenCL C compute operations
- VK_KHR_16bit_storage/SPV_KHR_16bit_storage
- VK_KHR_variable_pointers/SPV_KHR_variable_pointers
• Compiles OpenCL C’s programming model to Vulkan’s SPIR-V execution environment
- Proof-of-concept that OpenCL compute can be brought seamlessly to Vulkan
Vulkan
Universal
Portability
OpenCL-class
compute in
Vulkan
Universally
Portable
Graphics and
Advanced
Compute

Dedicated Vision
Hardware
Layered Vision/ Neural Net Ecosystem
Programmable Vision
Processors
Application
Implementers may use OpenCL or Vulkan to implement
OpenVX nodes on programmable processors
And then developers can use OpenVX to enable a
developer to easily connect those nodes into a graph
The OpenVX graph abstraction enables implementers to optimize execution
across diverse hardware architectures for optimal power and performance
OpenVX enables the graph to be extended to include hardware
architectures that don’t support programmable APIs
Vulkan Roadmap
Enhanced compute – vision and
inferencing on any platforms with
GPU acceleration
OpenCL Roadmap
Flexible precision for
widespread deployment on low-
cost embedded processors

Khronos Cooperative Framework
Royalty-Free
Specifications
Adopters
Programs
Implementations, Tools
Documentation, Samples
Adopters
Build conformant
implementation and products
Developers
Develop applications
using the APIs
Educators / Certifiers
Create Courses
Training and Certification
Educator Guidelines
Courseware Materials
API Working Groups
(Industry, Academic and
Associate members)
$
$
Khronos Board
Strategy, budget and
oversight$$
One vote per industry member
Conformance
Tests
Open and
royalty-free.
Often open sourced
Advisory Panels
By invitation, no fee.
Provide requirements
and draft spec feedback
Industry and
Community
Feedback and
contributions on
released materials
Membership Fees
Academic $1,000
Small companies $175/employee
(min $3,500)
Non-profits $7,500
Larger companies - $18,000

New Open Source Engagement Model
• Khronos is open sourcing specification sources,
conformance tests, tools
- Merge requests welcome from the community
(subject to review by OpenCL working group)
• Deeper Community Enablement
- Mix your own documentation!
- Contribute and fix conformance tests
- Fix the specification, headers, ICD etc.
- Contribute new features (carefully)
Khronos builds and Ratifies
Canonical Specification under
Khronos IP Framework.
No changes or re-hosting allowed
Spec and Ref Language Source and
derivative materials.
Re-mixable under CC-BY by the
industry and community
Source Materials for Specifications and Reference
Documentation CONTRIBUTED Under Khronos IP Framework
(you won’t assert patents against conformant implementations,
and license copyright for Khronos use)
Contributions and
Distribution under
Apache 2.0 Spec Build
System and
Scripts
Spec and
Ref Language
Source Redistribution
under CC-BY 4.0
Conformance
Test Suite
Source
Community built
documentation and
tools
Contributions and
Distribution under
Apache 2.0
Anyone can test any
implementation at
any time
Khronos Adopters
Program
Conformant Implementations can
use trademark and are covered
by Khronos IP Framework

Khronos Advisory Panels
Working
Group
Advisory
Panel
Shared Email
list and
Repository
Khronos Members
Any company can join
Membership Fee
Sign NDA and IP Framework
Khronos Advisors
Invited industry experts
$0 Cost
Sign NDA and IP Framework
Specification drafts and
invitations for
requirements and feedback
Requirements and
feedback on
specification drafts
Hosted by Khronos.
Under Khronos NDA
Advisory Panels Active for Vulkan, OpenCL/SYCL and NNEF

Need for Camera Control API?
• Advanced control of ISP and camera subsystem – with cross-platform portability
- Generate sophisticated image stream for advanced imaging & vision apps
• No platform API currently fulfills all developer requirements
- Portable access to growing sensor diversity: e.g. depth sensors and sensor arrays
- Cross sensor synch: e.g. synch of camera and MEMS sensors
- Advanced, high-frequency per-frame burst control of camera/sensor: e.g. ROI
- Multiple input, output re-circulating streams with RAW, Bayer or YUV Processing
Image Signal
Processor (ISP)
Defines control of Sensor, Color Filter Array
Lens, Flash, Focus, Aperture
Auto Exposure (AE)
Auto White Balance (AWB)
Auto Focus (AF) Stream of Images for
Vision Processing
OpenKCAM standard is
currently on ice – do we
need to restart?

Please Get Involved!
• Khronos is creating cutting-edge royalty-free open standards
- For embedded vision and neural network acceleration
• Khronos standards are key to many emerging markets such as
3D rendering and advanced gaming, Augmented and Virtual Reality,
Embedded Vision and machine learning
- Advanced next generation capabilities for ALL platforms
• Khronos encourages your participation
- www.khronos.org
- ntrevett@nvidia.com
- @neilt3d

OpenVX - Graph-Level Abstraction
• OpenVX developers express a graph of image operations (‘Nodes’)
- Using a C API
• Nodes can be executed on any hardware or processor coded in any language
- Implementers can optimize under the high-level graph abstraction
• Graphs are the key to run-time power and performance optimizations
- E.g. Node fusion, tiled graph processing for cache efficiency etc.
Array of
Keypoints
YUV
Frame
Gray
Frame
Camera
Input
Rendering
Output
Pyrt
Color
Conversion
Channel
Extract
Optical
Flow
Harris
Track
Image
Pyramid
RGB
Frame
Array of
Features
Ftrt-1OpenVX Graph
OpenVX Nodes
Feature Extraction Example Graph

OpenVX Efficiency through Graphs..
Reuse
pre-allocated
memory for
multiple
intermediate data
Memory
Management
Less allocation overhead,
more memory for
other applications
Replace a sub-
graph with a
single faster node
Kernel
Fusion
Better memory
locality, less kernel
launch overhead
Split the graph
execution across
the whole
system: CPU /
GPU / dedicated
HW
Graph
Scheduling
Faster execution
or lower power
consumption
Execute a sub-
graph at tile
granularity
instead of image
granularity
Data
Tiling
Better use of
data cache and
local memory

SPIR-V Ecosystem
LLVM
Third party kernel and
shader languages
SPIR-V
• Khronos defined cross-API IR
• Native graphics and parallel compute
• Easily parsed/extended 32-bit stream
• Data object/control flow retained for
effective code generation/translation
OpenCL C++
Front-end
OpenCL C
Front-end
glslang
Khronos open source
tools and translators
IHV Driver
Runtimes
Additional
Intermediate Forms
SPIR-V Validator
SPIR-V (Dis)Assembler
LLVM to SPIR-V
Bi-directional
Translator
https://github.com/KhronosGroup/SPIRV-Tools
GLSL HLSL
Khronos liaising with
Clang/LLVM Community
E.g. discussing SPIR-V as
supported Clang target
SPIRV-Cross
GLSL
HLSL
MSL
SPIRV-opt | SPIRV-remap
SPIR-V Optimizations
• Inlining (exhaustive)
• Store/Load Elimination
• Dead Code Elimination
• Dead Branch Elimination
• Common Uniform Elimination
Coming
• Loop Unrolling and Constant Folding
• Common Subexpression Elimination

Vulkan and New Generation 3D APIs
Only AppleOnly Windows 10
Cross Platform
Clean, modern architecture | Low overhead, explicit GPU access
Portable across desktop and mobile | Multi-thread / multi-core friendly
Efficient, low-latency, predictable performance

Vulkan Explicit GPU Control
GPU
High-level Driver
Abstraction
Layered GPU Control
Context management
Memory allocation
Full GLSL compiler
Error detection
Application
Single thread per context
GPU
Thin Driver
Explicit GPU Control
Application
Memory allocation
Thread management
Synchronization
Multi-threaded generation
of command buffers
Multiple Front-end
Compilers
GLSL, HLSL etc.
Loadable debug and
validation layers
Vulkan 1.0 provides access to
OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality
but with increased performance and flexibility
SPIR-V
pre-compiled shaders
Complex drivers
cause overhead
and inconsistent
behavior across
vendors
Always active
error handling
Full GLSL
preprocessor and
compiler in
driver
OpenGL vs.
OpenGL ES
Resource management
offloaded to app:
low-overhead, low-latency
driver
Consistent behavior:
no ‘fighting with driver
heuristics’
Validation and debug layers
loaded only when needed
SPIR-V intermediate
language: shading language
flexibility
Multi-threaded command
creation. Multiple graphics,
command and DMA queues
Unified API across all
platforms with feature set
flexibility
Vulkan = high performance and
low latency 3D and GPU Compute.
Ideal for VR/AR applications

OpenCL Ecosystem
Hardware Implementers
Desktop/Mobile/Embedded/FPGA
100s of applications using
OpenCL acceleration
Rendering, visualization, video editing,
simulation, image processing, vision and
neural network inferencing
Single Source C++ Programming
Portable Kernel Intermediate Language
Core API and Language Specs
OpenCL 2.2 – Top to Bottom C++

"Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to "Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group

Similar to "Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group (20)

More from Edge AI and Vision Alliance

More from Edge AI and Vision Alliance (20)

Recently uploaded

Recently uploaded (20)

"Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group