SlideShare a Scribd company logo
© Copyright Khronos Group 2016 - Page 1
The Vision API Maze
Options and Trade-offs
Embedded Vision Summit, May 2016
Neil Trevett | Khronos President
NVIDIA Vice President Developer Ecosystem
© Copyright Khronos Group 2016 - Page 2
Accelerated Vision API Jungle
Vision Frameworks
Language-based
Acceleration Frameworks
Explicit
Kernels
GPU FPGA
DSP
Dedicated
Hardware
Neural Net Libraries
© Copyright Khronos Group 2016 - Page 3
http://hwstats.unity3d.com/mobile/gpu.html
OpenGL ES 2.0
OpenGL ES 3.x
OpenGL ES
2003
1.0
2004
1.1
2007
2.0
2012
3.0
2014
3.1
Driver
Update
Silicon
Update
Silicon
Update
Driver
Update
Compute Shaders
32-bit integers and floats
NPOT, 3D/depth textures
Texture arrays
Multiple Render Targets
Vertex and
fragment shaders
Fixed function
Pipeline
2015
3.2
Silicon
Update
Tessellation and geometry shaders
ASTC Texture Compression
Floating point render targets
Debug and robustness for security
Epic’s Rivalry demo using full Unreal Engine 4
https://www.youtube.com/watch?v=jRr-G95GdaM
+AEP
OpenGL ES 3.1 and
Android Extension Pack
since Android 5.0 (Lollipop)
Vertex and Fragment Shaders Compute Shaders
© Copyright Khronos Group 2016 - Page 4
New Generation GPU APIs
Only AppleOnly Windows 10
Cross Platform
Vulkan 1.0 launched in February
Shipping now on Windows, Linux, Android platforms from multiple vendors
‘Half Way New Gen’
Retains Traditional Binding Model
Mixes OpenGL ES 3.1/OpenCL 1.2
C++11-based kernel language
Objective-C or Swift
© Copyright Khronos Group 2016 - Page 5
Vulkan Explicit GPU Control
GPU
High-level Driver
Abstraction
Context management
Memory allocation
Full GLSL compiler
Error detection
Layered GPU Control
Application
Single thread per context
GPU
Thin Driver
Explicit GPU Control
Application
Memory allocation
Thread management
Synchronization
Multi-threaded generation
of command buffers
Language Front-end
Compilers
Initially GLSL
Loadable debug and
validation layers
Vulkan 1.0 provides access to
OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality
but with increased performance and flexibility
Loadable Layers
No error handling overhead in
production code
SPIR-V Pre-compiled Shaders:
No front-end compiler in driver
Future shading language flexibility
Simpler drivers:
Improved efficiency/performance
Reduced CPU bottlenecks
Lower latency
Increased portability
Graphics, compute and DMA queues:
Work dispatch flexibility
Command Buffers:
Command creation can be multi-threaded
Multiple CPU cores increase performance
Resource management in app code:
Less hitches and surprises
Vulkan Benefits
SPIR-V pre-compiled
shaders
© Copyright Khronos Group 2016 - Page 6
NVIDIA CUDA
• The industry’s original dedicated GPU Compute language
- C/C++11 language extensions for ‘single source’ programming
• Easy programmability and low level access to GPU
- Unified Memory, Virtual Addressing, Dynamic Parallelism etc.
• Mature and optimized tools and compute / imaging libraries
- Thrust, NPP, cuFFT, cuBLAS, cuda-gdb, nvprof etc.
• CUDA 7.5 released September 2015
- Added 16-bit floating point (FP16) data format support
• NVIDIA only, GPU only
CUDA 8 (Coming Soon)
Support for Pascal Unified Memory
Flexible Mixing of
CUDA (C++ language extensions)
OpenACC (parallelism through standard C++)
Thrust (C++ parallel library)
© Copyright Khronos Group 2016 - Page 7
OpenCL
• Heterogeneous parallel programming of diverse compute resources
- One code tree can be executed on CPUs, GPUs, DSPs and FPGA
• OpenCL = Two APIs and Two Kernel languages
- C Platform Layer API to query, select and initialize compute devices
- C Runtime API to build and execute kernels across multiple devices
- OpenCL C and OpenCL C++ kernel languages
• The OpenCL C++ kernel language is a static subset of C++14
- Adaptable and elegant sharable code – great for building libraries
- Templates enable meta-programming for highly adaptive software
- Lambdas used to implement nested/dynamic parallelism
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
GPU
DSP
CPU
CPU
FPGA
Kernel code
compiled for
devices
Devices
CPU
Host
Runtime API
loads and executes
kernels across devices
© Copyright Khronos Group 2016 - Page 8
OpenCL Conformant Implementations
OpenCL 1.0
Specification
Dec08 Jun10
OpenCL 1.1
Specification
Nov11
OpenCL 1.2
Specification
OpenCL 2.0
Specification
Nov13
1.0 | Jul13
1.0 | Aug09
1.0 | May09
1.0 | May10
1.0 | Feb11
1.0 | May09
1.0 | Jan10
1.1 | Aug10
1.1 | Jul11
1.2 | May12
1.2 | Jun12
1.1 | Feb11
1.1 |Mar11
1.1 | Jun10
1.1 | Aug12
1.1 | Nov12
1.1 | May13
1.1 | Apr12
1.2 | Apr14
1.2 | Sep13
1.2 | Dec12
Desktop
Mobile
FPGA
2.0 | Jul14
OpenCL 2.1
Specification
Nov15
1.2 | May15
2.0 | Dec14
1.0 | Dec14
1.2 | Dec14
1.2 | Sep14
Vendor timelines are
first implementation of
each spec generation
1.2 | May15
Embedded
1.2 | Aug15
1.2 | Mar16
2.0 | Nov15
© Copyright Khronos Group 2016 - Page 9
OpenCL 2.2 - Top to Bottom C++
OpenCL 1.0
Specification
Dec08 Jun10
OpenCL 1.1
Specification
Nov11
OpenCL 1.2
Specification
OpenCL 2.0
Specification
Nov13
Device partitioning
Separate compilation and linking
Enhanced image support
Built-in kernels / custom devices
Enhanced DX and OpenGL Interop
Shared Virtual Memory
On-device dispatch
Generic Address Space
Enhanced Image Support
C11 Atomics
Pipes
Android ICD
3-component vectors
Additional image formats
Multiple hosts and devices
Buffer region operations
Enhanced event-driven execution
Additional OpenCL C built-ins
Improved OpenGL data/event interop
18 months 18 months 24 months
OpenCL 2.1
Specification
Nov1524 months
SPIR-V in Core
Subgroups into core
Subgroup query operations
clCloneKernel
Low-latency device
timer queries
OpenCL C++14
Kernel Language
into core
OpenCL 2.2
PROVISIONAL
May167months
Single Source C++ Programming
Full support for features in C++14 Kernel Language
API and Language Specs
Brings C++14 Kernel Language into core specification
Portable Kernel Intermediate Language
Support for C++14 kernel language e.g. constructors/destructors
© Copyright Khronos Group 2016 - Page 10
SPIR-V Ecosystem
LLVM
Third party kernel and
shader Languages
SPIR-V
• Khronos defined and controlled
cross-API intermediate language
• Native support for graphics
and parallel constructs
• 32-bit Word Stream
• Extensible and easily parsed
• Retains data object and control
flow information for effective
code generation and translation
OpenCL C++OpenCL C
GLSL
Khronos has open sourced
these tools and translators
IHV Driver
Runtimes
Other
Intermediate
Forms
SPIR-V Validator
SPIR-V Tools
SPIR-V (Dis)Assembler
LLVM to SPIR-V
Bi-directional
Translator
Khronos plans to open
source these tools soon
https://github.com/KhronosGroup/SPIR/tree/spirv-1.1
Open source C++ front-end released
© Copyright Khronos Group 2016 - Page 11
SYCL for OpenCL
• Single-source heterogeneous programming using STANDARD C++
- Use C++ templates and lambda functions for host & device code
• Aligns the hardware acceleration of OpenCL with direction of the C++ standard
- C++14 with open source C++17 Parallel STL hosted by Khronos
C++ Kernel Language
Low Level Control
‘GPGPU’-style separation of
device-side kernel source
code and host code
Single-source C++
Programmer Familiarity
Approach also taken by
C++ AMP and OpenMP
Developer Choice
The development of the two specifications are aligned so
code can be easily shared between the two approaches
© Copyright Khronos Group 2016 - Page 12
OpenCL Roadmap Discussions…
Embedded
Use cases: Signal and Pixel Processing
Roadmap: arbitrary precision for power
efficiency, hard real-time scheduling,
asynch DMA
FPGAs
Use cases: Network and
Stream Processing
Roadmap: enhanced execution model, self-
synchronized and self-scheduled graphs, fine-
grained synchronization between kernels,
DSL in C++
HPC, SciViz, Datacenter
Use case: Numerical Simulation, Virtualization
Roadmap: enhanced streaming processing,
enhanced library support
Vulkan Compute can leverage OpenCL?
Gaming Compute, Pixel Processing, Inference
Fine grain graphics and compute (no interop needed)
SPIR-V for shading language flexibility – C/C++
Low-latency, fine grain run-time
Google Android adoption
Competes well with Metal (=C++/OpenCL 1.2)
Roadmap: types, precision and accuracy
Pointers and address spaces, execution model
Desktop
Use cases: Video and Image Processing, Gaming Compute
Roadmap: Vulkan interop, arbitrary precision for
increased performance, pre-emption, collective
programming and improved execution model
Mobile
Use case: Photo and Vision Processing
Roadmap: arbitrary precision for
inference engine and pixel processing efficiency, pre-
emption and QoS scheduling for power efficiency
Possible learnings from Vulkan Philosophy
1. Explicit - provide direct access to hardware capabilities with thin driver
2. Feature Sets – enable diverse architectures to ship just relevant features
3. Open source conformance tests for deep community engagement
© Copyright Khronos Group 2016 - Page 13
OpenCV
• Extensive and widely used open source
vision library - written in optimized C/C++
- Free-use BSD license
• C++, C, Python and Java interfaces
- Windows, Linux, Mac OS, iOS and Android
• Increasingly taking advantage of
heterogeneous processing using OpenCL
- OpenCV 3.X Transparent API;
single API entry for each function/algorithm
- Dynamically loads OpenCL runtime if available;
otherwise falls back to CPU code
- Runtime Dispatching;
no recompilation!
CPU
Thread
CPU
Thread
CPU
Thread
…
ocl::Queue
ocl::Device
ocl::Queue ocl::Queue
ocl::Device
…
…
ocl::Context
OpenCV Application
OpenCV Transparent API for OpenCL Kernel Offload
• One OpenCL queue per CPU thread
• CPU threads can share a device
• OpenCL kernels are executed asynchronously
OpenCV is active open source - not an API specification
A strength and a weakness!
Production deployment often needs tightly defined callable API
© Copyright Khronos Group 2016 - Page 14
Vision Pipeline Challenges and Opportunities
22
Sensor ProliferationGrowing Camera Diversity Diverse Vision Processors
Flexible sensor and camera
control to GENERATE
an image stream
Use efficient acceleration to
PROCESS
the image stream
Combine vision output
with other sensor data
on device
© Copyright Khronos Group 2016 - Page 15
OpenVX – Low Power Vision Acceleration
• Precisely defined API for production deployment of vision acceleration
- Targeted at real-time mobile and embedded platforms
• Higher abstraction than OpenCL for performance portability across diverse architectures
- Multi-core CPUs, GPUs, DSPs and DSP arrays, ISPs, Dedicated hardware…
• Extends portable vision acceleration to very low power domains
- Doesn’t require high-power CPU/GPU Complex or OpenCL precision
- Low-power host can setup and manage frame-rate graph
GPU
Vision Engine
Middleware
Application
DSP
Hardware
PowerEfficiency
Computation Flexibility
Dedicated
Hardware
GPU
Compute
Multi-core
CPUX1
X10
X100
Vision Processing
EfficiencyVision
DSPs
© Copyright Khronos Group 2016 - Page 16
OpenVX Graphs
• OpenVX developers express a graph of image operations (‘Nodes’)
- Nodes can be on any hardware or processor coded in any language
• Graphs can execute almost autonomously
- Possible to Minimize host interaction during frame-rate graph execution
• Graphs are the key to run-time optimization opportunities…
Array of
Keypoints
YUV
Frame
Gray
Frame
Camera
Input
Rendering
Output
Pyrt
Color
Conversion
Channel
Extract
Optical
Flow
Harris
Track
Image
Pyramid
RGB
Frame
Array of
Features
Ftrt-1OpenVX Graph
OpenVX Nodes
Feature Extraction Example Graph
© Copyright Khronos Group 2016 - Page 17
OpenVX Efficiency through Graphs..
Reuse
pre-allocated
memory for
multiple
intermediate data
Memory
Management
Less allocation overhead,
more memory for
other applications
Replace a sub-
graph with a
single faster node
Kernel
Merge
Better memory
locality, less kernel
launch overhead
Split the graph
execution across
the whole system:
CPU / GPU /
dedicated HW
Graph
Scheduling
Faster execution
or lower power
consumption
Execute a sub-
graph at tile
granularity instead
of image
granularity
Data
Tiling
Better use of
data cache and
local memory
© Copyright Khronos Group 2016 - Page 18
Example Relative Performance
1.1
2.9
8.7
1.5
2.5
0
1
2
3
4
5
6
7
8
9
10
Arithmetic Analysis Filter Geometric Overall
OpenCV (GPU accelerated)
OpenVX (GPU accelerated)
Relative Performance
NVIDIA
implementation
experience.
Geometric mean of
>2200 primitives,
grouped into each
categories,
running at different
image sizes and
parameter settings
© Copyright Khronos Group 2016 - Page 19
Layered Vision Processing Ecosystem
Programmable Vision
Processors
Dedicated Vision
Hardware
Application
Processor Hardware
Powerful, flexible
low-level APIs / languages
Application Software
Engines/frameworks
C/C++
Implementers may use OpenCL or Compute Shaders to
implement OpenVX nodes on programmable processors
And then developers can use OpenVX to enable a
developer to easily connect those nodes into a graph
The OpenVX graph enables implementers to optimize execution across
diverse hardware architectures an drive to lower power implementations
OpenVX enables the graph to be extended to include hardware
architectures that don’t support programmable APIs
© Copyright Khronos Group 2016 - Page 20
OpenVX 1.0 Shipping, OpenVX 1.1 Released!
• Multiple OpenVX 1.0 Implementations shipping – spec in October 2014
- Open source sample implementation and conformance tests available
• OpenVX 1.1 Specification released 2nd May 2016 at Embedded Vision Summit
- Expands node functionality AND enhances graph framework
- Laplacian pyramids and enhanced filters
- Easier user nodes and control over execution on heterogeneous platforms
- Sample source and conformance tests will be updated to OpenVX 1.1 in 1H16
= shipping implementations
© Copyright Khronos Group 2016 - Page 21
OpenVX Roadmap and Safety Critical APIs
New Generation APIs for
safety certifiable vision,
graphics and compute
e.g. ISO 26262 and DO-178B/C
OpenGL ES 1.0 - 2003
Fixed function graphics
OpenGL ES 2.0 - 2007
Shader programmable pipeline
OpenGL SC 1.0 - 2005
Fixed function graphics subset
OpenGL SC 2.0 - April 2016
Shader programmable pipeline subset
Experience and Guidelines
Small driver size
Advanced functionality
Graphics and compute
OpenVX Roadmap Discussions
Significantly broaden node functionality
In-graph neural nets
Programmable nodes (OpenCL or SPIR-V?)
Market-specific feature sets
OpenVX SC?
© Copyright Khronos Group 2016 - Page 22
Get Involved!
• A diverse set of vision APIs in the industry
- Developer choice is good – but need to choose wisely!
• Many APIs originally created to program GPUs
- But vision processing needs are increasingly driving API roadmaps
• Industry will tend to consolidate around leading APIs
- Working toward a multi-layer API ecosystem
- Powerful foundational hardware APIs enabling rich middleware APIs and libraries
• Any company or organization is welcome to join Khronos
for a voice and a vote in any of its standards
- www.khronos.org
• Neil Trevett
- ntrevett@nvidia.com
- @neilt3d

More Related Content

What's hot

Openvino ncs2
Openvino ncs2Openvino ncs2
Openvino ncs2
艾鍗科技
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15
mixARConference
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
Edge AI and Vision Alliance
 
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ..."How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
Edge AI and Vision Alliance
 
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
AMD Developer Central
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
Jomar Silva
 
ScilabTEC 2015 - Xilinx
ScilabTEC 2015 - XilinxScilabTEC 2015 - Xilinx
ScilabTEC 2015 - Xilinx
Scilab
 
Qualcomm Hexagon SDK: Optimize Your Multimedia Solutions
Qualcomm Hexagon SDK: Optimize Your Multimedia SolutionsQualcomm Hexagon SDK: Optimize Your Multimedia Solutions
Qualcomm Hexagon SDK: Optimize Your Multimedia Solutions
Qualcomm Developer Network
 
Continuous Integration for BSP
Continuous Integration for BSPContinuous Integration for BSP
Continuous Integration for BSP
Witekio
 
!Zpx Overview New
!Zpx Overview New!Zpx Overview New
!Zpx Overview Newcynthiabro
 
Using the Cypress PSoC Processor
Using the Cypress PSoC ProcessorUsing the Cypress PSoC Processor
Using the Cypress PSoC Processor
LloydMoore
 
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
AMD Developer Central
 
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone Systems
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
AMD Developer Central
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
AMD Developer Central
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Intel® Software
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovAMD Developer Central
 
Tinychip PSoC Workshop
Tinychip PSoC WorkshopTinychip PSoC Workshop
Tinychip PSoC Workshop
Dr. Shivananda Koteshwar
 
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test ViewDesign and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
ODSA Workgroup
 

What's hot (20)

Openvino ncs2
Openvino ncs2Openvino ncs2
Openvino ncs2
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ..."How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
"How to Get the Best Deep Learning Performance with the OpenVINO Toolkit," a ...
 
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
 
ScilabTEC 2015 - Xilinx
ScilabTEC 2015 - XilinxScilabTEC 2015 - Xilinx
ScilabTEC 2015 - Xilinx
 
Qualcomm Hexagon SDK: Optimize Your Multimedia Solutions
Qualcomm Hexagon SDK: Optimize Your Multimedia SolutionsQualcomm Hexagon SDK: Optimize Your Multimedia Solutions
Qualcomm Hexagon SDK: Optimize Your Multimedia Solutions
 
Continuous Integration for BSP
Continuous Integration for BSPContinuous Integration for BSP
Continuous Integration for BSP
 
!Zpx Overview New
!Zpx Overview New!Zpx Overview New
!Zpx Overview New
 
Using the Cypress PSoC Processor
Using the Cypress PSoC ProcessorUsing the Cypress PSoC Processor
Using the Cypress PSoC Processor
 
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
 
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
Tyrone-Intel oneAPI Webinar: Optimized Tools for Performance-Driven, Cross-Ar...
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
Jai kumar fpga_prototyping
Jai kumar fpga_prototypingJai kumar fpga_prototyping
Jai kumar fpga_prototyping
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
Tinychip PSoC Workshop
Tinychip PSoC WorkshopTinychip PSoC Workshop
Tinychip PSoC Workshop
 
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test ViewDesign and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
 

Similar to "The Vision API Maze: Options and Trade-offs," a Presentation from the Khronos Group

"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ..."An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
Edge AI and Vision Alliance
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
Edge AI and Vision Alliance
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
Edge AI and Vision Alliance
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
Edge AI and Vision Alliance
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
Edge AI and Vision Alliance
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021
The Khronos Group Inc.
 
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
Edge AI and Vision Alliance
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
Linaro
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
Rogue Wave Software
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
Edge AI and Vision Alliance
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
Ghodhbane Mohamed Amine
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
Edge AI and Vision Alliance
 
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyPT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
AMD Developer Central
 
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
CEE-SEC(R)
 
Coscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloudCoscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloud
Tian-Jian Wu
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko3D
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
Rogue Wave Software
 
Hands on OpenCL
Hands on OpenCLHands on OpenCL
Hands on OpenCL
Vladimir Starostenkov
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
Bruno Cornec
 
DRIVE PX 2
DRIVE PX 2DRIVE PX 2
DRIVE PX 2
Shri Sundaram
 

Similar to "The Vision API Maze: Options and Trade-offs," a Presentation from the Khronos Group (20)

"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ..."An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021
 
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
 
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon SelleyPT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
PT-4052, Introduction to AMD Developer Tools, by Yaki Tebeka and Gordon Selley
 
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
Массовый параллелизм для гетерогенных вычислений на C++ для беспилотных автом...
 
Coscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloudCoscup2018 itri android-in-cloud
Coscup2018 itri android-in-cloud
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSL
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
Hands on OpenCL
Hands on OpenCLHands on OpenCL
Hands on OpenCL
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 
DRIVE PX 2
DRIVE PX 2DRIVE PX 2
DRIVE PX 2
 

More from Edge AI and Vision Alliance

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
Edge AI and Vision Alliance
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
Edge AI and Vision Alliance
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Edge AI and Vision Alliance
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
“Deploying Large Models on the Edge: Success Stories and Challenges,” a Prese...
 
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
“Scaling Vision-based Edge AI Solutions: From Prototype to Global Deployment,...
 
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
 
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

"The Vision API Maze: Options and Trade-offs," a Presentation from the Khronos Group

  • 1. © Copyright Khronos Group 2016 - Page 1 The Vision API Maze Options and Trade-offs Embedded Vision Summit, May 2016 Neil Trevett | Khronos President NVIDIA Vice President Developer Ecosystem
  • 2. © Copyright Khronos Group 2016 - Page 2 Accelerated Vision API Jungle Vision Frameworks Language-based Acceleration Frameworks Explicit Kernels GPU FPGA DSP Dedicated Hardware Neural Net Libraries
  • 3. © Copyright Khronos Group 2016 - Page 3 http://hwstats.unity3d.com/mobile/gpu.html OpenGL ES 2.0 OpenGL ES 3.x OpenGL ES 2003 1.0 2004 1.1 2007 2.0 2012 3.0 2014 3.1 Driver Update Silicon Update Silicon Update Driver Update Compute Shaders 32-bit integers and floats NPOT, 3D/depth textures Texture arrays Multiple Render Targets Vertex and fragment shaders Fixed function Pipeline 2015 3.2 Silicon Update Tessellation and geometry shaders ASTC Texture Compression Floating point render targets Debug and robustness for security Epic’s Rivalry demo using full Unreal Engine 4 https://www.youtube.com/watch?v=jRr-G95GdaM +AEP OpenGL ES 3.1 and Android Extension Pack since Android 5.0 (Lollipop) Vertex and Fragment Shaders Compute Shaders
  • 4. © Copyright Khronos Group 2016 - Page 4 New Generation GPU APIs Only AppleOnly Windows 10 Cross Platform Vulkan 1.0 launched in February Shipping now on Windows, Linux, Android platforms from multiple vendors ‘Half Way New Gen’ Retains Traditional Binding Model Mixes OpenGL ES 3.1/OpenCL 1.2 C++11-based kernel language Objective-C or Swift
  • 5. © Copyright Khronos Group 2016 - Page 5 Vulkan Explicit GPU Control GPU High-level Driver Abstraction Context management Memory allocation Full GLSL compiler Error detection Layered GPU Control Application Single thread per context GPU Thin Driver Explicit GPU Control Application Memory allocation Thread management Synchronization Multi-threaded generation of command buffers Language Front-end Compilers Initially GLSL Loadable debug and validation layers Vulkan 1.0 provides access to OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality but with increased performance and flexibility Loadable Layers No error handling overhead in production code SPIR-V Pre-compiled Shaders: No front-end compiler in driver Future shading language flexibility Simpler drivers: Improved efficiency/performance Reduced CPU bottlenecks Lower latency Increased portability Graphics, compute and DMA queues: Work dispatch flexibility Command Buffers: Command creation can be multi-threaded Multiple CPU cores increase performance Resource management in app code: Less hitches and surprises Vulkan Benefits SPIR-V pre-compiled shaders
  • 6. © Copyright Khronos Group 2016 - Page 6 NVIDIA CUDA • The industry’s original dedicated GPU Compute language - C/C++11 language extensions for ‘single source’ programming • Easy programmability and low level access to GPU - Unified Memory, Virtual Addressing, Dynamic Parallelism etc. • Mature and optimized tools and compute / imaging libraries - Thrust, NPP, cuFFT, cuBLAS, cuda-gdb, nvprof etc. • CUDA 7.5 released September 2015 - Added 16-bit floating point (FP16) data format support • NVIDIA only, GPU only CUDA 8 (Coming Soon) Support for Pascal Unified Memory Flexible Mixing of CUDA (C++ language extensions) OpenACC (parallelism through standard C++) Thrust (C++ parallel library)
  • 7. © Copyright Khronos Group 2016 - Page 7 OpenCL • Heterogeneous parallel programming of diverse compute resources - One code tree can be executed on CPUs, GPUs, DSPs and FPGA • OpenCL = Two APIs and Two Kernel languages - C Platform Layer API to query, select and initialize compute devices - C Runtime API to build and execute kernels across multiple devices - OpenCL C and OpenCL C++ kernel languages • The OpenCL C++ kernel language is a static subset of C++14 - Adaptable and elegant sharable code – great for building libraries - Templates enable meta-programming for highly adaptive software - Lambdas used to implement nested/dynamic parallelism OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code GPU DSP CPU CPU FPGA Kernel code compiled for devices Devices CPU Host Runtime API loads and executes kernels across devices
  • 8. © Copyright Khronos Group 2016 - Page 8 OpenCL Conformant Implementations OpenCL 1.0 Specification Dec08 Jun10 OpenCL 1.1 Specification Nov11 OpenCL 1.2 Specification OpenCL 2.0 Specification Nov13 1.0 | Jul13 1.0 | Aug09 1.0 | May09 1.0 | May10 1.0 | Feb11 1.0 | May09 1.0 | Jan10 1.1 | Aug10 1.1 | Jul11 1.2 | May12 1.2 | Jun12 1.1 | Feb11 1.1 |Mar11 1.1 | Jun10 1.1 | Aug12 1.1 | Nov12 1.1 | May13 1.1 | Apr12 1.2 | Apr14 1.2 | Sep13 1.2 | Dec12 Desktop Mobile FPGA 2.0 | Jul14 OpenCL 2.1 Specification Nov15 1.2 | May15 2.0 | Dec14 1.0 | Dec14 1.2 | Dec14 1.2 | Sep14 Vendor timelines are first implementation of each spec generation 1.2 | May15 Embedded 1.2 | Aug15 1.2 | Mar16 2.0 | Nov15
  • 9. © Copyright Khronos Group 2016 - Page 9 OpenCL 2.2 - Top to Bottom C++ OpenCL 1.0 Specification Dec08 Jun10 OpenCL 1.1 Specification Nov11 OpenCL 1.2 Specification OpenCL 2.0 Specification Nov13 Device partitioning Separate compilation and linking Enhanced image support Built-in kernels / custom devices Enhanced DX and OpenGL Interop Shared Virtual Memory On-device dispatch Generic Address Space Enhanced Image Support C11 Atomics Pipes Android ICD 3-component vectors Additional image formats Multiple hosts and devices Buffer region operations Enhanced event-driven execution Additional OpenCL C built-ins Improved OpenGL data/event interop 18 months 18 months 24 months OpenCL 2.1 Specification Nov1524 months SPIR-V in Core Subgroups into core Subgroup query operations clCloneKernel Low-latency device timer queries OpenCL C++14 Kernel Language into core OpenCL 2.2 PROVISIONAL May167months Single Source C++ Programming Full support for features in C++14 Kernel Language API and Language Specs Brings C++14 Kernel Language into core specification Portable Kernel Intermediate Language Support for C++14 kernel language e.g. constructors/destructors
  • 10. © Copyright Khronos Group 2016 - Page 10 SPIR-V Ecosystem LLVM Third party kernel and shader Languages SPIR-V • Khronos defined and controlled cross-API intermediate language • Native support for graphics and parallel constructs • 32-bit Word Stream • Extensible and easily parsed • Retains data object and control flow information for effective code generation and translation OpenCL C++OpenCL C GLSL Khronos has open sourced these tools and translators IHV Driver Runtimes Other Intermediate Forms SPIR-V Validator SPIR-V Tools SPIR-V (Dis)Assembler LLVM to SPIR-V Bi-directional Translator Khronos plans to open source these tools soon https://github.com/KhronosGroup/SPIR/tree/spirv-1.1 Open source C++ front-end released
  • 11. © Copyright Khronos Group 2016 - Page 11 SYCL for OpenCL • Single-source heterogeneous programming using STANDARD C++ - Use C++ templates and lambda functions for host & device code • Aligns the hardware acceleration of OpenCL with direction of the C++ standard - C++14 with open source C++17 Parallel STL hosted by Khronos C++ Kernel Language Low Level Control ‘GPGPU’-style separation of device-side kernel source code and host code Single-source C++ Programmer Familiarity Approach also taken by C++ AMP and OpenMP Developer Choice The development of the two specifications are aligned so code can be easily shared between the two approaches
  • 12. © Copyright Khronos Group 2016 - Page 12 OpenCL Roadmap Discussions… Embedded Use cases: Signal and Pixel Processing Roadmap: arbitrary precision for power efficiency, hard real-time scheduling, asynch DMA FPGAs Use cases: Network and Stream Processing Roadmap: enhanced execution model, self- synchronized and self-scheduled graphs, fine- grained synchronization between kernels, DSL in C++ HPC, SciViz, Datacenter Use case: Numerical Simulation, Virtualization Roadmap: enhanced streaming processing, enhanced library support Vulkan Compute can leverage OpenCL? Gaming Compute, Pixel Processing, Inference Fine grain graphics and compute (no interop needed) SPIR-V for shading language flexibility – C/C++ Low-latency, fine grain run-time Google Android adoption Competes well with Metal (=C++/OpenCL 1.2) Roadmap: types, precision and accuracy Pointers and address spaces, execution model Desktop Use cases: Video and Image Processing, Gaming Compute Roadmap: Vulkan interop, arbitrary precision for increased performance, pre-emption, collective programming and improved execution model Mobile Use case: Photo and Vision Processing Roadmap: arbitrary precision for inference engine and pixel processing efficiency, pre- emption and QoS scheduling for power efficiency Possible learnings from Vulkan Philosophy 1. Explicit - provide direct access to hardware capabilities with thin driver 2. Feature Sets – enable diverse architectures to ship just relevant features 3. Open source conformance tests for deep community engagement
  • 13. © Copyright Khronos Group 2016 - Page 13 OpenCV • Extensive and widely used open source vision library - written in optimized C/C++ - Free-use BSD license • C++, C, Python and Java interfaces - Windows, Linux, Mac OS, iOS and Android • Increasingly taking advantage of heterogeneous processing using OpenCL - OpenCV 3.X Transparent API; single API entry for each function/algorithm - Dynamically loads OpenCL runtime if available; otherwise falls back to CPU code - Runtime Dispatching; no recompilation! CPU Thread CPU Thread CPU Thread … ocl::Queue ocl::Device ocl::Queue ocl::Queue ocl::Device … … ocl::Context OpenCV Application OpenCV Transparent API for OpenCL Kernel Offload • One OpenCL queue per CPU thread • CPU threads can share a device • OpenCL kernels are executed asynchronously OpenCV is active open source - not an API specification A strength and a weakness! Production deployment often needs tightly defined callable API
  • 14. © Copyright Khronos Group 2016 - Page 14 Vision Pipeline Challenges and Opportunities 22 Sensor ProliferationGrowing Camera Diversity Diverse Vision Processors Flexible sensor and camera control to GENERATE an image stream Use efficient acceleration to PROCESS the image stream Combine vision output with other sensor data on device
  • 15. © Copyright Khronos Group 2016 - Page 15 OpenVX – Low Power Vision Acceleration • Precisely defined API for production deployment of vision acceleration - Targeted at real-time mobile and embedded platforms • Higher abstraction than OpenCL for performance portability across diverse architectures - Multi-core CPUs, GPUs, DSPs and DSP arrays, ISPs, Dedicated hardware… • Extends portable vision acceleration to very low power domains - Doesn’t require high-power CPU/GPU Complex or OpenCL precision - Low-power host can setup and manage frame-rate graph GPU Vision Engine Middleware Application DSP Hardware PowerEfficiency Computation Flexibility Dedicated Hardware GPU Compute Multi-core CPUX1 X10 X100 Vision Processing EfficiencyVision DSPs
  • 16. © Copyright Khronos Group 2016 - Page 16 OpenVX Graphs • OpenVX developers express a graph of image operations (‘Nodes’) - Nodes can be on any hardware or processor coded in any language • Graphs can execute almost autonomously - Possible to Minimize host interaction during frame-rate graph execution • Graphs are the key to run-time optimization opportunities… Array of Keypoints YUV Frame Gray Frame Camera Input Rendering Output Pyrt Color Conversion Channel Extract Optical Flow Harris Track Image Pyramid RGB Frame Array of Features Ftrt-1OpenVX Graph OpenVX Nodes Feature Extraction Example Graph
  • 17. © Copyright Khronos Group 2016 - Page 17 OpenVX Efficiency through Graphs.. Reuse pre-allocated memory for multiple intermediate data Memory Management Less allocation overhead, more memory for other applications Replace a sub- graph with a single faster node Kernel Merge Better memory locality, less kernel launch overhead Split the graph execution across the whole system: CPU / GPU / dedicated HW Graph Scheduling Faster execution or lower power consumption Execute a sub- graph at tile granularity instead of image granularity Data Tiling Better use of data cache and local memory
  • 18. © Copyright Khronos Group 2016 - Page 18 Example Relative Performance 1.1 2.9 8.7 1.5 2.5 0 1 2 3 4 5 6 7 8 9 10 Arithmetic Analysis Filter Geometric Overall OpenCV (GPU accelerated) OpenVX (GPU accelerated) Relative Performance NVIDIA implementation experience. Geometric mean of >2200 primitives, grouped into each categories, running at different image sizes and parameter settings
  • 19. © Copyright Khronos Group 2016 - Page 19 Layered Vision Processing Ecosystem Programmable Vision Processors Dedicated Vision Hardware Application Processor Hardware Powerful, flexible low-level APIs / languages Application Software Engines/frameworks C/C++ Implementers may use OpenCL or Compute Shaders to implement OpenVX nodes on programmable processors And then developers can use OpenVX to enable a developer to easily connect those nodes into a graph The OpenVX graph enables implementers to optimize execution across diverse hardware architectures an drive to lower power implementations OpenVX enables the graph to be extended to include hardware architectures that don’t support programmable APIs
  • 20. © Copyright Khronos Group 2016 - Page 20 OpenVX 1.0 Shipping, OpenVX 1.1 Released! • Multiple OpenVX 1.0 Implementations shipping – spec in October 2014 - Open source sample implementation and conformance tests available • OpenVX 1.1 Specification released 2nd May 2016 at Embedded Vision Summit - Expands node functionality AND enhances graph framework - Laplacian pyramids and enhanced filters - Easier user nodes and control over execution on heterogeneous platforms - Sample source and conformance tests will be updated to OpenVX 1.1 in 1H16 = shipping implementations
  • 21. © Copyright Khronos Group 2016 - Page 21 OpenVX Roadmap and Safety Critical APIs New Generation APIs for safety certifiable vision, graphics and compute e.g. ISO 26262 and DO-178B/C OpenGL ES 1.0 - 2003 Fixed function graphics OpenGL ES 2.0 - 2007 Shader programmable pipeline OpenGL SC 1.0 - 2005 Fixed function graphics subset OpenGL SC 2.0 - April 2016 Shader programmable pipeline subset Experience and Guidelines Small driver size Advanced functionality Graphics and compute OpenVX Roadmap Discussions Significantly broaden node functionality In-graph neural nets Programmable nodes (OpenCL or SPIR-V?) Market-specific feature sets OpenVX SC?
  • 22. © Copyright Khronos Group 2016 - Page 22 Get Involved! • A diverse set of vision APIs in the industry - Developer choice is good – but need to choose wisely! • Many APIs originally created to program GPUs - But vision processing needs are increasingly driving API roadmaps • Industry will tend to consolidate around leading APIs - Working toward a multi-layer API ecosystem - Powerful foundational hardware APIs enabling rich middleware APIs and libraries • Any company or organization is welcome to join Khronos for a voice and a vote in any of its standards - www.khronos.org • Neil Trevett - ntrevett@nvidia.com - @neilt3d