SlideShare a Scribd company logo
1 of 39
Download to read offline
© Copyright Khronos Group 2017 - Page 1
Update on Khronos Standards for
Vision and Machine Learning
December 2017
Neil Trevett | Khronos President
NVIDIA | VP Developer Ecosystem
ntrevett@nvidia.com | @neilt3d |www.khronos.org
© Copyright Khronos Group 2017 - Page 2
Khronos Mission
Software
Silicon
Khronos is an International Industry Consortium of over 100 companies creating
royalty-free, Open Standard APIs to enable software to access hardware
acceleration for 3D Graphics, Virtual and Augmented Reality, Parallel Computing,
Neural Networks and Vision Processing
© Copyright Khronos Group 2017 - Page 3
Khronos Open Standards
Vision
sensor(s)
Tracking and
Positioning
Geometric scene
reconstruction
Semantic scene
understanding
(Neural Networks)
Generate Low Latency 3D
Content and Augmentations
Interact with sensor, haptic
and display devices
Download 3D
augmentation object
and scene data
VR/AR Application
© Copyright Khronos Group 2017 - Page 4
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Vision and Neural Net
Inferencing Runtime
Embedded/Mobile
Vision/Inferencing Hardware
Standards for Vision and Neural Networks
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks
© Copyright Khronos Group 2017 - Page 5
NNEF - Solving Neural Net Fragmentation
NN Authoring Framework 1
NN Authoring Framework 2
NN Authoring Framework 3
Inference Engine 1
Inference Engine 2
Inference Engine 3
Every Tool Needs an Exporter to
Every Accelerator
With NNEF- NN Training and Inferencing Interoperability
Before NNEF – NN Training and Inferencing Fragmentation
NN Authoring Framework 1
NN Authoring Framework 2
NN Authoring Framework 3
Inference Engine 1
Inference Engine 2
Inference Engine 3
Optimization and processing tools
NNEF is a Cross-vendor Neural Net file format
Encapsulates network formal semantics, structure, data formats,
commonly-used operations (such as convolution, pooling, normalization, etc.)
© Copyright Khronos Group 2017 - Page 6
NNEF Status and Roadmap
• NNEF V1.0 provisional release – targeting late December 2017
- Focus on passing trained frameworks to embedded inference engines
- Support ‘First cut’ range of network types
• NNEF Roadmap
- Track development of new network types
- Allow authoring and retraining (3rd party tools)
- Address a wider range of applications (outside vision apps)
- Increase the expressive power of the format Baseline
Importer
File
Format
Validator
Working Group Participants
Baseline
Exporter
Khronos
Open Source
Projects
© Copyright Khronos Group 2017 - Page 7
OpenVX
PowerEfficiency
Computation Flexibility
Dedicated
Hardware
GPU
Compute
Multi-core
CPUX1
X10
X100
Vision
DSPs
Wide range of vision hardware architectures
OpenVX provides a high-level Graph-based abstraction
->
Enables Graph-level optimizations!
Can be implemented on almost any hardware or processor!
->
Portable, Efficient Vision Processing!
Vision
Node
Vision
Node
Vision
NodeVision
Node
Vision Processing Graph
GPU
Vision Engines
Middleware
Applications
DSP
Hardware
Software
Portability
© Copyright Khronos Group 2017 - Page 8
Simple Edge Detector in OpenVX
vx_image input = vxCreateImage(1920, 1080);
vx_image output = vxCreateImage(0, 0);
vx_image horiz = vxCreateVirtualImage();
vx_image vert = vxCreateVirtualImage();
vx_image mag = vxCreateVirtualImage();
vx_graph g = vxCreateGraph();
vxSobel3x3Node(g, input, horiz, vert);
vxMagnitudeNode(g, horiz, vert, mag);
vxThresholdNode(g, mag, THRESH, output);
status = vxVerifyGraph(g);
status = vxProcessGraph(g);
m
h
i
v
oS M T
Compile the Graph
Execute the Graph
Declare Input and Output Images
Declare Intermediate Images
Construct the Graph topology
© Copyright Khronos Group 2017 - Page 9
OpenVX Evolution
OpenVX 1.0
Spec released October 2014
Conformant
Implementations
OpenVX 1.1
Spec released May 2016
Conformant
Implementations
AMD OpenVX Tools
- Open source, highly optimized
for x86 CPU and OpenCL for GPU
- “Graph Optimizer”
- Scripting for rapid prototyping,
without re-compiling, at
production performance levels
http://gpuopen.com/compute-product/amd-openvx/
New Functionality
Expanded Nodes Functionality
Enhanced Graph Framework
OpenVX 1.2
Spec released May 2017
Adopters Program November 2017
New Functionality
Conditional node execution
Feature detection
Classification operators
Expanded imaging operations
Extensions
Neural Network Acceleration
Graph Save and Restore
16-bit image operation
Safety Critical
OpenVX 1.1 SC for
safety-certifiable systems
OpenVX
Roadmap under Discussion
NNEF Import
Programmable user
kernels with
accelerator offload
© Copyright Khronos Group 2017 - Page 10
New OpenVX 1.2 Functions
• Feature detection: find features useful for object detection and recognition
- Histogram of gradients – HOG  Template matching
- Local binary patterns – LBP  Line finding
• Classification: detect and recognize objects in an image based on a set of features
- Import a classifier model trained offline
- Classify objects based on a set of input features
• Image Processing: transform an image
- Generalized nonlinear filter: Dilate, erode, median with arbitrary kernel shapes
- Non maximum suppression: Find local maximum values in an image
- Edge-preserving noise reduction
• Conditional execution & node predication
- Selectively execute portions of a graph based on a true/false predicate
• Many, many minor improvements
• New Extensions
- Import/export: compile a graph; save and run later
- 16-bit support: signed 16-bit image data
- Neural networks: Layers are represented as OpenVX nodes
B C
S
A
Condition
If A then S ← B else S ← C
© Copyright Khronos Group 2017 - Page 11© Copyright Khronos Group 2017 - Page 11
OpenVX 1.2 and Neural Net Extension
• Convolution Neural Network topologies can be represented as OpenVX graphs
- Layers are represented as OpenVX nodes
- Layers connected by multi-dimensional tensors objects
- Layer types include convolution, activation, pooling, fully-connected, soft-max
- CNN nodes can be mixed with traditional vision nodes
• Import/Export Extension
- Efficient handling of network Weights/Biases or complete networks
• OpenVX will be able to import NNEF files into OpenVX Neural Nets
Vision
Node
Vision
Node
Vision
Node
Downstream
Application
Processing
Native
Camera
Control CNN Nodes
An OpenVX graph mixing CNN nodes
with traditional vision nodes
© Copyright Khronos Group 2017 - Page 12© Copyright Khronos Group 2017 - Page 12
OpenVX SC - Safety Critical Vision Processing
• OpenVX 1.1 - based on OpenVX 1.1 main specification
- Enhanced determinism
- Specification identifies and numbers requirements
• MISRA C clean per KlocWorks v10
• Divides functionality into “development” and “deployment” feature sets
- Adds requirement to support import/export extension
OpenVX SC
Development Feature
Set (Create Graph)
OpenVX SC
Deployment Feature Set
(Execute Graph)
Binary
format
Verify
Export
Import
Entire graph creation API No graph creation APIImplementation
dependent format
© Copyright Khronos Group 2017 - Page 13
Safety Critical APIs
New Generation APIs for safety
certifiable vision, graphics and
compute
e.g. ISO 26262 and DO-178B/C
OpenGL ES 1.0 - 2003
Fixed function graphics
OpenGL ES 2.0 - 2007
Shader programmable pipeline
OpenGL SC 1.0 - 2005
Fixed function graphics subset
OpenGL SC 2.0 - April 2016
Shader programmable pipeline subset
Experience and Guidelines
OpenCL SC TSG Formed
Working on OpenCL SC 1.2
Eliminate Undefined Behavior
Eliminate Callback Functions
Static Pool of Event Objects
OpenVX SC 1.1 Released May 2017
Restricted “deployment” implementation
executes on the target hardware by reading the
binary format and executing the pre-compiled
graphs
Khronos SCAP
‘Safety Critical Advisory Panel’
Guidelines for designing APIs that ease
system certification.
Open to Khronos member AND
industry experts
© Copyright Khronos Group 2017 - Page 14
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Vision and Neural Net
Inferencing Runtime
Embedded/Mobile
Vision/Inferencing Hardware
Standards for Vision and Neural Networks
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks
© Copyright Khronos Group 2017 - Page 15
OpenCL – Low-level Parallel Programing
• Low-level, explicit programming of heterogeneous parallel compute resources
- One code tree can be executed on CPUs, GPUs, DSPs and FPGA …
• OpenCL C or C++ language to write kernel programs to execute on any compute device
- Platform Layer API - to query, select and initialize compute devices
- Runtime API - to build and execute kernels programs on multiple devices
• The programmer gets to control:
- What programs execute on what device
- Where data is stored in various speed and size memories in the system
- When programs are run, and what operations are dependent on earlier operations
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
GPU
DSP
CPU
CPU
FPGA
Kernel code
compiled for
devices
Devices
CPU
Host
Runtime API
loads and executes
kernels across devices
© Copyright Khronos Group 2017 - Page 16
OpenCL 2.2 Released in May 2017
2011
OpenCL 1.2
Becomes
industry
baseline for
heterogeneous
parallel
computing
OpenCL 2.1
SPIR-V 1.0
SPIR-V in Core
Kernel Language
Flexibility
OpenCL 2.2
SPIR-V 1.2
OpenCL C++ Kernel Language
Static subset of C++14
Templates and Lambdas
SPIR-V 1.2
OpenCL C++ support
Pipes
Efficient device-scope
communication between kernels
Code Generation Optimizations
- Specialization constants at
SPIR-V compilation time
- Constructors and destructors of
program scope global objects
- User callbacks can be set at
program release time
201720152013
OpenCL 2.0
Enables new class
of hardware
SVM
Generic Addresses
On-device dispatch
https://www.khronos.org/opencl/
© Copyright Khronos Group 2017 - Page 17
OpenCL Conformant Implementations
OpenCL 1.0
Specification
Dec08 Jun10
OpenCL 1.1
Specification
Nov11
OpenCL 1.2
Specification
OpenCL 2.0
Specification
Nov13
1.0|Jul13
1.0|Aug09
1.0|May09
1.0|May10
1.0 | Feb11
1.0|May09
1.0|Jan10
1.1|Aug10
1.1|Jul11
1.2|May12
1.2|Jun12
1.1|Feb11
1.1|Mar11
1.1|Jun10
1.1|Aug12
1.1|Nov12
1.1|May13
1.1|Apr12
1.2|Apr14
1.2|Sep13
1.2|Dec12
Desktop
Mobile
FPGA
2.0|Jul14
OpenCL 2.1
Specification
Nov15
1.2|May15
2.0|Dec14
1.0|Dec14
1.2|Dec14
1.2|Sep14
Vendor timelines are
first conformant
submission for each spec
generation
1.2|May15
Embedded
1.2|Aug15
1.2|Mar16
2.0|Nov15
2.0|Apr17
2.1 | Jun16
© Copyright Khronos Group 2017 - Page 18
OpenCL as Language/Library Backend
C++ based
Neural
network
framework
MulticoreWare
open source
project on
Bitbucket
Compiler
directives for
Fortran,
C and C++
Java language
extensions
for
parallelism
Language for
image
processing and
computational
photography
Single
Source C++
Programming
for OpenCL
Hundreds of languages, frameworks
and projects using OpenCL to access
vendor-optimized, heterogeneous
compute runtimes
Vision
processing
open source
project
Open source
software library
for machine
learning
Over 4,000 GitHub repositories
using OpenCL: tools, applications,
libraries, languages - up from
2,000 two years ago
© Copyright Khronos Group 2017 - Page 19
SYCL Ecosystem
• Single-source heterogeneous programming using STANDARD C++
- Use C++ templates and lambda functions for host & device code
- Layered over OpenCL
• Fast and powerful path for bring C++ apps and libraries to OpenCL
- C++ Kernel Fusion - better performance on complex software than hand-coding
- Halide, Eigen, Boost.Compute, SYCLBLAS, SYCL Eigen, SYCL TensorFlow, SYCL GTX
- triSYCL, ComputeCpp, VisionCpp, ComputeCpp SDK …
• More information at http://sycl.tech
© Copyright Khronos Group 2017 - Page 20
SYCL 1.2.1 Publicly Released Today!
• Major update representing two and a half years of work by Khronos members
- Incorporates significant experience gained from three separate implementations
- Feedback from developers of machine learning frameworks such as TensorFlow
- TensorFlow now supports SYCL alongside the original CUDA accelerator back-end
- Layers over OpenCL 1.2
“This is a significant release update for SYCL with an enhanced
ecosystem that matches our intention to support machine learning
and align with modern C++17. SYCL continues to help us to drive
the C++ standard towards heterogeneous support. Our intention is
to move forward quickly with the SYCL roadmap to bring even more
emphasis to machine learning and Safety Critical support, as well as
continued alignment with future ISO C++,”
Michael Wong, SYCL working group chair
C++ Kernel Language
Low Level Control
‘GPGPU’-style separation of
device-side kernel source
code and host code
Single-source C++
Programmer Familiarity
Approach also taken by
C++ AMP and OpenMP
Developer Choice
The development of the two specifications are aligned so
code can be easily shared between the two approaches
© Copyright Khronos Group 2017 - Page 21
OpenCL Roadmap
2011
OpenCL 1.2
OpenCL C Kernel
Language
OpenCL 2.1
SPIR-V in Core
2015
SYCL 1.2
C++11 Single source
programming
OpenCL 2.2
C++ Kernel Language
2017
SYCL 2.2
C++14 Single source
programming
Industry working to bring
Heterogeneous compute to
standard ISO C++
C++17 Parallel STL hosted by Khronos
Executors – for scheduling work
“Managed pointers” or “channels” –
for sharing data
OpenCL for DSPs
- Embedded imaging, vision and inferencing
- Flexible reduced precision
- Conformance without IEEE 32 Floating Point
- Explicit DMA
Single source C++ programming.
Great for supporting C++ apps,
libraries and frameworks
Help bring OpenCL-
class compute to Vulkan
for GPUs
© Copyright Khronos Group 2017 - Page 22
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Embedded/Mobile
Vision/Inferencing Hardware
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Vision and Neural Net
Inferencing Runtime
Embedded/Mobile
Vision/Inferencing Hardware
Standards for Vision and Neural Networks
Vision/AI
Applications
Neural Net Training
Frameworks
Desktop and
Cloud Hardware
Trained
Networks
© Copyright Khronos Group 2017 - Page 23
Pervasive Vulkan
All Major GPU Companies shipping Vulkan Drivers for Desktop and Mobile Platforms
Embedded LinuxAndroid TVAndroid 7.0+ Nintendo Switch
http://vulkan.gpuinfo.org/
Oculus RiftSteamVR
VR Platforms
Desktop, Mobile, Embedded and Console Platforms Supporting Vulkan
Including phones and tablets from Google, Huawei, Samsung, Sony, Xiaomi - both premium and mid-range devices
GearVR Google Daydream
Game Engines
Desktop
© Copyright Khronos Group 2017 - Page 24
Market Demand for Universal 3D Portability
Games
Engines
Native 3D
Apps
Browser
Engines Games
Engines
Native 3D
Apps
Browser
Engines
Vulkan Universally
Portable Subset
JavaScript and
WebAssembly Native bindings
for ‘nexgen WebGL’
Community Outreach at GDC 2017
Create a hybrid Portability API?
Feedback - AVOID CREATING A FOURTH API!!!
Would need new specification, CTS, Documentation.
Additional developer learning curve.
A whole new specification to name, brand, promote.
Would INCREASE industry fragmentation
Tools Layers
API Libraries
Shader Translators
MAP Vulkan to Metal
and DX12
© Copyright Khronos Group 2017 - Page 25
Vulkan Portability TSG Process
API
Overlap
Analysis
Metal Shading
Language
HLSL
Vulkan Portability Deliverables
1. Vulkan Subset Diff Spec
2. Vulkan Subset Development Layer
3. Vulkan Subset API Library over DX12/Metal
4. SPIRV-Cross Translator
5. Vulkan Subset Conformance Tests
Expand/test existing
open source SPIRV-Cross Tool
Possible proposals for Vulkan extensions for
enhanced portability (and possibly Web
robustness) sent to Vulkan WG
Identify Vulkan
features not directly mappable
to DX12 and Metal
New Vulkan functionality may affect the
overlap analysis
Layers, APIs, Translators and Tests all to be
developed and released in open source
Open source project with similar goals
https://github.com/gfx-rs/gfx
Vulkan on iOS and macOS
https://moltengl.com/moltenvk/
© Copyright Khronos Group 2017 - Page 26
Vulkan Evolution
Feb16
Vulkan 1.0
Vulkan Extensions
Maintenance updates plus additional functionality
Explicit Building Blocks for VR
Explicit Building Blocks for Homogeneous Multi-GPU
Enhanced Windows System Integration
Increased Shader Language Flexibility
Enhanced Cross-Process and Cross-API Sharing
Vulkan Roadmap
Regular Vulkan Core Releases
Integrates proven KHR extensions
Roadmap Discussions
Pushes forward the envelope of GPU Acceleration:
Enhanced Compute and Language Flexibility
Optimized Vision and Inferencing Acceleration
Ray Tracing
Widened Platform Support
Through Vulkan Portability Initiative
Including Vulkan on macOS and iOS
Strengthened Ecosystem and SDK
Enhanced developer and debugging tools
Regression testing for SDK stability
Enhanced Conformance Testing (API now has 198K test cases
- up from 107K last year)
Compiler robustness - including HLSL support
Explicit Access to GPU
Acceleration
© Copyright Khronos Group 2017 - Page 27
Clspv OpenCL C to Vulkan Compiler
• Experimental collaboration between Google, Codeplay, and Adobe
- Successfully tested on over 200K lines of Adobe OpenCL C production code
- Released in open source https://github.com/google/clspv
- Tracks top-of-tree LLVM and clang, not a fork
• Uses new Vulkan extensions to support OpenCL C compute operations
- VK_KHR_16bit_storage/SPV_KHR_16bit_storage
- VK_KHR_variable_pointers/SPV_KHR_variable_pointers
• Compiles OpenCL C’s programming model to Vulkan’s SPIR-V execution environment
- Proof-of-concept that OpenCL compute can be brought seamlessly to Vulkan
Vulkan
Universal
Portability
OpenCL-class
compute in
Vulkan
Universally
Portable
Graphics and
Advanced
Compute
© Copyright Khronos Group 2017 - Page 28
Dedicated Vision
Hardware
Layered Vision/ Neural Net Ecosystem
Programmable Vision
Processors
Application
Implementers may use OpenCL or Vulkan to implement
OpenVX nodes on programmable processors
And then developers can use OpenVX to enable a
developer to easily connect those nodes into a graph
The OpenVX graph abstraction enables implementers to optimize execution
across diverse hardware architectures for optimal power and performance
OpenVX enables the graph to be extended to include hardware
architectures that don’t support programmable APIs
Vulkan Roadmap
Enhanced compute – vision and
inferencing on any platforms with
GPU acceleration
OpenCL Roadmap
Flexible precision for
widespread deployment on low-
cost embedded processors
© Copyright Khronos Group 2017 - Page 29
Khronos Cooperative Framework
Royalty-Free
Specifications
Adopters
Programs
Implementations, Tools
Documentation, Samples
Adopters
Build conformant
implementation and products
Developers
Develop applications
using the APIs
Educators / Certifiers
Create Courses
Training and Certification
Educator Guidelines
Courseware Materials
API Working Groups
(Industry, Academic and
Associate members)
$
$
Khronos Board
Strategy, budget and
oversight$$
One vote per industry member
Conformance
Tests
Open and
royalty-free.
Often open sourced
Advisory Panels
By invitation, no fee.
Provide requirements
and draft spec feedback
Industry and
Community
Feedback and
contributions on
released materials
Membership Fees
Academic $1,000
Small companies $175/employee
(min $3,500)
Non-profits $7,500
Larger companies - $18,000
© Copyright Khronos Group 2017 - Page 30
New Open Source Engagement Model
• Khronos is open sourcing specification sources,
conformance tests, tools
- Merge requests welcome from the community
(subject to review by OpenCL working group)
• Deeper Community Enablement
- Mix your own documentation!
- Contribute and fix conformance tests
- Fix the specification, headers, ICD etc.
- Contribute new features (carefully)
Khronos builds and Ratifies
Canonical Specification under
Khronos IP Framework.
No changes or re-hosting allowed
Spec and Ref Language Source and
derivative materials.
Re-mixable under CC-BY by the
industry and community
Source Materials for Specifications and Reference
Documentation CONTRIBUTED Under Khronos IP Framework
(you won’t assert patents against conformant implementations,
and license copyright for Khronos use)
Contributions and
Distribution under
Apache 2.0 Spec Build
System and
Scripts
Spec and
Ref Language
Source Redistribution
under CC-BY 4.0
Conformance
Test Suite
Source
Community built
documentation and
tools
Contributions and
Distribution under
Apache 2.0
Anyone can test any
implementation at
any time
Khronos Adopters
Program
Conformant Implementations can
use trademark and are covered
by Khronos IP Framework
© Copyright Khronos Group 2017 - Page 31
Khronos Advisory Panels
Working
Group
Advisory
Panel
Shared Email
list and
Repository
Khronos Members
Any company can join
Membership Fee
Sign NDA and IP Framework
Khronos Advisors
Invited industry experts
$0 Cost
Sign NDA and IP Framework
Specification drafts and
invitations for
requirements and feedback
Requirements and
feedback on
specification drafts
Hosted by Khronos.
Under Khronos NDA
Advisory Panels Active for Vulkan, OpenCL/SYCL and NNEF
© Copyright Khronos Group 2017 - Page 32© Copyright Khronos Group 2017 - Page 32
Need for Camera Control API?
• Advanced control of ISP and camera subsystem – with cross-platform portability
- Generate sophisticated image stream for advanced imaging & vision apps
• No platform API currently fulfills all developer requirements
- Portable access to growing sensor diversity: e.g. depth sensors and sensor arrays
- Cross sensor synch: e.g. synch of camera and MEMS sensors
- Advanced, high-frequency per-frame burst control of camera/sensor: e.g. ROI
- Multiple input, output re-circulating streams with RAW, Bayer or YUV Processing
Image Signal
Processor (ISP)
Defines control of Sensor, Color Filter Array
Lens, Flash, Focus, Aperture
Auto Exposure (AE)
Auto White Balance (AWB)
Auto Focus (AF) Stream of Images for
Vision Processing
OpenKCAM standard is
currently on ice – do we
need to restart?
© Copyright Khronos Group 2017 - Page 33© Copyright Khronos Group 2017 - Page 33
Please Get Involved!
• Khronos is creating cutting-edge royalty-free open standards
- For embedded vision and neural network acceleration
• Khronos standards are key to many emerging markets such as
3D rendering and advanced gaming, Augmented and Virtual Reality,
Embedded Vision and machine learning
- Advanced next generation capabilities for ALL platforms
• Khronos encourages your participation
- www.khronos.org
- ntrevett@nvidia.com
- @neilt3d
© Copyright Khronos Group 2017 - Page 34© Copyright Khronos Group 2017 - Page 34
OpenVX - Graph-Level Abstraction
• OpenVX developers express a graph of image operations (‘Nodes’)
- Using a C API
• Nodes can be executed on any hardware or processor coded in any language
- Implementers can optimize under the high-level graph abstraction
• Graphs are the key to run-time power and performance optimizations
- E.g. Node fusion, tiled graph processing for cache efficiency etc.
Array of
Keypoints
YUV
Frame
Gray
Frame
Camera
Input
Rendering
Output
Pyrt
Color
Conversion
Channel
Extract
Optical
Flow
Harris
Track
Image
Pyramid
RGB
Frame
Array of
Features
Ftrt-1OpenVX Graph
OpenVX Nodes
Feature Extraction Example Graph
© Copyright Khronos Group 2017 - Page 35
OpenVX Efficiency through Graphs..
Reuse
pre-allocated
memory for
multiple
intermediate data
Memory
Management
Less allocation overhead,
more memory for
other applications
Replace a sub-
graph with a
single faster node
Kernel
Fusion
Better memory
locality, less kernel
launch overhead
Split the graph
execution across
the whole
system: CPU /
GPU / dedicated
HW
Graph
Scheduling
Faster execution
or lower power
consumption
Execute a sub-
graph at tile
granularity
instead of image
granularity
Data
Tiling
Better use of
data cache and
local memory
© Copyright Khronos Group 2017 - Page 36
SPIR-V Ecosystem
LLVM
Third party kernel and
shader languages
SPIR-V
• Khronos defined cross-API IR
• Native graphics and parallel compute
• Easily parsed/extended 32-bit stream
• Data object/control flow retained for
effective code generation/translation
OpenCL C++
Front-end
OpenCL C
Front-end
glslang
Khronos open source
tools and translators
IHV Driver
Runtimes
Additional
Intermediate Forms
SPIR-V Validator
SPIR-V (Dis)Assembler
LLVM to SPIR-V
Bi-directional
Translator
https://github.com/KhronosGroup/SPIRV-Tools
GLSL HLSL
Khronos liaising with
Clang/LLVM Community
E.g. discussing SPIR-V as
supported Clang target
SPIRV-Cross
GLSL
HLSL
MSL
SPIRV-opt | SPIRV-remap
SPIR-V Optimizations
• Inlining (exhaustive)
• Store/Load Elimination
• Dead Code Elimination
• Dead Branch Elimination
• Common Uniform Elimination
Coming
• Loop Unrolling and Constant Folding
• Common Subexpression Elimination
© Copyright Khronos Group 2017 - Page 37
Vulkan and New Generation 3D APIs
Only AppleOnly Windows 10
Cross Platform
Clean, modern architecture | Low overhead, explicit GPU access
Portable across desktop and mobile | Multi-thread / multi-core friendly
Efficient, low-latency, predictable performance
© Copyright Khronos Group 2017 - Page 38
Vulkan Explicit GPU Control
GPU
High-level Driver
Abstraction
Layered GPU Control
Context management
Memory allocation
Full GLSL compiler
Error detection
Application
Single thread per context
GPU
Thin Driver
Explicit GPU Control
Application
Memory allocation
Thread management
Synchronization
Multi-threaded generation
of command buffers
Multiple Front-end
Compilers
GLSL, HLSL etc.
Loadable debug and
validation layers
Vulkan 1.0 provides access to
OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality
but with increased performance and flexibility
SPIR-V
pre-compiled shaders
Complex drivers
cause overhead
and inconsistent
behavior across
vendors
Always active
error handling
Full GLSL
preprocessor and
compiler in
driver
OpenGL vs.
OpenGL ES
Resource management
offloaded to app:
low-overhead, low-latency
driver
Consistent behavior:
no ‘fighting with driver
heuristics’
Validation and debug layers
loaded only when needed
SPIR-V intermediate
language: shading language
flexibility
Multi-threaded command
creation. Multiple graphics,
command and DMA queues
Unified API across all
platforms with feature set
flexibility
Vulkan = high performance and
low latency 3D and GPU Compute.
Ideal for VR/AR applications
© Copyright Khronos Group 2017 - Page 39
OpenCL Ecosystem
Hardware Implementers
Desktop/Mobile/Embedded/FPGA
100s of applications using
OpenCL acceleration
Rendering, visualization, video editing,
simulation, image processing, vision and
neural network inferencing
Single Source C++ Programming
Portable Kernel Intermediate Language
Core API and Language Specs
OpenCL 2.2 – Top to Bottom C++

More Related Content

What's hot

"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics
"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics
"The Path from ADAS to Autonomy," a Presentation from Strategy AnalyticsEdge AI and Vision Alliance
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...Edge AI and Vision Alliance
 
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ..."Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...Edge AI and Vision Alliance
 
"How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M..."How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M...Edge AI and Vision Alliance
 
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from SamsungEdge AI and Vision Alliance
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...Edge AI and Vision Alliance
 
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart..."Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...Edge AI and Vision Alliance
 
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityFixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityWen-Chih Lo
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...Edge AI and Vision Alliance
 
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...AI Frontiers
 
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from MovidiusEdge AI and Vision Alliance
 
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...Edge AI and Vision Alliance
 
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...Edge AI and Vision Alliance
 
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...Edge AI and Vision Alliance
 
360° Video Viewing Dataset in Head-Mounted Virtual Reality
360° Video Viewing Dataset in Head-Mounted Virtual Reality360° Video Viewing Dataset in Head-Mounted Virtual Reality
360° Video Viewing Dataset in Head-Mounted Virtual RealityWen-Chih Lo
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
 
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present...
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present..."Machine Learning- based Image Compression: Ready for Prime Time?," a Present...
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present...Edge AI and Vision Alliance
 
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analyticsMetta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analyticsEduardo Gaspar
 
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA Taiwan
 

What's hot (20)

"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics
"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics
"The Path from ADAS to Autonomy," a Presentation from Strategy Analytics
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
 
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ..."Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...
"Collaboratively Benchmarking and Optimizing Deep Learning Implementations," ...
 
"How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M..."How to Test and Validate an Automated Driving System," a Presentation from M...
"How to Test and Validate an Automated Driving System," a Presentation from M...
 
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
“A Highly Data-Efficient Deep Learning Approach,” a Presentation from Samsung
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
 
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart..."Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
"Approaches for Vision-based Driver Monitoring," a Presentation from PathPart...
 
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityFixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
 
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
“Applying the Right Deep Learning Model with the Right Data for Your Applicat...
 
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
 
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
 
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
 
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...
“DepthAI: Embedded, Performant Spatial AI and Computer Vision,” a Presentatio...
 
360° Video Viewing Dataset in Head-Mounted Virtual Reality
360° Video Viewing Dataset in Head-Mounted Virtual Reality360° Video Viewing Dataset in Head-Mounted Virtual Reality
360° Video Viewing Dataset in Head-Mounted Virtual Reality
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
 
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present...
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present..."Machine Learning- based Image Compression: Ready for Prime Time?," a Present...
"Machine Learning- based Image Compression: Ready for Prime Time?," a Present...
 
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analyticsMetta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
 
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
 

Similar to "Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group

"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...Edge AI and Vision Alliance
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...Edge AI and Vision Alliance
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...Edge AI and Vision Alliance
 
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ..."Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...Edge AI and Vision Alliance
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15 mixARConference
 
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P..."Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P...Edge AI and Vision Alliance
 
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio..."Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...Edge AI and Vision Alliance
 
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...Edge AI and Vision Alliance
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...Edge AI and Vision Alliance
 
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for..."The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...Edge AI and Vision Alliance
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...Edge AI and Vision Alliance
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...Edge AI and Vision Alliance
 
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...Edge AI and Vision Alliance
 
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemHow APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemCisco DevNet
 
Resume_Appaji
Resume_AppajiResume_Appaji
Resume_AppajiAppaji K
 
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...Edge AI and Vision Alliance
 
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option..."APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...Edge AI and Vision Alliance
 
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre..."Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...Edge AI and Vision Alliance
 
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P..."OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...Edge AI and Vision Alliance
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018Krishna-Kumar
 

Similar to "Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group (20)

"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ..."New Standards for Embedded Vision and Neural Networks," a Presentation from ...
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
 
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ..."Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15
 
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P..."Current and Planned Standards for Computer Vision and Machine Learning," a P...
"Current and Planned Standards for Computer Vision and Machine Learning," a P...
 
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio..."Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...
"Recent Developments in Khronos Standards for Embedded Vision," a Presentatio...
 
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...
“OpenVX 1.3: An Open Standard for Computer Vision Software Acceleration,” a P...
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for..."The OpenVX Computer Vision and Neural Network Inference Library Standard for...
"The OpenVX Computer Vision and Neural Network Inference Library Standard for...
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
 
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
“The OpenVX Standard API: Computer Vision for the Masses,” a Presentation fro...
 
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemHow APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
 
Resume_Appaji
Resume_AppajiResume_Appaji
Resume_Appaji
 
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
“Khronos Group Standards: Powering the Future of Embedded Vision,” a Presenta...
 
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option..."APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
"APIs for Accelerating Vision and Inferencing: An Industry Overview of Option...
 
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre..."Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
 
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P..."OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...
"OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision," a P...
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

"Update on Khronos Standards for Vision and Machine Learning," a Presentation from the Khronos Group

  • 1. © Copyright Khronos Group 2017 - Page 1 Update on Khronos Standards for Vision and Machine Learning December 2017 Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystem ntrevett@nvidia.com | @neilt3d |www.khronos.org
  • 2. © Copyright Khronos Group 2017 - Page 2 Khronos Mission Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, Open Standard APIs to enable software to access hardware acceleration for 3D Graphics, Virtual and Augmented Reality, Parallel Computing, Neural Networks and Vision Processing
  • 3. © Copyright Khronos Group 2017 - Page 3 Khronos Open Standards Vision sensor(s) Tracking and Positioning Geometric scene reconstruction Semantic scene understanding (Neural Networks) Generate Low Latency 3D Content and Augmentations Interact with sensor, haptic and display devices Download 3D augmentation object and scene data VR/AR Application
  • 4. © Copyright Khronos Group 2017 - Page 4 Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Neural Net Training Frameworks Neural Net Training Frameworks Neural Net Training Frameworks Vision and Neural Net Inferencing Runtime Embedded/Mobile Vision/Inferencing Hardware Standards for Vision and Neural Networks Vision/AI Applications Neural Net Training Frameworks Desktop and Cloud Hardware Trained Networks
  • 5. © Copyright Khronos Group 2017 - Page 5 NNEF - Solving Neural Net Fragmentation NN Authoring Framework 1 NN Authoring Framework 2 NN Authoring Framework 3 Inference Engine 1 Inference Engine 2 Inference Engine 3 Every Tool Needs an Exporter to Every Accelerator With NNEF- NN Training and Inferencing Interoperability Before NNEF – NN Training and Inferencing Fragmentation NN Authoring Framework 1 NN Authoring Framework 2 NN Authoring Framework 3 Inference Engine 1 Inference Engine 2 Inference Engine 3 Optimization and processing tools NNEF is a Cross-vendor Neural Net file format Encapsulates network formal semantics, structure, data formats, commonly-used operations (such as convolution, pooling, normalization, etc.)
  • 6. © Copyright Khronos Group 2017 - Page 6 NNEF Status and Roadmap • NNEF V1.0 provisional release – targeting late December 2017 - Focus on passing trained frameworks to embedded inference engines - Support ‘First cut’ range of network types • NNEF Roadmap - Track development of new network types - Allow authoring and retraining (3rd party tools) - Address a wider range of applications (outside vision apps) - Increase the expressive power of the format Baseline Importer File Format Validator Working Group Participants Baseline Exporter Khronos Open Source Projects
  • 7. © Copyright Khronos Group 2017 - Page 7 OpenVX PowerEfficiency Computation Flexibility Dedicated Hardware GPU Compute Multi-core CPUX1 X10 X100 Vision DSPs Wide range of vision hardware architectures OpenVX provides a high-level Graph-based abstraction -> Enables Graph-level optimizations! Can be implemented on almost any hardware or processor! -> Portable, Efficient Vision Processing! Vision Node Vision Node Vision NodeVision Node Vision Processing Graph GPU Vision Engines Middleware Applications DSP Hardware Software Portability
  • 8. © Copyright Khronos Group 2017 - Page 8 Simple Edge Detector in OpenVX vx_image input = vxCreateImage(1920, 1080); vx_image output = vxCreateImage(0, 0); vx_image horiz = vxCreateVirtualImage(); vx_image vert = vxCreateVirtualImage(); vx_image mag = vxCreateVirtualImage(); vx_graph g = vxCreateGraph(); vxSobel3x3Node(g, input, horiz, vert); vxMagnitudeNode(g, horiz, vert, mag); vxThresholdNode(g, mag, THRESH, output); status = vxVerifyGraph(g); status = vxProcessGraph(g); m h i v oS M T Compile the Graph Execute the Graph Declare Input and Output Images Declare Intermediate Images Construct the Graph topology
  • 9. © Copyright Khronos Group 2017 - Page 9 OpenVX Evolution OpenVX 1.0 Spec released October 2014 Conformant Implementations OpenVX 1.1 Spec released May 2016 Conformant Implementations AMD OpenVX Tools - Open source, highly optimized for x86 CPU and OpenCL for GPU - “Graph Optimizer” - Scripting for rapid prototyping, without re-compiling, at production performance levels http://gpuopen.com/compute-product/amd-openvx/ New Functionality Expanded Nodes Functionality Enhanced Graph Framework OpenVX 1.2 Spec released May 2017 Adopters Program November 2017 New Functionality Conditional node execution Feature detection Classification operators Expanded imaging operations Extensions Neural Network Acceleration Graph Save and Restore 16-bit image operation Safety Critical OpenVX 1.1 SC for safety-certifiable systems OpenVX Roadmap under Discussion NNEF Import Programmable user kernels with accelerator offload
  • 10. © Copyright Khronos Group 2017 - Page 10 New OpenVX 1.2 Functions • Feature detection: find features useful for object detection and recognition - Histogram of gradients – HOG  Template matching - Local binary patterns – LBP  Line finding • Classification: detect and recognize objects in an image based on a set of features - Import a classifier model trained offline - Classify objects based on a set of input features • Image Processing: transform an image - Generalized nonlinear filter: Dilate, erode, median with arbitrary kernel shapes - Non maximum suppression: Find local maximum values in an image - Edge-preserving noise reduction • Conditional execution & node predication - Selectively execute portions of a graph based on a true/false predicate • Many, many minor improvements • New Extensions - Import/export: compile a graph; save and run later - 16-bit support: signed 16-bit image data - Neural networks: Layers are represented as OpenVX nodes B C S A Condition If A then S ← B else S ← C
  • 11. © Copyright Khronos Group 2017 - Page 11© Copyright Khronos Group 2017 - Page 11 OpenVX 1.2 and Neural Net Extension • Convolution Neural Network topologies can be represented as OpenVX graphs - Layers are represented as OpenVX nodes - Layers connected by multi-dimensional tensors objects - Layer types include convolution, activation, pooling, fully-connected, soft-max - CNN nodes can be mixed with traditional vision nodes • Import/Export Extension - Efficient handling of network Weights/Biases or complete networks • OpenVX will be able to import NNEF files into OpenVX Neural Nets Vision Node Vision Node Vision Node Downstream Application Processing Native Camera Control CNN Nodes An OpenVX graph mixing CNN nodes with traditional vision nodes
  • 12. © Copyright Khronos Group 2017 - Page 12© Copyright Khronos Group 2017 - Page 12 OpenVX SC - Safety Critical Vision Processing • OpenVX 1.1 - based on OpenVX 1.1 main specification - Enhanced determinism - Specification identifies and numbers requirements • MISRA C clean per KlocWorks v10 • Divides functionality into “development” and “deployment” feature sets - Adds requirement to support import/export extension OpenVX SC Development Feature Set (Create Graph) OpenVX SC Deployment Feature Set (Execute Graph) Binary format Verify Export Import Entire graph creation API No graph creation APIImplementation dependent format
  • 13. © Copyright Khronos Group 2017 - Page 13 Safety Critical APIs New Generation APIs for safety certifiable vision, graphics and compute e.g. ISO 26262 and DO-178B/C OpenGL ES 1.0 - 2003 Fixed function graphics OpenGL ES 2.0 - 2007 Shader programmable pipeline OpenGL SC 1.0 - 2005 Fixed function graphics subset OpenGL SC 2.0 - April 2016 Shader programmable pipeline subset Experience and Guidelines OpenCL SC TSG Formed Working on OpenCL SC 1.2 Eliminate Undefined Behavior Eliminate Callback Functions Static Pool of Event Objects OpenVX SC 1.1 Released May 2017 Restricted “deployment” implementation executes on the target hardware by reading the binary format and executing the pre-compiled graphs Khronos SCAP ‘Safety Critical Advisory Panel’ Guidelines for designing APIs that ease system certification. Open to Khronos member AND industry experts
  • 14. © Copyright Khronos Group 2017 - Page 14 Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Neural Net Training Frameworks Neural Net Training Frameworks Neural Net Training Frameworks Vision and Neural Net Inferencing Runtime Embedded/Mobile Vision/Inferencing Hardware Standards for Vision and Neural Networks Vision/AI Applications Neural Net Training Frameworks Desktop and Cloud Hardware Trained Networks
  • 15. © Copyright Khronos Group 2017 - Page 15 OpenCL – Low-level Parallel Programing • Low-level, explicit programming of heterogeneous parallel compute resources - One code tree can be executed on CPUs, GPUs, DSPs and FPGA … • OpenCL C or C++ language to write kernel programs to execute on any compute device - Platform Layer API - to query, select and initialize compute devices - Runtime API - to build and execute kernels programs on multiple devices • The programmer gets to control: - What programs execute on what device - Where data is stored in various speed and size memories in the system - When programs are run, and what operations are dependent on earlier operations OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code GPU DSP CPU CPU FPGA Kernel code compiled for devices Devices CPU Host Runtime API loads and executes kernels across devices
  • 16. © Copyright Khronos Group 2017 - Page 16 OpenCL 2.2 Released in May 2017 2011 OpenCL 1.2 Becomes industry baseline for heterogeneous parallel computing OpenCL 2.1 SPIR-V 1.0 SPIR-V in Core Kernel Language Flexibility OpenCL 2.2 SPIR-V 1.2 OpenCL C++ Kernel Language Static subset of C++14 Templates and Lambdas SPIR-V 1.2 OpenCL C++ support Pipes Efficient device-scope communication between kernels Code Generation Optimizations - Specialization constants at SPIR-V compilation time - Constructors and destructors of program scope global objects - User callbacks can be set at program release time 201720152013 OpenCL 2.0 Enables new class of hardware SVM Generic Addresses On-device dispatch https://www.khronos.org/opencl/
  • 17. © Copyright Khronos Group 2017 - Page 17 OpenCL Conformant Implementations OpenCL 1.0 Specification Dec08 Jun10 OpenCL 1.1 Specification Nov11 OpenCL 1.2 Specification OpenCL 2.0 Specification Nov13 1.0|Jul13 1.0|Aug09 1.0|May09 1.0|May10 1.0 | Feb11 1.0|May09 1.0|Jan10 1.1|Aug10 1.1|Jul11 1.2|May12 1.2|Jun12 1.1|Feb11 1.1|Mar11 1.1|Jun10 1.1|Aug12 1.1|Nov12 1.1|May13 1.1|Apr12 1.2|Apr14 1.2|Sep13 1.2|Dec12 Desktop Mobile FPGA 2.0|Jul14 OpenCL 2.1 Specification Nov15 1.2|May15 2.0|Dec14 1.0|Dec14 1.2|Dec14 1.2|Sep14 Vendor timelines are first conformant submission for each spec generation 1.2|May15 Embedded 1.2|Aug15 1.2|Mar16 2.0|Nov15 2.0|Apr17 2.1 | Jun16
  • 18. © Copyright Khronos Group 2017 - Page 18 OpenCL as Language/Library Backend C++ based Neural network framework MulticoreWare open source project on Bitbucket Compiler directives for Fortran, C and C++ Java language extensions for parallelism Language for image processing and computational photography Single Source C++ Programming for OpenCL Hundreds of languages, frameworks and projects using OpenCL to access vendor-optimized, heterogeneous compute runtimes Vision processing open source project Open source software library for machine learning Over 4,000 GitHub repositories using OpenCL: tools, applications, libraries, languages - up from 2,000 two years ago
  • 19. © Copyright Khronos Group 2017 - Page 19 SYCL Ecosystem • Single-source heterogeneous programming using STANDARD C++ - Use C++ templates and lambda functions for host & device code - Layered over OpenCL • Fast and powerful path for bring C++ apps and libraries to OpenCL - C++ Kernel Fusion - better performance on complex software than hand-coding - Halide, Eigen, Boost.Compute, SYCLBLAS, SYCL Eigen, SYCL TensorFlow, SYCL GTX - triSYCL, ComputeCpp, VisionCpp, ComputeCpp SDK … • More information at http://sycl.tech
  • 20. © Copyright Khronos Group 2017 - Page 20 SYCL 1.2.1 Publicly Released Today! • Major update representing two and a half years of work by Khronos members - Incorporates significant experience gained from three separate implementations - Feedback from developers of machine learning frameworks such as TensorFlow - TensorFlow now supports SYCL alongside the original CUDA accelerator back-end - Layers over OpenCL 1.2 “This is a significant release update for SYCL with an enhanced ecosystem that matches our intention to support machine learning and align with modern C++17. SYCL continues to help us to drive the C++ standard towards heterogeneous support. Our intention is to move forward quickly with the SYCL roadmap to bring even more emphasis to machine learning and Safety Critical support, as well as continued alignment with future ISO C++,” Michael Wong, SYCL working group chair C++ Kernel Language Low Level Control ‘GPGPU’-style separation of device-side kernel source code and host code Single-source C++ Programmer Familiarity Approach also taken by C++ AMP and OpenMP Developer Choice The development of the two specifications are aligned so code can be easily shared between the two approaches
  • 21. © Copyright Khronos Group 2017 - Page 21 OpenCL Roadmap 2011 OpenCL 1.2 OpenCL C Kernel Language OpenCL 2.1 SPIR-V in Core 2015 SYCL 1.2 C++11 Single source programming OpenCL 2.2 C++ Kernel Language 2017 SYCL 2.2 C++14 Single source programming Industry working to bring Heterogeneous compute to standard ISO C++ C++17 Parallel STL hosted by Khronos Executors – for scheduling work “Managed pointers” or “channels” – for sharing data OpenCL for DSPs - Embedded imaging, vision and inferencing - Flexible reduced precision - Conformance without IEEE 32 Floating Point - Explicit DMA Single source C++ programming. Great for supporting C++ apps, libraries and frameworks Help bring OpenCL- class compute to Vulkan for GPUs
  • 22. © Copyright Khronos Group 2017 - Page 22 Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Embedded/Mobile Vision/Inferencing Hardware Neural Net Training Frameworks Neural Net Training Frameworks Neural Net Training Frameworks Vision and Neural Net Inferencing Runtime Embedded/Mobile Vision/Inferencing Hardware Standards for Vision and Neural Networks Vision/AI Applications Neural Net Training Frameworks Desktop and Cloud Hardware Trained Networks
  • 23. © Copyright Khronos Group 2017 - Page 23 Pervasive Vulkan All Major GPU Companies shipping Vulkan Drivers for Desktop and Mobile Platforms Embedded LinuxAndroid TVAndroid 7.0+ Nintendo Switch http://vulkan.gpuinfo.org/ Oculus RiftSteamVR VR Platforms Desktop, Mobile, Embedded and Console Platforms Supporting Vulkan Including phones and tablets from Google, Huawei, Samsung, Sony, Xiaomi - both premium and mid-range devices GearVR Google Daydream Game Engines Desktop
  • 24. © Copyright Khronos Group 2017 - Page 24 Market Demand for Universal 3D Portability Games Engines Native 3D Apps Browser Engines Games Engines Native 3D Apps Browser Engines Vulkan Universally Portable Subset JavaScript and WebAssembly Native bindings for ‘nexgen WebGL’ Community Outreach at GDC 2017 Create a hybrid Portability API? Feedback - AVOID CREATING A FOURTH API!!! Would need new specification, CTS, Documentation. Additional developer learning curve. A whole new specification to name, brand, promote. Would INCREASE industry fragmentation Tools Layers API Libraries Shader Translators MAP Vulkan to Metal and DX12
  • 25. © Copyright Khronos Group 2017 - Page 25 Vulkan Portability TSG Process API Overlap Analysis Metal Shading Language HLSL Vulkan Portability Deliverables 1. Vulkan Subset Diff Spec 2. Vulkan Subset Development Layer 3. Vulkan Subset API Library over DX12/Metal 4. SPIRV-Cross Translator 5. Vulkan Subset Conformance Tests Expand/test existing open source SPIRV-Cross Tool Possible proposals for Vulkan extensions for enhanced portability (and possibly Web robustness) sent to Vulkan WG Identify Vulkan features not directly mappable to DX12 and Metal New Vulkan functionality may affect the overlap analysis Layers, APIs, Translators and Tests all to be developed and released in open source Open source project with similar goals https://github.com/gfx-rs/gfx Vulkan on iOS and macOS https://moltengl.com/moltenvk/
  • 26. © Copyright Khronos Group 2017 - Page 26 Vulkan Evolution Feb16 Vulkan 1.0 Vulkan Extensions Maintenance updates plus additional functionality Explicit Building Blocks for VR Explicit Building Blocks for Homogeneous Multi-GPU Enhanced Windows System Integration Increased Shader Language Flexibility Enhanced Cross-Process and Cross-API Sharing Vulkan Roadmap Regular Vulkan Core Releases Integrates proven KHR extensions Roadmap Discussions Pushes forward the envelope of GPU Acceleration: Enhanced Compute and Language Flexibility Optimized Vision and Inferencing Acceleration Ray Tracing Widened Platform Support Through Vulkan Portability Initiative Including Vulkan on macOS and iOS Strengthened Ecosystem and SDK Enhanced developer and debugging tools Regression testing for SDK stability Enhanced Conformance Testing (API now has 198K test cases - up from 107K last year) Compiler robustness - including HLSL support Explicit Access to GPU Acceleration
  • 27. © Copyright Khronos Group 2017 - Page 27 Clspv OpenCL C to Vulkan Compiler • Experimental collaboration between Google, Codeplay, and Adobe - Successfully tested on over 200K lines of Adobe OpenCL C production code - Released in open source https://github.com/google/clspv - Tracks top-of-tree LLVM and clang, not a fork • Uses new Vulkan extensions to support OpenCL C compute operations - VK_KHR_16bit_storage/SPV_KHR_16bit_storage - VK_KHR_variable_pointers/SPV_KHR_variable_pointers • Compiles OpenCL C’s programming model to Vulkan’s SPIR-V execution environment - Proof-of-concept that OpenCL compute can be brought seamlessly to Vulkan Vulkan Universal Portability OpenCL-class compute in Vulkan Universally Portable Graphics and Advanced Compute
  • 28. © Copyright Khronos Group 2017 - Page 28 Dedicated Vision Hardware Layered Vision/ Neural Net Ecosystem Programmable Vision Processors Application Implementers may use OpenCL or Vulkan to implement OpenVX nodes on programmable processors And then developers can use OpenVX to enable a developer to easily connect those nodes into a graph The OpenVX graph abstraction enables implementers to optimize execution across diverse hardware architectures for optimal power and performance OpenVX enables the graph to be extended to include hardware architectures that don’t support programmable APIs Vulkan Roadmap Enhanced compute – vision and inferencing on any platforms with GPU acceleration OpenCL Roadmap Flexible precision for widespread deployment on low- cost embedded processors
  • 29. © Copyright Khronos Group 2017 - Page 29 Khronos Cooperative Framework Royalty-Free Specifications Adopters Programs Implementations, Tools Documentation, Samples Adopters Build conformant implementation and products Developers Develop applications using the APIs Educators / Certifiers Create Courses Training and Certification Educator Guidelines Courseware Materials API Working Groups (Industry, Academic and Associate members) $ $ Khronos Board Strategy, budget and oversight$$ One vote per industry member Conformance Tests Open and royalty-free. Often open sourced Advisory Panels By invitation, no fee. Provide requirements and draft spec feedback Industry and Community Feedback and contributions on released materials Membership Fees Academic $1,000 Small companies $175/employee (min $3,500) Non-profits $7,500 Larger companies - $18,000
  • 30. © Copyright Khronos Group 2017 - Page 30 New Open Source Engagement Model • Khronos is open sourcing specification sources, conformance tests, tools - Merge requests welcome from the community (subject to review by OpenCL working group) • Deeper Community Enablement - Mix your own documentation! - Contribute and fix conformance tests - Fix the specification, headers, ICD etc. - Contribute new features (carefully) Khronos builds and Ratifies Canonical Specification under Khronos IP Framework. No changes or re-hosting allowed Spec and Ref Language Source and derivative materials. Re-mixable under CC-BY by the industry and community Source Materials for Specifications and Reference Documentation CONTRIBUTED Under Khronos IP Framework (you won’t assert patents against conformant implementations, and license copyright for Khronos use) Contributions and Distribution under Apache 2.0 Spec Build System and Scripts Spec and Ref Language Source Redistribution under CC-BY 4.0 Conformance Test Suite Source Community built documentation and tools Contributions and Distribution under Apache 2.0 Anyone can test any implementation at any time Khronos Adopters Program Conformant Implementations can use trademark and are covered by Khronos IP Framework
  • 31. © Copyright Khronos Group 2017 - Page 31 Khronos Advisory Panels Working Group Advisory Panel Shared Email list and Repository Khronos Members Any company can join Membership Fee Sign NDA and IP Framework Khronos Advisors Invited industry experts $0 Cost Sign NDA and IP Framework Specification drafts and invitations for requirements and feedback Requirements and feedback on specification drafts Hosted by Khronos. Under Khronos NDA Advisory Panels Active for Vulkan, OpenCL/SYCL and NNEF
  • 32. © Copyright Khronos Group 2017 - Page 32© Copyright Khronos Group 2017 - Page 32 Need for Camera Control API? • Advanced control of ISP and camera subsystem – with cross-platform portability - Generate sophisticated image stream for advanced imaging & vision apps • No platform API currently fulfills all developer requirements - Portable access to growing sensor diversity: e.g. depth sensors and sensor arrays - Cross sensor synch: e.g. synch of camera and MEMS sensors - Advanced, high-frequency per-frame burst control of camera/sensor: e.g. ROI - Multiple input, output re-circulating streams with RAW, Bayer or YUV Processing Image Signal Processor (ISP) Defines control of Sensor, Color Filter Array Lens, Flash, Focus, Aperture Auto Exposure (AE) Auto White Balance (AWB) Auto Focus (AF) Stream of Images for Vision Processing OpenKCAM standard is currently on ice – do we need to restart?
  • 33. © Copyright Khronos Group 2017 - Page 33© Copyright Khronos Group 2017 - Page 33 Please Get Involved! • Khronos is creating cutting-edge royalty-free open standards - For embedded vision and neural network acceleration • Khronos standards are key to many emerging markets such as 3D rendering and advanced gaming, Augmented and Virtual Reality, Embedded Vision and machine learning - Advanced next generation capabilities for ALL platforms • Khronos encourages your participation - www.khronos.org - ntrevett@nvidia.com - @neilt3d
  • 34. © Copyright Khronos Group 2017 - Page 34© Copyright Khronos Group 2017 - Page 34 OpenVX - Graph-Level Abstraction • OpenVX developers express a graph of image operations (‘Nodes’) - Using a C API • Nodes can be executed on any hardware or processor coded in any language - Implementers can optimize under the high-level graph abstraction • Graphs are the key to run-time power and performance optimizations - E.g. Node fusion, tiled graph processing for cache efficiency etc. Array of Keypoints YUV Frame Gray Frame Camera Input Rendering Output Pyrt Color Conversion Channel Extract Optical Flow Harris Track Image Pyramid RGB Frame Array of Features Ftrt-1OpenVX Graph OpenVX Nodes Feature Extraction Example Graph
  • 35. © Copyright Khronos Group 2017 - Page 35 OpenVX Efficiency through Graphs.. Reuse pre-allocated memory for multiple intermediate data Memory Management Less allocation overhead, more memory for other applications Replace a sub- graph with a single faster node Kernel Fusion Better memory locality, less kernel launch overhead Split the graph execution across the whole system: CPU / GPU / dedicated HW Graph Scheduling Faster execution or lower power consumption Execute a sub- graph at tile granularity instead of image granularity Data Tiling Better use of data cache and local memory
  • 36. © Copyright Khronos Group 2017 - Page 36 SPIR-V Ecosystem LLVM Third party kernel and shader languages SPIR-V • Khronos defined cross-API IR • Native graphics and parallel compute • Easily parsed/extended 32-bit stream • Data object/control flow retained for effective code generation/translation OpenCL C++ Front-end OpenCL C Front-end glslang Khronos open source tools and translators IHV Driver Runtimes Additional Intermediate Forms SPIR-V Validator SPIR-V (Dis)Assembler LLVM to SPIR-V Bi-directional Translator https://github.com/KhronosGroup/SPIRV-Tools GLSL HLSL Khronos liaising with Clang/LLVM Community E.g. discussing SPIR-V as supported Clang target SPIRV-Cross GLSL HLSL MSL SPIRV-opt | SPIRV-remap SPIR-V Optimizations • Inlining (exhaustive) • Store/Load Elimination • Dead Code Elimination • Dead Branch Elimination • Common Uniform Elimination Coming • Loop Unrolling and Constant Folding • Common Subexpression Elimination
  • 37. © Copyright Khronos Group 2017 - Page 37 Vulkan and New Generation 3D APIs Only AppleOnly Windows 10 Cross Platform Clean, modern architecture | Low overhead, explicit GPU access Portable across desktop and mobile | Multi-thread / multi-core friendly Efficient, low-latency, predictable performance
  • 38. © Copyright Khronos Group 2017 - Page 38 Vulkan Explicit GPU Control GPU High-level Driver Abstraction Layered GPU Control Context management Memory allocation Full GLSL compiler Error detection Application Single thread per context GPU Thin Driver Explicit GPU Control Application Memory allocation Thread management Synchronization Multi-threaded generation of command buffers Multiple Front-end Compilers GLSL, HLSL etc. Loadable debug and validation layers Vulkan 1.0 provides access to OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality but with increased performance and flexibility SPIR-V pre-compiled shaders Complex drivers cause overhead and inconsistent behavior across vendors Always active error handling Full GLSL preprocessor and compiler in driver OpenGL vs. OpenGL ES Resource management offloaded to app: low-overhead, low-latency driver Consistent behavior: no ‘fighting with driver heuristics’ Validation and debug layers loaded only when needed SPIR-V intermediate language: shading language flexibility Multi-threaded command creation. Multiple graphics, command and DMA queues Unified API across all platforms with feature set flexibility Vulkan = high performance and low latency 3D and GPU Compute. Ideal for VR/AR applications
  • 39. © Copyright Khronos Group 2017 - Page 39 OpenCL Ecosystem Hardware Implementers Desktop/Mobile/Embedded/FPGA 100s of applications using OpenCL acceleration Rendering, visualization, video editing, simulation, image processing, vision and neural network inferencing Single Source C++ Programming Portable Kernel Intermediate Language Core API and Language Specs OpenCL 2.2 – Top to Bottom C++