Khronos OpenVX - GDC 2014


Published on

Open Standard APIs for embedded vision processing. Neil Trevett.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Khronos OpenVX - GDC 2014

  1. 1. © Copyright Khronos Group 2014 - Page 1 Open Standard APIs for Embedded Vision Processing Neil Trevett Vice President Mobile Ecosystem, NVIDIA President, Khronos Group
  2. 2. © Copyright Khronos Group 2014 - Page 2 Speakers This Morning • Neil Trevett - Vice President Mobile Ecosystem, NVIDIA - President, Khronos - Chair, OpenCL Working Group • Mikael Sevenier - Chair, Camera working group • Jim Steele - CTO, Sensor Platforms - Chair, StreamInput
  3. 3. © Copyright Khronos Group 2014 - Page 3 Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE, OPEN STANDARD APIs for hardware acceleration Defining the roadmap for low-level silicon interfaces needed on every platform Graphics, compute, rich media, vision, sensor and camera processing Rigorous specifications AND conformance tests for cross- vendor portability Acceleration APIs BY the Industry FOR the Industry Well over a BILLION people use Khronos APIs Every Day…
  4. 4. © Copyright Khronos Group 2014 - Page 4 Khronos Standards Visual Computing - 3D Graphics - Heterogeneous Parallel Computing 3D Asset Handling - 3D authoring asset interchange - 3D asset transmission format with compression Acceleration in HTML5 - 3D in browser – no Plug-in - Heterogeneous computing for JavaScript Camera Control API Over 100 companies defining royalty-free APIs to connect software to silicon Sensor Processing - Vision Acceleration - Camera Control - Sensor Fusion
  5. 5. © Copyright Khronos Group 2014 - Page 5 Sensors & Vision Driving Key Mobile Use Cases Augmented Reality Natural UI with Face, Body and Gesture Tracking Computational Photography and Videography 3D Scene and Object Reconstruction Time
  6. 6. © Copyright Khronos Group 2014 - Page 6 Vision Pipeline Challenges and Opportunities • Light / Proximity • 2 cameras • 3 microphones • Touch • Position - GPS - WiFi (fingerprint) - Cellular trilateration - NFC/Bluetooth Beacons • Accelerometer • Magnetometer • Gyroscope • Pressure / Temp / Humidity 19 Sensor Proliferation Diverse sensor awareness of the user and surroundings • Camera sensors >20MPix • Novel sensor configurations • Stereo pairs • Active Structured Light • Active TOF • Plenoptic Arrays Growing Camera Diversity Capturing color, range and lightfields Diverse Vision Processors Driving for high performance and low power • Camera ISPs • Dedicated vision IP blocks • DSPs and DSP arrays • Programmable GPUs • Multi-core CPUs Flexible sensor and camera control to generate required image stream Use best processing available for image stream processing – with code portability Control/fuse vision data by/with all other sensor data on device Camera Control API
  7. 7. © Copyright Khronos Group 2014 - Page 7 OpenVX – Power Efficient Vision Acceleration • Acceleration API for real-time vision - Focus on mobile and embedded systems • Enable diverse efficient implementations - From CPUs, through GPUs and DSPs to dedicated hardware • Foundational API for vision acceleration - Can be used by middleware libraries or by applications directly • Complementary to OpenCV - Which is great for prototyping • Khronos open source sample implementation - To be released with final specification - Sample - not reference - spec remains the definitive definition of OpenVX operation Open source sample implementation Hardware vendor implementations OpenCV open source library Other higher-level CV libraries Application
  8. 8. © Copyright Khronos Group 2014 - Page 8 OpenVX Graphs – The Key to Efficiency • Vision processing directed graphs for power and performance efficiency - Each Node can be implemented in software or accelerated hardware - Nodes may be fused by the implementation to eliminate memory transfers - Processing can be tiled to keep data entirely in local memory/cache • EGLStreams can provide data and event interop with other Khronos APIs - BUT use of other Khronos APIs are not mandated • VXU Utility Library for access to single nodes - Easy way to start using OpenVX by calling each node independently OpenVX Node OpenVX Node OpenVX Node OpenVX Node Heterogeneous Processing Native Camera Control Example OpenVX Graph
  9. 9. © Copyright Khronos Group 2014 - Page 9 OpenVX 1.0 Function Overview • Core data structures - Images and Image Pyramids - Processing Graphs, Kernels, Parameters • Image Processing - Arithmetic, Logical, and statistical operations - Multichannel Color and BitDepth Extraction and Conversion - 2D Filtering and Morphological operations - Image Resizing and Warping • Core Computer Vision - Pyramid computation - Integral Image computation • Feature Extraction and Tracking - Histogram Computation and Equalization - Canny Edge Detection - Harris and FAST Corner detection - Sparse Optical Flow OpenVX 1.0 defines framework for creating, managing and executing graphs Focused set of widely used functions that are readily accelerated Implementers can add functions as extensions Widely used extensions adopted into future versions of the core OpenVX Specification Evolution
  10. 10. © Copyright Khronos Group 2014 - Page 10 Example Graph - Stereo Machine Vision Camera 1 Compute Depth Map (User Node) Detect and track objects (User Node) Camera 2 Image Pyramid Stereo Rectify with Remap Stereo Rectify with Remap Compute Optical Flow Object coordinates OpenVX Graph Delay Tiling extension enables user nodes (extensions) to also optimally run in local memory
  11. 11. © Copyright Khronos Group 2014 - Page 11 OpenVX and OpenCV are Complementary Governance Community driven open source with no formal specification Formal specification defined and implemented by hardware vendors Conformance No conformance tests for consistency and every vendor implements different subset Full conformance test suite / process creates a reliable acceleration platform Portability APIs can vary depending on processor Hardware abstracted for portability Scope Very wide 1000s of imaging and vision functions Multiple camera APIs/interfaces Tight focus on hardware accelerated functions for mobile vision Use external camera API Efficiency Memory-based architecture Each operation reads and writes memory Graph-based execution Optimizable computation, data transfer Use Case Rapid experimentation Production development & deployment
  12. 12. © Copyright Khronos Group 2014 - Page 12 OpenVX Participants and Timeline • Provisional 1.0 specification released November 2013 for industry feedback • Aiming for specification finalization and conformance tests 3Q14 • Itseez is working group chair (the convener of OpenCV) • Qualcomm and TI are specification editors
  13. 13. © Copyright Khronos Group 2014 - Page 13 OpenCL – Portable Heterogeneous Computing • Portable Heterogeneous programming of diverse compute resources - Targeting supercomputers -> embedded systems -> mobile devices • One code tree can be executed on CPUs, GPUs, DSPs and hardware - Dynamically interrogate system load and balance work across available processors • OpenCL = Two APIs and C-based Kernel language - Platform Layer API to query, select and initialize compute devices - Kernel language - Subset of ISO C99 + language extensions - C Runtime API to build and execute kernels across multiple devices OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code OpenCL Kernel Code GPU DSP CPU CPU HW
  14. 14. © Copyright Khronos Group 2014 - Page 14 OpenCL as Foundation for Parallel Compute • 100+ tool chains and languages leveraging OpenCL - Heterogeneous solutions emerging for the most popular programming languages C++ syntax compiler extensions SYCL JavaScript binding for initiation of OpenCL C kernels WebCL River Trail Language extensions to JavaScript C++ AMP Shevlin Park Uses Clang and LLVM OpenCL provides vendor optimized, cross-platform, cross-vendor access to heterogeneous compute resources Harlan High level language for GPU programming Compiler directives for Fortran, C and C++ Aparapi Java language extensions for parallelism PyOpenCL Python wrapper around OpenCL Language for image processing and computational photography SPIR Standard Portable Intermediate Representation (extending LLVM for parallel computation) SPIR 1.2 Released in January 2014
  15. 15. © Copyright Khronos Group 2014 - Page 15 OpenVX and OpenCL are Complementary Use Case General Heterogeneous programming Domain targeted Vision processing Architecture Language-based – needs online compilation Library-based - no online compiler required Target Hardware ‘Exposed’ architected memory model – can impact performance portability Abstracted node and memory model - diverse implementations can be optimized for power and performance Precision Full IEEE floating point mandated Minimal floating point requirements – optimized for vision operators Ease of Use General-purpose math libraries with no built-in vision functions Fully implemented vision operators and framework ‘out of the box’ It is possible to use OpenCL to build OpenVX Nodes
  16. 16. © Copyright Khronos Group 2014 - Page 16 Need for Camera Control API • We have choice of APIs for image and vision image processing - BUT no open standard API for camera control to FEED these APIs! • Need advanced control of ISP and camera subsystem - Generate sophisticated image stream for advanced imaging & vision apps • No system API fulfills all developer requirements - Advanced, high-frequency burst control of camera and sensor operation - Portable support for diversity of sensors: e.g. depth sensors and sensor arrays - Tight system integration: e.g. synch of camera and MEMS sensors Pre-processing Image Signal Processor (ISP) Post-processing Sensor, Color Filter Array Lens, Flash, Focus, Aperture Bayer RGB/YUV Image/Vision Applications Lens, sensor, aperture control 3A - Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF) Scope of Camera Control API
  17. 17. © Copyright Khronos Group 2014 - Page 17 Advanced Camera Control Use Cases • High-dynamic range (HDR) and computational flash photography - High-speed burst with individual frame control over exposure and flash • Subject isolation and depth detection - High-speed burst with individual frame control over focus • Rolling shutter elimination - High-precision intra-frame synchronization between camera and motion sensor • Augmented Reality - 60Hz, low-latency capture with motion sensor synchronization - Multiple Region of Interest (ROI) capture - Synchronized stereo sensors for scene scaling - Detailed feedback on camera operation per frame • Time-of-flight or structured light depth camera processing - Aligned stacking of data from multiple sensors
  18. 18. © Copyright Khronos Group 2014 - Page 18 Camera API Architecture will be FCAM-based • No global state - State travels with image requests - Every stage in the pipeline may have different state - Enables fast, deterministic state changes • Synchronize devices - Lens, flash, sound capture, gyro… - Devices can schedule Actions - E.g. to be triggered on exposure change
  19. 19. © Copyright Khronos Group 2014 - Page 19 Khronos Camera API Requirements • Application control over ISP processing (including 3A) - Including multiple, re-entrant ISPs • Control multiple sensors with synch and alignment - E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras • Enhanced per frame detailed control - Format flexibility, Region of Interest (ROI) selection • Global timing & synchronization - E.g. Between cameras and MEMS sensors • Flexible processing/streaming - Multiple input and output streams with RAW, Bayer or YUV Processing - Streaming of rows (not just frames) Enable new camera functionality not available on current platforms and align with future platform directions for easy adoption
  20. 20. © Copyright Khronos Group 2014 - Page 20 Camera API Design Milestones and Philosophy • C-language API starting from proven designs - e.g. FCAM • Design alignment with widely used hardware standards - e.g. MIPI CSI • Focus on mobile, power-limited devices - But do not preclude other use cases such as automotive, surveillance, DSLR… • Minimize overlap and maximize interoperability with other Khronos APIs - But other Khronos APIs are not mandated • Support vendor-specific extensions Apr13 Jul13 Group charter approved 4Q13 Architectural Design 1Q14 First draft specification 2Q14 Sample implementation and tests 3Q14 Specification ratification Working group proposed
  21. 21. © Copyright Khronos Group 2014 - Page 21 • Android Exposes Java camera APIs to developers - Controls underlying Camera HAL • Camera HAL v1 API simplified basic point and shoot apps - Difficult or impossible to do much else • Camera HAL v3 API is a fundamentally different API - Streams-based to enable more sophisticated camera applications Potential Adoption on Android Open source project developed by Nokia and Stanford Camera API HAL V3 adopts many FCAM ideas and can use EGL in its implementation Khronos Camera API builds on FCAM with a goal of being forward compatible with Android architecture Khronos Camera API may be used to IMPLEMENT Android Camera HAL – and provide an advanced native camera API in NDK
  22. 22. © Copyright Khronos Group 2014 - Page 22 StreamInput Jim Steele CTO, Sensor Platforms Chair, StreamInput Working Group
  23. 23. © Copyright Khronos Group 2014 - Page 23 Sensor Industry Fragmentation …
  24. 24. © Copyright Khronos Group 2014 - Page 24 Low-level Sensor Abstraction API Apps Need Sophisticated Access to Sensor Data Without coding to specific sensor hardware Apps request semantic sensor information StreamInput defines possible requests, e.g. Read Physical or Virtual Sensors e.g. “Game Quaternion” Context detection e.g. “Am I in an elevator?” StreamInput processing graph provides optimized sensor data stream High-value, smart sensor fusion middleware can connect to apps in a portable way Apps can gain ‘magical’ situational awareness Advanced Sensors Everywhere Multi-axis motion/position, quaternions, context-awareness, gestures, activity monitoring, health and environmental sensors Sensor Discoverability Sensor Code Portability
  25. 25. © Copyright Khronos Group 2014 - Page 25 Sensor Types • Basic sensor data: - Acceleration, Magnetic Field, Angular Rates - Pressure, Ambient Light, Proximity, Temperature, Humidity, RGB light, UV light - Heart rate, Blood Oxygen Level, Skin Hydration, Breathalyzer • Sensor fusion - Orientation (Quaternion or Euler Angles), Gravity, Linear Acceleration - Position • Context awareness - Device Motion: general movement of the device: still, free-fall, … - Carry: how the device is being held by a user: in pocket, in hand, … - Posture: how the body holding the device is positioned: standing, sitting, step, … - Transport: about the environment around the device: in elevator, in car, …
  26. 26. © Copyright Khronos Group 2014 - Page 26 StreamInput: Potential Sensor Fusion Stack OS Sensor APIs (E.g. Android SensorManager or iOS CoreMotion) Low-level native API defines access to fused sensor data stream and context-awareness … Applications Sensor Sensor Sensor HubSensor Hub StreamInput implementations compete on sensor stream quality, reduced power consumption, environment triggering and context detection – enabling sensor subsystem vendors to increased ADDED VALUE Middleware (E.g. Context-awareness engines, gaming engines) Platforms can provide increased access to improved sensor data stream – driving faster, deeper sensor usage by applications Middleware engines need platform- portable access to native, low-level sensor data streams Mobile or embedded platforms without sensor fusion APIs can provide direct application access to StreamInput Hardware transport interfaces are defined by each system, e.g. IIO or HID sensor Embedded processors or peripheral hardware implementing StreamInput provide a standard interface to other system processors
  27. 27. © Copyright Khronos Group 2014 - Page 27 Khronos APIs for Augmented Reality Advanced Camera Control and stream generation 3D Rendering and Video Composition On GPU Audio Rendering Application on CPUs, GPUs and DSPs Sensor Fusion Vision Processing MEMS Sensors Camera Control API EGLStream - stream data between APIs Precision timestamps on all sensor samples AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for all these subsystems to work efficiently together
  28. 28. © Copyright Khronos Group 2014 - Page 28 Summary • Khronos is building a family of interoperating APIs for portable and power-efficient vision processing • OpenVX 1.0 has been provisionally released and non-members are invited to provide feedback on the forums - • Khronos camera and sensor fusion APIs are currently in design and complement and integrate with OpenVX • Any company is welcome to join Khronos to influence the direction of mobile and embedded vision processing! - $15K annual membership fee for access to all Khronos API working groups - Well-defined IP framework protects your IP and conformant implementations • -