Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenVX, Camera and StreamInput Overview


Published on

OpenVX enables performance and power optimized computer vision algorithms for use cases such as face, body and gesture tracking, smart video surveillance, automatic driver assistance systems, object and scene reconstruction, augmented reality, visual inspection, robotics and more. The provisional release of the specification enables developers and implementers to provide feedback before specification finalization, which is expected within six months.

OpenVX enables significant implementation innovation while maintaining a consistent API for developers. An OpenVX application expresses vision processing holistically as a graph of function nodes. An OpenVX implementer can optimize graph execution through a wide variety of techniques such as: acceleration of nodes on CPUs, GPUs, DSPs or dedicated hardware, compiler optimizations, node coalescing, and tiled execution to keep sections of processed images in local memories as they flow through the graph. Khronos has released a provisional tiled execution extension alongside the main OpenVX specification to enable user custom kernels to exploit this style of optimization. Additionally, Khronos has released the VXU™ utility library to enable developers using OpenVX to call individual nodes as standalone functions for easy code migration.

OpenVX can be used directly by applications or to accelerate higher-level middleware, such as the popular OpenCV open source vision library that is often used for application prototyping. OpenVX will have extensive conformance tests to complement a focused and tightly defined finalized specification for consistent and reliable operation across multiple vendors and platforms making OpenVX an ideal foundation for shipping production vision applications. Finally, as any Khronos specification, OpenVX is extensible to enable nodes to be deployed to meet customer needs, ahead of being integrated into the core specification.

Published in: Technology, Business
  • Be the first to comment

OpenVX, Camera and StreamInput Overview

  1. 1. APIs for Vision Acceleration, Camera Control and Sensor Fusion Neil Trevett Vice President NVIDIA, President Khronos © Copyright Khronos Group 2013 - Page 1
  2. 2. Khronos Connects Software to Silicon Open Consortium creating ROYALTY-FREE, OPEN STANDARD APIs for hardware acceleration Defining the roadmap for low-level silicon interfaces needed on every platform Graphics, video, audio, compute, vision, sensor and camera processing Rigorous specifications AND conformance tests for crossvendor portability Acceleration APIs BY the Industry FOR the Industry Well over a BILLION people use Khronos APIs Every Day… © Copyright Khronos Group 2013 - Page 2
  3. 3. Khronos Standards 3D Asset Handling - 3D authoring asset interchange - 3D asset transmission format with compression Visual Computing - 3D Graphics - Heterogeneous Parallel Computing Over 100 companies defining royalty-free APIs to connect software to silicon Camera Control API Provisional 1.0 spec released! Acceleration in HTML5 - 3D in browser – no Plug-in - Heterogeneous computing for JavaScript Sensor Processing - Vision Acceleration - Camera Control - Sensor Fusion © Copyright Khronos Group 2013 - Page 3
  4. 4. Sensors & Vision Driving Key Mobile Use Cases Computational Photography and Videography Natural UI with Face, Body and Gesture Tracking 3D Scene and Object Reconstruction Augmented Reality Time © Copyright Khronos Group 2013 - Page 4
  5. 5. Vision Pipeline Challenges and Opportunities Growing Camera Diversity Diverse Vision Processors Sensor Proliferation Capturing color, range and lightfields Driving for high performance and low power Diverse sensor awareness of the user and surroundings • Light / Proximity • 2 cameras • 3 microphones • Touch • Camera sensors >20MPix • Novel sensor configurations • Stereo pairs • Active Structured Light • Active TOF • Camera ISPs • Dedicated vision IP blocks • DSPs and DSP arrays • Programmable GPUs • Multi-core CPUs • Plenoptic Arrays Flexible sensor and camera control to generate required image stream Use best processing available for image stream processing – with code portability 19 • Position - GPS - WiFi (fingerprint) - Cellular trilateration - NFC/Bluetooth Beacons • Accelerometer • Magnetometer • Gyroscope • Pressure / Temp / Humidity Control/fuse vision data by/with all other sensor data on device Camera Control API © Copyright Khronos Group 2013 - Page 5
  6. 6. OpenVX – Power Efficient Vision Acceleration • Acceleration API for real-time vision - Focus on mobile and embedded systems • Enable diverse efficient implementations - From CPUs, through GPUs and DSPs to dedicated hardware • Foundational API for vision acceleration - Can be used by middleware libraries or by applications directly Application Other higher-level CV libraries OpenCV open source library • Complementary to OpenCV - Which is great for prototyping • Khronos open source sample implementation - To be released with final specification - Sample - not reference - spec remains the definitive definition of OpenVX operation Open source sample implementation Hardware vendor implementations © Copyright Khronos Group 2013 - Page 6
  7. 7. OpenVX and OpenCV are Complementary Governance Community driven open source with no formal specification Formal specification defined and implemented by hardware vendors Conformance No conformance tests for consistency and every vendor implements different subset Full conformance test suite / process creates a reliable acceleration platform Portability APIs can vary depending on processor Hardware abstracted for portability Scope Very wide 1000s of imaging and vision functions Multiple camera APIs/interfaces Tight focus on hardware accelerated functions for mobile vision Use external camera API Efficiency Memory-based architecture Each operation reads and writes memory Graph-based execution Optimizable computation, data transfer Use Case Rapid experimentation Production development & deployment © Copyright Khronos Group 2013 - Page 7
  8. 8. OpenVX Graphs – The Key to Efficiency • Vision processing directed graphs for power and performance efficiency - Each Node can be implemented in software or accelerated hardware - Nodes may be fused by the implementation to eliminate memory transfers - Processing can be tiled to keep data entirely in local memory/cache • EGLStreams can provide data and event interop with other Khronos APIs - BUT use of other Khronos APIs are not mandated • VXU Utility Library for access to single nodes - Easy way to start using OpenVX by calling each node independently Native Camera Control OpenVX Node OpenVX Node OpenVX Node Example OpenVX Graph OpenVX Node Heterogeneous Processing © Copyright Khronos Group 2013 - Page 8
  9. 9. OpenVX 1.0 Function Overview • Core data structures - Images and Image Pyramids - Processing Graphs, Kernels, Parameters • Image Processing - Arithmetic, Logical, and statistical operations - Multichannel Color and BitDepth Extraction and Conversion - 2D Filtering and Morphological operations - Image Resizing and Warping • Core Computer Vision - Pyramid computation - Integral Image computation • Feature Extraction and Tracking - Histogram Computation and Equalization - Canny Edge Detection - Harris and FAST Corner detection - Sparse Optical Flow Widely used extensions adopted into future versions of the core OpenVX Specification Evolution Open VX 1.0 defines Framework for creating, managing and executing graphs Focused set of widely used functions that are readily accelerated Implementers can add functions as extensions © Copyright Khronos Group 2013 - Page 9
  10. 10. Example Graph - Stereo Machine Vision OpenVX Graph Camera 1 Stereo Rectify with Remap Camera 2 Stereo Rectify with Remap Compute Depth Map (User Node) Detect and track objects (User Node) Image Pyramid Object coordinates Compute Optical Flow Delay Tiling extension enables user nodes (extensions) to also optimally run in local memory © Copyright Khronos Group 2013 - Page 10
  11. 11. OpenVX Participants and Timeline • Aiming for specification finalization by mid-2014 • Itseez is working group chair (the convener of OpenCV) • Qualcomm and TI are specification editors © Copyright Khronos Group 2013 - Page 11
  12. 12. OpenVX and OpenCL are Complementary Use Case General Heterogeneous programming Domain targeted - vision processing Architecture Language-based – needs online compilation Library-based - no online compiler required Target Hardware ‘Exposed’ architected memory model – can impact performance portability Abstracted node and memory model diverse implementations can be optimized for power and performance Precision Full IEEE floating point mandated Minimal floating point requirements – optimized for vision operators Ease of Use General-purpose math libraries with no built-in vision functions Fully implemented vision operators and framework ‘out of the box’ © Copyright Khronos Group 2013 - Page 12
  13. 13. Need for Camera Control API • We have APIs for imaging/vision in applications and pre- and post-ISP processing - E.g. OpenCL can be used to accelerate imaging on CPU, GPU, DSP etc. • BUT no open standard API for advanced control of ISP and camera subsystem - ISP is involved in camera control via 3A algorithms • Need sophisticated image stream generation for advanced imaging & vision apps - Advanced, high-frequency burst control of camera and sensor operation - Portable support more types of sensors – including depth sensors - Tighter system integration – e.g. synch of camera and MEMS sensors Scope of Camera Control API 3A - Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF) Lens, sensor, aperture control Bayer Pre-processing Sensor, Color Filter Array Lens, Flash, Focus, Aperture Image Signal Processor (ISP) RGB/YUV Post-processing Image/Vision Applications © Copyright Khronos Group 2013 - Page 13
  14. 14. Advanced Camera Control Use Cases • High-dynamic range (HDR) and computational flash photography - High-speed burst with individual frame control over exposure and flash • Subject isolation and depth detection - High-speed burst with individual frame control over focus • Rolling shutter elimination - High-precision intra-frame synchronization between camera and motion sensor • Augmented Reality - 60Hz, low-latency capture with motion sensor synchronization - Multiple Region of Interest (ROI) capture - Synchronized stereo sensors for scene scaling - Detailed feedback on camera operation per frame • Time-of-flight or structured light depth camera processing - Aligned stacking of data from multiple sensors © Copyright Khronos Group 2013 - Page 14
  15. 15. Khronos Camera API Requirements • Application control over ISP processing (including 3A) - Including multiple, re-entrant ISPs • Control multiple sensors with synch and alignment - E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras • Enhanced per frame detailed control - Format flexibility, Region of Interest (ROI) selection • Global timing & synchronization - E.g. Between cameras and MEMS sensors • Flexible processing/streaming - Multiple input and output streams with RAW, Bayer or YUV Processing - Streaming of rows (not just frames) Enable new camera functionality not available on current platforms and align with future platform directions for easy adoption © Copyright Khronos Group 2013 - Page 15
  16. 16. Camera API Design Milestones and Philosophy • C-language API starting from proven designs - e.g. FCAM • Design alignment with widely used hardware standards - e.g. MIPI CSI • Focus on mobile, power-limited devices - But do not preclude other use cases such as automotive, surveillance, DSLR… • Minimize overlap and maximize interoperability with other Khronos APIs - But other Khronos APIs are not mandated • Support vendor-specific extensions Group charter approved Apr13 Working group proposed First draft specification 4Q13 Jul13 Architectural Design 1Q14 Specification ratification 2Q14 Sample implementation and tests 3Q14 © Copyright Khronos Group 2013 - Page 16
  17. 17. Camera API Architecture will be FCAM-based • No global state - State travels with image requests - Every stage in the pipeline may have different state - Enables fast, deterministic state changes • Synchronize devices - Lens, flash, sound capture, gyro… - Devices can schedule Actions - E.g. to be triggered on exposure change © Copyright Khronos Group 2013 - Page 17
  18. 18. Potential Adoption on Android • Android Exposes Java camera APIs to developers - Controls underlying Camera HAL • Camera HAL v1 API simplified basic point and shoot apps - Difficult or impossible to do much else • Camera HAL v3 API is a fundamentally different API - Streams-based to enable more sophisticated camera applications Khronos Camera API builds on FCAM with a goal of being forward compatible with Android architecture Camera API Open source project developed by Nokia and Stanford HAL V3 adopts many FCAM ideas and can use EGL in its implementation Khronos Camera API may be used to IMPLEMENT Android Camera HAL – and provide an advanced native camera API in NDK © Copyright Khronos Group 2013 - Page 18
  19. 19. Sensor Industry Fragmentation … © Copyright Khronos Group 2013 - Page 19
  20. 20. Portable Access to Sensor Fusion Apps request semantic sensor information StreamInput defines possible requests, e.g. “Provide Skeleton Position” “Am I in an elevator?” Advanced Sensors Everywhere Apps Need Sophisticated Access to Sensor Data RGB and depth cameras, multi-axis motion/position, touch and gestures, microphones, wireless controllers, haptics keyboards, mice, track pads Without coding to specific sensor hardware Processing graph provides sensor data stream Utilizes optimized, smart, sensor middleware Apps can gain ‘magical’ situational awareness © Copyright Khronos Group 2013 - Page 20
  21. 21. StreamInput Concepts • Application-defined filtering and conversion - Set up a graph of nodes to generate required semantics - Standardized node intercommunication - Filter nodes modify data from inputs - Can create virtual input devices • Node types enable specific data abstractions - Unicode text node for keyboards, speech recognition, stylus, etc. - 3D skeleton node for depth-sensing camera or a motion-tracking suit … • Extensibility to any sensor type - Can define new node data types, state and methods Filter Node App Filter Node Filter Node Input Device Input Device Input Device • Sensor Synchronization - Universal time stamp on every sample • Provision of device meta-data - Including pictorial representations Standardized Node Intercommunication Universal Timestamps © Copyright Khronos Group 2013 - Page 21
  22. 22. StreamInput Adoption Flexibility • Defines access to high-quality fused sensor stream and context changes - Implementers can optimize and innovate generation of the sensor stream Applications Platforms can provide increased access to improved sensor data stream – driving faster, deeper sensor usage by applications StreamInput implementations compete on sensor stream quality, reduced power consumption, environment triggering and context detection – enabling sensor subsystem vendors to increased ADDED VALUE OS Sensor OS APIs Middleware (E.g. Android SensorManager or iOS CoreMotion) Middleware engines need platformportable access to native, low-level sensor data stream (E.g. Augmented Reality engines, gaming engines) Low-level native API defines access to fused sensor data stream and context-awareness Sensor Sensor … Mobile or embedded platforms without sensor fusion APIs can provide direct application access to StreamInput Sensor Sensor Hub Hub © Copyright Khronos Group 2013 - Page 22
  23. 23. Khronos APIs for Augmented Reality AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for all these subsystems to work efficiently together MEMS Sensors Sensor Fusion Application on CPUs, GPUs and DSPs Vision Processing Precision timestamps on all sensor samples Advanced Camera Control and stream generation Audio Rendering EGLStream stream data between APIs 3D Rendering and Video Composition On GPU Camera Control API © Copyright Khronos Group 2013 - Page 23
  24. 24. Summary • Khronos is building a family of interoperating APIs for portable and power-efficient vision processing • OpenVX 1.0 has been provisionally released and non-members are invited to provide feedback on the forums - • Khronos Camera and sensor fusion APIs are currently in design and complement and integrate with OpenVX • Any company is welcome to join Khronos to influence the direction of mobile and embedded vision processing! - $10K annual membership fee for access to all Khronos API working groups - Well-defined IP framework protects your IP and conformant implementations • - © Copyright Khronos Group 2013 - Page 24