Re-Vision stack presentation
Peter Hobden MSc
Lincoln University
Introduction Responsive and Reconfigurable Vision Systems
Why FPGAs
More Responsive than Typical
SoCs & Embedded GPUs:
6X better images/sec/Watt in
machine learning
42X higher frames/sec/Watt for
computer vision processing
1/5th the latency
UltraScale ZCU102/104
Components
• xfOpenCV
• Vivado version base TRD
• SDSoC (C like language)
• PetaLinux
OpenCV library functions are essential to developing many computer vision applications. Xilinx’s xFAST library for computer
vision, based on key OpenCV functions, will allow you to easily compose and accelerate computer vision functions in the FPGA
fabric through SDx or HLx environments.
In addition, xFAST functions are consistent with OpenCV and are optimized for performance, resource utilization and ease of
use. There are Thousands of functions in OpenCV 3.1 library for Cortex A9 and Cortex A53 OpenCV functions (including the
OpenVX subset) available as a library of optimized functions for Xilinx SoCs
Complete library user guide with device utilization and performance
xFOpenCV
xFOpenCV functions
Re-Vision stack - Implementations
• The reVISION stack includes four initial design templates (with more to come - hopefully) that are
intended to get you up-and-running in a very short period of time. These design examples aim
help you easily see the distinct advantage Xilinx All Programmable SOCs have in high performance
embedded vision applications. The following is a brief description of these four design examples.
• LK Dense Optical Flow @ 4K60 – Real-time dense implementation of optical flow, detecting
object motion for every single pixel. This example uses non-iterative, non-pyramidal
implementation on 4K@60 FPS input coming from a Sony IMX274 sensor via the MIPI interface
• Stereo Vision – Real-time stereo disparity map calculation including remap, rectification and local
block matching. It can process dual 1080p30 stereo camera input via USB3
• Combined dense optical flow, stereo vision.
• Future - Combines the three major, complex algorithms commonly used in vision-guided systems
today including Convolutional Neural Network (CNN) for object detection or scene segmentation,
Dense Optical Flow for motion tracking and Stereo Vision for depth perception, running on a
single Zynq Ultrascale+ MPSoC device.
Optical flow
https://www.youtube.com/watch?v=4vR0-Icx2lo
Supported devices
• MIPI-CSI 2.0 Sensor Xilinx
• MIPI CSI2 Receiver Subsystem and MIPI CSI 2 Transmitter Subsystems implement the Mobile Industry
Processor Interface (MIPI) based Camera Serial Interface (CSI-2) according to version 1.1 on Xilinx's
UltraScale+™ devices allowing users to capture raw images from MIPI CSI2 sensors.
• logiSLVS_RX Camera Sub-LVDS Receiver
• Sensor Xylon
• IP core supporting interfacing of ultra-high resolution Sony CMOS image sensors to image signal processing
pipelines and application processors implemented in Xilinx All Programmable devices
• HDMI In/Out Xilinx
• HDMI TX and RX subsystems. The HDMI Subsystems are designed in compliance with the HDMI Forum
version 2.0 of the HDMI specification.
• DisplayPort In/Out Xilinx
• DisplayPort LogiCORE™ and DisplayPort TX and RX subsystems help users implement DisplayPort video
interface as defined by VESA DisplayPort v1.2 specification.
• UHD-SDI (up to 12G) In/Out Xilinx
• UHD Serial Digital Interface (UHD-SDI)is used for the transport of uncompressed digital video streams up to
4K resolutions over coax cable. The LogiCORE™ IP UHD-SDI interface provides receiver and transmitter
interfaces for the SMPTE SD-SDI, HD-SDI, 3G-SDI, 6G-SDI and 12G-SDI standards.
• GigE Vision In/Out
Creating a custom platform in Vivado
FPGA logic for capturing
and displaying video
MPSoC Base TRD – Block Diagram
VHDL / Verilog code
IP Blocks
Simulation
Matlab – Model composer
SDSoC Integration (SDx)
SDSoC Environment Overview
Combines HLS and SoC together!
• Familiar Embedded C/C++/OpenCL Application Development Experience
• The SDSoC™ development environment provides a familiar embedded
C/C++/OpenCL application development experience including an easy to
use Eclipse IDE and a comprehensive design environment for
heterogeneous Zynq®
• All Programmable SoC and MPSoC deployment.
• Complete with the industry's first C/C++/OpenCL full-system optimising
compiler, SDSoC delivers system level profiling, automated software
acceleration in programmable logic, automated system connectivity
generation, and libraries to speed programming.
• It also enables end user and third party platform developers to define,
integrate, and verify system level solutions and enable their end customers
with a customized programming environment.
• Cross compile for cortex A53
Parallel processing on Hardware
Hardware / Not ARM processor
Linux – Add Open CV libraries
OpenCV references
Petalinux
Linux processors
Neural Networks – Deep Learning
• The idea is to integrate ‘The revision stack’ with a ‘Deep learning’
engine
• Caffe
• Tensorflow
DNN/CNN
Resource conflict issues
Deep learning requires fast access to memory
But so does the video!
Additional PL – Connections are required

Re-Vision stack presentation

  • 1.
    Re-Vision stack presentation PeterHobden MSc Lincoln University
  • 2.
    Introduction Responsive andReconfigurable Vision Systems
  • 3.
    Why FPGAs More Responsivethan Typical SoCs & Embedded GPUs: 6X better images/sec/Watt in machine learning 42X higher frames/sec/Watt for computer vision processing 1/5th the latency
  • 4.
  • 5.
    Components • xfOpenCV • Vivadoversion base TRD • SDSoC (C like language) • PetaLinux
  • 6.
    OpenCV library functionsare essential to developing many computer vision applications. Xilinx’s xFAST library for computer vision, based on key OpenCV functions, will allow you to easily compose and accelerate computer vision functions in the FPGA fabric through SDx or HLx environments. In addition, xFAST functions are consistent with OpenCV and are optimized for performance, resource utilization and ease of use. There are Thousands of functions in OpenCV 3.1 library for Cortex A9 and Cortex A53 OpenCV functions (including the OpenVX subset) available as a library of optimized functions for Xilinx SoCs Complete library user guide with device utilization and performance xFOpenCV
  • 7.
  • 8.
    Re-Vision stack -Implementations • The reVISION stack includes four initial design templates (with more to come - hopefully) that are intended to get you up-and-running in a very short period of time. These design examples aim help you easily see the distinct advantage Xilinx All Programmable SOCs have in high performance embedded vision applications. The following is a brief description of these four design examples. • LK Dense Optical Flow @ 4K60 – Real-time dense implementation of optical flow, detecting object motion for every single pixel. This example uses non-iterative, non-pyramidal implementation on 4K@60 FPS input coming from a Sony IMX274 sensor via the MIPI interface • Stereo Vision – Real-time stereo disparity map calculation including remap, rectification and local block matching. It can process dual 1080p30 stereo camera input via USB3 • Combined dense optical flow, stereo vision. • Future - Combines the three major, complex algorithms commonly used in vision-guided systems today including Convolutional Neural Network (CNN) for object detection or scene segmentation, Dense Optical Flow for motion tracking and Stereo Vision for depth perception, running on a single Zynq Ultrascale+ MPSoC device.
  • 9.
  • 10.
    Supported devices • MIPI-CSI2.0 Sensor Xilinx • MIPI CSI2 Receiver Subsystem and MIPI CSI 2 Transmitter Subsystems implement the Mobile Industry Processor Interface (MIPI) based Camera Serial Interface (CSI-2) according to version 1.1 on Xilinx's UltraScale+™ devices allowing users to capture raw images from MIPI CSI2 sensors. • logiSLVS_RX Camera Sub-LVDS Receiver • Sensor Xylon • IP core supporting interfacing of ultra-high resolution Sony CMOS image sensors to image signal processing pipelines and application processors implemented in Xilinx All Programmable devices • HDMI In/Out Xilinx • HDMI TX and RX subsystems. The HDMI Subsystems are designed in compliance with the HDMI Forum version 2.0 of the HDMI specification. • DisplayPort In/Out Xilinx • DisplayPort LogiCORE™ and DisplayPort TX and RX subsystems help users implement DisplayPort video interface as defined by VESA DisplayPort v1.2 specification. • UHD-SDI (up to 12G) In/Out Xilinx • UHD Serial Digital Interface (UHD-SDI)is used for the transport of uncompressed digital video streams up to 4K resolutions over coax cable. The LogiCORE™ IP UHD-SDI interface provides receiver and transmitter interfaces for the SMPTE SD-SDI, HD-SDI, 3G-SDI, 6G-SDI and 12G-SDI standards. • GigE Vision In/Out
  • 11.
    Creating a customplatform in Vivado FPGA logic for capturing and displaying video
  • 12.
    MPSoC Base TRD– Block Diagram
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
    SDSoC Environment Overview CombinesHLS and SoC together! • Familiar Embedded C/C++/OpenCL Application Development Experience • The SDSoC™ development environment provides a familiar embedded C/C++/OpenCL application development experience including an easy to use Eclipse IDE and a comprehensive design environment for heterogeneous Zynq® • All Programmable SoC and MPSoC deployment. • Complete with the industry's first C/C++/OpenCL full-system optimising compiler, SDSoC delivers system level profiling, automated software acceleration in programmable logic, automated system connectivity generation, and libraries to speed programming. • It also enables end user and third party platform developers to define, integrate, and verify system level solutions and enable their end customers with a customized programming environment. • Cross compile for cortex A53
  • 19.
  • 20.
    Hardware / NotARM processor
  • 21.
    Linux – AddOpen CV libraries
  • 22.
  • 23.
  • 24.
  • 25.
    Neural Networks –Deep Learning • The idea is to integrate ‘The revision stack’ with a ‘Deep learning’ engine • Caffe • Tensorflow
  • 26.
  • 27.
    Resource conflict issues Deeplearning requires fast access to memory But so does the video! Additional PL – Connections are required