OpenCV for Embedded: Lessons Learned

Copyright © 2015 Itseez 1
Yury Gorbachev
12-May-2015
OpenCV for Embedded:
Lessons Learned

• Open-source Computer Vision library (>2500 algos)
• De-facto standard in CV, BSD license
• Written in C++, C interface is now deprecated
• Supports multiple platforms (Linux, Windows, OSX, Android, iOS, QNX)
• Used by Google, nVidia, Microsoft, Intel, Stanford, etc.
• Funding/contributions from Willow Garage, nVidia, GSoC, AMD, Intel
• Maintained by Itseez
What is OpenCV

• OpenCV provides extensive means to create an entire application
• Camera interface (for example, V4L2 interface on Linux)
• Video Reading interface (using ffmpeg)
• UI primitives (windows, keyboard/mouse input, etc.)
• Decent performance out of the box
• Scalar performance is already good enough
• Some algorithms are capable of working ~100 FPS on average desktops
• Extra optimization is not required in most of the cases
• Good and pretty stable acceleration possibilities
• Intel® TBB is sufficient for multi-core
• AVX, IPP, OpenCL, CUDA
Desktops are good and fast

• Mostly ARM platforms
• Exotic execution environments
• C++ is not default language (e.g. on Android)
• Different interfaces (Camera, UI, Log)
• Hard to troubleshoot
• Insufficient and unpredictable performance
• Mobile and Embedded are still behind Desktop
• Thermal protection, power saving and other tricky issues
• Zoo of acceleration possibilities
• SIMD, DSP, GPU offload, FPGA
• Multi-core systems, heterogeneous systems
Embedded changes a lot

OpenCV
OpenCV based algorithms are highly portable
Platform Agnostic Modules
core, imgproc, calib3d, video, ml,
objdetect, features2d, photo, …
Platform Dependent Modules
gpu, highgui, androidcamera
python and java bindings
Dependencies
JPEG, PNG, Jasper,
multimedia, OpenNI
Dependencies
CMake
• Algorithm modules are easy to migrate to new environment
• С++ and CMake are the only requirements!
• OpenCV accuracy tests
• Easily verify correctness of OpenCV on a new platform
• Some vendors use for regression tests during environment updates
Accelerations
TBB/GDC/Concurrency,
IPP, Eigen

Use desktop for algorithm development
Prototyping
(x86)
Porting
Profiling
Bottleneck
optimization
Fine Tuning
Productization
Regression
Tests
Performance Tests
• Video input, more debug possibilities, simple UI, higher speed
• Focus on algorithm, not environment!

• HW performance is always an issue for vision systems
• Heavy image processing requires significant memory bandwidth
• Usual bottleneck; multiple cores do not help
• Collocation of multiple algorithms on a single system (e.g. ADAS)
• Mobile platforms are even more complicated
• Thermal protection, power saving are hard to control and influence
• Hard to predict when/if we are consuming too much
• Unstable FPS impacts algorithm complexity (e.g. object tracking)
• Hardware selection is not easy
• Very hard to predict final application performance beforehand
• No valid benchmarks to emulate computer vision patterns
Consider embedded performance issues

• OpenCV was initially optimized for desktop where it works fast
• ARM optimizations are far behind
• Scalar code does not perform on ARM as good as on x86
• Optimization might help to some extent
It is normal if it’s slow without optimizations
150
100
50
5
SSE
IPP
NEON (OpenCV 3)
NEON
Number of optimized functions within OpenCV

• Algorithm optimization and only then hotspots
• Reduce search and track areas, use grayscale, reduce resolution
• Select proper HW if possible
• Compare development kit performance at least
• Try ARMv8, it is better in scalar performance
• Use OpenCV packages from HW vendors (NVIDIA, TI)
• Vendor specific packages yield out of the box improvements on
specific HW, very easy to try
• Not a cross-platform solution
• Optimize functions yourself
• NEON, DSP and other HW specific options
A few optimization hints

Itseez achievements
18.9
138
163.6
32.4
2.3 3.1 3.1 7.9
Filter 2D Adaptive
Threshold
Blur FAST
Processing on ARM v7A
OpenCV Itseez
• Note scalar difference ARM v7A vs. v8
30.8 30.1
27.1
23.2
2.5 1.4 0.6
5
Filter 2D Adaptive
Threshold
Blur FAST
Processing on ARM v8
OpenCV Itseez

• Itseez ADAS solution
• Traffic Sign Recognition
• Front Collision Warning
• Line Departure Warning
• Pedestrian Detection
• All algorithms are running real-time on off-the-shelf ARM device
• Designed and tested using OpenCV
• Product implements intelligent pipeline layer to reduce load
• Uses custom accelerated functions
Actual product example

• Intelligent pipeline
• Shares computation results between algorithms
• Complicated processing is performed only once, used by all
• Multiple frame sizes used where appropriate
• Custom NEON optimizations
• Heavily optimized using only NEON, no GPU, DSP
• Multiple processing functions are joined to reduce memory access
• E.g. demosaicing with conversion to grayscale & RGBA
• Some interesting statistics
• Algorithm optimizations accelerate by factor 2-3
• NEON accelerations give another 3-4x
Itseez ADAS - Some more details

• OpenVX standard by Khronos
• Hardware accelerated vision – easier life for everyone
• Currently being implemented by number of vendors
• OpenCV HAL (a part of OpenCV 3.x)
• Low level API beneath the standard OpenCV
• Open-source, but potentially can use proprietary components
• Generic multi-core scheduler (Planned feature)
• Make multi-core scheduler more intelligent on mobile architectures
• pthread-based backend in addition to existing options
• Vision benchmarks for hardware (Desired feature)
• Some performance tests are present in OpenCV already
• Not possible to use for benchmarking directly, some work is needed
• OpenCV Manager for Android could also contain benchmarking
What is missing? What is planned?

• Itseez Web: www.itseez.com
• OpenCV home: www.opencv.org
• OpenCV documentation: docs.opencv.org
• GitHub: https://github.com/Itseez/opencv
• OpenCV resources on Embedded Vision Alliance (plenty of info):
http://www.embedded-vision.com/opencv-resources
• OpenCV on TI: http://www.ti.com/lit/wp/spry175/spry175.pdf
• OpenCV on NVIDIA: https://developer.nvidia.com/opencv
• E-mail me: yury.gorbachev@itseez.com
Resources

Q & A

OpenCV for Embedded: Lessons Learned

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to OpenCV for Embedded: Lessons Learned

Similar to OpenCV for Embedded: Lessons Learned (20)

Recently uploaded

Recently uploaded (20)

OpenCV for Embedded: Lessons Learned