More Related Content Similar to "Challenges in Object Detection on Embedded Devices," a Presentation from CEVA (20) More from Edge AI and Vision Alliance (20) "Challenges in Object Detection on Embedded Devices," a Presentation from CEVA1. Copyright © 2014, CEVA Inc. 1
Adar Paz
May 29, 2014
Challenges in Object Detection on
Embedded Devices
2. Copyright © 2014, CEVA Inc. 2
CEVA by Numbers
> 220 licensees & 330 licensing agreements
>40%
Worldwide handset market
share in 2013 (Strategy
Analytics, December 2013)
5 Billion
CEVA-powered devices -
shipped worldwide to date
#1
in licensable computer vision
and imaging Processors
#1
DSP licensor dominant
market share (>3X of any
other DSP IP vendor)
#1
DSP architecture in handsets
– more than 900m in 2013
#1
DSP core in audio – more than
3 billion devices shipped to
date
3. Copyright © 2014, CEVA Inc. 3
Feature Detection Use in Computer Vision
Corner
Blob
Edge
4. Copyright © 2014, CEVA Inc. 4
Gesture
Control
Base Technologies
• Feature Extraction
& Description
• Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec.
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
5. Copyright © 2014, CEVA Inc. 5
Mobile
Gesture
Control
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec.
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Base Technologies
• Feature Extraction
& Description
• Feature Matching
and Tracking
Computer Vision Understanding
Content
6. Copyright © 2014, CEVA Inc. 6
Wearables
Gesture
Control
Base Technologies
• Feature Extraction
& Description
• Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec.
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding
Content
7. Copyright © 2014, CEVA Inc. 7
Surveillance
Gesture
Control
Base Technologies
• Feature Extraction
& Description
• Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec.
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
AgeGender
Detection
Computer Vision Understanding
Content
8. Copyright © 2014, CEVA Inc. 8
Automotive
Gesture
Control
Base Technologies
• Feature Extraction
& Description
• Feature Matching
and Tracking
Applications
Augmented
Reality
Object Detection
Recognition
Depth Map
Image Stitching
Face Detection and
Recognition
Motion Detection
Emotion Detection
AgeGender
Detection
Segmentation
Irregular Behavior
Detection
Forward Collision
Warning (FCW)
Lane Departure
Warning (LDW)
Pedestrian
Detection (PD)
Traffic Sign Rec.
(TSR)
Driver Fatigue
Warning
Surround View
Monitor
Always-On
Computer Vision Understanding Content
9. Copyright © 2014, CEVA Inc. 9
RTL
design Production
Chip
ready
Distribution
Why Use a Programmable CV Processor?
Development
Time Line
RTL
design
CV
algorithm
design
Production
HWSolution
Programmable CV algorithm design
Continued algorithms
development
ProgrammableSolution
Chip
ready
Distribution
10. Copyright © 2014, CEVA Inc. 10
Dedicated CV Processor or Mobile GPU?
CEVA-MM3101Performancegain(resultpercycle)
0
5
10
15
20
25
30
MM3101 Vs. Mobile GPU A MM3101 Vs. Mobile GPU B MM3101 Vs. Mobile GPU C
sobel
corr3x3
corrsep5x5
corrsep11x11
Histogram
HSV2RGB
RGB2HSV
Max3x3
Min
Median3x3
CornerHarris
Average
Typical set of
representative
CV & imaging
algorithms
MM3101 shows average 13x performance boost over leading Mobile GPUs.
Power factor >50x more efficient
11. Copyright © 2014, CEVA Inc. 11
Typical Feature Treatment Flow
Sobel
Gaussian
Build image
pyramids
Integral sum
Gamma
normalization
HOG
BRIEF
SIFT
SURF
FREAK
LBP
HAAR
SVM
Decision
tree
Harris
FAST
SIFT
SURF
12. Copyright © 2014, CEVA Inc. 12
Flow Chart — HOG Descriptor
Input image
Scaled image
Scale 1
Scaled image
Gamma
Normalization
Gradient
Calculation
Descriptor
Calculation
Bilinear
Scaling
HOG algorithm is based on Dalal & Triggs paper (2005)
Common use is object detection, especially pedestrian detection
Reference Code – OpenCV 2.4.3
Scale 9
13. Copyright © 2014, CEVA Inc. 13
HOG — Bilinear Scaling
Bilinear Interpolation
X
X
X
Step 1: Vertical Interpolation
Step 2: Horizontal Interpolation
14. Copyright © 2014, CEVA Inc. 14
1. Load 2 X 8 pixels in a single cycle
2. 16 filter operations in a single cycle
3. Transposed Store (4x4) in single cycle
4. Perform the load and filter again (1,2)
5. Transposed Store in single cycle (3)
HOG — Bilinear Scaling — Implementation
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
Memory
vA
vB
vC
vD
Vector Registers
transpose
15. Copyright © 2014, CEVA Inc. 15
• Gamma Function: P
(γ)
• Implemented using ‘Look Up Table’ (LUT) – parallel access to local
memory in single cycle
• Loading of multiple gamma values in a single cycle
HOG — Gamma Normalization
16. Copyright © 2014, CEVA Inc. 16
• ORB — Oriented FAST and Rotated BRIEF
• An efficient alternative to SIFT
• Pyramid is used for scale-invariance
• Features are detected using FAST9 , Harris and non-max-suppress
• Descriptors are based on BRIEF with normalized orientation
ORB — Feature Extraction
Input
Image
Fast9 Harris
Non-Max-
Suppress
Oriented
BRIEF
Descriptors
list
Pyramid
17. Copyright © 2014, CEVA Inc. 17
ORB — FAST9 Implementation
• Continuous arc of 9 or more pixels:
• All much brighter then (p+Th)
or
• All much darker then (p-Th)
18. Copyright © 2014, CEVA Inc. 18
• Early exit is used to detect potential positions
• Long memory access of 32 bytes using
• quickly load consecutive pixels
• Vector compare is used to compare the center of the corner to
the borders
• Building a binary (bit) map with positions that need to be calculated
• Calculation of multiple positions in parallel
• Using different two dimensional loads
• Vector predicates are used selectively calculate only the locations
that pass the threshold
• Using multi-way parallel lookup table access to decide on
consecutive locations
ORB — FAST9 Implementation (2)
19. Copyright © 2014, CEVA Inc. 19
ORB — FAST9 Implementation (3)
SIMD Efficient, Random access to memory
Fast First Pass
Candidate
list
Input
Image
Fast First Pass
Candidate
list
Full FAST9 Second
Pass
Output
feature list
Input
Image
20. Copyright © 2014, CEVA Inc. 20
• Oriented brief uses the normalized orientation and calculates a 256 bit
wide descriptor
• The descriptor is calculated by comparison of pre-defined 256 pairs of
pixels in the surrounding of the feature center
• Each pair comparison donates a single bit in the
descriptor
• Orientation is normalized by rotating the image (or
pairs coordinates, in our implementation) according
to the moment of the feature center
BRIEF — Descriptor
21. Copyright © 2014, CEVA Inc. 21
BRIEF DSP Implementation
Calculate patch orientation (Patch
Moment)
Utilizing LUT capability read the pixels
address.
Load with random access to memory
the pixel couples
Use SIMD capability to efficiently
calculate descriptor
Orientation
22. Copyright © 2014, CEVA Inc. 22
• Inspired by the SIFT descriptor, much faster
• Main modules
• Detect features location according to pixels response to feature
• Calculate feature descriptor
SURF — Speeded Up Robust Features
Integral Sum
Image
Feature
Response
Find Local
Maximum
Choose Best
Features
Feature
Descriptor
23. Copyright © 2014, CEVA Inc. 23
+ +
T T
vector memory vector register vector memory vector register
• Two pass approach
• Horizontal and vertical data access (Full Bandwidth)
• Load and transpose data in a single instruction
SURF — Integral Sum Image
24. Copyright © 2014, CEVA Inc. 24
• Calculate feature response using Box Filter
• Small box: A-B+D-C
• Large box: E-F+H-G
• Total response: E-F+H-G - 3x(A-B+D-C)
• Can be executed in a single cycle
• Two operations in a single cycle
Res = (E-F)+(H-G)
Res += (A-B)*(-3)+(D-C)*(-3)
SURF — Feature Response
Flexible & Powerful Filter Instruction
-2
1
1
A B
C D
E F
G H
25. Copyright © 2014, CEVA Inc. 25
• Handles multiple features in parallel
• Memory access of several different features in a single instruction
• Calculates Integral sum
Vertical and horizontal memory access
• Calculates gradient using box filter
result = (A-B) + (D-C)
SURF — Feature Descriptor
Feature Data 0 Feature Data 1 ... Feature Data 7
parallel load
26. Copyright © 2014, CEVA Inc. 26
Feature Extraction Summary Table
Algorithm Memory Access Execution
2-Dimensional
Access
LUT Support Parallel
Memory Access
Dedicated
Instructions
Vectorized
Conditional Flow
HOG – Bilinear
HOG - Gamma
FAST9 - Detector
BRIEF - Descriptor
SURF – Int. Sum
SURF – Feature
Response
SURF - Descriptor
27. Copyright © 2014, CEVA Inc. 27
Conclusion —
Critical Ingredients for Efficient CV Processor
CEVA-MM3101 Imaging & Computer Vision IP platform:
Includes all above features (and more…)
CEVA-CV library already includes various feature extraction algos
Enables shorter development cycle, efficiency and algorithm flexibility
CV
Processor
Efficient filter
processing
Good
conditional code
execution
Fast & flexible
random access to
memory
Good bit and
byte data
manipulation
Wide memory
bandwidth
Large internal
memory
28. Copyright © 2014, CEVA Inc. 28
1. ORB: an efficient alternative to SIFT or SURF
Rublee, E. ; Rabaud, V. ; Konolige, K. ; Bradski, G. - Computer Vision (ICCV), 2011 IEEE
2. Histograms of oriented gradients for human detection
Dalal, N. ; Triggs, B. - Computer Vision and Pattern Recognition, 2005. CVPR 2005
3. SURF: Speeded Up Robust Features
Herbert Bay, Tinne Tuytelaars, Luc Van Gool - Computer Vision – ECCV 2006
4. BRIEF: Binary Robust Independent Elementary Features
Michael Calonder, Vincent Lepetit, Christoph Strecha, Pascal Fua - Computer Vision – ECCV 2010
5. Distinctive Image Features from Scale-Invariant Keypoints
David G. Lowe - International Journal of Computer Vision, Volume 60, Issue 2 , pp 91-110
Resource for further Investigation
𝒀𝒐𝒖 are all invited to the CEVA demo table
Thank You!