Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

"Challenges in Object Detection on Embedded Devices," a Presentation from CEVA

1,510 views

Published on

For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/ceva/embedded-vision-training/videos/pages/may-2014-embedded-vision-summit

For more information about embedded vision, please visit:
http://www.embedded-vision.com

Adar Paz, Imaging and Computer Vision Team Leader at CEVA, presents the "Challenges in Object Detection on Embedded Devices" tutorial at the May 2014 Embedded Vision Summit.

As more products ship with integrated cameras, there is an increased potential for computer vision (CV) to enable innovation. For instance, CV can tackle the "scene understanding" problem by first figuring out what the various objects in the scene are. Such "object detection" capability holds big promise for embedded devices in mobile, automotive, and surveillance markets. However, performing real-time object detection while meeting a strict power budget remains a challenge on existing processors.

In this session, Paz analyzes the trade-offs of various object detection, feature extraction and feature matching algorithms, their suitability for embedded vision processing, and recommends methods for efficient implementation in a power- and budget-constrained embedded device.

Published in: Technology
  • Be the first to comment

"Challenges in Object Detection on Embedded Devices," a Presentation from CEVA

  1. 1. Copyright © 2014, CEVA Inc. 1 Adar Paz May 29, 2014 Challenges in Object Detection on Embedded Devices
  2. 2. Copyright © 2014, CEVA Inc. 2 CEVA by Numbers > 220 licensees & 330 licensing agreements >40% Worldwide handset market share in 2013 (Strategy Analytics, December 2013) 5 Billion CEVA-powered devices - shipped worldwide to date #1 in licensable computer vision and imaging Processors #1 DSP licensor dominant market share (>3X of any other DSP IP vendor) #1 DSP architecture in handsets – more than 900m in 2013 #1 DSP core in audio – more than 3 billion devices shipped to date
  3. 3. Copyright © 2014, CEVA Inc. 3 Feature Detection Use in Computer Vision Corner Blob Edge
  4. 4. Copyright © 2014, CEVA Inc. 4 Gesture Control Base Technologies • Feature Extraction & Description • Feature Matching and Tracking Applications Augmented Reality Object Detection Recognition Depth Map Image Stitching Face Detection and Recognition Motion Detection Emotion Detection AgeGender Detection Segmentation Irregular Behavior Detection Forward Collision Warning (FCW) Lane Departure Warning (LDW) Pedestrian Detection (PD) Traffic Sign Rec. (TSR) Driver Fatigue Warning Surround View Monitor Always-On Computer Vision  Understanding Content
  5. 5. Copyright © 2014, CEVA Inc. 5 Mobile Gesture Control Applications Augmented Reality Object Detection Recognition Depth Map Image Stitching Face Detection and Recognition Motion Detection Emotion Detection AgeGender Detection Segmentation Irregular Behavior Detection Forward Collision Warning (FCW) Lane Departure Warning (LDW) Pedestrian Detection (PD) Traffic Sign Rec. (TSR) Driver Fatigue Warning Surround View Monitor Always-On Base Technologies • Feature Extraction & Description • Feature Matching and Tracking Computer Vision  Understanding Content
  6. 6. Copyright © 2014, CEVA Inc. 6 Wearables Gesture Control Base Technologies • Feature Extraction & Description • Feature Matching and Tracking Applications Augmented Reality Object Detection Recognition Depth Map Image Stitching Face Detection and Recognition Motion Detection Emotion Detection AgeGender Detection Segmentation Irregular Behavior Detection Forward Collision Warning (FCW) Lane Departure Warning (LDW) Pedestrian Detection (PD) Traffic Sign Rec. (TSR) Driver Fatigue Warning Surround View Monitor Always-On Computer Vision  Understanding Content
  7. 7. Copyright © 2014, CEVA Inc. 7 Surveillance Gesture Control Base Technologies • Feature Extraction & Description • Feature Matching and Tracking Applications Augmented Reality Object Detection Recognition Depth Map Image Stitching Face Detection and Recognition Motion Detection Emotion Detection Segmentation Irregular Behavior Detection Forward Collision Warning (FCW) Lane Departure Warning (LDW) Pedestrian Detection (PD) Traffic Sign Rec. (TSR) Driver Fatigue Warning Surround View Monitor Always-On AgeGender Detection Computer Vision  Understanding Content
  8. 8. Copyright © 2014, CEVA Inc. 8 Automotive Gesture Control Base Technologies • Feature Extraction & Description • Feature Matching and Tracking Applications Augmented Reality Object Detection Recognition Depth Map Image Stitching Face Detection and Recognition Motion Detection Emotion Detection AgeGender Detection Segmentation Irregular Behavior Detection Forward Collision Warning (FCW) Lane Departure Warning (LDW) Pedestrian Detection (PD) Traffic Sign Rec. (TSR) Driver Fatigue Warning Surround View Monitor Always-On Computer Vision  Understanding Content
  9. 9. Copyright © 2014, CEVA Inc. 9 RTL design Production Chip ready Distribution Why Use a Programmable CV Processor? Development Time Line RTL design CV algorithm design Production HWSolution Programmable CV algorithm design Continued algorithms development ProgrammableSolution Chip ready Distribution
  10. 10. Copyright © 2014, CEVA Inc. 10 Dedicated CV Processor or Mobile GPU? CEVA-MM3101Performancegain(resultpercycle) 0 5 10 15 20 25 30 MM3101 Vs. Mobile GPU A MM3101 Vs. Mobile GPU B MM3101 Vs. Mobile GPU C sobel corr3x3 corrsep5x5 corrsep11x11 Histogram HSV2RGB RGB2HSV Max3x3 Min Median3x3 CornerHarris Average Typical set of representative CV & imaging algorithms MM3101 shows average 13x performance boost over leading Mobile GPUs. Power factor >50x more efficient
  11. 11. Copyright © 2014, CEVA Inc. 11 Typical Feature Treatment Flow  Sobel  Gaussian  Build image pyramids  Integral sum  Gamma normalization  HOG  BRIEF  SIFT  SURF  FREAK  LBP  HAAR  SVM  Decision tree  Harris  FAST  SIFT  SURF
  12. 12. Copyright © 2014, CEVA Inc. 12 Flow Chart — HOG Descriptor Input image Scaled image Scale 1 Scaled image Gamma Normalization Gradient Calculation Descriptor Calculation Bilinear Scaling HOG algorithm is based on Dalal & Triggs paper (2005) Common use is object detection, especially pedestrian detection Reference Code – OpenCV 2.4.3 Scale 9
  13. 13. Copyright © 2014, CEVA Inc. 13 HOG — Bilinear Scaling Bilinear Interpolation X X X Step 1: Vertical Interpolation Step 2: Horizontal Interpolation
  14. 14. Copyright © 2014, CEVA Inc. 14 1. Load 2 X 8 pixels in a single cycle 2. 16 filter operations in a single cycle 3. Transposed Store (4x4) in single cycle 4. Perform the load and filter again (1,2) 5. Transposed Store in single cycle (3) HOG — Bilinear Scaling — Implementation Memory vA vB vC vD Vector Registers Memory vA vB vC vD Vector Registers Memory vA vB vC vD Vector Registers transpose
  15. 15. Copyright © 2014, CEVA Inc. 15 • Gamma Function: P (γ) • Implemented using ‘Look Up Table’ (LUT) – parallel access to local memory in single cycle • Loading of multiple gamma values in a single cycle HOG — Gamma Normalization
  16. 16. Copyright © 2014, CEVA Inc. 16 • ORB — Oriented FAST and Rotated BRIEF • An efficient alternative to SIFT • Pyramid is used for scale-invariance • Features are detected using FAST9 , Harris and non-max-suppress • Descriptors are based on BRIEF with normalized orientation ORB — Feature Extraction Input Image Fast9 Harris Non-Max- Suppress Oriented BRIEF Descriptors list Pyramid
  17. 17. Copyright © 2014, CEVA Inc. 17 ORB — FAST9 Implementation • Continuous arc of 9 or more pixels: • All much brighter then (p+Th) or • All much darker then (p-Th)
  18. 18. Copyright © 2014, CEVA Inc. 18 • Early exit is used to detect potential positions • Long memory access of 32 bytes using • quickly load consecutive pixels • Vector compare is used to compare the center of the corner to the borders • Building a binary (bit) map with positions that need to be calculated • Calculation of multiple positions in parallel • Using different two dimensional loads • Vector predicates are used selectively calculate only the locations that pass the threshold • Using multi-way parallel lookup table access to decide on consecutive locations ORB — FAST9 Implementation (2)
  19. 19. Copyright © 2014, CEVA Inc. 19 ORB — FAST9 Implementation (3) SIMD Efficient, Random access to memory Fast First Pass Candidate list Input Image Fast First Pass Candidate list Full FAST9 Second Pass Output feature list Input Image
  20. 20. Copyright © 2014, CEVA Inc. 20 • Oriented brief uses the normalized orientation and calculates a 256 bit wide descriptor • The descriptor is calculated by comparison of pre-defined 256 pairs of pixels in the surrounding of the feature center • Each pair comparison donates a single bit in the descriptor • Orientation is normalized by rotating the image (or pairs coordinates, in our implementation) according to the moment of the feature center BRIEF — Descriptor
  21. 21. Copyright © 2014, CEVA Inc. 21 BRIEF DSP Implementation Calculate patch orientation (Patch Moment) Utilizing LUT capability read the pixels address. Load with random access to memory the pixel couples Use SIMD capability to efficiently calculate descriptor Orientation
  22. 22. Copyright © 2014, CEVA Inc. 22 • Inspired by the SIFT descriptor, much faster • Main modules • Detect features location according to pixels response to feature • Calculate feature descriptor SURF — Speeded Up Robust Features Integral Sum Image Feature Response Find Local Maximum Choose Best Features Feature Descriptor
  23. 23. Copyright © 2014, CEVA Inc. 23 + + T T vector memory vector register vector memory vector register • Two pass approach • Horizontal and vertical data access (Full Bandwidth) • Load and transpose data in a single instruction SURF — Integral Sum Image
  24. 24. Copyright © 2014, CEVA Inc. 24 • Calculate feature response using Box Filter • Small box: A-B+D-C • Large box: E-F+H-G • Total response: E-F+H-G - 3x(A-B+D-C) • Can be executed in a single cycle • Two operations in a single cycle Res = (E-F)+(H-G) Res += (A-B)*(-3)+(D-C)*(-3) SURF — Feature Response Flexible & Powerful Filter Instruction -2 1 1 A B C D E F G H
  25. 25. Copyright © 2014, CEVA Inc. 25 • Handles multiple features in parallel • Memory access of several different features in a single instruction • Calculates Integral sum Vertical and horizontal memory access • Calculates gradient using box filter result = (A-B) + (D-C) SURF — Feature Descriptor Feature Data 0 Feature Data 1 ... Feature Data 7 parallel load
  26. 26. Copyright © 2014, CEVA Inc. 26 Feature Extraction Summary Table Algorithm Memory Access Execution 2-Dimensional Access LUT Support Parallel Memory Access Dedicated Instructions Vectorized Conditional Flow HOG – Bilinear  HOG - Gamma  FAST9 - Detector   BRIEF - Descriptor   SURF – Int. Sum    SURF – Feature Response   SURF - Descriptor  
  27. 27. Copyright © 2014, CEVA Inc. 27 Conclusion — Critical Ingredients for Efficient CV Processor CEVA-MM3101 Imaging & Computer Vision IP platform:  Includes all above features (and more…)  CEVA-CV library already includes various feature extraction algos  Enables shorter development cycle, efficiency and algorithm flexibility CV Processor Efficient filter processing Good conditional code execution Fast & flexible random access to memory Good bit and byte data manipulation Wide memory bandwidth Large internal memory
  28. 28. Copyright © 2014, CEVA Inc. 28 1. ORB: an efficient alternative to SIFT or SURF Rublee, E. ; Rabaud, V. ; Konolige, K. ; Bradski, G. - Computer Vision (ICCV), 2011 IEEE 2. Histograms of oriented gradients for human detection Dalal, N. ; Triggs, B. - Computer Vision and Pattern Recognition, 2005. CVPR 2005 3. SURF: Speeded Up Robust Features Herbert Bay, Tinne Tuytelaars, Luc Van Gool - Computer Vision – ECCV 2006 4. BRIEF: Binary Robust Independent Elementary Features Michael Calonder, Vincent Lepetit, Christoph Strecha, Pascal Fua - Computer Vision – ECCV 2010 5. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe - International Journal of Computer Vision, Volume 60, Issue 2 , pp 91-110 Resource for further Investigation 𝒀𝒐𝒖 are all invited to the CEVA demo table Thank You!

×