Single Shot Multibox Detector

•

2 likes•6,154 views

NamHyuk Ahn

Presentation file for lab seminar.

Engineering

SSD:
Single Shot Multibox Detector
NamHyuk Ahn

Object Detection
- mean Average Precision (mAP)
• Popular eval metric
• Compute average precision
for single class, and average
them over all classes
• Detections is True-positive
if box is overlap with ground-
truth more than some threshold
(usually use 0.5)

Object Detection
- R-CNN Family
• Most popular detection method in deep learning
• Use region proposal method <- make model slow
• Good accuracy (Faster: 73.2% mAP), but very slow
• R-CNN: 50 sec/img, Fast: 2 sec/img, Faster: 0.2 sec/img (7 FPS)
- YOLO (You Only Look Once)
• Real-time (45 FPS), but low accuracy (63.4% mAP)

YOLO:
You Only Look Once
- Single shot detector model
• Not separate classification and bbox regression
- Divide image into S x S grid (7x7 in paper)
• Within each grid cell, (4+1)*B + C vector,
• B: # of boxes in each grid (2 in paper)
C: # of classes (20 in paper)
(4+1): 4 box coord + 1 box confidence
- Direct prediction using CNN with regression loss

YOLO:
You Only Look Once
- Operate on a single-scale feature map (last pool)
• Bad accuracy with large or small object
- Predict bbox using fc layer
- Hard data augmentation, 448x448 input image
- Use customized CNN architecture

SSD:
Single Shot Multibox Detector
- Multi-scale feature maps for detection
• Add conv layer at the end of base network, decrease size progressively
• Concat output of multi-scale feature map at the last layer
- Convolutional predictors for detection
• YOLO use fc layer, but SSD use 3x3 conv kernel

SSD:
Single Shot Multibox Detector
- Default boxes and aspect ratios
• Set default boxes at each location, and predict offset relative to
corresponding default box
• output dims: (C+4)K*M*N,
K=# of default box, C=# of classes, MN=feature dims

$SSD: Single Shot Multibox Detector - Default boxes and aspect ratios • Use 6 default boxes at each feature cell • { 1, 2, 3, 1/2, 1/3 } aspect ratio boxes + 1 box with 1 aspect ratio • Set 3 boxes in conv4_3 to reduce computation$

SSD:
Single Shot Multibox Detector
- Output feature (final layer)
• With given output boxes from multi-scale features, sort them
using class confidence
• Pick top-200 boxes and make each box 7-dim vector
• [ batch_idx, class_confidence, label, box offset…]
• Output feature dim is 7x200
•

Model analysis
- Data argumentation is very important
- More feature map is better
• Lower feature map can capture fine-grained details of object
- More default box shape is better
• If you only 4 boxes, performance drop by 0.9%
• Using variety shape of default box makes predicting box easier
- Astrous VGG is better and faster

Result
- Accuracy is compare to state-of-the-art, and with
real-time

Reference
- Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint
arXiv:1512.02325 (2015).

What's hot

Object detectionROUSHAN RAJ KUMAR

Faster rcnn捷恩蔡

Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya

You only look once: Unified, real-time object detection (UPC Reading Group)Universitat Politècnica de Catalunya

YoloBang Tsui Liou

Intro to Object Detection with SSDThomas Delteil

Deep learning based object detectionchettykulkarni

Deep learning based object detection basicsBrodmann17

Mask R-CNNChanuk Lim

A Brief History of Object Detection / Tommi KerolaPreferred Networks

Introduction to object detectionBrodmann17

You only look once (YOLO) : unified real time object detectionEntrepreneur / Startup

Mask-RCNN for Instance SegmentationDat Nguyen

[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon

Multi Object Tracking | Presentation 1 | ID 103001Md. Minhazul Haque

Recent Progress on Object Detection_20170331Jihong Kang

Faster R-CNNanna8885

Anatomy of YOLO - v1Jihoon Song

YoloNEHA Kapoor

Image segmentation with deep learningAntonio Rueda-Toicen

What's hot (20)

Object detection

Faster rcnn

Faster R-CNN: Towards real-time object detection with region proposal network...

You only look once: Unified, real-time object detection (UPC Reading Group)

Yolo

Intro to Object Detection with SSD

Deep learning based object detection

Deep learning based object detection basics

Mask R-CNN

A Brief History of Object Detection / Tommi Kerola

Introduction to object detection

You only look once (YOLO) : unified real time object detection

Mask-RCNN for Instance Segmentation

[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection

Multi Object Tracking | Presentation 1 | ID 103001

Recent Progress on Object Detection_20170331

Faster R-CNN

Anatomy of YOLO - v1

Yolo

Image segmentation with deep learning

Similar to Single Shot Multibox Detector

object detection paper reviewYoonho Na

Introducción a las redes convolucionalesJoseAlGarcaGutierrez

“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance

Trackster Pruning at the CMS High-Granularity CalorimeterYousef Fadila

MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle

Week5-Faster R-CNN.pptxfahmi324663

The Technology behind Shadow Warrior, ZTG 2014Jarosław Pleskot

Pattern recognition binoy k means clustering108kaushik

Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15Jarosław Pleskot

DC04 Image Compression Standards.pdfssuser1bd081

Single shot multiboxdetectors지현 백

Fast Single-pass K-means Clusterting at Oxford MapR Technologies

SimCLR: A Simple Framework for Contrastive Learning of Visual Representationsynxm25hpxp

CnnMehrnaz Faraz

Convolutional Neural Networksmilad abbasi

Objects as points (CenterNet) review [CDM]Dongmin Choi

Single shot multiboxdetectors지현 백

Reza talkreza79sh

Anchor free object detection by deep learningYu Huang

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Sergey Karayev

Similar to Single Shot Multibox Detector (20)

object detection paper review

Introducción a las redes convolucionales

“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...

Trackster Pruning at the CMS High-Granularity Calorimeter

MLIP - Chapter 5 - Detection, Segmentation, Captioning

Week5-Faster R-CNN.pptx

The Technology behind Shadow Warrior, ZTG 2014

Pattern recognition binoy k means clustering

Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15

DC04 Image Compression Standards.pdf

Single shot multiboxdetectors

Fast Single-pass K-means Clusterting at Oxford

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Cnn

Convolutional Neural Networks

Objects as points (CenterNet) review [CDM]

Single shot multiboxdetectors

Reza talk

Anchor free object detection by deep learning

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...

Recently uploaded

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Thermal Engineering Unit - I & II . pptDineshKumar4165

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...9953056974 Low Rate Call Girls In Saket, Delhi NCR

Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1

Thermal Engineering -unit - III & IV.pptDineshKumar4165

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Minimum and Maximum Modes of microprocessor 8086anil_gaur

Thermal Engineering-R & A / C - unit - VDineshKumar4165

Hostel management system project report..pdfKamal Acharya

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)

22-prompt engineering noted slide shown.pdf203318pmpc

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Kandungan 087776558899

Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi

Double Revolving field theory-how the rotor develops torqueBhangaleSonal

Employee leave management system project.Kamal Acharya

Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

KubeKraft presentation @CloudNativeHooghlysanyuktamishra911

Recently uploaded (20)

(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7

Thermal Engineering Unit - I & II . ppt

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...

Work-Permit-Receiver-in-Saudi-Aramco.pptx

Thermal Engineering -unit - III & IV.ppt

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7

Minimum and Maximum Modes of microprocessor 8086

Thermal Engineering-R & A / C - unit - V

Hostel management system project report..pdf

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...

22-prompt engineering noted slide shown.pdf

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil

Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking

Double Revolving field theory-how the rotor develops torque

Employee leave management system project.

Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service

KubeKraft presentation @CloudNativeHooghly

Single Shot Multibox Detector

1. SSD: Single Shot Multibox Detector NamHyuk Ahn

2. Object Detection - mean Average Precision (mAP) • Popular eval metric • Compute average precision for single class, and average them over all classes • Detections is True-positive if box is overlap with ground- truth more than some threshold (usually use 0.5)

3. Object Detection - R-CNN Family • Most popular detection method in deep learning • Use region proposal method <- make model slow • Good accuracy (Faster: 73.2% mAP), but very slow • R-CNN: 50 sec/img, Fast: 2 sec/img, Faster: 0.2 sec/img (7 FPS) - YOLO (You Only Look Once) • Real-time (45 FPS), but low accuracy (63.4% mAP)

4. YOLO: You Only Look Once - Single shot detector model • Not separate classification and bbox regression - Divide image into S x S grid (7x7 in paper) • Within each grid cell, (4+1)*B + C vector, • B: # of boxes in each grid (2 in paper) C: # of classes (20 in paper) (4+1): 4 box coord + 1 box confidence - Direct prediction using CNN with regression loss

5. YOLO: You Only Look Once - Operate on a single-scale feature map (last pool) • Bad accuracy with large or small object - Predict bbox using fc layer - Hard data augmentation, 448x448 input image - Use customized CNN architecture

6. SSD: Single Shot Multibox Detector - Multi-scale feature maps for detection • Add conv layer at the end of base network, decrease size progressively • Concat output of multi-scale feature map at the last layer - Convolutional predictors for detection • YOLO use fc layer, but SSD use 3x3 conv kernel

7. SSD: Single Shot Multibox Detector - Default boxes and aspect ratios • Set default boxes at each location, and predict offset relative to corresponding default box • output dims: (C+4)K*M*N, K=# of default box, C=# of classes, MN=feature dims

8. SSD: Single Shot Multibox Detector - Default boxes and aspect ratios • Use 6 default boxes at each feature cell • { 1, 2, 3, 1/2, 1/3 } aspect ratio boxes + 1 box with 1 aspect ratio • Set 3 boxes in conv4_3 to reduce computation

9. SSD: Single Shot Multibox Detector - Output feature (final layer) • With given output boxes from multi-scale features, sort them using class confidence • Pick top-200 boxes and make each box 7-dim vector • [ batch_idx, class_confidence, label, box offset…] • Output feature dim is 7x200 •

10. Model analysis - Data argumentation is very important - More feature map is better • Lower feature map can capture fine-grained details of object - More default box shape is better • If you only 4 boxes, performance drop by 0.9% • Using variety shape of default box makes predicting box easier - Astrous VGG is better and faster

11. Result - Accuracy is compare to state-of-the-art, and with real-time

12. Reference - Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015).

Single Shot Multibox Detector

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Single Shot Multibox Detector

Similar to Single Shot Multibox Detector (20)

More from NamHyuk Ahn

More from NamHyuk Ahn (7)

Recently uploaded

Recently uploaded (20)

Single Shot Multibox Detector