Explanation of the architecture used in one of my open-source object detection projects, including Convolutional Neural Networks, Single Shot Detector and MobileNet v2
3. Library
TF Object Detection offers an API to perform quick
experiments regarding Computer Vision problems
https://github.com/tensorflow/models/tree/master/research/object_detection
4. Model Zoo
For each task, a pre-trained model in a given dataset is offered
Choose one or other considering the accuracy / speed tradeoff
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc
/detection_model_zoo.md
5. SSD Mobilenet
SSD: Single Shot MutiBox Detector
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
MobileNetV2: Inverted Residuals and Linear Bottlenecks
13. ResNets
As a DNN becomes deeper, the risk of having vanishing / exploding gradients
increases. Residual Nets aims to address this problem
[He et al., 2015, Deep Residual Learning for Image Recognition]
21. Non-Maxima Suppression
● Choose the box with highest prediction score
● Compare it with the others with IoU and delete the most overlapping ones
● Repeat
22. Alternative models
● Region-based Convolutional Network (R-CNN)
● Fast Region-based Convolutional Network (Fast R-CNN)
● Faster Region-based Convolutional Network (Faster R-CNN)
● Region-based Fully Convolutional Network (R-FCN)
● YOLOv3
● RetinaNet
Article: Review of Deep Learning Algorithms for Object Detection
23. Project
Define which classes of interest to detect
Use of a video processing library: OpenCV
Modify the detection code to adjust it to your project’s needs
Made an executable version with PyInstaller
https://github.com/xavialex/object-detection