Yolo

Real-time Object Detection
using Yolo
Presented by
Sourav Garai
MCSE 2018-2020
Jadavpur University

Object classification by Convolution Neural Network

Objective: Find objects and mark a
boundary.
Generate patches by sliding window
detector or selective search (2000
patches)
Problem: Computationally intensive -
Need to Run CNNs on so many
patches generated
Very slow
Object detection + Classification

YOLO(You only Look Once)
● A single neural network
● Trains on full images
● No need of a complex pipeline
● Sees the entire image

Algorithm
YOLO divides image into a grid of S x S and each grid predicts N bounding boxes
and confidence. Here N = ( B * 5 + C )
Remove boxes with low confidence, say 30%

Architecture
Network has 24 convolutional layers for image classification
➔ 7×7 Grid
➔ 2 Bounding
boxes per cell
➔ 20 classes
The final output is
7 × 7 × ( 2×5 + 20)
= 7 × 7 × 30

Class specific Confidence = Pr(Classi|Object) ∗ Pr(Object)∗IOU
= Pr(Classi)∗IOU

Intersection over Union - An evaluation metric used to measure the accuracy of
an object detector
Image Credits: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/

Training - Loss function
YOLO uses sum-squared error between the predictions and the ground truth to
calculate loss. The loss function composes of:
● the classification loss.
● the localization loss (errors between the predicted boundary box and the
ground truth).
● the confidence loss (the objectness of the box).

Classification loss
If an object is detected, the classification loss at each cell is the squared error of the
class conditional probabilities for each class:

Localization loss
The localization loss measures the errors in the predicted boundary box locations
and sizes. We only count the box responsible for detecting the object. A constant is
multiplied to give more emphasis on the bounding box accuracy.

Confidence loss (Object present)
If an object is detected in the box, the confidence loss is:

Confidence loss (Object not present)
If an object is not detected in the box, the confidence loss is:

Total error
The final loss adds localization, confidence and classification losses together.

Performance Comparison
● Less computationally
intensive
● High FPS
● High Mean Average
Precision
● Can be used in embedded
devices or mobile devices
● Time sensitive tasks like
self driving cars and robots.

Limitations
● Struggles to localize
objects correctly
● Struggles with small
objects that appear in
groups, such as flocks
of birds.
Detects 5 out of 9 persons
in the lower left area.

Thank you
J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017,
pp. 6517-6525.
doi: 10.1109/CVPR.2017.690
keywords: {image classification;object detection;YOLO9000;COCO detection
dataset;ImageNet detection task;YOLO detection method;YOLOv2 model;object
detection system;PASCAL VOC;object classification;ImageNet classification
dataset;Image resolution;Feature extraction;Training;Real-time systems;Object
detection;Detectors},
URL:
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8100173&isnumber=8099483

Yolo

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Yolo

Similar to Yolo (20)

Recently uploaded

Recently uploaded (20)

Yolo