SlideShare a Scribd company logo
You Only Look Once (YOLO):
Unified Real-Time Object Detection
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
University of Washington, Allen Institute for AI, Facebook AI Research
~ Ashish
Previously : Object Detection by Classifiers
● DPM (Deformable Parts Model)
○ Sliding window → classifier (evenly spaced locations)
● R-CNN
○ Region proposal --> potential BB
○ Run classifiers on BB
○ Post processing (refinement, eliminate, rescore)
● YOLO
○ Resize image, run convolutional network, non-max suppression
YOLO : Object Detection as Regression Problem
● output: Bounding box coordinates and Class Probabilities
● Single Neural Network
● Benefits:
○ Extremely Fast (one NN + 45 frames per sec), twice more mAP.
○ Global Reasoning (knows context, less background errors)
○ Generalizable Representations (train natural images, test art-work, applicable new domain)
Unified Detection
● Feature Extraction
○ Predict all class BB simultaneously
● SxS Grid
○ Each cell predicts B bounding boxes + Confidence Score
● Confidence Score
○ Confidence is IOU between predicted box and any ground truth box =
● Class Probability
● Tensor
Detection Process (YOLO) Grid SXS
S = 7
Confidence Score
Each grid cell predicts B bounding boxes and confidence scores for those boxes.
If a cell has an object , then confidence score = Intersection over union (IOU)
between the predicted box and the ground truth.
Detection Process (YOLO)
Each cell predicts B boxes(x,y,w,h) and
confidences of each box: P(Object)
.(x,y)
w
h
B = 2
Prob. that box contains an
object P1, P2
No
Object
Each cell predicts Bounding Boxes and Confidence
.(x,y)
Each cell also predicts class probability
Bicycle
Dog
Car
E.g. Dog :
0.8
Car : 0
Bicycle : 0
E.g. Dog : 0
Car : 0
Bicycle : 0.7
E.g. Dog : 0
Car :
0.7
Bicycle : 0
Bounding Boxes + Class Prediction
.(x,y)
P (class) = P (class|object) x P(object) Thresholding
Model
These predictions are encoded
as Tensor of dimension
(SxSx(Bx5+C))
SxS grid,
C = class probability,
B= no of bounding boxes.
Network Design
● Inspired by the GoogLeNet (image classification)
● 24 convolutional layers followed by 2 fully connected layers
● Fast YOLO uses 9 convolutional layers (instead of 24)
Training
1. Pretrain on ImageNet 1000 dataset
2. 20 convolutional layers + an average pooling layer + a fully connected layer
3. Trained for 1 week, accuracy 88% (ImageNet 2012 validation dataset)
4. Convert model to perform detection
5. Added 4 convolutional layer + 2 fully connected layer + increased input resolution from 224 x 224 to
448 x 448.
6. Final layer predicts class probabilities + BB.
7. Linear activation function (final layer), Relu (all other layers)
8. Sum of squared error as loss function (easy to optimise)
Loss Function
Training - Validation
1. Train network for 135 epochs on the training and validation data sets from PASCAL
VOC 2007 AND 2012
2. Testing data VOC 2007 & 2012
3. Batch size = 64, momentum = 0.9, decay = 0.0005
4. Learning rate :
a. First few epochs , raise LR 10^-3 to 10^-2
b. Model diverges if starting LR is high due to unstable gradient
c. first 75 epoch, LR 10^-2
d. next 30 epochs, LR 10^-3
e. next 30 epochs, LR 10^-4
5. To avoid overfitting:
a. Dropout layer with rate 0.5
b. For Data Augmentation, scaling and translation up to 20% of original image size
Inference
● On PASCAL VOC YOLO predicts 98 BB per image and class probability for
each box.
● Objects near border are localised by multiple cells
○ Non Maximal suppression can be used to fix these multiple detections (Non-max suppression is a
way to eliminate points that do not lie in important edges. )
■ Adds 2 to 3% to mAP
Limitation of YOLO
● Struggle with small objects
● Struggles with difference aspects and ratio of objects
● Loss function treats error in different size of boxes same
Comparison with other Real time Systems:
● DPM : disjoint pipeline (sliding window, features, classify, predict BB) -
YOLO concurrently
● R-CNN : region proposal , complex pipeline ( predict bb, extract
features, non-max suppression) - 40 sec per image (2000 BB) : YOLO
: 98 BB
● Deep Multibox : cnn, cannot do general detection
● OverFeat : cnn, disjoint system, no global context
● MultiGrasp : similar in design (YOLO) , only find a region
Experiments
● PASCAL VOC
2007
● Realtime :
○ YOLO VS DPM 30
Hz
VOC 2007 Error Analysis
Combining Fast R-CNN and YOLO
● YOLO makes fewer background
mistakes than Fast R-CNN
● This combination doesn’t benefit
from the speed of YOLO since
each model is run separately and
then combine the results.
VOC 2012 Results
● YOLO struggles with small objects (bottle, sheep, tv/monitor)
● Fast R-CNN + YOLO : Highest performing detection methods
Generalizability: Person Detection in Artwork
● YOLO has good performance on VOC 2007
● Its AP degrades less than other methods when applied to artwork.
● Artwork / Natural Images are very different on a pixel level but very similar in terms of size and
shape, so YOLO predicts good bounding boxes and detections.
Results
Darknet (YOLO) Results on random images

More Related Content

What's hot

Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
Deep Learning Italia
 
You only look once
You only look onceYou only look once
You only look once
Gin Kyeng Lee
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1
Jihoon Song
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
Hichem Felouat
 
Yolov3
Yolov3Yolov3
Yolov3
SHREY MOHAN
 
YOLO
YOLOYOLO
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
Preferred Networks
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
Usman Qayyum
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
Wenjing Chen
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
Hwa Pyung Kim
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
J On The Beach
 
Yol ov2
Yol ov2Yol ov2
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
穗碧 陳
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
Object detection
Object detectionObject detection
Object detection
ROUSHAN RAJ KUMAR
 
Introduction of Faster R-CNN
Introduction of Faster R-CNNIntroduction of Faster R-CNN
Introduction of Faster R-CNN
Simossyi Funabashi
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
Jinwon Lee
 

What's hot (20)

Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 
You only look once
You only look onceYou only look once
You only look once
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 
Yolov3
Yolov3Yolov3
Yolov3
 
YOLO
YOLOYOLO
YOLO
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
 
Yol ov2
Yol ov2Yol ov2
Yol ov2
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object detection
Object detectionObject detection
Object detection
 
Introduction of Faster R-CNN
Introduction of Faster R-CNNIntroduction of Faster R-CNN
Introduction of Faster R-CNN
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 

Similar to You only look once (YOLO) : unified real time object detection

“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
Edge AI and Vision Alliance
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
Universitat de Barcelona
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
Universitat Politècnica de Catalunya
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
Yoonho Na
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
Yu Huang
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
Eye deep
Eye deepEye deep
Eye deep
sveitser
 
Original SOINN
Original SOINNOriginal SOINN
Original SOINNSOINN Inc.
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
Dongmin Choi
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
Charles Deledalle
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
fahmi324663
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
3D Multi Object GAN
3D Multi Object GAN3D Multi Object GAN
3D Multi Object GAN
Yu Nishimura
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등
DACON AI 데이콘
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 

Similar to You only look once (YOLO) : unified real time object detection (20)

“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
Eye deep
Eye deepEye deep
Eye deep
 
Original SOINN
Original SOINNOriginal SOINN
Original SOINN
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
3D Multi Object GAN
3D Multi Object GAN3D Multi Object GAN
3D Multi Object GAN
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 

More from Entrepreneur / Startup

R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
Entrepreneur / Startup
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise Applications
Entrepreneur / Startup
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
Entrepreneur / Startup
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlow
Entrepreneur / Startup
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Entrepreneur / Startup
 
Build an AI based virtual agent
Build an AI based virtual agent Build an AI based virtual agent
Build an AI based virtual agent
Entrepreneur / Startup
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
Entrepreneur / Startup
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Entrepreneur / Startup
 
Building mobile apps using meteorJS
Building mobile apps using meteorJSBuilding mobile apps using meteorJS
Building mobile apps using meteorJS
Entrepreneur / Startup
 
Building iOS app using meteor
Building iOS app using meteorBuilding iOS app using meteor
Building iOS app using meteor
Entrepreneur / Startup
 
Understanding angular meteor
Understanding angular meteorUnderstanding angular meteor
Understanding angular meteor
Entrepreneur / Startup
 
Introducing ElasticSearch - Ashish
Introducing ElasticSearch - AshishIntroducing ElasticSearch - Ashish
Introducing ElasticSearch - Ashish
Entrepreneur / Startup
 
Meteor Introduction - Ashish
Meteor Introduction - AshishMeteor Introduction - Ashish
Meteor Introduction - Ashish
Entrepreneur / Startup
 

More from Entrepreneur / Startup (13)

R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise Applications
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlow
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)
 
Build an AI based virtual agent
Build an AI based virtual agent Build an AI based virtual agent
Build an AI based virtual agent
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
 
Building mobile apps using meteorJS
Building mobile apps using meteorJSBuilding mobile apps using meteorJS
Building mobile apps using meteorJS
 
Building iOS app using meteor
Building iOS app using meteorBuilding iOS app using meteor
Building iOS app using meteor
 
Understanding angular meteor
Understanding angular meteorUnderstanding angular meteor
Understanding angular meteor
 
Introducing ElasticSearch - Ashish
Introducing ElasticSearch - AshishIntroducing ElasticSearch - Ashish
Introducing ElasticSearch - Ashish
 
Meteor Introduction - Ashish
Meteor Introduction - AshishMeteor Introduction - Ashish
Meteor Introduction - Ashish
 

Recently uploaded

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 

Recently uploaded (20)

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 

You only look once (YOLO) : unified real time object detection

  • 1. You Only Look Once (YOLO): Unified Real-Time Object Detection Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi University of Washington, Allen Institute for AI, Facebook AI Research ~ Ashish
  • 2. Previously : Object Detection by Classifiers ● DPM (Deformable Parts Model) ○ Sliding window → classifier (evenly spaced locations) ● R-CNN ○ Region proposal --> potential BB ○ Run classifiers on BB ○ Post processing (refinement, eliminate, rescore) ● YOLO ○ Resize image, run convolutional network, non-max suppression
  • 3. YOLO : Object Detection as Regression Problem ● output: Bounding box coordinates and Class Probabilities ● Single Neural Network ● Benefits: ○ Extremely Fast (one NN + 45 frames per sec), twice more mAP. ○ Global Reasoning (knows context, less background errors) ○ Generalizable Representations (train natural images, test art-work, applicable new domain)
  • 4. Unified Detection ● Feature Extraction ○ Predict all class BB simultaneously ● SxS Grid ○ Each cell predicts B bounding boxes + Confidence Score ● Confidence Score ○ Confidence is IOU between predicted box and any ground truth box = ● Class Probability ● Tensor
  • 5. Detection Process (YOLO) Grid SXS S = 7
  • 6. Confidence Score Each grid cell predicts B bounding boxes and confidence scores for those boxes. If a cell has an object , then confidence score = Intersection over union (IOU) between the predicted box and the ground truth.
  • 7. Detection Process (YOLO) Each cell predicts B boxes(x,y,w,h) and confidences of each box: P(Object) .(x,y) w h B = 2 Prob. that box contains an object P1, P2 No Object
  • 8. Each cell predicts Bounding Boxes and Confidence .(x,y)
  • 9. Each cell also predicts class probability Bicycle Dog Car E.g. Dog : 0.8 Car : 0 Bicycle : 0 E.g. Dog : 0 Car : 0 Bicycle : 0.7 E.g. Dog : 0 Car : 0.7 Bicycle : 0
  • 10. Bounding Boxes + Class Prediction .(x,y) P (class) = P (class|object) x P(object) Thresholding
  • 11. Model These predictions are encoded as Tensor of dimension (SxSx(Bx5+C)) SxS grid, C = class probability, B= no of bounding boxes.
  • 12. Network Design ● Inspired by the GoogLeNet (image classification) ● 24 convolutional layers followed by 2 fully connected layers ● Fast YOLO uses 9 convolutional layers (instead of 24)
  • 13. Training 1. Pretrain on ImageNet 1000 dataset 2. 20 convolutional layers + an average pooling layer + a fully connected layer 3. Trained for 1 week, accuracy 88% (ImageNet 2012 validation dataset) 4. Convert model to perform detection 5. Added 4 convolutional layer + 2 fully connected layer + increased input resolution from 224 x 224 to 448 x 448. 6. Final layer predicts class probabilities + BB. 7. Linear activation function (final layer), Relu (all other layers) 8. Sum of squared error as loss function (easy to optimise)
  • 15. Training - Validation 1. Train network for 135 epochs on the training and validation data sets from PASCAL VOC 2007 AND 2012 2. Testing data VOC 2007 & 2012 3. Batch size = 64, momentum = 0.9, decay = 0.0005 4. Learning rate : a. First few epochs , raise LR 10^-3 to 10^-2 b. Model diverges if starting LR is high due to unstable gradient c. first 75 epoch, LR 10^-2 d. next 30 epochs, LR 10^-3 e. next 30 epochs, LR 10^-4 5. To avoid overfitting: a. Dropout layer with rate 0.5 b. For Data Augmentation, scaling and translation up to 20% of original image size
  • 16. Inference ● On PASCAL VOC YOLO predicts 98 BB per image and class probability for each box. ● Objects near border are localised by multiple cells ○ Non Maximal suppression can be used to fix these multiple detections (Non-max suppression is a way to eliminate points that do not lie in important edges. ) ■ Adds 2 to 3% to mAP
  • 17. Limitation of YOLO ● Struggle with small objects ● Struggles with difference aspects and ratio of objects ● Loss function treats error in different size of boxes same
  • 18. Comparison with other Real time Systems: ● DPM : disjoint pipeline (sliding window, features, classify, predict BB) - YOLO concurrently ● R-CNN : region proposal , complex pipeline ( predict bb, extract features, non-max suppression) - 40 sec per image (2000 BB) : YOLO : 98 BB ● Deep Multibox : cnn, cannot do general detection ● OverFeat : cnn, disjoint system, no global context ● MultiGrasp : similar in design (YOLO) , only find a region
  • 19. Experiments ● PASCAL VOC 2007 ● Realtime : ○ YOLO VS DPM 30 Hz
  • 20. VOC 2007 Error Analysis
  • 21. Combining Fast R-CNN and YOLO ● YOLO makes fewer background mistakes than Fast R-CNN ● This combination doesn’t benefit from the speed of YOLO since each model is run separately and then combine the results.
  • 22. VOC 2012 Results ● YOLO struggles with small objects (bottle, sheep, tv/monitor) ● Fast R-CNN + YOLO : Highest performing detection methods
  • 23. Generalizability: Person Detection in Artwork ● YOLO has good performance on VOC 2007 ● Its AP degrades less than other methods when applied to artwork. ● Artwork / Natural Images are very different on a pixel level but very similar in terms of size and shape, so YOLO predicts good bounding boxes and detections.
  • 25. Darknet (YOLO) Results on random images