SlideShare a Scribd company logo
1 of 97
Download to read offline
You Only Look Once:
Unified, Real-Time Object Detection (2016)
Taegyun Jeon
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." 

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
Slide courtesy of DeepSystem.io “YOLO: You only look once Review”
Evaluation on VOC2007
Main Concept
• Object Detection
• Regression problem
• YOLO
• Only One Feedforward
• Global context
• Unified (Real-time detection)
• YOLO: 45 FPS
• Fast YOLO: 155 FPS
• General representation
• Robust on various background
• Other domain
(Real-time system: https://cs.stackexchange.com/questions/56569/what-is-real-time-in-a-computer-vision-context)
Object Detection as Regression Problem
• Previous: Repurpose classifiers to perform detection
• Deformable Parts Models (DPM)
• Sliding window
• R-CNN based methods
• 1) generate potential bounding boxes.
• 2) run classifiers on these proposed boxes
• 3) post-processing (refinement, elimination, rescore)
Object Detection as Regression Problem
• YOLO: Single Regression Problem
• Image → bounding box coordinate and class probability.
• Extremely Fast
• Global reasoning
• Generalizable representation
Unified Detection
• All BBox, All classes
1) Image → S x S grids
2) grid cell 

→ B: BBoxes and Confidence score
→ C: class probabilities w.r.t #classes
x, y, w, h, confidence
Appendix: Intersection over Union (IOU)
Unified Detection
• Predict one set of class probabilities 

per grid cell, regardless of the number 

of boxes B.
• At test time, 

individual box confidence prediction
Network Design: YOLO
• Modified GoogLeNet
• 1x1 reduction layer (“Network in Network”)
Appendix: GoogLeNet
1 1 4 10 6 2 1 1
24
2
Conv. layer
Fully conn. Layer
Conv. layer
3x3x512
Maxpool Layer
2x2-s-2
Conv. layer
3x3x1024
Maxpool Layer
2x2-s-2
Conv. layer
3x3x1024
Conv. Layer
3x3x1024-s-2
Conv. layer
3x3x1024
Conv. Layer
3x3x1024
Network Design: YOLO-tiny (9 Conv. Layers)
Conv. layer
3x3x16
Maxpool Layer
2x2-s-2
Conv. layer
3x3x32
Maxpool Layer
2x2-s-2
Conv. layer
3x3x64
Maxpool Layer
2x2-s-2
Conv. layer
3x3x128
Maxpool Layer
2x2-s-2
Conv. layer
3x3x256
Maxpool Layer
2x2-s-2
Conv. layer
3x3x512
Maxpool Layer
2x2-s-2
Conv. layer
3x3x1024
Conv. layer
3x3x1024
Conv. layer
3x3x1024
9
2
Conv. layer
Fully conn. Layer
1
1
1
1
1
1
1 1 1
YOLO-Tiny
YOLO
Training
1) Pretrain with ImageNet 1000-class 

competition dataset
20Conv. layer
Feature Extractor Object Classifier
Training
2) “Network on Convolutional Feature Maps”
Increased input resolution (224x224)
→(448x448)
Appendix: Networks on Convolutional Feature Maps
20Conv. layer
4
2
Conv. layer
Fully conn. Layer
Feature Extractor Object Classifier
Inference
Loss Function (sum-squared error)
Appendix: sum-squared error
Loss Function (sum-squared error)
Loss Function (sum-squared error)
Re
If object appears in cell i
The jth bbox predictor in cell i is 

“responsible” for that prediction
Train strategy
epochs=135
batch_size=64
momentum_a = 0.9
decay=0.0005
lr=[10-3, 10-2, 10-3, 10-4]
dropout_rate=0.5
augmentation

=[scaling, translation, exposure, saturation]
Inference
Just like in training?
S=7, B=2 for Pascal VOC
Limitation of YOLO
• Group of small objects
• Unusual aspect ratios
• Coarse feature
• Localization error of bounding box
Comparison to Other Detection System
Comparison to Other Real-Time System
VOC Error
Combining Fast R-CNN and YOLO
VOC 2012 Leaderboard
Generalizability: Person Detection in Artwork
Generalizability: Person Detection in Artwork
Summary
Appendix | Implementation
• YOLO (darknet): https://pjreddie.com/darknet/yolov1/
• YOLOv2 (darknet): https://pjreddie.com/darknet/yolo/
• YOLO (caffe): https://github.com/xingwangsfu/caffe-yolo
• YOLO (TensorFlow: Train+Test): https://github.com/thtrieu/darkflow
• YOLO (TensorFlow: Test): https://github.com/gliese581gg/YOLO_tensorflow
Appendix | Slides
• DeepSense.io (google presentation - "YOLO: Inference")
Thanks
fb.com/taegyun.jeon
github.com/tgjeon
taylor.taegyun.jeon@gmail.com
Paper reviewed by 

Taegyun Jeon
Appendix | Intersection over Union (IoU)
• IoU(pred, truth)=[0, 1]
Appendix | GoogLeNet
Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
Appendix | GoogLeNet
Inception Module (1x1 convolution for dimension reductions)
Appendix | Networks on Convolutional Feature Maps
Previous: FC
Proposed : Conv + FC
Ren, Shaoqing, et al. "Object detection networks on convolutional feature maps." IEEE Transactions on Pattern Analysis and Machine Intelligence (2016).
Appendix | Sum-squared error (SSE)
sum of squared errors of prediction (SSE), is the sum of the
squares of residuals (deviations predicted from actual empirical values
of data). It is a measure of the discrepancy between the data and an
estimation model. A small RSS indicates a tight fit of the model to the
data. It is used as an optimality criterion in parameter selection and
model selection.
https://en.wikipedia.org/wiki/Residual_sum_of_squares

More Related Content

What's hot

Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 
PR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementPR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementJinwon Lee
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentationAshwinBicholiya
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!J On The Beach
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
yolov3-4-5.pdf
yolov3-4-5.pdfyolov3-4-5.pdf
yolov3-4-5.pdfNguynnhLm4
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Md. Minhazul Haque
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance SegmentationHichem Felouat
 

What's hot (20)

Yolov3
Yolov3Yolov3
Yolov3
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Yolo
YoloYolo
Yolo
 
PR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental ImprovementPR-207: YOLOv3: An Incremental Improvement
PR-207: YOLOv3: An Incremental Improvement
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentation
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Object detection
Object detectionObject detection
Object detection
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
 
Object detection
Object detectionObject detection
Object detection
 
Object Detection and Ship Classification Using YOLOv5
Object Detection and Ship Classification Using YOLOv5Object Detection and Ship Classification Using YOLOv5
Object Detection and Ship Classification Using YOLOv5
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
YOLO
YOLOYOLO
YOLO
 
yolov3-4-5.pdf
yolov3-4-5.pdfyolov3-4-5.pdf
yolov3-4-5.pdf
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001
 
Cnn
CnnCnn
Cnn
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 

Similar to [PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Sergey Karayev
 
ppt - of a project will help you on your college projects
ppt - of a project will help you on your college projectsppt - of a project will help you on your college projects
ppt - of a project will help you on your college projectsvikaspandey0702
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship ReportHarshilJain26
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1Jihoon Song
 
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume Laforge
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume LaforgeGroovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume Laforge
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume LaforgeGuillaume Laforge
 
Procedural Content Generation with Clojure
Procedural Content Generation with ClojureProcedural Content Generation with Clojure
Procedural Content Generation with ClojureMike Anderson
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421穗碧 陳
 
[系列活動] 一日搞懂生成式對抗網路
[系列活動] 一日搞懂生成式對抗網路[系列活動] 一日搞懂生成式對抗網路
[系列活動] 一日搞懂生成式對抗網路台灣資料科學年會
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
 
Testing multithreaded java applications for synchronization problems
Testing multithreaded java applications for synchronization problemsTesting multithreaded java applications for synchronization problems
Testing multithreaded java applications for synchronization problemsVassil Popovski
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn nsAndrew Brozek
 
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례Hwanhee Kim
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023Jinwon Lee
 
Code Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationCode Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationNorihiro Yoshida
 
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)Theo Jungeblut
 

Similar to [PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection (20)

Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
 
Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 
ppt - of a project will help you on your college projects
ppt - of a project will help you on your college projectsppt - of a project will help you on your college projects
ppt - of a project will help you on your college projects
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1
 
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume Laforge
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume LaforgeGroovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume Laforge
Groovy to infinity and beyond - SpringOne2GX - 2010 - Guillaume Laforge
 
Procedural Content Generation with Clojure
Procedural Content Generation with ClojureProcedural Content Generation with Clojure
Procedural Content Generation with Clojure
 
Sync considered unethical
Sync considered unethicalSync considered unethical
Sync considered unethical
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
[系列活動] 一日搞懂生成式對抗網路
[系列活動] 一日搞懂生成式對抗網路[系列活動] 一日搞懂生成式對抗網路
[系列活動] 一日搞懂生成式對抗網路
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 
Testing multithreaded java applications for synchronization problems
Testing multithreaded java applications for synchronization problemsTesting multithreaded java applications for synchronization problems
Testing multithreaded java applications for synchronization problems
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn ns
 
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례
[NDC2017] 딥러닝으로 게임 콘텐츠 제작하기 - VAE를 이용한 콘텐츠 생성 기법 연구 사례
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023
 
Code Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code MutationCode Search Based on Deep Neural Network and Code Mutation
Code Search Based on Deep Neural Network and Code Mutation
 
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)
Lego For Engineers - Dependency Injection for LIDNUG (2011-06-03)
 

More from Taegyun Jeon

TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTaegyun Jeon
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTaegyun Jeon
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-ResolutionTaegyun Jeon
 
[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before trainingTaegyun Jeon
 
GDG DevFest Xiamen 2017
GDG DevFest Xiamen 2017GDG DevFest Xiamen 2017
GDG DevFest Xiamen 2017Taegyun Jeon
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...Taegyun Jeon
 
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...Taegyun Jeon
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개[대전AI포럼] 위성영상 분석 기술 개발 현황 소개
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개Taegyun Jeon
 
[PR12] PR-026: Notes for CVPR Machine Learning Sessions
[PR12] PR-026: Notes for CVPR Machine Learning Sessions[PR12] PR-026: Notes for CVPR Machine Learning Sessions
[PR12] PR-026: Notes for CVPR Machine Learning SessionsTaegyun Jeon
 
[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networksTaegyun Jeon
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasTaegyun Jeon
 
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)Taegyun Jeon
 
인공지능: 변화와 능력개발
인공지능: 변화와 능력개발인공지능: 변화와 능력개발
인공지능: 변화와 능력개발Taegyun Jeon
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksTaegyun Jeon
 

More from Taegyun Jeon (15)

TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
 
[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training
 
GDG DevFest Xiamen 2017
GDG DevFest Xiamen 2017GDG DevFest Xiamen 2017
GDG DevFest Xiamen 2017
 
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...
 
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...
GDG DevFest Seoul 2017: Codelab - Time Series Analysis for Kaggle using Tenso...
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개[대전AI포럼] 위성영상 분석 기술 개발 현황 소개
[대전AI포럼] 위성영상 분석 기술 개발 현황 소개
 
[PR12] PR-026: Notes for CVPR Machine Learning Sessions
[PR12] PR-026: Notes for CVPR Machine Learning Sessions[PR12] PR-026: Notes for CVPR Machine Learning Sessions
[PR12] PR-026: Notes for CVPR Machine Learning Sessions
 
[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
 
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)
TensorFlow KR 2nd Meetup - Lightening talk (Satrec Initiative)
 
인공지능: 변화와 능력개발
인공지능: 변화와 능력개발인공지능: 변화와 능력개발
인공지능: 변화와 능력개발
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural Networks
 

Recently uploaded

Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloChristian Robert
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptxMuhammadRazzaq31
 
ANATOMY OF DICOT AND MONOCOT LEAVES.pptx
ANATOMY OF DICOT AND MONOCOT LEAVES.pptxANATOMY OF DICOT AND MONOCOT LEAVES.pptx
ANATOMY OF DICOT AND MONOCOT LEAVES.pptxRASHMI M G
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxGlendelCaroz
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algaekushbuR
 
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfSuchita Rawat
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationSérgio Sacani
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationAreesha Ahmad
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandRcvets
 
Costs to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaCosts to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaTimothyOkuna
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfpablovgd
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismAreesha Ahmad
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...Sérgio Sacani
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Ansari Aashif Raza Mohd Imtiyaz
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Fabiano Dalpiaz
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einsteinxgamestudios8
 

Recently uploaded (20)

Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
ANATOMY OF DICOT AND MONOCOT LEAVES.pptx
ANATOMY OF DICOT AND MONOCOT LEAVES.pptxANATOMY OF DICOT AND MONOCOT LEAVES.pptx
ANATOMY OF DICOT AND MONOCOT LEAVES.pptx
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary Gland
 
Costs to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaCosts to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of Uganda
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) Metabolism
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 

[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection

  • 1. You Only Look Once: Unified, Real-Time Object Detection (2016) Taegyun Jeon Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." 
 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
  • 2. Slide courtesy of DeepSystem.io “YOLO: You only look once Review” Evaluation on VOC2007
  • 3. Main Concept • Object Detection • Regression problem • YOLO • Only One Feedforward • Global context • Unified (Real-time detection) • YOLO: 45 FPS • Fast YOLO: 155 FPS • General representation • Robust on various background • Other domain (Real-time system: https://cs.stackexchange.com/questions/56569/what-is-real-time-in-a-computer-vision-context)
  • 4. Object Detection as Regression Problem • Previous: Repurpose classifiers to perform detection • Deformable Parts Models (DPM) • Sliding window • R-CNN based methods • 1) generate potential bounding boxes. • 2) run classifiers on these proposed boxes • 3) post-processing (refinement, elimination, rescore)
  • 5. Object Detection as Regression Problem • YOLO: Single Regression Problem • Image → bounding box coordinate and class probability. • Extremely Fast • Global reasoning • Generalizable representation
  • 6. Unified Detection • All BBox, All classes 1) Image → S x S grids 2) grid cell 
 → B: BBoxes and Confidence score → C: class probabilities w.r.t #classes x, y, w, h, confidence Appendix: Intersection over Union (IOU)
  • 7. Unified Detection • Predict one set of class probabilities 
 per grid cell, regardless of the number 
 of boxes B. • At test time, 
 individual box confidence prediction
  • 8. Network Design: YOLO • Modified GoogLeNet • 1x1 reduction layer (“Network in Network”) Appendix: GoogLeNet 1 1 4 10 6 2 1 1 24 2 Conv. layer Fully conn. Layer Conv. layer 3x3x512 Maxpool Layer 2x2-s-2 Conv. layer 3x3x1024 Maxpool Layer 2x2-s-2 Conv. layer 3x3x1024 Conv. Layer 3x3x1024-s-2 Conv. layer 3x3x1024 Conv. Layer 3x3x1024
  • 9. Network Design: YOLO-tiny (9 Conv. Layers) Conv. layer 3x3x16 Maxpool Layer 2x2-s-2 Conv. layer 3x3x32 Maxpool Layer 2x2-s-2 Conv. layer 3x3x64 Maxpool Layer 2x2-s-2 Conv. layer 3x3x128 Maxpool Layer 2x2-s-2 Conv. layer 3x3x256 Maxpool Layer 2x2-s-2 Conv. layer 3x3x512 Maxpool Layer 2x2-s-2 Conv. layer 3x3x1024 Conv. layer 3x3x1024 Conv. layer 3x3x1024 9 2 Conv. layer Fully conn. Layer 1 1 1 1 1 1 1 1 1 YOLO-Tiny YOLO
  • 10. Training 1) Pretrain with ImageNet 1000-class 
 competition dataset 20Conv. layer Feature Extractor Object Classifier
  • 11. Training 2) “Network on Convolutional Feature Maps” Increased input resolution (224x224) →(448x448) Appendix: Networks on Convolutional Feature Maps 20Conv. layer 4 2 Conv. layer Fully conn. Layer Feature Extractor Object Classifier
  • 12.
  • 14.
  • 15.
  • 16. Loss Function (sum-squared error) Appendix: sum-squared error
  • 18. Loss Function (sum-squared error) Re If object appears in cell i The jth bbox predictor in cell i is 
 “responsible” for that prediction
  • 19. Train strategy epochs=135 batch_size=64 momentum_a = 0.9 decay=0.0005 lr=[10-3, 10-2, 10-3, 10-4] dropout_rate=0.5 augmentation
 =[scaling, translation, exposure, saturation]
  • 20. Inference Just like in training? S=7, B=2 for Pascal VOC
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81. Limitation of YOLO • Group of small objects • Unusual aspect ratios • Coarse feature • Localization error of bounding box
  • 82. Comparison to Other Detection System
  • 83. Comparison to Other Real-Time System
  • 90. Appendix | Implementation • YOLO (darknet): https://pjreddie.com/darknet/yolov1/ • YOLOv2 (darknet): https://pjreddie.com/darknet/yolo/ • YOLO (caffe): https://github.com/xingwangsfu/caffe-yolo • YOLO (TensorFlow: Train+Test): https://github.com/thtrieu/darkflow • YOLO (TensorFlow: Test): https://github.com/gliese581gg/YOLO_tensorflow
  • 91. Appendix | Slides • DeepSense.io (google presentation - "YOLO: Inference")
  • 93. Appendix | Intersection over Union (IoU) • IoU(pred, truth)=[0, 1]
  • 94. Appendix | GoogLeNet Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
  • 95. Appendix | GoogLeNet Inception Module (1x1 convolution for dimension reductions)
  • 96. Appendix | Networks on Convolutional Feature Maps Previous: FC Proposed : Conv + FC Ren, Shaoqing, et al. "Object detection networks on convolutional feature maps." IEEE Transactions on Pattern Analysis and Machine Intelligence (2016).
  • 97. Appendix | Sum-squared error (SSE) sum of squared errors of prediction (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model. A small RSS indicates a tight fit of the model to the data. It is used as an optimality criterion in parameter selection and model selection. https://en.wikipedia.org/wiki/Residual_sum_of_squares