SlideShare a Scribd company logo
1 of 30
Download to read offline
You Only Look Once :
Unified, Real-Time Object Detection
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
전희선
1. Introduction
• 기존 모델들은 물체 인식과 분류 각각 따로 진행 → 사람 시각체계 모방하기에는 부족
• 하지만 YOLO는 물체 인식 및 분류를 하나의 regression 문제로 간주
1. Introduction
장점
- Extremely fast
- Reasons globally about the image
- Learns generalizable
representation of objects
단점
- Lags behind state-of-the-art
detection systems in accuracy
2. Unified Detection
1. 이미지를 S*S grid로 분할
(총 S*S개의 grid cell 생성)
Hyperparameters :
S (grid 분할 수)
B (bounding box 수)
C (class 수)
2. Unified Detection
2. 각 grid cell별로 B개의 bounding box 유추
+ bounding box별 confidence score 계산
각 bounding box 구성요소
(x, y) : bounding box 중심점 (grid cell에 대한 상대값)
(w, h) : 이미지 width, height (전체 이미지에 대한 상대값)
confidence : 신뢰도
Confidence Score :
Box가 객체 포함하는지에 대한 신뢰도 및
box가 얼마나 정확하게 유추되었는지 반영
Pr 𝑂𝑏𝑗𝑒𝑐𝑡 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑
𝑡𝑟𝑢𝑡ℎ
IOU(Intersection Over Union) :
예측 구간과 실제 구간이 얼마나 겹치는지 나타냄
𝐼𝑂𝑈 𝑝𝑟𝑒𝑑
𝑡𝑟𝑢𝑡ℎ
=
𝑡𝑟𝑢𝑡ℎ ∩ 𝑝𝑟𝑒𝑑 영역 넓이
𝑡𝑟𝑢𝑡ℎ ∪ 𝑝𝑟𝑒𝑑 영역 넓이
grid cell에 객체 있으면 1, 없으면 0
2. Unified Detection
3. 각 grid cell별로 C개의 conditional class probability 계산
→ 가장 확률 높은 class 할당
Conditional Class Probability :
Pr 𝐶𝑙𝑎𝑠𝑠𝑖 | 𝑂𝑏𝑗𝑒𝑐𝑡
2. Unified Detection
4. 최종 detection!
Test할 때는 각 box별로
Class-specific confidence score 계산 :
Pr 𝐶𝑙𝑎𝑠𝑠𝑖 𝑂𝑏𝑗𝑒𝑐𝑡) ∗ Pr 𝑂𝑏𝑗𝑒𝑐𝑡 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑
𝑡𝑟𝑢𝑡ℎ
= Pr 𝐶𝑙𝑎𝑠𝑠𝑖 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑
𝑡𝑟𝑢𝑡ℎ
2.1 Network Design
GoogLeNet 모델 기반으로 생성됨
Inception module에서
1*1 reduction layer,
3*3 conv layer 이용
2.1 Network Design
초반 20개 (GoogLeNet modification된) conv layer : feature extractor
후반 4개 conv layer + FC layer : object classifier
2.1 Network Design
class별
probability
각 bounding box별
x, y, w, h, confidence 값
(슬라이드 5 참고, 여기서
bounding box 개수 = 2개)
최종 출력 Tensor 크기
= S x S x (B*5+C)
= 7 x 7 x (2*5+20)
S(grid 분할 수) = 7
B(bounding box 수) = 2
C(class 수) = 20
Pr 𝐶𝑙𝑎𝑠𝑠𝑖 | 𝑂𝑏𝑗𝑒𝑐𝑡
2.2 Training – Loss Function
2.2 Training – Loss Function
Object가 존재하는 grid cell i의 bounding box j에 대해
x, y의 loss 계산
2.2 Training – Loss Function
Object가 존재하는 grid cell i의 bounding box j에 대해
w, y의 loss 계산
(큰 box에 대하여 small deviation 반영 위해 제곱근)
2.2 Training – Loss Function
Object가 존재하는 grid cell i의 bounding box j에 대해
confidence score의 loss 계산
(𝐶𝑖 = 1)
2.2 Training – Loss Function
Object가 존재하지 않는 grid cell i의 bounding box j에 대해
confidence score의 loss 계산
(𝐶𝑖 = 0)
2.2 Training – Loss Function
Object가 존재하지 않는 grid cell i의 bounding box j에 대해
conditional class probability의 loss 계산
(맞는 class이면 𝑝𝑖 𝑐 = 1, 아니면 𝑝𝑖 𝑐 = 0)
2.2 Training – Loss Function
보통
10배
2.2 Training – hyperparameter
1. 초반 20개 conv layers를 ImageNet 1000-class dataset으로 pretrain
+ 4개 conv layer와 2개 FC layer 넣어서 PASCAL VOC dataset으로 train
2. 𝜆 𝑐𝑜𝑜𝑟𝑑 = 5, 𝜆 𝑛𝑜𝑜𝑏𝑗 = 0.5 (보통 object 있는 곳에 10배 가중치)
3. Batch size = 64
4. Dropout rate = 0.5
5. Activation function = leaky ReLU
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.3 Inference
2.4 Limitations of YOLO
각 cell이 하나의 box 유추 → 그룹으로 객체가 묶여 있으면 예측 어려움
새로운, 독특한 형태의 bounding box 정확히 예측 불가
참고자료
http://www.navisphere.net/6028/you-only-look-once-unified-real-time-object-detection/
https://curt-park.github.io/2017-03-26/yolo/
https://www.youtube.com/watch?v=eTDcoeqj1_w&t=1572s
https://www.youtube.com/watch?v=4eIBisqx9_g
https://www.youtube.com/watch?v=8DjIJc7xH5U
https://www.youtube.com/watch?v=Cgxsv1riJhI

More Related Content

What's hot

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionAmar Jindal
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox DetectorNamHyuk Ahn
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsEditor IJCATR
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Hansol Kang
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 ISungbin Lim
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)WON JOON YOO
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421穗碧 陳
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tearsAnkit Sharma
 

What's hot (20)

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
YOLO
YOLOYOLO
YOLO
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Yolo
YoloYolo
Yolo
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance Applications
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
You only look once
You only look onceYou only look once
You only look once
 
Yolov3
Yolov3Yolov3
Yolov3
 
Dcgan
DcganDcgan
Dcgan
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
딥 러닝 자연어 처리를 학습을 위한 파워포인트. (Deep Learning for Natural Language Processing)
 
deep learning
deep learningdeep learning
deep learning
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
 

More from KyeongUkJang

Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photoPhoto wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photoKyeongUkJang
 
GAN - Generative Adversarial Nets
GAN - Generative Adversarial NetsGAN - Generative Adversarial Nets
GAN - Generative Adversarial NetsKyeongUkJang
 
Distilling the knowledge in a neural network
Distilling the knowledge in a neural networkDistilling the knowledge in a neural network
Distilling the knowledge in a neural networkKyeongUkJang
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationKyeongUkJang
 
Gaussian Mixture Model
Gaussian Mixture ModelGaussian Mixture Model
Gaussian Mixture ModelKyeongUkJang
 
CNN for sentence classification
CNN for sentence classificationCNN for sentence classification
CNN for sentence classificationKyeongUkJang
 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNEKyeongUkJang
 
Playing atari with deep reinforcement learning
Playing atari with deep reinforcement learningPlaying atari with deep reinforcement learning
Playing atari with deep reinforcement learningKyeongUkJang
 
Chapter 20 Deep generative models
Chapter 20 Deep generative modelsChapter 20 Deep generative models
Chapter 20 Deep generative modelsKyeongUkJang
 
Chapter 19 Variational Inference
Chapter 19 Variational InferenceChapter 19 Variational Inference
Chapter 19 Variational InferenceKyeongUkJang
 
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2KyeongUkJang
 
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - BasicNatural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - BasicKyeongUkJang
 
Chapter 17 monte carlo methods
Chapter 17 monte carlo methodsChapter 17 monte carlo methods
Chapter 17 monte carlo methodsKyeongUkJang
 
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2KyeongUkJang
 
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1KyeongUkJang
 
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2KyeongUkJang
 

More from KyeongUkJang (20)

Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photoPhoto wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photo
 
AlphagoZero
AlphagoZeroAlphagoZero
AlphagoZero
 
GoogLenet
GoogLenetGoogLenet
GoogLenet
 
GAN - Generative Adversarial Nets
GAN - Generative Adversarial NetsGAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
 
Distilling the knowledge in a neural network
Distilling the knowledge in a neural networkDistilling the knowledge in a neural network
Distilling the knowledge in a neural network
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Gaussian Mixture Model
Gaussian Mixture ModelGaussian Mixture Model
Gaussian Mixture Model
 
CNN for sentence classification
CNN for sentence classificationCNN for sentence classification
CNN for sentence classification
 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNE
 
Playing atari with deep reinforcement learning
Playing atari with deep reinforcement learningPlaying atari with deep reinforcement learning
Playing atari with deep reinforcement learning
 
Chapter 20 - GAN
Chapter 20 - GANChapter 20 - GAN
Chapter 20 - GAN
 
Chapter 20 - VAE
Chapter 20 - VAEChapter 20 - VAE
Chapter 20 - VAE
 
Chapter 20 Deep generative models
Chapter 20 Deep generative modelsChapter 20 Deep generative models
Chapter 20 Deep generative models
 
Chapter 19 Variational Inference
Chapter 19 Variational InferenceChapter 19 Variational Inference
Chapter 19 Variational Inference
 
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
 
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - BasicNatural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
 
Chapter 17 monte carlo methods
Chapter 17 monte carlo methodsChapter 17 monte carlo methods
Chapter 17 monte carlo methods
 
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
 
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
 
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
 

Recently uploaded

Console API (Kitworks Team Study 백혜인 발표자료)
Console API (Kitworks Team Study 백혜인 발표자료)Console API (Kitworks Team Study 백혜인 발표자료)
Console API (Kitworks Team Study 백혜인 발표자료)Wonjun Hwang
 
캐드앤그래픽스 2024년 5월호 목차
캐드앤그래픽스 2024년 5월호 목차캐드앤그래픽스 2024년 5월호 목차
캐드앤그래픽스 2024년 5월호 목차캐드앤그래픽스
 
MOODv2 : Masked Image Modeling for Out-of-Distribution Detection
MOODv2 : Masked Image Modeling for Out-of-Distribution DetectionMOODv2 : Masked Image Modeling for Out-of-Distribution Detection
MOODv2 : Masked Image Modeling for Out-of-Distribution DetectionKim Daeun
 
Merge (Kitworks Team Study 이성수 발표자료 240426)
Merge (Kitworks Team Study 이성수 발표자료 240426)Merge (Kitworks Team Study 이성수 발표자료 240426)
Merge (Kitworks Team Study 이성수 발표자료 240426)Wonjun Hwang
 
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...Continual Active Learning for Efficient Adaptation of Machine LearningModels ...
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...Kim Daeun
 
A future that integrates LLMs and LAMs (Symposium)
A future that integrates LLMs and LAMs (Symposium)A future that integrates LLMs and LAMs (Symposium)
A future that integrates LLMs and LAMs (Symposium)Tae Young Lee
 

Recently uploaded (6)

Console API (Kitworks Team Study 백혜인 발표자료)
Console API (Kitworks Team Study 백혜인 발표자료)Console API (Kitworks Team Study 백혜인 발표자료)
Console API (Kitworks Team Study 백혜인 발표자료)
 
캐드앤그래픽스 2024년 5월호 목차
캐드앤그래픽스 2024년 5월호 목차캐드앤그래픽스 2024년 5월호 목차
캐드앤그래픽스 2024년 5월호 목차
 
MOODv2 : Masked Image Modeling for Out-of-Distribution Detection
MOODv2 : Masked Image Modeling for Out-of-Distribution DetectionMOODv2 : Masked Image Modeling for Out-of-Distribution Detection
MOODv2 : Masked Image Modeling for Out-of-Distribution Detection
 
Merge (Kitworks Team Study 이성수 발표자료 240426)
Merge (Kitworks Team Study 이성수 발표자료 240426)Merge (Kitworks Team Study 이성수 발표자료 240426)
Merge (Kitworks Team Study 이성수 발표자료 240426)
 
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...Continual Active Learning for Efficient Adaptation of Machine LearningModels ...
Continual Active Learning for Efficient Adaptation of Machine LearningModels ...
 
A future that integrates LLMs and LAMs (Symposium)
A future that integrates LLMs and LAMs (Symposium)A future that integrates LLMs and LAMs (Symposium)
A future that integrates LLMs and LAMs (Symposium)
 

YOLO

  • 1. You Only Look Once : Unified, Real-Time Object Detection Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi 전희선
  • 2. 1. Introduction • 기존 모델들은 물체 인식과 분류 각각 따로 진행 → 사람 시각체계 모방하기에는 부족 • 하지만 YOLO는 물체 인식 및 분류를 하나의 regression 문제로 간주
  • 3. 1. Introduction 장점 - Extremely fast - Reasons globally about the image - Learns generalizable representation of objects 단점 - Lags behind state-of-the-art detection systems in accuracy
  • 4. 2. Unified Detection 1. 이미지를 S*S grid로 분할 (총 S*S개의 grid cell 생성) Hyperparameters : S (grid 분할 수) B (bounding box 수) C (class 수)
  • 5. 2. Unified Detection 2. 각 grid cell별로 B개의 bounding box 유추 + bounding box별 confidence score 계산 각 bounding box 구성요소 (x, y) : bounding box 중심점 (grid cell에 대한 상대값) (w, h) : 이미지 width, height (전체 이미지에 대한 상대값) confidence : 신뢰도 Confidence Score : Box가 객체 포함하는지에 대한 신뢰도 및 box가 얼마나 정확하게 유추되었는지 반영 Pr 𝑂𝑏𝑗𝑒𝑐𝑡 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑 𝑡𝑟𝑢𝑡ℎ IOU(Intersection Over Union) : 예측 구간과 실제 구간이 얼마나 겹치는지 나타냄 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑 𝑡𝑟𝑢𝑡ℎ = 𝑡𝑟𝑢𝑡ℎ ∩ 𝑝𝑟𝑒𝑑 영역 넓이 𝑡𝑟𝑢𝑡ℎ ∪ 𝑝𝑟𝑒𝑑 영역 넓이 grid cell에 객체 있으면 1, 없으면 0
  • 6. 2. Unified Detection 3. 각 grid cell별로 C개의 conditional class probability 계산 → 가장 확률 높은 class 할당 Conditional Class Probability : Pr 𝐶𝑙𝑎𝑠𝑠𝑖 | 𝑂𝑏𝑗𝑒𝑐𝑡
  • 7. 2. Unified Detection 4. 최종 detection! Test할 때는 각 box별로 Class-specific confidence score 계산 : Pr 𝐶𝑙𝑎𝑠𝑠𝑖 𝑂𝑏𝑗𝑒𝑐𝑡) ∗ Pr 𝑂𝑏𝑗𝑒𝑐𝑡 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑 𝑡𝑟𝑢𝑡ℎ = Pr 𝐶𝑙𝑎𝑠𝑠𝑖 ∗ 𝐼𝑂𝑈 𝑝𝑟𝑒𝑑 𝑡𝑟𝑢𝑡ℎ
  • 8. 2.1 Network Design GoogLeNet 모델 기반으로 생성됨 Inception module에서 1*1 reduction layer, 3*3 conv layer 이용
  • 9. 2.1 Network Design 초반 20개 (GoogLeNet modification된) conv layer : feature extractor 후반 4개 conv layer + FC layer : object classifier
  • 10. 2.1 Network Design class별 probability 각 bounding box별 x, y, w, h, confidence 값 (슬라이드 5 참고, 여기서 bounding box 개수 = 2개) 최종 출력 Tensor 크기 = S x S x (B*5+C) = 7 x 7 x (2*5+20) S(grid 분할 수) = 7 B(bounding box 수) = 2 C(class 수) = 20 Pr 𝐶𝑙𝑎𝑠𝑠𝑖 | 𝑂𝑏𝑗𝑒𝑐𝑡
  • 11. 2.2 Training – Loss Function
  • 12. 2.2 Training – Loss Function Object가 존재하는 grid cell i의 bounding box j에 대해 x, y의 loss 계산
  • 13. 2.2 Training – Loss Function Object가 존재하는 grid cell i의 bounding box j에 대해 w, y의 loss 계산 (큰 box에 대하여 small deviation 반영 위해 제곱근)
  • 14. 2.2 Training – Loss Function Object가 존재하는 grid cell i의 bounding box j에 대해 confidence score의 loss 계산 (𝐶𝑖 = 1)
  • 15. 2.2 Training – Loss Function Object가 존재하지 않는 grid cell i의 bounding box j에 대해 confidence score의 loss 계산 (𝐶𝑖 = 0)
  • 16. 2.2 Training – Loss Function Object가 존재하지 않는 grid cell i의 bounding box j에 대해 conditional class probability의 loss 계산 (맞는 class이면 𝑝𝑖 𝑐 = 1, 아니면 𝑝𝑖 𝑐 = 0)
  • 17. 2.2 Training – Loss Function 보통 10배
  • 18. 2.2 Training – hyperparameter 1. 초반 20개 conv layers를 ImageNet 1000-class dataset으로 pretrain + 4개 conv layer와 2개 FC layer 넣어서 PASCAL VOC dataset으로 train 2. 𝜆 𝑐𝑜𝑜𝑟𝑑 = 5, 𝜆 𝑛𝑜𝑜𝑏𝑗 = 0.5 (보통 object 있는 곳에 10배 가중치) 3. Batch size = 64 4. Dropout rate = 0.5 5. Activation function = leaky ReLU
  • 29. 2.4 Limitations of YOLO 각 cell이 하나의 box 유추 → 그룹으로 객체가 묶여 있으면 예측 어려움 새로운, 독특한 형태의 bounding box 정확히 예측 불가