SlideShare a Scribd company logo
1 of 46
Download to read offline
A Brief History of Object Detection
From Haar-like features to losing anchors @ PFN Day #4
July 11, 2019
Tommi Kerola
A Brief History of Object Detection
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
2/41
Motivation
• Localizing objects is a crucial task for using computer vision in the real world.
• Autonomous driving, personal robots, industrial robotics, . . . .
3/41
(Figs. from towardsdatascience.com, preferred.jp, robotics.org)
Problem definition
• Given an input image, predict the locations of a certain class of objects in the image.
• Locations are usually represented using bounding boxes.
• Evaluation using IoU (intersection over union).
IoU
A ∩ B
A ∪ B
, typically, IoU ≥ 0.5 =⇒ match with GT (1)
Ground-truth class: Horse
Ground-truth location:
Predicted class: Horse
Predicted location:
Problem definition
• Given an input image, predict the locations of a certain class of objects in the image.
• Locations are usually represented using bounding boxes.
• Evaluation using IoU (intersection over union).
IoU
A ∩ B
A ∪ B
, typically, IoU ≥ 0.5 =⇒ match with GT (1)
Ground-truth class: Horse
Ground-truth location:
Predicted class: Horse
Predicted location:
4/41
(Fig. from telegraph.co.uk)
Problem definition
• Given an input image, predict the locations of a certain class of objects in the image.
• Locations are usually represented using bounding boxes.
• Evaluation using IoU (intersection over union).
IoU
A ∩ B
A ∪ B
, typically, IoU ≥ 0.5 =⇒ match with GT (1)
IoU = 0.584
Ground-truth class: Horse
Ground-truth location:
Predicted class: Horse
Predicted location:
4/41
(Fig. from telegraph.co.uk)
Problem definition
• Given an input image, predict the locations of a certain class of objects in the image.
• Locations are usually represented using bounding boxes.
• Evaluation using IoU (intersection over union).
IoU
A ∩ B
A ∪ B
, typically, IoU ≥ 0.5 =⇒ match with GT (1)
Ground-truth class: Horse
Ground-truth location:
Predicted class: Horse
Predicted location:
4/41
(Fig. from telegraph.co.uk)
Problem definition
• Given an input image, predict the locations of a certain class of objects in the image.
• Locations are usually represented using bounding boxes.
• Evaluation using IoU (intersection over union).
IoU
A ∩ B
A ∪ B
, typically, IoU ≥ 0.5 =⇒ match with GT (1)
IoU = 0.5
Ground-truth class: Horse
Ground-truth location:
Predicted class: Horse
Predicted location:
• Note: IoU is not a perfect metric, but the current de-facto standard.
4/41
(Fig. from telegraph.co.uk)
Target Audience
People who
• Know the basics of deep learning and CNNs.
• Have not implemented a DL-based object detector before.
• But want to find out how!
5/41
Purpose of this Talk
Region
proposals
Feature
calculation
Region
classification
for each region object or not?
• Understand general object detection pipeline (above).
• Explain the history of object detection up until recent research.
• Illustrate various types of recent object detectors and their merits.
6/41
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
7/41
History Before Deep Learning
• Early object detectors were based on handcrafted features.
• Sliding window classifier, check if feature response is strong enough, if so: output
detection.
• We will review a few representative features pre-deep learning.
• Haar-like features
• Histograms of Oriented Gradients
• Deformable Part Models
8/41
(Fig. from researchgate.net)
Haar-like features [Viola and Jones(2001)]
• Hand-crafted weak features, calculate in sliding window, use boosted classifier like
AdaBoost. Name comes from similarity to Haar wavelets.
• e.g. “Is the middle part of the image darker than the outer parts?”
“Nose-detector”
“Eye-detector”
9/41
(Fig. from wikipedia.org)
Necessary Post-Processing: Non-maximum Suppression
• Sliding window classifiers yield lots of correlated detections.
• Non-maximum suppression (NMS) is a simple, greedy algorithm for turning these
into a single detection per object (ideally).
• (Recently, improvements upon NMS exist [Zhou et al.(2017), Bodla et al.(2017)])
10/41
(Fig. from pyimagesearch.com)
Histograms of Oriented Gradients [Dalal and Triggs(2005)]
• Computes histogram of gradient orientation (HOG feature) over sub-image blocks.
11/41
(Fig. from [Dalal and Triggs(2005)])
Deformable Part Models [Felzenszwalb et al.(2008)]
• Learn the relationships between HOG features of object parts via a latent SVM.
12/41
(Fig. from [Felzenszwalb et al.(2008)])
Better Region Proposals: Selective
Search [Uijlings et al.(2013)]
• Instead of sliding window, propose regions that have high “objectness”.
• Oversegment image and merge regions hierarchically by color, texture, size and shape.
13/41
(Fig. from [Uijlings et al.(2013)])
Deep Learning Era [Krizhevsky et al.(2012)]
• Starting with AlexNet in 2012, deep
learning methods significantly improved
image classification.
• The same holds for object detection:
> 30 pp. gap. Recent methods give
∼ 150% improvement.
deep learning gap
14/41
(Fig. from [Zou et al.(2019)])
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
15/41
Two-stage Methods
Image
Region
proposals
Classify
regions
Detections
• As implied, operates in two serial stages:
• Generate region proposals (instead of sliding window).
• Classify each proposed region, if feature response strong enough, output detection.
• When to use? Two-stage methods are accurate but computationally heavy.
• We will look at three historically representative methods:
• R-CNN
• Fast R-CNN
• Faster R-CNN
• (Further reading: [Li et al.(2019), Lin et al.(2017a), Lu et al.(2019), Singh et al.(2018)])
16/41
R-CNN [Girshick et al.(2014)]
• Extract proposals via selective search [Uijlings et al.(2013)].
• Extract CNN features.
• Classify with an SVM.
17/41
(Fig. from [Girshick et al.(2014)])
Fast R-CNN [Girshick(2015)]
• Extract proposals via selective search.
• Extract features and classify with CNN.
18/41
(Fig. from [Girshick(2015)])
Faster R-CNN [Ren et al.(2015)]
• Proposes novel RPN: Region Proposal Network – no more selective search.
• End2End: Extract proposals, features and classify with CNN.
• Note: PFDet [Akiba et al.(2018)] is based on this type of detector.
19/41
(Fig. from [Ren et al.(2015)])
Interlude: What is an anchor?
• Human-designed prior on object shapes.
• Designing anchors is deciding which and how many prior anchor shapes to use.
• Anchors are typically selected to be close in shape to typical objects. Upright
rectangle for pedestrian, lying rectangle for car, etc.
20/41
(Fig. from [Liu et al.(2016)])
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
21/41
Single-shot Methods
Image
Classify and
regress bboxes
Detections
• Unlike Faster R-CNN et al., only has a single stage.
• Can be thought of as that the RPN both localizes and classifies the object.
• When to use? Single-shot methods are generally fast and have moderate accuracy.
• We will look at two representative methods:
• SSD: Single Shot MultiBox Detector
• YOLO: You Only Look Once
• (Out of scope, but recommended reading: [Zhang et al.(2018), Lin et al.(2017b)])
22/41
SSD: Single Shot MultiBox Detector [Liu et al.(2016)]
• At each feature scale, predict class and bounding box regression (via Huber loss).
Linear scale target center x ˆgcx
j
Log scale target width ˆgw
j
23/41
(Fig. from [Liu et al.(2016)])
YOLO: You Only Look Once [Redmon et al.(2016)]
• Unlike SSD, much simpler. Just a single scale of features and fully connected layers.
• YOLOv2 [Redmon and Farhadi(2017)] makes the method more like SSD, removes
fully connected layers (+ other tricks).
24/41
(Fig. from [Liu et al.(2016)])
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
25/41
Anchor-free Methods
• Recently, researchers have tried to remove anchors from object detection methods.
• Why are anchors bad?
• Human-designed prior. (cf. hand-crafted featured vs deep learning on ILSVRC in 2012)
• Anchor-free methods provide more flexibility to leverage large-scale data.
• Why are we able to remove the anchors? Mainly thanks to progress in
• deep keypoint detection research [Newell et al.(2017), Newell and Deng(2017)].
• detection loss formulation, such as focal loss [Lin et al.(2017b)].
• Unlike previous architectures, more similar to semantic segmentation.
• When to use? Instead of single-shot methods for fast and accurate inference.
• If you were thinking of using SSD for your project – think again.
• Caveat: Two-stage methods tend to still be more accurate.
26/41
CornerNet: Detecting Objects as Paired
Keypoints [Law and Deng(2018)]
• First competitive anchor-free method – detect corners of objects.
• Key contributions are corner pooling and improving focal loss for keypoint detection:
27/41
(Fig. from [Law and Deng(2018)])
CenterNet(s): Objects as Points
• Found there are actually two CenterNet papers: posted on arXiv 1 day apart.
• Apr 16: Predict center, regress object size.
• Apr 17: Extend CornerNet to take both corners and centers into account.
28/41
(Figs. from [Zhou et al.(2019b), Duan et al.(2019)])
FCOS: Fully Convolutional One-Stage Object
Detection [Tian et al.(2019)]
• Similar to the Apr 16 CenterNet – predict center, regress object size.
29/41
(Figs. from [Tian et al.(2019)])
Bottom-up Object Detection by Grouping Extreme and Center
Points [Zhou et al.(2019a)]
• Detect not only bboxes, but convex octagons enclosing objects. Same author as Apr
16 CenterNet.
30/41
(Figs. from [Zhou et al.(2019a)])
1 Motivation
2 History Before Deep Learning
3 Two-stage Methods
4 Single-shot Methods
5 Anchor-free Methods
6 Problems and Summary
31/41
Problems for Current Object Detection Methods
• Hand-crafted post-processing (NMS) is still required due to correlated detections
• Attempts at “truly” end-to-end object detectors exist, but not
practical [Hu et al.(2018), Hosang et al.(2017)].
• Bounding box representation not optimal – remember the horse?
32/41
(Fig. from telegraph.co.uk)
Problems for Current Object Detection Methods
• Hand-crafted post-processing (NMS) is still required due to correlated detections
• Attempts at “truly” end-to-end object detectors exist, but not
practical [Hu et al.(2018), Hosang et al.(2017)].
• Bounding box representation not optimal – remember the horse?
• You can miss half the horse – and still considered a correct detection!
32/41
(Fig. from telegraph.co.uk)
Problems – and solutions?
• Bounding box representation and hand-crafted NMS not optimal
• New field, panoptic segmentation, tackles this [Kirillov et al.(2019)].
• No need for NMS, as the task requires explaining each pixel, including background
classes.
• Proposes new metric PQ: Panoptic Quality for replacing IoU metric.
33/41
(Fig. from [Kirillov et al.(2019)])
Summary
• Deep learning has resulted in a significant (∼ 150%) improvement in object detection.
• Three main architectures of deep object detectors
• Two-stage
• Single-shot
• Anchor-free (recent research!)
• Issues
• Hand-crafted post-processing (NMS) is still required due to correlated detections.
• Issue of using bounding boxes to represent objects remains.
• → Recent new field of panoptic segmentation aims to mitigate these issues.
• Takeaway message:
• Do not use SSD! There are much better methods available for your project.
• Anchor-free methods are promising for fast and accurate object detection.
34/41
References I
T. Akiba et al.
Pfdet: 2nd place solution to open images challenge 2018 object detection track.
arXiv preprint arXiv:1809.00778, 2018.
N. Bodla et al.
Soft-nms–improving object detection with one line of code.
In Proceedings of the IEEE International Conference on Computer Vision, pages 5561–5569, 2017.
N. Dalal and B. Triggs.
Histograms of oriented gradients for human detection.
In International Conference on computer vision & Pattern Recognition (CVPR’05), volume 1, pages 886–893. IEEE Computer Society, 2005.
K. Duan et al.
Centernet: Object detection with keypoint triplets.
arXiv preprint arXiv:1904.08189, 2019.
P. Felzenszwalb et al.
A discriminatively trained, multiscale, deformable part model.
In CVPR, 2008.
R. Girshick.
Fast r-cnn.
In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
35/41
References II
R. Girshick et al.
Rich feature hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
J. Hosang et al.
Learning non-maximum suppression.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4507–4515, 2017.
H. Hu et al.
Relation networks for object detection.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3588–3597, 2018.
A. Kirillov et al.
Panoptic segmentation.
In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
A. Krizhevsky et al.
Imagenet classification with deep convolutional neural networks.
In Advances in neural information processing systems, pages 1097–1105, 2012.
H. Law and J. Deng.
Cornernet: Detecting objects as paired keypoints.
In Proceedings of the European Conference on Computer Vision (ECCV), pages 734–750, 2018.
36/41
References III
Y. Li et al.
Scale-aware trident networks for object detection.
arXiv preprint arXiv:1901.01892, 2019.
T.-Y. Lin et al.
Feature pyramid networks for object detection.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125, 2017a.
T.-Y. Lin et al.
Focal loss for dense object detection.
In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b.
W. Liu et al.
Ssd: Single shot multibox detector.
In European conference on computer vision, pages 21–37. Springer, 2016.
X. Lu et al.
Grid r-cnn.
In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
A. Newell and J. Deng.
Pixels to graphs by associative embedding.
In Advances in neural information processing systems, pages 2171–2180, 2017.
37/41
References IV
A. Newell et al.
Associative embedding: End-to-end learning for joint detection and grouping.
In Advances in Neural Information Processing Systems, pages 2277–2287, 2017.
J. Redmon and A. Farhadi.
Yolo9000: better, faster, stronger.
In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
J. Redmon et al.
You only look once: Unified, real-time object detection.
In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
S. Ren et al.
Faster r-cnn: Towards real-time object detection with region proposal networks.
In Advances in neural information processing systems, pages 91–99, 2015.
B. Singh et al.
Sniper: Efficient multi-scale training.
In Advances in Neural Information Processing Systems, pages 9310–9320, 2018.
Z. Tian et al.
Fcos: Fully convolutional one-stage object detection.
arXiv preprint arXiv:1904.01355, 2019.
38/41
References V
J. Uijlings et al.
Selective search for object recognition.
International journal of computer vision, 104(2):154–171, 2013.
P. Viola and M. Jones.
Rapid object detection using a boosted cascade of simple features.
CVPR, 2001.
S. Zhang et al.
Single-shot refinement neural network for object detection.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4203–4212, 2018.
H. Zhou et al.
Cad: Scale invariant framework for real-time object detection.
In The IEEE International Conference on Computer Vision (ICCV) Workshops, Oct 2017.
X. Zhou et al.
Bottom-up object detection by grouping extreme and center points.
In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019a.
X. Zhou et al.
Objects as points.
arXiv preprint arXiv:1904.07850, 2019b.
39/41
References VI
Z. Zou et al.
Object detection in 20 years: A survey.
arXiv preprint arXiv:1905.05055, 2019.
40/41
Thank you for listening! Questions?
41/41

More Related Content

What's hot

Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
Neural Radiance Field
Neural Radiance FieldNeural Radiance Field
Neural Radiance FieldDong Heon Cho
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
 
Moving object detection
Moving object detectionMoving object detection
Moving object detectionManav Mittal
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox DetectorNamHyuk Ahn
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)Universitat Politècnica de Catalunya
 

What's hot (20)

Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Yolo
YoloYolo
Yolo
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
You only look once
You only look onceYou only look once
You only look once
 
Object detection
Object detectionObject detection
Object detection
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object detection
Object detectionObject detection
Object detection
 
Yolov5
Yolov5 Yolov5
Yolov5
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Neural Radiance Field
Neural Radiance FieldNeural Radiance Field
Neural Radiance Field
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
 

Similar to A Brief History of Object Detection / Tommi Kerola

Various object detection and tracking methods
Various object detection and tracking methodsVarious object detection and tracking methods
Various object detection and tracking methodssujeeshkumarj
 
Optical Flow with Semantic Segmentation and Localized Layers
Optical Flow with Semantic Segmentation and Localized LayersOptical Flow with Semantic Segmentation and Localized Layers
Optical Flow with Semantic Segmentation and Localized LayersSeval Çapraz
 
Wits presentation 2_19052015
Wits presentation 2_19052015Wits presentation 2_19052015
Wits presentation 2_19052015Beatrice van Eden
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired personshahsamkit73
 
Survey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real ImagesSurvey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real ImagesCSCJournals
 
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient SimulationDongwonSon1
 
Common Understanding about YOLO
Common Understanding about YOLOCommon Understanding about YOLO
Common Understanding about YOLO재민 임
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in VideosNAVER Engineering
 
Object based image analysis tools for opticks
Object based image analysis tools for opticksObject based image analysis tools for opticks
Object based image analysis tools for opticksMohit Kumar
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data AnalysisDeviousQuant
 
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYJournal For Research
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship ReportHarshilJain26
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object RetrievalNguyen Tuan
 
An analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationAn analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationSubhashis Hazarika
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysisNEHA Kapoor
 
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...Kensuke Harada
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
Final Year Major Project Report ( Year 2010-2014 Batch )
Final Year Major Project Report ( Year 2010-2014 Batch )Final Year Major Project Report ( Year 2010-2014 Batch )
Final Year Major Project Report ( Year 2010-2014 Batch )parul6792
 
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsAction Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsSangmin Woo
 

Similar to A Brief History of Object Detection / Tommi Kerola (20)

Various object detection and tracking methods
Various object detection and tracking methodsVarious object detection and tracking methods
Various object detection and tracking methods
 
Optical Flow with Semantic Segmentation and Localized Layers
Optical Flow with Semantic Segmentation and Localized LayersOptical Flow with Semantic Segmentation and Localized Layers
Optical Flow with Semantic Segmentation and Localized Layers
 
Wits presentation 2_19052015
Wits presentation 2_19052015Wits presentation 2_19052015
Wits presentation 2_19052015
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired person
 
Survey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real ImagesSurvey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real Images
 
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation[RSS2023] Local Object Crop Collision Network for Efficient Simulation
[RSS2023] Local Object Crop Collision Network for Efficient Simulation
 
Common Understanding about YOLO
Common Understanding about YOLOCommon Understanding about YOLO
Common Understanding about YOLO
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 
Object based image analysis tools for opticks
Object based image analysis tools for opticksObject based image analysis tools for opticks
Object based image analysis tools for opticks
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
 
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEY
 
IISc Internship Report
IISc Internship ReportIISc Internship Report
IISc Internship Report
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object Retrieval
 
An analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classificationAn analysis of_machine_and_human_analytics_in_classification
An analysis of_machine_and_human_analytics_in_classification
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...
Intergrated Single-arm Assembly and Manipulation Planning using Dynamic Regra...
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Final Year Major Project Report ( Year 2010-2014 Batch )
Final Year Major Project Report ( Year 2010-2014 Batch )Final Year Major Project Report ( Year 2010-2014 Batch )
Final Year Major Project Report ( Year 2010-2014 Batch )
 
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsAction Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
 

More from Preferred Networks

PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57Preferred Networks
 
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Preferred Networks
 
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Preferred Networks
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...Preferred Networks
 
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Preferred Networks
 
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Preferred Networks
 
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2Preferred Networks
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Preferred Networks
 
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演Preferred Networks
 
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Preferred Networks
 
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)Preferred Networks
 
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)Preferred Networks
 
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るKubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るPreferred Networks
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Preferred Networks
 
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会Preferred Networks
 
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2Preferred Networks
 
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Preferred Networks
 
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...Preferred Networks
 
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...Preferred Networks
 
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50Preferred Networks
 

More from Preferred Networks (20)

PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
 
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
 
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
 
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
 
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
 
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
 
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
 
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
 
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
 
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
 
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るKubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
 
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
 
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
 
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
 
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
 
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
 
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

A Brief History of Object Detection / Tommi Kerola

  • 1. A Brief History of Object Detection From Haar-like features to losing anchors @ PFN Day #4 July 11, 2019 Tommi Kerola
  • 2. A Brief History of Object Detection 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 2/41
  • 3. Motivation • Localizing objects is a crucial task for using computer vision in the real world. • Autonomous driving, personal robots, industrial robotics, . . . . 3/41 (Figs. from towardsdatascience.com, preferred.jp, robotics.org)
  • 4. Problem definition • Given an input image, predict the locations of a certain class of objects in the image. • Locations are usually represented using bounding boxes. • Evaluation using IoU (intersection over union). IoU A ∩ B A ∪ B , typically, IoU ≥ 0.5 =⇒ match with GT (1) Ground-truth class: Horse Ground-truth location: Predicted class: Horse Predicted location:
  • 5. Problem definition • Given an input image, predict the locations of a certain class of objects in the image. • Locations are usually represented using bounding boxes. • Evaluation using IoU (intersection over union). IoU A ∩ B A ∪ B , typically, IoU ≥ 0.5 =⇒ match with GT (1) Ground-truth class: Horse Ground-truth location: Predicted class: Horse Predicted location: 4/41 (Fig. from telegraph.co.uk)
  • 6. Problem definition • Given an input image, predict the locations of a certain class of objects in the image. • Locations are usually represented using bounding boxes. • Evaluation using IoU (intersection over union). IoU A ∩ B A ∪ B , typically, IoU ≥ 0.5 =⇒ match with GT (1) IoU = 0.584 Ground-truth class: Horse Ground-truth location: Predicted class: Horse Predicted location: 4/41 (Fig. from telegraph.co.uk)
  • 7. Problem definition • Given an input image, predict the locations of a certain class of objects in the image. • Locations are usually represented using bounding boxes. • Evaluation using IoU (intersection over union). IoU A ∩ B A ∪ B , typically, IoU ≥ 0.5 =⇒ match with GT (1) Ground-truth class: Horse Ground-truth location: Predicted class: Horse Predicted location: 4/41 (Fig. from telegraph.co.uk)
  • 8. Problem definition • Given an input image, predict the locations of a certain class of objects in the image. • Locations are usually represented using bounding boxes. • Evaluation using IoU (intersection over union). IoU A ∩ B A ∪ B , typically, IoU ≥ 0.5 =⇒ match with GT (1) IoU = 0.5 Ground-truth class: Horse Ground-truth location: Predicted class: Horse Predicted location: • Note: IoU is not a perfect metric, but the current de-facto standard. 4/41 (Fig. from telegraph.co.uk)
  • 9. Target Audience People who • Know the basics of deep learning and CNNs. • Have not implemented a DL-based object detector before. • But want to find out how! 5/41
  • 10. Purpose of this Talk Region proposals Feature calculation Region classification for each region object or not? • Understand general object detection pipeline (above). • Explain the history of object detection up until recent research. • Illustrate various types of recent object detectors and their merits. 6/41
  • 11. 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 7/41
  • 12. History Before Deep Learning • Early object detectors were based on handcrafted features. • Sliding window classifier, check if feature response is strong enough, if so: output detection. • We will review a few representative features pre-deep learning. • Haar-like features • Histograms of Oriented Gradients • Deformable Part Models 8/41 (Fig. from researchgate.net)
  • 13. Haar-like features [Viola and Jones(2001)] • Hand-crafted weak features, calculate in sliding window, use boosted classifier like AdaBoost. Name comes from similarity to Haar wavelets. • e.g. “Is the middle part of the image darker than the outer parts?” “Nose-detector” “Eye-detector” 9/41 (Fig. from wikipedia.org)
  • 14. Necessary Post-Processing: Non-maximum Suppression • Sliding window classifiers yield lots of correlated detections. • Non-maximum suppression (NMS) is a simple, greedy algorithm for turning these into a single detection per object (ideally). • (Recently, improvements upon NMS exist [Zhou et al.(2017), Bodla et al.(2017)]) 10/41 (Fig. from pyimagesearch.com)
  • 15. Histograms of Oriented Gradients [Dalal and Triggs(2005)] • Computes histogram of gradient orientation (HOG feature) over sub-image blocks. 11/41 (Fig. from [Dalal and Triggs(2005)])
  • 16. Deformable Part Models [Felzenszwalb et al.(2008)] • Learn the relationships between HOG features of object parts via a latent SVM. 12/41 (Fig. from [Felzenszwalb et al.(2008)])
  • 17. Better Region Proposals: Selective Search [Uijlings et al.(2013)] • Instead of sliding window, propose regions that have high “objectness”. • Oversegment image and merge regions hierarchically by color, texture, size and shape. 13/41 (Fig. from [Uijlings et al.(2013)])
  • 18. Deep Learning Era [Krizhevsky et al.(2012)] • Starting with AlexNet in 2012, deep learning methods significantly improved image classification. • The same holds for object detection: > 30 pp. gap. Recent methods give ∼ 150% improvement. deep learning gap 14/41 (Fig. from [Zou et al.(2019)])
  • 19. 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 15/41
  • 20. Two-stage Methods Image Region proposals Classify regions Detections • As implied, operates in two serial stages: • Generate region proposals (instead of sliding window). • Classify each proposed region, if feature response strong enough, output detection. • When to use? Two-stage methods are accurate but computationally heavy. • We will look at three historically representative methods: • R-CNN • Fast R-CNN • Faster R-CNN • (Further reading: [Li et al.(2019), Lin et al.(2017a), Lu et al.(2019), Singh et al.(2018)]) 16/41
  • 21. R-CNN [Girshick et al.(2014)] • Extract proposals via selective search [Uijlings et al.(2013)]. • Extract CNN features. • Classify with an SVM. 17/41 (Fig. from [Girshick et al.(2014)])
  • 22. Fast R-CNN [Girshick(2015)] • Extract proposals via selective search. • Extract features and classify with CNN. 18/41 (Fig. from [Girshick(2015)])
  • 23. Faster R-CNN [Ren et al.(2015)] • Proposes novel RPN: Region Proposal Network – no more selective search. • End2End: Extract proposals, features and classify with CNN. • Note: PFDet [Akiba et al.(2018)] is based on this type of detector. 19/41 (Fig. from [Ren et al.(2015)])
  • 24. Interlude: What is an anchor? • Human-designed prior on object shapes. • Designing anchors is deciding which and how many prior anchor shapes to use. • Anchors are typically selected to be close in shape to typical objects. Upright rectangle for pedestrian, lying rectangle for car, etc. 20/41 (Fig. from [Liu et al.(2016)])
  • 25. 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 21/41
  • 26. Single-shot Methods Image Classify and regress bboxes Detections • Unlike Faster R-CNN et al., only has a single stage. • Can be thought of as that the RPN both localizes and classifies the object. • When to use? Single-shot methods are generally fast and have moderate accuracy. • We will look at two representative methods: • SSD: Single Shot MultiBox Detector • YOLO: You Only Look Once • (Out of scope, but recommended reading: [Zhang et al.(2018), Lin et al.(2017b)]) 22/41
  • 27. SSD: Single Shot MultiBox Detector [Liu et al.(2016)] • At each feature scale, predict class and bounding box regression (via Huber loss). Linear scale target center x ˆgcx j Log scale target width ˆgw j 23/41 (Fig. from [Liu et al.(2016)])
  • 28. YOLO: You Only Look Once [Redmon et al.(2016)] • Unlike SSD, much simpler. Just a single scale of features and fully connected layers. • YOLOv2 [Redmon and Farhadi(2017)] makes the method more like SSD, removes fully connected layers (+ other tricks). 24/41 (Fig. from [Liu et al.(2016)])
  • 29. 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 25/41
  • 30. Anchor-free Methods • Recently, researchers have tried to remove anchors from object detection methods. • Why are anchors bad? • Human-designed prior. (cf. hand-crafted featured vs deep learning on ILSVRC in 2012) • Anchor-free methods provide more flexibility to leverage large-scale data. • Why are we able to remove the anchors? Mainly thanks to progress in • deep keypoint detection research [Newell et al.(2017), Newell and Deng(2017)]. • detection loss formulation, such as focal loss [Lin et al.(2017b)]. • Unlike previous architectures, more similar to semantic segmentation. • When to use? Instead of single-shot methods for fast and accurate inference. • If you were thinking of using SSD for your project – think again. • Caveat: Two-stage methods tend to still be more accurate. 26/41
  • 31. CornerNet: Detecting Objects as Paired Keypoints [Law and Deng(2018)] • First competitive anchor-free method – detect corners of objects. • Key contributions are corner pooling and improving focal loss for keypoint detection: 27/41 (Fig. from [Law and Deng(2018)])
  • 32. CenterNet(s): Objects as Points • Found there are actually two CenterNet papers: posted on arXiv 1 day apart. • Apr 16: Predict center, regress object size. • Apr 17: Extend CornerNet to take both corners and centers into account. 28/41 (Figs. from [Zhou et al.(2019b), Duan et al.(2019)])
  • 33. FCOS: Fully Convolutional One-Stage Object Detection [Tian et al.(2019)] • Similar to the Apr 16 CenterNet – predict center, regress object size. 29/41 (Figs. from [Tian et al.(2019)])
  • 34. Bottom-up Object Detection by Grouping Extreme and Center Points [Zhou et al.(2019a)] • Detect not only bboxes, but convex octagons enclosing objects. Same author as Apr 16 CenterNet. 30/41 (Figs. from [Zhou et al.(2019a)])
  • 35. 1 Motivation 2 History Before Deep Learning 3 Two-stage Methods 4 Single-shot Methods 5 Anchor-free Methods 6 Problems and Summary 31/41
  • 36. Problems for Current Object Detection Methods • Hand-crafted post-processing (NMS) is still required due to correlated detections • Attempts at “truly” end-to-end object detectors exist, but not practical [Hu et al.(2018), Hosang et al.(2017)]. • Bounding box representation not optimal – remember the horse? 32/41 (Fig. from telegraph.co.uk)
  • 37. Problems for Current Object Detection Methods • Hand-crafted post-processing (NMS) is still required due to correlated detections • Attempts at “truly” end-to-end object detectors exist, but not practical [Hu et al.(2018), Hosang et al.(2017)]. • Bounding box representation not optimal – remember the horse? • You can miss half the horse – and still considered a correct detection! 32/41 (Fig. from telegraph.co.uk)
  • 38. Problems – and solutions? • Bounding box representation and hand-crafted NMS not optimal • New field, panoptic segmentation, tackles this [Kirillov et al.(2019)]. • No need for NMS, as the task requires explaining each pixel, including background classes. • Proposes new metric PQ: Panoptic Quality for replacing IoU metric. 33/41 (Fig. from [Kirillov et al.(2019)])
  • 39. Summary • Deep learning has resulted in a significant (∼ 150%) improvement in object detection. • Three main architectures of deep object detectors • Two-stage • Single-shot • Anchor-free (recent research!) • Issues • Hand-crafted post-processing (NMS) is still required due to correlated detections. • Issue of using bounding boxes to represent objects remains. • → Recent new field of panoptic segmentation aims to mitigate these issues. • Takeaway message: • Do not use SSD! There are much better methods available for your project. • Anchor-free methods are promising for fast and accurate object detection. 34/41
  • 40. References I T. Akiba et al. Pfdet: 2nd place solution to open images challenge 2018 object detection track. arXiv preprint arXiv:1809.00778, 2018. N. Bodla et al. Soft-nms–improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision, pages 5561–5569, 2017. N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In International Conference on computer vision & Pattern Recognition (CVPR’05), volume 1, pages 886–893. IEEE Computer Society, 2005. K. Duan et al. Centernet: Object detection with keypoint triplets. arXiv preprint arXiv:1904.08189, 2019. P. Felzenszwalb et al. A discriminatively trained, multiscale, deformable part model. In CVPR, 2008. R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015. 35/41
  • 41. References II R. Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014. J. Hosang et al. Learning non-maximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4507–4515, 2017. H. Hu et al. Relation networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3588–3597, 2018. A. Kirillov et al. Panoptic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012. H. Law and J. Deng. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), pages 734–750, 2018. 36/41
  • 42. References III Y. Li et al. Scale-aware trident networks for object detection. arXiv preprint arXiv:1901.01892, 2019. T.-Y. Lin et al. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2117–2125, 2017a. T.-Y. Lin et al. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b. W. Liu et al. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37. Springer, 2016. X. Lu et al. Grid r-cnn. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. A. Newell and J. Deng. Pixels to graphs by associative embedding. In Advances in neural information processing systems, pages 2171–2180, 2017. 37/41
  • 43. References IV A. Newell et al. Associative embedding: End-to-end learning for joint detection and grouping. In Advances in Neural Information Processing Systems, pages 2277–2287, 2017. J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017. J. Redmon et al. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016. S. Ren et al. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015. B. Singh et al. Sniper: Efficient multi-scale training. In Advances in Neural Information Processing Systems, pages 9310–9320, 2018. Z. Tian et al. Fcos: Fully convolutional one-stage object detection. arXiv preprint arXiv:1904.01355, 2019. 38/41
  • 44. References V J. Uijlings et al. Selective search for object recognition. International journal of computer vision, 104(2):154–171, 2013. P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR, 2001. S. Zhang et al. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4203–4212, 2018. H. Zhou et al. Cad: Scale invariant framework for real-time object detection. In The IEEE International Conference on Computer Vision (ICCV) Workshops, Oct 2017. X. Zhou et al. Bottom-up object detection by grouping extreme and center points. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019a. X. Zhou et al. Objects as points. arXiv preprint arXiv:1904.07850, 2019b. 39/41
  • 45. References VI Z. Zou et al. Object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055, 2019. 40/41
  • 46. Thank you for listening! Questions? 41/41