SlideShare a Scribd company logo
1 of 36
Download to read offline
Advanced Deep Learning based Object
Detection Methods
Improving Object Detection With One Line of Code
● Non-Maximum Suppression is a greedy
process.
○ It worked well enough in 2007 but it doesn’t
anymore.
● High scoring detections can be suppressed
just as low scoring detections.
○ Overlap with stronger detection is the only
criteria.
● Should one detection completely suppress
another detection, or simply reduce its
confidence?
Improving Object Detection With One Line of Code
● NMS:
● Linear Soft-NMS:
● Gaussian Soft-NMS:
○ Linear Soft-NMS is not continuous in terms of
overlap and a sudden penalty is applied when a
NMS threshold is reached.
○ Instead we can use a continuous function:
Improving Object Detection With One Line of Code
Improving Object Detection With One Line of Code
Learning Non-Maximum Suppression
● Object detectors are mostly trained
end-to-end, except for the NMS.
○ NMS is still fully hand-crafted, and forces a
trade-off between recall and precision.
● Training loss is not evaluation loss.
○ Training is performed without NMS
○ During evaluation, multiple detections for same
object count as false positives.
● Instead, train the network to include the
suppression process.
○ Only output one bounding box per object.
○ Learn how to handle close objects.
Learning Non-Maximum Suppression
● Additional blocks that:
○ Encode pairwise information.
○ For each detection, pool information from all
pairings.
○ Update feature vector.
○ Repeat.
● New loss:
○ Only one positive candidate per object.
○ Instead of the current practice to take all
objects with IoU>50%
Learning Non-Maximum Suppression
Learning Non-Maximum Suppression
● Multi-scale object detection using image pyramid
○ Predict different scales by applying same model at different image resolutions.
● Classic method.
● But also, in OverFeat.
● Slow. Requires multiple evaluation of the same model.
Multi-Scale Object Detection
Multi-Scale Object Detection
● Predict multiple scale of objects using a single feature map.
● Same as Faster R-CNN.
● Fast
● Single model (same in training as in testing).
● Bad features resolution for small objects.
● Predict different object sizes at different feature scales.
● Same as SSD.
● Good features resolution for small objects
● But features are much weaker than in deeper layers.
Multi-Scale Object Detection
● Single model (same in training as in testing).
● Good features resolution for small objects.
● Strong features in all layers.
● Almost no overhead over SSD (= Fast).
Feature Pyramid Network (FPN)
Feature Pyramid Network (FPN)
Feature Pyramid Network (FPN)
● How important is top-down enrichment?
● How important are lateral connections?
● How important are pyramid representations?
Feature Pyramid Network (FPN)
● How important is top-down enrichment?
● How important are lateral connections?
● How important are pyramid representations?
Focal Loss for Dense Object Detection
● Can we train a single stage detector to be as accurate as two stage detectors?
● Contributions:
○ RetinaNet: Single stage object detector based on FPN backbone.
○ New loss.
Focal Loss for Dense Object Detection
● Class unbalance is an important issue for object detection.
● Previous solutions:
○ Random resampling at 1:3 ratio.
○ Hard negative resampling at 1:3 ratio.
● Both solutions means that at each step, we only a few samples actually matters
to the loss function.
● Instead, include all samples but use different weight for each class.
○ Regular cross entropy:
○ Weighted cross entropy:
● Using weight CE as baseline:
○ Can we do better?
○ Can we use different weight for each sample?
● Focal loss:
● Every sample is weighted according to its error.
○ We want to focus on samples which are
mislabeled.
Focal Loss for Dense Object Detection
● Different parameters for RetinaNet
Focal Loss for Dense Object Detection
● Comparison with online hard negative mining
Focal Loss for Dense Object Detection
● Accuracy/speed trade-offs
Focal Loss for Dense Object Detection
● Benchmark results
Focal Loss for Dense Object Detection
Also Read:
Deformable Convolutional Networks
https://arxiv.org/abs/1703.06211
YouTube Videos
● CS231n
○ Lecture 11 - Detection and segmentation https://youtu.be/nDPWywWRIRo
● Deep Learning for Objects and Scenes (CVPR 2017 Workshop)
○ Lecture 1: Learning Deep Representations for Visual Recognition, by Kaiming He
https://youtu.be/jHv37mKAhV4
○ Lecture 2: Deep Learning for Instance-level Object Understanding, by Ross Girshick
https://youtu.be/jHv37mKAhV4?t=39m4s
Looking for brilliant researchers
cv@brodmann17.com /
amir@brodmann17.com
Computer Vision Tasks
Source: CS231n Object detection http://cs231n.stanford.edu/slides/2016/winter1516_lecture8.pdf
Mask R-CNN
● Instance segmentation with pose
estimation for people.
● Extends faster R-CNN by adding new
branch for the instance mask task.
● Pose estimation can be added by simply
adding an additional branch.
● SOTA accuracy on detection, segmentation
and pose estimation at 5 FPS on GPU.
● https://arxiv.org/abs/1703.06870
● Girshick won young researcher award.
Mask R-CNN
Mask R-CNN
Mask R-CNN
Mask R-CNN
● RoiPool
○ Quantization breaks pixel-to-pixel alignment
○ Too coarse and not good for fine spatial
information required for mask.
● RoiAlign
○ Bilinearly sample the proposal region and avoid
the quantization.
○ Smoothly normalize features and predictions
into coordinate frame free of scale and aspect
ratio
Mask R-CNN
Mask R-CNN
● Backbone architecture
○ ResNet
○ ResNeXt
○ FPN
● Mask representation
○ FC vs. Convolutional
○ Multinomial vs. Independent Masks: softmax
vs. sigmoid
○ Class-Specific vs. Class-Agnostic Masks:
almost same accuracy
● Multi-task learning
○ Mask task improves object detection accuracy.
○ Keypoint task reduces object detection
accuracy.
Mask R-CNN
● Pose estimation
○ Simply add an additional branch.
○ Model a keypoint’s location as a one-hot mask,
and adopt Mask R-CNN to predict K masks.
○ Experiments are mainly to demonstrate the
generality of the Mask R-CNN framework.
○ RoiAlign improves this task’s accuracy as well.
Looking for brilliant researchers
cv@brodmann17.com

More Related Content

What's hot

Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataYu Huang
 
Computer Science Thesis Defense
Computer Science Thesis DefenseComputer Science Thesis Defense
Computer Science Thesis Defensetompitkin
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
Human pose estimation with deep learning
Human pose estimation with deep learningHuman pose estimation with deep learning
Human pose estimation with deep learningengiyad95
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationVikas Jain
 
A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regressionijtsrd
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applicationsAnas Arram, Ph.D
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention NetworksTaeoh Kim
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVYu Huang
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 

What's hot (20)

Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal Data
 
Computer Science Thesis Defense
Computer Science Thesis DefenseComputer Science Thesis Defense
Computer Science Thesis Defense
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Human pose estimation with deep learning
Human pose estimation with deep learningHuman pose estimation with deep learning
Human pose estimation with deep learning
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applications
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 

Similar to Advanced deep learning based object detection methods

Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupAmir Alush
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human labelKai-Wen Zhao
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IWanjin Yu
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Fwdays
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningBrodmann17
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Leveraging high level and low-level features for multimedia event detection.2...
Leveraging high level and low-level features for multimedia event detection.2...Leveraging high level and low-level features for multimedia event detection.2...
Leveraging high level and low-level features for multimedia event detection.2...Lu Jiang
 
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Luba Elliott
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVERNAVER D2
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Deep Neural Networks Presentation
Deep Neural Networks PresentationDeep Neural Networks Presentation
Deep Neural Networks PresentationBohdan Klimenko
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 

Similar to Advanced deep learning based object detection methods (20)

Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary Meetup
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human label
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet I
 
Paper review
Paper reviewPaper review
Paper review
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"Александр Заричковый "Faster than real-time face detection"
Александр Заричковый "Faster than real-time face detection"
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
fpres
fpresfpres
fpres
 
Leveraging high level and low-level features for multimedia event detection.2...
Leveraging high level and low-level features for multimedia event detection.2...Leveraging high level and low-level features for multimedia event detection.2...
Leveraging high level and low-level features for multimedia event detection.2...
 
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Deep Neural Networks Presentation
Deep Neural Networks PresentationDeep Neural Networks Presentation
Deep Neural Networks Presentation
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 

Recently uploaded

Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 

Recently uploaded (20)

Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

Advanced deep learning based object detection methods

  • 1. Advanced Deep Learning based Object Detection Methods
  • 2. Improving Object Detection With One Line of Code ● Non-Maximum Suppression is a greedy process. ○ It worked well enough in 2007 but it doesn’t anymore. ● High scoring detections can be suppressed just as low scoring detections. ○ Overlap with stronger detection is the only criteria. ● Should one detection completely suppress another detection, or simply reduce its confidence?
  • 3. Improving Object Detection With One Line of Code ● NMS: ● Linear Soft-NMS: ● Gaussian Soft-NMS: ○ Linear Soft-NMS is not continuous in terms of overlap and a sudden penalty is applied when a NMS threshold is reached. ○ Instead we can use a continuous function:
  • 4. Improving Object Detection With One Line of Code
  • 5. Improving Object Detection With One Line of Code
  • 6. Learning Non-Maximum Suppression ● Object detectors are mostly trained end-to-end, except for the NMS. ○ NMS is still fully hand-crafted, and forces a trade-off between recall and precision. ● Training loss is not evaluation loss. ○ Training is performed without NMS ○ During evaluation, multiple detections for same object count as false positives. ● Instead, train the network to include the suppression process. ○ Only output one bounding box per object. ○ Learn how to handle close objects.
  • 7. Learning Non-Maximum Suppression ● Additional blocks that: ○ Encode pairwise information. ○ For each detection, pool information from all pairings. ○ Update feature vector. ○ Repeat. ● New loss: ○ Only one positive candidate per object. ○ Instead of the current practice to take all objects with IoU>50%
  • 10. ● Multi-scale object detection using image pyramid ○ Predict different scales by applying same model at different image resolutions. ● Classic method. ● But also, in OverFeat. ● Slow. Requires multiple evaluation of the same model. Multi-Scale Object Detection
  • 11. Multi-Scale Object Detection ● Predict multiple scale of objects using a single feature map. ● Same as Faster R-CNN. ● Fast ● Single model (same in training as in testing). ● Bad features resolution for small objects.
  • 12. ● Predict different object sizes at different feature scales. ● Same as SSD. ● Good features resolution for small objects ● But features are much weaker than in deeper layers. Multi-Scale Object Detection
  • 13. ● Single model (same in training as in testing). ● Good features resolution for small objects. ● Strong features in all layers. ● Almost no overhead over SSD (= Fast). Feature Pyramid Network (FPN)
  • 15. Feature Pyramid Network (FPN) ● How important is top-down enrichment? ● How important are lateral connections? ● How important are pyramid representations?
  • 16. Feature Pyramid Network (FPN) ● How important is top-down enrichment? ● How important are lateral connections? ● How important are pyramid representations?
  • 17. Focal Loss for Dense Object Detection ● Can we train a single stage detector to be as accurate as two stage detectors? ● Contributions: ○ RetinaNet: Single stage object detector based on FPN backbone. ○ New loss.
  • 18. Focal Loss for Dense Object Detection ● Class unbalance is an important issue for object detection. ● Previous solutions: ○ Random resampling at 1:3 ratio. ○ Hard negative resampling at 1:3 ratio. ● Both solutions means that at each step, we only a few samples actually matters to the loss function. ● Instead, include all samples but use different weight for each class. ○ Regular cross entropy: ○ Weighted cross entropy:
  • 19. ● Using weight CE as baseline: ○ Can we do better? ○ Can we use different weight for each sample? ● Focal loss: ● Every sample is weighted according to its error. ○ We want to focus on samples which are mislabeled. Focal Loss for Dense Object Detection
  • 20. ● Different parameters for RetinaNet Focal Loss for Dense Object Detection
  • 21. ● Comparison with online hard negative mining Focal Loss for Dense Object Detection
  • 22. ● Accuracy/speed trade-offs Focal Loss for Dense Object Detection
  • 23. ● Benchmark results Focal Loss for Dense Object Detection
  • 24. Also Read: Deformable Convolutional Networks https://arxiv.org/abs/1703.06211
  • 25. YouTube Videos ● CS231n ○ Lecture 11 - Detection and segmentation https://youtu.be/nDPWywWRIRo ● Deep Learning for Objects and Scenes (CVPR 2017 Workshop) ○ Lecture 1: Learning Deep Representations for Visual Recognition, by Kaiming He https://youtu.be/jHv37mKAhV4 ○ Lecture 2: Deep Learning for Instance-level Object Understanding, by Ross Girshick https://youtu.be/jHv37mKAhV4?t=39m4s
  • 26. Looking for brilliant researchers cv@brodmann17.com / amir@brodmann17.com
  • 27. Computer Vision Tasks Source: CS231n Object detection http://cs231n.stanford.edu/slides/2016/winter1516_lecture8.pdf
  • 28. Mask R-CNN ● Instance segmentation with pose estimation for people. ● Extends faster R-CNN by adding new branch for the instance mask task. ● Pose estimation can be added by simply adding an additional branch. ● SOTA accuracy on detection, segmentation and pose estimation at 5 FPS on GPU. ● https://arxiv.org/abs/1703.06870 ● Girshick won young researcher award.
  • 32. Mask R-CNN ● RoiPool ○ Quantization breaks pixel-to-pixel alignment ○ Too coarse and not good for fine spatial information required for mask. ● RoiAlign ○ Bilinearly sample the proposal region and avoid the quantization. ○ Smoothly normalize features and predictions into coordinate frame free of scale and aspect ratio
  • 34. Mask R-CNN ● Backbone architecture ○ ResNet ○ ResNeXt ○ FPN ● Mask representation ○ FC vs. Convolutional ○ Multinomial vs. Independent Masks: softmax vs. sigmoid ○ Class-Specific vs. Class-Agnostic Masks: almost same accuracy ● Multi-task learning ○ Mask task improves object detection accuracy. ○ Keypoint task reduces object detection accuracy.
  • 35. Mask R-CNN ● Pose estimation ○ Simply add an additional branch. ○ Model a keypoint’s location as a one-hot mask, and adopt Mask R-CNN to predict K masks. ○ Experiments are mainly to demonstrate the generality of the Mask R-CNN framework. ○ RoiAlign improves this task’s accuracy as well.
  • 36. Looking for brilliant researchers cv@brodmann17.com