SlideShare a Scribd company logo
Object Detection
Using R-CNN Deep Learning
Framework
Nader Karimi Bavandpour (nader.karimi.b@gmail.com)
Summer School of Intelligent Learning
IPM, 2019
Table of Content
● Machine Learning Key Point: Inductive Bias
● From Classification to Instance Segmentation
● Region Proposal
● R-CNN Framework
2
Machine Learning Key Point:
Inductive Bias
3
Definition of Inductive Bias
The kind of necessary assumptions about the nature of the target function are subsumed in the phrase
inductive bias.
- Wikipedia
Every machine learning algorithm with any ability to generalize beyond the training data that it sees has
some type of inductive bias.
- StackOverflow
4
Examples of Inductive Bias
● Maximum Margin: Maximize the width of the boundary between two classes
● Nearest Neighbors: Most of the cases in a small neighborhood in feature space belong to the same
class
● Minimum Cross-Validation Error: Select the hypothesis with the lowest cross-validation error
5
○ Although cross-validation may seem to be free of bias,
the "no free lunch" theorems show that cross-validation must be biased.
● Locality of Receptive Field: Use convolutional layers instead of fc layers
From Classification to
Instance Segmentation
6
Object Classification
7
● Image Category Recognition
● Input: image
● Output: Class label
● Types:
○ Binary/Multi-class Classification
○ Multiclass Classification
○ Binary/Multi-label Classification
Object Localization
8
● Object Bounding Box Recognition
● Input: image
● Output: Box in the image (x, y, w, h)
Semantic Segmentation
9
● Pixel Category Recognition
● Input: Image
● Output: Category-aware pixel labels
Instance Segmentation
10
● Instance-Aware Pixel Category Recognition
● Input: Image
● Output: Instance-aware pixel labels
Intersection Over Union (IoU)
Important measurement for object localization
Used in both training and evaluation
11
Datasets: ImageNet Challenge
● 1000 Classes
● Each image has 1 class with at least one bounding box
● About 800 Training images per class
● Algorithm produces 5 (class + bounding box) guesses
● Correct if at least one of guess has correct class and bounding box
at least 50% intersection over union.
12
13
Region Proposal
14
Selective Search for Region Proposal
● A region proposal algorithm used in object detection
● Designed to be fast with a very high recall
● Based on computing hierarchical grouping of similar regions based on
color, texture, size and shape compatibility
15
Selective Search for Region Proposal
● First takes an image as input
16
Selective Search for Region Proposal
● Generates initial sub-segmentations
17
Selective Search for Region Proposal
● Combines the similar regions to form a larger region
○ based on color similarity, texture similarity, size
similarity, and shape compatibility
● Finally, these regions produce the Regions of
Interest (RoI)
18
R-CNN Framework
19
R-CNN Family
● R-CNN: Selective search → Cropped Image → CNN
● Fast R-CNN: Selective search → Crop feature map of CNN
● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN
● Mask-CNN: Adds Object Boundary Prediction to R-CNN
20
R-CNN Family
● R-CNN: Selective search → Cropped Image → CNN
● Fast R-CNN: Selective search → Crop feature map of CNN
● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN
● Mask-CNN: Adds Object Boundary Prediction to R-CNN
21
R-CNN
22
R-CNN
23
R-CNN
24
R-CNN
25
R-CNN
26
Problems with R-CNN
● Extracting 2,000 regions for each image based on selective search
● Extracting features using CNN for every image region. Suppose we have N images, then the number of
CNN features will be N*2,000
● The entire process of object detection using R-CNN has three models:
○ CNN for feature extraction
○ Linear SVM classifier for identifying objects
○ Regression model for tightening the bounding boxes
27
R-CNN Family
● R-CNN: Selective search → Cropped Image → CNN
● Fast R-CNN: Selective search → Crop feature map of CNN
● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN
● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN
28
Fast RCNN
● Selective search as a proposal method
to find the Regions of Interest is slow
● Takes around 2 seconds per image to
detect objects, which is much better
compared to RCNN
29
R-CNN Family
● R-CNN: Selective search → Cropped Image → CNN
● Fast R-CNN: Selective search → Crop feature map of CNN
● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN
● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN
30
Faster RCNN
● Region Proposal Network (RPN) for region proposal
○ Input: Image of any size
○ Output: A set of rectangular object proposals and objectness
scores
○ Related to attention mechanisms
31
Faster RCNN
● Feature maps from CNN are passed to the
Region Proposal Network (RPN)
● k Anchor boxes of different shapes are
generated using a sliding window in the RPN
● Anchor boxes are fixed sized boundary boxes
that are placed throughout the image and
have different shapes and size
32
Faster RCNN
● For each anchor, RPN predicts two things:
○ The first is the probability that an anchor is an object (it does not consider which
class the object belongs to)
○ Second is the bounding box regressor for adjusting the anchors to better fit the
object
33
R-CNN Family
● R-CNN: Selective search → Cropped Image → CNN
● Fast R-CNN: Selective search → Crop feature map of CNN
● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN
● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN
34
Mask R-CNN
● Extends Faster R-CNN by adding a
branch for predicting an object mask in
parallel with the existing branch for
bounding box recognition
35
Mask R-CNN
● Defines a multi-task loss on each sampled RoI
as:
L = L_cls + L_box + L_mask
36
Mask R-CNN
37
Thanks for Your Attention!
38

More Related Content

What's hot

Introduction of Faster R-CNN
Introduction of Faster R-CNNIntroduction of Faster R-CNN
Introduction of Faster R-CNN
Simossyi Funabashi
 
Object detection
Object detectionObject detection
Object detection
ROUSHAN RAJ KUMAR
 
Faster rcnn
Faster rcnnFaster rcnn
Faster rcnn
捷恩 蔡
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
Jinwon Lee
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
Sushant Shrivastava
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
SumeraHangi
 
YOLO
YOLOYOLO
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving I
Yu Huang
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
Yu Huang
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNN
anna8885
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Universitat Politècnica de Catalunya
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
Yu Huang
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
leopauly
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
Intel Nervana
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
Universitat Politècnica de Catalunya
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
DADAJONJURAKUZIEV
 

What's hot (20)

Introduction of Faster R-CNN
Introduction of Faster R-CNNIntroduction of Faster R-CNN
Introduction of Faster R-CNN
 
Object detection
Object detectionObject detection
Object detection
 
Faster rcnn
Faster rcnnFaster rcnn
Faster rcnn
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
YOLO
YOLOYOLO
YOLO
 
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving I
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNN
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)You only look once: Unified, real-time object detection (UPC Reading Group)
You only look once: Unified, real-time object detection (UPC Reading Group)
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 

Similar to Object Detection Using R-CNN Deep Learning Framework

object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
Yoonho Na
 
R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
Entrepreneur / Startup
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
Universitat de Barcelona
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
Universitat Politècnica de Catalunya
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Adaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom predictionAdaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom prediction
Universitat Politècnica de Catalunya
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
Brodmann17
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Edge AI and Vision Alliance
 
object-detection.pptx
object-detection.pptxobject-detection.pptx
object-detection.pptx
MohamedAliHabib3
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
Brodmann17
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
Yu Huang
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp
Deep Learning JP
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019
Kousuke Kuzuoka
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
RishavSharma112
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary Meetup
Amir Alush
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 

Similar to Object Detection Using R-CNN Deep Learning Framework (20)

object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Adaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom predictionAdaptive object detection using adjacency and zoom prediction
Adaptive object detection using adjacency and zoom prediction
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
object-detection.pptx
object-detection.pptxobject-detection.pptx
object-detection.pptx
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp[DL輪読会]ClearGrasp
[DL輪読会]ClearGrasp
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
 
Cvpr 2017 Summary Meetup
Cvpr 2017 Summary MeetupCvpr 2017 Summary Meetup
Cvpr 2017 Summary Meetup
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 

Recently uploaded

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 

Recently uploaded (20)

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 

Object Detection Using R-CNN Deep Learning Framework

  • 1. Object Detection Using R-CNN Deep Learning Framework Nader Karimi Bavandpour (nader.karimi.b@gmail.com) Summer School of Intelligent Learning IPM, 2019
  • 2. Table of Content ● Machine Learning Key Point: Inductive Bias ● From Classification to Instance Segmentation ● Region Proposal ● R-CNN Framework 2
  • 3. Machine Learning Key Point: Inductive Bias 3
  • 4. Definition of Inductive Bias The kind of necessary assumptions about the nature of the target function are subsumed in the phrase inductive bias. - Wikipedia Every machine learning algorithm with any ability to generalize beyond the training data that it sees has some type of inductive bias. - StackOverflow 4
  • 5. Examples of Inductive Bias ● Maximum Margin: Maximize the width of the boundary between two classes ● Nearest Neighbors: Most of the cases in a small neighborhood in feature space belong to the same class ● Minimum Cross-Validation Error: Select the hypothesis with the lowest cross-validation error 5 ○ Although cross-validation may seem to be free of bias, the "no free lunch" theorems show that cross-validation must be biased. ● Locality of Receptive Field: Use convolutional layers instead of fc layers
  • 7. Object Classification 7 ● Image Category Recognition ● Input: image ● Output: Class label ● Types: ○ Binary/Multi-class Classification ○ Multiclass Classification ○ Binary/Multi-label Classification
  • 8. Object Localization 8 ● Object Bounding Box Recognition ● Input: image ● Output: Box in the image (x, y, w, h)
  • 9. Semantic Segmentation 9 ● Pixel Category Recognition ● Input: Image ● Output: Category-aware pixel labels
  • 10. Instance Segmentation 10 ● Instance-Aware Pixel Category Recognition ● Input: Image ● Output: Instance-aware pixel labels
  • 11. Intersection Over Union (IoU) Important measurement for object localization Used in both training and evaluation 11
  • 12. Datasets: ImageNet Challenge ● 1000 Classes ● Each image has 1 class with at least one bounding box ● About 800 Training images per class ● Algorithm produces 5 (class + bounding box) guesses ● Correct if at least one of guess has correct class and bounding box at least 50% intersection over union. 12
  • 13. 13
  • 15. Selective Search for Region Proposal ● A region proposal algorithm used in object detection ● Designed to be fast with a very high recall ● Based on computing hierarchical grouping of similar regions based on color, texture, size and shape compatibility 15
  • 16. Selective Search for Region Proposal ● First takes an image as input 16
  • 17. Selective Search for Region Proposal ● Generates initial sub-segmentations 17
  • 18. Selective Search for Region Proposal ● Combines the similar regions to form a larger region ○ based on color similarity, texture similarity, size similarity, and shape compatibility ● Finally, these regions produce the Regions of Interest (RoI) 18
  • 20. R-CNN Family ● R-CNN: Selective search → Cropped Image → CNN ● Fast R-CNN: Selective search → Crop feature map of CNN ● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN ● Mask-CNN: Adds Object Boundary Prediction to R-CNN 20
  • 21. R-CNN Family ● R-CNN: Selective search → Cropped Image → CNN ● Fast R-CNN: Selective search → Crop feature map of CNN ● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN ● Mask-CNN: Adds Object Boundary Prediction to R-CNN 21
  • 27. Problems with R-CNN ● Extracting 2,000 regions for each image based on selective search ● Extracting features using CNN for every image region. Suppose we have N images, then the number of CNN features will be N*2,000 ● The entire process of object detection using R-CNN has three models: ○ CNN for feature extraction ○ Linear SVM classifier for identifying objects ○ Regression model for tightening the bounding boxes 27
  • 28. R-CNN Family ● R-CNN: Selective search → Cropped Image → CNN ● Fast R-CNN: Selective search → Crop feature map of CNN ● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN ● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN 28
  • 29. Fast RCNN ● Selective search as a proposal method to find the Regions of Interest is slow ● Takes around 2 seconds per image to detect objects, which is much better compared to RCNN 29
  • 30. R-CNN Family ● R-CNN: Selective search → Cropped Image → CNN ● Fast R-CNN: Selective search → Crop feature map of CNN ● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN ● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN 30
  • 31. Faster RCNN ● Region Proposal Network (RPN) for region proposal ○ Input: Image of any size ○ Output: A set of rectangular object proposals and objectness scores ○ Related to attention mechanisms 31
  • 32. Faster RCNN ● Feature maps from CNN are passed to the Region Proposal Network (RPN) ● k Anchor boxes of different shapes are generated using a sliding window in the RPN ● Anchor boxes are fixed sized boundary boxes that are placed throughout the image and have different shapes and size 32
  • 33. Faster RCNN ● For each anchor, RPN predicts two things: ○ The first is the probability that an anchor is an object (it does not consider which class the object belongs to) ○ Second is the bounding box regressor for adjusting the anchors to better fit the object 33
  • 34. R-CNN Family ● R-CNN: Selective search → Cropped Image → CNN ● Fast R-CNN: Selective search → Crop feature map of CNN ● Faster R-CNN: CNN → Region-Proposal Network → Crop feature map of CNN ● Mask-CNN: Mask-CNN: Adds Object Boundary Prediction to R-CNN 34
  • 35. Mask R-CNN ● Extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition 35
  • 36. Mask R-CNN ● Defines a multi-task loss on each sampled RoI as: L = L_cls + L_box + L_mask 36
  • 38. Thanks for Your Attention! 38