SlideShare a Scribd company logo
1 of 25
Download to read offline
Best of both worlds:
Human-machine
collaboration
for object annotation (CVPR2015)
visionNoob
(이재원)
PR-157:
Olga Russakovksy, Li-Jia Li, Li Fei-Fei
Stanford University, Snapchat
[paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf
[CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf
[supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf
[slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf
Goal
efficiently and accurately detect all objects in an image
Green boxes
RCNN results (with NMS)
Yellow boxes
ILSVRC dataset classes
(but RCNN fail)
Pink boxes
outside of range of capabilities
of object detectors
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
- Weakly supervised learning [42, 23, 52, 8, 24, 15]
- Active learning [32, 56] (see also [PR-119])
- Mine the web for object detection [8, 11, 15]
-> minimize human annotation
http://mpawankumar.info/tutorials/cvpr2013/index.html
Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Crowdsourcing techniques
- Annotation games [57, 12, 30]
- Tricks to reduce the annotation search space [13, 4]
- Effective user interface design [50, 58]
- Making use of existing annotations [5]
Making use of weak human supervision [26, 7]
Accurately computing the number of required workers [46]
System Overview
System Overview
input
1. image to label
2. Constraints
- utility
- precision
- and/or budget
output
Bi: bounding box
Ci: class label
pi: confident (prob of detection being correct)
System Overview
Constraints
- utility (𝑈∗
)
- precision(𝑃∗
)
- budget (𝐵∗
) : cost of human time
= 1 (in this paper)
3가지 중
2가지만 선택
Method
Model : Markov Decision Process (MDP)
State
Action
Transition probability
Reward
Optimization
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
cls(C | I, U)
det(B, C | I, U)
moreinst(B, X | I, U)
obj(B | I , U)
morecls(C | I, U)
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a question to ask humans
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Note that
Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Note that
Method
Computing the transition probability
t t-1
total probability
Method
Computing the transition probability
priorBayes’ rule
∝
Examples)
P( C | I ) //classifier
P( B, C | I ) // obj detector
Method
Multiple computer vision models
Method
Pre-computed human error rates
Experimental Setup
dataset
ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset
train set : 400K
validation : 200K (split the val set into two sets(val1, val2) for test)
computer vision models
1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14]
2. Object detector : 200 class RCNN [Girshick CVPR14]
3. Probability of object region : Objectness measure [Alexe PAMI2012]
4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data
5. Probability of another class in image : statistics from ILSVRC2014 val-DET data
Experimental Results The ILSVRC detection system :
Step1 : determining what object classes are present in the images
Step2 : Asking users to draw bounding boxes.
Conclusions
We presented a principled approach to unifying multiple inputs
from both computer vision and humans to label objects in images.
Discussion

More Related Content

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation

Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learningUtkarsh Contractor
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Wesley De Neve
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelDavid Ritchie
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsRoland Ewald
 
Learning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksLearning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksFranco Maria Nardini
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.ppsbutest
 
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...collwe
 

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation (20)

Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Af03401810185..
Af03401810185..Af03401810185..
Af03401810185..
 
Human_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
 
Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of Simulators
 
Learning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search TasksLearning To Rank User Queries to Detect Search Tasks
Learning To Rank User Queries to Detect Search Tasks
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.pps
 
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
 

More from jaewon lee

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the WildPR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wildjaewon lee
 
PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Trainingjaewon lee
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsjaewon lee
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networksjaewon lee
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networksjaewon lee
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devconjaewon lee
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?jaewon lee
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPjaewon lee
 

More from jaewon lee (9)

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the WildPR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild
 
PR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale TrainingPR-199: SNIPER:Efficient Multi Scale Training
PR-199: SNIPER:Efficient Multi Scale Training
 
PR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypointsPR-146: CornerNet detecting objects as paired keypoints
PR-146: CornerNet detecting objects as paired keypoints
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networks
 
PR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial NetworksPR-122: Can-Creative Adversarial Networks
PR-122: Can-Creative Adversarial Networks
 
Rgb data
Rgb dataRgb data
Rgb data
 
Pytorch kr devcon
Pytorch kr devconPytorch kr devcon
Pytorch kr devcon
 
PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?PR-134 How Does Batch Normalization Help Optimization?
PR-134 How Does Batch Normalization Help Optimization?
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
 

Recently uploaded

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Recently uploaded (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

PR157: Best of both worlds: human-machine collaboration for object annotation

  • 1. Best of both worlds: Human-machine collaboration for object annotation (CVPR2015) visionNoob (이재원) PR-157: Olga Russakovksy, Li-Jia Li, Li Fei-Fei Stanford University, Snapchat [paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf [CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf [supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf [slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf
  • 2. Goal efficiently and accurately detect all objects in an image Green boxes RCNN results (with NMS) Yellow boxes ILSVRC dataset classes (but RCNN fail) Pink boxes outside of range of capabilities of object detectors
  • 3.
  • 4. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation
  • 5. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation
  • 6. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation - Weakly supervised learning [42, 23, 52, 8, 24, 15] - Active learning [32, 56] (see also [PR-119]) - Mine the web for object detection [8, 11, 15] -> minimize human annotation http://mpawankumar.info/tutorials/cvpr2013/index.html
  • 7. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation Crowdsourcing techniques - Annotation games [57, 12, 30] - Tricks to reduce the annotation search space [13, 4] - Effective user interface design [50, 58] - Making use of existing annotations [5] Making use of weak human supervision [26, 7] Accurately computing the number of required workers [46]
  • 8.
  • 10. System Overview input 1. image to label 2. Constraints - utility - precision - and/or budget output Bi: bounding box Ci: class label pi: confident (prob of detection being correct)
  • 11. System Overview Constraints - utility (𝑈∗ ) - precision(𝑃∗ ) - budget (𝐵∗ ) : cost of human time = 1 (in this paper) 3가지 중 2가지만 선택
  • 12. Method Model : Markov Decision Process (MDP) State Action Transition probability Reward Optimization
  • 13. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities cls(C | I, U) det(B, C | I, U) moreinst(B, X | I, U) obj(B | I , U) morecls(C | I, U)
  • 14. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a question to ask humans
  • 15. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search
  • 16. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that
  • 17. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that
  • 18. Method Computing the transition probability t t-1 total probability
  • 19. Method Computing the transition probability priorBayes’ rule ∝ Examples) P( C | I ) //classifier P( B, C | I ) // obj detector
  • 22. Experimental Setup dataset ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset train set : 400K validation : 200K (split the val set into two sets(val1, val2) for test) computer vision models 1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14] 2. Object detector : 200 class RCNN [Girshick CVPR14] 3. Probability of object region : Objectness measure [Alexe PAMI2012] 4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data 5. Probability of another class in image : statistics from ILSVRC2014 val-DET data
  • 23. Experimental Results The ILSVRC detection system : Step1 : determining what object classes are present in the images Step2 : Asking users to draw bounding boxes.
  • 24. Conclusions We presented a principled approach to unifying multiple inputs from both computer vision and humans to label objects in images.