PR157: Best of both worlds: human-machine collaboration for object annotation

Best of both worlds:
Human-machine
collaboration
for object annotation (CVPR2015)
visionNoob
(이재원)
PR-157:
Olga Russakovksy, Li-Jia Li, Li Fei-Fei
Stanford University, Snapchat
[paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf
[CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf
[supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf
[slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf

Goal
efficiently and accurately detect all objects in an image
Green boxes
RCNN results (with NMS)
Yellow boxes
ILSVRC dataset classes
(but RCNN fail)
Pink boxes
outside of range of capabilities
of object detectors

Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation

Related Works
- Weakly supervised learning [42, 23, 52, 8, 24, 15]
- Active learning [32, 56] (see also [PR-119])
- Mine the web for object detection [8, 11, 15]
-> minimize human annotation
http://mpawankumar.info/tutorials/cvpr2013/index.html

Related Works
Crowdsourcing techniques
- Annotation games [57, 12, 30]
- Tricks to reduce the annotation search space [13, 4]
- Effective user interface design [50, 58]
- Making use of existing annotations [5]
Making use of weak human supervision [26, 7]
Accurately computing the number of required workers [46]

System Overview
input
1. image to label
2. Constraints
- utility
- precision
- and/or budget
output
Bi: bounding box
Ci: class label
pi: confident (prob of detection being correct)

System Overview
Constraints
- utility (𝑈∗
)
- precision(𝑃∗
)
- budget (𝐵∗
) : cost of human time
= 1 (in this paper)
3가지 중
2가지만 선택

Method
Model : Markov Decision Process (MDP)
State
Action
Transition probability
Reward
Optimization

Method
Action : a question to ask humans

Method
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search

Method
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search
Note that

Method
Computing the transition probability
t t-1
total probability

Method
Computing the transition probability
priorBayes’ rule
∝
Examples)
P( C | I ) //classifier
P( B, C | I ) // obj detector

Method
Multiple computer vision models

Method
Pre-computed human error rates

Experimental Setup
dataset
ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset
train set : 400K
validation : 200K (split the val set into two sets(val1, val2) for test)
computer vision models
1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14]
2. Object detector : 200 class RCNN [Girshick CVPR14]
3. Probability of object region : Objectness measure [Alexe PAMI2012]
4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data
5. Probability of another class in image : statistics from ILSVRC2014 val-DET data

Experimental Results The ILSVRC detection system :
Step1 : determining what object classes are present in the images
Step2 : Asking users to draw bounding boxes.

Conclusions
We presented a principled approach to unifying multiple inputs
from both computer vision and humans to label objects in images.

PR157: Best of both worlds: human-machine collaboration for object annotation

Recommended

Recommended

More Related Content

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation (20)

More from jaewon lee

More from jaewon lee (9)

Recently uploaded

Recently uploaded (20)

PR157: Best of both worlds: human-machine collaboration for object annotation