PR157: Best of both worlds: human-machine collaboration for object annotation

•

0 likes•113 views

jaewon lee

youtube : https://youtu.be/JbXdn44myP4

Education

Best of both worlds:
Human-machine
collaboration
for object annotation (CVPR2015)
visionNoob
(이재원)
PR-157:
Olga Russakovksy, Li-Jia Li, Li Fei-Fei
Stanford University, Snapchat
[paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf
[CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf
[supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf
[slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf

Goal
efficiently and accurately detect all objects in an image
Green boxes
RCNN results (with NMS)
Yellow boxes
ILSVRC dataset classes
(but RCNN fail)
Pink boxes
outside of range of capabilities
of object detectors

Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation

Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
- Weakly supervised learning [42, 23, 52, 8, 24, 15]
- Active learning [32, 56] (see also [PR-119])
- Mine the web for object detection [8, 11, 15]
-> minimize human annotation
http://mpawankumar.info/tutorials/cvpr2013/index.html

Related Works
1. Recognition with human in the loop
2. Better object detection
3. Cheaper manual annotation
Crowdsourcing techniques
- Annotation games [57, 12, 30]
- Tricks to reduce the annotation search space [13, 4]
- Effective user interface design [50, 58]
- Making use of existing annotations [5]
Making use of weak human supervision [26, 7]
Accurately computing the number of required workers [46]

System Overview
input
1. image to label
2. Constraints
- utility
- precision
- and/or budget
output
Bi: bounding box
Ci: class label
pi: confident (prob of detection being correct)

System Overview
Constraints
- utility (𝑈∗
)
- precision(𝑃∗
)
- budget (𝐵∗
) : cost of human time
= 1 (in this paper)
3가지 중
2가지만 선택

Method
Model : Markov Decision Process (MDP)
State
Action
Transition probability
Reward
Optimization

Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
cls(C | I, U)
det(B, C | I, U)
moreinst(B, X | I, U)
obj(B | I , U)
morecls(C | I, U)

Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a question to ask humans

Method
Model : Markov Decision Process (MDP)
State : set of object detections, with probabilities
Action : a questions to ask humans
Transition probability : probability distribution over user responses
Reward : increase in estimated quality of labeling divided by the cost of actions
Optimization : 2-step lookahead search

Method
Computing the transition probability
t t-1
total probability

Method
Computing the transition probability
priorBayes’ rule
∝
Examples)
P( C | I ) //classifier
P( B, C | I ) // obj detector

Experimental Setup
dataset
ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset
train set : 400K
validation : 200K (split the val set into two sets(val1, val2) for test)
computer vision models
1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14]
2. Object detector : 200 class RCNN [Girshick CVPR14]
3. Probability of object region : Objectness measure [Alexe PAMI2012]
4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data
5. Probability of another class in image : statistics from ILSVRC2014 val-DET data

Experimental Results The ILSVRC detection system :
Step1 : determining what object classes are present in the images
Step2 : Asking users to draw bounding boxes.

Conclusions
We presented a principled approach to unifying multiple inputs
from both computer vision and humans to label objects in images.

A database application differs form regular applications in that some of its inputs may be database queries. The program will execute the queries on a database and may use any result values in its subsequent program logic. This means that a user-supplied query may determine the values that the application will use in subsequent branching conditions. At the same time, a new database application is often required to work well on a body of existing data stored in some large database. For systematic testing of database applications, recent techniques replace the existing database with carefully crafted mock databases. Mock databases return values that will trigger as many execution paths in the application as possible and thereby maximize overall code coverage of the database application. In this paper we offer an alternative approach to database application testing. Our goal is to support software engineers in focusing testing on the existing body of data the application is required to work well on. For that, we propose to side-step mock database generation and instead generate queries for the existing database. Our key insight is that we can use the information collected during previous program executions to systematically generate new queries that will maximize the coverage of the application under test, while guaranteeing that the generated test cases focus on the existing data.

Object Discovery using CNN Features in Egocentric Videos

Marc Bolaños Solà

https://telecombcn-dl.github.io/2018-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Surveillance scene classification using machine learning

Utkarsh Contractor

The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to object classification using Faster R-CNNs on the COCO dataset. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.

Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery

Wai Nwe Tun

Visual diagnostics for more effective machine learning

Benjamin Bengfort

The model selection process is a search for the best combination of features, algorithm, and hyperparameters that maximize F1, R2, or silhouette scores after cross-validation. This view of machine learning often leads us toward automated processes such as grid searches and random walks. Although this approach allows us to try many combinations, we are often left wondering if we have actually succeeded. By enhancing model selection with visual diagnostics, data scientists can inject human guidance to steer the search process. Visualizing feature transformations, algorithmic behavior, cross-validation methods, and model performance allows us a peek into the high dimensional realm that our models operate. As we continue to tune our models, trying to minimize both bias and variance, these glimpses allow us to be more strategic in our choices. The result is more effective modeling, speedier results, and greater understanding of underlying processes. Visualization is an integral part of the data science workflow, but visual diagnostics are directly tied to machine learning transformers and models. The Yellowbrick library extends the scikit-learn API providing a Visualizer object, an estimator that learns from data and produces a visualization as a result. In this talk, we will explore feature visualizers, visualizers for classification, clustering, and regression, as well as model analysis visualizers. We'll work through several examples and show how visual diagnostics steer model selection, making machine learning more effective.

Sparse representation based human action recognition using an action region-a...

Wesley De Neve

Visualizing the Model Selection Process

Benjamin Bengfort

Machine learning is the hacker art of describing the features of instances that we want to make predictions about, then fitting the data that describes those instances to a model form. Applied machine learning has come a long way from it's beginnings in academia, and with tools like Scikit-Learn, it's easier than ever to generate operational models for a wide variety of applications. Thanks to the ease and variety of the tools in Scikit-Learn, the primary job of the data scientist is model selection. Model selection involves performing feature engineering, hyperparameter tuning, and algorithm selection. These dimensions of machine learning often lead computer scientists towards automatic model selection via optimization (maximization) of a model's evaluation metric. However, the search space is large, and grid search approaches to machine learning can easily lead to failure and frustration. Human intuition is still essential to machine learning, and visual analysis in concert with automatic methods can allow data scientists to steer model selection towards better fitted models, faster. In this talk, we will discuss interactive visual methods for better understanding, steering, and tuning machine learning models.

Seminar nov2017

Ahmed Youssef Ali Amer

最近の研究情勢についていくために - Deep Learningを中心に -

Hiroshi Fukui

HOP-Rec_RecSys18

Matt Yang

Learning where to look: focus and attention in deep vision

Universitat Politècnica de Catalunya

Af03401810185..ijceronline

Human_Activity_Recognition_Predictive_ModelDavid Ritchie

Biehl hanze-2021

University of Groningen

Using AI Planning to Automate the Performance Analysis of Simulators

Roland Ewald

Analyzing simulation algorithm performance is cumbersome: execute some runs, observe a performance metric, and analyze the results. Often, the results motivate follow-up experiments, which in turn may lead to additional experiments, and so on. This time-consuming and error-prone process can be automated with planning approaches from artificial intelligence, making simulator performance analysis more convenient and rigorous. This paper introduces ALeSiA, a prototypical system for automatic simulator performance analysis. It is independent of any specific simulation system and realizes a hypothesis-driven approach to evaluate performance.

Learning To Rank User Queries to Detect Search Tasks

Franco Maria Nardini

UHDMML.ppsbutest

MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...

collwe

Deep Learning for Computer Vision: Object Detection (UPC 2016)

Universitat Politècnica de Catalunya

http://imatge-upc.github.io/telecombcn-2016-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...

IJET - International Journal of Engineering and Techniques

Abstract: The processing power of computing devices has increased with number of available cores. This paper presents an approach towards clustering of categorical data on multi-core platform. K-modes algorithm is used for clustering of categorical data which uses simple dissimilarity measure for distance computation. The multi-core approach aims to achieve speedup in processing. Open Multi Processing (OpenMP) is used to achieve parallelism in k-modes algorithm. OpenMP is a shared memory API that uses thread approach using the fork-join model. The dataset used for experiment is Congressional Voting Dataset collected from UCI repository. The dataset contains votes of members in categorical format provided in CSV format. The experiment is performed for increased number of clusters and increasing size of dataset.

PR-185: RetinaFace: Single-stage Dense Face Localisation in the Wild

jaewon lee

PR-199: SNIPER:Efficient Multi Scale Training

jaewon lee

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation

Avihu Efrat's Viola and Jones face detection slides

wolf

Object Detection - Míriam Bellver - UPC Barcelona 2018

Universitat Politècnica de Catalunya

Surveillance scene classification using machine learning

Utkarsh Contractor

Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery

Wai Nwe Tun

Visual diagnostics for more effective machine learning

Benjamin Bengfort

Sparse representation based human action recognition using an action region-a...

Wesley De Neve

Visualizing the Model Selection Process

Benjamin Bengfort

Seminar nov2017

Ahmed Youssef Ali Amer

最近の研究情勢についていくために - Deep Learningを中心に -

Hiroshi Fukui

HOP-Rec_RecSys18

Matt Yang

Learning where to look: focus and attention in deep vision

Universitat Politècnica de Catalunya

Af03401810185..ijceronline

Human_Activity_Recognition_Predictive_ModelDavid Ritchie

Biehl hanze-2021

University of Groningen

Using AI Planning to Automate the Performance Analysis of Simulators

Roland Ewald

Learning To Rank User Queries to Detect Search Tasks

Franco Maria Nardini

UHDMML.ppsbutest

MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...

collwe

Deep Learning for Computer Vision: Object Detection (UPC 2016)

Universitat Politècnica de Catalunya

[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...

IJET - International Journal of Engineering and Techniques

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation (20)

Avihu Efrat's Viola and Jones face detection slides

Object Detection - Míriam Bellver - UPC Barcelona 2018

Surveillance scene classification using machine learning

Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery

Visual diagnostics for more effective machine learning

Sparse representation based human action recognition using an action region-a...

Visualizing the Model Selection Process

Seminar nov2017

最近の研究情勢についていくために - Deep Learningを中心に -

HOP-Rec_RecSys18

Learning where to look: focus and attention in deep vision

Af03401810185..

Human_Activity_Recognition_Predictive_Model

Biehl hanze-2021

Using AI Planning to Automate the Performance Analysis of Simulators

Learning To Rank User Queries to Detect Search Tasks

UHDMML.pps

MultiC2: an Optimization Framework for Learning from Task and Worker Dual Het...

Deep Learning for Computer Vision: Object Detection (UPC 2016)

[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...

Recently uploaded

A Survey of Techniques for Maximizing LLM Performance.pptx

thanhdowork

Best Digital Marketing Institute In NOIDA

deeptiverma2406

Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification. for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.

Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...

Dr. Vinod Kumar Kanvaria

Chapter 3 - Islamic Banking Products and Services.pptx

Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia

Synthetic Fiber Construction in lab .pptx

Pavel ( NSTU)

Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.

Francesca Gottschalk - How can education support child empowerment.pptx

EduSkills OECD

Biological Screening of Herbal Drugs in detailed.

Ashokrao Mane college of Pharmacy Peth-Vadgaon

Biological screening of herbal drugs: Introduction and Need for Phyto-Pharmacological Screening, New Strategies for evaluating Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and Antifertility, Toxicity studies as per OECD guidelines

A Strategic Approach: GenAI in Education

Peter Windle

Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction. This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.

Overview on Edible Vaccine: Pros & Cons with Mechanism

DeeptiGupta154

Embracing GenAI - A Strategic Imperative

Peter Windle

MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf

goswamiyash170123

Supporting (UKRI) OA monographs at Salford.pptx

Jisc

The approach at University of Liverpool.pptx

Jisc

Model Attribute Check Company Auto Property

Celine George

Acetabularia Information For Class 9 .docx

vaibhavrinwa19

Azure Interview Questions and Answers PDF By ScholarHat

Scholarhat

"Protectable subject matters, Protection in biotechnology, Protection of othe...

SACHIN R KONDAGURI

Natural birth techniques - Mrs.Akanksha Trivedi Rama University

Akanksha trivedi rama nursing college kanpur.

The French Revolution Class 9 Study Material pdf free download

Vivekanand Anglo Vedic Academy

The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe. For more information, visit-www.vavaclasses.com

Marketing internship report file for MBA

gb193092

Recently uploaded (20)

A Survey of Techniques for Maximizing LLM Performance.pptx

Best Digital Marketing Institute In NOIDA

Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...

Chapter 3 - Islamic Banking Products and Services.pptx

Synthetic Fiber Construction in lab .pptx

Francesca Gottschalk - How can education support child empowerment.pptx

Biological Screening of Herbal Drugs in detailed.

A Strategic Approach: GenAI in Education

Overview on Edible Vaccine: Pros & Cons with Mechanism

Embracing GenAI - A Strategic Imperative

MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf

Supporting (UKRI) OA monographs at Salford.pptx

The approach at University of Liverpool.pptx

Model Attribute Check Company Auto Property

Acetabularia Information For Class 9 .docx

Azure Interview Questions and Answers PDF By ScholarHat

"Protectable subject matters, Protection in biotechnology, Protection of othe...

Natural birth techniques - Mrs.Akanksha Trivedi Rama University

The French Revolution Class 9 Study Material pdf free download

Marketing internship report file for MBA

PR157: Best of both worlds: human-machine collaboration for object annotation

1. Best of both worlds: Human-machine collaboration for object annotation (CVPR2015) visionNoob (이재원) PR-157: Olga Russakovksy, Li-Jia Li, Li Fei-Fei Stanford University, Snapchat [paper] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15.pdf [CVPR’15 poster] http://ai.stanford.edu/~olga/posters/cvpr15-poster.pdf [supplements] http://ai.stanford.edu/~olga/papers/RussakovskyCVPR15_supp.pdf [slides made by first author] http://ai.stanford.edu/~olga/slides/best_of_both_worlds_slides.pdf

2. Goal efficiently and accurately detect all objects in an image Green boxes RCNN results (with NMS) Yellow boxes ILSVRC dataset classes (but RCNN fail) Pink boxes outside of range of capabilities of object detectors

4. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation

5. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation

6. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation - Weakly supervised learning [42, 23, 52, 8, 24, 15] - Active learning [32, 56] (see also [PR-119]) - Mine the web for object detection [8, 11, 15] -> minimize human annotation http://mpawankumar.info/tutorials/cvpr2013/index.html

7. Related Works 1. Recognition with human in the loop 2. Better object detection 3. Cheaper manual annotation Crowdsourcing techniques - Annotation games [57, 12, 30] - Tricks to reduce the annotation search space [13, 4] - Effective user interface design [50, 58] - Making use of existing annotations [5] Making use of weak human supervision [26, 7] Accurately computing the number of required workers [46]

9. System Overview

10. System Overview input 1. image to label 2. Constraints - utility - precision - and/or budget output Bi: bounding box Ci: class label pi: confident (prob of detection being correct)

11. System Overview Constraints - utility (𝑈∗ ) - precision(𝑃∗ ) - budget (𝐵∗ ) : cost of human time = 1 (in this paper) 3가지 중 2가지만 선택

12. Method Model : Markov Decision Process (MDP) State Action Transition probability Reward Optimization

14. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a question to ask humans

15. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search

16. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that

17. Method Model : Markov Decision Process (MDP) State : set of object detections, with probabilities Action : a questions to ask humans Transition probability : probability distribution over user responses Reward : increase in estimated quality of labeling divided by the cost of actions Optimization : 2-step lookahead search Note that

18. Method Computing the transition probability t t-1 total probability

19. Method Computing the transition probability priorBayes’ rule ∝ Examples) P( C | I ) //classifier P( B, C | I ) // obj detector

20. Method Multiple computer vision models

21. Method Pre-computed human error rates

22. Experimental Setup dataset ImageNet Large Scale Visual Recognition Challenge(ILSVRC) detection dataset train set : 400K validation : 200K (split the val set into two sets(val1, val2) for test) computer vision models 1. Image classifier : 200 class CNN classifiers [Hoffman NIPS14] 2. Object detector : 200 class RCNN [Girshick CVPR14] 3. Probability of object region : Objectness measure [Alexe PAMI2012] 4. Probability of another instance of same class : statistics from ILSVRC2014 val-DET data 5. Probability of another class in image : statistics from ILSVRC2014 val-DET data

23. Experimental Results The ILSVRC detection system : Step1 : determining what object classes are present in the images Step2 : Asking users to draw bounding boxes.

24. Conclusions We presented a principled approach to unifying multiple inputs from both computer vision and humans to label objects in images.

25. Discussion

PR157: Best of both worlds: human-machine collaboration for object annotation

Recommended

Recommended

More Related Content

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation

Similar to PR157: Best of both worlds: human-machine collaboration for object annotation (20)

More from jaewon lee

More from jaewon lee (9)

Recently uploaded

Recently uploaded (20)

PR157: Best of both worlds: human-machine collaboration for object annotation