Fcv rep hoiem

•

0 likes•174 views

zukun

Technology Business

Representational Challenges of Recognition,
from Detection to Interpretation

NSF Frontiers in Computer Vision Workshop

Derek Hoiem
University of Illinois (UIUC)
Aug 2011

Recognition in last 15 years
• Focus on object search: “Where is it?”
• Build templates that quickly differentiate object
patch from background patch

Dog Model

Object or
Non-Object?

Dog Model
Template Matching Problem

True
Detections
Bad Confused with
Localization Similar Object

Confused with
Misc. Background Dissimilar Object

Breakdown of top 100 false positives
Misc.
Misc.
Airplane Car
Background Background
16% 9%
Other
Other Object
Object 4% 5%
Similar
Object
16%
Similar Localization
Object 15% Localization
65% 70%

Misc.
Cat Dog
Background
5%

Misc.
Other Object
Background
15% Other 17% Localization
Object 23%
Localization 9%
41%
Similar
Object Similar
39% Object
51%
Felzenzwalb et al. (v4) Detector
PASCAL VOC 2010 valset

Key Challenge: localize the object from a
detection

Good Bad Good Bad Dog Model

Need good category-sensitive
segmentation methods

Can free up detectors to focus
on discriminative pieces

Key challenge: differentiate between similar
categories

Robustness through learned abstraction (e.g.,
shape), rather than hand-coded invariance
Compare details, rather than holistic appearance Dog Model

To get large improvements, we need to solve the
“mid-level” problems

Potential Gains in Precision-Recall

Object Recognition Challenges
• Last 15 years: object detection
– Good methods to detect objects, ignore
background
– Better segmentation and mid-level
representations are crucial for further
improvement

• Next 10+ years: object interpretation
– How do we represent objects themselves?

Key Challenge: How do we deal with
objects that we can’t categorize?

How to localize objects without categorization?

How to build representations that apply to novel objects?

Key Challenge: build/infer representations
that encode physical context

How to infer physical relations (contact, engagement, etc.)?

How to interpret an object’s role in the scene?

Key Challenge: build/infer representations
that depend on task context

Big animal ahead, Cow
moving left

Which objects are relevant, and how are they relevant?

We need complex, multi-faceted representations

• Categories, pose, material, unusual characteristics, etc.

Mirrors

Vehicle
Two-wheeled Gas tank
Motorcycle Seat Headlight
Lic. Plate
Facing right Motorcycle
Tail light
On the street Metal
Exhaust
Has a rider Rubber
Engine
Wheel
Wheel

Summary

• In object detection, key challenges are object
segmentation and fine differentiation

• Object interpretation is a wide-open problem,
and we need new object representations
– Unfamiliar objects
– Situational context
– Task context

Viewers also liked

Mit6870 orsu lecture11zukun

CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...zukun

Principal component analysis and matrix factorizations for learning (part 2) ...zukun

Principal component analysis and matrix factorizations for learning (part 3) ...zukun

A general survey of previous works on action recognitionzukun

ECCV2010: distance function and metric learning part 2zukun

Cvpr2010 open source vision software, intro and training part vii point cloud...zukun

Power%20 point[1]thiberge

Catalogueprofessionnel2011thiberge

Fcv rep todoroviczukun

CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...zukun

ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2zukun

Scientific Programming in Pythonzukun

Viewers also liked (13)

Mit6870 orsu lecture11

CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...

Principal component analysis and matrix factorizations for learning (part 2) ...

Principal component analysis and matrix factorizations for learning (part 3) ...

A general survey of previous works on action recognition

ECCV2010: distance function and metric learning part 2

Cvpr2010 open source vision software, intro and training part vii point cloud...

Power%20 point[1]

Catalogueprofessionnel2011

Fcv rep todorovic

CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...

ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2

Scientific Programming in Python

Recently uploaded

Elevate Developer Efficiency & build GenAI Application with Amazon QBhuvaneswari Subramani

Corporate and higher education May webinar.pptxRustici Software

Understanding the FAA Part 107 License ..Christopher Logan Kennedy

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

Exploring Multimodal Embeddings with MilvusZilliz

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Why Teams call analytics are critical to your entire businesspanagenda

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

MS Copilot expands with MS Graph connectorsNanddeep Nachan

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

Recently uploaded (20)

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Corporate and higher education May webinar.pptx

Understanding the FAA Part 107 License ..

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Exploring Multimodal Embeddings with Milvus

AWS Community Day CPH - Three problems of Terraform

FWD Group - Insurer Innovation Award 2024

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

CNIC Information System with Pakdata Cf In Pakistan

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Why Teams call analytics are critical to your entire business

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Vector Search -An Introduction in Oracle Database 23ai.pptx

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

MS Copilot expands with MS Graph connectors

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Fcv rep hoiem

1. Representational Challenges of Recognition, from Detection to Interpretation NSF Frontiers in Computer Vision Workshop Derek Hoiem University of Illinois (UIUC) Aug 2011

2. Recognition in last 15 years • Focus on object search: “Where is it?” • Build templates that quickly differentiate object patch from background patch Dog Model Object or Non-Object?

3. Dog Model Template Matching Problem True Detections Bad Confused with Localization Similar Object Confused with Misc. Background Dissimilar Object

4. Breakdown of top 100 false positives Misc. Misc. Airplane Car Background Background 16% 9% Other Other Object Object 4% 5% Similar Object 16% Similar Localization Object 15% Localization 65% 70% Misc. Cat Dog Background 5% Misc. Other Object Background 15% Other 17% Localization Object 23% Localization 9% 41% Similar Object Similar 39% Object 51% Felzenzwalb et al. (v4) Detector PASCAL VOC 2010 valset

5. Key Challenge: localize the object from a detection Good Bad Good Bad Dog Model Need good category-sensitive segmentation methods Can free up detectors to focus on discriminative pieces

6. Key challenge: differentiate between similar categories Robustness through learned abstraction (e.g., shape), rather than hand-coded invariance Compare details, rather than holistic appearance Dog Model

7. To get large improvements, we need to solve the “mid-level” problems Potential Gains in Precision-Recall

8. Object Recognition Challenges • Last 15 years: object detection – Good methods to detect objects, ignore background – Better segmentation and mid-level representations are crucial for further improvement • Next 10+ years: object interpretation – How do we represent objects themselves?

10.

11. Key Challenge: How do we deal with objects that we can’t categorize? How to localize objects without categorization? How to build representations that apply to novel objects?

12. Key Challenge: build/infer representations that encode physical context How to infer physical relations (contact, engagement, etc.)? How to interpret an object’s role in the scene?

13. Key Challenge: build/infer representations that depend on task context Big animal ahead, Cow moving left Which objects are relevant, and how are they relevant?

14. We need complex, multi-faceted representations • Categories, pose, material, unusual characteristics, etc. Mirrors Vehicle Two-wheeled Gas tank Motorcycle Seat Headlight Lic. Plate Facing right Motorcycle Tail light On the street Metal Exhaust Has a rider Rubber Engine Wheel Wheel

15. Summary • In object detection, key challenges are object segmentation and fine differentiation • Object interpretation is a wide-open problem, and we need new object representations – Unfamiliar objects – Situational context – Task context

16. Thank you

Fcv rep hoiem

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

More from zukun

More from zukun (20)

Recently uploaded

Recently uploaded (20)

Fcv rep hoiem