[2024]Digital Global Overview Report 2024 Meltwater.pdf
Fcv rep hoiem
1. Representational Challenges of Recognition,
from Detection to Interpretation
NSF Frontiers in Computer Vision Workshop
Derek Hoiem
University of Illinois (UIUC)
Aug 2011
2. Recognition in last 15 years
• Focus on object search: “Where is it?”
• Build templates that quickly differentiate object
patch from background patch
Dog Model
Object or
Non-Object?
3. Dog Model
Template Matching Problem
True
Detections
Bad Confused with
Localization Similar Object
Confused with
Misc. Background Dissimilar Object
4. Breakdown of top 100 false positives
Misc.
Misc.
Airplane Car
Background Background
16% 9%
Other
Other Object
Object 4% 5%
Similar
Object
16%
Similar Localization
Object 15% Localization
65% 70%
Misc.
Cat Dog
Background
5%
Misc.
Other Object
Background
15% Other 17% Localization
Object 23%
Localization 9%
41%
Similar
Object Similar
39% Object
51%
Felzenzwalb et al. (v4) Detector
PASCAL VOC 2010 valset
5. Key Challenge: localize the object from a
detection
Good Bad Good Bad Dog Model
Need good category-sensitive
segmentation methods
Can free up detectors to focus
on discriminative pieces
6. Key challenge: differentiate between similar
categories
Robustness through learned abstraction (e.g.,
shape), rather than hand-coded invariance
Compare details, rather than holistic appearance Dog Model
7. To get large improvements, we need to solve the
“mid-level” problems
Potential Gains in Precision-Recall
8. Object Recognition Challenges
• Last 15 years: object detection
– Good methods to detect objects, ignore
background
– Better segmentation and mid-level
representations are crucial for further
improvement
• Next 10+ years: object interpretation
– How do we represent objects themselves?
9.
10.
11. Key Challenge: How do we deal with
objects that we can’t categorize?
How to localize objects without categorization?
How to build representations that apply to novel objects?
12. Key Challenge: build/infer representations
that encode physical context
How to infer physical relations (contact, engagement, etc.)?
How to interpret an object’s role in the scene?
13. Key Challenge: build/infer representations
that depend on task context
Big animal ahead, Cow
moving left
Which objects are relevant, and how are they relevant?
14. We need complex, multi-faceted representations
• Categories, pose, material, unusual characteristics, etc.
Mirrors
Vehicle
Two-wheeled Gas tank
Motorcycle Seat Headlight
Lic. Plate
Facing right Motorcycle
Tail light
On the street Metal
Exhaust
Has a rider Rubber
Engine
Wheel
Wheel
15. Summary
• In object detection, key challenges are object
segmentation and fine differentiation
• Object interpretation is a wide-open problem,
and we need new object representations
– Unfamiliar objects
– Situational context
– Task context