SlideShare a Scribd company logo
1 of 16
Download to read offline
Representational Challenges of Recognition,
     from Detection to Interpretation



    NSF Frontiers in Computer Vision Workshop

                 Derek Hoiem
           University of Illinois (UIUC)
                    Aug 2011
Recognition in last 15 years
• Focus on object search: “Where is it?”
• Build templates that quickly differentiate object
  patch from background patch



                                               Dog Model




                                            Object or
                                            Non-Object?
Dog Model
Template Matching Problem

   True
 Detections
                              Bad                  Confused with
                           Localization            Similar Object




                                           Confused with
              Misc. Background            Dissimilar Object
Breakdown of top 100 false positives
                                                             Misc.
           Misc.
                     Airplane                                          Car
        Background                                        Background
           16%                                                9%
                                                Other
   Other                                        Object
  Object 4%                                      5%
                                                     Similar
                                                     Object
                                                       16%
  Similar                 Localization
Object 15%                                                               Localization
                             65%                                            70%



                                      Misc.
                                                Cat                                       Dog
                                   Background
                                       5%

                                                                                     Misc.
                        Other Object
                                                                                  Background
                            15%                                        Other         17%     Localization
                                                                       Object                    23%
                                                      Localization      9%
                                                         41%
                                          Similar
                                          Object                                             Similar
                                           39%                                               Object
                                                                                              51%
Felzenzwalb et al. (v4) Detector
PASCAL VOC 2010 valset
Key Challenge: localize the object from a
detection

      Good    Bad         Good    Bad     Dog Model




                    Need good category-sensitive
                    segmentation methods

                    Can free up detectors to focus
                    on discriminative pieces
Key challenge: differentiate between similar
 categories




Robustness through learned abstraction (e.g.,
shape), rather than hand-coded invariance
Compare details, rather than holistic appearance   Dog Model
To get large improvements, we need to solve the
“mid-level” problems

            Potential Gains in Precision-Recall
Object Recognition Challenges
• Last 15 years: object detection
  – Good methods to detect objects, ignore
    background
  – Better segmentation and mid-level
    representations are crucial for further
    improvement


• Next 10+ years: object interpretation
  – How do we represent objects themselves?
Key Challenge: How do we deal with
objects that we can’t categorize?




How to localize objects without categorization?

How to build representations that apply to novel objects?
Key Challenge: build/infer representations
that encode physical context




How to infer physical relations (contact, engagement, etc.)?

How to interpret an object’s role in the scene?
Key Challenge: build/infer representations
that depend on task context




 Big animal ahead,                       Cow
 moving left

 Which objects are relevant, and how are they relevant?
We need complex, multi-faceted representations


• Categories, pose, material, unusual characteristics, etc.


                                                Mirrors

    Vehicle
    Two-wheeled                             Gas tank
    Motorcycle                       Seat                         Headlight
                    Lic. Plate
    Facing right                                                     Motorcycle
                        Tail light
    On the street                      Metal
                    Exhaust
    Has a rider                                                    Rubber
                                               Engine
                                                          Wheel
                         Wheel
Summary

• In object detection, key challenges are object
  segmentation and fine differentiation

• Object interpretation is a wide-open problem,
  and we need new object representations
  – Unfamiliar objects
  – Situational context
  – Task context
Thank you

More Related Content

Viewers also liked

Mit6870 orsu lecture11
Mit6870 orsu lecture11Mit6870 orsu lecture11
Mit6870 orsu lecture11zukun
 
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...zukun
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...zukun
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...zukun
 
A general survey of previous works on action recognition
A general survey of previous works on action recognitionA general survey of previous works on action recognition
A general survey of previous works on action recognitionzukun
 
ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2zukun
 
Cvpr2010 open source vision software, intro and training part vii point cloud...
Cvpr2010 open source vision software, intro and training part vii point cloud...Cvpr2010 open source vision software, intro and training part vii point cloud...
Cvpr2010 open source vision software, intro and training part vii point cloud...zukun
 
Power%20 point[1]
Power%20 point[1]Power%20 point[1]
Power%20 point[1]thiberge
 
Catalogueprofessionnel2011
Catalogueprofessionnel2011Catalogueprofessionnel2011
Catalogueprofessionnel2011thiberge
 
Fcv rep todorovic
Fcv rep todorovicFcv rep todorovic
Fcv rep todoroviczukun
 
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...zukun
 
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2zukun
 
Scientific Programming in Python
Scientific Programming in PythonScientific Programming in Python
Scientific Programming in Pythonzukun
 

Viewers also liked (13)

Mit6870 orsu lecture11
Mit6870 orsu lecture11Mit6870 orsu lecture11
Mit6870 orsu lecture11
 
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...
CVPR2010: Semi-supervised Learning in Vision: Part 3: Algorithms and Applicat...
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...
 
A general survey of previous works on action recognition
A general survey of previous works on action recognitionA general survey of previous works on action recognition
A general survey of previous works on action recognition
 
ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2ECCV2010: distance function and metric learning part 2
ECCV2010: distance function and metric learning part 2
 
Cvpr2010 open source vision software, intro and training part vii point cloud...
Cvpr2010 open source vision software, intro and training part vii point cloud...Cvpr2010 open source vision software, intro and training part vii point cloud...
Cvpr2010 open source vision software, intro and training part vii point cloud...
 
Power%20 point[1]
Power%20 point[1]Power%20 point[1]
Power%20 point[1]
 
Catalogueprofessionnel2011
Catalogueprofessionnel2011Catalogueprofessionnel2011
Catalogueprofessionnel2011
 
Fcv rep todorovic
Fcv rep todorovicFcv rep todorovic
Fcv rep todorovic
 
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...
CVPR2010: Sparse Coding and Dictionary Learning for Image Analysis: Part 3: O...
 
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
 
Scientific Programming in Python
Scientific Programming in PythonScientific Programming in Python
Scientific Programming in Python
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Fcv rep hoiem

  • 1. Representational Challenges of Recognition, from Detection to Interpretation NSF Frontiers in Computer Vision Workshop Derek Hoiem University of Illinois (UIUC) Aug 2011
  • 2. Recognition in last 15 years • Focus on object search: “Where is it?” • Build templates that quickly differentiate object patch from background patch Dog Model Object or Non-Object?
  • 3. Dog Model Template Matching Problem True Detections Bad Confused with Localization Similar Object Confused with Misc. Background Dissimilar Object
  • 4. Breakdown of top 100 false positives Misc. Misc. Airplane Car Background Background 16% 9% Other Other Object Object 4% 5% Similar Object 16% Similar Localization Object 15% Localization 65% 70% Misc. Cat Dog Background 5% Misc. Other Object Background 15% Other 17% Localization Object 23% Localization 9% 41% Similar Object Similar 39% Object 51% Felzenzwalb et al. (v4) Detector PASCAL VOC 2010 valset
  • 5. Key Challenge: localize the object from a detection Good Bad Good Bad Dog Model Need good category-sensitive segmentation methods Can free up detectors to focus on discriminative pieces
  • 6. Key challenge: differentiate between similar categories Robustness through learned abstraction (e.g., shape), rather than hand-coded invariance Compare details, rather than holistic appearance Dog Model
  • 7. To get large improvements, we need to solve the “mid-level” problems Potential Gains in Precision-Recall
  • 8. Object Recognition Challenges • Last 15 years: object detection – Good methods to detect objects, ignore background – Better segmentation and mid-level representations are crucial for further improvement • Next 10+ years: object interpretation – How do we represent objects themselves?
  • 9.
  • 10.
  • 11. Key Challenge: How do we deal with objects that we can’t categorize? How to localize objects without categorization? How to build representations that apply to novel objects?
  • 12. Key Challenge: build/infer representations that encode physical context How to infer physical relations (contact, engagement, etc.)? How to interpret an object’s role in the scene?
  • 13. Key Challenge: build/infer representations that depend on task context Big animal ahead, Cow moving left Which objects are relevant, and how are they relevant?
  • 14. We need complex, multi-faceted representations • Categories, pose, material, unusual characteristics, etc. Mirrors Vehicle Two-wheeled Gas tank Motorcycle Seat Headlight Lic. Plate Facing right Motorcycle Tail light On the street Metal Exhaust Has a rider Rubber Engine Wheel Wheel
  • 15. Summary • In object detection, key challenges are object segmentation and fine differentiation • Object interpretation is a wide-open problem, and we need new object representations – Unfamiliar objects – Situational context – Task context