SlideShare a Scribd company logo
1 of 31
Download to read offline
What do we want from
Computational Scene Understanding?




                             Š Quint Buchholz


          Alexei “Alyosha” Efros
        Carnegie Mellon University
Many ways to Understand a Scene…




•   Scene Categories (e.g. street scene, city, Boston, etc.)
•   Innumerate objects (people, cars, lampposts, etc)
•   Label/Segment scene elements (road, buildings, sky, etc)
•   Scene Geometry (qualitative or quantitative) and Illumination
•   Physical Affordances (where can I walk?)
•   Prediction (What will happen next?)
Scene Categorization
                         beach  mountain                                            forest




Animal vs. no animal (Thorpe,                 city               street             farm
Poggio & Oliva)




                                      Basic Scene Categories (Oliva, Renenger, Fei-Fei, etc)n




Image classification, e.g. “Boat” (Caltech, etc.)




     Spatial envelope (Oliva & Torralba)                                  im2gps
Poster Spotlight: SUN Attributes: A Large-Scale
         Database of Scene Attributes
       Genevieve Patterson and James Hays, Brown University

                                  Global, binary attributes
                                  describing:
                                  • Affordances / Functions (e.g. farming,
                                  eating)
                                  • Materials (e.g. carpet, running water)
                                  • Surface Properties (e.g. aged, sterile)
                                  • Spatial Envelope (e.g. enclosed,
                                  symmetrical)
                                  Statistics of database:
                                  • 14340 images from 717 scene
                                  categories
                                  • 106 attributes
      Space of Scenes             • 2 million+ labels collected so far
   Organized by Attributes        • Outlier workers manually graded,
                                  good workers ~90% accurate.
Enumerating Objects
good for scene retrieval

                                       Lamp



           Couch                                      Couch




                                              Table

Standard detection task, e.g. PASCAL
But picture is worth… 4 words?

             Lamp



  Couch                     Couch




                    Table
Where can I sit ?

              Lamp



Couch                        Couch




                     Table
Labeling Pixels




See Alan and Lana talk…
3D Scene Understanding




                                  Hoeim et al
 See Martial and Silvio’s talk…
Pushing Back Evaluation Horizon…
• So far, we have acted as cognitive
  psychologists:
  – proposing and evaluating intermediate “mental
    representations”
  – E.g. object/scene categories, pixel labels,
    geometry
• But we can also be more behaviorist:
  – Focus on tasks instead of “mental states”
  – Evaluate action plans and behavior predictions
Task-Specific Questions




 Pushable, Reachable, Sittable ……
Human Centric Scene Understanding

                         Can Move

               Can Sit




                  Can Push
                                         Can Walk

Reasoning in terms of set of allowable actions/body poses.
Human Workspace                      3D Scene Geometry




Gupta et al,
CVPR’11        Joint Space of Human-Scene Interactions
Subjective Interpretation
Holy Grail: Predicting the Future
Event prediction
Input image



                     Video database




                 Liu, Yuen, Torralba. CVPR 2009. Yuen, Torralba. ECCV 2010
Event prediction
Input image



                  Video database




               Liu, Yuen, Torralba. CVPR 2009; Yuen, Torralba. ECCV 2010
What do we need
 to get to there?
  The Op-Ed Part ☺
Organizing Our Data
“It irritated him that the “dog” of 3:14 in the
  afternoon, seen in profile, should be
  indicated by the same noun as the dog of
  3:15, seen frontally…”
“My memory, sir, is like a garbage heap.”

            Fumes the Memorious
            Jorge Luis Borges
Trouble with Classic Platonic
           Categorization
 • Step 1: cut up the world into
   categories



PASCAL
“train”
category


 • Step 2: train an SVM on
   positives vs. negatives
It gets more complicated…




• Number of objects * number of interactions *
  number of outcomes… = too many categories
• Don’t want to categorize too early
  – “Dealing with the world as it comes to us” [Derek]
• Let’s categorize at run-time, once we know the task!
The Dictatorship of Librarians


Arts and recreation
Arts and recreation       Language
                          Language



                      Philosophy and
                      Philosophy and
 Literature
 Literature           Psychology
                      Psychology



  Technology
  Technology            Religion
                        Religion
                                   23
categories are losing money…




             vs.
Association instead of
             categorization
Ask not “what is this?”, ask “what is this like”
                         – Moshe Bar

• Exemplar Theory (Medin & Schaffer 1978,
  Nosofsky 1986, Krushke 1992)
 –categories represented in terms of remembered objects
  (exemplars)
 –Similarity is measured between input and all exemplars
 –think non-parametric density estimation
• Vanevar Bush (1945), Memex (MEMory
  EXtender)
 –Inspired hypertext, WWW, Google…
Bush’s Memex (1945)
•   Store publications, correspondence, personal work, on microfilm
•   Items retrieved rapidly using index codes
     – Builds on “rapid selector”
•   Can annotate text with margin notes, comments
•   Can construct a trail through the material and save it
     – Roots of hypertext
•   Acts as an external memory
Visual Memex, a proposal




                                Nodes = instances
                                Edges = associations

                                   types of edges:
                                   • visual similarity
                                   • spatial, temporal
                                   co-occurrence
                                   • geometric structure
                                   • language
                                   • geography
                                   •..
Milosewicz,Efros, NIPS’08]
Poster Spotlight: Relative Attributes




• Previous work restricts attributes to binary categories,
  but many attributes are more fluid and should be
  expressed relatively.
                                       [Parikh & Grauman, ICCV 2011]
Relative attributes
                           [Parikh & Grauman, ICCV 2011]

• Learn a ranking function per attribute, given ordering
  constraints among exemplars or categories
          Youth:
                                  ,                    …
• Allows two novel tasks:
          1) Zero-shot learning from                2) Description relative to
                 comparisons                            examples/classes
  Train: “Unseen person C is
  younger than S, older than H”,…
                                                           is more dense than     ,

                       S
Smiling




                                                       and less dense than

                   C           M
            H              Z                     Precise descriptions are more
                                                 recognizable to human subjects
                   Youth
Poster Spotlight: Ensemble of Exemplar-SVMs
       for Object Detection and Beyond




                        Milosewicz,Gupta,Efros, ICCV 2011]
Fcv scene efros

More Related Content

Viewers also liked

Fcv appli science_golland
Fcv appli science_gollandFcv appli science_golland
Fcv appli science_golland
zukun
 
Fcv taxo chellappa
Fcv taxo chellappaFcv taxo chellappa
Fcv taxo chellappa
zukun
 
Skiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sortingSkiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sorting
zukun
 
Fcv scene lazebnik
Fcv scene lazebnikFcv scene lazebnik
Fcv scene lazebnik
zukun
 
Fcv rep a_berg
Fcv rep a_bergFcv rep a_berg
Fcv rep a_berg
zukun
 
Fcv acad ind_martin
Fcv acad ind_martinFcv acad ind_martin
Fcv acad ind_martin
zukun
 
Fcv hum mach_belongie
Fcv hum mach_belongieFcv hum mach_belongie
Fcv hum mach_belongie
zukun
 

Viewers also liked (7)

Fcv appli science_golland
Fcv appli science_gollandFcv appli science_golland
Fcv appli science_golland
 
Fcv taxo chellappa
Fcv taxo chellappaFcv taxo chellappa
Fcv taxo chellappa
 
Skiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sortingSkiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sorting
 
Fcv scene lazebnik
Fcv scene lazebnikFcv scene lazebnik
Fcv scene lazebnik
 
Fcv rep a_berg
Fcv rep a_bergFcv rep a_berg
Fcv rep a_berg
 
Fcv acad ind_martin
Fcv acad ind_martinFcv acad ind_martin
Fcv acad ind_martin
 
Fcv hum mach_belongie
Fcv hum mach_belongieFcv hum mach_belongie
Fcv hum mach_belongie
 

Similar to Fcv scene efros

Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
Jia-Bin Huang
 
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
zukun
 
Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetection
Jie Feng
 
Machine Learning in Computer Vision
Machine Learning in Computer VisionMachine Learning in Computer Vision
Machine Learning in Computer Vision
butest
 
Machine Learning in Computer Vision
Machine Learning in Computer VisionMachine Learning in Computer Vision
Machine Learning in Computer Vision
butest
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1
zukun
 
Objects for modeling world
Objects for modeling worldObjects for modeling world
Objects for modeling world
Eswaran P
 

Similar to Fcv scene efros (20)

Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Promising avenues for interdisciplinary research in vision
Promising avenues for interdisciplinary research in visionPromising avenues for interdisciplinary research in vision
Promising avenues for interdisciplinary research in vision
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Experimental Media Voodoo™
Experimental Media Voodoo™Experimental Media Voodoo™
Experimental Media Voodoo™
 
RAFTing Through History
RAFTing Through HistoryRAFTing Through History
RAFTing Through History
 
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
 
Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetection
 
Machine Learning in Computer Vision
Machine Learning in Computer VisionMachine Learning in Computer Vision
Machine Learning in Computer Vision
 
Machine Learning in Computer Vision
Machine Learning in Computer VisionMachine Learning in Computer Vision
Machine Learning in Computer Vision
 
strong slot and filler
strong slot and fillerstrong slot and filler
strong slot and filler
 
Georgia Ward Dyer - O Time thy pyramids - Creative AI meetup
Georgia Ward Dyer - O Time thy pyramids - Creative AI meetupGeorgia Ward Dyer - O Time thy pyramids - Creative AI meetup
Georgia Ward Dyer - O Time thy pyramids - Creative AI meetup
 
Knowledge and its organization as a matter of multiple facets, forms and func...
Knowledge and its organization as a matter of multiple facets, forms and func...Knowledge and its organization as a matter of multiple facets, forms and func...
Knowledge and its organization as a matter of multiple facets, forms and func...
 
Random Forests without the Randomness Sept_2023.pptx
Random Forests without the Randomness Sept_2023.pptxRandom Forests without the Randomness Sept_2023.pptx
Random Forests without the Randomness Sept_2023.pptx
 
Ch 5 Object Recognition.pptx
Ch 5 Object Recognition.pptxCh 5 Object Recognition.pptx
Ch 5 Object Recognition.pptx
 
20 aug 1 iit symposium tsuchiya
20 aug 1 iit symposium tsuchiya20 aug 1 iit symposium tsuchiya
20 aug 1 iit symposium tsuchiya
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1
 
Objects for modeling world
Objects for modeling worldObjects for modeling world
Objects for modeling world
 
16 17 bag_words
16 17 bag_words16 17 bag_words
16 17 bag_words
 
Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...
Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...
Language as social sensor - Marko Grobelnik - Dubrovnik - HrTAL2016 - 30 Sep ...
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Fcv scene efros

  • 1. What do we want from Computational Scene Understanding? Š Quint Buchholz Alexei “Alyosha” Efros Carnegie Mellon University
  • 2. Many ways to Understand a Scene… • Scene Categories (e.g. street scene, city, Boston, etc.) • Innumerate objects (people, cars, lampposts, etc) • Label/Segment scene elements (road, buildings, sky, etc) • Scene Geometry (qualitative or quantitative) and Illumination • Physical Affordances (where can I walk?) • Prediction (What will happen next?)
  • 3. Scene Categorization beach mountain forest Animal vs. no animal (Thorpe, city street farm Poggio & Oliva) Basic Scene Categories (Oliva, Renenger, Fei-Fei, etc)n Image classification, e.g. “Boat” (Caltech, etc.) Spatial envelope (Oliva & Torralba) im2gps
  • 4. Poster Spotlight: SUN Attributes: A Large-Scale Database of Scene Attributes Genevieve Patterson and James Hays, Brown University Global, binary attributes describing: • Affordances / Functions (e.g. farming, eating) • Materials (e.g. carpet, running water) • Surface Properties (e.g. aged, sterile) • Spatial Envelope (e.g. enclosed, symmetrical) Statistics of database: • 14340 images from 717 scene categories • 106 attributes Space of Scenes • 2 million+ labels collected so far Organized by Attributes • Outlier workers manually graded, good workers ~90% accurate.
  • 6. good for scene retrieval Lamp Couch Couch Table Standard detection task, e.g. PASCAL
  • 7. But picture is worth… 4 words? Lamp Couch Couch Table
  • 8. Where can I sit ? Lamp Couch Couch Table
  • 9. Labeling Pixels See Alan and Lana talk…
  • 10. 3D Scene Understanding Hoeim et al See Martial and Silvio’s talk…
  • 11. Pushing Back Evaluation Horizon… • So far, we have acted as cognitive psychologists: – proposing and evaluating intermediate “mental representations” – E.g. object/scene categories, pixel labels, geometry • But we can also be more behaviorist: – Focus on tasks instead of “mental states” – Evaluate action plans and behavior predictions
  • 12. Task-Specific Questions Pushable, Reachable, Sittable ……
  • 13. Human Centric Scene Understanding Can Move Can Sit Can Push Can Walk Reasoning in terms of set of allowable actions/body poses.
  • 14. Human Workspace 3D Scene Geometry Gupta et al, CVPR’11 Joint Space of Human-Scene Interactions
  • 17. Event prediction Input image Video database Liu, Yuen, Torralba. CVPR 2009. Yuen, Torralba. ECCV 2010
  • 18. Event prediction Input image Video database Liu, Yuen, Torralba. CVPR 2009; Yuen, Torralba. ECCV 2010
  • 19. What do we need to get to there? The Op-Ed Part ☺
  • 20. Organizing Our Data “It irritated him that the “dog” of 3:14 in the afternoon, seen in profile, should be indicated by the same noun as the dog of 3:15, seen frontally…” “My memory, sir, is like a garbage heap.” Fumes the Memorious Jorge Luis Borges
  • 21. Trouble with Classic Platonic Categorization • Step 1: cut up the world into categories PASCAL “train” category • Step 2: train an SVM on positives vs. negatives
  • 22. It gets more complicated… • Number of objects * number of interactions * number of outcomes… = too many categories • Don’t want to categorize too early – “Dealing with the world as it comes to us” [Derek] • Let’s categorize at run-time, once we know the task!
  • 23. The Dictatorship of Librarians Arts and recreation Arts and recreation Language Language Philosophy and Philosophy and Literature Literature Psychology Psychology Technology Technology Religion Religion 23
  • 24. categories are losing money… vs.
  • 25. Association instead of categorization Ask not “what is this?”, ask “what is this like” – Moshe Bar • Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992) –categories represented in terms of remembered objects (exemplars) –Similarity is measured between input and all exemplars –think non-parametric density estimation • Vanevar Bush (1945), Memex (MEMory EXtender) –Inspired hypertext, WWW, Google…
  • 26. Bush’s Memex (1945) • Store publications, correspondence, personal work, on microfilm • Items retrieved rapidly using index codes – Builds on “rapid selector” • Can annotate text with margin notes, comments • Can construct a trail through the material and save it – Roots of hypertext • Acts as an external memory
  • 27. Visual Memex, a proposal Nodes = instances Edges = associations types of edges: • visual similarity • spatial, temporal co-occurrence • geometric structure • language • geography •.. Milosewicz,Efros, NIPS’08]
  • 28. Poster Spotlight: Relative Attributes • Previous work restricts attributes to binary categories, but many attributes are more fluid and should be expressed relatively. [Parikh & Grauman, ICCV 2011]
  • 29. Relative attributes [Parikh & Grauman, ICCV 2011] • Learn a ranking function per attribute, given ordering constraints among exemplars or categories Youth: , … • Allows two novel tasks: 1) Zero-shot learning from 2) Description relative to comparisons examples/classes Train: “Unseen person C is younger than S, older than H”,… is more dense than , S Smiling and less dense than C M H Z Precise descriptions are more recognizable to human subjects Youth
  • 30. Poster Spotlight: Ensemble of Exemplar-SVMs for Object Detection and Beyond Milosewicz,Gupta,Efros, ICCV 2011]