SlideShare a Scribd company logo
1 of 20
Download to read offline
Using geometry and related
         things
Region labels   + Boundaries   Stronger geometric   Reasoning on   3D point clouds
                 and objects    constraints from     aspects and
                               domain knowledge         poses


                          More quantitative
Qualitative                more precise                        Explicit
[D. Hoiem, A. A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 75(1):151–172, 2007]




        What level of representation?
Learning from image features to depth + MRF: A. Saxena, et al.. 3-D depth reconstruction from a single still image. IJCV, 76, 2007.

        How qualitative?
Make3D: Learning 3D Scene Structure from a Single Still Image: A. Saxena, et al. TPAMI, 2010.


        What type of training information is available?
        Assumptions (camera geometry, etc…)?


 Stage classes: Nedovic, V., Smeulders, A., Redert, A., Geusebroek, J.: Stages as models of scene geometry. In: PAMI (2010)
Next….
Can coarse surface labels be used for improving object
recognition and scene analysis performance through
better geometric reasoning?
Object Detection     Surface Estimates      Viewpoint Prior




Local Car Detector



   From geometry to objects and back?
   Distributions versus decisions?       D. Hoiem, A. Efros, M. Hebert. Putting objects in perspective. IJCV 2009
Local Ped Detector




                                                                              S.Y. Bao, M. Sun, S.Savarese. Toward
                                                                              Coherent Object Detection And Scene Layout
                                                                              Understanding. CVPR 2010.




                                                                                     B. Leibe, N. Cornelis, K. Cornelis,
                                                                                     and L. Van Gool. Dynamic 3D
                                                                                     Scene Analysis from a Moving
                                                                                     Vehicle. CVPR07
• Is a more precise representation
                                     possible?
                                   • Boundaries, interposition, relative
                                     depth ordering….
                                   • Still low-level: Can we combine
                                     reasoning about semantic labels

Region labels    + More geom. Stronger geometric    Reasoning on   3D point clouds
                 relations and   constraints from    aspects and
                semantic labels domain knowledge        poses


                           More quantitative
Qualitative                 more precise                       Explicit
Iterative refinement

                    Iter 1




                    Iter 2




      Toward true integration of geometric and
                    Final                                                                          D. Hoiem, A. A. Efros, and M. Hebert.
                                                                                                   Closing the loop on scene interpretation. In

      semantic cues: How to beat the intractable                                                   CVPR, 2008


   Global optimization
      nature of the problem?
F(D|I,L) =  p (d ) +
                1   p (d ,d ,d ) +
                             pqr       (d ,d )
                                           2    p    q   r            p        3       p   g
      What (features/cues can be used?
 F( |I,L,S) =      )+ ( )+         ( , )
                p   1   i          i       2     i           ij   3       i        j




                                                                          Tx   =1
                                           Tx   =1                    j
                                       i


 B. Liu, S. Gould, D. Koller. Single image depth estimation from predicted semantic labels. CVPR 2010
 S. Gould, R. Fulton, D. Koller. Decomposing a scene into geometric and semantically consistent regions. ICCV 2009
•   Still mostly bottom-up
                                                        classification approach
                                                    •   No use of domain
                                                        constraints or constraints
                                                        governing the physical world




Region labels   + Boundaries   Stronger geometric   Reasoning on     3D point clouds
                 and objects    constraints from     aspects and
                               domain knowledge         poses


                          More quantitative
Qualitative                more precise                          Explicit
Score


How to generate and search through hypotheses
                                       D. Lee, T. Kanade, M. Hebert. Geometric Reasoning
                                       for Single Image Structure Recovery. CVPR09.


(in a tractable manner)?
How to evaluate score?
How to avoid early decisions?
         Lines           Faces
                                       V. Hedau, D. Hoiem, D.Forsyth, “Recovering the
                                       Spatial Layout of Cluttered Rooms,” IEEE
                                       International Conference on Computer Vision (ICCV),
How to represent constraints in a more general
                                       2009.


way?                                   H. Wang, S. Gould, D. Koller. Discriminative learning
                                       with latent variables for cluttered indoor scene
                                       understanding. ECCV 2010.

      Classifiers   f(x,y,w) = wT (x,y)
                                          Feature vector
                                          measuring agreement
                    Learned weight vector between lines, faces
                                          and labels
Region labels   + Boundaries   Stronger geometric + more constraints3D point clouds
                 and objects    constraints from
                               domain knowledge


                          More quantitative
Qualitative                more precise                         Explicit
• Finite volume

• Spatial exclusion



• Containment

•   Stability
•   Contact
•   Proximity
•   ………….
• Search through hypothesis space

                                Input image       Line segments and   Room hypotheses
                                                  Vanishing points



                                                                                           Reject invalid
                                                                                           configurations


                              Geometric context    Orientation map    Object hypotheses


                     Compatibility of image data with geometric        Penalty term for incompatible configurations
                     configuration




                                                                  T                                 T
                               f x, y                     w              x, y               w                  y
      Features from image (surface labels,
      vanishing points, etc.)                              Hypothesis: Scene layout+
                                                           object hypothesis

D. Lee, A. Gupta, M. Hebert, and T. Kanade. Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects
and Surfaces. Advances in Neural Information Processing Systems (NIPS), Vol. 24, 2010.
Surface Layout Density Map


  Bag of Segments

Frontal Front-Right    Front-Left




Left-Right Left-OccludedRight-Occluded



Porous    Solid


                  Catalogue
sky                 above
                                                          above
                                                                         above         above


                                                     Medium
                                                                       Medium Medium High
                                                                                                         Infront
                                                      Medium                        Infront


                                                              Point-
                                                              supported
                                                   Point-
                                                   supported
        Original Image                                                  supported                High

                                                                               Ground               supported

                                                               3D Parse Graph
A. Gupta, A. Efros, and M. Hebert. Blocks World Revisited: Image Understanding Using Qualitative Geometry and
Mechanics. ECCV 2010.

http://www.cs.cmu.edu/~abhinavg/blocksworld
• Direct search through hypothesis space
 • Sampling                      L. Del Pero, J. Guan, E. Brau, J. Schlecht, K. Barnard. Sampling
                                 Bedrooms. CVPR 2011.

                              Diffusion moves:
 Sample room boundary                                           Sample camera




 Change r = (x,y,z,w,h,l, )                     Change c =       f

  Sample object parameters                           Sample over a block edge only




Change o = (x,y,z,w,h,l)
                              Jump moves:
• Direct search through hypothesis space
      • Sampling
      • (Constrained) object detection




P(O1,..,ON,L,H|I) = P(H)P(L|H,I)                                                            i P(Oi|L,H,I)

V. Hedau, D. Hoiem, D. Forsyth, “Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry,”
European Conference on Computer Vision (ECCV), 2010.
•   Direct search through hypothesis space
•   Sampling
•   Object detection    Truth   Max      Window     Wall     Balcony    Door     Roof

•   Grammars




                                                             =

                         L. Simon, O. Teboul, P. Koutsourakis and N. Paragios. Random
                         Exploration of the Procedural Space for Single-View 3D Modeling of
                         Buildings. International Journal of Computer Vision (IJCV), 2010.
                         O. Teboul, I. Kokkinos, P. Koutsourakis, L. Simon and N. Paragios.
                         Shape Grammar Parsing via Reinforcement Learning. In IEEE
                         Conference on Computer Vision and Pattern Recognition (CVPR).
                         2011.
•   Direct search through hypothesis space
    •   Sampling
    •   Object detection
    •   Grammars

What constraints?
How to combine low level classifiers with “higher-level” reasoning?
What are the right tools to search through and score hypotheses?
How to represent partial interpretations without early
commitment?
G. Tsai et al. Real-
                                                                       time Indoor Scene
                                                                       Understanding using
                                                                       Bayesian Filtering
                                                                       with Motion Cues.
                                                                       ICCV 2011.
                                                                      + sparse/partial
                                                                      3D data
Region labels   + Boundaries   Stronger geometric + more constraints3D point clouds
                                                                      + other
                 and objects    constraints from                      constraints
                               domain knowledge                       + (large) prior
                                                                      data

                          More quantitative
Qualitative                more precise                          Explicit
• What level of representation? Do we need explicit
  parse of the input or how far can we go with
  associations?
   – Hierarchical, region labels, associations,...
• How to incorporate knowledge/context external to
  the input image?
   – Task, geometry, contextual info (scene type,
     location), text, ...
• Should we use global models vs. sequences of
  simpler models? Is the problem too hard as posed,
  i.e., intractable?
• What are the right definitions of actions, activities,
  behaviors?
• How to combine temporal (actions) and spatial
  (scenes, objects) information effectively?
• What should be evaluated and what stage?
   – bounding boxes, pixelwise labels, 3D models,
     actions/behaviors, predictions?

More Related Content

What's hot

IT6005 digital image processing question bank
IT6005   digital image processing question bankIT6005   digital image processing question bank
IT6005 digital image processing question bankGayathri Krishnamoorthy
 
PG2012_User_Disparity
PG2012_User_DisparityPG2012_User_Disparity
PG2012_User_DisparityI-Chao Shen
 
E Cognition User Summit2009 S Lang Zgis Object Validity
E Cognition User Summit2009 S Lang Zgis Object ValidityE Cognition User Summit2009 S Lang Zgis Object Validity
E Cognition User Summit2009 S Lang Zgis Object ValidityTrimble Geospatial Munich
 
State of art pde based ip to bt vijayakrishna rowthu
State of art pde based ip to bt  vijayakrishna rowthuState of art pde based ip to bt  vijayakrishna rowthu
State of art pde based ip to bt vijayakrishna rowthuvijayakrishna rowthu
 
ttA sign language recognition approach for
ttA sign language recognition approach forttA sign language recognition approach for
ttA sign language recognition approach forijcseit
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...IJERA Editor
 
Different Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionDifferent Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionIJERA Editor
 
Gaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGuy K. Kloss
 
An Automatic Color Feature Vector Classification Based on Clustering Method
An Automatic Color Feature Vector Classification Based on Clustering MethodAn Automatic Color Feature Vector Classification Based on Clustering Method
An Automatic Color Feature Vector Classification Based on Clustering MethodRSIS International
 
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...CSCJournals
 
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...IRJET Journal
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 

What's hot (16)

IT6005 digital image processing question bank
IT6005   digital image processing question bankIT6005   digital image processing question bank
IT6005 digital image processing question bank
 
PG2012_User_Disparity
PG2012_User_DisparityPG2012_User_Disparity
PG2012_User_Disparity
 
E Cognition User Summit2009 S Lang Zgis Object Validity
E Cognition User Summit2009 S Lang Zgis Object ValidityE Cognition User Summit2009 S Lang Zgis Object Validity
E Cognition User Summit2009 S Lang Zgis Object Validity
 
State of art pde based ip to bt vijayakrishna rowthu
State of art pde based ip to bt  vijayakrishna rowthuState of art pde based ip to bt  vijayakrishna rowthu
State of art pde based ip to bt vijayakrishna rowthu
 
ttA sign language recognition approach for
ttA sign language recognition approach forttA sign language recognition approach for
ttA sign language recognition approach for
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...
Evaluation of Euclidean and Manhanttan Metrics In Content Based Image Retriev...
 
Different Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image ExtractionDifferent Image Segmentation Techniques for Dental Image Extraction
Different Image Segmentation Techniques for Dental Image Extraction
 
Gaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image Capturing
 
An Automatic Color Feature Vector Classification Based on Clustering Method
An Automatic Color Feature Vector Classification Based on Clustering MethodAn Automatic Color Feature Vector Classification Based on Clustering Method
An Automatic Color Feature Vector Classification Based on Clustering Method
 
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
 
F010413438
F010413438F010413438
F010413438
 
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
 
Ipta2010
Ipta2010Ipta2010
Ipta2010
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Bb32351355
Bb32351355Bb32351355
Bb32351355
 

Similar to Using geometry for scene understanding

Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Christian Sandor
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histogramszukun
 
ICCV 2011 Presentation
ICCV 2011 PresentationICCV 2011 Presentation
ICCV 2011 PresentationAlex Flint
 
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...
Iccv2009 recognition and learning object categories   p2 c01 - recognizing a ...Iccv2009 recognition and learning object categories   p2 c01 - recognizing a ...
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...zukun
 
Fcv learn sudderth
Fcv learn sudderthFcv learn sudderth
Fcv learn sudderthzukun
 
얼굴검출기법 감성언어인식기법
얼굴검출기법 감성언어인식기법얼굴검출기법 감성언어인식기법
얼굴검출기법 감성언어인식기법cyberemotions
 
얼굴 검출 기법과 감성 언어 인식기법
얼굴 검출 기법과 감성 언어 인식기법얼굴 검출 기법과 감성 언어 인식기법
얼굴 검출 기법과 감성 언어 인식기법cyberemotion
 
Liu Natural Scene Statistics At Stereo Fixations
Liu Natural Scene Statistics At Stereo FixationsLiu Natural Scene Statistics At Stereo Fixations
Liu Natural Scene Statistics At Stereo FixationsKalle
 
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude Method
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude MethodPresentazione Corso di Sistemi Informativi Multimediali Slope Magnitude Method
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude MethodMichele Pierangeli
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1zukun
 
Texture Snakes
Texture SnakesTexture Snakes
Texture Snakeschensagiv
 
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-Depth
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-DepthSurface Normal Prediction using Hypercolumn Skip-Net & Normal-Depth
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-DepthChinghang chen
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Workbutest
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Workbutest
 

Similar to Using geometry for scene understanding (20)

ICPRAM 2012
ICPRAM 2012ICPRAM 2012
ICPRAM 2012
 
Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histograms
 
ICCV 2011 Presentation
ICCV 2011 PresentationICCV 2011 Presentation
ICCV 2011 Presentation
 
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...
Iccv2009 recognition and learning object categories   p2 c01 - recognizing a ...Iccv2009 recognition and learning object categories   p2 c01 - recognizing a ...
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...
 
conv_nets.pptx
conv_nets.pptxconv_nets.pptx
conv_nets.pptx
 
16 17 bag_words
16 17 bag_words16 17 bag_words
16 17 bag_words
 
2d3d
2d3d2d3d
2d3d
 
Two
TwoTwo
Two
 
Fcv learn sudderth
Fcv learn sudderthFcv learn sudderth
Fcv learn sudderth
 
얼굴검출기법 감성언어인식기법
얼굴검출기법 감성언어인식기법얼굴검출기법 감성언어인식기법
얼굴검출기법 감성언어인식기법
 
얼굴 검출 기법과 감성 언어 인식기법
얼굴 검출 기법과 감성 언어 인식기법얼굴 검출 기법과 감성 언어 인식기법
얼굴 검출 기법과 감성 언어 인식기법
 
Liu Natural Scene Statistics At Stereo Fixations
Liu Natural Scene Statistics At Stereo FixationsLiu Natural Scene Statistics At Stereo Fixations
Liu Natural Scene Statistics At Stereo Fixations
 
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude Method
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude MethodPresentazione Corso di Sistemi Informativi Multimediali Slope Magnitude Method
Presentazione Corso di Sistemi Informativi Multimediali Slope Magnitude Method
 
Pc Seminar Jordi
Pc Seminar JordiPc Seminar Jordi
Pc Seminar Jordi
 
NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1NIPS2009: Understand Visual Scenes - Part 1
NIPS2009: Understand Visual Scenes - Part 1
 
Texture Snakes
Texture SnakesTexture Snakes
Texture Snakes
 
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-Depth
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-DepthSurface Normal Prediction using Hypercolumn Skip-Net & Normal-Depth
Surface Normal Prediction using Hypercolumn Skip-Net & Normal-Depth
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Work
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Work
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 

Using geometry for scene understanding

  • 1. Using geometry and related things
  • 2. Region labels + Boundaries Stronger geometric Reasoning on 3D point clouds and objects constraints from aspects and domain knowledge poses More quantitative Qualitative more precise Explicit
  • 3. [D. Hoiem, A. A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 75(1):151–172, 2007] What level of representation? Learning from image features to depth + MRF: A. Saxena, et al.. 3-D depth reconstruction from a single still image. IJCV, 76, 2007. How qualitative? Make3D: Learning 3D Scene Structure from a Single Still Image: A. Saxena, et al. TPAMI, 2010. What type of training information is available? Assumptions (camera geometry, etc…)? Stage classes: Nedovic, V., Smeulders, A., Redert, A., Geusebroek, J.: Stages as models of scene geometry. In: PAMI (2010)
  • 4. Next…. Can coarse surface labels be used for improving object recognition and scene analysis performance through better geometric reasoning?
  • 5. Object Detection Surface Estimates Viewpoint Prior Local Car Detector From geometry to objects and back? Distributions versus decisions? D. Hoiem, A. Efros, M. Hebert. Putting objects in perspective. IJCV 2009 Local Ped Detector S.Y. Bao, M. Sun, S.Savarese. Toward Coherent Object Detection And Scene Layout Understanding. CVPR 2010. B. Leibe, N. Cornelis, K. Cornelis, and L. Van Gool. Dynamic 3D Scene Analysis from a Moving Vehicle. CVPR07
  • 6. • Is a more precise representation possible? • Boundaries, interposition, relative depth ordering…. • Still low-level: Can we combine reasoning about semantic labels Region labels + More geom. Stronger geometric Reasoning on 3D point clouds relations and constraints from aspects and semantic labels domain knowledge poses More quantitative Qualitative more precise Explicit
  • 7. Iterative refinement Iter 1 Iter 2 Toward true integration of geometric and Final D. Hoiem, A. A. Efros, and M. Hebert. Closing the loop on scene interpretation. In semantic cues: How to beat the intractable CVPR, 2008 Global optimization nature of the problem? F(D|I,L) = p (d ) + 1 p (d ,d ,d ) + pqr (d ,d ) 2 p q r p 3 p g What (features/cues can be used? F( |I,L,S) = )+ ( )+ ( , ) p 1 i i 2 i ij 3 i j Tx =1 Tx =1 j i B. Liu, S. Gould, D. Koller. Single image depth estimation from predicted semantic labels. CVPR 2010 S. Gould, R. Fulton, D. Koller. Decomposing a scene into geometric and semantically consistent regions. ICCV 2009
  • 8. Still mostly bottom-up classification approach • No use of domain constraints or constraints governing the physical world Region labels + Boundaries Stronger geometric Reasoning on 3D point clouds and objects constraints from aspects and domain knowledge poses More quantitative Qualitative more precise Explicit
  • 9. Score How to generate and search through hypotheses D. Lee, T. Kanade, M. Hebert. Geometric Reasoning for Single Image Structure Recovery. CVPR09. (in a tractable manner)? How to evaluate score? How to avoid early decisions? Lines Faces V. Hedau, D. Hoiem, D.Forsyth, “Recovering the Spatial Layout of Cluttered Rooms,” IEEE International Conference on Computer Vision (ICCV), How to represent constraints in a more general 2009. way? H. Wang, S. Gould, D. Koller. Discriminative learning with latent variables for cluttered indoor scene understanding. ECCV 2010. Classifiers f(x,y,w) = wT (x,y) Feature vector measuring agreement Learned weight vector between lines, faces and labels
  • 10. Region labels + Boundaries Stronger geometric + more constraints3D point clouds and objects constraints from domain knowledge More quantitative Qualitative more precise Explicit
  • 11. • Finite volume • Spatial exclusion • Containment • Stability • Contact • Proximity • ………….
  • 12. • Search through hypothesis space Input image Line segments and Room hypotheses Vanishing points Reject invalid configurations Geometric context Orientation map Object hypotheses Compatibility of image data with geometric Penalty term for incompatible configurations configuration T T f x, y w x, y w y Features from image (surface labels, vanishing points, etc.) Hypothesis: Scene layout+ object hypothesis D. Lee, A. Gupta, M. Hebert, and T. Kanade. Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces. Advances in Neural Information Processing Systems (NIPS), Vol. 24, 2010.
  • 13. Surface Layout Density Map Bag of Segments Frontal Front-Right Front-Left Left-Right Left-OccludedRight-Occluded Porous Solid Catalogue
  • 14. sky above above above above Medium Medium Medium High Infront Medium Infront Point- supported Point- supported Original Image supported High Ground supported 3D Parse Graph A. Gupta, A. Efros, and M. Hebert. Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics. ECCV 2010. http://www.cs.cmu.edu/~abhinavg/blocksworld
  • 15. • Direct search through hypothesis space • Sampling L. Del Pero, J. Guan, E. Brau, J. Schlecht, K. Barnard. Sampling Bedrooms. CVPR 2011. Diffusion moves: Sample room boundary Sample camera Change r = (x,y,z,w,h,l, ) Change c = f Sample object parameters Sample over a block edge only Change o = (x,y,z,w,h,l) Jump moves:
  • 16. • Direct search through hypothesis space • Sampling • (Constrained) object detection P(O1,..,ON,L,H|I) = P(H)P(L|H,I) i P(Oi|L,H,I) V. Hedau, D. Hoiem, D. Forsyth, “Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry,” European Conference on Computer Vision (ECCV), 2010.
  • 17. Direct search through hypothesis space • Sampling • Object detection Truth Max Window Wall Balcony Door Roof • Grammars = L. Simon, O. Teboul, P. Koutsourakis and N. Paragios. Random Exploration of the Procedural Space for Single-View 3D Modeling of Buildings. International Journal of Computer Vision (IJCV), 2010. O. Teboul, I. Kokkinos, P. Koutsourakis, L. Simon and N. Paragios. Shape Grammar Parsing via Reinforcement Learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2011.
  • 18. Direct search through hypothesis space • Sampling • Object detection • Grammars What constraints? How to combine low level classifiers with “higher-level” reasoning? What are the right tools to search through and score hypotheses? How to represent partial interpretations without early commitment?
  • 19. G. Tsai et al. Real- time Indoor Scene Understanding using Bayesian Filtering with Motion Cues. ICCV 2011. + sparse/partial 3D data Region labels + Boundaries Stronger geometric + more constraints3D point clouds + other and objects constraints from constraints domain knowledge + (large) prior data More quantitative Qualitative more precise Explicit
  • 20. • What level of representation? Do we need explicit parse of the input or how far can we go with associations? – Hierarchical, region labels, associations,... • How to incorporate knowledge/context external to the input image? – Task, geometry, contextual info (scene type, location), text, ... • Should we use global models vs. sequences of simpler models? Is the problem too hard as posed, i.e., intractable? • What are the right definitions of actions, activities, behaviors? • How to combine temporal (actions) and spatial (scenes, objects) information effectively? • What should be evaluated and what stage? – bounding boxes, pixelwise labels, 3D models, actions/behaviors, predictions?