SlideShare a Scribd company logo
Semantic Structure From Motion
                                                                                       Sid Yingze Bao and Silvio Savarese
     VISION LAB                                                      Electrical and Computer Engineering, University of Michigan at Ann Arbor
                                                                                              Source code and data: http://www.eecs.umich.edu/vision/projects/ssfm/index.html

Introduction                                                                                                        Model Overview                                                                                                        Results
Goal:                                                                                                                                                               Assumption:                                                                                                                                         Car                     Person
                                                                                                                    {O,Q,C}=arg max P(q,u,o|C,O,Q)                                                                                        Comparison Baselines
                                                                                                                                                                                                                                                                                                                                                                     Office

Estimate 3D location and pose of objects, 3D location                                                                                                               Given camera hypothesis, objects
                                                                                                                     =arg max P(q,u|C,Q)P(o|C,O)                                                                                          - Camera Pose Est.: Bundler [1]
of points, and camera parameters from 2 or more images.                                                                                                             and points are independent
                                                                                                                                                                                                                                          - Object Detection: LSVM [2]
                                                                                                                     Point Likelihood P(q,u|C,Q)
                                                                                                                                                                                                                                        1. Car Dataset [3] (available online)
                                                                                                                                                                                                                                           - Images and Dense Lidar Points
                                                                                                                                                                                                                                           - ~500 testing images in 10 scenarios
Motivation:                                                                                                           Object Likelihood P(o|C,O)                                                                                                 Cam. T. est. error v.s. baseline         3D object localization
                                                                                                                                                                                                                                                                               0.3
- Most 3D reconstruction methods do                                                                                                                                                                                                      40
                                                                                                                                                                                                                                                 e (degree)
                                                                                                                                                                                                                                                  T                           0.25
                                                                                                                                                                                                                                                                                    Recall
                                                                                                                       - Estimate 3D object likelihood by 2D                                                                                          Bundler                  0.2                          4 Cam
  not povide semantic information.                                                                                                                                                                                                                                      SSFM 0.15                           3 Cam
                                                                                                                         projection appearance:                                                                                          20
                                                                                                                                                                                                                                                                               0.1                          2 Cam
- Most recognition methds do not         Conventional                                                                                                                                                                                                                         0.05
                                                                                                                                                                                                                                                                                                            1 Cam
                                                                                                                                                                                                                                                                                           False Positive per Image
                                                                                                                                                                                                                                         0
  provide geometry and camera pose. Structure From Motion                                                                                                                                                                                    1            2           3     4
                                                                                                                                                                                                                                                                                0
                                                                                                                                                                                                                                                                                   0    2       4      6      8    10

- We propose to solve these two problem jointly.                                     2D object detection
Advantages:
- Improve camera pose estimation, compared to feature-point-based SFM.
- Improve object detections given multiple images, compared to independently
  detecting objects from each single images.
- Establish object correspondences across views.



SSFM Problem Formulation                                                                                                                                                                                                                2. Kinect Office Dataset (available online)                                           Cam. T. est. error v.s. baseline
                                                                                                                                                                                                                                                                                                                         20 e (degree)
                                                                                                                                                                                                                                                                                                                                                                 3D object localization


                                                                                                            Q
                                                                                                                                                                                                                                           - Images and calibrated Kinect 3D range data                                      T                                   Recall
                                                                                                                                                                                                                                                                                                                                                                          2 Cam
                                                                                                                                                                                                                                                                                                                                                                          1 Cam
                                                                                                                                                                                                                                                                                                                                            Bundler
 Measurements                                                                                                                                                                                                                              - Mouse, Monitor, and Keyboard                                                10

 - q: point features (e.g. DOG+SIFT)                                                                                                                                                                                                       - 500 images in 10 scenarios                                                   0
                                                                                                                                                                                                                                                                                                                                                     SSFM                 False Positive per Image
                                                                                             O                                                                                                                                                                                                                            0.4         0.6      0.8          1
 - u: point matches (e.g. threshold test)
 - o: 2D objects (e.g. [2])
                                                              o
                                                                                                                    Joint Likelihood Maximization
                                                                                                                     Main challenge:                                  image 1 object det.                   image 2 object det.
Model Parameters (unknowns)                                                                                          High dimensionality of unknowns =>
                                                                              q
                                                                                                                                                                                          azimuth=20


- C: camera (K is known)
                                                                                                                                                                                           zenith =10



                                               C
                                                                                                    o           q    Sample P(q,u,o|C,O,Q) with MCMC                                                                       azimuth=90
                                                                                                                                                                                                                            zenith =5

- Q: 3D points (locations)                                                                                                                                            azimuth=-20
                                                                                                                                                                       zenith =9
                                                                                                                                                                                                             azimuth=70


                                                                                                                    Parameter Initialization
                                                                                                                                                                                                              zenith =10


- O: 3D objects (locations, poses, categories)                                                                                                                                                                                           3. Person Dataset
                                                                                                        C           - Use object detection scale and pose to                                one of camera                                   - A pair of stereo cameras
 Intuition:                                                                                                           initialize cameras relative poses                                   pose initializations
                                                                                                                                                                                                                                            - 400 image pairs in 10 scenarios
 In addition to point features, measurements of objects across views provide add-                                   - Theorem: camera parameters can be
 itional geometrical constraints that allow to relate cameras and scene parameters.                                   estimated given:
                                                                                                                      i) 3 objects with scale; ii) 2 objects with
                                                                                                                      pose; iii) 1 object with scale and pose.
Reference                                                                                                           Monte Carlo Markov Chain
[1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV. 2008.    - Sampling starts from different initializations
[2] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained
                                                                                                                    - Proposal distribution P(q,u,o|C,O,Q)                                                                              Ackowledgement
     part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence of Pattern Analysis, 2009.                                                                   one of camera poses and objects initializations
[3] Gaurav Pandey, James McBride,,and Ryan Eustice,.Ford campus vision and lidar data set. International Journal    - Combine all samples to identify the maximum                                                                       We acknowledge the support of NSF CAREER #1054127 and the Gigascale Systems Research Center.
    of Robotics Research. 2011                                                                                                                                                                                                          We thank Mohit Bagra for collecting the Kinect dataset and Min Sun for helpful feedback.

More Related Content

Viewers also liked

Pre production of music video
Pre production of music videoPre production of music video
Pre production of music video
y9_
 
Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016
rosemere12
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D R
sangh1212
 
Beware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas KashalikarBeware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas Kashalikar
sangh1212
 
Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007
Elisio Estanque
 
Donnie darko 9 shot
Donnie darko 9 shotDonnie darko 9 shot
Donnie darko 9 shot
Jack Murphy
 

Viewers also liked (19)

NANOBOY
NANOBOYNANOBOY
NANOBOY
 
Juventude amordaçada
Juventude amordaçadaJuventude amordaçada
Juventude amordaçada
 
MOOCafé ABP
MOOCafé ABP MOOCafé ABP
MOOCafé ABP
 
Pre production of music video
Pre production of music videoPre production of music video
Pre production of music video
 
Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D R
 
The grand waltz 1º movement
The grand waltz 1º movementThe grand waltz 1º movement
The grand waltz 1º movement
 
Unidad2 actividad2
Unidad2 actividad2Unidad2 actividad2
Unidad2 actividad2
 
Anvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av csAnvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av cs
 
Main Street Square Bierborse 2011
Main Street Square Bierborse 2011Main Street Square Bierborse 2011
Main Street Square Bierborse 2011
 
Beware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas KashalikarBeware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas Kashalikar
 
Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007
 
2 156 l'anello mancante1 copia
2 156 l'anello mancante1 copia2 156 l'anello mancante1 copia
2 156 l'anello mancante1 copia
 
El cuento 2
El cuento 2El cuento 2
El cuento 2
 
Comunidad del aprendizaje
Comunidad del aprendizajeComunidad del aprendizaje
Comunidad del aprendizaje
 
iPad productivity usage 101 (basics)
iPad productivity usage 101 (basics)iPad productivity usage 101 (basics)
iPad productivity usage 101 (basics)
 
Donnie darko 9 shot
Donnie darko 9 shotDonnie darko 9 shot
Donnie darko 9 shot
 
Debit card loans
Debit card loansDebit card loans
Debit card loans
 
Antonio machado
Antonio machadoAntonio machado
Antonio machado
 

Similar to Fcv poster bao

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
c.choi
 
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Big Data Week São Paulo
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 

Similar to Fcv poster bao (20)

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
 
Vehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image featuresVehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image features
 
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...
 
final_presentation
final_presentationfinal_presentation
final_presentation
 
p927-chang
p927-changp927-chang
p927-chang
 
sduGroupEvent
sduGroupEventsduGroupEvent
sduGroupEvent
 
A Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingA Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in Painting
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
 
PCA and SVD in brief
PCA and SVD in briefPCA and SVD in brief
PCA and SVD in brief
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
 
Real-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU AccelerationReal-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU Acceleration
 
CUDA Accelerated Face Recognition
CUDA Accelerated Face RecognitionCUDA Accelerated Face Recognition
CUDA Accelerated Face Recognition
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGEOBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
 
Object Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an ImageObject Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an Image
 
ei2106-submit-opt-415
ei2106-submit-opt-415ei2106-submit-opt-415
ei2106-submit-opt-415
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 
Object class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalObject class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunal
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Recently uploaded

Accounting and finance exit exam 2016 E.C.pdf
Accounting and finance exit exam 2016 E.C.pdfAccounting and finance exit exam 2016 E.C.pdf
Accounting and finance exit exam 2016 E.C.pdf
YibeltalNibretu
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 

Recently uploaded (20)

Benefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational ResourcesBenefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational Resources
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Accounting and finance exit exam 2016 E.C.pdf
Accounting and finance exit exam 2016 E.C.pdfAccounting and finance exit exam 2016 E.C.pdf
Accounting and finance exit exam 2016 E.C.pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 

Fcv poster bao

  • 1. Semantic Structure From Motion Sid Yingze Bao and Silvio Savarese VISION LAB Electrical and Computer Engineering, University of Michigan at Ann Arbor Source code and data: http://www.eecs.umich.edu/vision/projects/ssfm/index.html Introduction Model Overview Results Goal: Assumption: Car Person {O,Q,C}=arg max P(q,u,o|C,O,Q) Comparison Baselines Office Estimate 3D location and pose of objects, 3D location Given camera hypothesis, objects =arg max P(q,u|C,Q)P(o|C,O) - Camera Pose Est.: Bundler [1] of points, and camera parameters from 2 or more images. and points are independent - Object Detection: LSVM [2] Point Likelihood P(q,u|C,Q) 1. Car Dataset [3] (available online) - Images and Dense Lidar Points - ~500 testing images in 10 scenarios Motivation: Object Likelihood P(o|C,O) Cam. T. est. error v.s. baseline 3D object localization 0.3 - Most 3D reconstruction methods do 40 e (degree) T 0.25 Recall - Estimate 3D object likelihood by 2D Bundler 0.2 4 Cam not povide semantic information. SSFM 0.15 3 Cam projection appearance: 20 0.1 2 Cam - Most recognition methds do not Conventional 0.05 1 Cam False Positive per Image 0 provide geometry and camera pose. Structure From Motion 1 2 3 4 0 0 2 4 6 8 10 - We propose to solve these two problem jointly. 2D object detection Advantages: - Improve camera pose estimation, compared to feature-point-based SFM. - Improve object detections given multiple images, compared to independently detecting objects from each single images. - Establish object correspondences across views. SSFM Problem Formulation 2. Kinect Office Dataset (available online) Cam. T. est. error v.s. baseline 20 e (degree) 3D object localization Q - Images and calibrated Kinect 3D range data T Recall 2 Cam 1 Cam Bundler Measurements - Mouse, Monitor, and Keyboard 10 - q: point features (e.g. DOG+SIFT) - 500 images in 10 scenarios 0 SSFM False Positive per Image O 0.4 0.6 0.8 1 - u: point matches (e.g. threshold test) - o: 2D objects (e.g. [2]) o Joint Likelihood Maximization Main challenge: image 1 object det. image 2 object det. Model Parameters (unknowns) High dimensionality of unknowns => q azimuth=20 - C: camera (K is known) zenith =10 C o q Sample P(q,u,o|C,O,Q) with MCMC azimuth=90 zenith =5 - Q: 3D points (locations) azimuth=-20 zenith =9 azimuth=70 Parameter Initialization zenith =10 - O: 3D objects (locations, poses, categories) 3. Person Dataset C - Use object detection scale and pose to one of camera - A pair of stereo cameras Intuition: initialize cameras relative poses pose initializations - 400 image pairs in 10 scenarios In addition to point features, measurements of objects across views provide add- - Theorem: camera parameters can be itional geometrical constraints that allow to relate cameras and scene parameters. estimated given: i) 3 objects with scale; ii) 2 objects with pose; iii) 1 object with scale and pose. Reference Monte Carlo Markov Chain [1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV. 2008. - Sampling starts from different initializations [2] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained - Proposal distribution P(q,u,o|C,O,Q) Ackowledgement part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence of Pattern Analysis, 2009. one of camera poses and objects initializations [3] Gaurav Pandey, James McBride,,and Ryan Eustice,.Ford campus vision and lidar data set. International Journal - Combine all samples to identify the maximum We acknowledge the support of NSF CAREER #1054127 and the Gigascale Systems Research Center. of Robotics Research. 2011 We thank Mohit Bagra for collecting the Kinect dataset and Min Sun for helpful feedback.