Slides from my presentation at the WSCG2011. Describes some modifications to existing techniques for camera orientation estimation in "Manhattan Worlds" aiming at faster calculation times.
Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation
1. Speeding up probabilistic inference of
camera orientation by function
approximation and grid masking
LTI–PCS–EPUSP Nicolau L. Werneck
nic-wscg2011
N. Werneck
Doctoral candidate
1–Introduction
Supervisor: Prof. Anna Helena Reali Costa
2–Methodology Intelligent Techniques Laboratory, LTI — PCS — Poli
Universidade de S˜o Paulo (USP), Brazil
a
3–Results
References
Referˆncias
e
WSCG’2011, Plzen
Feb/2011
c N. Werneck
1 / 15
2. Introduction
The problem — camera orientation estimation
Environment edges are assumed
to be in the three directions of
the reference frame.
(Lego Land, Manhattan World)
LTI–PCS–EPUSP
nic-wscg2011 We want to calculate the
N. Werneck
camera orientation in relation
1–Introduction
to this reference frame, in
2–Methodology
real-time.
3–Results
References
Technique based on continuous
Referˆncias
e
optimization. No edge extrac-
tion or matching involved.
N. Werneck
(Maximum likelihood)
c
2 / 15
3. Introduction
Geometrical constraints
Knowing the camera orientation from a picture we can
predict the directions of image edges.
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referˆncias
e
c N. Werneck
3 / 15
4.
5. Introduction
Bayesian camera orientation estimation
The data analized is the gradient of the input image.
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referˆncias
e
c N. Werneck
5 / 15
6. Introduction
Bayesian camera orientation estimation
The Bayesian camera orientation estimation works by
defining an objective function L(Ψ) to be optimized. The
solution is Ψ∗ = argmax L(Ψ).
LTI–PCS–EPUSP
The function L tells how well the arguments
nic-wscg2011
N. Werneck
“explain” the evidences. (Likelihood function)
1–Introduction In this problem Ψ is a set of arguments that model
2–Methodology the camera orientation.
3–Results
L tells how much the edges in the images are
References
aligned to the directions expected from the
Referˆncias
e
vanishing points produced by Ψ.
c N. Werneck
6 / 15
7. Existing techniques
This work is based on previous research by Coughlan and
Yuille [2003], Deutscher et al. [2002], Schindler and
Dellaert [2004], Denis et al. [2008].
They are all based on likelihood maximization. The
LTI–PCS–EPUSP
differences lie in:
nic-wscg2011
N. Werneck
What parameters are estimated.
1–Introduction (Other than orientation).
2–Methodology
What optimization algorithm is employed.
3–Results
Expression of the likelihood function.
References
Referˆncias
e
(Specially what PDF models are used).
Subsampling technique.
c N. Werneck
7 / 15
8. Original expression
In Coughlan and Yuille [2003] the image likelihood is a
product of the likelihoods of gradients Eu at each pixel u.
Observation model
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
Lik. pixel is edge
1–Introduction Lik. orientation match
2–Methodology
3–Results
The expression built is a Maximum a posteriori estimator.
References
Referˆncias
e
Using M k for P(mu = k), Φk for P(φu |mu = k, Ψ, u) and
taking the log we arrive at the objective function...
c N. Werneck
8 / 15
9. Proposed expression
L Ψ = ∑ log Poff (Eu )Φ1 M 1 +Pon (Eu )Φ5 M 5 +Pon (Eu ) ∑4 Φk M k
k=2
u
a
Using log(b + a) ≈ b + log(b), we arrive at
LTI–PCS–EPUSP
Lik. pixel is edge
nic-wscg2011
N. Werneck
1–Introduction
Lik. orientation match
2–Methodology
3–Results
There is a weighting coefficient based on the gradient
References
norm multiplied by something that depends on the
Referˆncias
e
gradient directions and camera orientation.
c N. Werneck
9 / 15
10. Gradient norm masking
The mask generating function
−1
Poff (Eu ) 1
W (Eu ) = M + M5
Pon (Eu )
LTI–PCS–EPUSP
Also...
nic-wscg2011
N. Werneck We replaced W for
1–Introduction
W , based on the
2–Methodology logistic function.
3–Results
References We also used vector
Referˆncias
e dot products instead
of calculating arctan.
c N. Werneck
10 / 15
11. Grid masking
We select one from every few lines and columns.
Images edges are sampled regularly.
Minimally long lines are necessarily sampled.
Better strategy for high resolution images, where
LTI–PCS–EPUSP edge pixels are “rare”.
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referˆncias
e
c N. Werneck
11 / 15
12. Results
Expression evaluation
Speed
Expressions were implemented in Cython, using SIMD
instructions, and tested on c1.xlarge AWS computers.
A speedup of 50–64× was detected.
LTI–PCS–EPUSP
Original 1100.0 ±60ms
nic-wscg2011 Proposed 18.9 ±2.4ms
N. Werneck
1–Introduction (4s per image with the proposal, without subsampling.)
2–Methodology
3–Results Quality
References
From 102 tests, the original expression “fixed” the
Referˆncias
e
solution in 5 occasions, but ruined 6 good solutions.
Mean error went from 4.7◦ to 5.5◦ . (Large outliers)
c N. Werneck
12 / 15
13. Results
Grid masking evaluation
Speed increases as solution quality drops.
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referˆncias
e
c N. Werneck
13 / 15
14. Conclusion
The proposed expression is simpler, faster, intuitive and
justifies selecting pixels from gradient norm.
The grid masking technique proved to be a good
alternative for subsampling images deterministically.
Future work
LTI–PCS–EPUSP
nic-wscg2011
Develop a complete pixel selection method.
N. Werneck
Find best parameters.
1–Introduction
Try to use gradient-based optimization.
2–Methodology
3–Results
References Thanks! THE END
Referˆncias
e
http://nwerneck.sdf.org
c N. Werneck
14 / 15
15. References
James M. Coughlan and A. L. Yuille. Manhattan world: orientation
and outlier detection by bayesian inference. Neural Comput.,
15(5):1063–1088, 2003. ISSN 0899-7667. URL
doi:10.1162/089976603765202668.
Patrick Denis, James H. Elder, and Francisco J. Estrada. Efficient
edge-based methods for estimating manhattan frames in urban
imagery. In David A. Forsyth, Philip H. S. Torr, and Andrew
Zisserman, editors, ECCV (2), volume 5303 of Lecture Notes
LTI–PCS–EPUSP in Computer Science, pages 197–210. Springer, 2008. ISBN
nic-wscg2011 978-3-540-88685-3.
N. Werneck
Jonathan Deutscher, Michael Isard, and John Maccormick.
1–Introduction Automatic camera calibration from a single manhattan image.
2–Methodology In Eur. Conf. on Computer Vision (ECCV, pages 175–205,
3–Results
2002.
References Grant Schindler and Frank Dellaert. Atlanta world: An expectation
Referˆncias
e
maximization framework for simultaneous low-level edge
grouping and camera calibration in complex man-made
environments. In CVPR (1), pages 203–209, 2004.
c N. Werneck
15 / 15