Lecture 7

C OMPUTER V ISION : C AMERA M ODELS

IIT Kharagpur

Computer Science and Engineering,
Indian Institute of Technology
Kharagpur.

(IIT Kharagpur) Camera Models Jan ’10 1 / 52

What is a camera?
A camera is a mapping between the 3D world (object space) and
a 2D image.
A camera model is a matrix with particular properties and
represent the camera matrix.

A general projective camera has specialized models:
Finite camera: This is a central projection camera having a finite
centre.
Centre at infinity: Camera with centre at infinity. For example: the
affine camera.


The basic pin-hole model
The centre of projection is called as the camera centre.
The plane on which the image is formed is called as the image
plane.
The line through the camera centre and perpendicular to the
image plane is called as the principal axis of the camera.
The point where the principal axis meets the image plane is called
as the principal point.
The plane through the camera centre parallel to the image plane
is called as the principle plane of the camera.


Camera settings
Typical settings:
The camera centre is taken to be the origin of the Euclidean
coordinate frame.
The image plane is taken to be the plane z = f .
The central projection mapping from Euclidean space R3 → R2 is
given as:
(X, Y, Z)T → (f X/Z, f Y/Z)T


Central projection using homogeneous
coordinates
   
 X      X 
 
 Y 
   fX
   f
  0  
 Y 
 


  →  fY

 

 
=
  f 0





 Z 
  
  
   Z 
 

Z 1 0

 
    
 

1 1
   

x = PX
P = diag(f , f , 1) [ I | 0]

The measurements on the image plane assume that the principal point
is the origin of the image plane.


Principal point offset
If the principal point has general coordinates (px , py )T then the
mapping changes to

(X, Y, Z)T → (f X/Z + px , f Y/Z + py )T
   
 X      X 
 
 Y 
   f X + Zpx
   f
  px 0 
 
 Y 

 Z  →  f Y + Zpy f py 0 

 
 
  
= 
 

    
   Z 

      
Z 1 0 

 
    
 

1 1
  
 

 f px 

f py
 
K= 






1
 

x = K [ I | 0] Xcam


Camera Calibration matrix
x = K [ I | 0] Xcam

The matrix K is the camera calibration matrix.
Writing Xcam denotes that the world point is represented in the
camera coordinate system, with the camera centre being the
origin.


Camera rotation and translation
In general, points in space will be expressed in terms of a different
Euclidean coordinate frame, known as world coordinate frame.
The two coordinate frames are related via rotation and translation.
A point expressed in the world coordinate system as X can be
represented in the camera coordinate system as Xcam

Xcam = R(X − C)

C represents the coordinates of the camera centre in the world
coordinate frame. R is the rotation matrix.
 
 X 
 
R −RC  Y  R −RC
 
 
Xcam =  Z =

  X
0 1 0 1
 


 

1
 


Concatenating the matrices
 
 X 
 
R −RC  Y  R −RC

 
x = K [ I | 0] Xcam

Xcam =  Z =

 
 X
0 1 0 1
 

 

1
 

x = K R [ I | − C] X


Camera matrix
x = K R [ I | − C] X
Camera Matrix:
P = KR [ I | − C]

P is a 3 × 4 matrix. 9 degrees of freedom: 3 for K (elements
f , px , py ), 3 for R, 3 for C.
Parameters in K are the internal parameters.
Parameters in R and C are the external parameters.

A representation which hides the camera centre:

P = K [R | t] t = −RC


CCD cameras Non-uniform scaling
A CCD camera has non-square pixels. This has the effect of
introducing unequal scale factors in the axial directions.

 αx
   
 f
 x0   x0 

f y0  αy y0
   
K= 




 changes to K =  






1 1
   

mx and my denote the number of pixels per unit distance in image
coordinates in the x and y directions.
αx = fmx , αy = fmy
(x0 , y0 ) are coordinates of the principal point in terms of pixel
dimensions. x0 = mx px , y0 = my py
A CCD camera has 10 degrees of freedom.


Finite Projective Camera Skew
If the coordinate system of the image plane is skewed then we have:

 αx s x0 
 
 
αy y 0 

 

 

 
1
 

s is the skew parameter.

P = K R [ I | − C]
A ﬁnite projective camera has 11 degrees of freedom.
The left 3 × 3 sub-matrix of P is denoted as M.

M = KR


Finite Projective Camera
M = KR
The camera matrix can be written as

P = K R [ I | − C] P = [M | p4 ]

where p4 denotes the last column of the camera matrix.


Camera Anatomy Projective Camera
Camera centre:
PC = 0
Consider a line containing C and any other point A in 3-space.
Points on this line can be represented as:

X(λ) = λA + (1 − λ)C

Under the mapping x = PX, points on this line are projected to

x = PX(λ) = λPA + (1 − λ)PC = λPA


Column Vectors Projective Camera
The columns of the projective camera are 3-vectors which have a
geometric meaning as particular image points.
The ﬁrst 3 columns of P i.e. p1 , p2 , p3 are the vanishing points of
the world coordinate X, Y, Z respectively.
The column p4 is the image of the world origin.


Row Vectors
The columns of the projective camera are 4-vectors which are
interpreted geometrically as particular world planes.
   1T 
 p11 p12 p13 p14
   P
  

P =  p21 p22 p23 p24

  =  P2T
 
 




  
  3T 

p31 p32 p33 p34 P
   

The set of points X which lie on the plane P1 will satisfy P1T X = 0


Principal plane P3
The principal plane is the plane through the camera centre,
parallel to the image plane.
It consists of the set of points which are imaged on the line at
inﬁnity of the image.
If a point X lies on the principal plane, then PX = (x, y , 0)T . Thus
a point lies on the principal plane if and only if P3T X = 0


Axis planes P1 , P2
The points on plane P1 have P1T X = 0, and so are imaged at
PX = (0, y , w)T . These are points on the image y axis.
Since PC = 0 and P1T C = 0, this implies that C also lies on the
plane P1 .
Plane P1 is deﬁned by the camera centre and the line x = 0 in the
image.
Plane P2 is deﬁned by the camera centre and the line y = 0 in the
image.


Orthographic Projection
The projection along Z-axis in matrix form:
 
 1 0 0 0 
 
P= 0 1 0 0 

 


 

0 0 0 1
 

The mapping takes a point (X, Y, Z, 1)T to the image point
(X, Y, 1)T , dropping the Z coordinate.
For a general orthographic projection mapping, we precede this
map by a 3D Euclidean coordinate change of the form

R t
H= H is a 4 × 4 homography in P3 .
0T 1

R is a 3 × 3 rotation matrix. t is 3 × 1 translation vector.


Writing t = (t 1 , t 2 , t 3 )T , and the rows r1T , r2T , r3T of 3 × 3 rotation
matrix, a general orthographic camera is of the form:
 1T 
 2T t 1 
 r 
R t  r t2 

 

H4×4 = = 
 3T 
0T 1

 r


 T t3 


0 1
 

Aligning the world coordinate system and the camera coodinate
system:

  r1 T t 1 
 
  1T 
 1 0 0 0   2T   r t1 
  r t 
  
  
P × H4×4 =  0 1 0 0   3 T 2  =  r2 T t 2 

    
    

   r
  t3  
 T 

0 0 0 1  0 1
  
 T 
  
0 1



 1T 
 r t 
 2T 1 

 T t2 

P= r





0 1
 

Five degrees of freedom: 3 for R and 2 for t , t .
1 2
The orthographic projection matrix P = [M | t] has the matrix M
with last row zero, with the ﬁrst two rows orthonormal and of unit
norm, and t3 = 1


Scaled orthographic projection
Orthographic projection followed by isotropic scaling.
  1T
t1   r1T t1
   
 k
  r
   

  2T
k t2  =  r2T t2
   
P= 


 r


 T
 
 
  T




1 0 1 0 1/k
    

Six degrees of freedom.
A scaled orthographic projection matrix P = [M | t] has matrix M
with last row zero, and the ﬁrst two rows orthogonal and of equal
norm.


Weak perspective projection
It is camera at inﬁnity for which the scale factors in the two axial
image directions are not equal.

 αx
   1T 
  r
 t1 

αy
   2T
t2 

P= 


 r


 T



1 0 1
  

Seven degrees of freedom.
A weak perspective projection matrix P = [M | t] has matrix M with
last row zero, and the ﬁrst two rows orthogonal (they need not
have equal norm).


The affine camera
 αx   r1T t1   m11 m12 m13 t1 
    
 s    
αy
  2T
t2  ≡  m21 m22 m23 t2 
   
PA = 



 r



 
 
 



 T
1 0 1 0 0 0 1
   

Eight degrees of freedom.
An affine projection matrix P = [M | t] has matrix M with the first
two rows sub-matrix M2×3 having rank 2. This arises from the
requirement that the rank of P is 3.


The afﬁne camera PA
Projection under an afﬁne camera is a linear mapping on
inhomogeneous coordinates composed with a translation:
 
 X 
x m11 m12 m13  t
 Y + 1
 

PA = =  
y m21 m22 m23   
 t2
Z
 


Properties of the affine camera PA
The plane at infinity in space is mapped to points at infinity in the
image.
P A (X, Y, Z, 0)T = (X, Y, 0)T
The principal plane of the camera is the plane at infinity.
Parallel world lines are projected to parallel image lines.
The vector d satisfying M2×3 d = 0 is the direction of parallel
projection.
d
The camera centre is (dT , 0)T since P A =0
0


Push Broom camera


Push Broom camera
The Linear Pushbroom (LP) camera is the commonly used type of
sensor for satellites.
A linear sensor array is used to capture a single line of imagery at
a time.
As the sensor moves the sensor plane sweeps out a region of
space, capturing the image a single line at a time.
The second dimension of the image is provided by the motion of
the sensor.
In the linear pushbroom model, the sensor is assumed to move in
a straight line at a constant velocity with respect to the ground.


Push Broom camera
In the direction of the sensor, the image is effectively a
perspective image.
In the direction of the sensor motion it is an orthographic
projection.
Like the general projective camera the mapping from the object
space to the image may be described with a 3 × 4 camera matrix.
The interpretation of the result changes.

Let X = (X, Y, Z, 1)T be an object point, and let
P be the camera matrix of the linear pushbroom
camera. Suppose that PX = (x, y , w)T . Then
the corresponding image point (represented as
an inhomogeneous 2-vector) is (x, y /w)T


Cameras at infinity
A camera at infinity means that the camera center is at infinity.
The camera center is the 1-dimensional right null-space C of P,
i.e. PC = 0

−M−1 p4
Finite Camera: (M is not singular) C=
1

d
Camera at infinity: (M is singular) C= i.e. Md = 0
0

Md = 0 implies that M has a one dimen-
sional right null space d. Hence M is sin-
gular.


Cameras at infinity

Affine Camera
An affine camera is one that has the camera matrix P in which the last
row P3T is of the form (0 0 0 1).
Points at infinity are mapped to points at infinity.

Non-Affine Camera
The 3 × 3 matrix M is singular.


Smooth transition
Projective camera to Afﬁne camera
Consider what
happens as we
apply a
cinematographic
technique of
"tracking back"
while
"zooming-in", in
such a way as to
keep objects of
interest the same
size.


Projective to Afﬁne Camera Model Transition
Tracking back implies that we are moving the camera centre away
from the object.
Zooming implies increasing the focal length.
We take the limit of the process of tracking back and zooming in
such that both the focal length and the distance of the camera
from the object go on increasing.
The initial camera model is:
 1T 
 r
 −r1T C 
 
P0 = KR [ I | − C] = K  r2T −r2T C 
 

 


 3T 
3T C 
r −r

where ri T is the i−th row of the rotation matrix R.


 1T 
 r
 −r1T C 
 
P0 = KR [ I | − C] = K  r2T −r2T C 
 

 

 3T 
3T C 
r −r


The vector r3 gives the direction of the principal ray.
d 0 = −r3T C is the distance of the world origin from the camera
centre in the direction of the principal ray.

Start moving the camera back:
The camera centre is moved backwards along
the principal ray at unit speed for a time t so that
the centre of the camera is moved to C − tr3
Substitute for the updated centre in the camera matrix.


 1T 
 r
 −r1T (C − tr3 ) 


 2T 
Pt = K  r −r2T (C − tr3 )



 

 3T 
r −r3T (C − tr3 )
 

Terms ri T r3 are zero for i = 1, 2, because R is a rotation matrix.
 1T
−r1T C 

 r
 
Pt = K  r2T −r2T C 

 
 


 3T 

r dt

The scalar d t = −r3T C + t is the depth of the world origin with
respect to the camera centre in the direction of principal ray r3 of
the camera.


Effect of Tracking:
 1T 
 r
 −r1T C 

 
P0 = K  r2T −r2T C 
 

 


 3T 
3T C 
r −r
 1T
−r1T C 

 r
 
Pt = K  r2T −r2T C 
 

 


 3T 

r dt
The effect of tracking along the principal ray is to replace the (3, 4)
entry of the matrix by the depth d t of the camera centre from the
world origin.


Effect of Zooming:
The focal length is increased by a factor k . i.e. the calibration
matrix K is multiplied by diag(k , k , 1)
 
 k
 

k
 
K = K 






1
 


Effect of TRACKING + ZOOMING:
The focal length is increased by a factor k = d t /d 0 so that the
image size remains ﬁxed.

 d t /d 0
  1T
−r1T C 
 
  r
 
d t /d 0   r2T −r2T C 
 
Pt = K 










  3T
1
 
r dt
 1T
−r1T C 

 r
d t  2T 
−r2T C 
 
Pt = K r
 
 
d 0  d 0 3T
 

dt r d0
 


 1T
−r1T C

 r 
d t  2T 
−r2T C
 
Pt = K r
 

 
d 0  d 0 3T
 

dt r d0
 

dt
The factor d0 can be ignored.
When t = 0 the camera matrix Pt is the same as P0 .
In the limit as d t tends to ∞, this matrix becomes
 1T
−r1T C 

 r
 
= lim Pt = K  r2T −r2T C 
 
P∞ 
 

t→∞ 
 T 
0 d0



 1T
−r1T C

 r
 

 2T
−r2T C
 
P∞ = lim Pt = K  r




t→∞ 
 T 
0 d0


This is a subcategory of afﬁne camera:
The weak perspective camera.


Error in employing an Afﬁne Camera Model
Any point on the plane through the world origin and perpendicular
to the principal axis direction r3 can be

αr1 + βr2
X=
1

One can verify that P0 X = Pt X = P∞ X for all t
 1T
−r1T C 
 1T 
−r1T C 

 r  r
  d t  2T
 
−r2T C 
 
P0 = K  r2T −r2T C 
 
Pt = K r
  
 
d0 

 
 
 d 0 3T 
 3T  
r d0 r d0

dt

 1T
−r1T C 

 r
 
 2T
−r2T C 

P∞ = K r






 T
0 d0



One can verify that P0 X = Pt X = P∞ X for all t, since
r3T (αr1 + βr2 ) = 0 .
This means that the image of the point X is unchanged by
combined zooming and backward tracking.
For points not on this plane, the images under P0 and P∞ differ.
How much will be the Error?


Consider a point X which is at a perpendicular distance ∆ from
this plane.
αr1 + βr2 + ∆r3
X=
1
The point X is imaged by the cameras P0 and P∞ as:
   

 ˜
x 
  ˜
 x 

˜
y  ˜
= P∞ X = K  y
   
xproj = P0 X = K 



 and x


 afﬁne 





d0 + ∆ d0
   

where x = α − r1T C and y = β − r2T C
˜ ˜


   

 ˜
x 
  ˜ 
 x 
˜
y  ˜ 
= P∞ X = K  y 
  
xproj = P0 X = K 






 and xafﬁne 




d0 + ∆ d0
   

where x = α − r1T C and y = β − r2T C
˜ ˜
Using the calibration matrix K

K2×2 ˜0
x
K= ˜T
0 1

K2×2 ˜ + (d 0 + ∆)˜0
x x K2×2 ˜ + d 0 ˜0
x x
xproj = and xafﬁne =
d0 + ∆ d0


K2×2 ˜ + (d 0 + ∆)˜0
x x K2×2 ˜ + d 0 ˜0
x x
xproj = and xaffine =
d0 + ∆ d0

After dehomogenizing the two points xproj and xaffine we have

K2×2 ˜
x
˜proj = ˜0 +
x x
d0 + ∆
K2×2 ˜
x
ãffine = ˜0 +
x x
d0

d 0 +∆
Error: xaffine − x0 =
˜ ˜ d0 xproj − x0
˜ ˜


d 0 +∆
xafﬁne − x0 =
˜ ˜ d0 xproj − x0
˜ ˜

The effect of
the afﬁne approximation P∞ to the true camera matrix P0 is to move
the image of the point X radially towards or away from the principal
d +∆
point ˜0 by a factor equal to 0d 0
x

Rewriting the error as:

∆
xafﬁne − xproj =
˜ ˜ xproj − x0
˜ ˜
d0
The distance between the true perspective image position and the
position obtained using the afﬁne camera approximations P∞ will
be small provided:

The depth relief (∆) is small
compared to the average depth
(d 0 ).

The distance of the point from
the principal ray is small.


Rewriting the error as:

∆
xaffine − xproj =
˜ ˜ xproj − x0
˜ ˜
d0
The distance between the true perspective image position and the
position obtained using the affine camera approximations P∞ will
be small provided:

The latter condition is satisfied by a
The depth relief (∆) is small
compared to the average depth small field of view.
(d 0 ). Images acquired using a lens with
a longer focal length tend to satisfy
The distance of the point from these conditions.
the principal ray is small.


For scenes at which there are many points at different depths, the
afﬁne camera is not a good approximation.
If the scene contains close foreground as well as background
objects, the afﬁne camera model should not be used.


Conclusion
We have discussed several types of camera projection matrices.
In the most general form the camera matrix P has 11 degrees of
freedom.
CCD camera (non-uniform scale + skew) −→ 11
Non-CCD −→ 9
Orthographic projection −→ 5
Orthographic / Weak perspective (uniform scale) −→ 6
Orthographic projection (non-uniform scale) −→ 7
Affine projection (non-uniform scale + skew) −→ 8
Since we are bothered about working with simple models, we also
explored what happens when we use a simple affine camera
model (6 dof) instead of a general camera model (9 dof). Our
analysis of imaging errors indicate that affine camera can indeed
be used to approximate a projective camera under certain settings
of the scene.


What Next?
We now understand and appreciate the linear model P for the
projective mapping from the 3-D scene to the camera image
plane.
Who will provide us with the linear model?
Most of the time we work with camera as a black-box given to us.
Thankfully we have access to the acquired image.
We also have some knowledge about the settings of the scene.


Lecture 7

Recommended

Recommended

More Related Content

More from Krishna Karri

More from Krishna Karri (17)

Lecture 7