COMP 557 lecture 4                                                                                                      Sept. 11, 2008


Perspective Projection
A simple model of image formation is that a 3D scene is projected towards a single point – called the
center of projection. This center of projection is just the position of the camera (or “view reference
point” defined in lecture notes of last class).
    The image is not defined at the projection point, but rather it is defined on a plane, called the
projection plane. The projection plane is perpendicular to the camera z axis (the n vector defined
in lecture 3).
    For real cameras, the projection plane and the scene lie on opposite sides of the center of
projection. The center of the camera aperture (or pupil of the eye) serve roughly1 as the center of
projection. Light passes through the camera aperture and then arrives on a light sensitive surface2 .
The image of the scene is upside down on the projection plane of real cameras and eyes, which can
be very confusing. To avoid confusion, it is common in computer graphics to consider a projection
plane that lies on the same side of the center of projection as the scene.
    In camera coordinates, the center of projection is (0, 0, 0) and the projection plane is z = f . In
a right handed coordinate system, f < 0. A general point (x, y, z) is projected to (x∗ , y ∗, f ). Using
similar triangles, we observe that the projection should satisfy:
                                                      x   x∗                   y   y∗
                                                        =                        =
                                                      z   f                    z   f
and so the projection maps
                                                                           x    y
                                                          (x, y, z) → (f     , f , f ).
                                                                           z    z

                                                    (x,z)                                                       (y,z)




       FROM                                                                    FROM
       "ABOVE"                                                                 "SIDE"



                                                                                                       z

                                                             z=f                                                                 z=f
                 projection plane                                                         projection plane   (y* , f)
                                       (x*, f)



                 center of                       x axis                                                                 y axis
                 projection (0, 0)



                                     z axis                                                                  z axis




  1
      I am ignoring lenses here.
  2
      CCD array for a camera, or retina for the eye


                                                                      1
COMP 557 lecture 4                                                                      Sept. 11, 2008


Homogeneous coordinates
Last class we have represented a 3D point (x, y, z) as a point (x, y, z, 1) in ℜ4 . We now generalize
this by representing (x, y, z) as any 4D vector of the form (wx, wy, wz, w) where w = 0. Note that
the set of points
                                     { (wx, wy, wz, w) : w = 0 }
is a line in ℜ4 which passes through the origin and the point (x, y, z, 1) in ℜ4 . [ASIDE: note we are
associating each point in ℜ3 with a line in ℜ4 , in particular, a line that passes through the origin.]
    Is this generalization consistent with the 4 × 4 rotation, translation, and scaling matrices which
we introduced last class. Yes, it does, since for any q = (x, y, z, 1),

                                          w(Mq) ≡ M(wq),

that is, multiplying each component of the vector q by w and transforming by M yields the same
vector as transforming q by M and then muliplying each component of M q by w. But multiplying
each component of a 4D vector by w doesn’t change the 3D point that is represented.
   But what do we gain in identifying (x, y, z, 1) with (wx, wy, wz, w) ? Answer: alot. For example,
consider the projection mapping above. We re-write our projected points as follows:
                                     x y
                                   (f , f , f, 1) ≡ (xf, yf, f z, z).
                                     z z
This allows us to represent the projection transformation by a 4 × 4 matrix, i.e. :
                                                         
                                  fx           f 0 0 0         x
                                 fy         0 f 0 0  y 
                                 fz  =  0 0 f 0   z 
                                                         

                                   z           0 0 1 0         1

    Several observations can be made. First, since we are now treating all 4D points (wx, wy, wz, w)
as representing the same 3D point, we can multiply any 4 × 4 transformation matrix by a constant,
without changing the transformation that is carried out by the matrix. The above matrix can be
divided by the constant f and written instead as:
                                                        
                                            1 0 0 0
                                          0 1 0 0 
                                                        
                                          0 0 1 0 
                                                   1
                                            0 0 f 0

This matrix, like the one above, projects the 3D scene onto the plane z = f , with center of projection
being the origin (0, 0, 0).
    A second observation is that, whereas the 4 × 4 translation, rotation, and scaling matrices were
invertible (and hence of rank 4), the projection matrix is clearly not invertible. It is of rank 3 since
the third and fourth rows are linearly dependent i.e. they differ only by a multiplicative constant.
    A third observation is that there is nothing magic about the z = f projection plane. We can
easily define projections onto other planes. For example, the following matrix projects onto the


                                                   2
COMP 557 lecture 4                                                                        Sept. 11, 2008


x = f plane.                                           
                                                1 0 0 0
                                               0 1 0 0 
                                                       
                                               0 0 1 0 
                                                1
                                                f
                                                  0 0 0

Another example of perspective projection
Suppose the center of projection is placed at (0, 0, f ) where now f > 0 and suppose we set the
projection plane to z = 0. Take a point (x, y, z) such that z < 0 and project this point to (x∗ , y ∗, 0).
Using similar triangles, we see that
                                                x      x∗
                                                    =
                                             f −z      f
                                                y      y∗
                                                    =
                                             f −z      f
where, in the sketch below, we note that z < 0. Hence
                                                                x    y
                             (x, y, z) → (x∗ , y ∗, 0) = (f       ,f    , 0)                          (1)
                                                              f −z f −z
Expressing the above transformation in homogeneous coordinates, we have:
                                                          x    y
                                    (x, y, z, 1) → (f       ,f    , 0, 1)
                                                        f −z f −z
We multiply each component on the right side by w = f − z, and note
                                     x    y
                              (f       ,f    , 0, 1) ≡ (xf, yf, 0, f − z)
                                   f −z f −z
   Can we invent a 4 × 4 matrix that achieves the transformation

                                      (x, y, z, 1) → (xf, yf, 0, f − z)

Yes we can:                                                    
                                               f   0 0        0
                                              0   f 0        0 
                                                               
                                              0   0 0        0 
                                               0   0 −1       f
That is,                                                               
                                   xf      f 0 0                 0    x
                                 yf   0 f 0                   0  y     
                                 0 = 0 0 0
                                                                       
                                                                 0  z     
                                  f −z     0 0 −1                f    1




                                                        3
COMP 557 lecture 4                                                                                         Sept. 11, 2008


                                           (x,z)                                                  (y,z)




  FROM                                                             FROM
  "ABOVE"                                                          "SIDE"




                                  x axis                                                          y axis
                                                   z=0                                                              z=0
            projection plane                                                projection plane   (y* , 0)
                                (x*, 0)
                      z axis                                                        z axis


            center of                                                                           center of
            projection (0, f)                                                                   projection (0, f)




Orthographic projection
Extending this last example, what happens if we let f → ∞, essentially moving the camera very
far away in the direction z. To do this, we need to rewrite the above projection matrix as:
                                                       
                                           1 0 0 0
                                          0 1 0 0 
                                          0 0 0 0 .
                                                       
                                                   1
                                           0 0 −f 1

Letting f → ∞ yields:                                              
                                                      1   0   0   0
                                                     0   1   0   0 
                                                                   .
                                                     0   0   0   0 
                                                      0   0   0   1
This is transformation just sets the z value to 0 and leaves the x and y values as they are. This is
called the orthographic projection in the z direction.
    Some of you may have heard of orthographic projection before. It is quite a simple projection,
perhaps the simplest one can define! It is not obvious, though, why this transformation should
have anything to do with the usual way to project images, namely towards a center of projection.
The above derivation makes this connection: an orthographic project is the limit of the central
projection that you get when the camera moves far back from the scene.
    More generally, an orthographic projection is defined by a arbitrary plane in 3D such that you
project all points (x, y, z) in the direction parallel to the normal to the plane. The “view from
above” and “view from the side” sketches shown earlier in this lecture can be thought of as the
orthographic projection in the y and x directions, respectively.



                                                              4
COMP 557 lecture 4                                                                                       Sept. 11, 2008


Points at infinity
We have considered points (wx, wy, wz, w) under the condition that w = 0. We have allowed
ourselves to talk about (0, 0, 0, w) provided that w = 0. Let’s look at the remaining points (x, y, z, 0),
where at least one of x, y, z is non-zero.3 How are we to interpret this case?
   Consider (x, y, z, ǫ) and consider what happens when ǫ → 0. Assuming that ǫ > 0, we can write
                                                              x y z
                                              (x, y, z, ǫ) ≡ ( , , , 1)
                                                              ǫ ǫ ǫ
This is very interesting. As ǫ → 0, the corresponding 3D point goes to infinity, and stays along the
line from the origin through the point (x, y, z, 1). We thus identify the limit (x, y, z, 0) with a point
at infinity.
    What happens to a point at infinity when we perform a rotation, translation, or scaling? Since
the bottom row of each of these 4×4 matrices is (0,0,0,1), it is easy to see that these transformations
map points at infinity to points at infinity. In particular,

    • a translation matrix does not affect a point at infinity; i.e. it behaves the same as the identity
      matrix;

    • a rotation matrix maps a point at infinity in exactly the same way it maps a finite point,
      namely, (x, y, z, 1) rotates to (x′ , y ′, z ′ , 1) if and only if (x, y, z, 0) rotates to (x′ , y ′, z ′ , 0).

    • a scale matrix maps a point at infinity in exactly the same way it maps a finite point, namely,
      (x, y, z, 1) scales to (sx x, sy y, sz z, 1) if and only if (x, y, z, 0) scales to (sx x, sy y, sz z, 0).

   What happens when we project a point at infinity (x, y, z, 0) onto a projection plane? It is easy
to verify that it projects onto exactly the same point that the finite point (x, y, z, 1) projects to.
This should make sense. For any (x, y, z) ∈ ℜ3 , consider the line (wx, wy, wz, 1) that passes through
the origin (0, 0, 0, 1) and through the point (x, y, z, 1). All points on this line (except the origin)
project to the same image point. The point at infinity is just the limit point of this line.

Direction vectors
It is often useful to interpret points at infinity as direction vectors, that is, they have a direction
but no position. We can rotate them and scale4 them. But we cannot translate them.
    As a concrete example, consider the equation of a plane that passes through the origin (0, 0, 0).
We can write this plane as
                                        Nx x + Ny y + Nz z = 0
or using homogeneous coordinates

                                          (Nx , Ny , Nz , 0) · (x, y, z, 1) = 0.
   3
      The case (0, 0, 0, 0) is not considered here. We will not try to interpret this case.
   4
      We can only scale a “direction vector” in a relative sense, not absolute sense, e.g. if we scale by (sx , sy , sz ) =
(2, 1, 1), then we are doubling the length of the x axis relative to the lengths of the y and z axis. If we scale by
(sx , sy , sz ) = (s, s, s), where s = 0 then this doesn’t change the direction vector, since we get the same point at
infinity.


                                                            5
COMP 557 lecture 4                                                                       Sept. 11, 2008


The normal to the plane (Nx , Ny , Nz ) is a direction vector, and so we are representing it in homo-
geneous coordinates (Nx , Ny , Nz , 0).
   What happens to the plane if we apply a rotation transformation R ? Consider:

                             0 =     (Nx , Ny , Nz , 0) · (x, y, z, 1)
                               =     (Nx , Ny , Nz , 0)(x, y, z, 1)T
                               =     (Nx , Ny , Nz , 0)RT R(x, y, z, 1)T
                               =     (R(Nx , Ny , Nz , 0)T ) · (R(x, y, z, 1)T )

Thus, applying the rotation to the points (x, y, z) on the plane and to the normal vector (Nx , Ny , Nz )
gives a new plane and new normal which are perpendicular to each other. This should not be
surprising. I show it mainly so that you can see how this property is experessed mathematically.
(We will also use similar but more subtle arguments later in the course.)




                                                     6

Cg

  • 1.
    COMP 557 lecture4 Sept. 11, 2008 Perspective Projection A simple model of image formation is that a 3D scene is projected towards a single point – called the center of projection. This center of projection is just the position of the camera (or “view reference point” defined in lecture notes of last class). The image is not defined at the projection point, but rather it is defined on a plane, called the projection plane. The projection plane is perpendicular to the camera z axis (the n vector defined in lecture 3). For real cameras, the projection plane and the scene lie on opposite sides of the center of projection. The center of the camera aperture (or pupil of the eye) serve roughly1 as the center of projection. Light passes through the camera aperture and then arrives on a light sensitive surface2 . The image of the scene is upside down on the projection plane of real cameras and eyes, which can be very confusing. To avoid confusion, it is common in computer graphics to consider a projection plane that lies on the same side of the center of projection as the scene. In camera coordinates, the center of projection is (0, 0, 0) and the projection plane is z = f . In a right handed coordinate system, f < 0. A general point (x, y, z) is projected to (x∗ , y ∗, f ). Using similar triangles, we observe that the projection should satisfy: x x∗ y y∗ = = z f z f and so the projection maps x y (x, y, z) → (f , f , f ). z z (x,z) (y,z) FROM FROM "ABOVE" "SIDE" z z=f z=f projection plane projection plane (y* , f) (x*, f) center of x axis y axis projection (0, 0) z axis z axis 1 I am ignoring lenses here. 2 CCD array for a camera, or retina for the eye 1
  • 2.
    COMP 557 lecture4 Sept. 11, 2008 Homogeneous coordinates Last class we have represented a 3D point (x, y, z) as a point (x, y, z, 1) in ℜ4 . We now generalize this by representing (x, y, z) as any 4D vector of the form (wx, wy, wz, w) where w = 0. Note that the set of points { (wx, wy, wz, w) : w = 0 } is a line in ℜ4 which passes through the origin and the point (x, y, z, 1) in ℜ4 . [ASIDE: note we are associating each point in ℜ3 with a line in ℜ4 , in particular, a line that passes through the origin.] Is this generalization consistent with the 4 × 4 rotation, translation, and scaling matrices which we introduced last class. Yes, it does, since for any q = (x, y, z, 1), w(Mq) ≡ M(wq), that is, multiplying each component of the vector q by w and transforming by M yields the same vector as transforming q by M and then muliplying each component of M q by w. But multiplying each component of a 4D vector by w doesn’t change the 3D point that is represented. But what do we gain in identifying (x, y, z, 1) with (wx, wy, wz, w) ? Answer: alot. For example, consider the projection mapping above. We re-write our projected points as follows: x y (f , f , f, 1) ≡ (xf, yf, f z, z). z z This allows us to represent the projection transformation by a 4 × 4 matrix, i.e. :      fx f 0 0 0 x  fy   0 f 0 0  y   fz  =  0 0 f 0   z       z 0 0 1 0 1 Several observations can be made. First, since we are now treating all 4D points (wx, wy, wz, w) as representing the same 3D point, we can multiply any 4 × 4 transformation matrix by a constant, without changing the transformation that is carried out by the matrix. The above matrix can be divided by the constant f and written instead as:   1 0 0 0  0 1 0 0     0 0 1 0  1 0 0 f 0 This matrix, like the one above, projects the 3D scene onto the plane z = f , with center of projection being the origin (0, 0, 0). A second observation is that, whereas the 4 × 4 translation, rotation, and scaling matrices were invertible (and hence of rank 4), the projection matrix is clearly not invertible. It is of rank 3 since the third and fourth rows are linearly dependent i.e. they differ only by a multiplicative constant. A third observation is that there is nothing magic about the z = f projection plane. We can easily define projections onto other planes. For example, the following matrix projects onto the 2
  • 3.
    COMP 557 lecture4 Sept. 11, 2008 x = f plane.   1 0 0 0  0 1 0 0     0 0 1 0  1 f 0 0 0 Another example of perspective projection Suppose the center of projection is placed at (0, 0, f ) where now f > 0 and suppose we set the projection plane to z = 0. Take a point (x, y, z) such that z < 0 and project this point to (x∗ , y ∗, 0). Using similar triangles, we see that x x∗ = f −z f y y∗ = f −z f where, in the sketch below, we note that z < 0. Hence x y (x, y, z) → (x∗ , y ∗, 0) = (f ,f , 0) (1) f −z f −z Expressing the above transformation in homogeneous coordinates, we have: x y (x, y, z, 1) → (f ,f , 0, 1) f −z f −z We multiply each component on the right side by w = f − z, and note x y (f ,f , 0, 1) ≡ (xf, yf, 0, f − z) f −z f −z Can we invent a 4 × 4 matrix that achieves the transformation (x, y, z, 1) → (xf, yf, 0, f − z) Yes we can:   f 0 0 0  0 f 0 0     0 0 0 0  0 0 −1 f That is,      xf f 0 0 0 x  yf   0 f 0 0  y   0 = 0 0 0      0  z  f −z 0 0 −1 f 1 3
  • 4.
    COMP 557 lecture4 Sept. 11, 2008 (x,z) (y,z) FROM FROM "ABOVE" "SIDE" x axis y axis z=0 z=0 projection plane projection plane (y* , 0) (x*, 0) z axis z axis center of center of projection (0, f) projection (0, f) Orthographic projection Extending this last example, what happens if we let f → ∞, essentially moving the camera very far away in the direction z. To do this, we need to rewrite the above projection matrix as:   1 0 0 0  0 1 0 0   0 0 0 0 .   1 0 0 −f 1 Letting f → ∞ yields:   1 0 0 0  0 1 0 0   .  0 0 0 0  0 0 0 1 This is transformation just sets the z value to 0 and leaves the x and y values as they are. This is called the orthographic projection in the z direction. Some of you may have heard of orthographic projection before. It is quite a simple projection, perhaps the simplest one can define! It is not obvious, though, why this transformation should have anything to do with the usual way to project images, namely towards a center of projection. The above derivation makes this connection: an orthographic project is the limit of the central projection that you get when the camera moves far back from the scene. More generally, an orthographic projection is defined by a arbitrary plane in 3D such that you project all points (x, y, z) in the direction parallel to the normal to the plane. The “view from above” and “view from the side” sketches shown earlier in this lecture can be thought of as the orthographic projection in the y and x directions, respectively. 4
  • 5.
    COMP 557 lecture4 Sept. 11, 2008 Points at infinity We have considered points (wx, wy, wz, w) under the condition that w = 0. We have allowed ourselves to talk about (0, 0, 0, w) provided that w = 0. Let’s look at the remaining points (x, y, z, 0), where at least one of x, y, z is non-zero.3 How are we to interpret this case? Consider (x, y, z, ǫ) and consider what happens when ǫ → 0. Assuming that ǫ > 0, we can write x y z (x, y, z, ǫ) ≡ ( , , , 1) ǫ ǫ ǫ This is very interesting. As ǫ → 0, the corresponding 3D point goes to infinity, and stays along the line from the origin through the point (x, y, z, 1). We thus identify the limit (x, y, z, 0) with a point at infinity. What happens to a point at infinity when we perform a rotation, translation, or scaling? Since the bottom row of each of these 4×4 matrices is (0,0,0,1), it is easy to see that these transformations map points at infinity to points at infinity. In particular, • a translation matrix does not affect a point at infinity; i.e. it behaves the same as the identity matrix; • a rotation matrix maps a point at infinity in exactly the same way it maps a finite point, namely, (x, y, z, 1) rotates to (x′ , y ′, z ′ , 1) if and only if (x, y, z, 0) rotates to (x′ , y ′, z ′ , 0). • a scale matrix maps a point at infinity in exactly the same way it maps a finite point, namely, (x, y, z, 1) scales to (sx x, sy y, sz z, 1) if and only if (x, y, z, 0) scales to (sx x, sy y, sz z, 0). What happens when we project a point at infinity (x, y, z, 0) onto a projection plane? It is easy to verify that it projects onto exactly the same point that the finite point (x, y, z, 1) projects to. This should make sense. For any (x, y, z) ∈ ℜ3 , consider the line (wx, wy, wz, 1) that passes through the origin (0, 0, 0, 1) and through the point (x, y, z, 1). All points on this line (except the origin) project to the same image point. The point at infinity is just the limit point of this line. Direction vectors It is often useful to interpret points at infinity as direction vectors, that is, they have a direction but no position. We can rotate them and scale4 them. But we cannot translate them. As a concrete example, consider the equation of a plane that passes through the origin (0, 0, 0). We can write this plane as Nx x + Ny y + Nz z = 0 or using homogeneous coordinates (Nx , Ny , Nz , 0) · (x, y, z, 1) = 0. 3 The case (0, 0, 0, 0) is not considered here. We will not try to interpret this case. 4 We can only scale a “direction vector” in a relative sense, not absolute sense, e.g. if we scale by (sx , sy , sz ) = (2, 1, 1), then we are doubling the length of the x axis relative to the lengths of the y and z axis. If we scale by (sx , sy , sz ) = (s, s, s), where s = 0 then this doesn’t change the direction vector, since we get the same point at infinity. 5
  • 6.
    COMP 557 lecture4 Sept. 11, 2008 The normal to the plane (Nx , Ny , Nz ) is a direction vector, and so we are representing it in homo- geneous coordinates (Nx , Ny , Nz , 0). What happens to the plane if we apply a rotation transformation R ? Consider: 0 = (Nx , Ny , Nz , 0) · (x, y, z, 1) = (Nx , Ny , Nz , 0)(x, y, z, 1)T = (Nx , Ny , Nz , 0)RT R(x, y, z, 1)T = (R(Nx , Ny , Nz , 0)T ) · (R(x, y, z, 1)T ) Thus, applying the rotation to the points (x, y, z) on the plane and to the normal vector (Nx , Ny , Nz ) gives a new plane and new normal which are perpendicular to each other. This should not be surprising. I show it mainly so that you can see how this property is experessed mathematically. (We will also use similar but more subtle arguments later in the course.) 6