1. Name-Anish Hemmady
Project Report
Assignment No-1
1. Problem statement- To calibrate a camera using data points provided from image of
calibration rig and to find intrinsic and extrinsic parameters.
2. Procedure- Below is the image of calibration rig which consists of 3 axis x,y,z denoting
different axis. X axis is increasing from left to right. Z axis from below to upside its increasing.
The real world co-ordinates are given to us from this diagram. Each white block spans 2cm in
height and width. The x, y, z axis denotes world co-ordinates in cms.
Fig 1. Calibration rig image
2.1. Steps taken:
a) Note down world co-ordinates manually,some initial co-ordinates given in the image like
length from center to first square on leftside is 4,0,0 and on rightside its 0,4,0.Also we
know gap between 2 squares is 2cms each and height is also 2cms then you can go on
counting and find world coordinates of points of interest.
b) Image size is 1920*1280
2. c) You can note down 6 world coordinates and 6 image coordinates atleast.
d) Note down image coordinates using imagej or build your own api. I have built my own
api using python programming.The tkinter interface gui captures user mouse clicks
as events and returns them image coordinates.
e) Now prepare a program that automatically takes in these world and image coordinates and
generates camera matrix.
f) By using Projection matrix we can calculate intrinsic and extrinsic parameters.
2.2.Calculating Image co-ordinates using Tkinter (Python gui)
I programmed a gui using python programming which picked out image coordinates x,y as we
wanted x increasing from left to right and y coordinate increasing from bottom to up I had to
subtract y coordinate value from height which gave me real y coordinates as we wanted.x
coordinate was kept untouched.I collected 6 coordinates from tkinter.
2.3.Calculation of World coordinates
World coordinates were chosen manually as per scale given in the project question.Each square
object measured 2 cms in height and width.Distance from x,y,z origin marked in image to left
and rightside squares measured 4cms.From these given information I was able to find world
coordinates.
3. Calculation of Projection Matrix
Projection matrix is given by following matrix:
∏ = (
𝑎11, 𝑎12, 𝑎13, 𝑎14
𝑎21, 𝑎22, 𝑎23, 𝑎24
𝑎31, 𝑎32, 𝑎33, 𝑎34
)
A(ij) are the parameters to estimate.
Camera equation is given by:
Image points(u,v,w)= [matrix of (3*4)] *world coordinates(x,y,z,1)
There are 2 approaches to solve Projection matrix:
1) Using pseudo inverse property and multiplying it with known column vectors and
calculating the a(i,j) parameters.
2) Second approach is to use minimum eigenvalues and solving it
From the above matrix camera equations can be generated which will be inserted into our
program and a matrix of 12 rows and 11 columns would be formed.Its 11 columns since we
will solve by approach 1 where scale factor is set to 1 and 11 parameters are calculated
first.
3. I will be using approach 1 to solve projection matrix which is as follows:
a) After manipulating equations from above matrix we have to solve for Bp=0 where B is
the matrix formed by collecting data of image coordinates and world coordinates which is
12 rows by 11 columns.p is the unknown a(ij) parameters represented as 12 rows by 1
column as column vector.
b) Since we are scaling by factor of 1 i.e keeping a34=1 now we will get constants in
equations which will be shifted onto right handside and 12*1 known column vector of
image coordinates will be created.i.e. Bp=[12*1 image cords]
c) Now calculate matrix B from this equation.
d) After this step calculate pseudo-inverse of matrix B using (BT
B)-1
BT
e) We are taking transpose of B matrix since its not a square matrix,so we need to first
convert it into square matrix and find its inverse.
f) I have used numpy libraries pseudo inverse method to directly calculate the above result.
g) Now after finding pseudo inverse multiply B matrix by known column vector of image
coordinates and you will get output as 11 unknown parameters.
h) Now to find a34 value as we assumed it 1.0 previously,take the third row of known B
matrix which we found out from above procedure.Take a31,a32,a33 elements and do the
following.
i) √a31
2
+ a32
2
+ a33
2
value
j) Whatever value we got from taking square root of these 3 elements,divide entire matrix
by this value and we will get our final projection matrix.
4. Calculating Intrinsic and Extrinsic parameters
Intrinsic parameters-Intrinsic parameters are those parameters which depends on internal
configuration of camera. Focal length, skew and origin values are considered Inrinsic
parameters.(uo , vo, α,β)
Intrinsic parameters are calculated by using projection matrix we obtained from above.
1) Take the first row of projection matrix and do its transpose and multiply it with
third row of projection matrix i.e dot product with third row you will get uo
2) Similarly take the second row and do transpose of second row and multiply it with third
row then you will get vo
3) For α calculation multiply first row transpose with first row itself.After getting
value from this take squared value of uo and then subtract uo squared value from
value of dot product you got earlier.Take square root of this value after subtraction
and you have got your α.
4) Similarly take dot product of second row with its transpose and take vo squared value and
follow above steps as mentioned in 3.You will get β.
4. Extrinsic parameters
We can calculate extrinsic parameters from the projection matrix and using intrinsic
parameter matrix formed from above methods.
Extrinsic parameters are Rotation matrix and Translation matrix
Follow foll steps for calculating Extrinsic parameters:
1) Take first 3 rows and first 3 columns of projection matrix these denote the rotation values
from which we will get rotation matrix.This above mentioned 3*3 matrix multiply
should be multiplied with inverse of Intrinsic parameter matrix.We will get
rotational parameters from this.
2) The third column of projection matrix should be taken and multiplied with inverse
of intrinsic matrix.We will get translation parameter from this.
5. Results
Image coordinates values
(794.0,331.0),(456.0,845.0),(844.0,799.0),(1036.0,335.0),(1402.0,325.0),(1398.0,852.0)
World coordinates values :(6,0,0),(18,0,14),(6,0,14),(0,4,0),(0,16,0),(0,16,14)
B matrix containing 11 parameters I have found plus it contains a34 value which we had
assumed 1.0
[[ -3.84833298e+01 9.95643616e+00 8.04584312e+00 9.48233887e+02]
[ -8.70588875e+00 -5.85336113e+00 3.49512482e+01 3.48210083e+02]
[ -1.54169165e-02 -1.30057931e-02 5.71391918e-03 1.00000000e+00]]
Final B matrix known as Normalized matrix which contains all 12 parameters value
[[ -1.83570407e+03 4.74934743e+02 3.83797011e+02 4.52319695e+04]
[ -4.15282031e+02 -2.79212815e+02 1.66721925e+03 1.66100665e+04]
[ -7.35406639e-01 -6.20392970e-01 2.72561255e-01 4.77012794e+01]]
5. Intrinsic parameters matrix
[[ 1.54828496e+03 0.00000000e+00 1.15995098e+03]
[ 0.00000000e+00 1.46951395e+03 9.33042203e+02]
[ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
Extrinsic parameters matrix for Rotation
[[-0.63468189 0.77153767 0.04368661]
[ 0.18433537 0.20390416 0.96147988]
[-0.73540664 -0.62039297 0.27256126]]
Extrinsic parameters for Translation
[ 0.04368661 0.96147988 0.27256126]
Parameter Value
uo 1159.95097641
vo 933.042202582
α 1548.28495503
β 1469.51395002
6. Reconstructing Image coordinates (Error checking)
To check whether earlier projection matrix what we got gives us how precise image
coordinates we do error checking by reconstructing image coordinates from world
coordinates. Here only 6 coordinates were taken into account so we have high error rate
Image coordinates
(measured using
Imagej)
World coordinates
In cms
Image coordinates
estimated using my
program
Error Normalization
(688,394) (10,0,2,1) (675.98,386.17) 15.2
(794,398) (6,0,2,1) (798,398.15) 4.0
(946,342) (0,0,0,1) (948,348) 5.65
(1234,398) (0,12,2,1) (1267,390) 34
6. Error Normalization is done by calculating difference between x and y coordinates in Image
coordinates measured using Imagej and then squaring these difference values. After this both
squared values are added and square root is taken to calculate Error normalization. Most of the
errored values are medium but the last one little big which draws me to the conclusion that if we
are able to take more number of points of image we get less error.
Below table is formed by taking 12 coordinates instead of 6 coordinates. You can cross
check it with the values of 12 coordinates commented in my program.
Image coordinates
(measured using
Imagej)
World coordinates
In cms
Image coordinates
estimated using my
program
Error Normalization
(688,394) (10,0,2,1) (694.26,396.1) 6.32
(794,398) (6,0,2,1) (800,398) 6.0
(946,342) (0,0,0,1) (940,340) 6.32
(1234,398) (0,12,2,1) (1245,399) 11.0
Error Normalization e.g. for 10,0,21√(694.26 – 688)2
+ (396.1-394)2
=6.32
Errors can be introduced while noting down world coordinates manually and if you take less
number of points for matrix error rate while reconstructing image would be high.Errors can also
be introduce intrinsically while matrix multiplication and stuff.
Conclusion: For image calibration one should collect as many as points possible to get the
exact projection matrix. If number of points collected is less error rate would be high.Also it
depends how you choose your points. I observed if we take more points from left handside of
image then error rate would be high on righthandside since we have taken less points from
right side.Equal number of points should be collected from both sides.