Stereo vision

STEREO VISION
Computer Vision
Farah Al-Tufaili

Stereo Vision
 Two Seeing Eyes = Two Views!
 Two Views Used and Fused in the Brain = Stereovision!
 Human Beings with Two Eyes that Work Together Have Stere
5/30/2015
2

Computer stereo vision
 Computer stereo vision is the extraction of 3D
information from digital images. By comparing
information about a scene from two vantage points,
3D information can be extracted by examination of
the relative positions of objects in the two panels.
5/30/2015
3

Inferring 3D From 2D
 Model based pose estimation :If we have single camera that’s
calibrated and we have a model we know the geometry of the
model so we can determine the pose of the camera with respect
to the model called Model based pose estimation
Known model
single
(calibrated)
camera
-> Can
determine the
pose of the
model
5/30/2015
4

Inferring 3D From 2D
 Stereo vision: if we have two cameras and we know the relative
pose between them we can find 3D information from an arbitrary
seen we don’t have to know model in the seen and we can
determine the position in that scene from those two cameras
Arbitrary
scene
two
(calibrated)
camera
-> Can
determine the
positions of
points in the
scene
Relative pose between
cameras is also known
5/30/2015
5

Outline
 In traditional stereo vision, two cameras, displaced horizontally
from one another are used to obtain two differing views on a
scene, in a manner similar to human binocular vision. By
comparing these two images, the relative depth information can
be obtained, in the form of disparities, which are inversely
proportional to the differences in distance to the objects.
 To compare the images, the two views must be superimposed in
a stereoscopic device, the image from the right camera being
shown to the observer's right eye and from the left one to the left
eye.
5/30/2015
6

 A way of getting depth (3-D)
information about a scene from
two (or more) 2-D images
- Used by humans and animals, now
computers
5/30/2015
8
Left image Right image
Reconstructed surface
with image texture

Iright = im2double(imread('pentagonRight.png'));
Ileft = im2double(imread('pentagonLeft.png'));
% Disparity is d = xleft-xright
% So Ileft(x,y) = Iright(x+d,y)
for d=-20:20
d
Idiff = abs(Ileft(:, 21:end-20) - Iright(:, d+21:d+end-20));
imshow(Idiff, []);
pause
end
5/30/2015
9

Stereo Principle
 If you know
• intrinsic parameters of each camera
• the relative pose between the cameras
 If you measure
• An image point in the left camera
• The corresponding point in the right camera
 Each image point corresponds to a ray emanating from that
camera
 You can intersect the rays (triangulate) to find the absolute point
position
5/30/2015
10

Stereo Geometry – Simple Case
 Assume image planes are coplanar.
 There is only a translation in the X direction between the two
coordinate frames.
 b is the baseline distance between the cameras.
5/30/2015
11
XL
ZRZL
XR
xR
P(XL,YL,ZL)
Right camera
 disparity: the difference in image location of
the same 3D point when projected under
perspective to two different cameras
d = xleft-xright
xL
Left camera
f
b

Stereo Geometry – Simple Case
 f is the focal lenth, b is the baseline distance between the
cameras.
d=𝑥 𝐿 − 𝑥 𝑅 ,𝑥 𝐿 = 𝑓
𝑋 𝐿
𝑍 𝐿
, 𝑥 𝑅 = 𝑓
𝑋 𝑅
𝑍 𝑅
∵ 𝑍 𝑅= 𝑍 𝐿 = 𝑍 and 𝑋 𝐿=𝑋 𝑅 + 𝑏
∴ 𝑑 = 𝑥 𝐿 − 𝑥 𝑅= 𝑓
𝑋 𝑅+𝑏
𝑍
− 𝑋 𝑅= 𝑓
𝑋 𝑅+𝑏−𝑋 𝑅
𝑍
= 𝑓
𝑏
𝑍
𝒅 = 𝒇
𝒃
𝒁
𝒁 = 𝒇
𝒃
𝒅
5/30/2015
13
• We can see as the
disparity increases
the Z value is smaller.
• And as disparity
decreases the point
goes further way
XL
ZRZL
XR
xR
P(XL,YL,ZL)
Right camera
xL
Left camera
f
b

Geometry for parallel cameras
 Let us consider the optical setting in the figure, that is also
called standard model.
1. L and R are two cameras with parallel optical axes.
Let f be the focal length of both cameras.
2. The baseline (that is the line connecting the two lens
centers) is perpendicular to the optical axes. Let b
be the distance between the two lens centers.
3. XZ is the plane where the optical axes lie, XY plane is
parallel to the image plane of both cameras, X axis
equals the baseline and the origin O of (X,Y,Z) world
reference system is the lens center of the left
camera.
5/30/2015
14

Simple Model: Optic axes of 2
cameras are parallel

𝑧
𝑓
=
𝑥
𝑥 𝑙
,
𝑧
𝑓
=
𝑥−𝑏
𝑥 𝑟
,
𝑧
𝑓
=
𝑦
𝑦 𝑙
=
𝑦
𝑦 𝑟
(from similar triangles)
5/30/2015
15
Y-axis is
perpendicular
to the page.

3D from Stereo Images: Triangulati
 For stereo cameras with parallel optical axes, focal length f,
 baseline b, corresponding image points (xl,yl) and (xr,yr), the
location of the 3D point can be derived from previous slide’s
equations:
 Depth z = 𝑓
𝑏
𝑥 𝐿−𝑥 𝑅
= 𝑓
𝑏
𝑑
X = 𝑥 𝐿 ∗
𝑧
𝑓
or 𝑏 + 𝑥 𝐿 ∗
𝑧
𝑓
Y = 𝑦 𝐿 ∗
𝑧
𝑓
or 𝑦 𝑅 ∗
𝑧
𝑓
5/30/2015
16
This method of
determining depth
from disparity d is
called triangulation.

 Disparity is higher for points closer to the camera
5/30/2015
17

Goal: a complete disparity
5/30/2015
19
 Disparity is the difference in position of corresponding points between the
left and right images .

Reconstruction Error
 Given the uncertainty in pixel projection of the
point, what is the error in depth?
 Obviously the error in depth (∆Z) will depend on:
 Z, b, f
 ∆xL, ∆ xR
 Let’s find the expected value of the error, and the
variance of the error
5/30/2015
20

 First, find the error in disparity Dd, from the error of
locating the feature in each image, ∆ XL and ∆ XR
d=𝑥 𝐿 − 𝑥 𝑅
 Taking the total derivative of each side
 d(d)=d(𝑥 𝐿) − d(𝑥 𝑅)
 ∆ d=∆𝑥 𝐿 − ∆𝑥 𝑅
 Assuming ∆xL, ∆xR are independent and zero mean
 𝜇 = 𝐸 ∆𝑑 = 𝐸 ∆𝑥 𝐿 − 𝐸 ∆𝑥 𝑅 = 0
 𝑉𝑎𝑟 ∆𝑑 = 𝐸 ∆𝑑 − 𝜇 2 = 𝐸 ∆𝑑 2 5/30/2015
21

 And
 𝑉𝑎𝑟 ∆𝑑 = 𝐸 ∆𝑥 𝐿 − ∆𝑥 𝑅
2
= 𝐸 ∆𝑥 𝐿
2
− 2∆𝑥 𝐿∆𝑥 𝑅 + ∆𝑥 𝑅
2
=𝐸 ∆𝑥 𝐿
2
− 2𝐸 ∆𝑥 𝐿∆𝑥 𝑅 + 𝐸 ∆𝑥 𝑅
2
=𝐸 ∆𝑥 𝐿
2
+ 𝐸 ∆𝑥 𝑅
2
 So:
𝜎 𝑑
2
=𝜎L
2
+𝜎R
2
5/30/2015
22
2𝐸 ∆𝑥 𝐿∆𝑥 𝑅 =0
Because ∆xL, ∆xR are independent
and zero mean

 Next, we take the total derivative of Z=𝑓
𝑏
𝑍
 If the only uncertainty is in the disparity d
∆Z=𝑓
𝑏
𝑑2 (−∆d)
 The mean error is 𝜇 𝑍= E[∆ Z]
𝜇 𝑍=0
 The variance of the error is 𝜎Z
2
= E [(∆ Z- 𝜇 𝑍)2]
E [(∆ Z- 𝜇 𝑍)2] =𝐸 ∆𝑍2
= 𝑓
𝑏
𝑑2
2
𝐸 −∆𝑑 2
𝜎Z
2
= 𝑓
𝑏
𝑑2
2
𝜎d
2
𝜎 𝑍 = 𝑓
𝑏
𝑑2 𝜎 𝑑 = Z
𝜎 𝑑
𝑑 5/30/2015
23

Example
 A stereo vision system estimates the disparity of a point
as d=10 pixels
 What is the depth (Z) of the point, if f = 500 pixels and b = 10
cm?
Z=𝑓
𝑏
𝑑
=(500 pix)
10𝑐𝑚
10𝑝𝑖𝑥
= 500 𝑐𝑚
 What is the uncertainty (standard deviation) of the depth, if
the standard deviation of locating a feature in each image =
1 pixel?
𝜎 𝑑
2
=𝜎L
2
+𝜎R
2
=2 𝜎 𝑑 = 2
𝜎 𝑍 = Z
𝜎 𝑑
𝑑
=(500 cm)
2𝑝𝑖𝑥
10𝑝𝑖𝑥
≅ 70𝑐𝑚
5/30/2015
24

Example – continues
 Find 3D point corresponding to 2 point P1 and P2 in from right and
left camera respectively ,where P1(88,90) ,P2 (100,90). f=500 cm ,
b=10 bix.
 z = 𝑓
𝑏
𝑥 𝐿−𝑥 𝑅
= 500 𝑐𝑚
10 𝑏𝑖𝑥
100−88
= 417
X = 𝑥 𝐿 ∗
𝑧
𝑓
= 100 ∗
417
500
= 83
Y = 𝑦 𝐿 ∗
𝑧
𝑓
= 90 ∗
417
500
= 75
 So P which is 3D point is :
𝑿
𝒀
𝒁
=
𝟖𝟑
𝟕𝟓
𝟒𝟏𝟕
5/30/2015
25

Stereo vision

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Stereo vision

Similar to Stereo vision (20)

More from Farah M. Altufaili

More from Farah M. Altufaili (10)

Recently uploaded

Recently uploaded (20)

Stereo vision