Generation and weighting of 3D point correspondences for improved registration of RGB-D data
1. Generation and Weighting of 3D Point Correspondences
for Improved Registration of RGB-D Data
Kourosh Khoshelham
Daniel Dos Santos
George Vosselman
2. MAPPING BY RGB-D DATA
RGB-D cameras like Kinect have great potential for indoor
mapping;
Kinect captures:
depth + color images @ ~30 fps
= sequence of colored point clouds
IR emitter RGB camera
+
IR camera
2
3. REGISTRATION OF RGB-D DATA
Mapping requires registration of consecutive frames;
Registration: transforming all point clouds into one coordinate
system (usually of the first frame).
Point i in frame j-1
Point i in frame j
𝑗−1
𝐗 𝑖,𝑗−1 = 𝐑 𝑗
𝑗−1
𝐗 𝑖,𝑗 + 𝐭 𝑗
Transformation from
frame j to frame j-1
3
4. REGISTRATION BY VISUAL FEATURES
Extraction and matching of keypoints is done more reliably in RGB
images;
Two main components:
Keypoint extraction and matching SIFT, SURF, …
Outlier detection RANSAC, M-estimator, …
SURF
matches
Conversion to 3D
correspondences
(using depth data)
Removing
outliers
RANSAC
Least-squares
estimation of
registration
parameters
4
5. CHALLENGES AND OBJECTIVES
Challenge:
Pairwise registration errors accumulate deformed point cloud
Objective:
More accurate pairwise registration by:
i. Accurate generation of 3D correspondences from 2D points;
ii. Weighting 3D point pairs based on random error of depth.
5
6. GENERATION OF 3D POINT CORRESPONDENCES
2D keypoints 3D point correspondences ? (ill-posed)
RGB image coordinates relate to depth image coordinates by a
shift?
Note: the FOV of the RGB camera and IR camera are different!
Our approach:
Transform 2D keypoints from RGB to depth image using
relative orientation between the two cameras;
Search along the epipolar line for the correct 3D coordinates.
Note: relative orientation parameters are estimated during calibration.
6
7. GENERATION OF 3D POINT CORRESPONDENCES
More formally:
Given a keypoint in the RGB frame:
1. calculate the epipolar line in the depth frame using the relative
orientation parameters;
2. define a search band along the epipolar line using the minimum and
maximum of the range of depth values (0.5 m and 5 m respectively);
For all pixels within the search band:
1. calculate 3D coordinates and re-project the resulting 3D point back
to the RGB frame;
2. calculate and store the distance between the reprojected point and
the original keypoint;
Return the 3D point whose re-projection has the smallest distance
to the keypoint.
7
8. GENERATION OF 3D POINT CORRESPONDENCES
Finding 3D points in the depth image (right) corresponding to 2D
keypoints in the RGB image (left) by searching along epipolar
lines (red bands).
8
9. ESTIMATING RELATIVE ORIENTATION PARAMETERS
Relative orientation between the RGB camera and IR camera:
approximate by a shift;
estimate by stereo calibration;
estimate by space resection.
Manually measured markers in the disparity (left) and colour image (right)
used for the estimation of relative orientation parameters by space resection.
9
10. WEIGHTING OF 3D POINT CORRESPONDENCES
Observation equation in the estimation model:
vi X i , j 1 R jj 1X i , j t jj 1
Approximate as:
vi X i , j 1 X i , j
Note: because of high frame rate transformation parameters between
consecutive frames are quite small.
Define weights as:
wi
k
2
v
i
k
2
2
Xi , j 1 Xi , j
10
11. WEIGHTING OF 3D POINT CORRESPONDENCES
We use random error of depth only:
Relation between disparity (d) and depth (Z):
2
2
Propagation of variance: Z c12 Z 4 d
Weight:
Z 1 c0 c1 d
Calibration
parameters
kc12 d 2
wi 4
Z i , j 1 Z i4 j
,
11
12. RESULTS: ACCURACY OF 3D POINT CORRESPONDENCES
Relative orientation
approximated by a shift
12
13. RESULTS: ACCURACY OF 3D POINT CORRESPONDENCES
Relative orientation
estimated by stereo calibration
13
14. RESULTS: ACCURACY OF 3D POINT CORRESPONDENCES
Relative orientation
estimated by space resection
14
15. EFFECT OF WEIGHTS IN REGISTRATION
Six RGB-D sequences of an office environment;
Trajectories formed closed loops;
Evaluation by closing error:
Closing
rotation
R
T
0
Closing
translation
v
n
H1 H n1H1
2
n
1
Transformation from
last frame to first frame
Transformation from
first frame to last frame
15
16. EFFECT OF WEIGHTS IN REGISTRATION
Closing distance for the six sequences registered with and without
weights:
16
17. EFFECT OF WEIGHTS IN REGISTRATION
Closing angle for the six sequences registered with and without
weights:
17
18. EFFECT OF WEIGHTS IN REGISTRATION
Average closing errors for registrations with and without weight:
Average closing
distance [cm]
Average closing
angle [deg]
without weight
6.42
6.32
with weight
3.80
4.74
Registration
18
19. EFFECT OF WEIGHTS IN REGISTRATION
The trajectory obtained by weighted registration (in blue) is more
accurate than the one without weights (in red).
19
22. CONCLUSIONS
Accurate transformation of keypoints from the RGB space to the
3D space more accurate registration of consecutive frames;
Assigning weights based on random error of depth improves the
accuracy of pairwise registration and sensor pose estimates.
Using weights covariance matrices for pose vectors
can be used to weight pose vectors in the global adjustment
= more accurate loop closure
Influence of synchronization errors (between RGB and IR cam)
fine registration using point- and plane correspondences
extracted directly from the point cloud.
22
25. Measurement principle of Kinect
Depth measurement by triangulation:
The laser source emits a laser beam;
A diffraction grating splits the beam to create a pattern of speckles
projected onto the scene;
The speckles are captured by the infrared camera;
The speckle image is correlated with a reference image obtained by
capturing a plane at a known distance from the sensor;
The result of correlation is a disparity value for each pixel from which
depth can be calculated.
Resulting
disparity
image
IR image of pattern of
speckles projected to
the scene
25
26. Depth-disparity relation and calculation of point coordinates
From triangle similarities:
and:
Zk
Zo
Z
1 o d
fb
Zk
( xk xo x)
f
Z
Yk k ( yk yo y )
f
Xk
where:
Zo
f
d
b
xk,yk
xo, yo
δx,δy
Distance of the reference plane
Focal lnegth of the IR camera
Measured disparity
Base length between emitter and IR camera
Image coordinates of point k
Principal point offsets
Lens distortion corrections
26
27. Calibration
Calibration procedure:
focal length (f);
Standard calibration
of IR camera
principal point offsets (xo, yo);
lens distortion coefficients (in δx, δy);
base length (b);
Depth calibration
distance of the reference pattern (Zo).
Normalization:
d = md’+n
1
Zk (
m
n
1
) d (Z o )
fb
fb
Depth calibration parameters
27
28. Theoretical model of depth random error
Depth equation:
Zk
1
m
n
( Z o 1 )
( )d
fb
fb
Propagation
of variance
Depth random error:
Z (
k
m
2
)Z k d '
fb
Random error is a quadratic function of depth.
28
29. Depth random error
Standard deviation of
plane fitting residuals as
a measure of depth
random error;
As expected, depth
random error increases
quadratically with
increasing distance from
the sensor.
1.0 m
2.0 m
3.0 m
4.0 m
5.0 m
29
30. Depth resolution
Distribution of plane fitting residuals on the
plane at 4 m distance
Depth resolution is also proportional to the
squared distance from the sensor;
Side view of the points on the plane
at 4 m (effect of depth resolution)
At a maximum range of 5 m
depth resolution is 7 cm.
30