Honours_Thesis2015_final

3D Surface Change Detection
by
Marcus Low Junxiang
Submitted to the School of Computer Science
Honours Thesis in fulﬁllment of the requirements for the degree of
Honours Bachelor of Computer Science
at the
UNIVERSITY OF ADELAIDE
October 2015
c UNIVERSITY OF ADELAIDE 2015. All rights reserved.
Submitted to . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
School of Computer Science
6th
November, 2015
Supervised by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dr. Tat-Jun Chin
Project Supervisor

by
Marcus Low Junxiang
Submitted to the School of Computer Science
on 6th
November, 2015, in fulfillment of the
requirements for the degree of
Honours Bachelor of Computer Science
Abstract
Change detection across 2D data is a common and well-studied field. The field of
computer vision has many studies on 2D image differencing algorithms. However,
evaluating change of a three-dimensional surface is rare and not commonly applied.
Structure from Motion methods inherently rely on 2D images and are not suitable
for detecting changes in a 3D scene. Other attempted methods at 3D surface change
detection typically use expensive hardware or are not feasible for common house-
hold usage. On the other hand, cheap and simple 3D range-scanning devices are
usually noisy and thus inaccurate. This Honours Project explores various methods
for pre-processing 3D data, point cloud registration, data smoothing techniques and
threshold-based change detection. Finally, a comprehensive pipeline for 3D surface
change detection is proposed.
Project Supervisor: Dr. Tat-Jun Chin

Contents
1 Introduction 1
1.1 Applications of Surface Change Detection . . . . . . . . . . . . . . . 2
1.2 Motivations for 3D surface change detection . . . . . . . . . . . . . . 2
1.3 Goal & Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Literature Review 6
2.1 Related Work in Surface Change Detection . . . . . . . . . . . . . . . 6
2.2 Related Work in Surface Reconstruction . . . . . . . . . . . . . . . . 7
2.3 Related Work in Moving Least Squares . . . . . . . . . . . . . . . . . 8
2.4 Other Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Research Setting and Surface Representations 9
3.1 Research Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Point Set Representation . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 2D Array Representation . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Discussion of Representations . . . . . . . . . . . . . . . . . . . . . . 11
4 Moving Least Squares 13
4.1 Introduction to MLS . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 The MLS Projection Operator . . . . . . . . . . . . . . . . . . 13
4.1.2 General Method . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 MLS as a Data Smoothing Technique . . . . . . . . . . . . . . . . . . 15
i

4.2.1 Computing the MLS projected surface . . . . . . . . . . . . . 16
4.3 MLS as an Interpolation Technique . . . . . . . . . . . . . . . . . . . 18
4.3.1 2.5D MLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 Discussion of MLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Dealing with Noise 22
5.1 What Noise in 3D data is . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1.1 Inherent Noise in Sensor Output . . . . . . . . . . . . . . . . . 22
5.1.2 Illuminance & Reflectivity . . . . . . . . . . . . . . . . . . . . 23
5.1.3 Other Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Minimising Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.1 Average Over Multiple Rapid Captures . . . . . . . . . . . . . 25
5.2.2 Moving Least Squares . . . . . . . . . . . . . . . . . . . . . . 26
6 Surface Correspondence 27
6.1 Issues in Point Set Correspondence . . . . . . . . . . . . . . . . . . . 27
6.1.1 General Case: Ideal Point Set Correspondence . . . . . . . . . 27
6.1.2 Difference in Resolution . . . . . . . . . . . . . . . . . . . . . 28
6.1.3 The Occlusion Problem . . . . . . . . . . . . . . . . . . . . . . 28
6.1.4 Discussion on Surface Correspondences . . . . . . . . . . . . . 29
6.2 Point Cloud Registration . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2.1 Point Cloud Registration: 6DOF Optimal Registration . . . . 30
6.2.2 Point Cloud Registration: Iterative Closest Points . . . . . . . 31
6.2.3 Finding an Optimal Alignment Efficiently . . . . . . . . . . . 32
6.3 Reconstructing the Surface . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3.1 Initialising a Larger Frame of Reference . . . . . . . . . . . . . 33
6.3.2 MLS Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3.3 Invalidating Occluded Points . . . . . . . . . . . . . . . . . . . 34
6.4 General Algorithm for Finding a Surface Correspondence . . . . . . . 35

7 Differencing 44
7.1 Representing Difference . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.1.1 Before-to-After Differencing . . . . . . . . . . . . . . . . . . . 44
7.1.2 Absolute Differencing . . . . . . . . . . . . . . . . . . . . . . . 46
7.2 Defining Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2.1 Area of Effect vs Magnitude of Change . . . . . . . . . . . . . 46
7.2.2 Percentile of Change as a Threshold . . . . . . . . . . . . . . . 48
7.2.3 Representation of True Change . . . . . . . . . . . . . . . . . 48
7.2.4 Binary Classifier . . . . . . . . . . . . . . . . . . . . . . . . . 48
8 3D Surface Change Detection Pipeline 52
8.1 Summary of Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.1.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.1.2 Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.1.3 Reconstruction of Corresponding Surfaces . . . . . . . . . . . 53
8.1.4 Finding Change . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2 Algorithm for 3D Surface Change Detection . . . . . . . . . . . . . . 53
9 Experiments 54
9.1 Forming the Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.1.1 Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.1.2 Movement in Camera Pose . . . . . . . . . . . . . . . . . . . . 57
9.1.3 Changes Applied . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.2.1 Effect of MLS Gaussian Parameter h . . . . . . . . . . . . . . 58
9.2.2 Evaluation of the Binary Classifier . . . . . . . . . . . . . . . 60
9.2.3 Optimal vs Non-Optimal Alignment . . . . . . . . . . . . . . . 61
9.3 Discussion on Experiments . . . . . . . . . . . . . . . . . . . . . . . . 62
Bibliography 63

List of Algorithms
1 MLS: General Projection Algorithm . . . . . . . . . . . . . . . . . . . 16
2 Modiﬁed MLS: Simple Projection Procedure . . . . . . . . . . . . . . 17
3 P: 2.5D MLS Projection . . . . . . . . . . . . . . . . . . . . . . . . 19
4 ICP: Registration of Point Clouds . . . . . . . . . . . . . . . . . . . . 31
5 Finding a Rigid-Body Transform . . . . . . . . . . . . . . . . . . . . 32
6 F: Finding Surface Correspondence between R and S . . . . . . . . 38
7 B: 3D Surface Change Detection . . . . . . . . . . . . . . . . . . . . 53
iv

Chapter 1
Introduction
In the ﬁeld of computer vision, many algorithms and well-studied methods exist
to handle two-dimensional data. However, three-dimensional data provides more
realistic representations of real-world surfaces. Point cloud representation and mesh
representation are the two common representations of surfaces, that can be either
modelled directly into a simulated surface such as those used in Computer-Generated-
Imagery (CGI), or obtained from a scanning device. Scanning devices typically use
a laser to obtain a Depth Image, a three-dimensional grid of points whose x and
y-coordinates represent ﬁxed points in space projected onto the sensor, while the
z-coordinate of every point represents the depth of said point from the sensor.
Due to the pinhole nature of laser-based depth scanners, any represented points are
usually calibrated with the position of the sensors, and a three-dimensional structure
can be reconstructed from each progressive scan. Each scan can thus be said to be
the representation of a three-dimensional scene. When the scan is a partial section of
a large continuous real-world surface such as a wall or a landscape, the representation
of the three-dimensional scene can be modeled as a surface in 3D by converting the
depth map obtained by the scanner into point cloud data.
In the case of this project, all 3D data examples (unless otherwise stated) are
obtained by using a Creative Senz3D RGBD Webcam. Full implementations of the
algorithm and its components were written using MATLAB R2015.
1

1.1 Applications of Surface Change Detection
The measurement of physical change over time is a commonly used metric in all forms
of science. Radioactive half-life, molecular changes in organic cells, and electrical
current flow are just some examples of measurements that are used to improve modern
technology. The accuracy of these measurements are dependent on the sensitivity of
the sensors that are used.
The representation of a real-world scene in three-dimensions allows us to measure
the change of said surface over time. Observation of a surface is common in medical
imaging technology, where current systems mostly utilise 2D imagery to reconstruct
anatomical models. It is also common in the observation of geological changes to a
particular landscape over time, e.g temporal analysis of shoreline recession. The two
scenarios are examples of small and large-scale usage of surface change detection.
Other examples of potential applications for surface change detection is in analysis
of localised structural deformation or finding cracks during a manufacturing or main-
tenance process, where minor deformations in the shape of a product can be a valid
condition for failing certain quality standards. One such situation is in the fuselage
and engine checks performed on an aeroplane before it is cleared for take off. Current
industry standards require a trained engineer to manually move his hand across the
fuselage/engine surface to ”feel” for deformations. This is clearly a non-optimal pro-
cedure and subject to human-error, and it shows that a computational method for
detecting surface changes is a promising field of research.
1.2 Motivations for 3D surface change detection
Aside from the issue of removing the human-expert element from surface analysis
procedures by developing a computational method, one of the problems in current
surface change technology is that most methods for surface change detection utilise
a structure-from-motion(SfM) method together with data from 2D images. These
methods work well only with general visual changes, and do not work with depth-

Figure 1-1: Creative Senz3D RGBD
Webcam
Figure 1-2: Microsoft Kinect
Figure 1-3: SwissRanger SR4000
Camera
Figure 1-4: Velodyne HDL LIDAR
Sensor
wise (in the direction of the camera) changes that might be signiﬁcant, but too small
in magnitude for SfM techniques to detect. Other methods that involve true 3D
representations typically utilise high-end Terrestrial Laser Scanners (TLS) or LIDAR,
that are expensive and not at all feasible for consumer usage.
Although cheap range scanners such as the Microsoft Kinect and the Creative
Senz3D Webcam are available to regular consumers, these sensors are inherently
noisy and unable to detect changes any smaller than large bodily movements. This
noise usually hides all trace of a change along a surface, and thus the ability to detect
change is dependent on how much a surface representation is distorted by noise.
RGBD scanners like the Creative Senz3D Webcam captures standard RGB image
data and a depth map, which is essentially a grid with the same dimensional ratio of
the RGB image, where every pixel value depicts a depth value instead of an RGB hex-
adecimal value. This value gives a sense of spatiality to the captured scene, allowing
for applications in 3D hand gesture recognition and basic 3D scene reconstruction.
However, depth data from cheap scanners are typically noisy, and while a general
near-far distinction can be made from the data (e.g detecting a hand much closer to
the camera than the background), depth data is too noisy to distinguish which of two
objects are nearer/further if the depth of both objects are close.

1.3 Goal & Challenges
The goal of this project is to develop an algorithm for detecting real-world changes
to a 3D surface using noisy data obtained at different times.
The main challenge will be investigating the limitations of detecting real-world
changes to a 3D surface using noisy data obtained at different times.
1.4 Thesis Overview
Aside from this introductory section, Chapter 2 describes related work in four fields,
namely in surface change detection, surface reconstruction, moving least squares, and
other related work. Chapter 3 gives a short introduction to two different surface repre-
sentation methods used in computing techniques, and also discusses some limitations
of each of them.
At the core of this research is the Moving Least Squares (MLS) method, first
proposed by [17] and implemented computationally by [1]. Chapter 4 explains MLS
in detail and shows its applicability as both a data interpolation technique and a
smoothing operator.
Chapter 5 discusses the presence of noise in the datasets, and the factors which
affect it. It also describes how noise was minimised in the experiments using various
methods.
In Chapter 6, methods for finding a correspondence between two surfaces is in-
troduced and explained in detail. A consequence of my method for finding a surface
correspondence is the Occlusion Problem, which is described in Sections 6.1.3 and
6.3.3. My approach to solving the Occlusion Problem is also detailed.
Chapter 7 describes the process of representing a difference between two surfaces.
The importance of a threshold is introduced and the various methods of difference
representation is discussed. A combined method for difference representation is pro-
posed for the purposes of this project.
Chapter 8 delivers a summary of each stage of the change detection pipeline, and

will describe the 3D surface change detection algorithm in full. Finally, Chapter 9
describes the experimental set-up of this project and its results, as well as a discussion
of the results and its various factors at each stage of the pipeline.

Chapter 2
Literature Review
This literature review is divided into three sections, namely related work in surface
change detection, surface reconstruction, moving least squares, and other related work
involving the methods that are being researched.
2.1 Related Work in Surface Change Detection
With regards to large-scale surface change detection, [23, 24] conducted research in
a system for detecting general visual changes in tunnel linings by utilising Structure-
from-Motion techniques that reconstructs and matches a 3D model of the tunnel
using geometric priors. Differences detected are limited to two-dimensional changes
such as cracks and graffiti on smooth concrete surfaces. Using Terrestrial Laser Scan-
ners (TLS), [18] also attempted a similar experiment, which performed better on
differences smaller than 15mm, but which proved unstable below 5mm. [9] delivers a
method for detection of changes in the general shape of a tunnel (landscape shape,
not surface) using elliptical fitting algorithms on data also obtained from TLS scans.
The method depends on the pre-determination of five or more fixed control points
within the environment, and uses statistical analysis to profile the deformations to a
tunnel’s shape based on the scans.
With regards to landscape (topological) changes, [3] compares two techniques,
Cloud to Mesh (C2M) and Multiscale Model to Model Comparison (M3C2) on topo-
6

logical data of the Selawik River in Alaska, obtained using airborne and terrestrial
laser scanners to detect local topological change during the seasons over each year.
[12] also developed a software framework for semi-automatic change detection of large-
scale point cloud data also obtained using highly accurate ground lasers. Hausdorff
Distance method was determined to be the most accurate for detecting changes, al-
though the changes detected were still in the larger range (approx. 2-10m)
It should be noted that very little work has been done with regards to small-scale
topological change, such as rock deformation observation and analysis. [20] describes
an experiment on surface deformation analysis that utilises the Least Squares 3D
algorithm (LS3D) by [13] to register the surface, and also to derive the deviation of
the later surface from the prior from the residual in the LS3D method.
In most cases, the usage of highly accurate equipment on large areas seem to
be the general direction where research in 3D surface change detection is concerned.
However our goal of small-scale change detection aims to minimise the problem of
noise in less accurate range scanners to deliver the same result as those obtained
through highly accurate equipment, on a smaller scale. This is more in line with the
research done by [20].
2.2 Related Work in Surface Reconstruction
[2] describes a method of computing a surface from point set surfaces using the Moving
Least Squares method for interpolating a surface from discrete sample points, which
is the core mathematical method behind the work presented in this thesis. [15, 21]
present KinectFusion, a project for surface reconstruction by utilising Structure From
Motion (SFM) techniques to track the movement of a moving depth camera, while
simultaneously adding sample points to a surface. The KinectFusion project uses a
First-In-First-Out buffer to replace old points with new samples, and as such cannot
be considered as a change-detection mechanism, due to the on-the-fly nature of the
depth maps produced by the moving sensor. With this in mind, we can consider an
extension to the KinectFusion project based on the method described in this thesis

as future work.
2.3 Related Work in Moving Least Squares
Moving Least Squares (MLS) was originally presented as a method for surface recon-
struction in [17] and elaborated on in [8]. [2] describes an algorithm for performing the
iterative non-linear minimisation process of projecting the point. It also describes the
factors, most importantly the Gaussian parameter h, that affects the primary smooth-
ness of the final computed surface. [10] describes an adaptive version of MLS that
varies the Gaussian parameter (and hence the weight function) based on feature sizes
on the sampled surface. Independent work done by [11, 19] describe methods for
performing MLS on subsets of the point cloud data in order to reconstruct surfaces
with sharp edges. [5] describes a method for using the discrete implementation of
MLS to perform noise filtering and superresolution of 2D images.
2.4 Other Related Work
For the purposes of this project, [4] introduces the Iterative Closest Point (ICP) algo-
rithm for matching one surface to another by firstly finding corresponding matches of
every point, finding the rigid-body transform that minimises the mean-squared error
between each point correspondence, and repeating the procedure until the computed
rotation and translation fall below a predetermined tolerance. Several methods for
calculating the rigid-body transform via Singular Value Decomposition (SVD) is de-
scribed in [14, 22], which presents a more efficient approximation method that runs
in linear time.
Although the ICP method delivers relatively accurate results if the initial relative
alignment is close, throughout the experiments conducted, it was shown that the lack
of an accurate alignment can introduce errors in the change detection result. For the
purpose of the experiments, the 6DOF method from [7] was used together with ICP
to find globally optimal alignments of our datasets.

Chapter 3
Research Setting and Surface
Representations
This chapter discusses the research setting and the process of obtaining data, as
well as the two general methods of representing a surface using Point Set representa-
tion and 2D Array representation
3.1 Research Setting
The broad problem statement that this research project addresses is as follows:
Given two scans of a real-world surface from a low-cost, 3D scanner at
different times and camera pose, to what degree can we detect accurately
any changes to the surface that occurred between the times when the two
scans were made.
The research setting of the project is such that the surface is scanned at differ-
ent times at different angles, with the raw output from the sensor being used as
inputs. The principal product of this research project is a computational pipeline
that produces a representation of the changes that has occurred. The raw output
of a 3D scanner is a 2D array of depth values (depthmap), although the output can
9

Figure 3-1: Visualisation of a raw depthmap output from a Creative Senz3D Webcam.
Photo courtesy of [16]
vary across sensors, with more sophisticated sensors also reporting illuminance data
or RGB imagery.
For this research project, we use only the raw depthmap output from the scanners
as inputs into the pipeline. Figure 3-1 shows a visualised depthmap, where the depth
values values deﬁne (approximately) the shape and form of the surface that is being
observed.
3.2 Point Set Representation
Point Sets, more commonly referred to as point clouds, are a collection of discrete
3-dimensional sample points in a representative space. This representation method
is the preferred method of representation for 3D surfaces, due to its accuracy and

versatility. The representation of each sample point as a vector of x, y, and z values
also allows for fast and eﬃcient transformation operations to be performed on the
surface.
A point p in a Point Set P can be represented as a vector of x, y, and z values.
3.3 2D Array Representation
The default output of 3D scanners are depthmaps, are a set of depth points relative
to a single observation point. The values are stored as a 2D array, where each (x, y)
value represents the magnitude of the observed depth from the observation point.
Depth maps are the typical raw output that can be obtained from range scanners,
and the values are usually normalised and rescaled using external data (such as the
known position and orientation of the scanner), to create a point cloud. It should
be noted that the depth values represented by the grid do not truly represent a 3D
surface to scale, due to pinhole nature of the camera.
A Point Cloud representation of a surface in 2D array representation thus follows
the form:
pi =





x
y
Dx,y





(3.1)
3.4 Discussion of Representations
A 2D array representation is commonly regarded as the simplest (most raw) form of
depth data that can be obtained. This raw data is always relative to the scanner,
and as such must usually be normalised in order to accurately represent real-world
surfaces. However, in this project, the objective is not to recreate a representation of
the surface, but rather to observe any changes that have occurred on a single surface
at diﬀerent times. Our goal is thus to observe the changes (if any) on a surface relative
to one viewpoint. This presents further obstacles that are discussed in Chapter 4 and
Section 6.1.3.

Figure 3-2: An example of how a range scanner’s distance from a surface affects the
features that it can detect. The range scanner has a fixed field of vision θ and only
records the depths at evenly-spaced fixed points within the field of vision.
It should be noted that the accuracy of both representations is often limited to
the quality of sampling. Since a point cloud is discrete, there exist areas of continuous
space on the real-world surface that, at best, can only be estimated. As most range
scanners are based on the principle of a pinhole observation point, any detectable
features are always larger than the smallest distance between any two depth points.
Figure 3-2 shows an example of how the distance of a range scanner from a surface
affects the accuracy of the features that are detectable by the scanner. This limitation
is a hardware-based limitation and is thus not considered in this project.

Chapter 4
Moving Least Squares
4.1 Introduction to MLS
Moving Least Squares (MLS) is a technique commonly used in surface reconstruction.
It has a unique property of being able to estimate a best-fit surface from sparse point
set data. The procedure is elaborate but elegant.
First proposed by [17] and further elaborated on in [8], the idea behind MLS is to
project every point in a sample set onto a reference plane such that the weighted least
squares difference between the projected point and the other points in its neighbour-
hood is minimised. [2] shows an iterative method for computing an MLS projection
of a sample set, and shows that the gaussian parameter that controls the weighting
factor can be adjusted to control the level of smoothness on a surface.
4.1.1 The MLS Projection Operator
The set of projected points SP is the result of an MLS projection operator P acting
on every point pi in the sample set P. As MLS is a deterministic algorithm, given
fixed parameters, using the projection operator P on the projected set SP will yeild
no changes, that is:
P(SP ) = SP (4.1)
It should be noted that Moving Least Squares(MLS) method is a method for
13

reconstructing both Discrete and Continuous values from disorganised samples. In
this project, we only consider the discrete version of MLS, as we only deal with
discrete points in 3D and 2D.
4.1.2 General Method
Given a set of sampled points P, MLS is performed by first finding a reference domain
H (plane in R3
) to each point r in the sample set R, that minimises the local weighted
sum of square distances from pi to H for all pi ∈ P. If the point q is assumed to be
the projection of r onto H, then the weights attached to pi in this step is defined as
a function θ of the distance of pi to q. Given the normal vector ˆn, either given or by
local estimation, q and thus H can be found by locally minimising
N
i=1
( n, pi − D)2
θ( pi − q ) (4.2)
Since q is defined here as a projection of r on H, we can set q = r + tˆn , and
rewrite the equation as
N
i=1
n, pi − r − tˆn 2
θ( pi − r − tˆn ) (4.3)
Where θ is a monotone decreasing weight function.
The next step upon finding q is to discover the local bivariate polynomial approxi-
mation in a neighbourhood of r. This polynomial approximation is represented in the
local coordinate system of H, where q is the origin. It can be automatically observed
that ˆn is the local orthogonal axis at q.
The H-local bivariate polynomial g(xi, yi) is found by computing its coefficients
in order to minimise the weighted least squares error
N
i=1
(g(xi, yi) − fi)2
θ( pi − q ) (4.4)
Where fi is the local height of pi over H, i.e fi = ˆn · (pi − q)

Figure 4-1: Original Surface Scan P Figure 4-2: MLS Surface SP
The MLS projection of r, P(r) is then defined by the polynomial value at the
local origin q, i.e, where the polynomial meets ˆn at the local coordinates (0, 0).
P(r) = q + g(0, 0)ˆn
= r + (t + g(0, 0))ˆn (4.5)
4.2 MLS as a Data Smoothing Technique
The radial weight function θ in Equation 4.2 defines the energy that each point pi
contributes to computing the projection P of r. Since the projection effectively
smoothens the surface by moving each point closer into the general neighbourhood
of its closests points, the weight function values points closer to r more than points
that are further away.
The weight function suggested by [8] is
θ(d) = e− d2
h2
(4.6)
and is a Gaussian function where h is a pre-determined parameter that reflects the
anticipated average spacing between neighbouring points after projection. By using
this weight function, a smaller value for h causes the gaussian function to decay
more rapidly, and the approximation becomes more local, and points beyond the
neighbourhood defined by h become nearly insignificant. The projected surface SP

can be thus adjusted to smooth out features of size < h. (See figure [4-1,4-2])
It should be observed that the further away a neighbouring point is, its weight
tends towards zero. This allows us to trim the neighbourhood beyond a fixed neglect
distance dn since all neighbouring points beyond the neglect distance contributes
insignificantly to the minimisation process in Equation 4.2.
[2] also notes that it is possible to ignore the concept of having a neglect distance
by using smooth compact weight functions, such as:
θ(x) = 2x3
− 3x2
+ 1 (4.7)
Finding an appropriate weight function differs by task and application of the MLS
technique.
4.2.1 Computing the MLS projected surface
The general MLS projection procedure is shown in algorithm 1.
Algorithm 1 MLS: General Projection Algorithm
Input: sample set P, weight function θ, threshold ε
Output: projected set SP
1: for each point pi ∈ P do
2: r ← pi
3: ˆn ← Compute(estimate) the normal of pi
4: while (∆t > ε) (∆ˆn > ε) do
5: t ← minimise equation 4.3 within bounds −h
2
≤ t ≤ h
2
6: q ← r + tˆn
7: q ← minimise q on H : (t, ˆn) using conjugate gradients {note*}
8: new [t, n] ← q
9: end while
10: compute coefficients to g(xi, yi) on local coordinate system H : (t, ˆn) {note**}
11: SP [i] ← r + (t + g(0, 0))ˆn
12: end for
As shown by [2], for input points, where the sample set S is expected to be close
to the surface they define, it is safe to assume that the local plane H passes through
the projected point, i.e P(r) = q. This allows computation of the projected point
to trade accuracy for speed by not having to compute the local bivariate polynomial

(See note** in algorithm 1). The reduced MLS projection procedure is still fairly
strong if the sample set S is obtained from a sensor, as the surface denoted by S is
close to the real-world surface).
Another trade-off of accuracy for speed can be performed by eliminating the need
to fine-tune ˆn (See note* in algorithm 1). This also takes advantage of the fact that
the input points are close to the real-world surface, and are evenly sampled (via
sensor input) and thus the normals at each point can be estimated with a relatively
high degree of confidence. Implementations of the general MLS procedure with and
without minimising q(note*) deliver very similar results. The results also show that
ˆn does not deviate much from its initial estimate.
It is worth noting that as opposed to surface reconstruction, where the accuracy
of the final projected surface is important, because change detection is ultimately a
binary classification problem, utilising MLS in change detection only requires that
the same projection function is applied to both surfaces, and does not require an
accurate shape representation of either surface.
Algorithm 2 Modified MLS: Simple Projection Procedure
Input: sample set P, weight function θ, threshold ε
Output: projected set SP
1: for each point pi ∈ P do
2: r ← pi
3: ˆn ← Compute(estimate) the normal of pi
4: t ← minimise equation 4.3 within bounds −h
2
≤ t ≤ h
2
5: SP [i] ← r + tˆn
6: end for
The projection procedure can thus be optimised to only minimise along t for each
r. As minimisation on q is not performed, there is no change to the normal ˆn, ie. ∆ˆn
does not exist. This significantly decreases the runtime complexity of the modified
MLS procedure by eliminating the need for the while loop condition (comparing ∆t
and ∆ˆn with ε).
This allows for a much faster MLS projection procedure that delivers our goal of
smoothing a surface in preparation for change detection, without the need to worry
about a very accurate representation of its shape.

The modified MLS projection procedure for noise smoothing of a surface is shown
in algorithm 2.
4.3 MLS as an Interpolation Technique
This section discusses the usage of MLS as a discrete value interpolation technique.
The need for interpolation arises when a set of regular discrete values are not available
(e.g. from a random sampling of points). This is most commonly seen when projecting
point cloud data onto the plane orthonormal to the z-axis. By interpolating the points
onto a grid, the data can be transformed into depth maps and used for comparison
against other 2D data.
4.3.1 2.5D MLS
For this project we introduce a modified version of MLS that operates the Projection
step with a fixed normal(scalar estimation of t), while still utilising the weighted
distance in 3D Euclidean space. The projected point is a fixed point in the x and y
dimensions, and only projected in the z-direction.
The goal of 2.5D MLS is to recreate a depth map D∗
representation of a surface
from a given point cloud P. The point cloud P is first normalised such that all x and
y values are positive. The depth map D∗
must be initialised with a pre-determined
width w and height h. We formulate 2.5D MLS as a function P2.5D such that
D∗
= P2.5D(D, P) (4.8)
where D has width w and height h, and the function operator P2.5D has a pre-
determined weight function, Gaussian parameter, and threshold as described in Al-
gorithm 2 and Section 4.3.1.
To obtain the values in the cells of D∗
, for each cell D∗
x,y we first assign it the
z value of its nearest neighbour in P (in 2D Euclidean space). The MLS projection
P(D∗
x,y) = D∗
x,y + t is then found by locally minimising

N
i=1
(pi(3) − D∗
x,y − t)2
θ( pi − q ) (4.9)
where
q =





x
y
D∗
x,y





(4.10)
As described, the MLS projection occurs only in one dimension (x and y are ﬁxed),
and hence the squared error of the projected point from pi is (pi(3)−D∗
x,y −t)2
, which
is a scalar value. However, to accurately project the point, the weight is a function
of its distance to its neighbouring points in 3D Euclidean space. This results in an
interpolated 2D point on the grid D∗
that minimises the weighted least squares error
of its 3D neighbourhood (giving the name 2.5D). Figure 4-3 shows a small example
of 2.5D MLS interpolation on a set of sample points.
Algorithm 3 P: 2.5D MLS Projection
Input: Depth Map D (width w, height h), Point Cloud P
Output: Depth Map D∗
1: for each i ∈ 1, 2, . . . , w do
2: for each j ∈ 1, 2, . . . , h do
3: D∗
i,j ← Initialise as 1NN([i, j], P)
4: t ← minimise equation 4.9
5: D∗
i,j ← D∗
i,j + t
6: end for
7: end for
8: return D∗
4.4 Discussion of MLS
MLS is a technique for recreating continuous functions from disorganised samples,
without a need for a frame of reference, such as mesh-based interpolation methods.
The weighting function allows MLS to be used as a smoothing operator by modifying
the gaussian parameter h. Section 6.1.3 discusses how our method uses both the

Figure 4-3: 2.5D MLS Interpolation on a set of randomly sampled points (in blue).
The result of 2.5D MLS Interpolation is a set of points with integer x and y values
(in orange). The z-direction magnitudes of each interpolated point is a best-ﬁt value
with the smallest weighted least-squares error, where the weighted sum is a function
of the point’s distance to its neighbouring points in 3D Euclidean space.

smoothing and interpolation properties to reconstruct a projected view of a surface.

Chapter 5
Dealing with Noise
This chapter details the various sources of noise in range scanners and how various
scenery artifacts aﬀect them. Our methods of minimising noise from the sensors in
our experiments will also be detailed.
5.1 What Noise in 3D data is
Noise in data predominantly appears as a random series of corruptions that appear
consistently within data. Noise in 3D data (or in our case, 2.5D data) often appear to
be within some range that allows for the data to still be discernible to humans. This
behaviour can be seen in examples of noisy point cloud representations of some 3D
models, where the noise is still below a particular level that allows people to identify
the 3D model easily.
5.1.1 Inherent Noise in Sensor Output
The sensors used in our experiments are the Creative Senz3D RGBD Webcam, and
the Swissranger SR4000. The data output from both sensors contain varying levels
of noise depending on the scenery and distance from the scenery.
Any surface change that is smaller than the amount of noise inherent to a sensor
will be hidden by the noise. The term hidden here refers to the fact that ﬁnding
22

Figure 5-1: Preliminary experimental result showing sensor noise. The mesh graph
on the right shows the raw ”changes” detected in two shots taken rapidly one after
the other, with no changes in the surface or in camera pose.
the difference between two corresponding surfaces will not deliver a result that can
be identified as a change. Figures [5-1-5-3] show how the noise in a standard range
scanning camera can hide the presence of objects that are smaller (in the depth sense)
than the average noise level, even if the object in question covers a significant portion
of the observed surface. Only large changes that far outweigh the noise level can be
clearly observed, as shown in Figure 5-2.
5.1.2 Illuminance & Reflectivity
When capturing data from a scene, both the Senz3D and SR4000 emit infrared light
onto the scene and capture it using the onboard sensor. This allows properties in
the scenery itself to interfere with the infrared light. The illuminance of a scene can
affect the depth scene, especially if the scene is too bright, or if the light source is
directly in front of the sensor, thus creating glare.
The reflectivity of a surface can also affect the angle at which the infrared beams
reflect off a surface. This causes the depth at the surface to appear at a different level
than is accurate. In our experiments, we have found that extremely reflective objects

Figure 5-2: Preliminary experimental result showing sensor noise, with a large change
clearly visible in the raw ”change” map
Figure 5-3: Preliminary experimental result showing sensor noise. The amount of
noise (in the mesh graph on the right) hides the change (a 0.8cm thick smartphone
on the surface)

like mirrors and aluminium foil often appear as blank patches in the depth scene.
This is possibly because the infrared beams are scattered as they hit the reflective
surface, and are lost to the surroundings. As such these beams never return to the
sensor, and the corresponding patch in the frame is registered as infinite.
5.1.3 Other Factors
The noise in a depth data stream is often largely due to the quality of the sensor.
This is also closely related to the price of the camera that houses the sensor. Higher
quality hardware such as the Swissranger SR4000 often includes hardware-based op-
timisations that further increase the accuracy of the output data. Such optimisations
take into account other factors such as the temperature of the sensor, and the repeata-
bility of a measured data point. Repeatability is an arbitrary value characterised by
the spread of a measurement around the mean value and is related to the precision
on a single measurement. Repeatability gives an indication on the noise of a given
measurement.
5.2 Minimising Noise
This section discusses two methods that are used in this project to minimise the
output.
5.2.1 Average Over Multiple Rapid Captures
Data outputs from any form of sensor is typically noisy, but general shape and form
of the surface can still be seen. The noise can thus be characterised to be systemic.
Given a noise threshold Ñ , we can generalise the depth output of a 3D scanner in
the form
D = D ± σ (5.1)
D is an M ×N array of depth values, D is the ideally accurate depth map, and σ

refers to the array of noise values, which are within the continuous range [− Ñ , Ñ ].
Assuming that σ is evenly distributed, a simple method of reducing the amount of
noise in a depth map is to take the average of multiple scans. In our experiments, the
initial stage of capturing a scan involves capturing multiple scans in a rapid fashion
(burst), and finding the depth map of average depth values. As the rapid-capture
functionality of modern cameras are relatively quick, we can assume the averaged
result to be representative of the surface at a single moment in time (epoch).
It should be noted that for the remainder of this thesis, any terms referencing the
output depth map from a sensor is regarded as the averaged depth map over 20 rapid
captures.
5.2.2 Moving Least Squares
As described in Section 4.2, the radial weight function θ and its predetermined Gaus-
sian parameter h define the granularity and smoothness of an MLS-Projected surface,
as h defines the radius of the neighbourhood which influences the projection of a point,
thus smoothing out features which are < h. By assuming a statistically even distri-
bution of noise in the data, the smoothing process virtually decreases the noise in the
data since it finds the best estimate of the surface.

Chapter 6
Surface Correspondence
6.1 Issues in Point Set Correspondence
Given two scans of a surface from different times, namely a before scan and an after
scan, one of the key steps to finding a surface change is identifying which parts of the
before surface correspond to the after surface. Although this may seem like a trivial
task for a human, finding a correspondence between surfaces computationally is often
difficult due to the limited data that can be derived from the surface scans themselves.
This section will describe a general case of finding a point set correspondence, and how
certain real-world factors can make the process of finding a surface correspondence
problematic.
6.1.1 General Case: Ideal Point Set Correspondence
Given two point cloud representations of a surface model R and S, the ideal condition
of finding a point set correspondence is such that each point r ∈ R has a single unique
corresponding point s ∈ S. This is also known as a one-to-one correspondence. In
the case of a one-to-one surface correspondence, finding a surface change is a trivial
matter of finding the difference between each pair of correspondences, that is
∆i = ri − si ∀r ∈ R, ∀s ∈ S (6.1)
27

This general equation shows the ideal condition of a surface correspondence, and
is in contrast to a real world surface correspondence, where surfaces are continuous
instead of discrete. The following sections will discuss how our method reduces the
surface correspondence problem to the general form in Equation 6.1. The general
algorithm is presented in 6.
6.1.2 Difference in Resolution
Figure 3-2 (mentioned in Section 3.4) shows how any particular segment of a surface
can be represented by a different number of depth points depending on the resolution
of the sensor, which is directly affected by changes in the viewing angle and distance of
the sensor to the target surface. Often, obtaining a true one-to-one correspondence is
virtually impossible under realistic conditions and a true correspondence is typically
one-to-many or many-to-one, i.e. any particular segment of the surface is represented
by a different set of points in both R and S (assuming R and S are scans taken
separately from different angles). This affects the resolution of the image not just
across different scans but also locally within a single scan, since surfaces that are
closer to the sensor have a higher points-per-unit-area density. This is similar to the
differences in definition in a 2D RGB image, where objects closer to the camera will
naturally appear more detailed.
6.1.3 The Occlusion Problem
Another issue with attempting to find a correspondence between R and S is the
problem of occlusion, where a subset of R represents a section of the surface that
is not represented in S, and vice versa. When attempting to find a correspondence
between the two representations, occluded points will inevitably exist if R and S
are captured from unique angles. As the surface correspondence method involves
reconstructing the surface with interpolation, interpolating a new point S∗
i,j with no
nearby points in S can leads to a result that may be inaccurate, as shown in Figure
6-4. As such, occluded points must be invalidated to deliver a proper correspondence.

Section 6.3.3 describes a method for determining if a point is a valid, by means of
a range search in 2D Euclidean space.
6.1.4 Discussion on Surface Correspondences
The core issue with finding a surface correspondence is the characteristic of point den-
sity. While a smooth continuous surface is representable by a series of mathematical
expressions, a real-world rough surface can only be represented by discrete samples
of the surface. Obtaining these samples often leads to differences in resolution and
occlusion. Hence, there exist gaps between these samples in which the true surface
cannot be truly known. The naive method of finding a correspondence thus implies
the condition that the surface scans must always be conducted from the same position
and orientation. This is a non-realistic condition and cannot possibly be achieved in
real-world experiments. The issues of difference in resolution, as well as the occlusion
problem are issues that we aim to solve in the following sections.
6.2 Point Cloud Registration
This section describes a three-part pipeline that can be used to find a surface cor-
respondence from two sets of point set data R and S, where a sub-set ˆR ⊂ R and
a sub-set ˆS ⊂ S are both representations of the same real-world surface. ˆR and ˆS
can differ in resolution (different number of points-per-unit-area), and under realistic
conditions a correspondence cannot be found between ˆR and ˆS.
The three stages of the pipeline are as follows:
• Optimal Alignment using Point Cloud Registration
• Invalidation of Occluded Sections of the Surface
• Reconstruction of a Projected View of the Surface using MLS Interpolation

6.2.1 Point Cloud Registration: 6DOF Optimal Registration
Given two point clouds M = {mi}M
i=1 and B = {bi}B
j=1, we use the globally optimal
method described in [7]. The method extends the branch and bound method proposed
by [6] for finding an optimal geometric transformation that aligns two point clouds.
Mathematically, the optimal alignment is a transformations defined by a rotation R
and translation t that maximises the objective function
Q(R, t) =
i
max
j
R(mi + t) − bj ≤ (6.2)
where Q defines a quality of matching. The segment defined by predicate repre-
sents the standard indicator function:
predicate =



1 if predicate is true
0 if predicate is false
(6.3)
The original branch and bound method by [6] works by first considering a bounded
set of isometric transformations Tall where all the contained transformations T : R2
→
R2
. The transformation space T is branched into two disjoint transformation spaces
Tk and Tk+1 such that Tk = T2k ∪ T2k+1. The optimal transformation must be
contained within either of the two spaces. The algorithm uses a priority queue
to ensure that the space bounded by the higher priority ˆQ will be searched first,
that is, Q(Tmax) = ˆQ(Tk) ≥ ˆQ(Tremaining). By recursively dividing the search space
and inserting each sub-space into the priority queue with priority ˆQ, the algorithm
tightens the bound on the search space until the bounds that define the quality of
matching is within an acceptable range. The method is a divide-and-conquer method
for finding the optimal transformation for alignment.
The 6DOF method given in [7] extends the method in [6] by dividing each search
space into eight disjoint sub-spaces instead of just two. The method uses a novel
approach of utilising a 3-dimensional search space, with an efficient bound evaluation
method using stereographical projections. The underlying concept is similar to the
branch and bound method, but is noticeably quicker due to its division of the search

space into 8 sub-sections, each with its own priority ˆQ, that allows the algorithm to
arrive at an optimal alignment faster.
6.2.2 Point Cloud Registration: Iterative Closest Points
One of the more popular methods for co-registration of point cloud data is the Iter-
ative Closest Points (ICP) method described in [4]. This method involves iteratively
finding the transformation that minimises the mean-squared error between each point
si ∈ S and its closest point ri ∈ R. This transformation is defined by a rotation R
and a translation t such that:
(R, t) = arg min
R,t
n
i=1
wi (Rsi + t) − ri
2
(6.4)
At each iteration of ICP, each point si ∈ S is matched to its closest point ri ∈ R,
and a rigid-body transformation is found and applied to S. The process is repeated
until the the change in R and t is below a pre-determined threshold . To find the
rigid-body transformation in each iteration, we use the SVD decomposition method
detailed in [22]. Algorithms 4 and 5 describe the process of ICP and finding the
rigid-body transform respectively.
Algorithm 4 ICP: Iterative Closest Point
Input: Reference set R, Source Set S
Output: Co-Registered Source Set S
1: S ← S
2: while ∆R, ∆t > do
3: for each si in S do
4: ri ← 1NN(si, R)
5: end for
6: [R, t] = svdRigidBodyTransform(R, S )
7: S ← RS + t
8: end while
9: return S

Algorithm 5 SVD Method for calculating Rigid-Body Transform
Input: reference set M, source set S
Output: rotation matrix R, translation vector t
1: ¯m ← 1
n
n
i=1
mi, mi ∈ M
2: ¯s ← 1
n
n
i=1
si, si ∈ S
3: Mcentered ← {m1 − ¯m, ..., mn − ¯m}
4: Scentered ← {s1 − ¯s, ..., sn − ¯s}
5: C ← McenteredST
centered
6: [U, S, V T
] ← svd(C)
7: R ← Udiag(1, 1, ..., det(UV T
))V T
8: t ← ¯s − Rot ¯m
9: return R, t
6.2.3 Finding an Optimal Alignment Efficiently
One of the major drawbacks of utilising a fast method like ICP for co-registration of
point cloud data is the necessary condition that the initial state of both point clouds
must be close enough for the registration to find an optimal alignment. This is in
contrast to the 6DOF method, where branch-and-bound segmentation of the search
space disregards the intial state of the point clouds, at the expense of being noticeably
slower compared to ICP.
To compute an optimal alignment efficiently, we apply 6DOF with a looser bound
threshold to first compute an approximate near-optimal alignment of S on R. The
result is then used as the initial state for alignment with the ICP method.
It should be noted that finding an optimal alignment is necessary for finding a
correct surface correspondence between R and S, as an incorrect alignment will cause
the interpolation of S∗
(described in Section 6.3) to return an inaccurate result.

6.3 Reconstructing the Surface
6.3.1 Initialising a Larger Frame of Reference
After S is aligned to R, forming S , we can observe that R has remained unchanged
throughout the pipeline up to this point, and as such can still be represented both as
a 2D array of depth values or as a point cloud. In contrast, under the transformation
applied during alignment, the x and y values of each point s ∈ S are possibly non-
integers, and hence S can no longer be represented in cell-grid form.
We ﬁrst translate R and S such that all (x, y) coordinates of both R and S are
positive. This is a pre-processing step to prepare the data for interpolation onto a
2D array, where the indices are non-negative, ie. i, j ∈ Z+
. To ensure that R will still
be representable as a 2D array, the translation must be a 2D translation with integer
values, ie.
t =


δx
δy

 δx, δy ∈ Z+
(6.5)
Note that the translation is only done if any point in S has non-negative x or y-
coordinates. In the case where the x,y-coordinates of all points in S are still positive
after alignment, this step can be ignored.
Next, we initialise R∗
and S∗
as two large 2D arrays of width w and height h such
that each of the (x, y) coordinates of all points in R and S can be represented within
a grid of the same width and height.
0 ≤ xMin < xMax ≤ w
0 ≤ yMin < yMax ≤ h
(6.6)
where xMin, yMin, and yMin, yMax are the minimum and maximum x, y-
coordinates respectively in both R and S . To simplify, we can set
w = ceiling(xMax)
h = ceiling(yMax)
(6.7)

6.3.2 MLS Interpolation
The final step to finding a surface correspondence between R and S is to reconstruct
the surface representations R∗
and S∗
, both of which are 2D arrays of width w and
height h. This step is done by applying the 2.5D MLS projection operator described
in 4.3.1 on R∗
and S∗
at this stage of the pipeline, using the weights from points in
R and S respectively:
R∗
= P2.5D(R∗
, R)
S∗
= P2.5D(S∗
, S )
(6.8)
It should be noted that the pre-processing step of initialising a larger array is
part of the 2.5D MLS projection procedure, although it is detailed in the previous
sub-section.
Another observation is the fact that R∗
is simply a smoothened version of R
inserted in a larger 2D array, as the x, y-coordinates of R are already integers.
6.3.3 Invalidating Occluded Points
As shown in Section 6.1.3, R and S may contain points representing parts of the
scanned surface which do not appear in the other. This problem can be overcome
by reconstructing only parts of the surface of which both R and S contain sample
points. This ensures the relative local accuracy of points in the reconstructed surface
S∗
.
To find which points to invalidate in S∗
, we identify a valid neighbourhood distance
dvalid such that a coordinate S∗
i,j is valid only if there exists a single point in S
(projection of S that is orthonormal to the z-axis) within the neighbourhood in 2D
euclidean space, that is
valid(S∗
i,j) =



true if rangeSearchx,y([i, j], S , dvalid) ≥ 1
false otherwise
(6.9)
where rangeSearchx,y([i, j], S , dvalid) is a function that searches for the total num-

ber of neighbouring points from [i, j] that falls within the valid neighbourhood distance
in on the x, y-plane in 2D Euclidean space.
In our implementation, we set dvalid to
√
0.5. This value is derived from the
observation that valid x, y-coordinates can be any of the four coordinates that form
a 1unit-wide box around the points. Hence,
dvalid =
√
0.52 + 0.52 =
√
0.5 (6.10)
Figures 6-1 and 6-2 visualises how valid/invalid x, y-coordinates for the recon-
structed surface S∗
are selected using dvalid =
√
0.5.
It should be noted that the invalidation process is independent of the 2.5D MLS
projection procedure within the pipeline, that is, invalidation of occluded points can
be performed after the projection procedure, or vice versa. This is due to the fact
that MLS is a deterministic algorithim that operates on each point independently of
the projection of other points. In our implementation, we implement validation of
each x, y-coordinate as a branching step before performing MLS projection for said
coordinate, so that no computational resources are wasted performing MLS projection
on coordinates that fall within an invalid region.
6.4 General Algorithm for Finding a Surface Cor-
respondence
Given a representative surface R and a model surface S, the pipeline reconstructs two
new representations R∗
and S∗
such that R∗
has a one-to-one correspondence with
S∗
. Diﬀerences in the surface can then be found using Equation 6.1. It should be
noted that R and S are regarded as surface scans in 2D array representation, although
they can be represented as Point Cloud data as well. In contrast, the reconstructed
surfaces R∗
and S∗
are 2D array Representations of the surface from the same viewing
angle as R.
We can generalise the method of ﬁnding a correspondence as a function F that

Figure 6-1: A projection of a point cloud on the x, y-plane in 2D Euclidean Space.
The purple neighbourhood contains at least one point, so (6, 8) is a valid point in S∗
.
The red neighbourhood contains no points, so (3, 6) is not a valid point in S∗

Figure 6-2: The valid points in S∗
after the validation step using the data shown in
Figure 6-1

takes in two representations of a surface R and S, and returns R∗
and S∗
, such that
R∗
i,j corresponds to S∗
i,j.
A diﬀerence can thus be found by re-writing Equation 6.1:
∆i,j = R∗
i,j − S∗
i,j i ∈ {1, 2, . . . , w}, j ∈ {1, 2, . . . , h} (6.11)
Algorithm 6 shows the general algorithm formed from the methods described in
the previous sections. For simplicity of understanding, Figures [6-3 - 6-7] show a 2D
analogical version of the process of ﬁnding a surface correspondence.
Algorithm 6 F: Finding Surface Correspondence between R and S
Input: Point Clouds R and S
Output: 2D arrays R∗
and S∗
1: S ← Find optimal alignment of S onto R
2: w ← ceiling xMax
3: h ← ceiling yMax
4: R∗
, S∗
← Initialise 2D array of width w and height h
5: for i ← 1tow do
6: R∗
i,j ← Ri,j if point exists
7: end for
8: if rangeSearch([i, j], Sx,y, dvalid) ≥ 1 then
9: S∗
i,j ← P2.5D(RRx,y)
10: end if
11: return R∗
and S∗

Figure 6-3: 1: Observing a surface from two separate angles can result in both scans
containing sample points from diﬀerent areas of the surface, resulting in an incorrect
correspondence, despite an optimal alignment. The green points refer to points in R,
and the red points refer to points in S

Figure 6-4: 2: Using 2.5D MLS to interpolate depth values for S∗
at the areas
corresponding to the sample points of R

Figure 6-5: 3: An interpolated point is invalidated if there exist no points in the
original data within the valid neighbourhood distance

Figure 6-6: 4: The MLS property causes the data to smoothen.

Figure 6-7: 5: The two corresponding results can be used to ﬁnd changes.

Chapter 7
Differencing
Changes in a 3D surface can be defined as an amount by which a representation
of an after surface S∗
deviates from a representation of a before surface R∗
, where R∗
and S∗
have a one-to-one correspondence. The previous chapters describe in detail
how two raw data scans with no immediate correspondence R and S can be processed
to form R∗
and S∗
. This chapter details the method of determining change from the
processed surfaces.
7.1 Representing Difference
This section describes two methods for representing change between R∗
and S∗
. The
difference is represented as a change map ∆, which is a 2D array with the same
dimensions as both R∗
and S∗
.
7.1.1 Before-to-After Differencing
The simple method for determining change is to obtain a difference between two
corresponding surfaces:
∆simple
i,j = S∗
i,j − R∗
i,j (7.1)
There exists the possibility that either S∗
or R∗
contain NULL values, due to
occlusion. To prepare for such a situation we set an additional condition:
44

Figure 7-1: Simple Differencing: The red and blue segments represents positive and
negative change values respectively, while the light green represents values close to
zero.
a − b =



NULL if a, b = NULL
a − b otherwise
(7.2)
Note that obtaining ∆simple
using 7.1 defines the values of changes in terms of
R∗
. As the sensors measure depth (distance from surface to sensor), positive change
values in ∆simple
reflect indentations(increased depth) made in the surfaces over time,
ie. R∗
→ S∗
, while negative change values reflect extrusions(decreased depth). Figure
7-1 shows an example of a surface where indentations and extrusion changes exist.

7.1.2 Absolute Differencing
A second method for determining change is to obtain the magnitude (absolute differ-
ence) between two corresponding surfaces:
|∆|i,j = |S∗
i,j − R∗
i,j| (7.3)
Finding the absolute difference allows for a measurement of total change across the
surface, regardless of whether it is an indentation or an extrusion. Figure 7-2 shows
the same change map as the one shown in Figure 7-1 using absolute differencing. In
our pipeline for determining change, we use magnitude of change as the measurement
of change, with no distinction as to the nature of the change. It should be noted that
the two methods are inter-useable, depending on the objectives of the application.
7.2 Defining Change
This section describes the method of determining if a change map contains true
change, and the various ways of representing them, if required by the application.
7.2.1 Area of Effect vs Magnitude of Change
Unlike changes in 2D imagery, where changes in colour are usually the major concern,
the RGB value of changes in a 3D surface is not within the scope of this work. This
project is concerned primarily with the magnitude of change, and its area of
change on the surface. This means that changes in the 2D arrays R∗
and S∗
can be
determined to be changes not just by the amount of total changes detected, but also
by the size of the changed area.
”Acceptable” levels of change typically vary across applications: A landscape sur-
veyor may choose to ignore small changes and only detect large areas of change with
less overall change in magnitude, while an expert inspecting the integrity of a mining
wall for shifts in the rock structure may be more interested in large shifts in the rock
over a smaller area.

Figure 7-2: Absolute Diﬀerencing: Change values are ≥ 0

We thus have two factors to consider when determining if a change map repre-
sents a true change or not. Section 7.2.2 describes a threshold for magnitude of
change, and a threshold for area of change is used in the Binary Classifier for
change detailed in Section 7.2.4
7.2.2 Percentile of Change as a Threshold
Given a change map ∆ and a threshold of magnitude ϑ where 0 ≤ ϑ ≤ 1, we can find
the scalar value ϑt that corresponds to ϑ within the entire range of change values in
∆. To use statistical terminology, we can say that change values in the nth
percentile
of ∆ are true changes, where n = 100ϑ.
ϑt = min(∆) + ϑ(max(∆) − min(∆)) (7.4)
A value in ∆ can thus be regarded as true change if it is ≥ ϑt.
7.2.3 Representation of True Change
To represent true changes, we define a true change map ∆ that only shows changes
in ∆ if the changes pass a threshold:
∆x,y =



∆x,y if ∆x,y ≥ ϑt
0 otherwise
(7.5)
Figure 7-3 shows ∆x,y using the same data that was used to visualise Figures 7-1
and 7-2.
7.2.4 Binary Classifier
At the end of the pipeline for 3D Surface Change Detection, the ultimate goal is to
detect if a change has occurred between two input representations of a surface. The
output is thus a single true/false value.

Figure 7-3: True Changes: Only values that are above the threshold tϑ are shown

Figure 7-4: Binary Differencing: Red pixels define areas in the change map that are
above the change threshold tϑ
To accomplish this, we first rewrite Equation 7.5 such that the true change map
∆ becomes a 2D array of logical values:
∆∗
x,y =



1 if ∆x,y ≥ tϑ
0 otherwise
(7.6)
Figure 7-4 shows ∆∗
using the same data that was used to visualise Figures 7-1
and 7-2.
The binary classification for detected change can then be expressed as the number

of cells in ∆∗
that passes an area of change threshold aϑ:
classification = (
w,h
i,j
∆∗
i,j) ≥ aϑ (7.7)
where Equation 7.7 follows the standard indicator function described in Section
6.2.1, and aϑ is an integer representing the size of the change threshold in pixels.

Chapter 8
Pipeline
This chapter summarises all the stages of the pipeline and describes the recom-
mended data structures at each stage of the pipeline.
8.1 Summary of Stages
8.1.1 Inputs
The pipeline begins with two input depth maps R and S, both of which are 2D arrays
which may or may not be of similar height and width. Both depth maps are converted
to point clouds, which can be represented as a list of N observations of 3 dimensions
each.
8.1.2 Alignment
An optimal alignment for R and S is found by ﬁnding the transformation that aligns
S to R, forming S .
52

8.1.3 Reconstruction of Corresponding Surfaces
The point clouds R and S are first adjusted such that none of their x, y-coordinates
are negative. R∗
and S∗
are first initialised as 2D arrays with the width w and height
h such that the x, y-coordinates of each point in R and S fits in the coordinate space
formed by w and h.
2.5D MLS is performed on both R and S to fill the values of R∗
and S∗
.
8.1.4 Finding Change
A change map (2D array) ∆∗
can be obtained using Equation 7.5, and a classifica-
tion for change / no change can be obtained given the magnitude of change
threshold tϑ and the area of change threshold aϑ.
The pipeline can be generalised (black box) as the binary classifier B(R, S) that
returns true/false showing if a true change above the thresholds has occurred or not.
B(R, S) =



true if R → S contains change
false if R → S contains no change
(8.1)
8.2 Algorithm for 3D Surface Change Detection
Algorithm 7 B: 3D Surface Change Detection
Input: Reference set R, Source Set S, Change Threshold tϑ, Area Threshold aϑ
Output: true/false
1: [R∗
, S∗
] ← Reconstruct Surface with Correspondence F(R, S) {Algorithm 6}
2: ∆i,j ← R∗
i,j − S∗
i,j {Equation 6.11}
3: ∆∗
← ∆x,y ≥ tϑ {Equation 7.6}
4: classification ← (
w,h
i,j
∆∗
i,j) ≥ aϑ {Equation 7.7}
5: return classification

Chapter 9
Experiments
This chapter details the experiments done during the progress of this research.
The goal of conducting the experiments is to observe the effectiveness of our designed
pipeline on changes applied to real-world surfaces.
The experiments are performed in 3 phases:
Forming the Dataset We obtained 80 sets of raw surface scan data, each with
varying properties. Section 9.1 details the steps taken in forming the dataset.
Experiments We ran 3 types of Experiments to show the effects of various parame-
ters on the effectiveness of the change detection pipeline. Section 9.2 describes
each of the experiments and their results.
9.1 Forming the Dataset
This section describes the dataset of raw surface scan data. The dataset consists
of 80 sets of 3 scans of a surface with different properties described in the following
sub-sections. The three scans are listed in each set as v0, v1, and v2. As mentioned in
Section 5.2.1, each of these ”scans” is actually an average of 20 scans taken in rapid
fashion (taking about 1 second in total), as a pre-proccesing step for minimising noise.
54

Figure 9-2: Close up of plasticine model surface
9.1.1 Equipment
40 sets of the data obtained were surface scans taken using a Creative Senz3D Web-
cam (Figure 1-1) using the Senz3D Acquisition Interface developed by [16], with the
remaining 40 taken using a SwissRanger SR4000 (SR4K) sensor (Figure 1-3) using
the included driver software. The experimental setup was performed by stabilising
the sensor using a tripod, with the sensor aimed at a model of a surface. The model
used was a 40cm × 30cm plasticine model of a surface, with regions having up to
12cm diﬀerences in surface height. Figures 9-1 and 9-2 show the set up of the sensor
and surface model.
The Senz3D camera uses the Structured-Light(SL) method for determining dis-
tance from an object. This method involves projecting a pre-determined rigid pattern
of points onto a surface, and obtaining information about the depth of the scene by ob-
serving the deformation of the rigid pattern. The technology that powers the Senz3D
is similar to that of the ﬁrst-generation Microsoft Kinect, and is of consumer-grade

Figure 9-3: Example of a small change in camera pose used to obtain data in the
with movement dataset
quality and thus easily obtainable. However, this also means that the useable depth
of the sensor is within 0.5metres to 1.5metres, and also has a higher level of noise in
its output.
Time-of-Flight(TOF) is a method of reflecting IR beams from multiple active
illumination sources off a scenery, where the reflected beams will be caught by a
sensor, and using the round-trip time difference between each individual light beams
to calculate the depth of the scene. The SR4000 uses TOF and thus has significantly
higher degree of accuracy at obtaining the depth values in a scene. However, the
SR4000 delivers a much lower resolution (pixels per frame) image of a scene than the
Senz3D, given the same scenery and camera set up.
9.1.2 Movement in Camera Pose
Among the 40 datasets obtained for each sensor, 20 sets were taken by moving the
camera to a different angle of view between v0, v1, and v2, while the remaining 20
sets had the same angle of view for all three scans. Figure 9-3 shows an example of
the change in the camera’s view of the surface when its pose is changed.
9.1.3 Changes Applied
In each of the 20 datasets (Moving and Non-Moving) obtained for each sensor, 10 sets
had a series of changes applied to them between v0, v1, and v2, while the remaining

had no changes applied to them. The types of changes are as follows:
1-5 Small Extrusions Between v0 and v1, an extrusion approximately the size of
an Australian 20cent coin was added. This was repeated in between v1 and v2
while leaving the changes in v1 intact. 5 of the 10 sets containing changes use
this change method with varying heights of the extrusion (up to 5× height of a
coin).
Large-Area Extrusions 2 of the 10 sets containing changes had large strips of extra
plasticine added to the surface model in between v0 and v1, and between v1 and
v2.
Depression 1 of the remaining 3 sets containing changes had 3× slight(< 0.2cm)
depressions made across the surface between v0 and v1, and between v1 and
v2. Another set had changes made in the same style, but with deep (≥ 0.2cm)
depressions instead.
The remaining set had a large-area, large-magnitude change performed in be-
tween the scans. This was done by creating a ﬁst-sized indentation at a random
location in the surface model.
9.2 Experiments
This section details the various experiments conducted on the dataset obtained in
Section 9.1.
9.2.1 Eﬀect of MLS Gaussian Parameter h
The pipeline developed in this research project performs optimally under the con-
dition that the noise from various sources(sensor and environment) are minimised.
We utilise the Moving Least Squares(MLS) technique as a noise-smoothing operator
to minimise the amount of noise that remains in the reconstructed surfaces R∗
and

Figure 9-4: Rate of changes detected in an experiment with no expected change,
over varying values of the Moving Least Squares gaussian parameter h
S∗
. The weighting function that we utilise in MLS requires a pre-determined Gaus-
sian parameter h, that affects the smoothness of the reconstructed surface, thus also
affecting the level of change that can be detected by our pipeline.
To find the optimal value for h, we conducted a series of experiments using the
Senz3Ddataset with no expected changes. We modified the pipeline to deliver
the total number of pixels identifying change, and re-ran the experiment over varying
levels of the magnitude-of-change treshold tϑ. As the dataset is of the no expected
changes variety, the goal in this experiment is to identify an optimal value for h
that gives the lowest rate of false-positives (changes detected) in the dataset.
As can be seen from the result graph of this experiment in Figure 9-4, the optimal
value of h is consistently within the range of 3.0 → 3.25. For all further experiments
conducted, we set h = 3.0.

Figure 9-5: ROC curves of the pipeline operating on the datasets of the two sensors
9.2.2 Evaluation of the Binary Classifier
To evaluate the performance of our designed pipeline, we run all 80 data instances
through the pipeline and evaluate the accuracy of the binary classification process.
We apply the standard Receiver Operating Characteristic (ROC) test on results from
the two sensors over 4 area-of-effect thresholds:
• Single Pixel
• 25% of the Total number of Pixels (w × h)
where each test runs on all feasible values of magnitude thresholds. A total of 160
comparisons were made for each area-of-effect threshold, each having 40 comparison
results with an expected change condition and 40 with an expected no change
condition. The results are shown in Figure 9-5.

Figure 9-6: Effect of Optimal vs. Non-Optimal Alignment on Success Rate of Change
Detection
As expected, the sensitivity of the SR4K camera allows the tests done using the
SR4K dataset to maintain a consistently high rate of successful change detections
despite the increasing area-of-effect thresholds. This is in contrast to the accuracy
of the classifier on the Senz3D dataset, which decreases significantly when the area-
of-effect threshold is above 75%. However, the sensitivity of the SR4K camera could
also explain why it has a consistent non-zero false positive rate (FPR ≥ 0.05). In-
vestigating the limitations of the SR4K camera can be an area for future work, but
it falls out of the scope of this research project, which operates on the data itself
regardless of sensor type.
9.2.3 Optimal vs Non-Optimal Alignment
Another major requirement for successfully detecting changes in the surface is the
optimal alignment strategy described in Section 6.2. To prove that a non-optimal
alignment affects the accuracy of reconstruction and deteriorates the success rate

of change detection, we run a series of experiments using the SR4K dataset with
camera movement (total of 20 sets). The dataset was put through the pipeline
twice, once using the optimal alignment strategy (6DOF + ICP) described in Section
6.2.3, and the second time using only ICP to ﬁnd a best estimate (but possibly not
optimal) alignment.
Figure 9-6 shows the results of the ROC test on the results (2 groups of 20 result
data sets). The results show that using an optimal alignment strategy will always lead
to an equal or better result. This is fairly obvious given the nature of the interpolation
method.
It should be noted that using the ICP method can possibly deliver an optimal
alignment, although this generally depends on the initial state of the two point clouds
that are to be aligned. This experiment simply serves to show that the change detec-
tion method works as long as an optimal alignment can be found.
9.3 Discussion on Experiments
In hindsight, the experiments could have been conducted with simulated data (dig-
itally constructed surfaces with simulated changes). This would have allowed for
further analysis of how much change the detection method could detect in compari-
son to how much actual change was applied. In our experiments with data obtained
from real-world surfaces, the actual position, magnitude, and area of change that is
applied cannot truly be known in advance.
In any case, the results clearly show that our change detection pipeline works for
the general case where the optimal Gaussian parameter for the MLS procedure is
known, and that the alignment strategy is optimal.

Bibliography
[1] M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, and C.T. Silva. Point
set surfaces. Proc. Vis. 2001. VIS ’01., 2001.
[2] Marc Alexa, Johannes Behr, Daniel Cohen-Or, Shachar Fleishman, David Levin,
and Claudio T. Silva. Computing and rendering point set surfaces. IEEE Trans.
Vis. Comput. Graph., 9(1):3–15, 2003.
[3] Theodore B. Barnhart and Benjamin T. Crosby. Comparing two methods of sur-
face change detection on an evolving thermokarst using high-temporal-frequency
terrestrial laser scanning, Selawik River, Alaska. Remote Sens., 5:2813–2837,
2013.
[4] Paul Besl and Neil McKay. A Method for Registration of 3-D Shapes, 1992.
[5] N K Bose and Nilesh A Ahuja. Superresolution and noise filtering using moving
least squares. IEEE Trans. Image Process., 15:2239–2248, 2006.
[6] Thomas M. Breuel. Implementation techniques for geometric branch-and-bound
matching methods. Comput. Vis. Image Underst., 90(3):258–294, 2003.
[7] Alvaro Joaquin Parra Bustos, Tat-Jun Chin, and David Suter. Fast Rotation
Search with Stereographic Projections for 3D Registration. 2014 IEEE Conf.
Comput. Vis. Pattern Recognit., pages 3930–3937, 2014.
[8] Levin David. Mesh-independent surface interpolation. Geom. Model. Sci. Vis.,
3:37–49, 2003.
[9] Dani Delaloye. Development of a new Methodology for Measuring Deformation
in Tunnels and Shafts with Terrestrial Laser Scanning (LIDAR) using Elliptical
Fitting Algorithms. page 217, 2012.
[10] Tamal K. Dey and Jian Sun. An Adaptive MLS Surface for Reconstruction with
Guarantees. Eurographics Symp. Geom. Process., page 43, 2005.
[11] Shachar Fleishman, Daniel Cohen-Or, and Cláudio T. Silva. Robust moving
least-squares fitting with sharp features. ACM Trans. Graph., 24:544, 2005.
[12] D Girardeau-Montaut and Michel Roux. Change detection on points cloud data
acquired with a ground laser scanner. Int. Arch. Photogramm. Remote Sens.
Spat. Inf. Sci., 36:W19, 2005.
63

[13] Armin Gruen and Devrim Akca. Least squares 3D surface and curve matching.
ISPRS J. Photogramm. Remote Sens., 59(3):151–174, 2005.
[14] S Inge. Using SVD for some fitting problems. (2):2–5, 2009.
[15] Shahram Izadi, Andrew Davison, Andrew Fitzgibbon, David Kim, Otmar
Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton,
Steve Hodges, and Dustin Freeman. Kinect Fusion: Real-time 3D Reconstruc-
tion and Interaction Using a Moving Depth Camera. Proc. 24th Annu. ACM
Symp. User interface Softw. Technol. - UIST ’11, page 559, 2011.
[16] Dirk-Jan Kroon. Senz3D Acquisition interface. http://au.mathworks.
com/matlabcentral/fileexchange/42581-senz3d-acquisition-interface,
2014.
[17] David Levin. The approximation power of moving least-squares. Math. Comput.,
67(224):1517–1532, 1998.
[18] Roderik Lindenbergh, Lukasz Uchanski, Alexander Bucksch, Rinske Van Gosliga,
Space Systems, and Rotterdam Public Works. Structural Monitoring of Tunnels
Using Terrestrial Laser Scanning. 2009.
[19] Yaron Lipman, Daniel Cohen-Or, and David Levin. Data-dependent MLS for
faithful surface approximation. Proc. fifth Eurographics Symp. Geom. Process.,
pages 59–67, 2007.
[20] O Monserrat and M Crosetto. Deformation measurement using terrestrial laser
scanning data and least squares 3D surface matching. ISPRS J. Photogramm.
Remote Sens., 63:142–154, 2008.
[21] Richard a Newcombe, David Molyneaux, David Kim, Andrew J. Davison, Jamie
Shotton, Steve Hodges, and Andrew Fitzgibbon. KinectFusion: Real-Time Dense
Surface Mapping and Tracking. IEEE Int. Symp. Mix. Augment. Real., pages
127–136, 2011.
[22] Olga Sorkine. Least-squares rigid motion using svd. Tech. notes, (February):1–6,
2009.
[23] Roberto Stent, Simon and Gherardi, Riccardo and Stenger, Björn and Soga,
Kenichi and Cipolla. An Image-Based System for Change Detection on Tunnel
Linings. In 13th IAPR Int. Conf. Mach. Vis. Appl. Kyoto, Japan, pages 2–5,
2013.
[24] Masato Ukai. Advanced Inspection System of Tunnel Wall Deformation using
Image Processing. Q. Rep. RTRI, 48(2):94–98, 2007.

Honours_Thesis2015_final

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Honours_Thesis2015_final

Similar to Honours_Thesis2015_final (20)

Honours_Thesis2015_final