SlideShare a Scribd company logo
1 of 214
Download to read offline
Image Analysis and Pattern Recognition
for Remote Sensing
with Algorithms in ENVI/IDL
Morton John Canty
Forschungszentrum J¨ulich GmbH
m.canty@fz-juelich.de
March 21, 2005
-
Contents
1 Images, Arrays and Vectors 1
1.1 Multispectral satellite images . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Algebra of vectors and matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Finding minima and maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Image Statistics 13
2.1 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 The normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 A special function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Conditional probabilities and Bayes
Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Transformations 21
3.1 Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.1 Discrete Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 Discrete Fourier transform of an image . . . . . . . . . . . . . . . . . . 23
3.2 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Principal components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Minimum noise fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Maximum autocorrelation factor (MAF) . . . . . . . . . . . . . . . . . . . . . 28
4 Radiometric enhancement 31
4.1 Lookup tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.1 Histogram equalization . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.2 Histogram matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.1 Laplacian of Gaussian filter . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Topographic modelling 39
5.1 RST transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
i
ii CONTENTS
5.2 Imaging transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Camera models and RFM approximations . . . . . . . . . . . . . . . . . . . . 41
5.4 Stereo imaging, elevation models and
orthorectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5 Slope and aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.6 Illumination correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6 Image Registration 53
6.1 Frequency domain registration . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2.1 Contour detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2.2 Closed contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2.3 Chain codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2.4 Invariant moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2.5 Contour matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2.6 Consistency check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3 Re-sampling and warping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7 Image Sharpening 61
7.1 HSV fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 Brovey fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.3 PCA fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.4 Wavelet fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.4.1 Discrete wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . 64
7.4.2 `A trous filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.5 Quality indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8 Change Detection 69
8.1 Algebraic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.2 Principal components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.3 Post-classification comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.4 Multivariate alteration detection . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.4.1 Canonical correlation analysis . . . . . . . . . . . . . . . . . . . . . . . 71
8.4.2 Solution by Cholesky factorization . . . . . . . . . . . . . . . . . . . . 72
8.4.3 Properties of the MAD components . . . . . . . . . . . . . . . . . . . 73
8.4.4 Covariance of MAD variates with original observations . . . . . . . . . 74
8.4.5 Scale invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.4.6 Improving signal to noise . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.4.7 Decision thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.5 Radiometric normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9 Unsupervised Classification 79
CONTENTS iii
9.1 A simple cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.2 Algorithms that minimize the simple cost function . . . . . . . . . . . . . . . 81
9.2.1 K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
9.2.2 Extended K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
9.2.3 Agglomerative hierarchical clustering . . . . . . . . . . . . . . . . . . . 83
9.2.4 Fuzzy K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.3 EM Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.3.1 Simulated annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.3.2 Partition density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.3.3 Including spatial information . . . . . . . . . . . . . . . . . . . . . . . 87
9.4 The Kohonen Self Organizing Map . . . . . . . . . . . . . . . . . . . . . . . . 89
9.5 Unsupervised classification of changes . . . . . . . . . . . . . . . . . . . . . . 91
10 Supervised Classification 93
10.1 Bayes decision rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10.2 Training data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
10.3 Bayes Maximum likelihood classification . . . . . . . . . . . . . . . . . . . . . 95
10.4 Non-parametric methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
10.5 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.5.1 The feed-forward network . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.5.2 Cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10.5.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.5.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.6.1 Standard deviation of misclassification . . . . . . . . . . . . . . . . . . 111
10.6.2 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.6.3 Confusion matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
11 Hyperspectral analysis 117
11.1 Mixture modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
11.1.1 Full linear unmixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
11.1.2 Unconstrained linear unmixing . . . . . . . . . . . . . . . . . . . . . . 119
11.1.3 Intrinsic end-members and pixel purity . . . . . . . . . . . . . . . . . . 119
11.2 Orthogonal subspace projection . . . . . . . . . . . . . . . . . . . . . . . . . . 121
A Least Squares Procedures 125
A.1 Generalized least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
A.2 Recursive least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A.3 Orthogonal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
B The Discrete Wavelet Transformation 131
iv CONTENTS
B.1 Inner product space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
B.2 Haar wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
B.3 Multi-resolution analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
B.4 Fixpoint wavelet approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 138
B.5 The mother wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
B.6 The Daubechies wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
B.7 Wavelets and filter banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
C Advanced Neural Network Training Algorithms 151
C.1 The Hessian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
C.1.1 The R-operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
C.1.2 Calculating the Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . 155
C.2 Scaled conjugate gradient training . . . . . . . . . . . . . . . . . . . . . . . . 156
C.2.1 Conjugate directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
C.2.2 Minimizing a quadratic function . . . . . . . . . . . . . . . . . . . . . 157
C.2.3 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
C.3 Kalman filter training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
C.3.1 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
C.3.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
D ENVI Extensions 171
D.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
D.2 Topographic modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
D.2.1 Calculating building heights . . . . . . . . . . . . . . . . . . . . . . . . 172
D.2.2 Illumination correction . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
D.3 Image registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
D.4 Image fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
D.4.1 DWT fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
D.4.2 ATWT fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
D.4.3 Quality index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
D.5 Change detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
D.5.1 Multivariate Alteration Detecton . . . . . . . . . . . . . . . . . . . . . 184
D.5.2 Maximum autocorrelation factor . . . . . . . . . . . . . . . . . . . . . 186
D.5.3 Radiometric normalization . . . . . . . . . . . . . . . . . . . . . . . . 187
D.6 Unsupervised classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
D.6.1 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
D.6.2 Fuzzy K-means clustering . . . . . . . . . . . . . . . . . . . . . . . . . 190
D.6.3 EM clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
D.6.4 Probabilistic label relaxation . . . . . . . . . . . . . . . . . . . . . . . 194
D.6.5 Kohonen self organizing map . . . . . . . . . . . . . . . . . . . . . . . 196
D.6.6 A GUI for change clustering . . . . . . . . . . . . . . . . . . . . . . . . 197
CONTENTS v
D.7 Neural network: Scaled conjugate gradient . . . . . . . . . . . . . . . . . . . . 198
D.8 Neural network: Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
D.9 Neural network: Hybrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Bibliography 203
vi CONTENTS
Chapter 1
Images, Arrays and Vectors
1.1 Multispectral satellite images
There are a number of multispectral satellite-based sensors currently in orbit which are used
for earth observation. Representative of these we mention here the Landsat ETM+ system.
The ETM+ instrument on the Landsat 7 spacecraft contains sensors to measure radiance
in three spectral intervals:
• visible and near infrared (VNIR) bands - bands 1,2,3,4, and 8 (PAN) with a spectral
range between 0.4 and 1.0 micrometer.
• short wavelength infrared (SWIR) bands - bands 5 and 7 with a spectral range between
1.0 and 3.0 micrometer.
• thermal long wavelength infrared (LWIR) band - band 6 with a spectral range between
8.0 and 12.0 micrometer.
In addition a panchromatic (PAN) image (band 8) covering the visible spectrum is provided.
Ground resolutions are 15m (PAN), 30m (VNIR,SWIR) and 60m (LWIR). Figure 1.1 shows
a color composite image of a Landsat 7 scene over Morocco acquired in 1999.
A single multispectral image can be represented as an array of gray-scale values or digital
numbers
gk(i, j), 1 ≤ i ≤ c, 1 ≤ j ≤ r,
where c is the number of pixel columns and r is the number of pixel rows. If we are dealing
with an N-band multispectral image, then the index k, 1 ≤ k ≤ N, denotes the spectral
band. Often a pixel intensity is stored in a single byte, so that 0 ≤ gk ≤ 255.
The gray-scale values are the result of sampling along an array of sensors the at-sensor
radiance fλ(x, y) at wavelength λ due to sunlight reflected from some point (x, y) on the
Earth’s surface and focussed by the satellite’s optical system at the sensors. Ignoring atmo-
spheric effects this radiance is given roughly by
fλ(x, y) ∼ iλ(x, y)rλ(x, y),
where iλ(x, y) is the sun’s irradiance at the surface in units of watt/m2
µm, and rλ(x, y)
is the surface reflectance, a number between 0 and 1. The conversion between gray-scale
1
2 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
Figure 1.1: Color composite of bands 4 (red), 5 (green) and 7 (blue) for a Landsat ETM+
image over Morocco.
1.1. MULTISPECTRAL SATELLITE IMAGES 3
or digital number g and at-sensor radiance f is determined by the sensor calibration as
measured (and maintained) by the satellite image provider:
f = Cg(i, j) + fmin
where C = (fmax − fmin)/255, in which fmax and fmin are maximum and minimum mea-
surable radiances at the sensor.
Atmospheric scattering and absorption models are used to calculate surface reflectance
from the observed at-sensor radiance, as it is the reflectance which is directly related to the
physical properties of the surface being examined.
Various conventions can be used for storing the image array g(i, j) in computer memory
or on storage media. In band interleaved by pixel (BIP) format, for example, a two-channel,
3 × 3 pixel image would be stored as
g1(1, 1) g2(1, 1) g1(2, 1) g2(2, 1) g1(3, 1) g2(3, 1)
g1(1, 2) g2(1, 2) g1(2, 2) g2(2, 2) g1(3, 2) g2(3, 2)
g1(1, 3) g2(1, 3) g1(2, 3) g2(2, 3) g1(3, 3) g2(3, 3),
whereas in band interleaved by line (BIL) it would be stored as
g1(1, 1) g1(2, 1) g1(3, 1) g2(1, 1) g2(2, 1) g2(3, 1)
g1(1, 2) g1(2, 2) g1(3, 2) g2(2, 1) g1(2, 2) g2(2, 3)
g1(1, 3) g2(2, 3) g1(3, 3) g2(3, 1) g1(2, 3) g2(3, 3),
and in band sequential (BSQ) format it is stored as
g1(1, 1) g1(2, 1) g1(3, 1)
g1(1, 2) g1(2, 2) g1(3, 2)
g1(1, 3) g1(2, 3) g1(3, 3)
g2(1, 1) g2(2, 1) g2(3, 1)
g2(1, 2) g2(2, 2) g2(3, 2)
g2(1, 3) g2(2, 3) g2(3, 3).
In the computer language IDL, so-called row major indexing is used for arrays and the
elements in an array are numbered from zero. This means that, if a gray-scale image g is
stored in an IDL array variable G, then the intensity value g(i, j) is addressed as G[i-1,j-1].
An N-band multispectral image is stored in BIP format as an N × c × r array in IDL, in
BIL format as a c × N × r and in BSQ format as an c × r × N array.
Auxiliary information, such as image acquisition parameters and georeferencing, is nor-
mally included with the image data on the same file, and the format may or may not make
use of compression algorithms. Examples are the geoTIFF1
file format used for example by
Space Imaging Inc. for distributing Carterra(c) imagery and which includes lossless compres-
sion, the HDF (Hierachical Data Format) in which for example ASTER images are distributed
and the cross-platform PCDSK format employed by PCI Geomatics with its image process-
ing software, which is in plain ASCII code and not compressed. ENVI uses a simple “flat
binary” file structure with an additional ASCII header file.
1geoTIFF refers to TIFF files which have geographic (or cartographic) data embedded as tags within the
TIFF file. The geographic data can then be used to position the image in the correct location and geometry
on the screen of a geographic information display.
4 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
1.2 Algebra of vectors and matrices
It is very convenient to use a vector representation for multispectral images, namely
g(i, j) =



g1(i, j)
...
gN (i, j)


 , (1.1)
which is a column vector of multispectral gray-scale values at the position (i, j).
Since we will be making extensive use of the vector notation of Eq. (1.1) we review
here some of the basic properties of vectors and matrices. We can illustrate most of these
properties in just two dimensions.

x
x2
x1
Figure 1.2: A vector in two dimensions.
The transpose of the two-dimensional column vector shown in Fig. 1.2,
x =
x1
x2
,
is the row vector
x = (x1, x2).
The sum of two vectors is given by
x + y =
x1
x2
+
y1
y2
=
x1 + y1
x2 + y2
,
and the inner product by
x y = (x1, x2)
y1
y2
= x1y1 + x2y2.
The length or norm of the vector x is
x = |x| = x2
1 + x2
2 =
√
x x .
The programming language IDL is especially good at manipulating vectors and matrices:
1.2. ALGEBRA OF VECTORS AND MATRICES 5
IDL x=[[1],[2]]
IDL print,x
1
2
IDL print,transpose(x)
1 2
b
X
x
yθ
x cos θ
Figure 1.3: The inner product.
The inner product can be written in terms of the vector lengths and the angle θ between
the two vectors as
x y = |x||y| cos θ = xy cos θ,
see Fig. 1.3. If θ = 90o
the vectors are orthogonal so that
x y = 0.
Any vector can be decomposed into orthogonal unit vectors:
x =
x1
x2
= x1
1
0
+ x2
0
1
.
A two-by-two matrix is written
A =
a11 a12
a21 a22
.
When a matrix is multiplied with a vector the result is another vector, e.g.
Ax =
a11 a12
a21 a22
x1
x2
=
a11x1 + a12x2
a21x1 + a22x2
.
The IDL operator for matrix and vector multiplication is ##.
IDL a=[[1,2],[3,4]]
IDL print,a
1 2
3 4
IDL print,a##x
5
11
6 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
Matrices also have a transposed form, obtained by interchanging their rows and columns:
A =
a11 a21
a12 a22
.
The product of two matrices is given by
AB =
a11 a12
a21 a22
b11 b12
b21 b22
=
a11b11 + a12b21 · · ·
· · · · · ·
and is another matrix. The determinant of a two-dimensional matrix is
|A| = det A = a11a22 − a12a21.
The outer product of two vectors is a matrix:
xy =
x1
x2
(y1, y2) =
x1 0
x2 0
y1 y2
0 0
=
x1y1 x1y2
x2y1 x2y2
The identity matrix is given by
I =
1 0
0 1
, IA = AI = A.
The matrix inverse A−1
is defined in terms of the identity matrix according to
A−1
A = AA−1
= I.
In two dimensions it is easy to verify that
A−1
=
1
|A|
a22 −a12
−a21 a11
.
IDL print, determ(float(a))
-2.00000
IDL print, invert(a)
-2.00000 1.00000
1.50000 -0.500000
IDL print, a##invert(a)
1.00000 0.000000
0.000000 1.00000
If |A| = 0, then A has no inverse and is said to be a singular matrix. The trace of a
square matrix is the sum of its diagonal elements:
Tr A = a11 + a22.
1.3 Eigenvalues and eigenvectors
The statistical properties of ensembles of pixel intensities (for example entire images or
specific land-cover classes) are often approximated by their mean values and covariance
1.3. EIGENVALUES AND EIGENVECTORS 7
matrices. As we will see later, covariance matrices are always symmetric. A matrix A is
symmetric if it doesn’t change when it is transposed, i.e. if
A = A .
Very often we have to solve the so-called eigenvalue problem, which is to find eigenvectors x
and eigenvalues λ that satisfy the equation
Ax = λx
or, equivalently,
a11 a12
a21 a22
x1
x2
= λ
x1
x2
.
This is the same as the two equations
(a11 − λ)x1 + a12x2 = 0
a21x1 + (a22 − λ)x2 = 0.
(1.2)
If we eliminate x1 and make use of the symmetry a12 = a21, we obtain
[(a11 − λ)(a22 − λ) − a2
12]x2 = 0.
In general x2 = 0, so we must have
(a11 − λ)(a22 − λ) − a2
12 = 0,
which is known as the characteristic equation for the eigenvalue problem. It is a quadratic
equation in λ with solutions
λ(1)
=
1
2
a11 + a22 + (a11 + a22)2 − 4(a11a22 − a2
12)
λ(2)
=
1
2
a11 + a22 − (a11 + a22)2 − 4(a11a22 − a2
12) .
(1.3)
Thus there are two eigenvalues and, correspondingly, two eigenvectors x(1)
and x(2)
, which
can be obtained by substituting λ(1)
and λ(2)
into (1.2) and solving for x1 and x2. It is easy
to show that the eigenvalues are orthogonal
(x(1)
) x(2)
= 0.
The matrix formed by the two eigenvectors,
u = (x(1)
, x(2)
) =
x
(1)
1 x
(2)
1
x
(1)
2 x
(2)
2
,
is said to diagonalize the matrix a. That is
u Au =
λ(1)
0
0 λ(2) . (1.4)
We can illustrate the whole procedure in IDL as follows:
8 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
IDL a=float([[1,2],[2,3]])
IDL print,a
1.00000 2.00000
2.00000 3.00000
IDL print,eigenql(a,eigenvectors=u,/double)
4.2360680 -0.23606798
IDL print,transpose(u)##a##u
4.2360680 -2.2204460e-016
-1.6653345e-016 -0.23606798
Note that, after diagonalization, the off-diagonal elements are not precisely zero due to
rounding errors in the computation.
All of the above properties generalize easily to N dimensions.
1.4 Finding minima and maxima
In order to maximize some desirable property of a multispectral image, such as signal to
noise or spread in intensity, we often need to take derivatives of vectors. A vector (partial)
derivative in two dimensions is written ∂
∂x and is defined as the vector
∂
∂x
=
1
0
∂
∂x1
+
0
1
∂
∂x2
.
Many of the operations with vector derivatives correspond exactly to operations with or-
dinary scalar derivatives (They can all be verified easily by writing out the expressions
component-by component):
∂
∂x
(x y) = y analogous to
∂
∂x
xy = y
∂
∂x
(x x) = 2x analogous to
∂
∂x
x2
= 2x
The scalar expression
x Ay,
where A is a matrix, is called a quadratic form. We have
∂
∂x
(x Ay) = Ay
∂
∂y
(x Ay) = A x
and
∂
∂x
(x Ax) = Ax + A x.
Note that, if A is a symmetrix matrix, this last equation can be written
∂
∂x
(x Ax) = 2Ax.
Suppose x∗
is a critical point of the function f(x), i.e.
d
dx
f(x∗
) =
d
d
f(x)
x=x∗
= 0, (1.5)
1.4. FINDING MINIMA AND MAXIMA 9
x∗
x
f(x)
d
dx f(x∗
) = 0
Figure 1.4: A function of one variable.
see Fig. 1.4. Then f(x∗
) is a local minimum if d2
dx2 f(x∗
)  0. This becomes obvious if we
express f(x) as a Taylor series about x∗
f(x) = f(x∗
) + (x − x∗
)
d
dx
f(x∗
) + (x − x∗
)2 d2
dx2
f(x∗
) + . . . .
For |x − x∗
| sufficiently small this is equivalent to
f(x) ≈ f(x∗
) + (x − x∗
)2 d2
dx2
f(x∗
).
The situation is similar for scalar functions of a vector:
f(x) ≈ f(x∗
) + (x − x∗
)
∂f(x∗
)
∂x
+
1
2
(x − x∗
) H(x − x∗
). (1.6)
where H is called the Hessian matrix:
(H)ij =
∂2
∂xi∂xj
f(x∗
). (1.7)
In the neighborhood of the critical point, since ∂f(x∗
)
∂x = 0, we get the approximation
f(x) ≈ f(x∗
) + (x − x∗
) H(x − x∗
).
Now the condition for a local minimum is that the Hessian matrix be positive definite at the
point x∗
. Positive definiteness means that
x Hx  0 for all x = 0. (1.8)
Suppose we want to find a minimum (or maximum) of a scalar function f(x) of the
vector x. If there are no constraints, then we solve the set of equations
∂f(x)
∂xi
= 0, i = 1, 2,
or, in terms of our notation for vector derivatives,
∂f(x)
∂x
= 0 =
0
0
.
10 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
However suppose that x is constrained by the equation
g(x) = 0.
For example, we might have
g(x) = x2
1 + x2
2 − 1 = 0
which constrains x to lie on a circle of radius 1.
Finding an minimum of f subject to g = 0 is equivalent to finding an unconstrained
minimum of
f(x) + λg(x), (1.9)
where λ is called a Lagrange multiplier and is treated like an additional variable, see [Mil99].
That is, we solve the set of equations
∂
∂xi
(f(x) + λg(x)) = 0, i = 1, 2
∂
∂λ
(f(x) + λg(x)) = 0.
(1.10)
The latter equation is just g(x) = 0.
For example, let f(x) = ax2
1 + bx2
2 and g(x) = x1 + x2 − 1. Then we get the three
equations
∂
∂x1
(f(x) + λg(x)) = 2ax1 + λ = 0
∂
∂x2
(f(x) + λg(x)) = 2bx2 + λ = 0
∂
∂λ
(f(x) + λg(x)) = x1 + x2 − 1 = 0
The solution is
x1 =
b
a + b
, x2 =
a
a + b
.
1.4. FINDING MINIMA AND MAXIMA 11
Exercises
1. Show that the outer product of two 2-dimensional vectors is a singular matrix.
2. Prove that the eigenvectors or a 2 × 2 symmetric matrix are orthogonal.
3. Differentiate the function
1
(x · a · y)
with respect to y.
4. Verify the following matrix identity in IDL:
(A · B) = B · A .
5. Calculate the eigenvalues and eigenvectors of a non-symmetric matrix with IDL.
6. Plot the function f(x) = x2
1 − x2
2 with IDL. Find its minima and maxima subject to
the constraint g(x) = x2
1 + x2
2 − 1 = 0.
12 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
Chapter 2
Image Statistics
It is useful to think of image pixel intensities g(x) as realizations of a random vector G(x)
drawn independently from some probability distribution.
2.1 Random variables
A random variable can be used to represent some quantity which changes in an unpredictable
way each time it is observed. If there is a discrete set of M possible events {Ei}, i = 1 . . . M,
associated with some random process, let pi be the probability that the ith event Ei will
occur. If ni represents the number of times Ei occurs in n trials, we expect that pi → ni/n
in the limit n → ∞ and that
M
i=1
pi = 1.
For example, on the throw of a pair of dice,
{Ei} = (1, 1), (1, 2), (2, 1) . . . (6, 6)
and each event is equally probable
pi = 1/36, i = 1 . . . 36.
Formally, a random variable X is a real function on the set of possible events:
X = f(Ei).
If, for example, X is the sum of the points on the dice,
X = f(E1) = 2, X = f(E2) = 3, X = f(E3) = 3, . . . X = f(E36) = 12.
On the basis of the probabilities of the individual events, we can associate a distribution
function P(x) with the random variable X, defined by
P(x) = Pr(X ≤ x).
For the dice example,
P(1) = 0, P(2) = 1/36, P(3) = 1/12, . . . P(12) = 1.
13
14 CHAPTER 2. IMAGE STATISTICS
For continuous random variables, such as the measured radiance at a satellite sensor, the
distribution function is not expressed in terms of discrete probabilities, but rather in terms
of a probability density function p(x), where p(x)dx is the probability that the value of the
random variable X lies in the interval [x, x + dx]. Then
P(x) = Pr(X ≤ x) =
x
−∞
p(t)dt
and, of course,
P(−∞) = 0, P(∞) = 1.
Two random variables X and Y are said to be independent when
Pr(X ≤ x and Y ≤ y) = Pr(X ≤ x, Y ≤ y) = P(x)P(y).
The mean or expected value of a random variable X is written X and is defined in
terms of the probability density function:
X =
∞
−∞
xp(x)dx.
The variance of X, written var(X) is defined as the expected value of the random variable
(X − X )2
, i.e.
var(X) = (X − X )2
.
In terms of the probability density function, it is given by
var(X) =
∞
−∞
(x − X )2
p(x)dx.
Two simple but very useful identities follow from the definition of variance:
var(X) = X2
− X 2
var(aX) = a2
var(X).
(2.1)
2.2 The normal distribution
It is often the case that random variables are well-described by the normal or Gaussian
probability density function
p(x) =
1
√
2πσ
exp(−
1
2σ2
(x − µ)2
).
In that case
X = µ, var(X) = σ2
.
The expected value of pixel intensities
G(x) =




G1(x)
G2(x)
...
GN (x)



 ,
2.2. THE NORMAL DISTRIBUTION 15
where x denotes the pixel coordinates, i.e. x = (i, j), is estimated by averaging over all of
the pixels in the image,
G(x) ≈
1
cr
c,r
i,j=1
g(i, j),
referred to as the sample mean vector. It is usually assumed to be independent of x, i.e.
G(x) = G .
The covariance between bands k and is defined according to
cov(Gk, G ) = (Gk − Gk )(G − G )
and is estimated again by averaging over the pixels:
cov(Gk, G ) ≈
1
cr
c,r
i,j=1
(gk(i, j) − Gk )(g (i, j) − G ),
which is called the sample covariance. The covariance is also usually assumed to be inde-
pendent of x. The variance for bands k is given by
var(Gk) = cov(Gk, Gk) = (Gk − Gk )2
.
The random vector G is often assumed to be described by a multivariate normal proba-
bility density function p(g), given by
p(g) =
1
(2π)N/2 |Σ|
exp −
1
2
(g − µ) Σ−1
(g − µ) .
We indicate this by writing
G ∼ N(µ, Σ).
The distribution function of the multi-spectral pixels is then completely determined by the
expected value G = µ and by the covariance matrix Σ. In two dimensions, for example,
Σ =
var(G1) cov(G1, G2)
cov(G2, G1) var(G2)
=
σ2
1 σ12
σ21 σ2
2
.
Note that, since cov(Gk, G ) = cov(G , Gk), the covariance matrix is symmetric, Σ = Σ .
The covariance matrix can also be written as an outer product:
Σ = (G − G )(G − G ) .
as can its estimated value:
Σ ≈
1
cr
c,r
i,j=1
(g(i, j) − G )(g(i, j) − G ) .
If G = 0, we can write simply
Σ = GG .
Another useful identity applies to any linear combination a G of the random vector G,
namely
var(a G) = a Σa. (2.2)
16 CHAPTER 2. IMAGE STATISTICS
This is obvious in two dimensions, since we have
var(a G) = cov(a1G1 + a2G2, a1G1 + a2G2)
= a2
1var(G1) + a1a2cov(G1, G2) + a1a2cov(G2, G1) + a2
2var(G2)
= (a1, a2)
var(G1) cov(G1, G2)
cov(G2, G1) var(G2)
a1
a2
.
Variance is always nonnegative and the vector a in (2.2) is arbitrary, so we have
a Σa ≥ 0 for all a.
The covariance matrix is therefore said to be positive semi-definite.
The correlation matrix C is similar to the covariance matrix, except that each matrix
element (i, j) is normalized to var(Gi)var(Gj). In two dimensions
C =
1 ρ12
ρ21 1
=


1 cov(G1,G2)
√
var(G1)var(G2)
cov(G2,G1)
√
var(G1)var(G2)
1

 =
1 σ12
σ1σ2
σ21
σ1σ2
1
.
The following ENVI/IDL program calculates and prints out the covariance matrix of a
multispectral image:
envi_select, title=’Choose multispectral image’,fid=fid,dims=dims,pos=pos
if (fid eq -1) then return
num_cols = dims[2]-dims[1]+1
num_rows = dims[4]-dims[3]+1
num_pixels = (num_cols*num_rows)
num_bands = n_elements(pos)
samples=intarr(num_bands,n_elements(num_pixels))
for i=0,num_bands-1 do samples[i,*]=envi_get_data(fid=fid,dims=dims,pos=pos[i])
print, correlate(samples,/covariance,/double)
end
ENVI .GO
111.46663 82.123236 159.58377 133.80637
82.123236 64.532431 124.84815 104.45298
159.58377 124.84815 246.18004 205.63420
133.80637 104.45298 205.63420 192.70367
2.3 A special function
If n is an integer, the factorial of n is defined by
n! = n(n − 1) · · · 1, 1! = 0! = 1.
The generalization of this to non-integers z is the gamma function
Γ(z) =
∞
0
tz−1
e−t
dt.
It has the property
Γ(z + 1) = zΓ(z).
2.4. CONDITIONAL PROBABILITIES AND BAYES THEOREM 17
The factorial is a special case, i.e. for integer n
Γ(n) = n!
A further generalization is the incomplete gamma function
ΓP (a, x) =
1
Γ(a)
x
0
ta−1
s−t
dt.
It has the properties
ΓP (a, 0) = 0, ΓP (a, ∞) = 1.
Here is a plot of ΓP for a = 3 in IDL:
x=findgen(100)/10
envi_plot_dtat,x,igamma(3,x)
Figure 2.1: The incomplete gamma function.
We are interested in this function for the following reason. Suppose that the random
variables Xi, i = 1 . . . n, are independent normally distributed with zero mean and variance
σ2
i . Then the random variable
Z =
n
i=1
Xi
σi
2
has the distribution function
P(z) = Pr(Z ≤ z) = ΓP (n/2, z/2),
and is said to be chi-square distributed with n degrees of freedom.
2.4 Conditional probabilities and Bayes
Theorem
If A and B are two events such that the probability of A andB occurring simultaneously is
P(A, B), then the conditional probability of A occuring given that B has occurred is
P(A | B) =
P(A, B)
P(B)
.
18 CHAPTER 2. IMAGE STATISTICS
Bayes’ Theorem (named after Rev. Thomas Bayes, an 18th century mathematician who
derived a special case) is the basic starting point for inference problems using probability
theory as logic. We will use it in the following form. Let X be a random variable describing
a pixel intensity, and let {Ck | k = 1 . . . M} be a set of possible classes for the pixels. Then
the a posteriori conditional probability for class Ck, given the measured pixel intensity x is
P(Ck|x) =
P(x|Ck)P(Ck)
P(x)
, (2.3)
where
P(Ck) is the prior-probability for class Ck,
P(x|Ck) is the conditional probability of observing the value x, if it belongs to class Ck,
P(x) =
M
k=1 p(x|Ck)p(Ck) is the total probability for x.
2.5 Linear regression
Applying radiometric corrections to digital images often involves fitting a set of m data
points (xi, yi) to a straight line:
y(x) = a + bx + .
Suppose that the measurements yi include a random error with variance σ2
and that the
measurements xi are exact. Define a “goodness of fit” function
χ2
(a, b) =
m
i=1
yi − a − bxi
σ
2
. (2.4)
If the random variable is normally distributed, then we obtain the most likely (i.e. best)
values for a and b by minimizing this function, that is, by solving the equations
∂χ2
∂a
=
∂χ2
∂b
= 0.
The solution is
ˆb =
sxy
s2
xx
, ˆa = ¯y − ˆb¯x, (2.5)
where
sxy =
1
m
m
i=1
(xi − ¯x)(yi − ¯y)
s2
xx =
1
m
m
i=1
(xi − ¯x)2
¯x =
1
m
m
i=1
xi, ¯y =
1
m
m
i=1
yi.
The uncertainties in the estimates ˆa and ˆb are given by
σ2
a =
σ2
x2
i
m x2
i − ( xi)2
σ2
b =
σ2
m
m x2
i − ( xi)2
. (2.6)
2.5. LINEAR REGRESSION 19
If σ2
is not known a priori, then it can be estimated by
ˆσ2
=
1
m − 2
m
i=1
(yi − ˆa − ˆbxi)2
.
Generalized and orthogonal least squares methods are described in Appendix A. A
recusive procedure is described in Appendix C.
20 CHAPTER 2. IMAGE STATISTICS
Exercises
1. Write the multivariate normal probability density function p(g) for the case Σ = σ2
I.
Show that probability density function for a one-dimensional random variable G is a
special case. Prove that G = µ.
2. In the Monty Hall game a contestant is asked to choose between one of three doors.
Behind one of the doors is an automobile as prize for choosing the correct door. After
the contestant has chosen, Monty Hall opens one of the other two doors to show that
the automobile is not there. He then asks the contestant if she wishes to change her
mind and choose the other unopened door. Use Bayes’ theorem to prove that her
correct answer is “yes”.
3. Derive the uncertainty for a in (2.6) from the formula for error propagation
σ2
a =
N
i=1
σ2 ∂f
∂yi
2
.
Chapter 3
Transformations
Up until now we have thought of multispectral images as (r × c × N)-dimensional arrays
of measured pixel intensities. In the present chapter we consider other representations of
images which are often useful in image analysis.
3.1 Fourier transforms
Figure 3.1: Fourier series approximation of a sawtooth function. The series was truncated
at k = ±4. The left hand side shows the intensities |ˆx(k)|2
.
A periodic function x(t) with period T,
x(t) = x(t + T)
can always be expressed as the infinite Fourier series
x(t) =
∞
k=−∞
ˆx(k)ei2π(kf)t
, (3.1)
where f = 1/T = ω/2π and eix
= cos x + i sin x. From the orthogonality of the e-functions,
the coefficients ˆx(k) in the expansion are given by
ˆx(k) = f
1/2f
−1/2f
x(t)e−i2π(kf)t
dt. (3.2)
21
22 CHAPTER 3. TRANSFORMATIONS
Figure 3.1 shows an example for the sawtooth function with period T = 1:
x(t) = t, −1/2 ≤ t  1/2.
Parseval’s formula follows directly from (3.2)
k
|ˆx(k)|2
= f
1/2f
−1/2f
(x(t))2
dt.
3.1.1 Discrete Fourier transform
Let g(j) be a discrete sample of the real function g(x) (a row of pixels), sampled c times at
the sampling interval ∆ over a complete period T, i.e.
g(j) = g(x = j∆), j = 0 . . . c − 1.
The corresponding discrete Fourier series is written
g(j) =
1
c
c/2
k=−c/2
ˆg(k)ei2π(kf)(j∆)
, j = 0 . . . c − 1, (3.3)
where the truncation frequency ±c
2 f is the highest frequency component that can be deter-
mined by the sampling. This frequency is called the Nyquist critical frequency and is given
by 1/2∆, so that f is determined by
cf
2
=
1
2∆
or f =
1
c∆
.
(This corresponds to sampling over one complete period: c∆ = T.) Thus (3.3) becomes
g(j) =
1
c
c/2
k=−c/2
ˆg(k)ei2πkj/c
, j = 0 . . . c − 1.
With the observation
ei2π(−c/2)j/c
= e−iπj
= (−1)c
= eiπj
= ei2π(c/2)j/c
,
we can write this as
g(j) =
1
c
c/2−1
k=−c/2
ˆg(k)ei2πkj/c
, j = 0 . . . c − 1,
a set of c equations in the c unknown frequency components ˆg(k). Equivalently,
g(j) =
1
c
c/2−1
k=0
ˆg(k)eπ2πkj/c
+
1
c
−1
k=−c/2
ˆg(k)ei2πkj/c
=
1
c
c/2−1
k=0
ˆg(k)ei2πkj/c
+
1
c
c−1
k =c/2
X(k − c)ei2π(k −c)j/c
=
1
c
c/2−1
k=0
ˆg(k)ei2πkj/c
+
1
c
c−1
k=c/2
ˆg(k − c)ei2πkj/c
.
3.2. WAVELETS 23
Thus we can write
g(j) =
1
c
c−1
k=0
ˆg(k)ei2πkj/c
, j = 0 . . . c − 1, (3.4)
if we interpret ˆg(k) → ˆg(k − c) when k ≥ c/2.
The solution to (3.4) for the complex frequency components ˆg(k) is called the discrete
Fourier transform and is given by
ˆg(k) =
c−1
j=0
g(j)e−i2πkj/c
, k = 0 . . . c − 1. (3.5)
This follows from the following orthogonality property:
c−1
j=0
ei2π(k−k )j/c
= cδk,k . (3.6)
Eq. (3.4) itself is the discrete inverse Fourier transform. The discrete analog of Parsival’s
formula is
c−1
k=0
|ˆg(k)|2
=
1
c
c−1
j=0
g(j)2
. (3.7)
Determining the frequency components in (3.5) would appear to involve, in all, c2
floating
point multiplication operations. The fast Fourier transform (FFT) exploits the structure of
the complex e-functions to reduce this to order c log c, see for example [PFTV86].
3.1.2 Discrete Fourier transform of an image
The discrete Fourier transform is easily generalized to two dimensions for the purpose of
image analysis. Let g(i, j), i, j = 0 . . . c − 1, represent a (quadratic) gray scale image. Its
discrete Fourier transform is
ˆg(k, ) =
c−1
i=0
c−1
j=0
g(i, j)e−i2π(ik+j )/c
(3.8)
and the corresponding inverse transform is
g(i, j) =
1
c2
c−1
k=0
c−1
=0
ˆg(k, )ei2π(ik+j )/c
. (3.9)
3.2 Wavelets
Unlike the Fourier transform, which represents a signal (array of pixel intensities) in terms
of pure frequency functions, the wavelet transform expresses the signal in terms of functions
which are restricted both in terms of frequency and spatial extent. In many applications,
this turns out to be particularly efficient and useful. We’ll see an example of this in Chapter
7, where we discuss image fusion in more detail. The wavelet transform is discussed in
Appendix B.
24 CHAPTER 3. TRANSFORMATIONS
3.3 Principal components
The principal components transformation forms linear combinations of multispectral pixel
intensities which are mutually uncorrelated and which have maximum variance.
We assume without loss of generality that G = 0, so that the covariance matrix of a
multispectral image is is Σ = GG , and look for a linear combination Y = a G with
maximum variance, subject to the normalization condition a a = 1. Since the covariance
of Y is a Σa, this is equivalent to maximizing an unconstrained Lagrange function, see
Section 1.4,
L = a Σa − 2λ(a a − 1).
The maximum of L occurs at that value of a for which ∂L
∂a = 0. Recalling the rules for vector
differentiation,
∂L
∂a
= 2Σa − 2λa = 0
which is the eigenvalue problem
Σa = λa.
Since Σ is real and symmetric, the eigenvectors are orthogonal (and normalized). Denote
them a1 . . . aN for eigenvalues λ1 ≥ . . . ≥ λN . Define the matrix
A = (a1 . . . aN ), AA = I,
and let the the transformed principal component vector be Y = A G with covariance matrix
Σ . Then we have
Σ = YY = A GG A
= A ΣA = Diag(λ1 . . . λN ) =




λ1 0 · · · 0
0 λ2 · · · 0
...
...
...
...
0 0 · · · λN



 =: Λ.
The fraction of the total variance in the original multispectral image which is described by
the first i principal components is
λ1 + . . . + λi
λ1 + . . . + λi + . . . + λN
.
If the original multispectral channels are highly correlated, as is usually the case, the first
few principal components will account for a very high percentage of the variance the image.
For example, a color composite of the first 3 principal components of a LANDSAT TM
scene displays essentially all of the information contained in the 6 spectral components in
one single image. Nevertheless, because of the approximation involved in the assumption
of a normal distribution, higher order principal components may also contain significant
information [JRR99].
The principal components transformation can be performed directly from the ENVI main
menu. However the following IDL program illustrates the procedure in detail:
; Principal components analysis
envi_select, title=’Choose multispectral image’, $
3.4. MINIMUM NOISE FRACTION 25
fid=fid, dims=dims,pos=pos
if (fid eq -1) then return
num_cols = dims[2]+1
num_lines = dims[4]+1
num_pixels = (num_cols*num_lines)
num_channels = n_elements(pos)
image=intarr(num_channels,num_pixels)
for i=0,num_channels-1 do begin
temp=envi_get_data(fid=fid,dims=dims,pos=pos[i])
m = mean(temp)
image[i,*]=temp-m
endfor
; calculate the transformation matrix A
sigma = correlate(image,/covariance,/double)
lambda = eigenql(sigma,eigenvectors=A,/double)
print,’Covariance matrix’
print, sigma
print,’Eigenvalues’
print, lambda
print,’Eigenvectors’
print, A
; transform the image
image = image##transpose(A)
; reform to BSQ format
PC_array = bytarr(num_cols,num_lines,num_channels)
for i = 0,num_channels-1 do PC_array[*,*,i] = $
reform(image[i,*],num_cols,num_lines,/overwrite)
; output the result to memory
envi_enter_data, PC_array
end
3.4 Minimum noise fraction
Principal components analysis maximizes variance. This doesn’t always lead to images of
decreasing image quality (i.e. of increasing noise). The MNF transformation minimizes the
noise content rather than maximizing variance, so, if this is the desired criterion, it is to be
preferred over PCA.
Suppose we can represent a gray scale image G with covariance matrix Σ and zero mean
as a sum of uncorrelated signal and noise noise components
G = S + N,
26 CHAPTER 3. TRANSFORMATIONS
both normally distributed, with covariance matrices ΣS and ΣN and zero mean. Then we
have
Σ = GG = (S + N)(S + N) = SS + NN ,
since noise and signal are uncorrelated, i.e. SN = NS = 0. Thus
Σ = ΣS + ΣN . (3.10)
Now let us seek a linear combination a G for which the signal to noise ratio
SNR =
var(a S)
var(a N)
=
a ΣSa
a ΣN a
is maximized. From (3.10) we can write this in the form
SNR =
a Σa
a ΣN a
− 1. (3.11)
Differentiating we get
∂
∂a
SNR =
1
a ΣN a
1
2
Σa −
a Σa
(a ΣN a)2
1
2
ΣN a = 0,
or, equivalently,
(a ΣN a)Σa = (a Σa)ΣN a .
This condition is met when a solves the generalized eigenvalue problem
ΣN a = λΣa. (3.12)
Both ΣN and Σ are symmetric and the latter is also positive definite. Its Cholesky factor-
ization is
Σ = LL ,
where L is a lower triangular matrix, and can be thought of as the “square root” of Σ. Such
an L always exists is Σ is positive definite. With this, we can write (3.12) as
ΣN a = λLL a
or, equivalently,
L−1
ΣN (L )−1
L a = λL a
or, with b = L a and commutivity of inverse and transpose,
[L−1
ΣN (L−1
) ]b = λb,
a standard eigenproblem for a real, symmetric matrix L−1
ΣN (L−1
) .
From (3.11) we see that the SNR for eigenvalue λi is just
SNRi =
ai Σai
ai (λiΣai)
− 1 =
1
λi
− 1.
Thus the eigenvector ai corresponding to the smallest eigenvalue λi will maximize the signal
to noise ratio. Note that (3.12) can be written in the form
ΣN A = ΣAΛ, (3.13)
3.4. MINIMUM NOISE FRACTION 27
where A = (a1 . . . aN ) and Λ = Diag(λ1 . . . λN ).
The MNF transformation is available in the ENVI environment. It is carried out in
two steps which are equivalent to the above. First of all the noise contribution to G is
“whitened”, i.e. the random vector N has covariance matrix I, the identity matrix. Since
ΣN can be assumed to be diagonal anyway (the noise in any band is uncorrelated with the
noise in any other band), we accomplish this by doing a transformation which divides the
components of G by the standard deviations of the noise,
X = Σ
−1/2
N G,
where
Σ
−1/2
N ΣN Σ
−1/2
N = I.
The transformed random vector X thus has covariance matrix
ΣX = Σ
−1/2
N ΣΣ
−1/2
N . (3.14)
Next we do an ordinary principal components transformation on X, i.e.
Y = B X
where
B ΣX B = ΛX, B B = I. (3.15)
The overall transformation is thus
Y = B Σ
−1/2
N G = A G
where A = Σ
−1/2
N B is not an orthogonal transformation. To see that this transformation is
equivalent to solving the generalized eigenvalue problem, consider
ΣN A = ΣN Σ
−1/2
N B
= Σ
1/2
N ΣX BΛ−1
X
= Σ
1/2
N Σ
−1/2
N ΣΣ
−1/2
N BΛ−1
X
= ΣAΛ−1
X .
This is equivalent to (3.13) with
λXi =
1
λi
= SNRi + 1.
Thus an eigenvalue in the second transformation equal to one corresponds to “pure noise”.
Before the transformation can be performed, it is of course necessary to estimate the
noise covariance matrix ΣN . This can be done for example by differencing with respect to
the local mean:
(ΣN )k ≈
1
cr
c,r
i,j
(gk(i, j) − mk(i, j))(g (i, j) − m (i, j))
where mk(i, j) is the local mean of pixels in some neighborhood of (i, j).
28 CHAPTER 3. TRANSFORMATIONS
3.5 Maximum autocorrelation factor (MAF)
Let x represent the coordinates of a pixel within image G, i.e. x = (i, j). We consider the
covariance matrix Γ between the original image, represented by G(x), and the same image
G(x + ∆) shifted by an amount ∆ = (∆x, ∆y) :
Γ(∆) = G(x)G(x + ∆) ,
assumed to be independent of x. Then
Γ(0) = Σ,
and furthermore
Γ(−∆) = G(x)G(x − ∆)
= G(x + ∆)G(x)
= (G(x)G(x + ∆) )
= Γ(∆) .
Now we consider the covariance of projections of the original and shifted images:
cov(a G(x), a G(x + ∆)) = a G(x)G(x + ∆) a
= a Γ(∆)a
= a Γ(−∆)a
=
1
2
a (Γ(∆) + Γ(−∆))a.
(3.16)
Define Σ∆ as the covariance matrix of the difference image G(x) − G(x + ∆), i.e.
Σ∆ = (G(x) − G(x + ∆))(G(x) − G(x + ∆)
= G(x)G(x) + G(x + ∆)G(x + ∆) − G(x)G(x + ∆)
− G(x + ∆)G(x)
= 2Σ − Γ(∆) − Γ(−∆).
Hence Γ(∆) + Γ(−∆) = 2Σ − Σ∆ and we can write (3.16) in the form
cov(a G(x), a G(x + ∆)) = a Σa −
1
2
a Σ∆a.
The correlation of the projections is therefore given by
corr(a G(x), a G(x + ∆)) =
a Σa − 1
2 a Σ∆a
var(a G(x))var(a G(x + ∆))
=
a Σa − 1
2 a Σ∆a
(a Σa)(a Σa)
= 1 −
1
2
a Σ∆a
a Σa
.
(3.17)
We want to determine that vector a which extremalizes this correlation, so we wish to
extremalize the function
R(a) =
a Σ∆a
a Σa
.
3.5. MAXIMUM AUTOCORRELATION FACTOR (MAF) 29
Differentiating,
∂R
∂a
=
1
a Σa
1
2
Σ∆a −
a Σ∆a
(a Σa)2
1
2
Σa = 0
or
(a Σa)Σ∆a = (a Σ∆a)Σa.
This condition is met when a solves the generalized eigenvalue problem
Σ∆a = λΣa, (3.18)
which is seen to have the same form as (3.12). Again both Σ∆ and Σ are symmetric and
the latter is also positive definite and we obtain the standard eigenproblem
[L−1
Σ∆(L−1
) ]b = λb,
for the real, symmetric matrix L−1
Σ∆(L−1
) .
Let the eigenvalues be λ1 ≥ . . . λN and the corresponding (orthogonal) eigenvectors be
bi. We have
0 = bi bj = ai LL aj = ai Σaj, i = j,
and therefore
cov(ai G(x), aj G(x)) = ai Σaj = 0, i = j,
so that the MAF components are orthogonal (uncorrelated). Moreover with equation (2.14)
we have
corr(ai G(x), ai G(x + ∆)) = 1 −
1
2
λi,
and the first MAF component has minimum autocorrelation.
An ENVI plug-in for performing the MAF transformation is given in Ap-
pendix D.5.2.
30 CHAPTER 3. TRANSFORMATIONS
Exercises
1. Show that, for x(t) = sin(2πt) in Eq. (2.2),
ˆx(−1) = −
1
2i
, ˆx(1) =
1
2i
,
and ˆx(k) = 0 otherwise.
2. Calculate the discrete Fourier transform of the sequence 2, 4, 6, 8 from (3.4). You have
to solve four simultaneous equations, the first of which is
2 =
1
4
ˆg(0) + ˆg(1) + ˆg(2) + ˆg(3) .
Verify your result in IDL with the command
print, FFT([2,4,6,8])
Chapter 4
Radiometric enhancement
4.1 Lookup tables
Figure 4.1: Contrast enhancement with a lookup table represented as the continuous function
f(x) [JRR99].
Intensity enhancement of an image is easily accomplished by means of lookup tables. For
byte-encoded data, the pixel intensities g are used to index an array
LUT[k], k = 0 . . . 255,
the entries of which also lie between 0 and 255. These entries can be chosen to implement
linear stretching, saturation, histogram equalization, etc. according to
ˆgk(i, j) = LUT[gk(i, j)], 0 ≤ i ≤ r − 1, 0 ≤ j ≤ c − 1.
31
32 CHAPTER 4. RADIOMETRIC ENHANCEMENT
It is also useful to think of the the lookup table as an approximately continuous function
y = f(x).
If hin(x) is the histogram of the original image and hout(y) is the histogram of the image
after transformation through the lookup table, then, since the number of pixels is constant,
hout(y) dy = hin(x) dx,
see Fig.4.1
4.1.1 Histogram equalization
For histogram equalization, we want hout(y) to be constant independent of y. Hence
dy ∼ hin(x) dx
and
y = f(x) ∼
x
0
hin(t)dt.
The lookup table y for histogram equalization is thus proportional to the cumulative sum
of the original histogram.
4.1.2 Histogram matching
Figure 4.2: Steps required for histogram matching [JRR99].
It is often desirable to match the histogram of one image to that of another so as to
make their apparent brightnesses as similar as possible, for example when the two images
4.2. CONVOLUTIONS 33
are combined in a mosaic. We can do this by first equalizing both the input histogram
hin(x) and the reference histogram href (y) with the cumulative lookup tables z = f(x) and
z = g(y), respectively. The required lookup table is then
y = g−1
(z) = g−1
(f(x)).
The necessary steps for implementing this function are illustrated in Fig. 1.5 taken from
[JRR99].
4.2 Convolutions
With the convention
ω = 2πk/c
we can write (3.5) in the form
ˆg(ω) =
c−1
j=0
g(j)e−iωj
. (4.1)
The convolution of g with a filter h = (h(0), h(1), . . .) is defined by
f(j) =
k
h(k)g(j − k) =: h ∗ g, (4.2)
where the sum is over all nonzero elements of the filter h. If the number of nonzero elements
is finite, we speak of a finite impulse response filter (FIR).
Theorem 1 (Convolution theorem) In the frequency domain, convolution is replaced by
multiplication: ˆf(ω) = ˆh(ω)ˆg(ω).
Proof:
ˆf(ω) =
j
f(j)e−iωj
=
j,k
h(k)g(j − k)e−iωj
ˆh(ω)ˆg(ω) =
k
h(k)e−iωk
g( )e−iω
=
k,
h(k)g( )e−iω(k+ )
=
k,j
h(k)g(j − k)e−iωj
= ˆf(ω).
This can of course be generalized to two dimensional images, so that there are three
basic steps involved in image filtering:
1. The image and the convolution filter are transformed from the spatial domain to the
frequency domain using the FFT.
2. The transformed image is multiplied with the frequency filter.
3. The filtered image is transformed back to the spatial domain.
34 CHAPTER 4. RADIOMETRIC ENHANCEMENT
We often distinguish between low-pass and high-pass filters. Low pass filters perform
some sort of averaging. The simplest example is
h = (1/2, 1/2, 0 . . .),
which computes the average of two consecutive pixels. A high-pass filter computes differences
of nearby pixels, e.g.
h = (1/2, −1/2, 0 . . .).
Figure 4.3 shows the Fourier transforms of these two simple filters generated by the the IDL
program
; Hi-Lo pass filters
x = fltarr(64)
x[0]=0.5
x[1]=-0.5
p1 =abs(FFT(x))
x[1]=0.5
p2 =abs(FFT(x))
envi_plot_data,lindgen(64),[[p1],[p2]]
end
Figure 4.3: Low-pass(red) and high-pass (white) filters in the frequency domain. The quan-
tity |ˆh(k)|2
is plotted as a function of k. The highest frequency is at the center of the plots,
k = c/2 = 32 .
4.2.1 Laplacian of Gaussian filter
We shall illustrate image filtering with the so-called Laplacian of Gaussian (LoG) filter,
which will be used in Chapter 6 to implement contour matching for automatic determination
of ground control points. To begin with, consider the gradient operator for a two-dimensional
image:
=
∂
∂x
= i
∂
∂x1
+ j
∂
∂x2
,
4.2. CONVOLUTIONS 35
where i and j are unit vectors in the vertical and horizontal directions, respectively. g(x)
is a vector in the direction of the maximum rate of change of gray scale intensity. Since the
intensity values are discrete, the partial derivatives must be approximated. For example we
can use the Sobel operators:
∂g(x)
∂x1
≈ [g(i − 1, j − 1) + 2g(i, j − 1) + g(i + 1, j − 1)]
− [g(i − 1, j + 1) + 2g(i, j + 1) + g(i + 1, j + 1)] = 2(i, j)
∂g(x)
∂x2
≈ [g(i − 1, j − 1) + 2g(i − 1, j) + g(i − 1, j + 1)]
− [g(i + 1, j − 1) + 2g(i + 1, j) + g(i + 1, j + 1)] = 1(i, j)
which are equivalent to the two-dimensional FIR filters
h1 =
−1 0 1
−2 0 2
−1 0 1
and h2 =
1 2 1
0 0 0
−1 −2 −1
,
respectively. The magnitude of the gradient is
| | = 2
1 + 2
2.
Edge detection can be achieved by calculating the filtered image
f(i, j) = | |(i, j)
and setting an appropriate threshold.
Figure 4.4: Laplacian of Gaussian filter.
36 CHAPTER 4. RADIOMETRIC ENHANCEMENT
Now consider the second derivatives of the image intensities, which can be represented
formally by the Laplacian
2
= · =
∂2
∂x2
1
+
∂2
∂x2
2
.
2
g(x) is a scalar quantity which is zero whenever the gradient is maximum. Therefore
changes in intensity from dark to light or vice versa correspond to sign changes in the
Laplacian and these can also be used for edge detection. The Laplacian can also be ap-
proximated by a FIR filter, however such filters tend to be very sensitive to image noise.
Usually a low-pass Gauss filter is first used to smooth the image before the Laplacian filter
is applied. It is more efficient, however, to calculate the Laplacian of the Gauss function
itself and then use the resulting function to derive a high-pass filter. The Gauss function in
two dimensions is given by
1
2πσ2
exp −
1
2σ2
(x2
1 + x2
2),
where the parameter σ determines its extent. Its Laplacian is
1
2πσ6
(x2
1 + x2
2 − 2σ2
) exp −
1
2σ2
(x2
1 + x2
2)
a plot of which is shown in Fig. 4.4.
The following program illustrates the application of the filter to a gray scale image, see
Fig. 4.5:
pro LoG
sigma = 2.0
filter = fltarr(17,17)
for i=0L,16 do for j=0L,16 do $
filter[i,j] = (1/(2*!pi*sigma^6))*((i-8)^2+(j-8)^2-2*sigma^2) $
*exp(-((i-8)^2+(j-8)^2)/(2*sigma^2))
; output as EPS file
thisDevice =!D.Name
set_plot, ’PS’
Device, Filename=’c:tempLoG.eps’,xsize=4,ysize=4,/inches,/Encapsulated
shade_surf,filter
device,/close_file
set_plot, thisDevice
; read a jpg image
filename = Dialog_Pickfile(Filter=’*.jpg’,/Read)
OK = Query_JPEG(filename,fileinfo)
if not OK then return
xsize = fileinfo.dimensions[0]
ysize = fileinfo.dimensions[1]
window,11,xsize=xsize,ysize=ysize
Read_JPEG,filename,image1
image = bytarr(xsize,ysize)
4.2. CONVOLUTIONS 37
image[*,*] = image1[0,*,*]
tvscl,image
; run the filter
filt = image*0.0
filt[0:16,0:16]=filter[*,*]
image1= float(fft(fft(image)*fft(filt),1))
; get zero-crossings and display
image2 = bytarr(xsize,ysize)
indices = where( (image1*shift(image1,1,0) lt 0) or (image1*shift(image1,0,1) lt 0) )
image2[indices]=255
wset, 11
tv, image2
end
Figure 4.5: Image filtered with the Laplacian of Gaussian filter.
38 CHAPTER 4. RADIOMETRIC ENHANCEMENT
Chapter 5
Topographic modelling
Satellite images are two-dimensional representations of the three-dimensional earth surface.
The correct treatment of the third dimension – the elevation – is essential for terrain mod-
elling and accurate georeferencing.
5.1 RST transformation
Transformations of spatial coordinates1
in 3 dimensions which involve only rotations, scaling
and translations can be represented by a 4 × 4 transformation matrix A
v∗
= Av (5.1)
where v is the column vector containing the original coordinates
v = (X, Y, Z, 1)
and v∗
contains the transformed coordinates
v∗
= (X∗
, Y ∗
, Z∗
, 1) .
For example the translation
X∗
= X + X0
Y ∗
= Y + Y0
Z∗
= Z + Z0
corresponds to the transformation matrix
T =



1 0 0 X0
0 1 0 Y0
0 0 1 Z0
0 0 0 1


 ,
a uniform scaling by 50% to
S =



1/2 0 0 0
0 1/2 0 0
0 0 1/2 0
0 0 0 1


 ,
1The following treatment closely follows Chapter 2 of Gonzales and Woods [GW02].
39
40 CHAPTER 5. TOPOGRAPHIC MODELLING
and a simple rotation θ about the Z-axis to
Rθ =



cos θ sin θ 0 0
−sinθ cosθ 0 0
0 0 1 0
0 0 0 1


 ,
etc. The complete RST transformation is then
v∗
= RSTv = Av. (5.2)
The inverse transformation is of course represented by A−1
.
5.2 Imaging transformations
An imaging (or perspective) transformation projects 3D points onto a plane. It is used to
describe the formation of a camera image and, unlike the RST transformation, is non-linear
since it involves division by coordinate values.
Figure 5.1: Basic imaging process, from [GW02].
In Figure 5.1, the camera coordinate system (x, y, x) is aligned with the world coordinate
system, describing the terrain to be imaged. The camera focal length is λ. From sim-
ple geometry we obtain expressions for the image plane coordinates in terms of the world
coordinates:
x =
λX
λ − Z
y =
λY
λ − Z
.
(5.3)
Solving for the X and Y world coordinates:
X =
x
λ
(λ − Z)
Y =
y
λ
(λ − Z).
(5.4)
5.3. CAMERA MODELS AND RFM APPROXIMATIONS 41
Thus, in order to extract the geographical coordinates (X, Y ) of a point on the earth’s
surface from its image coordinates, we require knowledge of the elevation Z. Correcting for
the elevation in this way constitutes the process of orthorectification.
5.3 Camera models and RFM approximations
Equation (5.3) is overly simplified, as it assumes that the origin of world and image coordi-
nates coincide. In order to apply it, one has first to transform the image coordinate system
from the satellite to the world coordinate system. This is done in a straightforward way
with the rotation and translation transformations introduced in Section 5.1. However it
requires accurate knowledge of the height and orientation of the satellite imaging system at
the time of the image acquisition (or, more exactly, during the acquisition, since the latter
is normally not instantaneous). The resulting non-linear equations that relate image and
world coordinates are what constitute the camera or sensor model for that particular image.
Direct use of the camera model for image processing is complicated as it requires ex-
tremely exact, sometimes proprietary information about the sensor system and its orbit.
An alternative exists if the image provider also supplies a so-called rational function model
(RFM) which approximates the camera model for each acquisition as a ratio of rational
polynomials, see e.g. [TH01]. Such RFMs have the form
r = f(X , Y , Z ) =
a(X , Y , Z )
b(X , Y , Z )
c = g(X , Y , Z ) =
c(X , Y , Z )
d(X , Y , Z )
(5.5)
where c and r are the column and row (XY) coordinates in the image plane relative to an
origin (c0, r0) and scaled by a factor cs resp. rs:
c =
c − c0
cs
, r =
r − r0
rs
.
Similarly X , Y and Z are relative, scaled world coordinates:
X =
X − X0
Xs
, Y =
Y − Y0
Ys
, Z =
Z − Z0
Zs
.
The polynomials a, b, c and d are typically to third order in the world coordinates, e.g.
a(X, Y, Z) = a0 + a1X + a2Y + a3Z + a4XY + a5XZ + a6Y Z + a7X2
+ a8Y 2
+ a9Z2
+ a10XY Z + a11X3
+ a12XY 2
+ a13XZ2
+ a14X2
Y + a15Y 3
+ a16Y Z2
+ a17X2
Z + a18Y 2
Z + a19Z3
The advantage of using ratios of polynomials is that these are less subject to interpolation
error.
For a given acquisition the provider fits the RFM to his camera model using a three-
dimensional grid of points covering the image and world spaces with a least squares fitting
procedure. The RFM is capable of representing the camera model extremely well and can
be used as a replacement for it. Both Space Imaging and Digital Globe provide RFMs with
their high resolution IKONOS and QuickBird imagery. Below is a sample Quickbird RFM
file giving the origins, scaling factors and polynomial coefficients needed in Eq. (5.5).
42 CHAPTER 5. TOPOGRAPHIC MODELLING
satId = QB02;
bandId = P;
SpecId = RPC00B;
BEGIN_GROUP = IMAGE
errBias = 56.01;
errRand = 0.12;
lineOffset = 4683;
sampOffset = 4154;
latOffset = 32.5709;
longOffset = 51.8391;
heightOffset = 1582;
lineScale = 4733;
sampScale = 4399;
latScale = 0.0256;
longScale = 0.0269;
heightScale = 500;
lineNumCoef = (
+1.162844E-03,
-7.011681E-03,
-9.993482E-01,
-1.119999E-02,
-6.682911E-06,
+7.591306E-05,
+3.632740E-04,
-1.111298E-04,
-5.842086E-04,
+2.212466E-06,
-1.275349E-06,
+1.279061E-06,
+1.918762E-08,
-6.957548E-07,
-1.240783E-06,
-7.644403E-07,
+3.479752E-07,
+1.259300E-05,
+1.085128E-06,
-1.571375E-06);
lineDenCoef = (
+1.000000E+00,
+1.801541E-06,
+5.822024E-04,
+3.774278E-04,
-2.141015E-08,
-6.984359E-07,
-1.344888E-06,
-9.669251E-07,
-4.726988E-08,
+1.329814E-06,
+2.113403E-08,
-2.914653E-06,
5.3. CAMERA MODELS AND RFM APPROXIMATIONS 43
-4.367422E-07,
+6.988065E-07,
+4.035593E-07,
+3.275453E-07,
-2.740827E-07,
-4.147675E-06,
-1.074015E-06,
+2.218804E-06);
sampNumCoef = (
-9.783496E-04,
+9.566915E-01,
-8.477919E-03,
-5.393803E-02,
-1.590864E-04,
+5.477412E-04,
-3.968308E-04,
+4.819512E-04,
-3.965558E-06,
-3.442885E-05,
+5.821180E-08,
+2.952683E-08,
-1.363146E-07,
+2.454422E-07,
+1.372698E-07,
+1.987710E-07,
-3.167074E-07,
-1.038018E-06,
+1.376092E-07,
-2.352636E-07);
sampDenCoef = (
+1.000000E+00,
+5.029785E-04,
+1.225257E-04,
-5.780883E-04,
-1.543054E-07,
+1.240426E-06,
-1.830526E-07,
+3.264812E-07,
-1.255831E-08,
-5.177631E-07,
-5.868514E-07,
-9.029287E-07,
+7.692317E-08,
+1.289335E-07,
-3.649242E-07,
+0.000000E+00,
+1.229000E-07,
-1.290467E-05,
+4.318970E-08,
-8.391348E-08);
44 CHAPTER 5. TOPOGRAPHIC MODELLING
END_GROUP = IMAGE
END;
To illustrate a simple use of the RFM data, consider a vertical structure in a high-
resolution image, such as a chimney or building fassade. Suppose we determine the image
coordinates of the bottom and top of the structure to be (rb, cb) and (rt, ct), respectively.
Then from 5.5
rb = f(X, Y, Zb)
cb = g(X, Y, Zb)
rt = f(X, Y, Zt)
ct = g(X, Y, Zt),
(5.6)
since the (X, Y ) coordinates must be the same. This would appear to constitute a set of
four equations in four unknowns X, Y , Zb and Zt, however the solution is unstable because
of the close similarity of Zt to Zb. Nevertheless the object height Zt − Zb can be obtained
by the following procedure:
1. Get (rb, cb) and (rt, ct) from the image.
2. Solve first two equations in (5.6) (e.g. with Newton’s method) for X and Y with Zb
set equal to the average elevation in the scene if no DEM is available, otherwise to the
true elevation.
3. For a spanning range of Zt values, calculate (rt, ct) from the second two equations in
(5.6) and choose for Zt the value of Zt which gives closest agreement to the values
read in.
Quite generally, the RFM can approximate the camera model very well and can be used
as an alternative for providing end users with the necessary information to perform their
own photogrammetric processing. An ENVI plug-in for object height determination
from RFM data is given in Appendix D.2.1.
5.4 Stereo imaging, elevation models and
orthorectification
The missing elevation information Z in (5.3) or in (5.5) can be obtained with stereoscopic
imaging techniques. Figure 5.2 shows two cameras viewing the same world point w from
two positions. The separation of the lens centers is the baseline. The objective is to find
the coordinates (X, Y, Z) of w if its image points have coordinates (x1, y1) and (x2, y2). We
assume that the cameras are identical and that their image coordinate systems are perfectly
aligned, differing only in the location of their origins. The Z coordinate of w is the same for
both coordinate systems.
In Figure 5.3 the first camera is brought into coincidence with the world coordinate
system. Then from (5.4),
X1 =
x1
λ
(λ − Z).
Alternatively, if the second camera is brought to the origin of the world coordinate system,
X2 =
x2
λ
(λ − Z).
5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 45
Figure 5.2: The stereo imaging process, from [GW02].
Figure 5.3: Top view of Figure 5.2, from [GW02].
46 CHAPTER 5. TOPOGRAPHIC MODELLING
But, from the figures,
X2 = X1 + B,
where B is the baseline. We have from the above three equations:
Z = λ −
λB
x2 − x1
. (5.7)
Thus if the displacement of the image coordinates of the point w, namely x2 − x1 can be
determined, the Z coordinate can be calculated. The task is then to find two correspond-
ing points in different images of the same scene. This is usually accomplished by spatial
correlation techniques and is closely related to the problem of image-to-image registration
discussed in the next chapter.
Figure 5.4: ASTER stereo acquisition geometry.
Because the stereo image must be correlated, best results are obtained if they are acquired
within a very short time of each other, preferably “along track” if a single platform is used,
see Figure 5.4. This figure shows the orientation and imaging geometry of the VNIR 3N and
3B cameras on the ASTER platform for acquiring a stereo full scene. The satellite travels at
5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 47
a speed of 6.7 km/sec at a height of 705 km. A 60 × 60 km2
full scene is scanned in 9 seconds.
55 seconds later the same scene is scanned by the back-looking camera, corresponding to a
baseline of 370 km. The along-track geometry means that the stereo pair is unipolar, that
is, the displacements due to viewing angle are only along the y axis in the imaging plane.
Therefore the spatial correlation algorithm used to match points can be one dimensional. If
carried out on a pixel for pixel basis, one obtains a digital elevation model (DEM).
Figure 5.5: ASTER 3N nadir camera image.
Figure 5.6: ASTER 3B back-looking camera image.
As an example, Figures 5.5 and 5.6 show an ASTER stereo pair. Both images have been
rotated so as to make them unipolar.
48 CHAPTER 5. TOPOGRAPHIC MODELLING
The following IDL program calculates a very rudimentary DEM:
pro test_correl_images
height = 705.0
base = 370.0
pixel_size = 15.0
envi_select, title=’Choose 1st image’, fid=fid1, dims=dims1, pos=pos1, /band_only
envi_select, title=’Choose 2nd image’, fid=fid2, dims=dims2, pos=pos2, /band_only
im1 = envi_get_data(fid=fid1,dims=dims1,pos=pos1)
im2 = envi_get_data(fid=fid2,dims=dims2,pos=pos2)
n_cols = dims1[2]-dims1[1]+1
n_rows = dims1[4]-dims1[3]+1
parallax = fltarr(n_cols,n_rows)
progressbar = Obj_New(’progressbar’, Color=’blue’, Text=’0’,$
title=’Cross correlation, column ...’,xsize=250,ysize=20)
progressbar-start
for i=7L,n_cols-8 do begin
if progressbar-CheckCancel() then begin
envi_enter_data,pixel_size*parallax*(height/base)
progressbar-Destroy
return
endif
progressbar-Update,(i*100)/n_cols,text=strtrim(i,2)
for j=25L,n_rows-26 do begin
cim = correl_images(im1[i-5:i+5,j-5:j+5],im2[i-7:i+7,j-25:j+25], $
xoffset_b=0,yoffset_b=-20,xshift=0,yshift=20)
corrmat_analyze,cim,xoff,yoff,m,e,p
parallax[i,j] = yoff  (-5.0)
endfor
endfor
progressbar-destroy
envi_enter_data,pixel_size*parallax*(height/base)
end
This program makes use of the routines correl images and corrmat analyze from the IDL
Astronomy User’s Library2
to calculate the cross-correlation of the two images. For each
pixel in the nadir image an 11 × 11 window is moved along an 11 × 51 window in the back-
looking image centered at the same position. The point of maximum correlation defines the
parallax or displacement p. This is related to the relative elevation e of the pixel according
to
e =
h
b
p × 15m,
where h is the height of the sensor and b is the baseline, see Figure 5.7.
Figure 5.8 shows the result. Clearly there are many problems due to the correlation
errors, however the relative elevations are approximately correct when compared to the
DEM determined with the ENVI commercial add-on AsterDTM, see Figure 5.9.
2www.astro.washington.edu/deutsch/idl/htmlhelp/index.html
5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 49
'
b
h
e
p
satellite motion
ground
nadir cameraback camera
Figure 5.7: Relating parallax p to elevation e by similar triangles: e/p = (h − e)/b ≈ h/b.
Figure 5.8: A rudimentary DEM.
50 CHAPTER 5. TOPOGRAPHIC MODELLING
Figure 5.9: DEM generated with the commercial product AsterTDM.
Either the complete camera model or an RFM can be used, but usually neither is sufficient
for an absolute DEM relative to mean sea level. Most often additional ground reference
points within the image whose elevations are known are also required for absolute calibration.
The orthorectification of the image is carried out on the basis of a suitable DEM and
consists of projecting the (X, Y, Z) coordinates of each pixel onto the (X, Y ) coordinates of
a given map projection.
5.5 Slope and aspect
Terrain analysis involves the processing of elevation data. Specifically we consider here
the generation of slope images, which give the steepness of the terrain at each pixel, and
aspect images, which give the prevailing direction relative to north of a vector normal to the
landscape at each pixel.
A 3×3 pixel window can be used to determine both slope and aspect, see Figure 5.10.
Define
∆x1 = c − a ∆y1 = a − g
∆x2 = f − d ∆y2 = b − h
∆x3 = i − g ∆y3 = c − i
and
∆x = (∆x1 + ∆x2 + ∆x3)/(3xs)
∆y = (∆y1 + ∆y2 + ∆y3)/(3xs,
where xs, ys give the pixel dimensions in meters. Then the slope in % at the central pixel
position is given by
s =
(∆x)2 + (∆y)2
2
× 100
whereas the aspect in radians measured clockwise from north is
θ = tan−1 ∆x
∆y
.
5.6. ILLUMINATION CORRECTION 51
a b c
d e f
g h i
Figure 5.10: Pixel elevations in an 8-neighborhood. The letters represent elevations.
Slope/aspect determinations from a DEM are available in the ENVI main menu under
Topographic/Topographic Modelling.
5.6 Illumination correction
Figure 5.11: Angles involved in computation of local solar elevation, taken from [RCSA03].
Topographic modelling can be used to correct images for the effects of local solar illu-
mination, which depends not only upon the sun’s position (elevation and azimuth) but also
upon the local slope and aspect of the terrain being illuminated. Figure 5.11 shows the
angles involved [RCSA03]. Solar elevation is θi, solar azimuth is φa, θp is the slope and φ0
is the aspect. The quantity to be calculated is the local solar elevation γi which determines
52 CHAPTER 5. TOPOGRAPHIC MODELLING
the local irradiance. From trigonometry we have
cos γi = cos θp cos θi + sin θp sin θi cos(φa − φ0). (5.8)
An example of a cos γi image in hilly terrain is shown in Figure 5.12.
Figure 5.12: Cosine of local solar illumination angle stretched across a DEM.
Let ρT represent the reflectance of the inclined surface in Figure 5.11. Then for a
Lambertian surface, i.e. a surface which scatters reflected radiation uniformly in al directions,
the reflectance of the corresponding horizontal surface ρH would be
ρH = ρT
cos θi
cos γi
. (5.9)
The Lambertian assumption is in general not correct, the actual reflectance being de-
scribed by a complicated bidirectional reflectance distribution function (BRDF). An empiri-
cal appraoch which gives a better approximation to the BRDF is the C-correction [TGG82].
Let m and b be the slope and intercept of a regression line for reflectance vs. cos γi for a
particular image band. Then instead of (5.9) one uses
ρH = ρT
cosθi + b/m
cos γi + b/m
. (5.10)
An ENVI plug-in for illumination correction with the C-correction approxi-
mation is given in Appendix D.2.2.
Chapter 6
Image Registration
Image registration, either to another image or to a map, is a fundamental task in image
processing. It is required for georeferencing, stereo imaging, accurate change detection, or
any kind of multitemporal image analysis.
Image-to-image registration methods can be divided into roughly four classes [RC96]:
1. algorithms that use pixel values directly, i.e. correlation methods
2. frequency- or wavelet-domain methods that use e.g. the fast fourier transform(FFT)
3. feature-based methods that use low-level features such as edges and corners
4. algorithms that use high level features and the relations between them, e.g. object-
oriented methods
We consider examples of frequency-domain and feature-based methods here.
6.1 Frequency domain registration
Consider two N × N gray scale images g1(i , j ) and g2(i, j), where g2 is offset relative to g1
by an integer number of pixels:
g2(i, j) = g1(i , j ) = g1(i − i0, j − j0), i0, j0 N.
Taking the Fourier transform we have
ˆg2(k, l) =
ij
g1(i − i0, j − j0)e−i2π(ik+jl)/N
,
or with a change of indices to i j ,
ˆg2(k, l) =
i j
g1(i , j )e−i2π(i k+j l)/N
e−i2π(i0k+j0l)/N
= ˆg1(k, l)e−i2π(i0k+j0l)/N
.
(This is referred to as the Fourier translation property.) Therefore we can write
ˆg2(k, l)ˆg∗
1(k, l)
|ˆg2(k, l)ˆg∗
1(k, l)|
= e−i2π(i0k+j0l)/N
, (6.1)
53
54 CHAPTER 6. IMAGE REGISTRATION
Figure 6.1: Phase correlation of two identical images shifted by 10 pixels.
where ˆg∗
1 is the complex conjugate of ˆg1. The inverse transform of the right hand side
exhibits a Dirac delta function (spike) at the coordinates (i0, j0). Thus if two otherwise
identical images are offset by an integer number of pixels, the offset can be found by taking
their Fourier transforms, computing the ratio on the left hand side of (6.1) (the so-called
cross-power spectrum) and then taking the inverse transform of the result. The position of
the maximum value in the inverse transform gives the values of i0 and j0. The following
IDL program illustrates the procedure, see Fig. 6.1
; Image matching by phase correlation
; read a bitmap image and cut out two 512x512 pixel arrays
filename = Dialog_Pickfile(Filter=’*.jpg’,/Read)
if filename eq ’’ then print, ’cancelled’ else begin
Read_JPeG,filename,image
g1 = image[0,10:521,10:521]
g2 = image[0,0:511,0:511]
; perform Fourier transforms
f1 = fft(g1, /double)
f2 = fft(g2, /double)
; Determine the offset
g = fft( f2*conj(f1)/abs(f1*conj(f1)), /inverse, /double )
6.2. FEATURE MATCHING 55
pos = where(g eq max(g))
print, ’Offset = ’ + strtrim(pos mod 512) + strtrim(pos/512)
; output as EPS file
thisDevice =!D.Name
set_plot, ’PS’
Device, Filename=’c:tempphasecorr.eps’,xsize=4,ysize=4,/inches,/Encapsulated
shade_surf,g[0,0:50,0:50]
device,/close_file
set_plot, thisDevice
endelse
end
Images which differ not only by an offset but also by a rigid rotation and change of scale
can in principle be registered similarly, see [RC96].
6.2 Feature matching
A tedious task associated with image-image registration using low level image features is
the setting of ground control points (GCPs) since, in general, it is necessary to resort to
the manual entry. However various techniques for automatic determination of GCPs have
been suggested in the literature. We will discuss one such method, namely contour matching
[LMM95]. This technique has been found to function reliably in bitemporal scenes in which
vegetation changes do not dominate. It can of course be augmented (or replaced) by other
automatic methods or by manual determination. The procedures involved in image-image
registration using contour matching are shown in Fig. 6.2 [LMM95].
LoG
Zero Crossing
Edge Strength
Contour
Finder
Chain Code
Encoder
Closed Contour
Matching
Consistency
Check
Warping
E
E
E
E
E
E
cc
'''
Image 1
Image 2
Image 2
(registered)
Figure 6.2: Image-image registration with contour matching.
56 CHAPTER 6. IMAGE REGISTRATION
6.2.1 Contour detection
The first step involves the application of a Laplacian of Gaussian filter to both images. After
determining the contours by examining zero-crossings of the LoG-filtered image, the contour
strengths are encoded in the pixel intensities. Strengths are taken to be proportional to the
magnitude of the gradient at the zero-crossing.
6.2.2 Closed contours
In the next step, all closed contours with strengths above some given threshold are deter-
mined by tracing the contours. Pixels which have been visited during tracing are set to zero
so that they will not be visited again.
6.2.3 Chain codes
For subsequent matching purposes, all significant closed contours found in the preceding
step are chain encoded. Any digital curve can be represented by an integer sequence
{a1, a2 . . . ai . . .}, ai ∈ {0, 1, 2, 3, 4, 5, 6, 7}, depending on the relative position of the current
pixel with respect to the previous pixel in the curve. This simple code has the drawback
that some contours produce wrap around. For example the line in the direction −22.5o
has
the chain code {707070 . . .}. Li et al. [LMM95] suggest the smoothing operation:
{a1a2 . . . an} → {b1b2 . . . bn},
where b1 = a1 and bi = qi, qi is an integer satisfying (qi−ai) mod 8 = 0 and |qi−bi−1| → min,
i = 2, 3 . . . n.
They also suggest the applying the Gaussian smoothing filter {0.1, 0.2, 0.4, 0.2, 0.1} to the
result. Two chain codes can be compared by “sliding” one over the other and determining
the maximum correlation between them.
6.2.4 Invariant moments
The closed contours are first matched according to their invariant moments. These are
defined as follows, see [Hab95, GW02]. Let the set C denote the set of pixels defining a
contour, with |C| = n, that is, n is the number of pixels on the contour. The moment of
order p, q of the contour is defined as
mpq =
i,j∈C
jp
iq
. (6.2)
Note that n = m00. The center of gravity xc, yc of the contour is thus
xc =
m10
m00
, yc =
m01
m00
.
The centralized moments are then given by
µpq =
i,j∈C
(j − xc)p
(i − yc)q
, (6.3)
6.2. FEATURE MATCHING 57
and the normalized centralized moments by
ηpq =
1
µ
(p+q)/2+1
00
µpq. (6.4)
For example,
η20 =
1
µ2
00
µ20 =
1
n2
i,j∈C
(j − yc)2
.
The normalized centralized moments are, apart from effects of digital quantization, invariant
under scale changes and translations of the contours.
Finally, we can define moments which are also invariant under rotations, see [Hu62]. The
first two such invariant moments are
h1 = η20 + η02
h2 = (η20 − η02)2
+ 4η2
11.
(6.5)
For example, consider a general rotation of the coordinate axes with origin at the center of
gravity of a contour:
j
i
=
cos θ sin θ
− sin θ cos θ
j
i
= A
j
i
.
The first invariant moment in the rotated coordinate system is
h1 =
1
n2
i ,j ∈C
(j
2
+ i
2
) =
1
n2
i ,j ∈C
(j , i )
j
i
=
1
n2
i,j∈C
(j, i)A A
j
i
=
1
n2
i,j∈C
(j2
+ i2
),
since A A = I.
6.2.5 Contour matching
Each significant contour in one image is first matched with contours in the second image
according to their invariant moments h1, h2. This is done by setting a threshold on the
allowed differences, for instance 1 standard deviation. If one or more matches is found, the
best candidate for a GCP pair is then chosen to be that matched contour in the second
image for which the chain code correlation with the contour in the first image is maximum.
If the maximum correlation is less that some threshold, e.g. 0.9, then no match is found.
The actual GCP coordinates are taken to be the centers of gravity of the matched contours.
6.2.6 Consistency check
The contour matching procedure invariably generates false GCP pairs, so a further process-
ing step is required. In [LMM95] use is made of the fact that distances are preserved under
a rigid transformation. Let A1A2 represent the distance between two points A1 and A2 in
58 CHAPTER 6. IMAGE REGISTRATION
an image. For two sets of m matched contour centers {Ai} and {Bi} in image 1 and 2, the
ratios
AiAj/BiBj, i = 1 . . . m, j = i + 1 . . . m,
are calculated. These should form a cluster, so that pairs scattered away from the cluster
center can be rejected as false matches.
An ENVI plug-in for GCP determination via contour matching is given in
Appendix D.3.
6.3 Re-sampling and warping
We represent with (x, y) the coordinates of a point in image 1 and the corresponding point
in image 2 with (u, v). A second order polynomial map of image 2 to image 1, for example,
is given by
u = a0 + a1x + a2y + a3xy + a4x2
+ a5y2
v = b0 + b1x + b2y + b3xy + b4x2
+ b5y2
.
Since there are 12 unknown coefficients, we require at least 6 GCP pairs to determine the
map (each pair generates 2 equations). If more than 6 pairs are available, the coefficients can
be found by least squares fitting. This has the advantage that an RMS error for the mapping
can be estimated. Similar considerations apply for lower or higher order polynomial maps.
Having determined the map coefficients, image 2 can be registered to image 1 by re-
sampling. Nearest neighbor resampling simply chooses the actual pixel in image 2 that has
its center nearest the calculated coordinates (u, v) and transfers it to location (x, y). This
is the preferred technique for classification or change detection, since the registered image
consists of the original pixel brightnesses, simply rearranged in position to give a correct
image geometry. Other commonly used resampling methods are bilinear interpolation and
cubic convolution interpolation, see [JRR99] for details. These methods mix the spectral
intensities of neighboring pixels.
6.3. RE-SAMPLING AND WARPING 59
Exercises
1. We can approximate the centralized moments (6.3) of a contour by the integral
µpq = (x − xx)p
(y − yc)q
f(x, y)dxdy,
where the integration is over the whole image and where f(x, y) = 1 if the point
(x, y) lies on the contour and f(x, y) = 0 otherwise. Use this approximation to prove
that the normalized centralized moments ηpq given in (3.4) are invariant under scaling
transformations of the form
x
y
=
α 0
0 α
·
x
y
.
60 CHAPTER 6. IMAGE REGISTRATION
Chapter 7
Image Sharpening
The change detection and classification algorithms that we will meet in the next chapters
exploit of course not only the spatial but also the spectral information of satellite imagery.
Many common platforms (Landsat 7 TM, IKONOS, SPOT, QuickBird) offer panchromatic
images with higher ground resolution than that of the spectral channels. Application of mul-
tispectral change detection or classification methods is therefore restricted to the lower res-
olution. Conventional image fusion techniques, such as the well-known HSV-transformation
can be used to sharpen the spectral components, however the effect of mixing-in of the
panchromatic image is often to “dilute” the spectral resolution. Another disadvantage of
the HSV transformation is that one is restricted to using three of the available spectral
channels. In the following we will outline the HSV method and then consider alternative
fusion techniques.
7.1 HSV fusion
In computers with 24-bit graphics (true color), any three channels of a multispectral image
can be displayed with 8 bits for each of the additive primary colors red, green and blue. The
monitor displays this as an RGB color composite image which, depending on the choice of
image channels and their relative intensities, may or may not appear to be natural. There
are 224
≈ 16 million colors possible.
Another means of color definition is in terms of hue, saturation and value (HSV). Value
(or intensity) can be thought of as an axis equidistant from the three orthogonal primary
color axes. Hue refers to the actual color and is defined as an angle on a circle perpendicular
to the value axis. Saturation is the “amount” of color present and is represented by the
radius of the circle described by the hue,
A commonly used method for fusion of two images (for example a lower resolution multi-
spectral image with a higher resolution panchromatic image) is to transform the first image
from RGB to HSV space, replace the V component with the grayscale values of the second
image after performing a radiometric normalization, and then transform back to RGB space.
The forward transformation begins by rotating the RGB coordinate axes into the diagonal
61
62 CHAPTER 7. IMAGE SHARPENING
axis of the RGB color cube. The coordinates in the new reference system are given by


m1
m2
i1

 =


2/
√
6 −1/
√
6 −1/
√
6
0 1/
√
2 −1/
√
2
1/
√
3 1/
√
3 1/
√
3

 ·


R
G
B

 .
Then the the rectangular coordinates (m1, m2, i1) are transformed into the cylindrical HSV
coordinates:
H = arctan(m1/m2), S = m2
1 + m2
2, I =
√
3 i1.
The following IDL code illustrates the necessary steps for HSV fusion making use of ENVI
batch procedures. These are also invoked directly from the ENVI main menu.
pro HSVFusion, event
; get MS image
envi_select, title=’Select low resolution three-band input file’, $
fid=fid1, dims=dims1, pos=pos1
if (fid1 eq -1) or (n_elements(pos1) ne 3) then return
; get PAN image
envi_select, title=’Select panchromatic image’, $
fid=fid2, pos=pos2, dims=dims2, /band_only
if (fid2 eq -1) then return
envi_check_save, /transform
; linear stretch the images and convert to byte format
envi_doit,’stretch_doit’, fid=fid1, dims=dims1, pos=pos1, method=1, $
r_fid=r_fid1, out_min=0, out_max=255, $
range_by=0, i_min=0, i_max=100, out_dt=1, out_name=’c:temphsv_temp’
envi_doit,’stretch_doit’, fid=fid2, dims=dims2, pos=pos2, method=1, $
r_fid=r_fid2, out_min=0, out_max=255, $
range_by=0, i_min=0, i_max=100, out_dt=1, /in_memory
envi_file_query, r_fid2, ns=f_ns, nl=f_nl
f_dims = [-1l, 0, f_ns-1, 0, f_nl-1]
; HSV sharpening
envi_doit, ’sharpen_doit’, $
fid=[r_fid1,r_fid1,r_fid1], pos=[0,1,2], f_fid=r_fid2, $
f_dims=f_dims, f_pos=[0], method=0, interp=0, /in_memory
; remove temporary files from ENVI
envi_file_mng, id=r_fid1, /remove, /delete
envi_file_mng, id=r_fid2, /remove
end
7.2. BROVEY FUSION 63
7.2 Brovey fusion
In its simplest form this method multiplies each re-sampled multispectral pixel by the ratio
of the corresponding panchromatic pixel intensity to the sum of all of the multispectral
intensities. The corrected pixel intensities ¯gk(i, j) in the kth fused multispectral channel are
given by
¯gk(i, j) = gk(i, j) ·
gp(i, j)
k gk (i, j)
, (7.1)
where gk(i, j) is the (re-sampled) pixel intensity in the kth channel and gp(i, j) is the corre-
sponding pixel intensity in the panchromatic image. (The ENVI-environment offers Brovey
fusion in its main menu.) This technique assumes that the spectral range spanned by the
panchromatic image is essentially the same as that covered by the multispectral channels.
This is seldom the case. Moreover, to avoid bias, the intensities used should be the radiances
at the satellite sensors, implying use of the sensors’ calibration.
7.3 PCA fusion
Panchromatic sharpening using principal components analysis (PCA) is similar to the HSV
method. After the PCA transformation, the first principal component is replaced by the
panchromatic image, again after radiometric normalization, see Figure 7.1.
Figure 7.1: Panchromatic fusion with the principal components transformation.
Image sharpening using PCA and the closely related Gram-Schmidt transformation is
available from the ENVI main menu.
64 CHAPTER 7. IMAGE SHARPENING
7.4 Wavelet fusion
Wavelets provide an efficient means of representing high and low frequency components of
multispectral images and can be used to perform image sharpening. Two examples are given
here.
7.4.1 Discrete wavelet transform
The discrete wavelet transform (DWT) of a two-dimensional image is shown in Appendix
B to be equivalent to an iterative application of the high-low-pass filter bank illustrated in
Figure 7.2
H
G
H
H
G
G
↓
↓
↓
↓
↓
↓
E E E
E
E
E
E E
E
E
E
E
E
E
E
E
gk(i, j)
gk+1(i, j)
CH
k+1(i, j)
CV
k+1(i, j)
CD
k+1(i, j)
Columns Rows
Figure 7.2: Wavelet filter bank. H is a low-pass and G a high-pass filter derived from the
coefficients of the wavelet transformation. The symbol ↓ indicates downsampling by a factor
of 2. The original image gk(i, j) can be reconstructed by inverting the filter.
A single application of the filter corresponding to the Daubechies D4 wavelet to a satellite
image g1(i, j) (1m resolution) is shown in Figure B.12. The high frequency information
(wavelet coefficients) is stored in the arrays CH
2 , CV
2 and CD
2 and displayed in the upper right,
lower left and lower right quadrants, respectively. The original image with its resolution
degraded by a factor two, g2(i, j), is in the upper left quadrant. Applying the filter bank
iteratively to the upper left quadrant yields a further reduction by a factor of 2.
The fusion procedure for IKONOS or QuickBird imagery for instance, in which the
resolutions of panchromatic and the 4 multispectral components differ exactly by a factor
of 4, is then as follows: Both the degraded panchromatic image and the four multispectral
images are compressed once again (e.g. to 8m resolution in the case of IKONOS) and the high
frequency components Cz
4 , z = H, V, D, are sampled to estimate the correction coefficients
az
= σz
ms/σz
pan
bz
= mz
ms − az
mz
pan,
(7.2)
where mz
and σz
denote mean and standard deviation, respectively. These coefficients are
then used to normalize the wavelet coefficients for the panchromatic image to those of the
multispectral image:
Cz
i (i, j) → az
Cz
i (i, j) + bz
, z = H, V, D, i = 2, 3. (7.3)
7.4. WAVELET FUSION 65
The degraded panchromatic image g3(i, j) is then replaced by the each of the four multispec-
tral images and the normalized wavelet coefficients are used to reconstruct the original 1m
resolution. We thus obtain what would be seen if the multispectral sensors had the resolution
of the panchromatic sensor [RW00].
An ENVI plug-in for panchromatic sharpening with the DWT is given in
Appendix D.4.1.
7.4.2 `A trous filtering
The radiometric fidelity obtained with the discrete wavelet transform is excellent, as will be
shown in the next section. However the lack of translational invariance of the DWT often
leads to spatial artifacts (blurring, shadowing, staircase effect) in the sharpened product.
This is illustrated in the following program, in which an image is transformed once with the
DWT and the low-pass quadrant shifted by one pixel relative to the high-pass quadrants
(i.e. the wavelet coefficients). After inverting the transformation, serious degradation is
apparent, see Figure 7.3.
pro translate_wavelet
; get an image band
envi_select, title=’Select input file’, $
fid=fid, dims=dims, pos=pos, /band_only
if fid eq -1 then return
; create a DWT object
aDWT = Obj_New(’DWT’,envi_get_data(fid=fid,dims=dims,pos=pos))
; compress
aDWT-compress
; shift the compressed portion supressing phase correlation match
aDWT-inject,shift(aDWT-Get_Quadrant(0),[1,1]),pc=0
; restore
aDWT-expand
; return result to ENVI
envi_enter_data, aDWT-get_image()
end
As an alternative to the DWT, the `a trous wavelet transform (ATWT) has been proposed
for image sharpening [AABG02]. The ATWT is a multiresolution decomposition defined
formally by a low-pass filter H = {h(0), h(1), . . .} and a high-pass filter G = δ − H, where
δ denotes an all-pass filter. Thus the high frequency part is just the difference between the
original image and low-pass filtered image. Not surprisingly, this transformation does not
allow perfect reconstruction if the output is downsampled. Therefore downsampling is not
performed at all. Rather, at the kth iteration of the low-pass filter, 2k−1
zeroes are inserted
between the elements of H. This means that every other pixel is interpolated on the first
iteration:
H = {h(0), 0, h(1), 0, . . .},
while on the second iteration
H = {h(0), 0, 0, h(1), 0, 0, . . .}
etc. (hence the name `a trous = with holes). The low-pass filter is usually chosen to be
symmetric (unlike the Daubechies wavelet filters for example). The prototype filter chosen
66 CHAPTER 7. IMAGE SHARPENING
here is the cubic B-spline filter
H = {1/16, 1/4, 3/8, 1/4, 1/16}.
The transformation is highly redundant and requires considerably more computer storage
to implement. However when used for image sharpening it is much less sensitive to mis-
alignment between the multispectral and panchromatic images.
Figure 7.3: Artifacts due to lack of translational invariance of the DWT.
Figure 7.4 outlines the scheme implemented in the ENVI plug-in for ATWT panchromatic
sharpening. The MS band is nearest-neighbor upsampled by a factor of 2 to match the
dimensions of the high resolution band. The `a trous transformation is applied to both bands
(columns and rows are filtered with the upsampled cubic spline filter, with the difference
determining the high-pass result). The high frequency component of the pan image is
normalized to that of the MS image in the same way as for DWT sharpening, equations
(7.2) and (7.3). Then the low frequency pan component is replaced by the filtered MS
image and the transformation inverted. An ENVI plug-in for ATWT sharpening is
described in Appendix D.4.2.
7.5 Quality indices
Wang and Bovik [WB02] suggest the following measure of radiometric fidelity between two
image bands f and g:
7.5. QUALITY INDICES 67
E
E
E
E
E
T
+
G
G
↑H
↑H
T
insert
E
T
c
normalize
MS
Pan MS(sharpened)
↑
Figure 7.4: `A trous image sharpening scheme for an MS to panchromatic resolution ratio of
two. The symbol ↑H denotes the upsampled low-pass filter.
Figure 7.5: Comparison of three image sharpening methods with the Wang-Bovik quality
index. Left to right: Gram-Schmidt, ATWT, DTW.
68 CHAPTER 7. IMAGE SHARPENING
Q =
σfg
σf σg
·
2 ¯f¯g
¯f2 + ¯g2
·
2σf σg
σ2
f + σ2
g
=
4σfg
¯f¯g
( ¯f2 + ¯g2)(σ2
f + σ2
g)
(7.4)
where ¯f and σf are mean and variance of band f and σfg is the covariance of the two
bands. This first term in (7.4) is seen to be the correlation coefficient between the two
images, with values in [−1, 1], the second term compares their average brightness, with
values in [0, 1] and the third term compares their contrasts, also in [0, 1]. Thus perfect
radiometric correspondence would give a value Q = 1.
Since image quality is usually not spatially invariant, it is usual to compute Q in, say,
M sliding windows and then average over all such windows:
Q =
1
M
M
j=1
Qj.
An ENVI plug-in for determining the quality index for pansharpened images is
given in Appendix D.4.3.
Figure 7.5 shows a comparison of three image sharpening methods applied to a QuickBird
image, namely the Gram-Schmidt, ATWT and DWT transformations. The latter is by far
the best, but spatial artifacts are apparent.
Chapter 8
Change Detection
To quote Singh’s review article on change detection [Sin89],
“The basic premise in using remote sensing data for change detection is that
changes in land cover must result in changes in radiance values ... [which] must
be large with respect to radiance changes from other factors.”
In the present chapter we will mention briefly the most commonly used digital techniques for
enhancing this “change signal” in bitemporal satellite images, and then focus our attention
on the so-called multivariate alteration detection algorithm of Nielsen et al. [NCS98].
8.1 Algebraic methods
In order to see changes in the two multispectral images represented by N-dimensional ran-
dom vectors F and G, a simple procedure is to subtract them from each other component-
by-component, examining the N differenced images characterized by
F − G = (F1 − G1, F2 − G2 . . . FN − GN ) (8.1)
for significant changes. Pixel intensity differences near zero indicate no change, large positive
or negative values indicate change, and decision thresholds can be set to define significant
changes. If the difference signatures in the spectral channels are used to classify the kind of
change that has taken place, one speaks of change vector analysis. Thresholds are usually
expressed in standard deviations from the mean difference value, which is taken to correspond
to no change.
Alternatively, ratios of intensities of the form
Fk
Gk
, k = 1 . . . N (8.2)
can be built between successive images. Ratios near unity correspond to no-change, while
small and large values indicate change. A disadvantage of this method is that random
variables of the form (8.2) are not normally distributed, so simple threshold values defined
in terms of standard deviations are not valid.
Other algebraic combinations, such as differences in vegetation indices (Section 2.1) are
also in use. All of these “band math” operations can of course be performed conveniently
within the ENVI/IDL environment.
69
70 CHAPTER 8. CHANGE DETECTION
8.2 Principal components
Figure 8.1: Change detection with principal components.
Consider the bitemporal feature space for a single spectral band m in which each pixel
is denoted by a point (fm, gm), a realization of the random vector (Fm, Gm). Since the
unchanged pixels are highly correlated, they will lie in a narrow, elongated cluster along the
principal axis, whereas change pixels will lie some distance away from it, see Fig. 8.1. The
second principal component will thus quantify the degree of change associated with a given
pixel. Since the principal axes are determined by diagonalization of the covariance matrix for
all of the pixels, the no-change axis may be poorly determined. To avoid this problem, the
principal components can be determined iteratively using weights for each pixel according
to the magnitude of the second principal component. This method can be generalized to
treat all multispectral bands simultaneously [Wie97].
8.3 Post-classification comparison
If two co-registered satellite images have been classified, then the class labels can be com-
pared to determine land cover changes. If classification is carried out at the pixel level (as
opposed to segments or objects), then classification errors (typically  5%) may dominate
the true changes, depending on the magnitude of the latter. ENVI offers functions for
statistical analysis of post-classification change detection.
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl
Morton john canty   image analysis and pattern recognition for remote sensing with algorithms in envi-idl

More Related Content

What's hot

David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimizationDavid_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimizationDavid Mateos
 
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun UmraoBasic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun Umraossuserd6b1fd
 
Business Mathematics Code 1429
Business Mathematics Code 1429Business Mathematics Code 1429
Business Mathematics Code 1429eniacnetpoint
 
Reading Materials for Operational Research
Reading Materials for Operational Research Reading Materials for Operational Research
Reading Materials for Operational Research Derbew Tesfa
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic ProgrammingSSA KPI
 
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...ssuserd6b1fd
 
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...ssuserd6b1fd
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...ssuserd6b1fd
 
Efficient Model-based 3D Tracking by Using Direct Image Registration
Efficient Model-based 3D Tracking by Using Direct Image RegistrationEfficient Model-based 3D Tracking by Using Direct Image Registration
Efficient Model-based 3D Tracking by Using Direct Image RegistrationEnrique Muñoz Corral
 
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...ssuserd6b1fd
 

What's hot (18)

David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimizationDavid_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
 
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun UmraoBasic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
 
Business Mathematics Code 1429
Business Mathematics Code 1429Business Mathematics Code 1429
Business Mathematics Code 1429
 
Reading Materials for Operational Research
Reading Materials for Operational Research Reading Materials for Operational Research
Reading Materials for Operational Research
 
MSC-2013-12
MSC-2013-12MSC-2013-12
MSC-2013-12
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic Programming
 
feilner0201
feilner0201feilner0201
feilner0201
 
Communication
CommunicationCommunication
Communication
 
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
 
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
 
Diederik Fokkema - Thesis
Diederik Fokkema - ThesisDiederik Fokkema - Thesis
Diederik Fokkema - Thesis
 
Notes on probability 2
Notes on probability 2Notes on probability 2
Notes on probability 2
 
Cg notes
Cg notesCg notes
Cg notes
 
t
tt
t
 
Efficient Model-based 3D Tracking by Using Direct Image Registration
Efficient Model-based 3D Tracking by Using Direct Image RegistrationEfficient Model-based 3D Tracking by Using Direct Image Registration
Efficient Model-based 3D Tracking by Using Direct Image Registration
 
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...
Think Like Scilab and Become a Numerical Programming Expert- Notes for Beginn...
 
genral physis
genral physisgenral physis
genral physis
 

Similar to Morton john canty image analysis and pattern recognition for remote sensing with algorithms in envi-idl

Methods for Applied Macroeconomic Research.pdf
Methods for Applied Macroeconomic Research.pdfMethods for Applied Macroeconomic Research.pdf
Methods for Applied Macroeconomic Research.pdfComrade15
 
Applied Statistics With R
Applied Statistics With RApplied Statistics With R
Applied Statistics With RTracy Drey
 
probability_stats_for_DS.pdf
probability_stats_for_DS.pdfprobability_stats_for_DS.pdf
probability_stats_for_DS.pdfdrajou
 
Computer Graphics Notes.pdf
Computer Graphics Notes.pdfComputer Graphics Notes.pdf
Computer Graphics Notes.pdfAOUNHAIDER7
 
ubc_2015_november_angus_edward
ubc_2015_november_angus_edwardubc_2015_november_angus_edward
ubc_2015_november_angus_edwardTed Angus
 
An Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfAn Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfvishnuraj764102
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspectivee2wi67sy4816pahn
 
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxnumpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxsin3divcx
 
Wireless Communications Andrea Goldsmith, Stanford University.pdf
Wireless Communications Andrea Goldsmith, Stanford University.pdfWireless Communications Andrea Goldsmith, Stanford University.pdf
Wireless Communications Andrea Goldsmith, Stanford University.pdfJanviKale2
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
Introduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksIntroduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksESCOM
 

Similar to Morton john canty image analysis and pattern recognition for remote sensing with algorithms in envi-idl (20)

BenThesis
BenThesisBenThesis
BenThesis
 
Methods for Applied Macroeconomic Research.pdf
Methods for Applied Macroeconomic Research.pdfMethods for Applied Macroeconomic Research.pdf
Methods for Applied Macroeconomic Research.pdf
 
book.pdf
book.pdfbook.pdf
book.pdf
 
time_series.pdf
time_series.pdftime_series.pdf
time_series.pdf
 
Applied Statistics With R
Applied Statistics With RApplied Statistics With R
Applied Statistics With R
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
probability_stats_for_DS.pdf
probability_stats_for_DS.pdfprobability_stats_for_DS.pdf
probability_stats_for_DS.pdf
 
Computer Graphics Notes.pdf
Computer Graphics Notes.pdfComputer Graphics Notes.pdf
Computer Graphics Notes.pdf
 
ubc_2015_november_angus_edward
ubc_2015_november_angus_edwardubc_2015_november_angus_edward
ubc_2015_november_angus_edward
 
An Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfAn Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdf
 
Thats How We C
Thats How We CThats How We C
Thats How We C
 
D-STG-SG02.16.1-2001-PDF-E.pdf
D-STG-SG02.16.1-2001-PDF-E.pdfD-STG-SG02.16.1-2001-PDF-E.pdf
D-STG-SG02.16.1-2001-PDF-E.pdf
 
thesis
thesisthesis
thesis
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspective
 
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxnumpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
numpyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
 
Wireless Communications Andrea Goldsmith, Stanford University.pdf
Wireless Communications Andrea Goldsmith, Stanford University.pdfWireless Communications Andrea Goldsmith, Stanford University.pdf
Wireless Communications Andrea Goldsmith, Stanford University.pdf
 
main-moonmath.pdf
main-moonmath.pdfmain-moonmath.pdf
main-moonmath.pdf
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
Na 20130603
Na 20130603Na 20130603
Na 20130603
 
Introduction to Radial Basis Function Networks
Introduction to Radial Basis Function NetworksIntroduction to Radial Basis Function Networks
Introduction to Radial Basis Function Networks
 

Recently uploaded

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 

Recently uploaded (20)

Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 

Morton john canty image analysis and pattern recognition for remote sensing with algorithms in envi-idl

  • 1. Image Analysis and Pattern Recognition for Remote Sensing with Algorithms in ENVI/IDL Morton John Canty Forschungszentrum J¨ulich GmbH m.canty@fz-juelich.de March 21, 2005
  • 2. -
  • 3. Contents 1 Images, Arrays and Vectors 1 1.1 Multispectral satellite images . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Algebra of vectors and matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Finding minima and maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Image Statistics 13 2.1 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 The normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 A special function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Conditional probabilities and Bayes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Transformations 21 3.1 Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Discrete Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.2 Discrete Fourier transform of an image . . . . . . . . . . . . . . . . . . 23 3.2 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Principal components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4 Minimum noise fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.5 Maximum autocorrelation factor (MAF) . . . . . . . . . . . . . . . . . . . . . 28 4 Radiometric enhancement 31 4.1 Lookup tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.1 Histogram equalization . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.2 Histogram matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 Laplacian of Gaussian filter . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Topographic modelling 39 5.1 RST transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 i
  • 4. ii CONTENTS 5.2 Imaging transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.3 Camera models and RFM approximations . . . . . . . . . . . . . . . . . . . . 41 5.4 Stereo imaging, elevation models and orthorectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.5 Slope and aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.6 Illumination correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6 Image Registration 53 6.1 Frequency domain registration . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.2 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.2.1 Contour detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2.2 Closed contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2.3 Chain codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2.4 Invariant moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2.5 Contour matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.2.6 Consistency check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.3 Re-sampling and warping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 7 Image Sharpening 61 7.1 HSV fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.2 Brovey fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.3 PCA fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.4 Wavelet fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.4.1 Discrete wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . 64 7.4.2 `A trous filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.5 Quality indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 8 Change Detection 69 8.1 Algebraic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 8.2 Principal components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 8.3 Post-classification comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 8.4 Multivariate alteration detection . . . . . . . . . . . . . . . . . . . . . . . . . 71 8.4.1 Canonical correlation analysis . . . . . . . . . . . . . . . . . . . . . . . 71 8.4.2 Solution by Cholesky factorization . . . . . . . . . . . . . . . . . . . . 72 8.4.3 Properties of the MAD components . . . . . . . . . . . . . . . . . . . 73 8.4.4 Covariance of MAD variates with original observations . . . . . . . . . 74 8.4.5 Scale invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 8.4.6 Improving signal to noise . . . . . . . . . . . . . . . . . . . . . . . . . 75 8.4.7 Decision thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 8.5 Radiometric normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 9 Unsupervised Classification 79
  • 5. CONTENTS iii 9.1 A simple cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 9.2 Algorithms that minimize the simple cost function . . . . . . . . . . . . . . . 81 9.2.1 K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 9.2.2 Extended K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.2.3 Agglomerative hierarchical clustering . . . . . . . . . . . . . . . . . . . 83 9.2.4 Fuzzy K-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 9.3 EM Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 9.3.1 Simulated annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 9.3.2 Partition density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 9.3.3 Including spatial information . . . . . . . . . . . . . . . . . . . . . . . 87 9.4 The Kohonen Self Organizing Map . . . . . . . . . . . . . . . . . . . . . . . . 89 9.5 Unsupervised classification of changes . . . . . . . . . . . . . . . . . . . . . . 91 10 Supervised Classification 93 10.1 Bayes decision rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 10.2 Training data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 10.3 Bayes Maximum likelihood classification . . . . . . . . . . . . . . . . . . . . . 95 10.4 Non-parametric methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 10.5 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 10.5.1 The feed-forward network . . . . . . . . . . . . . . . . . . . . . . . . . 99 10.5.2 Cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 10.5.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 10.5.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 10.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 10.6.1 Standard deviation of misclassification . . . . . . . . . . . . . . . . . . 111 10.6.2 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 10.6.3 Confusion matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 11 Hyperspectral analysis 117 11.1 Mixture modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 11.1.1 Full linear unmixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 11.1.2 Unconstrained linear unmixing . . . . . . . . . . . . . . . . . . . . . . 119 11.1.3 Intrinsic end-members and pixel purity . . . . . . . . . . . . . . . . . . 119 11.2 Orthogonal subspace projection . . . . . . . . . . . . . . . . . . . . . . . . . . 121 A Least Squares Procedures 125 A.1 Generalized least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 A.2 Recursive least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 A.3 Orthogonal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 B The Discrete Wavelet Transformation 131
  • 6. iv CONTENTS B.1 Inner product space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 B.2 Haar wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 B.3 Multi-resolution analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 B.4 Fixpoint wavelet approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 138 B.5 The mother wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 B.6 The Daubechies wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 B.7 Wavelets and filter banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 C Advanced Neural Network Training Algorithms 151 C.1 The Hessian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 C.1.1 The R-operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 C.1.2 Calculating the Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . 155 C.2 Scaled conjugate gradient training . . . . . . . . . . . . . . . . . . . . . . . . 156 C.2.1 Conjugate directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 C.2.2 Minimizing a quadratic function . . . . . . . . . . . . . . . . . . . . . 157 C.2.3 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 C.3 Kalman filter training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3.1 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.3.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 D ENVI Extensions 171 D.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 D.2 Topographic modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 D.2.1 Calculating building heights . . . . . . . . . . . . . . . . . . . . . . . . 172 D.2.2 Illumination correction . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 D.3 Image registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 D.4 Image fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 D.4.1 DWT fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 D.4.2 ATWT fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 D.4.3 Quality index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 D.5 Change detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 D.5.1 Multivariate Alteration Detecton . . . . . . . . . . . . . . . . . . . . . 184 D.5.2 Maximum autocorrelation factor . . . . . . . . . . . . . . . . . . . . . 186 D.5.3 Radiometric normalization . . . . . . . . . . . . . . . . . . . . . . . . 187 D.6 Unsupervised classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 D.6.1 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 D.6.2 Fuzzy K-means clustering . . . . . . . . . . . . . . . . . . . . . . . . . 190 D.6.3 EM clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 D.6.4 Probabilistic label relaxation . . . . . . . . . . . . . . . . . . . . . . . 194 D.6.5 Kohonen self organizing map . . . . . . . . . . . . . . . . . . . . . . . 196 D.6.6 A GUI for change clustering . . . . . . . . . . . . . . . . . . . . . . . . 197
  • 7. CONTENTS v D.7 Neural network: Scaled conjugate gradient . . . . . . . . . . . . . . . . . . . . 198 D.8 Neural network: Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 D.9 Neural network: Hybrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Bibliography 203
  • 9. Chapter 1 Images, Arrays and Vectors 1.1 Multispectral satellite images There are a number of multispectral satellite-based sensors currently in orbit which are used for earth observation. Representative of these we mention here the Landsat ETM+ system. The ETM+ instrument on the Landsat 7 spacecraft contains sensors to measure radiance in three spectral intervals: • visible and near infrared (VNIR) bands - bands 1,2,3,4, and 8 (PAN) with a spectral range between 0.4 and 1.0 micrometer. • short wavelength infrared (SWIR) bands - bands 5 and 7 with a spectral range between 1.0 and 3.0 micrometer. • thermal long wavelength infrared (LWIR) band - band 6 with a spectral range between 8.0 and 12.0 micrometer. In addition a panchromatic (PAN) image (band 8) covering the visible spectrum is provided. Ground resolutions are 15m (PAN), 30m (VNIR,SWIR) and 60m (LWIR). Figure 1.1 shows a color composite image of a Landsat 7 scene over Morocco acquired in 1999. A single multispectral image can be represented as an array of gray-scale values or digital numbers gk(i, j), 1 ≤ i ≤ c, 1 ≤ j ≤ r, where c is the number of pixel columns and r is the number of pixel rows. If we are dealing with an N-band multispectral image, then the index k, 1 ≤ k ≤ N, denotes the spectral band. Often a pixel intensity is stored in a single byte, so that 0 ≤ gk ≤ 255. The gray-scale values are the result of sampling along an array of sensors the at-sensor radiance fλ(x, y) at wavelength λ due to sunlight reflected from some point (x, y) on the Earth’s surface and focussed by the satellite’s optical system at the sensors. Ignoring atmo- spheric effects this radiance is given roughly by fλ(x, y) ∼ iλ(x, y)rλ(x, y), where iλ(x, y) is the sun’s irradiance at the surface in units of watt/m2 µm, and rλ(x, y) is the surface reflectance, a number between 0 and 1. The conversion between gray-scale 1
  • 10. 2 CHAPTER 1. IMAGES, ARRAYS AND VECTORS Figure 1.1: Color composite of bands 4 (red), 5 (green) and 7 (blue) for a Landsat ETM+ image over Morocco.
  • 11. 1.1. MULTISPECTRAL SATELLITE IMAGES 3 or digital number g and at-sensor radiance f is determined by the sensor calibration as measured (and maintained) by the satellite image provider: f = Cg(i, j) + fmin where C = (fmax − fmin)/255, in which fmax and fmin are maximum and minimum mea- surable radiances at the sensor. Atmospheric scattering and absorption models are used to calculate surface reflectance from the observed at-sensor radiance, as it is the reflectance which is directly related to the physical properties of the surface being examined. Various conventions can be used for storing the image array g(i, j) in computer memory or on storage media. In band interleaved by pixel (BIP) format, for example, a two-channel, 3 × 3 pixel image would be stored as g1(1, 1) g2(1, 1) g1(2, 1) g2(2, 1) g1(3, 1) g2(3, 1) g1(1, 2) g2(1, 2) g1(2, 2) g2(2, 2) g1(3, 2) g2(3, 2) g1(1, 3) g2(1, 3) g1(2, 3) g2(2, 3) g1(3, 3) g2(3, 3), whereas in band interleaved by line (BIL) it would be stored as g1(1, 1) g1(2, 1) g1(3, 1) g2(1, 1) g2(2, 1) g2(3, 1) g1(1, 2) g1(2, 2) g1(3, 2) g2(2, 1) g1(2, 2) g2(2, 3) g1(1, 3) g2(2, 3) g1(3, 3) g2(3, 1) g1(2, 3) g2(3, 3), and in band sequential (BSQ) format it is stored as g1(1, 1) g1(2, 1) g1(3, 1) g1(1, 2) g1(2, 2) g1(3, 2) g1(1, 3) g1(2, 3) g1(3, 3) g2(1, 1) g2(2, 1) g2(3, 1) g2(1, 2) g2(2, 2) g2(3, 2) g2(1, 3) g2(2, 3) g2(3, 3). In the computer language IDL, so-called row major indexing is used for arrays and the elements in an array are numbered from zero. This means that, if a gray-scale image g is stored in an IDL array variable G, then the intensity value g(i, j) is addressed as G[i-1,j-1]. An N-band multispectral image is stored in BIP format as an N × c × r array in IDL, in BIL format as a c × N × r and in BSQ format as an c × r × N array. Auxiliary information, such as image acquisition parameters and georeferencing, is nor- mally included with the image data on the same file, and the format may or may not make use of compression algorithms. Examples are the geoTIFF1 file format used for example by Space Imaging Inc. for distributing Carterra(c) imagery and which includes lossless compres- sion, the HDF (Hierachical Data Format) in which for example ASTER images are distributed and the cross-platform PCDSK format employed by PCI Geomatics with its image process- ing software, which is in plain ASCII code and not compressed. ENVI uses a simple “flat binary” file structure with an additional ASCII header file. 1geoTIFF refers to TIFF files which have geographic (or cartographic) data embedded as tags within the TIFF file. The geographic data can then be used to position the image in the correct location and geometry on the screen of a geographic information display.
  • 12. 4 CHAPTER 1. IMAGES, ARRAYS AND VECTORS 1.2 Algebra of vectors and matrices It is very convenient to use a vector representation for multispectral images, namely g(i, j) =    g1(i, j) ... gN (i, j)    , (1.1) which is a column vector of multispectral gray-scale values at the position (i, j). Since we will be making extensive use of the vector notation of Eq. (1.1) we review here some of the basic properties of vectors and matrices. We can illustrate most of these properties in just two dimensions. x x2 x1 Figure 1.2: A vector in two dimensions. The transpose of the two-dimensional column vector shown in Fig. 1.2, x = x1 x2 , is the row vector x = (x1, x2). The sum of two vectors is given by x + y = x1 x2 + y1 y2 = x1 + y1 x2 + y2 , and the inner product by x y = (x1, x2) y1 y2 = x1y1 + x2y2. The length or norm of the vector x is x = |x| = x2 1 + x2 2 = √ x x . The programming language IDL is especially good at manipulating vectors and matrices:
  • 13. 1.2. ALGEBRA OF VECTORS AND MATRICES 5 IDL x=[[1],[2]] IDL print,x 1 2 IDL print,transpose(x) 1 2 b X x yθ x cos θ Figure 1.3: The inner product. The inner product can be written in terms of the vector lengths and the angle θ between the two vectors as x y = |x||y| cos θ = xy cos θ, see Fig. 1.3. If θ = 90o the vectors are orthogonal so that x y = 0. Any vector can be decomposed into orthogonal unit vectors: x = x1 x2 = x1 1 0 + x2 0 1 . A two-by-two matrix is written A = a11 a12 a21 a22 . When a matrix is multiplied with a vector the result is another vector, e.g. Ax = a11 a12 a21 a22 x1 x2 = a11x1 + a12x2 a21x1 + a22x2 . The IDL operator for matrix and vector multiplication is ##. IDL a=[[1,2],[3,4]] IDL print,a 1 2 3 4 IDL print,a##x 5 11
  • 14. 6 CHAPTER 1. IMAGES, ARRAYS AND VECTORS Matrices also have a transposed form, obtained by interchanging their rows and columns: A = a11 a21 a12 a22 . The product of two matrices is given by AB = a11 a12 a21 a22 b11 b12 b21 b22 = a11b11 + a12b21 · · · · · · · · · and is another matrix. The determinant of a two-dimensional matrix is |A| = det A = a11a22 − a12a21. The outer product of two vectors is a matrix: xy = x1 x2 (y1, y2) = x1 0 x2 0 y1 y2 0 0 = x1y1 x1y2 x2y1 x2y2 The identity matrix is given by I = 1 0 0 1 , IA = AI = A. The matrix inverse A−1 is defined in terms of the identity matrix according to A−1 A = AA−1 = I. In two dimensions it is easy to verify that A−1 = 1 |A| a22 −a12 −a21 a11 . IDL print, determ(float(a)) -2.00000 IDL print, invert(a) -2.00000 1.00000 1.50000 -0.500000 IDL print, a##invert(a) 1.00000 0.000000 0.000000 1.00000 If |A| = 0, then A has no inverse and is said to be a singular matrix. The trace of a square matrix is the sum of its diagonal elements: Tr A = a11 + a22. 1.3 Eigenvalues and eigenvectors The statistical properties of ensembles of pixel intensities (for example entire images or specific land-cover classes) are often approximated by their mean values and covariance
  • 15. 1.3. EIGENVALUES AND EIGENVECTORS 7 matrices. As we will see later, covariance matrices are always symmetric. A matrix A is symmetric if it doesn’t change when it is transposed, i.e. if A = A . Very often we have to solve the so-called eigenvalue problem, which is to find eigenvectors x and eigenvalues λ that satisfy the equation Ax = λx or, equivalently, a11 a12 a21 a22 x1 x2 = λ x1 x2 . This is the same as the two equations (a11 − λ)x1 + a12x2 = 0 a21x1 + (a22 − λ)x2 = 0. (1.2) If we eliminate x1 and make use of the symmetry a12 = a21, we obtain [(a11 − λ)(a22 − λ) − a2 12]x2 = 0. In general x2 = 0, so we must have (a11 − λ)(a22 − λ) − a2 12 = 0, which is known as the characteristic equation for the eigenvalue problem. It is a quadratic equation in λ with solutions λ(1) = 1 2 a11 + a22 + (a11 + a22)2 − 4(a11a22 − a2 12) λ(2) = 1 2 a11 + a22 − (a11 + a22)2 − 4(a11a22 − a2 12) . (1.3) Thus there are two eigenvalues and, correspondingly, two eigenvectors x(1) and x(2) , which can be obtained by substituting λ(1) and λ(2) into (1.2) and solving for x1 and x2. It is easy to show that the eigenvalues are orthogonal (x(1) ) x(2) = 0. The matrix formed by the two eigenvectors, u = (x(1) , x(2) ) = x (1) 1 x (2) 1 x (1) 2 x (2) 2 , is said to diagonalize the matrix a. That is u Au = λ(1) 0 0 λ(2) . (1.4) We can illustrate the whole procedure in IDL as follows:
  • 16. 8 CHAPTER 1. IMAGES, ARRAYS AND VECTORS IDL a=float([[1,2],[2,3]]) IDL print,a 1.00000 2.00000 2.00000 3.00000 IDL print,eigenql(a,eigenvectors=u,/double) 4.2360680 -0.23606798 IDL print,transpose(u)##a##u 4.2360680 -2.2204460e-016 -1.6653345e-016 -0.23606798 Note that, after diagonalization, the off-diagonal elements are not precisely zero due to rounding errors in the computation. All of the above properties generalize easily to N dimensions. 1.4 Finding minima and maxima In order to maximize some desirable property of a multispectral image, such as signal to noise or spread in intensity, we often need to take derivatives of vectors. A vector (partial) derivative in two dimensions is written ∂ ∂x and is defined as the vector ∂ ∂x = 1 0 ∂ ∂x1 + 0 1 ∂ ∂x2 . Many of the operations with vector derivatives correspond exactly to operations with or- dinary scalar derivatives (They can all be verified easily by writing out the expressions component-by component): ∂ ∂x (x y) = y analogous to ∂ ∂x xy = y ∂ ∂x (x x) = 2x analogous to ∂ ∂x x2 = 2x The scalar expression x Ay, where A is a matrix, is called a quadratic form. We have ∂ ∂x (x Ay) = Ay ∂ ∂y (x Ay) = A x and ∂ ∂x (x Ax) = Ax + A x. Note that, if A is a symmetrix matrix, this last equation can be written ∂ ∂x (x Ax) = 2Ax. Suppose x∗ is a critical point of the function f(x), i.e. d dx f(x∗ ) = d d f(x) x=x∗ = 0, (1.5)
  • 17. 1.4. FINDING MINIMA AND MAXIMA 9 x∗ x f(x) d dx f(x∗ ) = 0 Figure 1.4: A function of one variable. see Fig. 1.4. Then f(x∗ ) is a local minimum if d2 dx2 f(x∗ ) 0. This becomes obvious if we express f(x) as a Taylor series about x∗ f(x) = f(x∗ ) + (x − x∗ ) d dx f(x∗ ) + (x − x∗ )2 d2 dx2 f(x∗ ) + . . . . For |x − x∗ | sufficiently small this is equivalent to f(x) ≈ f(x∗ ) + (x − x∗ )2 d2 dx2 f(x∗ ). The situation is similar for scalar functions of a vector: f(x) ≈ f(x∗ ) + (x − x∗ ) ∂f(x∗ ) ∂x + 1 2 (x − x∗ ) H(x − x∗ ). (1.6) where H is called the Hessian matrix: (H)ij = ∂2 ∂xi∂xj f(x∗ ). (1.7) In the neighborhood of the critical point, since ∂f(x∗ ) ∂x = 0, we get the approximation f(x) ≈ f(x∗ ) + (x − x∗ ) H(x − x∗ ). Now the condition for a local minimum is that the Hessian matrix be positive definite at the point x∗ . Positive definiteness means that x Hx 0 for all x = 0. (1.8) Suppose we want to find a minimum (or maximum) of a scalar function f(x) of the vector x. If there are no constraints, then we solve the set of equations ∂f(x) ∂xi = 0, i = 1, 2, or, in terms of our notation for vector derivatives, ∂f(x) ∂x = 0 = 0 0 .
  • 18. 10 CHAPTER 1. IMAGES, ARRAYS AND VECTORS However suppose that x is constrained by the equation g(x) = 0. For example, we might have g(x) = x2 1 + x2 2 − 1 = 0 which constrains x to lie on a circle of radius 1. Finding an minimum of f subject to g = 0 is equivalent to finding an unconstrained minimum of f(x) + λg(x), (1.9) where λ is called a Lagrange multiplier and is treated like an additional variable, see [Mil99]. That is, we solve the set of equations ∂ ∂xi (f(x) + λg(x)) = 0, i = 1, 2 ∂ ∂λ (f(x) + λg(x)) = 0. (1.10) The latter equation is just g(x) = 0. For example, let f(x) = ax2 1 + bx2 2 and g(x) = x1 + x2 − 1. Then we get the three equations ∂ ∂x1 (f(x) + λg(x)) = 2ax1 + λ = 0 ∂ ∂x2 (f(x) + λg(x)) = 2bx2 + λ = 0 ∂ ∂λ (f(x) + λg(x)) = x1 + x2 − 1 = 0 The solution is x1 = b a + b , x2 = a a + b .
  • 19. 1.4. FINDING MINIMA AND MAXIMA 11 Exercises 1. Show that the outer product of two 2-dimensional vectors is a singular matrix. 2. Prove that the eigenvectors or a 2 × 2 symmetric matrix are orthogonal. 3. Differentiate the function 1 (x · a · y) with respect to y. 4. Verify the following matrix identity in IDL: (A · B) = B · A . 5. Calculate the eigenvalues and eigenvectors of a non-symmetric matrix with IDL. 6. Plot the function f(x) = x2 1 − x2 2 with IDL. Find its minima and maxima subject to the constraint g(x) = x2 1 + x2 2 − 1 = 0.
  • 20. 12 CHAPTER 1. IMAGES, ARRAYS AND VECTORS
  • 21. Chapter 2 Image Statistics It is useful to think of image pixel intensities g(x) as realizations of a random vector G(x) drawn independently from some probability distribution. 2.1 Random variables A random variable can be used to represent some quantity which changes in an unpredictable way each time it is observed. If there is a discrete set of M possible events {Ei}, i = 1 . . . M, associated with some random process, let pi be the probability that the ith event Ei will occur. If ni represents the number of times Ei occurs in n trials, we expect that pi → ni/n in the limit n → ∞ and that M i=1 pi = 1. For example, on the throw of a pair of dice, {Ei} = (1, 1), (1, 2), (2, 1) . . . (6, 6) and each event is equally probable pi = 1/36, i = 1 . . . 36. Formally, a random variable X is a real function on the set of possible events: X = f(Ei). If, for example, X is the sum of the points on the dice, X = f(E1) = 2, X = f(E2) = 3, X = f(E3) = 3, . . . X = f(E36) = 12. On the basis of the probabilities of the individual events, we can associate a distribution function P(x) with the random variable X, defined by P(x) = Pr(X ≤ x). For the dice example, P(1) = 0, P(2) = 1/36, P(3) = 1/12, . . . P(12) = 1. 13
  • 22. 14 CHAPTER 2. IMAGE STATISTICS For continuous random variables, such as the measured radiance at a satellite sensor, the distribution function is not expressed in terms of discrete probabilities, but rather in terms of a probability density function p(x), where p(x)dx is the probability that the value of the random variable X lies in the interval [x, x + dx]. Then P(x) = Pr(X ≤ x) = x −∞ p(t)dt and, of course, P(−∞) = 0, P(∞) = 1. Two random variables X and Y are said to be independent when Pr(X ≤ x and Y ≤ y) = Pr(X ≤ x, Y ≤ y) = P(x)P(y). The mean or expected value of a random variable X is written X and is defined in terms of the probability density function: X = ∞ −∞ xp(x)dx. The variance of X, written var(X) is defined as the expected value of the random variable (X − X )2 , i.e. var(X) = (X − X )2 . In terms of the probability density function, it is given by var(X) = ∞ −∞ (x − X )2 p(x)dx. Two simple but very useful identities follow from the definition of variance: var(X) = X2 − X 2 var(aX) = a2 var(X). (2.1) 2.2 The normal distribution It is often the case that random variables are well-described by the normal or Gaussian probability density function p(x) = 1 √ 2πσ exp(− 1 2σ2 (x − µ)2 ). In that case X = µ, var(X) = σ2 . The expected value of pixel intensities G(x) =     G1(x) G2(x) ... GN (x)     ,
  • 23. 2.2. THE NORMAL DISTRIBUTION 15 where x denotes the pixel coordinates, i.e. x = (i, j), is estimated by averaging over all of the pixels in the image, G(x) ≈ 1 cr c,r i,j=1 g(i, j), referred to as the sample mean vector. It is usually assumed to be independent of x, i.e. G(x) = G . The covariance between bands k and is defined according to cov(Gk, G ) = (Gk − Gk )(G − G ) and is estimated again by averaging over the pixels: cov(Gk, G ) ≈ 1 cr c,r i,j=1 (gk(i, j) − Gk )(g (i, j) − G ), which is called the sample covariance. The covariance is also usually assumed to be inde- pendent of x. The variance for bands k is given by var(Gk) = cov(Gk, Gk) = (Gk − Gk )2 . The random vector G is often assumed to be described by a multivariate normal proba- bility density function p(g), given by p(g) = 1 (2π)N/2 |Σ| exp − 1 2 (g − µ) Σ−1 (g − µ) . We indicate this by writing G ∼ N(µ, Σ). The distribution function of the multi-spectral pixels is then completely determined by the expected value G = µ and by the covariance matrix Σ. In two dimensions, for example, Σ = var(G1) cov(G1, G2) cov(G2, G1) var(G2) = σ2 1 σ12 σ21 σ2 2 . Note that, since cov(Gk, G ) = cov(G , Gk), the covariance matrix is symmetric, Σ = Σ . The covariance matrix can also be written as an outer product: Σ = (G − G )(G − G ) . as can its estimated value: Σ ≈ 1 cr c,r i,j=1 (g(i, j) − G )(g(i, j) − G ) . If G = 0, we can write simply Σ = GG . Another useful identity applies to any linear combination a G of the random vector G, namely var(a G) = a Σa. (2.2)
  • 24. 16 CHAPTER 2. IMAGE STATISTICS This is obvious in two dimensions, since we have var(a G) = cov(a1G1 + a2G2, a1G1 + a2G2) = a2 1var(G1) + a1a2cov(G1, G2) + a1a2cov(G2, G1) + a2 2var(G2) = (a1, a2) var(G1) cov(G1, G2) cov(G2, G1) var(G2) a1 a2 . Variance is always nonnegative and the vector a in (2.2) is arbitrary, so we have a Σa ≥ 0 for all a. The covariance matrix is therefore said to be positive semi-definite. The correlation matrix C is similar to the covariance matrix, except that each matrix element (i, j) is normalized to var(Gi)var(Gj). In two dimensions C = 1 ρ12 ρ21 1 =   1 cov(G1,G2) √ var(G1)var(G2) cov(G2,G1) √ var(G1)var(G2) 1   = 1 σ12 σ1σ2 σ21 σ1σ2 1 . The following ENVI/IDL program calculates and prints out the covariance matrix of a multispectral image: envi_select, title=’Choose multispectral image’,fid=fid,dims=dims,pos=pos if (fid eq -1) then return num_cols = dims[2]-dims[1]+1 num_rows = dims[4]-dims[3]+1 num_pixels = (num_cols*num_rows) num_bands = n_elements(pos) samples=intarr(num_bands,n_elements(num_pixels)) for i=0,num_bands-1 do samples[i,*]=envi_get_data(fid=fid,dims=dims,pos=pos[i]) print, correlate(samples,/covariance,/double) end ENVI .GO 111.46663 82.123236 159.58377 133.80637 82.123236 64.532431 124.84815 104.45298 159.58377 124.84815 246.18004 205.63420 133.80637 104.45298 205.63420 192.70367 2.3 A special function If n is an integer, the factorial of n is defined by n! = n(n − 1) · · · 1, 1! = 0! = 1. The generalization of this to non-integers z is the gamma function Γ(z) = ∞ 0 tz−1 e−t dt. It has the property Γ(z + 1) = zΓ(z).
  • 25. 2.4. CONDITIONAL PROBABILITIES AND BAYES THEOREM 17 The factorial is a special case, i.e. for integer n Γ(n) = n! A further generalization is the incomplete gamma function ΓP (a, x) = 1 Γ(a) x 0 ta−1 s−t dt. It has the properties ΓP (a, 0) = 0, ΓP (a, ∞) = 1. Here is a plot of ΓP for a = 3 in IDL: x=findgen(100)/10 envi_plot_dtat,x,igamma(3,x) Figure 2.1: The incomplete gamma function. We are interested in this function for the following reason. Suppose that the random variables Xi, i = 1 . . . n, are independent normally distributed with zero mean and variance σ2 i . Then the random variable Z = n i=1 Xi σi 2 has the distribution function P(z) = Pr(Z ≤ z) = ΓP (n/2, z/2), and is said to be chi-square distributed with n degrees of freedom. 2.4 Conditional probabilities and Bayes Theorem If A and B are two events such that the probability of A andB occurring simultaneously is P(A, B), then the conditional probability of A occuring given that B has occurred is P(A | B) = P(A, B) P(B) .
  • 26. 18 CHAPTER 2. IMAGE STATISTICS Bayes’ Theorem (named after Rev. Thomas Bayes, an 18th century mathematician who derived a special case) is the basic starting point for inference problems using probability theory as logic. We will use it in the following form. Let X be a random variable describing a pixel intensity, and let {Ck | k = 1 . . . M} be a set of possible classes for the pixels. Then the a posteriori conditional probability for class Ck, given the measured pixel intensity x is P(Ck|x) = P(x|Ck)P(Ck) P(x) , (2.3) where P(Ck) is the prior-probability for class Ck, P(x|Ck) is the conditional probability of observing the value x, if it belongs to class Ck, P(x) = M k=1 p(x|Ck)p(Ck) is the total probability for x. 2.5 Linear regression Applying radiometric corrections to digital images often involves fitting a set of m data points (xi, yi) to a straight line: y(x) = a + bx + . Suppose that the measurements yi include a random error with variance σ2 and that the measurements xi are exact. Define a “goodness of fit” function χ2 (a, b) = m i=1 yi − a − bxi σ 2 . (2.4) If the random variable is normally distributed, then we obtain the most likely (i.e. best) values for a and b by minimizing this function, that is, by solving the equations ∂χ2 ∂a = ∂χ2 ∂b = 0. The solution is ˆb = sxy s2 xx , ˆa = ¯y − ˆb¯x, (2.5) where sxy = 1 m m i=1 (xi − ¯x)(yi − ¯y) s2 xx = 1 m m i=1 (xi − ¯x)2 ¯x = 1 m m i=1 xi, ¯y = 1 m m i=1 yi. The uncertainties in the estimates ˆa and ˆb are given by σ2 a = σ2 x2 i m x2 i − ( xi)2 σ2 b = σ2 m m x2 i − ( xi)2 . (2.6)
  • 27. 2.5. LINEAR REGRESSION 19 If σ2 is not known a priori, then it can be estimated by ˆσ2 = 1 m − 2 m i=1 (yi − ˆa − ˆbxi)2 . Generalized and orthogonal least squares methods are described in Appendix A. A recusive procedure is described in Appendix C.
  • 28. 20 CHAPTER 2. IMAGE STATISTICS Exercises 1. Write the multivariate normal probability density function p(g) for the case Σ = σ2 I. Show that probability density function for a one-dimensional random variable G is a special case. Prove that G = µ. 2. In the Monty Hall game a contestant is asked to choose between one of three doors. Behind one of the doors is an automobile as prize for choosing the correct door. After the contestant has chosen, Monty Hall opens one of the other two doors to show that the automobile is not there. He then asks the contestant if she wishes to change her mind and choose the other unopened door. Use Bayes’ theorem to prove that her correct answer is “yes”. 3. Derive the uncertainty for a in (2.6) from the formula for error propagation σ2 a = N i=1 σ2 ∂f ∂yi 2 .
  • 29. Chapter 3 Transformations Up until now we have thought of multispectral images as (r × c × N)-dimensional arrays of measured pixel intensities. In the present chapter we consider other representations of images which are often useful in image analysis. 3.1 Fourier transforms Figure 3.1: Fourier series approximation of a sawtooth function. The series was truncated at k = ±4. The left hand side shows the intensities |ˆx(k)|2 . A periodic function x(t) with period T, x(t) = x(t + T) can always be expressed as the infinite Fourier series x(t) = ∞ k=−∞ ˆx(k)ei2π(kf)t , (3.1) where f = 1/T = ω/2π and eix = cos x + i sin x. From the orthogonality of the e-functions, the coefficients ˆx(k) in the expansion are given by ˆx(k) = f 1/2f −1/2f x(t)e−i2π(kf)t dt. (3.2) 21
  • 30. 22 CHAPTER 3. TRANSFORMATIONS Figure 3.1 shows an example for the sawtooth function with period T = 1: x(t) = t, −1/2 ≤ t 1/2. Parseval’s formula follows directly from (3.2) k |ˆx(k)|2 = f 1/2f −1/2f (x(t))2 dt. 3.1.1 Discrete Fourier transform Let g(j) be a discrete sample of the real function g(x) (a row of pixels), sampled c times at the sampling interval ∆ over a complete period T, i.e. g(j) = g(x = j∆), j = 0 . . . c − 1. The corresponding discrete Fourier series is written g(j) = 1 c c/2 k=−c/2 ˆg(k)ei2π(kf)(j∆) , j = 0 . . . c − 1, (3.3) where the truncation frequency ±c 2 f is the highest frequency component that can be deter- mined by the sampling. This frequency is called the Nyquist critical frequency and is given by 1/2∆, so that f is determined by cf 2 = 1 2∆ or f = 1 c∆ . (This corresponds to sampling over one complete period: c∆ = T.) Thus (3.3) becomes g(j) = 1 c c/2 k=−c/2 ˆg(k)ei2πkj/c , j = 0 . . . c − 1. With the observation ei2π(−c/2)j/c = e−iπj = (−1)c = eiπj = ei2π(c/2)j/c , we can write this as g(j) = 1 c c/2−1 k=−c/2 ˆg(k)ei2πkj/c , j = 0 . . . c − 1, a set of c equations in the c unknown frequency components ˆg(k). Equivalently, g(j) = 1 c c/2−1 k=0 ˆg(k)eπ2πkj/c + 1 c −1 k=−c/2 ˆg(k)ei2πkj/c = 1 c c/2−1 k=0 ˆg(k)ei2πkj/c + 1 c c−1 k =c/2 X(k − c)ei2π(k −c)j/c = 1 c c/2−1 k=0 ˆg(k)ei2πkj/c + 1 c c−1 k=c/2 ˆg(k − c)ei2πkj/c .
  • 31. 3.2. WAVELETS 23 Thus we can write g(j) = 1 c c−1 k=0 ˆg(k)ei2πkj/c , j = 0 . . . c − 1, (3.4) if we interpret ˆg(k) → ˆg(k − c) when k ≥ c/2. The solution to (3.4) for the complex frequency components ˆg(k) is called the discrete Fourier transform and is given by ˆg(k) = c−1 j=0 g(j)e−i2πkj/c , k = 0 . . . c − 1. (3.5) This follows from the following orthogonality property: c−1 j=0 ei2π(k−k )j/c = cδk,k . (3.6) Eq. (3.4) itself is the discrete inverse Fourier transform. The discrete analog of Parsival’s formula is c−1 k=0 |ˆg(k)|2 = 1 c c−1 j=0 g(j)2 . (3.7) Determining the frequency components in (3.5) would appear to involve, in all, c2 floating point multiplication operations. The fast Fourier transform (FFT) exploits the structure of the complex e-functions to reduce this to order c log c, see for example [PFTV86]. 3.1.2 Discrete Fourier transform of an image The discrete Fourier transform is easily generalized to two dimensions for the purpose of image analysis. Let g(i, j), i, j = 0 . . . c − 1, represent a (quadratic) gray scale image. Its discrete Fourier transform is ˆg(k, ) = c−1 i=0 c−1 j=0 g(i, j)e−i2π(ik+j )/c (3.8) and the corresponding inverse transform is g(i, j) = 1 c2 c−1 k=0 c−1 =0 ˆg(k, )ei2π(ik+j )/c . (3.9) 3.2 Wavelets Unlike the Fourier transform, which represents a signal (array of pixel intensities) in terms of pure frequency functions, the wavelet transform expresses the signal in terms of functions which are restricted both in terms of frequency and spatial extent. In many applications, this turns out to be particularly efficient and useful. We’ll see an example of this in Chapter 7, where we discuss image fusion in more detail. The wavelet transform is discussed in Appendix B.
  • 32. 24 CHAPTER 3. TRANSFORMATIONS 3.3 Principal components The principal components transformation forms linear combinations of multispectral pixel intensities which are mutually uncorrelated and which have maximum variance. We assume without loss of generality that G = 0, so that the covariance matrix of a multispectral image is is Σ = GG , and look for a linear combination Y = a G with maximum variance, subject to the normalization condition a a = 1. Since the covariance of Y is a Σa, this is equivalent to maximizing an unconstrained Lagrange function, see Section 1.4, L = a Σa − 2λ(a a − 1). The maximum of L occurs at that value of a for which ∂L ∂a = 0. Recalling the rules for vector differentiation, ∂L ∂a = 2Σa − 2λa = 0 which is the eigenvalue problem Σa = λa. Since Σ is real and symmetric, the eigenvectors are orthogonal (and normalized). Denote them a1 . . . aN for eigenvalues λ1 ≥ . . . ≥ λN . Define the matrix A = (a1 . . . aN ), AA = I, and let the the transformed principal component vector be Y = A G with covariance matrix Σ . Then we have Σ = YY = A GG A = A ΣA = Diag(λ1 . . . λN ) =     λ1 0 · · · 0 0 λ2 · · · 0 ... ... ... ... 0 0 · · · λN     =: Λ. The fraction of the total variance in the original multispectral image which is described by the first i principal components is λ1 + . . . + λi λ1 + . . . + λi + . . . + λN . If the original multispectral channels are highly correlated, as is usually the case, the first few principal components will account for a very high percentage of the variance the image. For example, a color composite of the first 3 principal components of a LANDSAT TM scene displays essentially all of the information contained in the 6 spectral components in one single image. Nevertheless, because of the approximation involved in the assumption of a normal distribution, higher order principal components may also contain significant information [JRR99]. The principal components transformation can be performed directly from the ENVI main menu. However the following IDL program illustrates the procedure in detail: ; Principal components analysis envi_select, title=’Choose multispectral image’, $
  • 33. 3.4. MINIMUM NOISE FRACTION 25 fid=fid, dims=dims,pos=pos if (fid eq -1) then return num_cols = dims[2]+1 num_lines = dims[4]+1 num_pixels = (num_cols*num_lines) num_channels = n_elements(pos) image=intarr(num_channels,num_pixels) for i=0,num_channels-1 do begin temp=envi_get_data(fid=fid,dims=dims,pos=pos[i]) m = mean(temp) image[i,*]=temp-m endfor ; calculate the transformation matrix A sigma = correlate(image,/covariance,/double) lambda = eigenql(sigma,eigenvectors=A,/double) print,’Covariance matrix’ print, sigma print,’Eigenvalues’ print, lambda print,’Eigenvectors’ print, A ; transform the image image = image##transpose(A) ; reform to BSQ format PC_array = bytarr(num_cols,num_lines,num_channels) for i = 0,num_channels-1 do PC_array[*,*,i] = $ reform(image[i,*],num_cols,num_lines,/overwrite) ; output the result to memory envi_enter_data, PC_array end 3.4 Minimum noise fraction Principal components analysis maximizes variance. This doesn’t always lead to images of decreasing image quality (i.e. of increasing noise). The MNF transformation minimizes the noise content rather than maximizing variance, so, if this is the desired criterion, it is to be preferred over PCA. Suppose we can represent a gray scale image G with covariance matrix Σ and zero mean as a sum of uncorrelated signal and noise noise components G = S + N,
  • 34. 26 CHAPTER 3. TRANSFORMATIONS both normally distributed, with covariance matrices ΣS and ΣN and zero mean. Then we have Σ = GG = (S + N)(S + N) = SS + NN , since noise and signal are uncorrelated, i.e. SN = NS = 0. Thus Σ = ΣS + ΣN . (3.10) Now let us seek a linear combination a G for which the signal to noise ratio SNR = var(a S) var(a N) = a ΣSa a ΣN a is maximized. From (3.10) we can write this in the form SNR = a Σa a ΣN a − 1. (3.11) Differentiating we get ∂ ∂a SNR = 1 a ΣN a 1 2 Σa − a Σa (a ΣN a)2 1 2 ΣN a = 0, or, equivalently, (a ΣN a)Σa = (a Σa)ΣN a . This condition is met when a solves the generalized eigenvalue problem ΣN a = λΣa. (3.12) Both ΣN and Σ are symmetric and the latter is also positive definite. Its Cholesky factor- ization is Σ = LL , where L is a lower triangular matrix, and can be thought of as the “square root” of Σ. Such an L always exists is Σ is positive definite. With this, we can write (3.12) as ΣN a = λLL a or, equivalently, L−1 ΣN (L )−1 L a = λL a or, with b = L a and commutivity of inverse and transpose, [L−1 ΣN (L−1 ) ]b = λb, a standard eigenproblem for a real, symmetric matrix L−1 ΣN (L−1 ) . From (3.11) we see that the SNR for eigenvalue λi is just SNRi = ai Σai ai (λiΣai) − 1 = 1 λi − 1. Thus the eigenvector ai corresponding to the smallest eigenvalue λi will maximize the signal to noise ratio. Note that (3.12) can be written in the form ΣN A = ΣAΛ, (3.13)
  • 35. 3.4. MINIMUM NOISE FRACTION 27 where A = (a1 . . . aN ) and Λ = Diag(λ1 . . . λN ). The MNF transformation is available in the ENVI environment. It is carried out in two steps which are equivalent to the above. First of all the noise contribution to G is “whitened”, i.e. the random vector N has covariance matrix I, the identity matrix. Since ΣN can be assumed to be diagonal anyway (the noise in any band is uncorrelated with the noise in any other band), we accomplish this by doing a transformation which divides the components of G by the standard deviations of the noise, X = Σ −1/2 N G, where Σ −1/2 N ΣN Σ −1/2 N = I. The transformed random vector X thus has covariance matrix ΣX = Σ −1/2 N ΣΣ −1/2 N . (3.14) Next we do an ordinary principal components transformation on X, i.e. Y = B X where B ΣX B = ΛX, B B = I. (3.15) The overall transformation is thus Y = B Σ −1/2 N G = A G where A = Σ −1/2 N B is not an orthogonal transformation. To see that this transformation is equivalent to solving the generalized eigenvalue problem, consider ΣN A = ΣN Σ −1/2 N B = Σ 1/2 N ΣX BΛ−1 X = Σ 1/2 N Σ −1/2 N ΣΣ −1/2 N BΛ−1 X = ΣAΛ−1 X . This is equivalent to (3.13) with λXi = 1 λi = SNRi + 1. Thus an eigenvalue in the second transformation equal to one corresponds to “pure noise”. Before the transformation can be performed, it is of course necessary to estimate the noise covariance matrix ΣN . This can be done for example by differencing with respect to the local mean: (ΣN )k ≈ 1 cr c,r i,j (gk(i, j) − mk(i, j))(g (i, j) − m (i, j)) where mk(i, j) is the local mean of pixels in some neighborhood of (i, j).
  • 36. 28 CHAPTER 3. TRANSFORMATIONS 3.5 Maximum autocorrelation factor (MAF) Let x represent the coordinates of a pixel within image G, i.e. x = (i, j). We consider the covariance matrix Γ between the original image, represented by G(x), and the same image G(x + ∆) shifted by an amount ∆ = (∆x, ∆y) : Γ(∆) = G(x)G(x + ∆) , assumed to be independent of x. Then Γ(0) = Σ, and furthermore Γ(−∆) = G(x)G(x − ∆) = G(x + ∆)G(x) = (G(x)G(x + ∆) ) = Γ(∆) . Now we consider the covariance of projections of the original and shifted images: cov(a G(x), a G(x + ∆)) = a G(x)G(x + ∆) a = a Γ(∆)a = a Γ(−∆)a = 1 2 a (Γ(∆) + Γ(−∆))a. (3.16) Define Σ∆ as the covariance matrix of the difference image G(x) − G(x + ∆), i.e. Σ∆ = (G(x) − G(x + ∆))(G(x) − G(x + ∆) = G(x)G(x) + G(x + ∆)G(x + ∆) − G(x)G(x + ∆) − G(x + ∆)G(x) = 2Σ − Γ(∆) − Γ(−∆). Hence Γ(∆) + Γ(−∆) = 2Σ − Σ∆ and we can write (3.16) in the form cov(a G(x), a G(x + ∆)) = a Σa − 1 2 a Σ∆a. The correlation of the projections is therefore given by corr(a G(x), a G(x + ∆)) = a Σa − 1 2 a Σ∆a var(a G(x))var(a G(x + ∆)) = a Σa − 1 2 a Σ∆a (a Σa)(a Σa) = 1 − 1 2 a Σ∆a a Σa . (3.17) We want to determine that vector a which extremalizes this correlation, so we wish to extremalize the function R(a) = a Σ∆a a Σa .
  • 37. 3.5. MAXIMUM AUTOCORRELATION FACTOR (MAF) 29 Differentiating, ∂R ∂a = 1 a Σa 1 2 Σ∆a − a Σ∆a (a Σa)2 1 2 Σa = 0 or (a Σa)Σ∆a = (a Σ∆a)Σa. This condition is met when a solves the generalized eigenvalue problem Σ∆a = λΣa, (3.18) which is seen to have the same form as (3.12). Again both Σ∆ and Σ are symmetric and the latter is also positive definite and we obtain the standard eigenproblem [L−1 Σ∆(L−1 ) ]b = λb, for the real, symmetric matrix L−1 Σ∆(L−1 ) . Let the eigenvalues be λ1 ≥ . . . λN and the corresponding (orthogonal) eigenvectors be bi. We have 0 = bi bj = ai LL aj = ai Σaj, i = j, and therefore cov(ai G(x), aj G(x)) = ai Σaj = 0, i = j, so that the MAF components are orthogonal (uncorrelated). Moreover with equation (2.14) we have corr(ai G(x), ai G(x + ∆)) = 1 − 1 2 λi, and the first MAF component has minimum autocorrelation. An ENVI plug-in for performing the MAF transformation is given in Ap- pendix D.5.2.
  • 38. 30 CHAPTER 3. TRANSFORMATIONS Exercises 1. Show that, for x(t) = sin(2πt) in Eq. (2.2), ˆx(−1) = − 1 2i , ˆx(1) = 1 2i , and ˆx(k) = 0 otherwise. 2. Calculate the discrete Fourier transform of the sequence 2, 4, 6, 8 from (3.4). You have to solve four simultaneous equations, the first of which is 2 = 1 4 ˆg(0) + ˆg(1) + ˆg(2) + ˆg(3) . Verify your result in IDL with the command print, FFT([2,4,6,8])
  • 39. Chapter 4 Radiometric enhancement 4.1 Lookup tables Figure 4.1: Contrast enhancement with a lookup table represented as the continuous function f(x) [JRR99]. Intensity enhancement of an image is easily accomplished by means of lookup tables. For byte-encoded data, the pixel intensities g are used to index an array LUT[k], k = 0 . . . 255, the entries of which also lie between 0 and 255. These entries can be chosen to implement linear stretching, saturation, histogram equalization, etc. according to ˆgk(i, j) = LUT[gk(i, j)], 0 ≤ i ≤ r − 1, 0 ≤ j ≤ c − 1. 31
  • 40. 32 CHAPTER 4. RADIOMETRIC ENHANCEMENT It is also useful to think of the the lookup table as an approximately continuous function y = f(x). If hin(x) is the histogram of the original image and hout(y) is the histogram of the image after transformation through the lookup table, then, since the number of pixels is constant, hout(y) dy = hin(x) dx, see Fig.4.1 4.1.1 Histogram equalization For histogram equalization, we want hout(y) to be constant independent of y. Hence dy ∼ hin(x) dx and y = f(x) ∼ x 0 hin(t)dt. The lookup table y for histogram equalization is thus proportional to the cumulative sum of the original histogram. 4.1.2 Histogram matching Figure 4.2: Steps required for histogram matching [JRR99]. It is often desirable to match the histogram of one image to that of another so as to make their apparent brightnesses as similar as possible, for example when the two images
  • 41. 4.2. CONVOLUTIONS 33 are combined in a mosaic. We can do this by first equalizing both the input histogram hin(x) and the reference histogram href (y) with the cumulative lookup tables z = f(x) and z = g(y), respectively. The required lookup table is then y = g−1 (z) = g−1 (f(x)). The necessary steps for implementing this function are illustrated in Fig. 1.5 taken from [JRR99]. 4.2 Convolutions With the convention ω = 2πk/c we can write (3.5) in the form ˆg(ω) = c−1 j=0 g(j)e−iωj . (4.1) The convolution of g with a filter h = (h(0), h(1), . . .) is defined by f(j) = k h(k)g(j − k) =: h ∗ g, (4.2) where the sum is over all nonzero elements of the filter h. If the number of nonzero elements is finite, we speak of a finite impulse response filter (FIR). Theorem 1 (Convolution theorem) In the frequency domain, convolution is replaced by multiplication: ˆf(ω) = ˆh(ω)ˆg(ω). Proof: ˆf(ω) = j f(j)e−iωj = j,k h(k)g(j − k)e−iωj ˆh(ω)ˆg(ω) = k h(k)e−iωk g( )e−iω = k, h(k)g( )e−iω(k+ ) = k,j h(k)g(j − k)e−iωj = ˆf(ω). This can of course be generalized to two dimensional images, so that there are three basic steps involved in image filtering: 1. The image and the convolution filter are transformed from the spatial domain to the frequency domain using the FFT. 2. The transformed image is multiplied with the frequency filter. 3. The filtered image is transformed back to the spatial domain.
  • 42. 34 CHAPTER 4. RADIOMETRIC ENHANCEMENT We often distinguish between low-pass and high-pass filters. Low pass filters perform some sort of averaging. The simplest example is h = (1/2, 1/2, 0 . . .), which computes the average of two consecutive pixels. A high-pass filter computes differences of nearby pixels, e.g. h = (1/2, −1/2, 0 . . .). Figure 4.3 shows the Fourier transforms of these two simple filters generated by the the IDL program ; Hi-Lo pass filters x = fltarr(64) x[0]=0.5 x[1]=-0.5 p1 =abs(FFT(x)) x[1]=0.5 p2 =abs(FFT(x)) envi_plot_data,lindgen(64),[[p1],[p2]] end Figure 4.3: Low-pass(red) and high-pass (white) filters in the frequency domain. The quan- tity |ˆh(k)|2 is plotted as a function of k. The highest frequency is at the center of the plots, k = c/2 = 32 . 4.2.1 Laplacian of Gaussian filter We shall illustrate image filtering with the so-called Laplacian of Gaussian (LoG) filter, which will be used in Chapter 6 to implement contour matching for automatic determination of ground control points. To begin with, consider the gradient operator for a two-dimensional image: = ∂ ∂x = i ∂ ∂x1 + j ∂ ∂x2 ,
  • 43. 4.2. CONVOLUTIONS 35 where i and j are unit vectors in the vertical and horizontal directions, respectively. g(x) is a vector in the direction of the maximum rate of change of gray scale intensity. Since the intensity values are discrete, the partial derivatives must be approximated. For example we can use the Sobel operators: ∂g(x) ∂x1 ≈ [g(i − 1, j − 1) + 2g(i, j − 1) + g(i + 1, j − 1)] − [g(i − 1, j + 1) + 2g(i, j + 1) + g(i + 1, j + 1)] = 2(i, j) ∂g(x) ∂x2 ≈ [g(i − 1, j − 1) + 2g(i − 1, j) + g(i − 1, j + 1)] − [g(i + 1, j − 1) + 2g(i + 1, j) + g(i + 1, j + 1)] = 1(i, j) which are equivalent to the two-dimensional FIR filters h1 = −1 0 1 −2 0 2 −1 0 1 and h2 = 1 2 1 0 0 0 −1 −2 −1 , respectively. The magnitude of the gradient is | | = 2 1 + 2 2. Edge detection can be achieved by calculating the filtered image f(i, j) = | |(i, j) and setting an appropriate threshold. Figure 4.4: Laplacian of Gaussian filter.
  • 44. 36 CHAPTER 4. RADIOMETRIC ENHANCEMENT Now consider the second derivatives of the image intensities, which can be represented formally by the Laplacian 2 = · = ∂2 ∂x2 1 + ∂2 ∂x2 2 . 2 g(x) is a scalar quantity which is zero whenever the gradient is maximum. Therefore changes in intensity from dark to light or vice versa correspond to sign changes in the Laplacian and these can also be used for edge detection. The Laplacian can also be ap- proximated by a FIR filter, however such filters tend to be very sensitive to image noise. Usually a low-pass Gauss filter is first used to smooth the image before the Laplacian filter is applied. It is more efficient, however, to calculate the Laplacian of the Gauss function itself and then use the resulting function to derive a high-pass filter. The Gauss function in two dimensions is given by 1 2πσ2 exp − 1 2σ2 (x2 1 + x2 2), where the parameter σ determines its extent. Its Laplacian is 1 2πσ6 (x2 1 + x2 2 − 2σ2 ) exp − 1 2σ2 (x2 1 + x2 2) a plot of which is shown in Fig. 4.4. The following program illustrates the application of the filter to a gray scale image, see Fig. 4.5: pro LoG sigma = 2.0 filter = fltarr(17,17) for i=0L,16 do for j=0L,16 do $ filter[i,j] = (1/(2*!pi*sigma^6))*((i-8)^2+(j-8)^2-2*sigma^2) $ *exp(-((i-8)^2+(j-8)^2)/(2*sigma^2)) ; output as EPS file thisDevice =!D.Name set_plot, ’PS’ Device, Filename=’c:tempLoG.eps’,xsize=4,ysize=4,/inches,/Encapsulated shade_surf,filter device,/close_file set_plot, thisDevice ; read a jpg image filename = Dialog_Pickfile(Filter=’*.jpg’,/Read) OK = Query_JPEG(filename,fileinfo) if not OK then return xsize = fileinfo.dimensions[0] ysize = fileinfo.dimensions[1] window,11,xsize=xsize,ysize=ysize Read_JPEG,filename,image1 image = bytarr(xsize,ysize)
  • 45. 4.2. CONVOLUTIONS 37 image[*,*] = image1[0,*,*] tvscl,image ; run the filter filt = image*0.0 filt[0:16,0:16]=filter[*,*] image1= float(fft(fft(image)*fft(filt),1)) ; get zero-crossings and display image2 = bytarr(xsize,ysize) indices = where( (image1*shift(image1,1,0) lt 0) or (image1*shift(image1,0,1) lt 0) ) image2[indices]=255 wset, 11 tv, image2 end Figure 4.5: Image filtered with the Laplacian of Gaussian filter.
  • 46. 38 CHAPTER 4. RADIOMETRIC ENHANCEMENT
  • 47. Chapter 5 Topographic modelling Satellite images are two-dimensional representations of the three-dimensional earth surface. The correct treatment of the third dimension – the elevation – is essential for terrain mod- elling and accurate georeferencing. 5.1 RST transformation Transformations of spatial coordinates1 in 3 dimensions which involve only rotations, scaling and translations can be represented by a 4 × 4 transformation matrix A v∗ = Av (5.1) where v is the column vector containing the original coordinates v = (X, Y, Z, 1) and v∗ contains the transformed coordinates v∗ = (X∗ , Y ∗ , Z∗ , 1) . For example the translation X∗ = X + X0 Y ∗ = Y + Y0 Z∗ = Z + Z0 corresponds to the transformation matrix T =    1 0 0 X0 0 1 0 Y0 0 0 1 Z0 0 0 0 1    , a uniform scaling by 50% to S =    1/2 0 0 0 0 1/2 0 0 0 0 1/2 0 0 0 0 1    , 1The following treatment closely follows Chapter 2 of Gonzales and Woods [GW02]. 39
  • 48. 40 CHAPTER 5. TOPOGRAPHIC MODELLING and a simple rotation θ about the Z-axis to Rθ =    cos θ sin θ 0 0 −sinθ cosθ 0 0 0 0 1 0 0 0 0 1    , etc. The complete RST transformation is then v∗ = RSTv = Av. (5.2) The inverse transformation is of course represented by A−1 . 5.2 Imaging transformations An imaging (or perspective) transformation projects 3D points onto a plane. It is used to describe the formation of a camera image and, unlike the RST transformation, is non-linear since it involves division by coordinate values. Figure 5.1: Basic imaging process, from [GW02]. In Figure 5.1, the camera coordinate system (x, y, x) is aligned with the world coordinate system, describing the terrain to be imaged. The camera focal length is λ. From sim- ple geometry we obtain expressions for the image plane coordinates in terms of the world coordinates: x = λX λ − Z y = λY λ − Z . (5.3) Solving for the X and Y world coordinates: X = x λ (λ − Z) Y = y λ (λ − Z). (5.4)
  • 49. 5.3. CAMERA MODELS AND RFM APPROXIMATIONS 41 Thus, in order to extract the geographical coordinates (X, Y ) of a point on the earth’s surface from its image coordinates, we require knowledge of the elevation Z. Correcting for the elevation in this way constitutes the process of orthorectification. 5.3 Camera models and RFM approximations Equation (5.3) is overly simplified, as it assumes that the origin of world and image coordi- nates coincide. In order to apply it, one has first to transform the image coordinate system from the satellite to the world coordinate system. This is done in a straightforward way with the rotation and translation transformations introduced in Section 5.1. However it requires accurate knowledge of the height and orientation of the satellite imaging system at the time of the image acquisition (or, more exactly, during the acquisition, since the latter is normally not instantaneous). The resulting non-linear equations that relate image and world coordinates are what constitute the camera or sensor model for that particular image. Direct use of the camera model for image processing is complicated as it requires ex- tremely exact, sometimes proprietary information about the sensor system and its orbit. An alternative exists if the image provider also supplies a so-called rational function model (RFM) which approximates the camera model for each acquisition as a ratio of rational polynomials, see e.g. [TH01]. Such RFMs have the form r = f(X , Y , Z ) = a(X , Y , Z ) b(X , Y , Z ) c = g(X , Y , Z ) = c(X , Y , Z ) d(X , Y , Z ) (5.5) where c and r are the column and row (XY) coordinates in the image plane relative to an origin (c0, r0) and scaled by a factor cs resp. rs: c = c − c0 cs , r = r − r0 rs . Similarly X , Y and Z are relative, scaled world coordinates: X = X − X0 Xs , Y = Y − Y0 Ys , Z = Z − Z0 Zs . The polynomials a, b, c and d are typically to third order in the world coordinates, e.g. a(X, Y, Z) = a0 + a1X + a2Y + a3Z + a4XY + a5XZ + a6Y Z + a7X2 + a8Y 2 + a9Z2 + a10XY Z + a11X3 + a12XY 2 + a13XZ2 + a14X2 Y + a15Y 3 + a16Y Z2 + a17X2 Z + a18Y 2 Z + a19Z3 The advantage of using ratios of polynomials is that these are less subject to interpolation error. For a given acquisition the provider fits the RFM to his camera model using a three- dimensional grid of points covering the image and world spaces with a least squares fitting procedure. The RFM is capable of representing the camera model extremely well and can be used as a replacement for it. Both Space Imaging and Digital Globe provide RFMs with their high resolution IKONOS and QuickBird imagery. Below is a sample Quickbird RFM file giving the origins, scaling factors and polynomial coefficients needed in Eq. (5.5).
  • 50. 42 CHAPTER 5. TOPOGRAPHIC MODELLING satId = QB02; bandId = P; SpecId = RPC00B; BEGIN_GROUP = IMAGE errBias = 56.01; errRand = 0.12; lineOffset = 4683; sampOffset = 4154; latOffset = 32.5709; longOffset = 51.8391; heightOffset = 1582; lineScale = 4733; sampScale = 4399; latScale = 0.0256; longScale = 0.0269; heightScale = 500; lineNumCoef = ( +1.162844E-03, -7.011681E-03, -9.993482E-01, -1.119999E-02, -6.682911E-06, +7.591306E-05, +3.632740E-04, -1.111298E-04, -5.842086E-04, +2.212466E-06, -1.275349E-06, +1.279061E-06, +1.918762E-08, -6.957548E-07, -1.240783E-06, -7.644403E-07, +3.479752E-07, +1.259300E-05, +1.085128E-06, -1.571375E-06); lineDenCoef = ( +1.000000E+00, +1.801541E-06, +5.822024E-04, +3.774278E-04, -2.141015E-08, -6.984359E-07, -1.344888E-06, -9.669251E-07, -4.726988E-08, +1.329814E-06, +2.113403E-08, -2.914653E-06,
  • 51. 5.3. CAMERA MODELS AND RFM APPROXIMATIONS 43 -4.367422E-07, +6.988065E-07, +4.035593E-07, +3.275453E-07, -2.740827E-07, -4.147675E-06, -1.074015E-06, +2.218804E-06); sampNumCoef = ( -9.783496E-04, +9.566915E-01, -8.477919E-03, -5.393803E-02, -1.590864E-04, +5.477412E-04, -3.968308E-04, +4.819512E-04, -3.965558E-06, -3.442885E-05, +5.821180E-08, +2.952683E-08, -1.363146E-07, +2.454422E-07, +1.372698E-07, +1.987710E-07, -3.167074E-07, -1.038018E-06, +1.376092E-07, -2.352636E-07); sampDenCoef = ( +1.000000E+00, +5.029785E-04, +1.225257E-04, -5.780883E-04, -1.543054E-07, +1.240426E-06, -1.830526E-07, +3.264812E-07, -1.255831E-08, -5.177631E-07, -5.868514E-07, -9.029287E-07, +7.692317E-08, +1.289335E-07, -3.649242E-07, +0.000000E+00, +1.229000E-07, -1.290467E-05, +4.318970E-08, -8.391348E-08);
  • 52. 44 CHAPTER 5. TOPOGRAPHIC MODELLING END_GROUP = IMAGE END; To illustrate a simple use of the RFM data, consider a vertical structure in a high- resolution image, such as a chimney or building fassade. Suppose we determine the image coordinates of the bottom and top of the structure to be (rb, cb) and (rt, ct), respectively. Then from 5.5 rb = f(X, Y, Zb) cb = g(X, Y, Zb) rt = f(X, Y, Zt) ct = g(X, Y, Zt), (5.6) since the (X, Y ) coordinates must be the same. This would appear to constitute a set of four equations in four unknowns X, Y , Zb and Zt, however the solution is unstable because of the close similarity of Zt to Zb. Nevertheless the object height Zt − Zb can be obtained by the following procedure: 1. Get (rb, cb) and (rt, ct) from the image. 2. Solve first two equations in (5.6) (e.g. with Newton’s method) for X and Y with Zb set equal to the average elevation in the scene if no DEM is available, otherwise to the true elevation. 3. For a spanning range of Zt values, calculate (rt, ct) from the second two equations in (5.6) and choose for Zt the value of Zt which gives closest agreement to the values read in. Quite generally, the RFM can approximate the camera model very well and can be used as an alternative for providing end users with the necessary information to perform their own photogrammetric processing. An ENVI plug-in for object height determination from RFM data is given in Appendix D.2.1. 5.4 Stereo imaging, elevation models and orthorectification The missing elevation information Z in (5.3) or in (5.5) can be obtained with stereoscopic imaging techniques. Figure 5.2 shows two cameras viewing the same world point w from two positions. The separation of the lens centers is the baseline. The objective is to find the coordinates (X, Y, Z) of w if its image points have coordinates (x1, y1) and (x2, y2). We assume that the cameras are identical and that their image coordinate systems are perfectly aligned, differing only in the location of their origins. The Z coordinate of w is the same for both coordinate systems. In Figure 5.3 the first camera is brought into coincidence with the world coordinate system. Then from (5.4), X1 = x1 λ (λ − Z). Alternatively, if the second camera is brought to the origin of the world coordinate system, X2 = x2 λ (λ − Z).
  • 53. 5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 45 Figure 5.2: The stereo imaging process, from [GW02]. Figure 5.3: Top view of Figure 5.2, from [GW02].
  • 54. 46 CHAPTER 5. TOPOGRAPHIC MODELLING But, from the figures, X2 = X1 + B, where B is the baseline. We have from the above three equations: Z = λ − λB x2 − x1 . (5.7) Thus if the displacement of the image coordinates of the point w, namely x2 − x1 can be determined, the Z coordinate can be calculated. The task is then to find two correspond- ing points in different images of the same scene. This is usually accomplished by spatial correlation techniques and is closely related to the problem of image-to-image registration discussed in the next chapter. Figure 5.4: ASTER stereo acquisition geometry. Because the stereo image must be correlated, best results are obtained if they are acquired within a very short time of each other, preferably “along track” if a single platform is used, see Figure 5.4. This figure shows the orientation and imaging geometry of the VNIR 3N and 3B cameras on the ASTER platform for acquiring a stereo full scene. The satellite travels at
  • 55. 5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 47 a speed of 6.7 km/sec at a height of 705 km. A 60 × 60 km2 full scene is scanned in 9 seconds. 55 seconds later the same scene is scanned by the back-looking camera, corresponding to a baseline of 370 km. The along-track geometry means that the stereo pair is unipolar, that is, the displacements due to viewing angle are only along the y axis in the imaging plane. Therefore the spatial correlation algorithm used to match points can be one dimensional. If carried out on a pixel for pixel basis, one obtains a digital elevation model (DEM). Figure 5.5: ASTER 3N nadir camera image. Figure 5.6: ASTER 3B back-looking camera image. As an example, Figures 5.5 and 5.6 show an ASTER stereo pair. Both images have been rotated so as to make them unipolar.
  • 56. 48 CHAPTER 5. TOPOGRAPHIC MODELLING The following IDL program calculates a very rudimentary DEM: pro test_correl_images height = 705.0 base = 370.0 pixel_size = 15.0 envi_select, title=’Choose 1st image’, fid=fid1, dims=dims1, pos=pos1, /band_only envi_select, title=’Choose 2nd image’, fid=fid2, dims=dims2, pos=pos2, /band_only im1 = envi_get_data(fid=fid1,dims=dims1,pos=pos1) im2 = envi_get_data(fid=fid2,dims=dims2,pos=pos2) n_cols = dims1[2]-dims1[1]+1 n_rows = dims1[4]-dims1[3]+1 parallax = fltarr(n_cols,n_rows) progressbar = Obj_New(’progressbar’, Color=’blue’, Text=’0’,$ title=’Cross correlation, column ...’,xsize=250,ysize=20) progressbar-start for i=7L,n_cols-8 do begin if progressbar-CheckCancel() then begin envi_enter_data,pixel_size*parallax*(height/base) progressbar-Destroy return endif progressbar-Update,(i*100)/n_cols,text=strtrim(i,2) for j=25L,n_rows-26 do begin cim = correl_images(im1[i-5:i+5,j-5:j+5],im2[i-7:i+7,j-25:j+25], $ xoffset_b=0,yoffset_b=-20,xshift=0,yshift=20) corrmat_analyze,cim,xoff,yoff,m,e,p parallax[i,j] = yoff (-5.0) endfor endfor progressbar-destroy envi_enter_data,pixel_size*parallax*(height/base) end This program makes use of the routines correl images and corrmat analyze from the IDL Astronomy User’s Library2 to calculate the cross-correlation of the two images. For each pixel in the nadir image an 11 × 11 window is moved along an 11 × 51 window in the back- looking image centered at the same position. The point of maximum correlation defines the parallax or displacement p. This is related to the relative elevation e of the pixel according to e = h b p × 15m, where h is the height of the sensor and b is the baseline, see Figure 5.7. Figure 5.8 shows the result. Clearly there are many problems due to the correlation errors, however the relative elevations are approximately correct when compared to the DEM determined with the ENVI commercial add-on AsterDTM, see Figure 5.9. 2www.astro.washington.edu/deutsch/idl/htmlhelp/index.html
  • 57. 5.4. STEREO IMAGING, ELEVATION MODELS AND ORTHORECTIFICATION 49 ' b h e p satellite motion ground nadir cameraback camera Figure 5.7: Relating parallax p to elevation e by similar triangles: e/p = (h − e)/b ≈ h/b. Figure 5.8: A rudimentary DEM.
  • 58. 50 CHAPTER 5. TOPOGRAPHIC MODELLING Figure 5.9: DEM generated with the commercial product AsterTDM. Either the complete camera model or an RFM can be used, but usually neither is sufficient for an absolute DEM relative to mean sea level. Most often additional ground reference points within the image whose elevations are known are also required for absolute calibration. The orthorectification of the image is carried out on the basis of a suitable DEM and consists of projecting the (X, Y, Z) coordinates of each pixel onto the (X, Y ) coordinates of a given map projection. 5.5 Slope and aspect Terrain analysis involves the processing of elevation data. Specifically we consider here the generation of slope images, which give the steepness of the terrain at each pixel, and aspect images, which give the prevailing direction relative to north of a vector normal to the landscape at each pixel. A 3×3 pixel window can be used to determine both slope and aspect, see Figure 5.10. Define ∆x1 = c − a ∆y1 = a − g ∆x2 = f − d ∆y2 = b − h ∆x3 = i − g ∆y3 = c − i and ∆x = (∆x1 + ∆x2 + ∆x3)/(3xs) ∆y = (∆y1 + ∆y2 + ∆y3)/(3xs, where xs, ys give the pixel dimensions in meters. Then the slope in % at the central pixel position is given by s = (∆x)2 + (∆y)2 2 × 100 whereas the aspect in radians measured clockwise from north is θ = tan−1 ∆x ∆y .
  • 59. 5.6. ILLUMINATION CORRECTION 51 a b c d e f g h i Figure 5.10: Pixel elevations in an 8-neighborhood. The letters represent elevations. Slope/aspect determinations from a DEM are available in the ENVI main menu under Topographic/Topographic Modelling. 5.6 Illumination correction Figure 5.11: Angles involved in computation of local solar elevation, taken from [RCSA03]. Topographic modelling can be used to correct images for the effects of local solar illu- mination, which depends not only upon the sun’s position (elevation and azimuth) but also upon the local slope and aspect of the terrain being illuminated. Figure 5.11 shows the angles involved [RCSA03]. Solar elevation is θi, solar azimuth is φa, θp is the slope and φ0 is the aspect. The quantity to be calculated is the local solar elevation γi which determines
  • 60. 52 CHAPTER 5. TOPOGRAPHIC MODELLING the local irradiance. From trigonometry we have cos γi = cos θp cos θi + sin θp sin θi cos(φa − φ0). (5.8) An example of a cos γi image in hilly terrain is shown in Figure 5.12. Figure 5.12: Cosine of local solar illumination angle stretched across a DEM. Let ρT represent the reflectance of the inclined surface in Figure 5.11. Then for a Lambertian surface, i.e. a surface which scatters reflected radiation uniformly in al directions, the reflectance of the corresponding horizontal surface ρH would be ρH = ρT cos θi cos γi . (5.9) The Lambertian assumption is in general not correct, the actual reflectance being de- scribed by a complicated bidirectional reflectance distribution function (BRDF). An empiri- cal appraoch which gives a better approximation to the BRDF is the C-correction [TGG82]. Let m and b be the slope and intercept of a regression line for reflectance vs. cos γi for a particular image band. Then instead of (5.9) one uses ρH = ρT cosθi + b/m cos γi + b/m . (5.10) An ENVI plug-in for illumination correction with the C-correction approxi- mation is given in Appendix D.2.2.
  • 61. Chapter 6 Image Registration Image registration, either to another image or to a map, is a fundamental task in image processing. It is required for georeferencing, stereo imaging, accurate change detection, or any kind of multitemporal image analysis. Image-to-image registration methods can be divided into roughly four classes [RC96]: 1. algorithms that use pixel values directly, i.e. correlation methods 2. frequency- or wavelet-domain methods that use e.g. the fast fourier transform(FFT) 3. feature-based methods that use low-level features such as edges and corners 4. algorithms that use high level features and the relations between them, e.g. object- oriented methods We consider examples of frequency-domain and feature-based methods here. 6.1 Frequency domain registration Consider two N × N gray scale images g1(i , j ) and g2(i, j), where g2 is offset relative to g1 by an integer number of pixels: g2(i, j) = g1(i , j ) = g1(i − i0, j − j0), i0, j0 N. Taking the Fourier transform we have ˆg2(k, l) = ij g1(i − i0, j − j0)e−i2π(ik+jl)/N , or with a change of indices to i j , ˆg2(k, l) = i j g1(i , j )e−i2π(i k+j l)/N e−i2π(i0k+j0l)/N = ˆg1(k, l)e−i2π(i0k+j0l)/N . (This is referred to as the Fourier translation property.) Therefore we can write ˆg2(k, l)ˆg∗ 1(k, l) |ˆg2(k, l)ˆg∗ 1(k, l)| = e−i2π(i0k+j0l)/N , (6.1) 53
  • 62. 54 CHAPTER 6. IMAGE REGISTRATION Figure 6.1: Phase correlation of two identical images shifted by 10 pixels. where ˆg∗ 1 is the complex conjugate of ˆg1. The inverse transform of the right hand side exhibits a Dirac delta function (spike) at the coordinates (i0, j0). Thus if two otherwise identical images are offset by an integer number of pixels, the offset can be found by taking their Fourier transforms, computing the ratio on the left hand side of (6.1) (the so-called cross-power spectrum) and then taking the inverse transform of the result. The position of the maximum value in the inverse transform gives the values of i0 and j0. The following IDL program illustrates the procedure, see Fig. 6.1 ; Image matching by phase correlation ; read a bitmap image and cut out two 512x512 pixel arrays filename = Dialog_Pickfile(Filter=’*.jpg’,/Read) if filename eq ’’ then print, ’cancelled’ else begin Read_JPeG,filename,image g1 = image[0,10:521,10:521] g2 = image[0,0:511,0:511] ; perform Fourier transforms f1 = fft(g1, /double) f2 = fft(g2, /double) ; Determine the offset g = fft( f2*conj(f1)/abs(f1*conj(f1)), /inverse, /double )
  • 63. 6.2. FEATURE MATCHING 55 pos = where(g eq max(g)) print, ’Offset = ’ + strtrim(pos mod 512) + strtrim(pos/512) ; output as EPS file thisDevice =!D.Name set_plot, ’PS’ Device, Filename=’c:tempphasecorr.eps’,xsize=4,ysize=4,/inches,/Encapsulated shade_surf,g[0,0:50,0:50] device,/close_file set_plot, thisDevice endelse end Images which differ not only by an offset but also by a rigid rotation and change of scale can in principle be registered similarly, see [RC96]. 6.2 Feature matching A tedious task associated with image-image registration using low level image features is the setting of ground control points (GCPs) since, in general, it is necessary to resort to the manual entry. However various techniques for automatic determination of GCPs have been suggested in the literature. We will discuss one such method, namely contour matching [LMM95]. This technique has been found to function reliably in bitemporal scenes in which vegetation changes do not dominate. It can of course be augmented (or replaced) by other automatic methods or by manual determination. The procedures involved in image-image registration using contour matching are shown in Fig. 6.2 [LMM95]. LoG Zero Crossing Edge Strength Contour Finder Chain Code Encoder Closed Contour Matching Consistency Check Warping E E E E E E cc ''' Image 1 Image 2 Image 2 (registered) Figure 6.2: Image-image registration with contour matching.
  • 64. 56 CHAPTER 6. IMAGE REGISTRATION 6.2.1 Contour detection The first step involves the application of a Laplacian of Gaussian filter to both images. After determining the contours by examining zero-crossings of the LoG-filtered image, the contour strengths are encoded in the pixel intensities. Strengths are taken to be proportional to the magnitude of the gradient at the zero-crossing. 6.2.2 Closed contours In the next step, all closed contours with strengths above some given threshold are deter- mined by tracing the contours. Pixels which have been visited during tracing are set to zero so that they will not be visited again. 6.2.3 Chain codes For subsequent matching purposes, all significant closed contours found in the preceding step are chain encoded. Any digital curve can be represented by an integer sequence {a1, a2 . . . ai . . .}, ai ∈ {0, 1, 2, 3, 4, 5, 6, 7}, depending on the relative position of the current pixel with respect to the previous pixel in the curve. This simple code has the drawback that some contours produce wrap around. For example the line in the direction −22.5o has the chain code {707070 . . .}. Li et al. [LMM95] suggest the smoothing operation: {a1a2 . . . an} → {b1b2 . . . bn}, where b1 = a1 and bi = qi, qi is an integer satisfying (qi−ai) mod 8 = 0 and |qi−bi−1| → min, i = 2, 3 . . . n. They also suggest the applying the Gaussian smoothing filter {0.1, 0.2, 0.4, 0.2, 0.1} to the result. Two chain codes can be compared by “sliding” one over the other and determining the maximum correlation between them. 6.2.4 Invariant moments The closed contours are first matched according to their invariant moments. These are defined as follows, see [Hab95, GW02]. Let the set C denote the set of pixels defining a contour, with |C| = n, that is, n is the number of pixels on the contour. The moment of order p, q of the contour is defined as mpq = i,j∈C jp iq . (6.2) Note that n = m00. The center of gravity xc, yc of the contour is thus xc = m10 m00 , yc = m01 m00 . The centralized moments are then given by µpq = i,j∈C (j − xc)p (i − yc)q , (6.3)
  • 65. 6.2. FEATURE MATCHING 57 and the normalized centralized moments by ηpq = 1 µ (p+q)/2+1 00 µpq. (6.4) For example, η20 = 1 µ2 00 µ20 = 1 n2 i,j∈C (j − yc)2 . The normalized centralized moments are, apart from effects of digital quantization, invariant under scale changes and translations of the contours. Finally, we can define moments which are also invariant under rotations, see [Hu62]. The first two such invariant moments are h1 = η20 + η02 h2 = (η20 − η02)2 + 4η2 11. (6.5) For example, consider a general rotation of the coordinate axes with origin at the center of gravity of a contour: j i = cos θ sin θ − sin θ cos θ j i = A j i . The first invariant moment in the rotated coordinate system is h1 = 1 n2 i ,j ∈C (j 2 + i 2 ) = 1 n2 i ,j ∈C (j , i ) j i = 1 n2 i,j∈C (j, i)A A j i = 1 n2 i,j∈C (j2 + i2 ), since A A = I. 6.2.5 Contour matching Each significant contour in one image is first matched with contours in the second image according to their invariant moments h1, h2. This is done by setting a threshold on the allowed differences, for instance 1 standard deviation. If one or more matches is found, the best candidate for a GCP pair is then chosen to be that matched contour in the second image for which the chain code correlation with the contour in the first image is maximum. If the maximum correlation is less that some threshold, e.g. 0.9, then no match is found. The actual GCP coordinates are taken to be the centers of gravity of the matched contours. 6.2.6 Consistency check The contour matching procedure invariably generates false GCP pairs, so a further process- ing step is required. In [LMM95] use is made of the fact that distances are preserved under a rigid transformation. Let A1A2 represent the distance between two points A1 and A2 in
  • 66. 58 CHAPTER 6. IMAGE REGISTRATION an image. For two sets of m matched contour centers {Ai} and {Bi} in image 1 and 2, the ratios AiAj/BiBj, i = 1 . . . m, j = i + 1 . . . m, are calculated. These should form a cluster, so that pairs scattered away from the cluster center can be rejected as false matches. An ENVI plug-in for GCP determination via contour matching is given in Appendix D.3. 6.3 Re-sampling and warping We represent with (x, y) the coordinates of a point in image 1 and the corresponding point in image 2 with (u, v). A second order polynomial map of image 2 to image 1, for example, is given by u = a0 + a1x + a2y + a3xy + a4x2 + a5y2 v = b0 + b1x + b2y + b3xy + b4x2 + b5y2 . Since there are 12 unknown coefficients, we require at least 6 GCP pairs to determine the map (each pair generates 2 equations). If more than 6 pairs are available, the coefficients can be found by least squares fitting. This has the advantage that an RMS error for the mapping can be estimated. Similar considerations apply for lower or higher order polynomial maps. Having determined the map coefficients, image 2 can be registered to image 1 by re- sampling. Nearest neighbor resampling simply chooses the actual pixel in image 2 that has its center nearest the calculated coordinates (u, v) and transfers it to location (x, y). This is the preferred technique for classification or change detection, since the registered image consists of the original pixel brightnesses, simply rearranged in position to give a correct image geometry. Other commonly used resampling methods are bilinear interpolation and cubic convolution interpolation, see [JRR99] for details. These methods mix the spectral intensities of neighboring pixels.
  • 67. 6.3. RE-SAMPLING AND WARPING 59 Exercises 1. We can approximate the centralized moments (6.3) of a contour by the integral µpq = (x − xx)p (y − yc)q f(x, y)dxdy, where the integration is over the whole image and where f(x, y) = 1 if the point (x, y) lies on the contour and f(x, y) = 0 otherwise. Use this approximation to prove that the normalized centralized moments ηpq given in (3.4) are invariant under scaling transformations of the form x y = α 0 0 α · x y .
  • 68. 60 CHAPTER 6. IMAGE REGISTRATION
  • 69. Chapter 7 Image Sharpening The change detection and classification algorithms that we will meet in the next chapters exploit of course not only the spatial but also the spectral information of satellite imagery. Many common platforms (Landsat 7 TM, IKONOS, SPOT, QuickBird) offer panchromatic images with higher ground resolution than that of the spectral channels. Application of mul- tispectral change detection or classification methods is therefore restricted to the lower res- olution. Conventional image fusion techniques, such as the well-known HSV-transformation can be used to sharpen the spectral components, however the effect of mixing-in of the panchromatic image is often to “dilute” the spectral resolution. Another disadvantage of the HSV transformation is that one is restricted to using three of the available spectral channels. In the following we will outline the HSV method and then consider alternative fusion techniques. 7.1 HSV fusion In computers with 24-bit graphics (true color), any three channels of a multispectral image can be displayed with 8 bits for each of the additive primary colors red, green and blue. The monitor displays this as an RGB color composite image which, depending on the choice of image channels and their relative intensities, may or may not appear to be natural. There are 224 ≈ 16 million colors possible. Another means of color definition is in terms of hue, saturation and value (HSV). Value (or intensity) can be thought of as an axis equidistant from the three orthogonal primary color axes. Hue refers to the actual color and is defined as an angle on a circle perpendicular to the value axis. Saturation is the “amount” of color present and is represented by the radius of the circle described by the hue, A commonly used method for fusion of two images (for example a lower resolution multi- spectral image with a higher resolution panchromatic image) is to transform the first image from RGB to HSV space, replace the V component with the grayscale values of the second image after performing a radiometric normalization, and then transform back to RGB space. The forward transformation begins by rotating the RGB coordinate axes into the diagonal 61
  • 70. 62 CHAPTER 7. IMAGE SHARPENING axis of the RGB color cube. The coordinates in the new reference system are given by   m1 m2 i1   =   2/ √ 6 −1/ √ 6 −1/ √ 6 0 1/ √ 2 −1/ √ 2 1/ √ 3 1/ √ 3 1/ √ 3   ·   R G B   . Then the the rectangular coordinates (m1, m2, i1) are transformed into the cylindrical HSV coordinates: H = arctan(m1/m2), S = m2 1 + m2 2, I = √ 3 i1. The following IDL code illustrates the necessary steps for HSV fusion making use of ENVI batch procedures. These are also invoked directly from the ENVI main menu. pro HSVFusion, event ; get MS image envi_select, title=’Select low resolution three-band input file’, $ fid=fid1, dims=dims1, pos=pos1 if (fid1 eq -1) or (n_elements(pos1) ne 3) then return ; get PAN image envi_select, title=’Select panchromatic image’, $ fid=fid2, pos=pos2, dims=dims2, /band_only if (fid2 eq -1) then return envi_check_save, /transform ; linear stretch the images and convert to byte format envi_doit,’stretch_doit’, fid=fid1, dims=dims1, pos=pos1, method=1, $ r_fid=r_fid1, out_min=0, out_max=255, $ range_by=0, i_min=0, i_max=100, out_dt=1, out_name=’c:temphsv_temp’ envi_doit,’stretch_doit’, fid=fid2, dims=dims2, pos=pos2, method=1, $ r_fid=r_fid2, out_min=0, out_max=255, $ range_by=0, i_min=0, i_max=100, out_dt=1, /in_memory envi_file_query, r_fid2, ns=f_ns, nl=f_nl f_dims = [-1l, 0, f_ns-1, 0, f_nl-1] ; HSV sharpening envi_doit, ’sharpen_doit’, $ fid=[r_fid1,r_fid1,r_fid1], pos=[0,1,2], f_fid=r_fid2, $ f_dims=f_dims, f_pos=[0], method=0, interp=0, /in_memory ; remove temporary files from ENVI envi_file_mng, id=r_fid1, /remove, /delete envi_file_mng, id=r_fid2, /remove end
  • 71. 7.2. BROVEY FUSION 63 7.2 Brovey fusion In its simplest form this method multiplies each re-sampled multispectral pixel by the ratio of the corresponding panchromatic pixel intensity to the sum of all of the multispectral intensities. The corrected pixel intensities ¯gk(i, j) in the kth fused multispectral channel are given by ¯gk(i, j) = gk(i, j) · gp(i, j) k gk (i, j) , (7.1) where gk(i, j) is the (re-sampled) pixel intensity in the kth channel and gp(i, j) is the corre- sponding pixel intensity in the panchromatic image. (The ENVI-environment offers Brovey fusion in its main menu.) This technique assumes that the spectral range spanned by the panchromatic image is essentially the same as that covered by the multispectral channels. This is seldom the case. Moreover, to avoid bias, the intensities used should be the radiances at the satellite sensors, implying use of the sensors’ calibration. 7.3 PCA fusion Panchromatic sharpening using principal components analysis (PCA) is similar to the HSV method. After the PCA transformation, the first principal component is replaced by the panchromatic image, again after radiometric normalization, see Figure 7.1. Figure 7.1: Panchromatic fusion with the principal components transformation. Image sharpening using PCA and the closely related Gram-Schmidt transformation is available from the ENVI main menu.
  • 72. 64 CHAPTER 7. IMAGE SHARPENING 7.4 Wavelet fusion Wavelets provide an efficient means of representing high and low frequency components of multispectral images and can be used to perform image sharpening. Two examples are given here. 7.4.1 Discrete wavelet transform The discrete wavelet transform (DWT) of a two-dimensional image is shown in Appendix B to be equivalent to an iterative application of the high-low-pass filter bank illustrated in Figure 7.2 H G H H G G
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78. ↓ E E E E E E E E E E E E E E E E gk(i, j) gk+1(i, j) CH k+1(i, j) CV k+1(i, j) CD k+1(i, j) Columns Rows Figure 7.2: Wavelet filter bank. H is a low-pass and G a high-pass filter derived from the coefficients of the wavelet transformation. The symbol ↓ indicates downsampling by a factor of 2. The original image gk(i, j) can be reconstructed by inverting the filter. A single application of the filter corresponding to the Daubechies D4 wavelet to a satellite image g1(i, j) (1m resolution) is shown in Figure B.12. The high frequency information (wavelet coefficients) is stored in the arrays CH 2 , CV 2 and CD 2 and displayed in the upper right, lower left and lower right quadrants, respectively. The original image with its resolution degraded by a factor two, g2(i, j), is in the upper left quadrant. Applying the filter bank iteratively to the upper left quadrant yields a further reduction by a factor of 2. The fusion procedure for IKONOS or QuickBird imagery for instance, in which the resolutions of panchromatic and the 4 multispectral components differ exactly by a factor of 4, is then as follows: Both the degraded panchromatic image and the four multispectral images are compressed once again (e.g. to 8m resolution in the case of IKONOS) and the high frequency components Cz 4 , z = H, V, D, are sampled to estimate the correction coefficients az = σz ms/σz pan bz = mz ms − az mz pan, (7.2) where mz and σz denote mean and standard deviation, respectively. These coefficients are then used to normalize the wavelet coefficients for the panchromatic image to those of the multispectral image: Cz i (i, j) → az Cz i (i, j) + bz , z = H, V, D, i = 2, 3. (7.3)
  • 79. 7.4. WAVELET FUSION 65 The degraded panchromatic image g3(i, j) is then replaced by the each of the four multispec- tral images and the normalized wavelet coefficients are used to reconstruct the original 1m resolution. We thus obtain what would be seen if the multispectral sensors had the resolution of the panchromatic sensor [RW00]. An ENVI plug-in for panchromatic sharpening with the DWT is given in Appendix D.4.1. 7.4.2 `A trous filtering The radiometric fidelity obtained with the discrete wavelet transform is excellent, as will be shown in the next section. However the lack of translational invariance of the DWT often leads to spatial artifacts (blurring, shadowing, staircase effect) in the sharpened product. This is illustrated in the following program, in which an image is transformed once with the DWT and the low-pass quadrant shifted by one pixel relative to the high-pass quadrants (i.e. the wavelet coefficients). After inverting the transformation, serious degradation is apparent, see Figure 7.3. pro translate_wavelet ; get an image band envi_select, title=’Select input file’, $ fid=fid, dims=dims, pos=pos, /band_only if fid eq -1 then return ; create a DWT object aDWT = Obj_New(’DWT’,envi_get_data(fid=fid,dims=dims,pos=pos)) ; compress aDWT-compress ; shift the compressed portion supressing phase correlation match aDWT-inject,shift(aDWT-Get_Quadrant(0),[1,1]),pc=0 ; restore aDWT-expand ; return result to ENVI envi_enter_data, aDWT-get_image() end As an alternative to the DWT, the `a trous wavelet transform (ATWT) has been proposed for image sharpening [AABG02]. The ATWT is a multiresolution decomposition defined formally by a low-pass filter H = {h(0), h(1), . . .} and a high-pass filter G = δ − H, where δ denotes an all-pass filter. Thus the high frequency part is just the difference between the original image and low-pass filtered image. Not surprisingly, this transformation does not allow perfect reconstruction if the output is downsampled. Therefore downsampling is not performed at all. Rather, at the kth iteration of the low-pass filter, 2k−1 zeroes are inserted between the elements of H. This means that every other pixel is interpolated on the first iteration: H = {h(0), 0, h(1), 0, . . .}, while on the second iteration H = {h(0), 0, 0, h(1), 0, 0, . . .} etc. (hence the name `a trous = with holes). The low-pass filter is usually chosen to be symmetric (unlike the Daubechies wavelet filters for example). The prototype filter chosen
  • 80. 66 CHAPTER 7. IMAGE SHARPENING here is the cubic B-spline filter H = {1/16, 1/4, 3/8, 1/4, 1/16}. The transformation is highly redundant and requires considerably more computer storage to implement. However when used for image sharpening it is much less sensitive to mis- alignment between the multispectral and panchromatic images. Figure 7.3: Artifacts due to lack of translational invariance of the DWT. Figure 7.4 outlines the scheme implemented in the ENVI plug-in for ATWT panchromatic sharpening. The MS band is nearest-neighbor upsampled by a factor of 2 to match the dimensions of the high resolution band. The `a trous transformation is applied to both bands (columns and rows are filtered with the upsampled cubic spline filter, with the difference determining the high-pass result). The high frequency component of the pan image is normalized to that of the MS image in the same way as for DWT sharpening, equations (7.2) and (7.3). Then the low frequency pan component is replaced by the filtered MS image and the transformation inverted. An ENVI plug-in for ATWT sharpening is described in Appendix D.4.2. 7.5 Quality indices Wang and Bovik [WB02] suggest the following measure of radiometric fidelity between two image bands f and g:
  • 81. 7.5. QUALITY INDICES 67 E E E E E T + G G ↑H ↑H T insert E T c normalize MS Pan MS(sharpened)
  • 82. ↑ Figure 7.4: `A trous image sharpening scheme for an MS to panchromatic resolution ratio of two. The symbol ↑H denotes the upsampled low-pass filter. Figure 7.5: Comparison of three image sharpening methods with the Wang-Bovik quality index. Left to right: Gram-Schmidt, ATWT, DTW.
  • 83. 68 CHAPTER 7. IMAGE SHARPENING Q = σfg σf σg · 2 ¯f¯g ¯f2 + ¯g2 · 2σf σg σ2 f + σ2 g = 4σfg ¯f¯g ( ¯f2 + ¯g2)(σ2 f + σ2 g) (7.4) where ¯f and σf are mean and variance of band f and σfg is the covariance of the two bands. This first term in (7.4) is seen to be the correlation coefficient between the two images, with values in [−1, 1], the second term compares their average brightness, with values in [0, 1] and the third term compares their contrasts, also in [0, 1]. Thus perfect radiometric correspondence would give a value Q = 1. Since image quality is usually not spatially invariant, it is usual to compute Q in, say, M sliding windows and then average over all such windows: Q = 1 M M j=1 Qj. An ENVI plug-in for determining the quality index for pansharpened images is given in Appendix D.4.3. Figure 7.5 shows a comparison of three image sharpening methods applied to a QuickBird image, namely the Gram-Schmidt, ATWT and DWT transformations. The latter is by far the best, but spatial artifacts are apparent.
  • 84. Chapter 8 Change Detection To quote Singh’s review article on change detection [Sin89], “The basic premise in using remote sensing data for change detection is that changes in land cover must result in changes in radiance values ... [which] must be large with respect to radiance changes from other factors.” In the present chapter we will mention briefly the most commonly used digital techniques for enhancing this “change signal” in bitemporal satellite images, and then focus our attention on the so-called multivariate alteration detection algorithm of Nielsen et al. [NCS98]. 8.1 Algebraic methods In order to see changes in the two multispectral images represented by N-dimensional ran- dom vectors F and G, a simple procedure is to subtract them from each other component- by-component, examining the N differenced images characterized by F − G = (F1 − G1, F2 − G2 . . . FN − GN ) (8.1) for significant changes. Pixel intensity differences near zero indicate no change, large positive or negative values indicate change, and decision thresholds can be set to define significant changes. If the difference signatures in the spectral channels are used to classify the kind of change that has taken place, one speaks of change vector analysis. Thresholds are usually expressed in standard deviations from the mean difference value, which is taken to correspond to no change. Alternatively, ratios of intensities of the form Fk Gk , k = 1 . . . N (8.2) can be built between successive images. Ratios near unity correspond to no-change, while small and large values indicate change. A disadvantage of this method is that random variables of the form (8.2) are not normally distributed, so simple threshold values defined in terms of standard deviations are not valid. Other algebraic combinations, such as differences in vegetation indices (Section 2.1) are also in use. All of these “band math” operations can of course be performed conveniently within the ENVI/IDL environment. 69
  • 85. 70 CHAPTER 8. CHANGE DETECTION 8.2 Principal components Figure 8.1: Change detection with principal components. Consider the bitemporal feature space for a single spectral band m in which each pixel is denoted by a point (fm, gm), a realization of the random vector (Fm, Gm). Since the unchanged pixels are highly correlated, they will lie in a narrow, elongated cluster along the principal axis, whereas change pixels will lie some distance away from it, see Fig. 8.1. The second principal component will thus quantify the degree of change associated with a given pixel. Since the principal axes are determined by diagonalization of the covariance matrix for all of the pixels, the no-change axis may be poorly determined. To avoid this problem, the principal components can be determined iteratively using weights for each pixel according to the magnitude of the second principal component. This method can be generalized to treat all multispectral bands simultaneously [Wie97]. 8.3 Post-classification comparison If two co-registered satellite images have been classified, then the class labels can be com- pared to determine land cover changes. If classification is carried out at the pixel level (as opposed to segments or objects), then classification errors (typically 5%) may dominate the true changes, depending on the magnitude of the latter. ENVI offers functions for statistical analysis of post-classification change detection.