clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx

clear all
clc
%% Ex2-1 A)
% Compute the 5 gravity profiles
d=[-34 -20 0 20 35]; % in km
z=[2 3 5 10 10]; % in km
R=[2 1 3 4 3.5]; % in km
dro=[3.0 5.0 1.0 2.0 1.5]; % in gm/cm^3
gz=(41.93*dro.*R.^2./z)./(d.^2./z.^2+1);
d_itv=-100:1:100;
gz1=(41.93*dro(1)*R(1)^2/z(1))./(d_itv.^2./z(1)^2+1);
gz2=(41.93*dro(2)*R(2)^2/z(2))./(d_itv.^2./z(2)^2+1);
gz3=(41.93*dro(3)*R(3)^2/z(3))./(d_itv.^2./z(3)^2+1);
gz4=(41.93*dro(4)*R(4)^2/z(4))./(d_itv.^2./z(4)^2+1);
gz5=(41.93*dro(5)*R(5)^2/z(5))./(d_itv.^2./z(5)^2+1);
d1=d_itv+d(1);
d2=d_itv+d(2);
d3=d_itv+d(3);
d4=d_itv+d(4);
d5=d_itv+d(5);
index1=find(abs(d1)<=64);
% Plot the 5 gravity superimposed
figure(1)
plot(d1(index1), gz1(index1));
hold on;
hold on;
hold on;

hold on;
hold off
title('Gravity Profiles', 'FontSize', 20);
xlabel('Distance (km)', 'FontSize', 15);
ylabel('Gravity (mgal)', 'FontSize', 15);
legend('Cylinder 1','Cylinder 2','Cylinder 3','Cylinder
4','Cylinder 5');
%% Ex2-1 B)
% Compute the total gravity effect of the 5 cylinders by
summing their
% effects at each observation point on the profile
g=gz1(index1)+gz2(index2)+gz3(index3)+gz4(index4)+gz5(inde
x5);
% mean value
average = sum(g)/length(g)
% standard deviation
std=sqrt(sum((g-average).^2)/length(g))
% Plot the total gravity effect
figure(2)
plot(d1(index1),g);
hold on
plot(d1(index1),repmat(average,length(index1),1));
title('Total Gravity From Five Cylinders', 'FontSize', 20);
xlabel('Distance (km)', 'FontSize', 15);
ylabel('Gravity (mgal)', 'FontSize', 15);
%% Ex2-1 C)
A1=((41.93*R(1)^2/z(1))./(d_itv.^2./z(1)^2+1))';
B1=gz1';
dro1=inv(A1'*A1)*A1'*B1;
A2=((41.93*R(2)^2/z(2))./(d_itv.^2./z(2)^2+1))';
B2=gz2';

A3=((41.93*R(3)^2/z(3))./(d_itv.^2./z(3)^2+1))';
B3=gz3';
A4=((41.93*R(4)^2/z(4))./(d_itv.^2./z(4)^2+1))';
B4=gz4';
A5=((41.93*R(5)^2/z(5))./(d_itv.^2./z(5)^2+1))';
B5=gz5';
%% Ex2-1 D)
% lower lower triangular matrix L
L1 = chol(A1'*A1,'lower');
% coefficients of P
P1=inv(L1)*A1'*B1;
P2=inv(L2)*A2'*B2;
P3=inv(L3)*A3'*B3;
P4=inv(L4)*A4'*B4;
P5=inv(L5)*A5'*B5;
% least square estimates of dro
dro1_CF=inv(L1*L1')*L1*P1;
Roles of Managers and Individuals Paper
Prepare a 700-800 word paper assessing the various roles of
managers and individuals in the change process.
· Include an examination with detailed descriptions and

examples of:
· two types of change agents (Human and Non-Human)
· a manager’s role in combating resistance
· a manager’s role in championing change.
admin
Sticky Note
- the process of inversion is shown in the figure on next page
1-A and in Fig. 2.1 (GeomathBook.pdf, p. 22/153)
- ie., it is an artform with subjective and quantitative objective
components
admin
Sticky Note
In math, science, and engineering, the 'inverse problem' of
relating data observations to unknown elements or coefficients
of a model is a 'process' commonly called 'inversion'.
However, other names may be used depending on the
application - eg., it may be called
=> blackbox analysis in physics
=> factor analysis in social sciences
=> forward and inverse modeling in geophysics
=> or simply modeling in any dicipline
=> trial-and-error analysis
=> deconvolution in signal processing and engineering
=> ????? in geodesy?

=> etc.
admin
Sticky Note
AX is the linear or non-linear forward model,
so that AX = B + 'errors', where
B is the observed data set,
admin
Sticky Note
A is the system input or design matrix,
admin
Sticky Note
and X is the impluse response or set of unknown blackbox
coefficients that the inversion seeks to determine
admin
Sticky Note
The essence of any inverse problem is the forward model = AX,
which in practice is always processed by linear digital
operations.
admin
Sticky Note
SENSITIVITY ANALYSIS - pretty much where most modern
developments in inversion analysis are concentrated.
admin
Sticky Note
the most meaningful set of solution coefficients
admin

Sticky Note
translating (ie., rationalizing) the best set of solution
coefficients into 'new' information on the conceptual model -
usually augmented with digital graphics.
admin
Sticky Note
Inversion examples of conceptual and mathematical models are
shown on page 2-A below and in Fig. 2.2 (GeomathBook.pdf, p.
23/153)
admin
Sticky Note
Conceptual models are imagined simplifications of reality
admin
Sticky Note
Mathematical models are the further simplified expressions of
conceptual models
admin
Sticky Note
joint inversions would obtain density and velocity solutions that
satisfy both gravity and seismic observations
admin
Sticky Note
always
admin

Sticky Note
always
admin
Sticky Note
always
admin
Sticky Note
=> eg., the data set is improperly 'scaled' relative to computer's
working precision
=> for example, the inverse problem is improperly posed to find
angstrom scale variations in observations that are known only to
nearest kilometer, etc.
admin
Sticky Note
Figure on page 8-A below shows examples of model
'appropriateness' relative to inversion 'objectives'
admin
Sticky Note
Inversion (ie., modeling) objectives dictate the mathematical

model that should be invoked - or model 'appropriateness'
In other words, the forward model should
- not be overly complex numerically so as to waste
computational resources,
- or too simple numerically so as not to achieve the inversion
objective.
- Rather, it should be just right or 'optimal' so that the inversion
objective is achieved with 'minimum' resource expenditure.
admin
Sticky Note
<= for example, the gravity profile on the left may be modeled
with equivalent accuracy by the 3 mathematical models
generalized in the bottom 3 panels with varying numbers of
unknowns, m.
admin
Sticky Note
if one wanted to simply interpolate the gravity signal at
unmapped x-coodinates, for example,
then the horizontal cyclinder model would be most appropriate,
because only m = 1 unknown (eg., the density contrast Δρ-
value) needs to be determined from the n-data points
admin
Sticky Note
However, if the objective is to drill the source, which is a
relatively expensive operation,

then a more complex half-space solution might be warranted
involving m = 5+ line segment deviations from the horizontal to
better resolve subsurface details of the source's geometry
The cost of the half-space solutions would be at least 5+ times
the cost of the horizontal cylinder solution, but probably worth
it given the great costs of the drilling application.
admin
Sticky Note
On the other hand, if the objective is to mine or excavate the
source, which is the most expensive way to exploit the
subsurface,
then a 2-D checkerboard model of the subsurface might be
warranted involving the determination of at least m = 5 x 10 =
50 prism densities.
The cost of the checkerboard solutions would be at least 50
times the cost of the horizontal cylinder solution, but probably
worth it given the extreme costs of subsurface mining and
excavation.
admin
Sticky Note
In general, the 'underdetermined' inversion where m > n is
frankly the 'norm' in every mapping application .
admin
Sticky Note
Thus, inversion involves=>
- conceptualizing or fantasizing an appropriate forward model,

- quantitatively determining the solution coefficients,
- critiquing the solution for the most meaningful coefficients,
and
- converting or rationalizing the solution coefficients in terms
of the conceptual model
People who are adept at the art form of conceptualizing models
to relate to data, and transforming the numerical results of
inversion into meaningful rationalizations or stories are highly
sought after and rewarded.
The objective, quantitative components of inversion are
relatively trivial and well established since Gauss' time (~
1800).
admin
Sticky Note
The relative advantages and limitation of trial-and-error
inversion are summarized on the next page.
admin
Sticky Note
Trial-and-error inversion is perhaps the most successful and
widely used method in human history -
it also seems to find wide application throughout nature by
virtue of Darwin's Law of Natural Selection whereby organisms
adapt to their environment.

admin
Sticky Note
i is the row subscript, and
j the column subscript
admin
Sticky Note
don't really need to take a course in linear algebra anymore to
become facile with matrix operations - just become familiar
with a linear equation solver like matlab, mathematica, etc.
admin
Sticky Note
ie. the matrix expression of unity
admin
Sticky Note
the determinant is the value or number associated with a square
matrix that is used to define the characteristic polynomial of the
matrix.
admin
Sticky Note
the number of terms in the determinant of an n-order matrix is
n-factorial or n! -

where for example, 10! = 3,628,800 is the number of terms
needing evaluation for a tenth-order matrix.
admin
Sticky Note
Sarrus' rule=> the determinant for n>2 is the sum of he products
along the right-pointing diagonals minus the sum of the
products along the left-pointing diagonals
admin
Sticky Note
X and B are also sometimes called n-tuples
admin
Sticky Note
Cramer's Rule from ca. 1798
admin
Sticky Note
this example is also considered in Fig. 4.2 (GeomathBook.pdf,
p. 32/153)
admin
Sticky Note
we get=>
admin
Sticky Note
where=>

admin
Sticky Note
a buried 2-D horizontal cylinder extending infinitely into and
out of the page
admin
Sticky Note
note that the elements aij and bi are known, whereas the
element xj is unknown
admin
Sticky Note
ie, the design matrix elements aij are evaluated with the
unknown variables x1 and x2 set to unity - or
(x1 = Δρ) = 1 = (x2 = C)
admin
Sticky Note
|A'|/|A| =
admin
Sticky Note
error = [(0.500 - 0.444)/(0.5)]100
= 11.2%
admin

Sticky Note
we do not 'invert' anything here, but rather 'repeat the inversion'
admin
Sticky Note
~ 7% error
admin
Sticky Note
~ 11% error
admin
Sticky Note
inversion of
admin
Sticky Note
compute A @ d1 = 0 km and
d2 = 20 km for the system=>
admin
Sticky Note
coefficients of the A-matrix
admin
Sticky Note
coefficients of he Identity matrix
admin
Sticky Note
we 'triangulate' the system using elementary row operations

admin
Sticky Note
ie., the a12-element was zeroed out by multiplying the top row
by -(4.440/75.474) and subtracting the result from the bottom
row, etc.
admin
Sticky Note
why can you not use elementary column operations to
triangulate the system?
admin
Sticky Note
for the n-order system
admin
Sticky Note
ie., the system of equations is diagonalized
admin
Sticky Note
in the days before computerized equation solvers were available
like Matlab, Mathematica, etc.,
the 'rank' concept had considerable practical significance for
minimizing manual processing of singular systems
admin

Sticky Note
real-world problems are always underdetermined!!!
admin
Sticky Note
more generally, in modern least squares applications, we have
A => AtA and
B => AtB (the weighted observation vector)
admin
Sticky Note
Gaussian elimination algorithm
admin
Sticky Note
Matrix determinant algorithm
admin
Sticky Note
B is the column matrix containing values of the 'dependent'
variable, and
A is the column matrix (missing the transpose symbol)
containing values of the 'independent' variable
admin
Sticky Note
the least squares solution will be developed using the principle
of 'mathematical induction' - ie., if

1) when a statement is true for for a natural number
n = k, then it will also be true for its successor
n = k+1;
2) and the statement is true for n = 1;
then the statement will be true for every natural number n.
To prove a statement by induction, parts 1)- and 2)-above must
be proved.
admin
Sticky Note
in linear regression, the slope x1 is the only unknown
admin
Sticky Note
typo=> x2-above should be x1
admin
Sticky Note
ie., as long as AtA ≠ 0
admin
Sticky Note
typo=> delete the + symbol in the above term - ie., it should be
the product=> x1ai
and not the sum=> x1+ai

admin
Sticky Note
typo=> the above unknowns x2 & x3 should be=> x1 & x2
admin
Sticky Note
ie., j-indices refer to transpose elements,
and
k-indices refer to normal vector elements
admin
Sticky Note
δij = 0 for all i ≠ j
and
δij = 1 for all i = j
admin
Sticky Note
Note that for any number of unknowns m and number of
observations n=>
A(n×m)X(m×1) = B(n×1)

versus
AtA(m×m)X(m×1) = AtB(m×1)
admin
Sticky Note
but a faster approach that avoids finding (AtA)-1 directly is to
use the Cholesky factorization of (AtA)
admin
Sticky Note
see Lawson & Hansen (1976)
admin
Sticky Note
this is a telescoping series or equation where the x3-solution is
needed to solve for the x2-solution, etc.
admin
Sticky Note
in evaluating a system of m = 100 unknowns, the use of the
#1-option below requires m3 =106 operations;
#2-option below reduces the operations by a third
to 2m3/3 = 666,667; and

#3-option below reduces the operations by two-thirds to 2m3/6
= 333,333
admin
Sticky Note
or 5,050 elements for m =100 unknowns
admin
Sticky Note
in the 1980s, OSU's IBM 3081 mainframe computer could
evaluate roughly 1,000 unknowns in-core,
which only held a maximum (AtA)-packed array of about
500,500 elements
admin
Sticky Note
however, at the expense of CPU-time, the number of unknowns
could be expanded as necessary by updating the packed (AtA)-
array on an external (out-of-core) storage device as unknowns
are added to the system.
admin
Sticky Note
Assignment #4=> complete exercise 2.1=>
EARTHSC_5642_Ex2-1_08Feb15
admin
Sticky Note
ie., rank (AtA) is < m
admin
Sticky Note

ie., rank (AtA) is approximately < m
admin
Sticky Note
admin
Sticky Note
and overdetermined so that n > m
admin
Sticky Note
note that var(X) = σX2 and var(B) = σB2
admin
Sticky Note
r2 is sometimes also called 'coherency' or the 'coherency
coefficient'
admin
Sticky Note
Assignment #6=> What is the matrix expression for the
correlation coefficient r in terms of the matrices A, X, and B
that also accounts for the sign of r?
admin
Sticky Note

- under the square root symbol is a number that is just the i-th
diagonal element of the n x n matrix A(AtA)-1At
- this element is the weight for the confidence interval (CI) or
error bar on the prediction bi^hat
- see example in Fig. VI.5 of the handout
admin
Sticky Note
or Σ(bi^hat - bi^bar)2 ---> 0
admin
Sticky Note
admin
Sticky Note
- even though the synthetic data (ie., the predictions from a
solution) may fit the observed data, the error bars on the
solution may be unacceptably large
- ie., sufficiently large that the solution is useless for anything
else than matching the observations!
- and the solution is said to be unstable or poorly conditioned or
near-singular
admin
Sticky Note
- the coefficients of the A-matrix take up most of the resources

in solving the inverse problem
- thus, rather than re-parameterizing the problem for a new set
of A-coefficients, the pressure is on to manipulate or manage
the original set of coefficients for a better performing solution
- classically, this has been done by considering the
eigenvalue/eigenvector decomposition of AtA
admin
Sticky Note
in matlab, the condition number is=>
COND = λmax/λmin
and the reciprocal condition is given by=>
RCOND = λmin/λmax
admin
Sticky Note
of course, for least squares problems, this system really is=>
AtAX = AtB
admin
Sticky Note
two ways to check your code for finding the eigenvalues of AtA
admin

Sticky Note
so what are the COND or RCOND values here?
admin
Sticky Note
x1 x2 = A = 4 8
x3 x4 8 4
admin
Sticky Note
- arrow intersects the ellipse halfway between (4, 8) and (8, 4)
with semi-major axis length=> λmax = 12 and semi-minor axis
length=> λmin = -4
- where the directions of the two axes are given by the two
corresponding eigenvectors
- ie., for λmax = 12 the eigenvector is=>
Xmax = (x1 x2)t = (1 1)t
- so that the slope of the semi-major axis is=>
tan-1(1/1) = 45o
- and for λmin = -4 the eigenvector is=>
Xmin = (-1 1)t
- so that the slope of the semi-minor axis is=>
tan-1(-1/1) = 135o = 45o + 90o
admin
Sticky Note
for A1 = 6 8
8 6

- the rows are becoming more similar and the matrix
increasingly near-singular
- as reflected by the eigenvalues=>
λmax = 14, λmin = -2, and related COND
admin
Sticky Note
for A2 = 4 8
4 8
- the matrix is singular with eigenvalues=>
λmax = 12, λmin = 0, and COND---> ∞
- ie., the second row adds no new info to the system
admin
Sticky Note
for A3 = -4 8
8 4
- the matrix is orthogonal with eigenvalues=>
λmax = 8.95 = -λmin, and COND = RCOND
- so that the solution using A-1 is stable in the sense that small
changes in the A-coefficients do not result in large erratic
variations in the predictions

admin
Sticky Note
correction=>
At = VΛtUt where U = AVΛ-1 since Λ-1 = (Λt)-1
so that to evaluate U we don't need to evaluate the
(n x n) system AAt
admin
Sticky Note
of
admin
Sticky Note
correction=>
K = - log(RCOND)
admin
Sticky Note
- modeling a (10 x 10)-array of satellite-observed total field
magnetic anomaly values in nT at 125 km altitude with a 1o-
station spacing

- by a (10 x 10)-array of point dipoles at the earth's surface with
a 1o-station spacing
- in the 1975 International Geomagnetic Reference Field (IGRF-
75)
-- the inversion obtained point dipole magnetic susceptibilities
so that the field of the dipoles fit the observed magnetic
anomalies of insert A with least squares accuracy
admin
Sticky Note
- in the remaining 4 inserts, the system's conditioning was
investigated using=> K = -log(RCOND) as a function of various
inversion parameters
- for example, insert B shows that increasing the altitude of the
observations degrades the solution increasingly because the
system is becoming more near-singular and unstable
admin
Sticky Note
- in insert B, the solutions were obtained for an applied
differentially-reduced-to-pole (DRTP) field with intensity Fe =
60,000 nT
- this simplification minimized considering the inversion effects
from varying IGRF intensities, declinations, and inclinations
admin
Sticky Note
- for insert C, the (10o x 10o)-arrays are moved along a constant
longitude
- with the observed anomaly values at 125 km altitude, and

- a DTRP field of Fe = 60,000 nT
- the system becomes increasingly ill condition as it approaches
the earth's poles
admin
Sticky Note
- insert D shows the conditioning effects of moving the problem
along the swath between the geographic equator and 10oN in the
IGRF-75 (curve A),
- along the swath between the geographic equator and 10oS in
the IGRF-75 (curve B), and
- along the swath between the geographic equator and 10oS in
the DRTP (curve C)
- the observed magnetic anomalies were all taken at 125 km
altitude
admin
Sticky Note
- in insert E, curve A shows the conditioning effects at 400 km
altitude in the DRTP of varying the observation grid spacing for
a constant 1o source grid spacing
- ie., this system is best conditioned for observation grid
spacings equal to or greater than roughly 2.5 times the source
spacing
admin
Sticky Note
- curve B of insert E, on the other hand, shows the conditioning
effects at 400 km altitude in the DRTP of varying the source

grid spacing for a constant 1o observation grid spacing
- ie., this system is best conditioned where the source grid
spacing is equal to roughly 1.5 to 2 times the observation
spacing
admin
Sticky Note
ie. the=> GLI
admin
Sticky Note
H is the GLI that is sometimes also referred to as the 'pseudo' or
'Penrose' inverse
admin
Sticky Note
ie., the design matrix A is multiplied from the right by the GLI
to see how close the product is to the n-th order identity matrix
In
admin
Sticky Note
ie., the design matrix A is multiplied from the left by the GLI to
see how close the product is to the m-th order identity matrix
Im
admin
Sticky Note

correction=> minimum
admin
Sticky Note
ie., where m < n
admin
Sticky Note
where the GLI=> H = (AtA)-1At
admin
Sticky Note
correction=> quantitatively
admin
Sticky Note
ie., the info density matrix S provides insight on how the
observations are related to the inferred forward model of the
system
admin
Sticky Note
ie., the info density matrix S provides insight on the
observations that may be culled from or added to the system for
a more effective solution
admin
Sticky Note

ie., the info density matrix S provides insight on designing
appropriate surveys, experiments, field work, etc.
admin
Sticky Note
- note that the info density matrix S is a function only of the
coefficients of the design matrix A, and not the magnitudes of
the coefficients of the observation vector B
- ie., solving the system where all the B-coefficients have been
set to zero yields the exact same COND as does the solution for
the original set of B-coefficients
admin
Sticky Note
where the GLI=> H = (AtA)-1At
admin
Sticky Note
ie., where m > n
admin
Sticky Note
ie., where (m = n) or (m > n)
admin
Sticky Note
ie., the corresponding k-th variable
admin
Sticky Note

ie., by deleting either xi or xj, etc.
admin
Sticky Note
Inversion of free oscillation earthquake data for a core-to-
crustal surface profile of earth densities, where
- the densities ρ(ro) are estimated at target depths ro that range
from
- ro = (0.10)re near the earth's center in the top-left insert to
- ro = (0.99)re near the earth's surface in the bottom-right
insert,
- and re is the earth's mean radius of about 6,372 km
admin
Sticky Note
for each insert, the horizontal R-axis extends from the earth's
center at R = 0re to its mean surface at R = 1re
admin
Sticky Note
ρ(ro = 1re) ≈ 3 g/cm3
admin
Sticky Note
ρ(ro = 0re) ≈ 13 g/cm3
admin
Sticky Note

m(dots) = 35
admin
Sticky Note
where the co-variance is=>
cvar(xk^hat) = σ2(VpΛp-2 Vpt )
admin
Sticky Note
or 'standardizing'
admin
Sticky Note
The resolution limits of inversion depend on=>
admin
Sticky Note
on page 83/100 below
admin
Sticky Note
on page 83/100 below
admin
Sticky Note
for underdetermined systems

admin
Sticky Note
- the eigenvalues are arranged in decreasing magnitude order
- in general, these spectra tend to be insightful where the
number of unknowns is relatively small like in these examples
- for large numbers of unknowns, the choice of a cut-off values
can be hard to make
- for gridded solutions in particular, the eigenvalue spectrum
tends to decay smoothly without a prominent break in slope
admin
Sticky Note
OPTIONAL Assignment #8=> complete exercise 2.4=>
admin
Sticky Note
in the computationally long-winded approach of Fig. 2.6=>
admin
Sticky Note
on page 84/100
admin
Sticky Note
on page 85/100
admin

Sticky Note
correction=> CONSTRAINED LINEAR INVERSION
admin
Sticky Note
ie., the IDEAL WORLD of mostly textbooks
admin
Sticky Note
ie., the REAL WORLD that includes improperly scaled
inversions
admin
Sticky Note
where X = A-1B
admin
Sticky Note
where X = (AtA)-1AtB
admin
Sticky Note
where (m - n) of the unknowns must be expressed in terms of
the remaining unknowns or reassigned
admin
Sticky Note
Note that the GLI always exists for obtaining solutions from
REAL WORLD systems

admin
Sticky Note
Note that the conditioning of the AX-system has nothing to do
with how the magnitudes of the observed B-coefficients vary
admin
Sticky Note
where the GLI=> H = [AtA)-1At]
admin
Sticky Note
Thus, the Information Density matrix
S = AH = A[AtA)-1At] ≈ UUt = In
may be useful in cases II, IV, and V, for isolating optimal
observation B-coefficients to relate to a chosen forward model
AX
admin
Sticky Note
and the IResolution matrix
R = HA = [AtA)-1At]A ≈ VVt = Im
may be useful in cases III, IV, and V, for isolating optimal X-
coefficients to relate to a chosen forward model AX
admin
Sticky Note
- in general, cases IV and V are the REAL WORLD cases most
commonly encountered by researchers,
- and in need of sensitivity analysis to establish optimally
performing solutions

admin
Sticky Note
in the Soc. for Industrial and Applied Mathematics (SIAM)
journal=> Technometrics
admin
Sticky Note
that is drawn from the Gaussian distribution=>
N(μ=0, σ) with σ2 = EV
- using EV to stablize a system is equivalent to the classical
approach of computing each A-coefficient with random noise
added from the N(0, σ)-distribution
admin
Sticky Note
or error of fit (EOF)
admin
Sticky Note
with λj‘ = (λj + EV)
admin
Sticky Note
ie., X’ , can be computed in about a tenth of the effort it took to
compute the A-coefficients
admin

Sticky Note
on pages 90/100 and 91/100, respectively
admin
Sticky Note
Figures II.1 thru II.6 have been deleted
admin
Sticky Note
The rest of the notes to the end of this 5642Lectures_2_4.pdf
are provided for completeness only...
- the utility of these notes is becoming limited as modern data
processing increasingly relies on linear systems and spectral
analysis
- thus, we will move on to consider the most computationally
efficient and elegant linear system of all=> the spectral model
admin
Sticky Note
Insert A shows a (16 x 16)-array of satellite magnetic
observations with 2o-station spacing over India at 400 km asl
that we want to relate by least squares inversion to the magnetic
susceptibities of a (16 x 16)-array of point dipoles with 2o-
station spacing at 100 km bsl
assuming the IGRF-75 updated to 1980
admin
Sticky Note
Insert B shows that the solution provides an exact match within
working precision to observed magnetic anomalies in A

- however, the condition number as inferred by
K = -log(RCOND) ≈ 14 suggests that the solution contains
almost no the significant figures
admin
Sticky Note
Insert C is a map of the DRTP anomalies produced using the
very poorly conditioned solution of insert B
- ie., changing the A-coefficients to process the solution for
anything (eg., interpolations, continuations, etc.) beyond
estimating the original observations yields essentially useless
results
ie., garbage - even though the predictions are nearly perfect in
matching the observed anomalies
admin
Sticky Note
Insert I is the map of the DRTP anomalies produced with a
better conditioned solution based on using the optimal EV = 10-
7 that minimizes
- the misfit in the observed data according to curve C in Fig. 3-
below, and
- the map-to-map differences in the predictions with changing
EV according to curve B in Fig. 3-below
admin
Sticky Note
Insert H provides an effective error map on the predictions in
insert I
that was obtained by subtracting the predictions at EV = 10-6

from those at EV = 10-8
admin
Sticky Note
Curve B = sum of squared residuals (SSR) =
Σi[(Pred. Map @ EVi) – (Pred. Map @ EV=1)]2
for the map of predictions (Pred. Map) at various EV-values
admin
Sticky Note
Curve C =
Σi[(Pred. Obs. @ EVi) – Obs.)]2
admin
Sticky Note
Curve D = K = -log(RCOND)
admin
Sticky Note
Curve E = variance of solutions @ EVi
admin
Sticky Note
put cursor on top-left corner to see range of 'optimal' EV=>
|---------------| 10-6
admin
Sticky Note
Curves B and C provide an effective trade-off diagram for

estimating an 'optimal' EV,
whereas Curves D and E are only marginally effective
admin
Sticky Note
Principles of
Linear Programming <===> Game Theory
admin
Sticky Note
convex solution space that satisfies the a-thru-e inequalities of
eq. 4.1

Geomathematical Analysis
Ralph R.B. von Frese
School of Earth Sciences,
The Ohio State University,
Columbus, OH 43210
Contents
List of illustrations page v
List of tables xii
0.1 Overview 1
1 Basic Digital Data Analysis 2
2 Data Inversion or Modeling 9
3 Trial-and-Error Methods 14
4 Array Methods 15
4.1 Basic Matrix Properties 16
4.2 Matrix Inversion 19
4.2.1 Error Statistics 25

4.2.2 Sensitivity Analysis 27
4.3 Generalized Linear Inversion 30
4.3.1 Eigenvalues & Eigenvectors 31
4.3.2 Singular Value Decomposition 32
4.3.3 Generalized Linear Inverse 33
4.3.4 A) Information Density 34
4.3.5 B) Model Resolution 35
4.3.6 C) Model Variance 37
4.3.7 Sensitivity Analysis 37
4.4 Summary 38
5 Spectral Analysis 40
5.1 Analytical Transforms 43
5.2 Numerical Transforms 45
5.2.1 Numerical Errors 53
5.2.2 A) Gibbs’ Error 53
5.2.3 B) Wavenumber Leakage 55
iv Contents

5.2.4 C) Wavenumber Aliasing 57
5.2.5 D) Wavenumber Resolution 59
5.3 Maximum Entropy Spectral Analysis 59
6 Data Interrogation 64
6.1 Convolution 65
6.1.1 Analytical Convolution 65
6.1.2 Numerical Convolution & Deconvolution 65
6.1.3 Convolution & Correlation Theorems 70
6.1.4 Summary 76
6.2 Isolation & Enhancement 79
6.2.1 Spatial Filtering 80
6.2.2 A) Geological Methods 80
6.2.3 B) Graphical Methods 81
6.2.4 C) Trend Surface Analysis 82
6.2.5 D) Analytical Grid Methods 85
6.2.6 Spectral Filtering 85
6.2.7 A) Wavelength Filters 86
6.2.8 B) Directional Filters 89

6.2.9 C) Correlation Filters 94
6.2.10 D) Derivative & Integral Filters 99
6.2.11 E) Interpolation & Continuation Filters 104
7 Data Graphics 108
7.1 Map Projections & Transformations 109
7.2 Gridding 110
7.2.1 Linear Interpolation 112
7.2.2 Cubic Spline Interpolation 114
7.2.3 Equivalent Source Interpolation 117
7.2.4 Polynomial Interpolation 121
7.2.5 Statistical Interpolation 121
7.3 Graphical Parameters & Practice 123
7.3.1 Standardized & Normalized Data 124
7.3.2 Local Favorability Indices 125
7.4 Presentation Modes 126
8 Key Concepts 131
References 135
Index 139

Illustrations
1.1 Historical trends in computing range from the pre-calculus
era of
ancient times when analysis was essentially based on linear dis-
crete mathematics, through the classical calculus era that began
roughly with Newton and Leibniz who incorporated continuum
mathematics and non-linear systems, to the present computa-
tional calculus era that was inaugurated for the masses by the
arrival of the electronic pocket calculator. All data analyses are
fundamentally based on taking data sums [Σ] and differences
[∆]. 3
1.2 The calculus relates the function to the intergal of its
derivative
and vice versa. Adapted from Apostol (1964). 4
1.3 The integral of the cosine is the cosine phase-shifted by 90o
or
the sine, whereas the integral of the sine is the cosine with
ampli-
tude that is twice that of the sine wave. Adapted from Apostol
(1964). 8
2.1 The process of inversion (solid arrows) establishes a
forward
model (AX) of the observations (B) and performs a sensitiv-
ity analysis to find a subset of possible solutions that may give
new insights on the processes producing the data variations. The
feed-back loop (black-bordered gray arrows) from unacceptable
residuals and useless solutions to the mathematical model refor-
mulates the forward model until a useful solution is produced.
10

2.2 Schematic examples of the inversion of gravity data for the
sub-
surface distributions of lower density sediments with densities
that increase with depth within a basin surrounded by higher
density rocks (upper row) and a lower density salt dome dis-
placing higher density sedimentary rocks (bottom row). Both
conceptual models can be mathematically represented by the
simplified density models on the right that predict a relative
gravity minimum over the center of the sources as illustrated by
vi Illustrations
the generic gravity signal calculated in the middle panel of the
right column. 11
4.1 The determinant of any n-square matrix is simply the differ-
ence in the diagional [n − 1]-products of the n-elements, which
are added [+] if right-sloping, and subtracted [-] if left-sloping.
However, care must be taken not to repeat products in the sum -
i.e., for the |A(3×3)| example, the rightmost right-sloping diago-
nal product [a11a22a33] was already established and thus
cannot
be repeated. Similarly, the the rightmost left-sloping product
[a13a22a31] cannot be repeated. 19
4.2 Profile of the gravity effect (g(2D)z) across a buried
horizontal
2D cylinder striking perpendicular to the profile superimposed
on a constant regional field C. This profile is used to illustrate
the data inversion process. See the text for additional modeling
details. 20
4.3 Using performance curves to estimate the ’optimal’ value of
er-

ror variance (EVopt) for effective data inversion. In this exam-
ple, curves #1 and #2 with respective C- and D-ordinates were
established to determine the EV-range (shaded) over which re-
gional magnetic anomalies of southern China and India could be
effectively modeled for their radial components. [Adapted from
von Frese et al. (1988)] 29
4.4 The more the i-th row of the information density matrix S is
a spike function about its diagonal element, the greater is the
importance of the corresponding observation point [bi] to the
solution of the model. 35
4.5 Selection of p-value from an eigenvalue spectrum plot. In
the left
plot, the knee in the curve suggests a threshold value λt where
smaller eigenvalues may be discarded, or equivalently set to
zero.
In the right plot, the threshold value λt and corresponding p-
value may be indicated where the spectrum’s gradient flattens.
38
5.1 The projection of the Fourier transform H(f) and its complex
conjugate H⋆ (f) in the polar complex coordinate (CT, ST)-
plane. 41
5.2 Interpretation of the Fourier transform for a simple
sinusoidal
wave. [Printed and electronically reproduced after Brigham
(1974) by permission of Pearson Education, Inc., Upper Saddle
River, New Jersey.] 42
5.3 Data and frequency domain representations of Equation
5.10. 44
5.4 The bookkeeping details in taking the Fourier transform of a
21-

point signal. The transform assumes that the data (open circles)
are infinitely repeated (black dots) on all sides of the signal. 47
5.5 The machine storage format for representing the discrete
Nyquist-
Illustrations vii
centered wavenumber transform of a 16×32 data array. [Adapted
from Reed (1980)] 51
5.6 Symmetry properties of the Nyquist-centered wavenumber
spec-
trum for (a) non-symmetric and (b) symmetric real gridded
data where nx and ny are the number of data in the x- and
y-dimensions of the grid. [Adapted from Reed (1980)] 51
5.7 Comparison of the number of multiplications required for
calcu-
lating the DFT and the FFT. The computational efficiency of the
FFT increases dramatically at about 64 signal coefficients and
higher. [Printed and electronically reproduced after Brigham
5.8 Fourier synthesis of a saw tooth function. (a) The first four
Fourier components of a sawtooth are sin(t),−(1/2) sin(2t), (1/3)
sin(3t),
and -(1/4) sin(4t). (b) Superposition of the Fourier components
reconstitutes the saw tooth function. [After Sheriff (1973)] 55
5.9 Comparison of the Fourier transform for a cosine waveform
trun-
cated over an interval that is an integer multiple of the period

(a) and where the truncation interval is not a multiple of the pe-
riod (b). [Printed and electronically reproduced after Brigham
5.10 Application of the Hanning function (a) to taper the poorly
sampled cosine wave in Figure 5.9(a) greatly improves the ac-
curacy of the cosine’s frequency response (b). [Printed and
elec-
tronically reproduced after Brigham (1974) by permission of
Pearson Education, Inc., Upper Saddle River, New Jersey.] 57
5.11 Aliasing of a 417 htz signal sampled at 500 htz occurs at
the
folding frequency of 250 htz, which is the Nyquist frequency of
sampling. 58
5.12 Final prediction error (FPE), Akaike information criterion
(AIC),
and prediction error power (PEP = Pn) as a function of the
length m of the prediction error filter. [Adapted from Saad
(1978)] 63
6.1 Graphical view of the convolution procedure. [Adapted from
Brigham (1974)] 66
6.2 Graphical illustration of the of convolution’s associative
property
- i.e., [left column] x(t) ⊗ h(t) = h(t) ⊗ x(t) [right column].
[Adapted from Brigham (1974)] 67
6.3 Graphical illustration of the discrete convolution of x(t) [=
1.0 ∀ t ∈
(0, 1) and = 0 otherwise] and h(t) [= 0.5∀ t ∈ (0, 1) and = 0
otherwise] at the uniform interval of t1. [Adapted from Brigham
(1974)] 68

viii Illustrations
6.4 FORTRAN code [box] used to carry out the digital
convolution
of a 3-point operator h with a 4-point signal x - i.e., yk = xi
⊗hj. 69
6.5 Numerical deconvolution of input and output signal
coefficients
for a set of operator coefficients such that the convolution of
operator and signal coefficients results in the ouput
coefficients. 70
6.6 Examples of taking the convolution [top row], cross-
correlation
[2nd row], and auto-correlation [3rd row] of signals #1 and #2
with respective A1- and A2-amplitude spectra and Φ1- and Φ2-
phase spectra. Band-pass filtering [4th row] and deconvolution
[bottom row] examples are also shown. 72
6.7 The data domain auto-correlation of a n-point signal g(t) in-
volves dragging it past itself at lags k = [0, 1, . . . , (n − 1)] × ∆t
and computing at each lag the correlation coefficient between
the overlapping signal components. 73
6.8 The auto-correlograms for signals A and B degrade as k −→
n
and are considered reliable only for the first 10% to 50% of the
lags. Signals A and B have the same frequency components but
different phases and so their auto-correlations are the same. 74
6.9 Auto-correlograms are even functions and indicative of the
fre-

quency content of the signals g(t). Random noise components in
the signals yield auto-correlograms with maximum values at
zero
lag. Also the decay rate of auto-correlograms from zero lag are
sharp from broad band signals with many frequency components
relative to the lower decay rates of narrower band signals. 74
6.10 The data domain cross-correlation of a n-point well log
g(t) with
the m-point target signature h(t) involves dragging them past
themselves at lags k = [0, 1, . . . , (n−1)]×∆t and computing at
each lag the correlation coefficient between the overlapping sig-
nal components. The maximum in the auto-correlogram locates
the target signature in the well log at lag = k1. 75
6.11 Cross-associations between two well logs are a maximum
3/6 =
50% at the top two lag positions and inconsequential at all other
lags due to the pinch-out of unit #4 between the boreholes. 76
6.12 Typical processing sequence where convolution operations
such
as the data domain application of a smoothing filter is done
most
efficiently and accurately in the equivalent wavenumber
domain. 78
6.13 Schematic gravity anomaly contour map of the residual
anomaly
superimposed on a regional anomaly gradient illustrating the
method of isolating residual anomalies by subtracting fictitious
smooth contours of the regional. [Adapted from Dobrin and
Savit (1988)] 82
6.14 Computation of the residual gravity anomaly at O based on
eight regional anomaly values selected by the interpreter. The

Illustrations ix
non-unique regional anomaly at O may be taken as an average
of the selected values or some estimate from a polynomial or
another function fit to them. [Adapted from Griffin (1949)] 83
6.15 Design of the digital low-pass filter operator. 87
6.16 Design of digital high-pass and band-pass filter operators.
88
6.17 Three idealized wavelength filter operators with edges
tapered
for enhanced suppression of Gibbs’ error. The middle contour is
the 50% value of the taper. The design of these filters requires
only a single quadrant. [Adapted from Reed (1980)] 90
6.18 (a) The smoothed or tapered doughnut-shaped response
func-
tion of a band-pass filter for extracting gridded data features
with maximum and minimum wavelengths defined by the
wavenum-
bers at the respective outside edges of the doughnut hole and
the
doughnut. (b) Response function for the composite band-and-
strike-pass filter that passes the maximum and minimum wave-
length features which lack components with wavenumbers in the
wedge-shaped breaks of the doughnut. [Adapted from Lindseth
(1974)] 90
6.19 Map and frequency responses for (a) elliptical N/S-
striking, (b)
elliptical NE/SW-striking, and (c) circular longer and shorter
wavelength anomalies. [Adapted from Lindseth (1974)] 91

6.20 Contrasting representations for the azimuthal orientations
in the
data domain [(a) and (c)] and the wavenumber domains [(b)
and (d)]. Panels (c) and (d) also show the machine storage
formats for the data and wavenumbers, respectively. [Adapted
from Reed (1980)] 93
6.21 Three idealized directional filters. The design of these
filters re-
quires two quadrants. [Adapted from (Reed, 1980)] 94
6.22 Interpretations of the correlation coefficient r for (a)
positive,
(b) null, and (c) negative feature associations. [Adapted from
6.23 The k-th wavevectors for maps X and Y represented as
complex
polar coordinates. 97
6.24 The meaning of the correlation coefficient described by
Equa-
tion 6.22. 97
6.25 Ideal filter operators for the vertical and horizontal
derivatives
to order p = 1. The design of these filters requires only a single
quadrant. [Adapted from Reed (1980)] 101
6.26 Ideal filter operators for the vertical derivatives to orders p
= 2
(left) and p = 4 (right). The design of these filters requires only
a single quadrant. [Adapted from Fuller (1967)] 101
6.27 Example of the gravity effects of five horizontal cylinders.

The
top panel gives the parameters of the cylinders, whereas the
x Illustrations
middle panel shows the gravity effects over a 90-km survey at
1-km intervals which are extended as a rind of constant values
in the bottom panel to enhance the accuracy of the derivative
and integral estimates given in Figure 6.28. 102
6.28 Vertical first (top) and second (middle) derivatives
estimated
spectrally for the gravity effects in Figure 6.27. The comparison
(bottom) of the double spectral integration of the derivative in
the middle panel to the original gravity effects reveals
negligible
differences. 103
6.29 Ideal upward (a) and downward (b) continuation filter
coeffi-
cients for continuations of the data to 3 station intervals above
and below the original altitude of the 16 × 32 data array. The
design of these filters requires only a single quadrant with co-
efficients ranging from the x to α values shown. [Adapted from
Reed (1980)] 106
6.30 Ideal upward continuation filters to continue data (a) 1-
and
(b) 2-station intervals. Changing the sign of the transfer func-
tion coefficients continues the data downward 1- and 2-station
intervals, respectively. [Adapted from Fuller (1967)] 107
7.1 Schematic illustration of predicting the value b(tp) for (a) a
pro-

file between known data points (dots) at the location tp by
linear
(dashed line), cubic spline (solid line), and least squares (dotted
line) methods, and for maps using (b) linear interpolation by
surrounding triangles and (c) statistical interpolation by
inverse-
distance weighted averages. [Adapted from Braile (1978)] 112
7.2 Contour map comparisons of data gridded from 200
randomly
located point samples (dots) of the aeromagnetic anomalies in
Map (b). Map (a) is based on the surrounding triangle method
in Figure 7.1(b), Map (c) is from the intersecting piecewise
cubic spline method in Figure 7.3(b), and Map (f) from the
weighted statistical averaging method in Figure 7.1(c). Map
(e) is from the local polynomial surface fitting method illus-
trated in Map (d). All maps are based on grids of 20×20 values
at the interval of 2 km. Contour values are in nT. [Adapted from
Braile (1978)] 113
7.3 Schematic illustration of predicting (a) a profile value bp
between
known data points (dots) at the location tp by the cubic spline
method, and (b) a map value (+) by intersecting piecewise cubic
splines. 114
7.4 Geomagnetic field drift for 4 October 1978 at a prehistoric
ar-
chaeological site in Jefferson County, Indiana on the north bank
of the Ohio River across from Louisville, Kentucky. The diur-
nal field variation curve (solid line) was interpolated from the
Illustrations xi

base station observations (dots) by the natural cubic spline with
the mean value of 56, 566 nT removed. The horizontal axis is in
decimal time where for example 15.50 = 1530 hours = 3 : 30
p.m. 115
7.5 Contour maps of marine gravity anomalies prepared from
grids
interpolated by the minimum curvature method. Map (a) was
prepared from grid values obtained by the normal untensioned
minimum curvature method. Map (b) was prepared from grid
values estimated by tensioned minimum curvature. [Adapted
from Smith and Wessel (1990)] 118
7.6 Aeromagnetic survey data gridded (a) using a minimum cur-
vature algorithm and (b) equivalent source interpolation. Both
maps were produced at a contour interval of 20 nT. [Courtsey
of EDCON-PRJ, Inc.] 119
7.7 Screen shots of (a) the 2D gravity effects of two prismatic
bodies
with different uniform densities equivalently modeled by (b) the
effects of numerous density contrasts within a uniformly thick
layer. [Adapted from Cooper (2000)] 120
7.8 (a) Gridding by least squares collocation. (b) A typical
covari-
ance function adapted to grid satellite altitude magnetic anoma-
lies. [Adapted from Goyal et al. (1990)] 122
7.9 Signals A and B of 128 coefficients each in respective
panels
(b) and (c) with correlation coefficient r = 0.0 have feature
associations that are mapped out using the SLFI and DLFI
coefficients in the respective panels (a) and (d). [Adapted from

Tables
4.1 The analysis of variance (ANOVA) table for testing the
signifi-
cance of a model’s fit to data. 27
5.1 Analytical properties of the Fourier transform. 46
5.2 Wavenumber structure of the discrete Fourier transform. 49
6.1 Spectral filter coefficients for estimating p-order derivatives
and
integrals in the horizontal (H) and vertical (V) directions. 100
0.1 Overview 1
0.1 Overview
Digital data are commonly expressed in standard formats for
electronic anal-
ysis and archiving. A voluminous literature full of application-
specialized
jargon describes numerous analytical procedures for data
processing and in-
terpretation. However, when considered from the electronic
computing per-
spective, these procedures simplify into the core problem of
manipulating a

digital forward model of the data to achieve the data analysis
objectives. The
forward model consists of a set of coefficients specified by the
investigator and
a set of unknown coefficients that must be determined by
inversion from the
input data set and the specified forward model coefficients. The
inversion
typically establishes a least squares solution, as well as errors
on the esti-
mated coefficients and predictions of the solution in terms of
the data and
specified model coefficients. The inversion solution is never
unique because
of the errors in the data and specified model coefficients, the
truncated calcu-
lation errors, and the source ambiguity of potential fields. Thus,
a sensitivity
analysis is commonly required to establish an ’optimal’ set or
range of so-
lutions that conforms to the error constraints. Sensitivity
analysis assesses
solution performance in achieving data analysis objectives
including the de-

termination of the range of geologically reasonable parameters
that satisfy
the observed data.
For electronic analysis, data are typically rendered into a grid
that is most
efficiently modeled by the fast Fourier transform (FFT) which
has largely
superseded spatial domain calculations. This spectral model
accurately rep-
resents the data as the summation of cosine and sine series over
a finite
sequence of discrete uniformly spaced wavelengths. These
wavelengths corre-
spond to an equivalent discrete sequence of frequencies called
wavenumbers
which are fixed by the number of gridded data points. The FFT
dominates
modern data analysis because of the associated computational
efficiency and
accuracy, and minimal number of assumptions.
The purpose of the inversion process is to obtain a model that
can be in-
terrogated for the data’s derivative and integral components to
satisfy data

analysis objectives. These objectives tend to focus mainly on
isolating and
enhancing data attributes using either spatial domain
convolution methods
or frequency domain (i.e., spectral) filtering operations, as well
as on char-
acterizing data sources by forward modeling using trial-and-
error inversion
and inverse modeling by linear matrix inversion methods.
Effective presenta-
tion graphics to display the data and data processing results also
are essential
elements of modern data analysis.
1
Basic Digital Data Analysis
Measurements provide discrete, finite samples of phenomena
that in theory
may exhibit an infinite continuum of infintesimally small
variations. These
discrete numbers with attached physical units are called data -
i.e., the

Latin plural for datum. Data are called analog when measured
from lengths
of a ruler, heights of mercury in a tube, slide rule calculations,
or other
continuous quantities, and digital when determined by counting
clock ticks,
heart beats, fingers, toes, bones, stones, abacus beads, or other
discrete
units.
With the advent of widespread electronic computing in the latter
half of
the 20th century, the counting of open and closed electronic
circuits in pocket
calculators, and personal, mainframe, and super computers
became the norm
for representing and analyzing data. Thus, modern analysis has
regressed to
the pre-calculus era that existed roughly up through the early
17th century
when the more intuitive linear mathematics of digital data series
prevailed
(Figure 1.1). However, the remarkable advances of electronic
computing in
the current information age allow us to explore and visualize

the analytical
properties of data far more readily and comprehensively than
any previous
generation of scientists.
In general, all analysis is based on the elementary inverse
operations of
arithmetic addition and subtraction that respectively define the
first-order
sums and differences in data. These fundamental operations are
combined
into the inverse arithmetic operations of multiplication and
division to de-
scribe the second-order behavior of data products and quotients,
respec-
tively. With the algebra, the arithmetic operations are extended
to the more
comprehensive characterization of data in terms of
representative functions
and their inverses. However, the fullest characterization of the
analytical
properties of data results from the calculus that functionally
relates the

Basic Digital Data Analysis 3
Pre-Calculus
Finite, linear,
discrete math
∑ and ∆
Classical Calculus
In#nite, non-linear,
continuum math
∫ and dnx/dyn
Computational
Calculus
Finite, linear, discrete
math plus graphics
∑ and ∆ ≈ ∂ nx/∂ yn
1640 1975
Figure 1.1 Historical trends in computing range from the pre-
calculus era of
ancient times when analysis was essentially based on linear
discrete mathe-
matics, through the classical calculus era that began roughly
with Newton
and Leibniz who incorporated continuum mathematics and non-
linear sys-
tems, to the present computational calculus era that was
inaugurated for

the masses by the arrival of the electronic pocket calculator. All
data anal-
yses are fundamentally based on taking data sums [Σ] and
differences [∆].
sums and differences by the inverse operations of integration
and differenti-
ation, respectively.
The profiles in Figure 1.2 illustrate the issues on establishing
the analyt-
ical properties of data considered in this book. Each profile
plots the data as
f(t)-values of the dependent variable along the ordinate or
vertical axis for
the corresponding t-values of the independent variable along the
abscissa or
horizontal axis. In practice, measurements of f(t) are always
discrete point
observations containing errors and finite in number. Thus, the
observations
can never completely replicate the signal from which they
originate, but
only approximate a portion of it as illustrated in the middle-
right insert of
Figure 1.2.

To infer the function or signal giving rise to data observations,
the pat-
terns in the data are fit with a hypothesized function [i.e., a
likely story]
as described in the sections below. The effective explanation of
the point
observations in the middle-right insert of Figure 1.2, for
example, requires
us to intuit that the analytical expression for the appropriate
function (solid
line) describing the data is f(t) = a(t
2
2
), where a is unknown. In general, the
procedure for determining the unknown parameter of the
inferred function
is fundamentally straightforward and the same for any
application.
This story-telling process is an artform, however, and thus goes
by a some-
times baffling variety of application-dependent labels even
though the labels
all essentially cover the same procedural details. For example,
factor anal-

ysis commonly describes the data-fitting process in psychology,
the social
4 Basic Digital Data Analysis
f1(t) = a
f2(t) = at
f3(t) = at2
Area #1
Area #2
Area #3
Area #1 = ∫ a dt = ax
0
x
Area #2 = ∫ at dt = ax2/2
0
x
Area #3 = ∫ at2 dt = ax3/3
0
x
t

t
t
x
x
x
x
x
x
t
t
t
Figure 1.2 The calculus relates the function to the intergal of its
derivative
and vice versa. Adapted from Apostol (1964).
sciences or economics, whereas it may be labeled blackbox
theory in physics,
or parameter estimation, deconvolution, inverse modeling, or
simply model-
ing in other applications. In this book, the process will be
referred to as
inversion, which is the term most commonly used in

mathematics, engineer-
ing, and the natural sciences.
The primary objective of inversion is to establish an effective
analytical
model for the data that can be exploited for new insights on the
processes
causing the data variations. This exploitation phase is
commonly called sys-
tems processing or data interrogation and may range from
simple data inter-
polations or extrapolations to assessing the integral and
differential proper-
ties of the data. Indeed, the two fundamental theorems of
calculus guarantee
that the mere act of plotting or mapping data fully reveals their
analytical
properties.
Figure 1.2 gives some examples of the great analytical power of
data
graphics. Consider, for example, the discrete data points in the
middle-

right insert of Figure 1.2 that simulate the late 16th century
experimental
observations of Galileo Galilei on the motion of a mass in the
Earth’s gravity
field. Here, the vertical axis records the effective free-fall
distance in f(t)-
meters traveled by the any mass as a function of time in t-
seconds along the
horizontal axis. Galileo and his assistants used a ruler and their
pulse rates
to measure the data coordinates of distance and time,
respectively. This plot
is remarkable because it is exactly the same for both light and
heavy objects,
and thus it contradicts Aristole’s intuitive claim of heavier
objects falling
faster than lighter ones that had prevailed for the previous 2,
000 years.
To see how the above example shows that all masses in free-fall
[i.e., in
a vacuum] do so at the same rate or velocity, suppose that the
function
f(t) = a(t
2

2
) with the a-parameter unknown is intuited to be a plausible
fit the discrete data. The a-coefficient must be in units of m/s2
because
dimensional analysis requires that the physical units of both
sides of the
equation be the same or homogeneous. To estimate the
numerical value of
a, various candidate values could be tested until one is found
that predicts
values essentially on the solid line through the data in middle-
right insert of
Figure 1.2. Thus, by trial-and-error, it could be determined that
a ≈ 9.8
m/s2. However, invoking the calculus that Newton and Leibniz
co-invented
in the 17th century can yield further insight on the physical
significance of
this proportionality constant.
By the First Fundamental Theorem of Calculus [e.g., Apostol
(1964)],
for example, it is known that f(t) is the integral of its derivative
to within a
constant. In Figure 1.2, for instance, the top-right insert is the

derivative
f ′(t) =
d[f(t)]
dt
= at, (1.1)
of the function shown in the middle-right insert. The derivative
f ′(t) is the
rate of change of the travel-time curve f(t) with time t or the
velocity V (t)
in meters per second [e.g., m/s].
For [i = 1, 2, ..., n] discrete observations, the derivative can be
numerically
estimated by plotting the [n−2] local 3-data point graphical
differentials
V (ti) = f
′(ti) ≈
[f(ti+1) −f(ti−1)]
[ti+1 − ti−1]
(1.2)
against ti and ruling a straight line uniformly through them.
Note that the

numerical 3-data point derivatives are not defined at the two
end-points t1
and tn. However, these end-point derivatives can be evaluated
by solving
the boundary value problems using the Second Fundamental
Theorem of
Calculus, which says that f(t) is also the derivative of its
integral
f(t) =
d[
∫
f(t)dt]
dt
=
at2
2
. (1.3)
Like the derivatives of a function, the integrals to within
constants can be
determined both graphically from the data plot or analytically if
the function
is known. The more general graphical determination involves
measuring the
total area between the data and horizontal axis and plotting the
numerical

estimates with t for the integral. In Figure 1.2, for example, the
function
f(t) in the middle-right insert is the integral of its derivative in
the top-right
insert. Thus, at any t, the integral f(t) numerically equals the
shaded area
under the velocity curve in the middle-left insert of Figure 1.2,
which is
one-half the area of the rectangle with sides t and at, or
f(t) = a(
t2
2
). (1.4)
Using the two theorems of calculus, the numerically undefined
end-point
derivatives can be established by
f ′(t1) = f
′(t2) −
2f(t2)
∆t
and f ′(tn) = f
′(tn−1) +
2[f(tn) −f(tn−1)]

∆t
, (1.5)
where ∆t = [ti+1 − ti]. These numerical procedures may be
repeated on the
n numerical derivatives to obtain higher order derivative
estimates that also
satisfy the two fundamental theorems of calculus.
The derivative in the top-right insert of Figure 1.2, obtained
either an-
alytically from the intuited function f(t) or numerically from the
graph-
ical differentials of the data [i.e., f ′(ti) ≡ V (ti)], shows that the
veloc-
ity of a freely falling object increases with time t at the constant
rate a.
The rate of velocity increase or acceleration can be estimated
from the
slope of the velocity line given by the ratio of its rise to its run
[i.e.,
a = [V (tn) − V (t1)]/[tn − t1]. The acceleration also can be
evaluated from
any estimate V (ti) as a = V (ti)/ti, or by plotting up the
numerical deriva-
tive of the velocity curve as shown in the top-left insert of
Figure 1.2.

Second order numerical differentials of Galileo’s travel-time
data reveal that
the magnitude of the Earth’s gravitational acceleration is a ≈ 9.8
m/s2.
Newton incorporated this result into his second law [F = ma] to
differenti-
ate between weight, which is the force F acting on a falling
object, and the
object’s invariant material property of mass [m].
The ultimate objective in mapping data is to identify an
analytical func-
tion or model that can effectively explain the data variations.
Graphical
differentiations and integrations of the data can be insightful
when choos-
ing an appropriate fitting function or model for data inversion.
When the
number of model unknowns is small [≤ 5 - 10] like it was for
the f(t)-model
in the middle-right insert of Figure 1.2, data inversion by trial-
and-error
can be effective. However, this approach is not practical for
many modern

inversions that routinely solve for vastly greater numbers of
unknowns. For
these larger scale inversions, computer adaptations of the early
19th cen-
tury procedures of Gauss obtain the least squares solution that
minimizes
the sum of the squared residuals between the model’s
predictions and the
data.
For electronic processing, data are always pixilated or rendered
into grid-
ded datasets. In gridded format, the data can be modeled by the
Fast Fourier
Transform or simply the FFT, which is a very elegant version of
the early
19th century Fourier series. The FFT is a spectral model that
closely rep-
resents the data as the summation or integration of a series of
cosine and
sine waves at discrete wavelength/frequency arguments which
are completely
fixed by the interval and number of data points. The data
inversion estimates
only the amplitudes of these cosine and sine waves, and thus the

FFT is ex-
tremely fast to implement. Indeed, the spectral model is the
fastest known
procedure for fitting mapped data and invokes no assumptions
about the
dataset other than it is uniformly discretized in each of its 1, 2,
3, or more
orthogonal dimensions.
The generic spectral model is commensurately efficient and
comprehensive
in accessing the differential and integral properties of data. In
Figure 1.3,
for example, the integral of A cos(t) is simply A sin(t) so that
the inverse
differential or the derivative of A sin(t) is A cos(t), where A
and t are the
respective amplitude and argument of the wave functions.
Similarly, the in-
tegral of A sin(t) is −A cos(t) so that the derivative of A cos(t)
is −A sin(t).
Thus, to model the data’s analytical properties, the spectral
model’s co-
sine and sine components could be simply exchanged for the
appropriate

derivative- or integral-producing waves and the modified series
re-evaluated.
The entire process only involves summing the products of the
wave ampli-
tudes, which were determined from the data, with the related
waves that
Area #4
Area #4 = ∫ cos(t) dt = sin(x)
0
x
t t
t t
Area #5
x x
x x
f4(t) = cos(x)
f5(t) = sin(x)
Area #5 = ∫ sin(t) dt = 1 - cos(x)
0

x
Figure 1.3 The integral of the cosine is the cosine phase-shifted
by 90o or
the sine, whereas the integral of the sine is the cosine with
amplitude that
is twice that of the sine wave. Adapted from Apostol (1964).
the interval and number of data points specified. In summary,
spectral mod-
eling dominates modern digital data analysis because of the
computational
efficiency, accuracy, and minimal number of assumptions
involved.
In the sections that follow, further practical details are outlined
concerning
the inversion and systems processing of digital data. The role of
modern
digital graphics for developing insightful presentations of the
data and their
interpretations is also considered.
2
Data Inversion or Modeling
A principal objective in mapping data is to plausibly quantify or

model the
data variations. These models are produced from the inversion
of the data.
It is the fundamental inverse problem of science that expresses
measured
natural processes in the language of mathematics.
As illustrated in Figure 2.1, inversion is a quasi-art form or
process that
requires the interpreter to conceive a mathematical model (f(A,
X)) with
known and unknown parameters A and X, respectively, that can
be related
to variations in the observed data (B) to an appropriate level of
uncertainty
or error. The mathematical model is a simple usually linear
approximation
of the conceptual model that the interpreter proposes to account
for the
data variations. Assumptions arise both implicitly and explicitly
to convert
the conceptual model into the simple mathematical model. They
critically
qualify the applicability of the inversion solution, and thus
should be re-

ported with the conclusions and interpretations of the analysis.
Figure 2.2
illustrates two examples of translating conceptual geological
models into
simplified mathematical models for inversion.
The forward modeling component of inversion involves
evaluating the
mathematical model for synthetic data (i.e., b̂ ∈ B
̂ in Figure
2.1). The
veracity of an inversion’s solution is judged on how well the
predicted data
(B
̂ ) compare with the observed data (B) based on quantitative
criteria such
as the raw differences (B
̂ −B) or squared differences (B
̂ −B)2 or
some other
norm. In general, a perfect match between the predictions and
observations
is not necessarily expected or desired because the observational
data (B) and
the coefficients (A) assumed for the forward model are subject
to inaccu-
racy, insufficiency, interdependency, and other distorting
effects (Jackson,
1972).

Data inaccuracies result from mapping errors, random and
systematic,
related to imprecision in the measurement equipment or
techniques, inexact
10 Data Inversion or Modeling
CONCEPTUAL MODEL
FORWARD MODEL SYNTHETIC DATA (B)
ACCEPTABLE (B - B)
USEFUL SOLUTIONS MODEL ANAYSIS
UNACCEPTABLE (B - B)
SENSITIVITY ANALYSIS
OBSERVED DATA (B)
(B - B)
INVERSION
• inaccurate data
• simplify
• qualify
• known parameters (A)

• unknown parameters (X)
• never unique
• not useful
• reduce ambiguities
• improve convergence
• improve solution performance • etc.
• enhance reasonableness
• insu!cient data
• etc.
• inappropriate
model
• inconsistent data
MATHEMATICAL MODEL (AX = B + error = B)
SOLUTIONS
CONSTRAINTS
ASSUMPTIONS
‹
‹
‹
‹ ‹

Figure 2.1 The process of inversion (solid arrows) establishes a
forward
model (AX) of the observations (B) and performs a sensitivity
analysis to
find a subset of possible solutions that may give new insights on
the pro-
cesses producing the data variations. The feed-back loop (black-
bordered
gray arrows) from unacceptable residuals and useless solutions
to the math-
ematical model reformulates the forward model until a useful
solution is
produced.
data reduction procedures, truncation (round-off) errors, and
other factors.
They scatter the data about a general trend or line (e.g., Figure
2.2) so that
a range of model solutions yields predictions within the
scattered observa-
tions. However, as described in Chapter 3.2.2, the statistical
properties of
the observational data (B) can be propagated to characterize
uncertainties
in the solution variables (X).
Insufficient data do not contain enough information to describe
the model
parameters completely. Examples include the lack of data
observations on

the signal’s peak, trough, flank, or other critical regions. The
data coverage
also may not be sufficiently comprehensive to resolve critical
longer wave-
length properties of data, or the station interval may be too
large to resolve
important shorter wavelength features in the data. Another
source of data
insufficiency is the imprecision in the independent variables (A)
that pro-
motes solution biases. In general, data insufficiency issues are
difficult to
Data Inversion or Modeling 11
calculated
SALT
σ
3
> σ
2
> σ
1

σ
2
> σ
1
σ
3
σ
3
σ
1
σ
1
σ
1
σ
1
σ
2
σ
2
σ
2
σ
2

Figure 2.2 Schematic examples of the inversion of gravity data
for the sub-
surface distributions of lower density sediments with densities
that increase
with depth within a basin surrounded by higher density rocks
(upper row)
and a lower density salt dome displacing higher density
sedimentary rocks
(bottom row). Both conceptual models can be mathematically
represented
by the simplified density models on the right that predict a
relative gravity
minimum over the center of the sources as illustrated by the
generic gravity
signal calculated in the middle panel of the right column.
resolve without new data observations or supplemental geologic
and geo-
physical information.
Interdependency in data sets is where the independent variables
for two
or more data points cannot be distinguished from each other
within the
working precision of the mathematical model. Examples include
data points
that are too close to each other for the mathematical model to
discriminate,
or where perfect symmetry exists so that only half of the data
observations

need to be considered in the inversion (e.g., Figure 4.2).
Incorporating
dependent data into the computations provides no new
information and re-
sults in considerable computational inefficiency and solutions
with erratic
and often unreasonable performance characteristics. However,
as described
in Chapter 3.2.2, this problem can be mitigated somewhat using
random
noise in the A-coefficients to optimize the solution for
acceptable perfor-
mance attributes with only marginal growth in the deviation
between the
observed and predicted data.
The use of an inappropriate model for inversion results in
significant neg-
12 Data Inversion or Modeling
ative consequences ranging from invalid solutions to valid
solutions pro-
duced from excessive or unnecessary effort. Compromised
solutions occur,

for example, from attempts to fit a two-dimensional model to
data that ex-
hibit three-dimensional effects, or a homogeneous model where
the effects of
anisotropy are evident. In addition, the use of a complex model
where a sim-
pler one can achieve the objectives of the inversion leads to
unnecessary and
often unacceptable expenditures of effort. Furthermore, data
such as poten-
tial field observations [e.g., gravity, magnetic, electrical,
thermal, etc.] vary
inversely with distance to their sources so that in theory at
least, an infinite
number of equivalent source distributions can be fit to these
fields. Thus,
the inherent nonuniqueness of these solutions allows
considerable flexibility
in developing appropriate models for effective data inversions.
In general, the solutions of inverse problems are not unique and
require
the application of constraints to identify the subset of the more
effective
solutions for a given objective. Constraints are additional

information about
the model parameters that when imposed on the inverse problem
help reduce
interpretational ambiguities, improve convergence to a solution,
enhance the
reasonableness of the solution and its predictions, and otherwise
improve the
effectiveness of the solution.
The identification of a useful solution or set of solutions has
only estab-
lished a set of numbers for X that map the assumed numbers in
A into
the observed numbers of B. These three sets of numbers
constitute the core
of the inversion and the basis for other investigators to test the
modeling
efforts. The final stage of the process interprets these numbers
for a story
and graphics in a report that addresses the veracity of the
conceptual hy-
pothesis, the consequences of accepting or rejecting the
hypothesis, alternate
interpretations, and other issues for applying the inversion
results effectively.

In summary, the creative, artful elements of data modeling or
inversion
involve developing the conceptual and forward models and the
final inter-
pretation. The analytical efforts to obtain solutions (X), by
contrast, are
routine and well known since Gauss and Fourier established the
respective
least squares and spectral methods at the beginning of the 19th
century. In
practice, the largest analytical challenge is the sensitivity
analysis required
to identify an optimal set of solution coefficients for the given
modeling
objective (Chapter 3.2.2).
Inversion methods range from relatively simple trial-and-error
forward
modeling to numerically more extensive and complex inverse
modeling us-
ing array methods. The former approach usually involves a
relatively small
number (e.g., 5−10) of unknowns, where trial-and-error
adjustments of the
unknown parameters X are made until the predictions of the

forward model
Data Inversion or Modeling 13
reproduce the observed data within a specified margin of error.
For greater
numbers of unknowns, on the other hand, inverse modeling is
commonly ap-
plied that determines the unknown parameters directly from the
observed
data using an optimization process comparing the observed and
predicted
data. However, inverse modeling is applicable on any number of
unknowns,
and thus the more general approach.
3
Trial-and-Error Methods
Historically, the trial-and-error method has been perhaps the
most widely
used and successful approach for data inversion. In this
approach, the for-
ward problem is solved for predicted data which are compared

to the ob-
served data. If the two data sets differ substantially, the model
parameters
are modified and the forward problem is solved again for
another set of
predictions. This process is repeated until a satisfactory match
is obtained
between the predicted and observed data.
This approach offers several advantages for data interpretation.
For exam-
ple, it requires only a moderate amount of experience to
implement. In addi-
tion, the forward modeling exercise quickly provides insights on
the range of
data that the given model satisfies, as well as the sensitivity of
the model’s
parameters to the data. In general, moderately experienced
investigators can
often apply trial-and-error inversion to obtain considerable
practical insight
or training on the scientific implications of the data. However,
difficulities
with the approach include the lack of uniqueness and reliable
error statistics

for the final solution. Furthermore, for more complicated
models involving
large numbers [> 5−10] of unknown parameters, the
convergence to the fi-
nal solution by trial-and-error inversion becomes increasingly
laborious, slow
and unmanageable. Thus, to process the more complex
inversion, array or
matrix methods are commonly applied.
4
Array Methods
Electronic computations are digital and carried out by linear
series manip-
ulations on linear digital models of the input data. The ultimate
objec-
tive of modeling data or inversion is not simply to replicate the
data, but
rather to establish a model that can be analytically interrogated
for new in-
sights on the processes causing the data variations. The digital
data model
is the basis for achieving any data processing objective by

electronic com-
puting. It can be demonstrated with the problem of relating n-
data values
(bi ∈ B, ∀ i = 1, 2, . . . , n) to the simultaneous system of n-
equations
a11x1 + a12x2 + · · · + a1mxm = b1
a21x1 + a22x2 + · · · + a2mxm = b2
...
... =
... (4.1)
an1x1 + an2x2 + · · · + anmxm = bn,
which is linear in the m-unknown variables (xj ∈ X, ∀ j = 1, 2,
. . . , m) and
the (n×m)-known variables (aij ∈ A) of the forward model (AX)
that the
investigator assumes will account for the observations. In
matrix notation,
this linear equation system becomes
AX = B, (4.2)
where B is the n-row by 1-column matix or vector containing
the n-observations,
X is the m-row by 1-column vector of the m-unknowns, and the
design ma-
trix A is the n-row by m-column matrix containing the known
coefficients

of the assumed model AX.
The forward model is the heart of any analysis and recognizing
its at-
tributes is essential to productive data analysis. Linear models
dominate
modern analysis, but non-linear forward models also can be
invoked and
were popular in inversion before the advent of electronic
computing because
16 Array Methods
of their relative ease in manual application. For computer
processing, the
non-linear model is linearized either by the application of the
Taylor series
or chopped up numerically into array computations.
In general, any linear or non-linear inversion involves only
three sets of
coefficients with one set taken from the measured data (i.e., B),
another
set imposed by the interpreter’s assumptions (i.e., A), and the
final set of

unknowns (i.e., X) determined by solving Equation 4.2.
Electronic com-
puters process these coefficients as data arrays with basic
matrix properties
so that matrix inversion is extensively used for obtaining the
solution (X).
4.1 Basic Matrix Properties
A matrix is a rectangular array of numbers such as given by
A =
a11 a12 a13
a21 a22 a23
a31 a32 a33
a41 a42 a43
. (4.3)
The elements of A are aij, where the i-subscript denotes its row

number (→)
and the j-subscript is its column number (↓). Thus, the matrix
A-above has
dimensions of 4 rows by 3 columns, or is of order 4 × 3 which
also may be
indicated by the notation A(4×3). The matrix is even-ordered or
square if the
number of rows equals the number of columns, otherwise it is
odd-ordered
or non-square.
The matrix A is symmetric if it is even-ordered and aij = aji,
like in
A =
1 3 6
3 4 2
6 2 5
The symmetric matrix of order n can be packed into a singly
dimensioned

array of p = n(n + 1)/2 elements. Thus, symmetric matrices are
convenient
to work with because they minimize storage requirements in
computing.
The transpose of the n×m matrix A is the m×n matrix At with
elements
atij = aji. Thus, in taking the transpose of a matrix, the rows of
the matrix
become the columns of its transpose, or equivalently the
columns of the
matrix become the rows of its transpose. For example,
A =
(
1 2 3
4 5 6
)
has the transpose At =
1 4
2 5

3 6
4.1 Basic Matrix Properties 17
Matrix addition and subtraction, C = A ± B, involves the
element-by-
element operations defined by cij = aij ± bij. For the operations
to be de-
fined, the matrices A, B, and C must all have the same
dimensions. These
operations are associative because (A ±B) ±C = A ± (B ±C), and
com-
mutative because A±B = B±A.
Matrix multiplication, C = AB, involves the row-by-column
multiplica-
tion of matrix elements given by cij =
∑n
k=1 aikbkj. Matrix multiplication is
possible only between conformal matrices where the dimensions
of the prod-
uct matrices satisfy C(m×n) = A(m×k)B(k×n). Clearly, C = AB
does not

imply that C = BA, and hence matrix multiplication is not
commutative.
As an example, the products of the matrices in Equation 4.5 are
AAt =
(
14 32
32 77
)
and AtA =
17 22 27
22 29 36
27 36 45
Note that AtA is a symmetric (m×m) matrix, whereas AAt is a
symmetric
(n × n) matrix. These product matrices are important because

the matrix
inverse is defined only for a square matrix.
For matrices A and B that are square in the same order, B is the
inverse
of A or B = A−1 if BA = I, where
I =
1 0 · · · 0
0 1 · · · 0
...
...
...
...
0 0 · · · 1

(4.7)
is the identity matrix that has the special property BI = IB = B.
The
inverse A−1 can be found by applying elementary row
operations to A until
it is transformed into the identity matrix I. The row operations
consist of
adding and subtracting linear multiples of the rows from each
other. The
effects of these row operations on the corresponding elements
of the identity
matrix give the coefficients of the inverse A−1.
As an example, the matrix
A =
(
75.474 1.000
4.440 1.000
)
(4.8)
can be transformed by elementary row operations into the
following row-
equivalent [∼] systems

75.474 1.000
... 1.000 0.000
4.440 1.000
... 0.000 1.000
∼
18 Array Methods
1.000 1/75.474
... 1/75.474 0.000
4.440 1.000
... 0.000 1.000
∼
1.000 1/75.474

... 1/75.474 0.000
0.000 0.9412
... −0.0588 1.000
∼
1.000 1/75.474
... 1/75.474 0.000
0.000 1.000
... −0.0625 1.0625
∼
1.000 0.000
... 0.0141 −0.0141
0.000 1.000
... −0.0625 1.0625

∋
A−1 =
(
0.0141 −0.0141
−0.0625 1.0625
)
. (4.9)
Of course, a test to see if the coefficients for the inverse were
correctly
determined is to verify that A−1A = AA−1 = I.
However not every square matrix A has an inverse. Indeed, the
inverse
exists only if the determinant of A is not zero - i.e., |A| 6= 0.
Specifically,
the determinant of the square matrix A = (aij) of order n is the
scalar
|A| = |aij| =
∣
∣
∣
∣
∣
∣
∣
∣

a11 · · · a1n
...
...
...
an1 · · · ann
∣
∣
∣
∣
∣
∣
∣
∣
(4.10)
given by the simple difference in the sums of the diagonal
products. Here,
the [n−1]-products of the n-elements that lie on the diagonals
sloping down
to the right are added [+], and the [n−1]-products of the n-
elements along
the diagonals sloping down to the left are subtracted [-] as
shown in Figure
4.1. Thus, the determinant for the example order 3 matrix A is
|A| = a11a22a33 +a12a23a31
+a13a21a32−a31a22a13−a32a23a11−a33a21a12.
In general, a test for the existence of an inverse A−1 for A is to
see if

|A| 6= 0. If the determinant is zero, then the matrix A is said to
be singular.
Matrix singularity occurs if a row or column is filled with
zeroes or two or
more rows or columns are linear multiples of each other. For
example, the
= |aij| |A(3x3)| =
+ + +- - -
Figure 4.1 The determinant of any n-square matrix is simply the
difference
in the diagional [n − 1]-products of the n-elements, which are
added [+]
if right-sloping, and subtracted [-] if left-sloping. However,
care must be
taken not to repeat products in the sum - i.e., for the |A(3×3)|
example,
the rightmost right-sloping diagonal product [a11a22a33] was
already estab-
lished and thus cannot be repeated. Similarly, the the rightmost
left-sloping
product [a13a22a31] cannot be repeated.
matrices given by
A1 =

1 1 0
3 7 0
1.7 2.7 0
1 1 1
3 7 2
2.7 2.7 2.7
are both singular and have zero determinants. Specifically, |A1|
= 0 because
the 3rd column is filled with zeroes so that each product
element of the
determinant is zero and thus A1−1 does not exist. Additionally,
|A2| = 0

because the 3rd row is the multiple of 2.7 times the 1st row. The
rows 1
and 3 are linearly dependent or co-linear and can zero each
other out with
simple row operations so that A2−1 also does not exist.
4.2 Matrix Inversion
Using the elementary matrix properties described above, the
common array
methods for solving linear systems of simultaneous equations
are readily
developed. To illustrate these methods, the simple gravity
profile in Figure
4.2 is used that crosses perpendicularly over a two-dimensional
[i.e., a strike-
infinite] horizontal cylinder source in the presence of a constant
regional
gravity field. Each observation along the profile is the gravity
effect of the
cylindrical mass plus the regional field computed from
g(2D)z = 41.93∆σ(
R2
z
)(

1
(d2/z2) + 1
) + C, (4.11)
where g(2D)z is the vertical component of gravity in milligal
[mGal] and d
is the distance along the profile in km crossing the horizontal
cylinder with
radius R = 3 km, depth to the central axis z = 5 km, and density
(or density
20 Array Methods
50
0
0
20
0 20 40 60–20–40–60
g
z
(m
G
a
l)

z
(k
m
)
d (km)
For R = 3 km
z = 5 km
Δσ = 0.5 g/cm3
C = 10 mGal
0
± 2
± 5
± 10
± 20
± 50
± 100
47.737
42.532
28.869
17.547
12.220
10.374
10.094
d g(2D)z
Figure 4.2 Profile of the gravity effect (g(2D)z) across a buried
horizontal

2D cylinder striking perpendicular to the profile superimposed
on a con-
stant regional field C. This profile is used to illustrate the data
inversion
process. See the text for additional modeling details.
contrast) ∆σ = 0.5 g/cm3, in the regional gravity field of
amplitude C =
10.0 mGal.
Equation 4.11 involves two linear terms so that an interpreter
can esti-
mate a variable in each of the terms from the remaining
variables and the
observations. Consider, for example, determining ∆σ and C
assuming that
R, z, and d are known and the model [Equation 4.11] is
appropriate for
explaining the variations in the gravity profile observations. To
evaluate the
two unknowns, at least two observations are needed which may
be modeled
by the linear system of equations
b1 = a11x1 + a12x2
b2 = a21x1 + a22x2, (4.12)
where the design matrix coefficients are

a11 = [41.93(R
2/z)]/[(d21/z
2) + 1], a12 = 1.0,
a21 = [41.93(R
2/z)]/[(d22/z
2) + 1], a22 = 1.0. (4.13)
The design matrix coefficients represent the forward model in
Equation
4.11 evaluated with the known parameters [R, z, d] set to their
respective
values and the unknown parameters [∆σ, C] numerically set to
equal unity.
In mathematical theory, the two equations in two unknowns can
be solved
uniquely from any two observations. However, if the two
observations are
located symmetrically about the signal’s peak, then the two
equations in
the above system are co-linear and |A| = 0 so that no solution is
possible.
Thus, in the context of the assumed model, the solution can be

completely
obtained from the peak value and the observations either to the
right or left
of the peak. In other words, the interpreter’s choice of the
model [Equation
4.11] established the interdependency that made roughly 50% of
data in
Figure 4.2 redundant for subsurface analysis.
Consider now the signal’s values on the peak at d1 = 0 km and
the flank
at d2 = 20 km so that the system becomes
47.737 = 75.474x1 + x2
12.220 = 4.440x1 + x2. (4.14)
The design matrix and its determinant and inverse for this
system are, re-
spectively,
A =
(
75.474 1.000
4.440 1.000
)

, |A| = 71.034, and A−1 =
(
0.0141 −0.0141
−0.0625 1.0625
)
.
(4.15)
Cramer’s Rule is a relatively simple method for obtaining the
solution X
whereby xj = |Dj|/|A|. Here, the augmented matrix Dj is
obtained from
A by replacing the j-column of A with the column vector B.
Thus, by
Cramer’s Rule,
x1 =
35.517
71.034
= 0.50 and x2 =
710.34
71.034
= 10.0, (4.16)
which are the correct values. Cramer’s Rule, however, is
practical only for

simple systems with relatively few unknowns because of the
computational
labor of computing the determinants. For larger systems, a more
efficient
approach uses elementary row operations to process A for its
inverse so that
the solution to Equation 4.14 can be obtained from
X = A−1B =
(
0.50
10.0
)
. (4.17)
Even more efficient approaches directly diagonalize the linear
equations
in place for the solution coefficients. Gaussian elimination uses
elementary
row operations to sweep out the lower triangular part of the
equations. As
22 Array Methods

an example, the system 4.14 reduces to
[
47.737 = 75.474x1 + x2
12.220 = 4.440x1 + x2
]
∼
[
47.737 = 75.474x1 + x2
9.412 = 0.0 + 0.9412x2
]
,
from which clearly the solution coefficients are again
x2 =
9.412
0.9412
= 10.0 and x1 =
(47.737 −10.000)
75.474
= 0.50, (4.18)
where x1 was obtained by back substituting the solution for x2.
Instead
of back substitution, the row operations can be continued to also

sweep
out the upper triangular part of the system. This procedure
called Gauss-
Jordan elimination obtains the solution coefficients explicitly
along the main
diagonal as
[
47.737 = 75.474x1 + x2
9.412 = 0.0 + 0.9412x2
]
∼
[
37.737 = 75.474x1 + 0
9.412 = 0 + 0.9412x2
]
∼
[
0.5 = x1 + 0
10 = 0.0 + x2
]
.

These matrix inversion methods are applicable for systems
where the num-
ber of unknowns m equals the number of input data values n.
However, for
most data inversions, m < n and the system of equations is
always related
to an incomplete set of input data containing errors, as well as
uncertain-
ties in estimating the coefficients of the design matrix A and
other errors.
Thus, the least squares solution is commonly adopted that
minimizes the
sum of squared differences between the predicted and observed
data. This
early 19th century solution due to Gauss involves solving the
system of nor-
mal equations derived by differentiating the sum of squared
differences with
respect to the model parameters being estimated in the
inversion.
To obtain a least squares solution, the simple raw error e is used
to define
the residual function

R = ete = (B−AX)t(B −AX) =
n
∑
i=1
(bi −
m
∑
j=1
aijxj)
2.
Next, the partial derivatives of R with respect to the solution
coefficients
are taken and set equal to zero to obtain the normal equations.
For the qth
coefficient xq, for example,
∂R
∂xq
= 2
m
∑
j=1
xj
n

∑
i=1
aiqaij −2
n
∑
i=1
aiqbi ≡ 0,
which by the definition for matrix multiplication reduces to
AtAX−AtB ≡ 0.
Thus, the system AX = B is equivalent to the At-weighted
system given
by AtAX = AtB, which has the least squares solution
X = (AtA)−1AtB (4.19)
if |AtA| 6= 0.
Of course, the solution is found more efficiently by processing
the weighted
system in place with Gaussian elimination than by taking the
inverse (AtA)−1
directly and multiplying it against the weighted observation
vector (AtB).

This method works for any set of equations which can be put
into linear
form. No derivatives need be explicitly taken for the least
squares solution,
which contains only the known dependent and independent
variable coeffi-
cients from B and A, respectively.
As an example, for the simple (2×2) system in Equation 4.14
with the
design matrix A in Equation 4.8,
(AtA)−1 =
(
−0.0004 −0.0158
−0.0158 1.1328
)
so that the least squares or natural inverse is given by
(AtA)−1At =
(
0.0141 −0.0141
−0.0625 1.0625
)
,

which is identical to the inverse A−1 that was found previously
using el-
ementary row operations. Thus, for simple systems where (n =
m), the
natural inverse reduces to the elementary inverse A−1.
However, the natural inverse also applies for the more usual
case where
(n > m) and A−1 does not exist. Consider, for example, the 7
linearly
independent gravity observations in Figure 4.2 that extend from
the peak
along the right flank given by
BRF = (47.737 42.532 28.869 17.547 12.220 10.374 10.094)
t
for which
At =
(
75.474 65.064 37.737 15.095 4.440 0.747 0.188
1 1 1 1 1 1 1
)
.
In this case, the natural inverse is

(AtA)−1At =
(
−0.0046 −0.0047
0.2746 0.2772
)
,
24 Array Methods
which again correctly estimates the unknown parameters from
(AtA)−1AtBRF = X = (0.50 10)
t.
The matrix AtA is symmetric and positive definite with |AtA| >
0,
and thus can be factored into an explicit series solution that is
much faster
and more efficient to evaluate than the conventional Gaussian
elimination
solution. In particular, the matrix AtA can be decomposed into
(m × m)
lower and upper triangular Cholesky factors L and Lt,
respectively, so that
LLt = AtA. The coefficients of L are obtained from
l11 =

√
a11, ∀ i = j = 1 (4.20)
lj1 = aj1/l11, ∀ j = 2, 3, · · · , m
lii =
√
√
√
√aii −
i−1
∑
k=1
l2ik, ∀ i = 2, 3, · · · , (m −1)
lji = 0 ∀ i > j
= (
1
lii
)(aji −
i−1
∑
k=1
likljk), ∀ i = 2, 3, · · · , (m −1) and
j = (i + 1), (i + 2), · · · , (m), and

lmm =
√
√
√
√amm −
m−1
∑
k=1
l2mk.
Note that in the above Cholesky elements, the summations were
effectively
taken to be zero for i = 1.
To see how the Cholesky decomposition obtains the faster series
solution,
consider the previous example of the 7 observations in Figure
4.2, where
AtA =
(
1.1602 0.0199
0.0199 0.0007
)
×104 and AtB =

(
7.7884
0.1694
)
×103. Now, since
|AtA| = 4.1714×104 > 0, AtA can be decomposed into Cholesky
factors
L =
(
107.7121 0
1.8452 1.8962
)
and Lt such that LLt = AtA. Also, let the
weighted observation column vector AtB = K = LLtX. Thus,
setting P =
LtX obtains
p1 = l11x1 + l21x2
p2 = 0 + l22x2, (4.21)
as well as LP = K with
p1 = K1/l11 (4.22)

p2 = K2/l22 − (l21/l22)p1,
which can be combined for the solution coefficient estimates
x2 = (K2/l222) − (K1/l222)/(l21l11) = 10
x1 = (K1/l211) − (l21/l11)x2 = 0.5 (4.23)
In general, the Cholesky factorization allows xm to be estimated
from a
series expressed completely in the known coefficients of L, A
and B, which is
back-substituted into the series expression for xm−1, which in
turn is back-
substituted into the series for xm−2, etc. The Cholesky solution
is much
faster than Gaussian elimination approaches that require twice
as many
numerical operations to complete. The Cholesky solution also
has minimal
in-core memory requirements because the symmetric AtA matrix
can be
packed into a singly dimensioned array of length m(m + 1)/2.
For systems
that are too large to hold in active memory, updating Cholesky
algorithms are

available that process the system on external storage devices
[e.g., Lawson
and Hanson (1974)].
The Cholesky factorization is problematic, however, in
applications where
the coefficients li,j −→ 0 within working precision. In these
instances, the
products and powers of the coefficients are even smaller so that
solution
coefficient estimates become either indeterminant or blow up
wildly and un-
realistically. Indeterminant or unstable solutions respectively
signal singular
or near-singular AtA matrices. However, these ill conditioned
matrices can
still be processed for effective solutions using the error
statistics of the in-
versions to help suppress their singular or near-singular
properties.
4.2.1 Error Statistics
In general, reporting of data inversion results routinely requires
quantifying
the statistical uncertainties of the analysis. These assessments
commonly

focus on the variances and confidence intervals on the solution
coefficients
and predictions of the model, and the coherency and statistical
significance
of the fit of the model’s predictions to the data observations.
For example, applying variance propagation [e.g., Bevington
(1969)] to
the linear function (AtA)−1AtB shows that the solution variance
can be
obtained from
σ2X = [(A
tA)−1At]2σ2B = (A
tA)−1σ2B, (4.24)
where the statistical variance in the observations σ2B is
specified either a
26 Array Methods
priori or from
σ2B ≃ [BtB −Xt(AtA)X]/(n −m) ∀ n > m, (4.25)
which is an unbiased estimate if the model AX is correct.
Equation 4.24 is
the symmetric variance-covariance matrix of X with elements
that are the

products σxiσxj of the standard deviations of xi and xj. Thus,
the elements
along the diagonal where i = j give the variances σ2xi = σxiσxj
and the
off-diagonal elements give the covariances that approach zero if
xi and xj
are not correlated.
The 100(1 −α)% confidence interval [or equivalently the α%
significance
interval] on each xi ∈ X is given by
xi ± (σxi)t(1 −α/2)n−m, (4.26)
where tn−m(1−α/2) is the value of Student’s t-distribution for
the confidence
level (1−α/2) and degrees of freedom ν = (n−m). The individual
confidence
intervals for xi can also be readily combined into joint
confidence regions
[e.g., Jenkins and Watts (1968); Draper and Smith (1966)], but
they
are difficult to visualize for m > 3.
Application of the variance propagation rule to AX shows that
the vari-
ance on each prediction b̂ i ∈ B
̂ (= AX ≃ B) can be estimated
from
σ2
b̂ i
= Ai(A

tA)−1Ati σ
2
B, (4.27)
where Ai is the row vector defined by Ai = (ai1 ai2 · · · aim).
Thus, the
100(1−α)% confidence limits or error bars for the b̂ i at Ai can
be obtained
from
b̂ i ±σB[
√
Ai(A
tA)−1At
i
]t(1 −α/2)n−m. (4.28)
Another measure of the fit is the correlation coefficient r, which
when
squared gives the coherency that indicates the percent of the
observations
bi fitted by the predictions b̂ i. The coherency in matrix form is
given by
r2 = (XtAtB −nb̄ 2)/(BtB −nb̄ 2), (4.29)
where b̄ is the mean value of the bi ∈ B. In general, our
confidence in the
model AX tends to increase as r2 −→ 1.
The coherency (Equation 4.29) reduces algebraically to

r2 = [
n∑
i=1
(b̂ i − b̄ )2]/[
n∑
i=1
(bi − b̄ )2], (4.30)
where
∑n
i=1(b̂ i−b̄ )2 is the sum of squares due to regression, and
∑n
i=1(bi−b̄ )2
is the sum of squares about the mean. In general, the sum of
squares about
the regression is
∑n
i=1(bi − b̄ i)2 =
∑n
i=1(b̂ i − b̄ )2 +
∑n
i=1(bi − b̂ i)2 so that if

AX is correct, then [
∑n
i=1(b̂ i − b̄ )2] >> [
∑n
i=1(bi − b̂ i)2] ∋ r2 −→ 1. Thus,
an analysis of variance or ANOVA table can be constructed to
test the
statistical significance of the fit between b̂ i and bi as shown in
Table 4.1. The
null hypothesis that the model’s predictions (B
̂ ) do not
significantly fit the
observations (B) is tested by comparing the estimate F =
MSR/MSD from
the ANOVA table with the critical value (Fc) from the Fisher
distribution
with degrees of freedom ν = (m − 1), (n − m) at the desired
level α of
significance [or alternatively (1−α) of confidence]. The
hypothesis is rejected
if F > Fc; otherwise there is no reason to reject the hypothesis
based on the
F-test.
Table 4.1 The analysis of variance (ANOVA) table for testing
the
significance of a model’s fit to data.

ERROR CORRECTED SUM MEAN
SOURCE ν OF SQUARES (CSS) SQUARES F-TEST
Regression
Error m −1 XtAtB− (
∑n
i=1 bi)
2/n MSR = CSS/ν F =
Residual
Deviation n −m BtB−XtAtB MSD = CSS/ν MSR/MSD
Total
Error n−1 BtB− (
∑n
i=1 bi)
2/n
4.2.2 Sensitivity Analysis
The above sections have illustrated how inversion determines an
unknown set
of discrete numbers (X) that relates a set of discrete numbers
(B) observed
from nature to a given set of discrete numbers (A) from the
model (AX) that
the interpreter presumes can account for the observations (B).
Of course,

clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx

clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx

Recommended

Recommended

More Related Content

Similar to clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx

Similar to clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx (20)

More from monicafrancis71118

More from monicafrancis71118 (20)

Recently uploaded

Recently uploaded (20)

clear allclc Ex2-1 A) Compute the 5 gravity profiles.docx