In my Thesis, Over Levi and I have presented several novel approaches to regularization problem.
1. Develop the 2D Discrete Picard condition
2. Designed a new Hybrid (L1,L2) Norm
3. Implemented an amalgamation of convex function optimization
We also show the effects of the following on inverse problem.
1. L1,L2 regularization
2. TSVD regularization
3. L-curve optimization
4. 1D,2D Discrete Picard condition
2. o Understanding the problem
o Mathematical modeling
o The need for Regularization
o Regularization Methods
o Solution Development
o Results and Conclusion
3. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
4. o The study of the structure and dynamic
behavior of molecules is extremely important
Medical imaging
Industrial quality control
Chemical and Pharmaceutical analysis
Safety inspections
o However, molecules are too small to be
observed and studied directly
The nuclear magnetic resonance (NMR) is a
versatile and powerful technique for exploring their
structure and dynamic behavior.
5. o Protons have a magnetic charge and
possess a spin. Due to this, they have
a magnetic field.
o In an external magnetic field 𝐵∅ (𝐵𝑧)
they align parallel or anti-parallel.
o The spinning protons wobble, about
the axis of the external magnetic field.
This motion is called precession.
a relationship which is defined by the
Larmor Equation: 𝜔∅ = 𝛾𝐵∅
o An electromagnetic RF pulse at the
resonance frequency causes the
protons to presses in phase
6. o T1-The longitudinal relaxation or
spin-lattice relaxation.
T1 is the exponential recovery of 𝑀𝑧.
𝑀𝑧 = 𝑀𝑧,𝑒𝑞(1 − 𝑒−𝑡/𝑇1)
o T2-The transversal relaxation or
spin-spin relaxation.
T2 is the exponential decay of a signal,
𝑀 𝑥𝑦.
𝑀 𝑥𝑦 = 𝑀 𝑥𝑦,𝑒𝑞(𝑒−𝑡/𝑇2)
o Simultaneously; the longitudinal
magnetization begin to increases
again as the excited spins begin to
return to the original 𝑀𝑧
orientation.
7. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
8. o Fredholm first integral equation
𝑠 𝑡 = 𝑇1∈𝑡
1 − 𝑒
−
𝑡
𝑇1 𝑓 𝑇1 𝑑𝑇1
𝑠 𝑡 = 𝑇2∈𝑡
𝑒
−
𝑡
𝑇2 𝑓 𝑇2 𝑑𝑇2
𝑠 𝑡1, 𝑡2 = 𝑇2∈𝑡2 𝑇1∈𝑡1
(1 − 𝑒−𝑡1/𝑇1)(𝑒−𝑡2/𝑇2)𝑓 𝑇1, 𝑇2 𝑑𝑇1 𝑑𝑇2
o Discretizing the integral
𝑠 = 𝐾𝑓
𝑆 = 𝐾1 𝐹𝐾2
9. o Solving the equation
𝑠 = 𝐾𝑓
𝑓 = 𝐾 𝑇 𝐾 −1 𝐾 𝑇 𝑠
o We are done!
o Oh no!
𝑓 ≠ 𝐾 𝑇 𝐾 −1 𝐾 𝑇 𝑠
𝐹 ≠ 𝐾1
𝑇
𝐾1
−1 𝐾1
𝑇
𝑆𝐾2
𝑇
𝐾2 𝐾2
𝑇 −1
10. o Inverse problems
Conversion of the relaxation signal into a
continuous distribution of relaxation components is
an inverse Laplace transform problem.
o Ill-posed problems
Inverse problems, in particular, belong to a class of
ill-posed problems and frequently exhibit this
extreme sensitivity to changes in the input.
o Perturbation theory
That is, even minute perturbations in the data can
vastly affect the computed solution.
11. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
12. o Consider for example the following system.
𝐴𝑥 = 𝑏
o A the matrix which describes the model.
o b the vector which describes the output of the
system.
o x the solution for the inverse problem
Is the vector which describes the input of the system.
13. o The underdetermined problem have infinitely
many solutions.
(1 1)𝑥 = 1
o The problem is replaced by a nearby problem
where the solution is less sensitive to errors in the
data.
o This replacement is commonly referred to as
regularization
14. o If we require the 2-norm of x to be a minimum,
that is:
o Then there is a unique solution at
o compute an approximate solution to the linear
least-squares minimization problem associated
with the linear system of equations.
1
..
min
21
2
xx
ts
x
2
1
21 xx
15. o Assume that the solution x can be separated to
o Inserting to the above and rearranging
o If then this means that the vector 𝑥0 is a
null vector (kernel).
o The system behaves like an underdetermined
system
oxxx ˆ
bAxxA o ˆ
0oAx
16. o To stabilize the solution reinforcing an upper
bound on the norm of the solution
o From optimization theory, we can incorporate
the constraint via a Lagrange multiplier 𝛾 .
2
2
..
min
x
ts
bAx
22
2
2
2
.
min
xbAx
ts
x
17. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
18. o Tikhonov regularization
Perhaps the most successful and widely used
regularization method of is the Tikhonov
regularization.
o Singular Value Decomposition (SVD)
The singular value decomposition, in the discrete
setting, is a power tool for many useful applications
in signal processing and statistics.
19. o Specifically, the tikhonov solution xλ is defined, for the
strictly positive weighting regularization parameter λ, as
the solution to the problem
min
𝑥
𝐴𝑥 − 𝑏 2
2
+ 𝜆2
𝑥 2
2
o The first term 𝐴𝑥 − 𝑏 2
2
is a measure of the goodness of
fit
o If the term is too large, then x cannot be considered a
good solution because we are under fitting the model.
o If the term is too small, then we are over fitting our
model to the noisy measurements.
20. o If we can control the norm of x, then we can
suppress most of the large noise components.
o The objective is to find a suitable balance via the
regularization parameter λ for these two terms,
such that, the regularized solution xλ fits the
data thoroughly and is sufficiently regularized
o The balance between the two terms is
structured by the factor λ.
21. o It is obvious that for λ =0 we obtain the least
square problem
more weights are given to fitting the noisy data,
resulting in a solution that is less regular.
o However, the larger the λ, the more effort is
devoted into the regularity of the solution.
more weights are given to the minimization of the
L2-norm of the solution, and so as 𝜆 → ∞ we have
𝑥 → 0 .
22.
23. o Discrepancy Principle:
This method is very likely to overestimate the
regularization parameter.
⇒ L-Curve:
Some underestimation expected, very robust.
o Generalize Cross Validation (GCV):
risk of severe over or under estimation.
o Normalized Cumulative Periodogram (NCP)
Criterion:
for low or high noise level considerable overestimate.
24. o It is a convenient graphical tool for displaying
the trade-off between the size of a regularized
solution and its fit to the given data, as the
regularization parameter varies.
o Advantages of the L-curve criterion are
robustness
ability to treat perturbations consisting of
correlated noise.
o Disadvantage is of the L-curve criterion is
for a low noise level, the regularization parameter
given is much smaller then the optimal parameter.
27. o Formally, the singular value decomposition of
an m × n real or complex matrix A is a
factorization of the form
𝐴 = 𝑈Σ𝑉∗
=
𝑖=1
𝑛
𝜎𝑖 𝑢𝑖 𝑣𝑖
𝑇
o 𝑈𝜖ℝ 𝑚×𝑚
and 𝑉𝜖ℝ 𝑛×𝑛
are orthogonal matrices
o Σ𝜖ℝ 𝑚×𝑛 Is a rectangular diagonal matrix with
non-negative real numbers 𝜎𝑖
𝜎1 > 𝜎2 > ⋯ > 𝜎𝑟 > 𝜎𝑟+1 = ⋯ = 𝜎 𝑛 = 0
28. o As the singular value decreases its corresponding
singular vector becomes more chaotic and with less
information.
29.
30. o This regularization method resolves the issue of
the problematic tiny positive singular values by
setting them to zero.
o The TSVD approximation of A will effectively
ignore the smallest singular values.
k
i
T
iiikk vuVUA
1
*
)0,...,0,,...,,( 21 kk diag
32. o A necessary condition for obtaining good
regularized solutions is that the Fourier
coefficients of the right-hand side, when
expressed in terms of the generalized SVD
associated with the regularization problem, on
the average decay to zero faster than the
generalized singular values
o In other words, chop off SVD components that
are dominated by the noise
o (Note: the need for Fourier coefficients to
converge has been understood for many years)
33. o the condition must be satisfied in order to
obtain “good regularized solutions”.
o The Discrete Picard Condition. Let Ƭ denote the
level at which the computed singular values σi
level off due to rounding errors. The discrete
Picard condition is satisfied if, for all singular
values larger than Ƭ, the corresponding
coefficient |ui
Ts|, on average, decay faster than
the σi.
36. o Recall our 2D problem from 𝑆 = 𝐾1 𝐹𝐾2.
o Transforming the equation back to 1D.
𝑣𝑒𝑐 𝑆 = 𝐾2
𝑇
⨂𝐾1 𝑣𝑒𝑐 𝐹
𝐾2
𝑇
⨂𝐾1 = 𝑉2⨂𝑈1 Σ2⨂Σ1 𝑈2
𝑇
⨂𝑉1
𝑇
𝑠 = 𝜇𝜉𝜈 𝑇
𝑓.
o We now define the Picard curve:
𝜌 = 𝑙𝑜𝑔 𝜇𝑖
𝑇
𝑠 − 𝑙𝑜𝑔 𝑑𝑖𝑎𝑔 𝜉𝑖
37. o However, the matrix µ is extremely large
o perform an inverse kronecker product operation.
V2⨂U1 𝑠 ↔ U1SV2
𝑇
𝑑𝑖𝑎𝑔 Σ2⨂Σ1 = Σ2⨂Σ1 𝕀 𝑚1 𝑚2×1 ↔
↔ Σ1 𝕀 𝑚1×𝑚2
Σ2
𝑇
= 𝑑𝑖𝑎𝑔 Σ1 × 𝑑𝑖𝑎𝑔 𝑇 Σ2
o We now define the Picard surface as
𝜌 = 𝑙𝑜𝑔 𝑈1 𝑆𝑉2
𝑇
− 𝑙𝑜𝑔 𝜎1 × 𝜎2
𝑇
Where 𝜎1 and 𝜎2 are the vector diagonals of the
singular values respectively.
38.
39. o Assuming a simple problem consists of the
same data structure from our LR-NMR
experimental
o signal measurements matrix
S 16384 by 70
values stored as a double precision (8 bytes)
o our kernel matrices
K1 300 by 70
k2 300 by 16384
o we would need 45 Megabytes to store our raw
data measurements.
40. o In our example, the Picard Plot suggested to use
only the first 9 singular values for the K1 kernel
and the first 12 singular values of the second K2
kernel
o New signal measurements vector
s 108 by 1
o New kernel matrices
K1 300 by 9
k2 300 by 12
o we would need 0.05 Megabytes to store our
raw data measurements.
o Compression ratio of 1,000:1
41. o Consider our method to map the data into a 1D
problem
o New signal measurements vector
s 1146880
o New kernel matrix
K’ 1146880 by 90000
o we would need at least 768.9 Gigabytes of storage
space.
o Using the 2D Picard Condition
o New signal measurements vector
s 108
o New kernel matrix
K’ 108 by 90000
o we would need at least 0.073 Gigabytes of storage
space.
o Compression ratio of 10,000:1
42. o Impossible to get more knowledge from the
information inside the data.
o If the resolution is fictitious increased, the
information is falsified.
o The regularization parameters are a function of
the distribution f and the signal noise Δs.
o No Mathematical basis.
43. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
44. o Define the functional 𝛷 𝑓
Min 𝛷 𝑓 = 1
2
𝐾𝑓 − 𝑠 2
2
+ 1
2
𝜆2 𝑓 2
2
+ 𝜆1 𝑓 1
s.t. 𝑓 ∈ 𝐶
𝜆1 ≥ 0 , 𝜆2 ≥ 0
𝐿2
: 𝐶 = 𝑓: 𝑓 𝑇 ≥ 0, 𝑓 2
< ∞
o If K∈L2, then Φ(f) has a directional derivative,
donated by ∇Φ.
o 𝛻𝛷 𝑓 = 1
2
𝜕
𝜕𝑓
𝑒, 𝑒 + 1
2
𝜆2
𝜕
𝜕𝑓
𝑓, 𝑓 + 𝜆1
𝜕
𝜕𝑓
𝑡𝑟 𝑓 ∙ 𝐼
o 𝛻𝛷 𝑓 = 𝐾′ 𝐾𝑓 − 𝑠 + 𝜆2 𝑓 + 𝜆1
45. o The Kuhn-Tucker condition:
∇Φ(f)=0 → f>0: When the derivative is 0 then we are
at a calculus minimum
∇Φ(f)≥0 → f=0: When its not, a small decrease of f
will reduce the function value, however, that is when
the constraint is reached.
o Rearranging
f 𝜆1,𝜆2
= max(0, 𝐾′
𝐾 + 𝜆2 𝐼 −1
(𝐾′
𝑠 − 𝜆1))
o We can use the SVD to obtain more insight into
the Tikhonov solution
f 𝜆1,𝜆2
= max 0, 𝑉 Σ2
+ 𝜆2
2
𝐼
−1
Σ𝑈′
𝑠 − 𝑉′
𝜆1
50. o The Primal-Dual convex optimization (PDCO) is
a state of the art optimization solver
implemented in Matlab.
o It applies a primal-dual interior method to
linearly constrained optimization problems
with a convex objective function.
51. o The problems are assumed to be of the following
form:
min
𝑓,𝑟
𝑟 2
2
+ 𝐷1 𝑓 2
2
+ 𝜑(𝑓)
𝑠. 𝑡
𝐴𝑓 + 𝐷2 𝑟 = 𝑏
𝑙 ≤ 𝑓 ≤ 𝑢
o Where f and r are variables, and 𝐷1 and 𝐷2 are
positive- definite diagonal matrices.
o Each PDCO iteration is generating search directions
∆f and ∆y for the primal variables f and the dual
variables y associated with 𝐴𝑓 + 𝐷2 𝑟 = 𝑏
52. o Until recently many models have only used the
l2 penalty function because the solving methods
are simple and fast.
o The introduction of the least absolute values, l1,
to model fitting have greatly improved many
applications.
o adopt a hybrid between l1 and l2.
53. o Hybrid function 𝐻𝑦𝑏𝑐(𝑓′
) = 𝑔(𝑓′𝑖) with a
regularization parameter c, where
o 𝑔(𝑓′𝑖) =
𝑓′ 𝑖
2
2𝑐
+
𝑐
2
𝑓′𝑖
𝑓′𝑖 ≤ 𝑐
𝑓′𝑖 > 𝑐
54.
55. o The most popular entropy functional is the Shannon
entropy formula
o the Entropy function 𝐸𝑛𝑡 𝑎 𝑓′ = 𝑓′
𝑖
𝑎
log 𝑓′
𝑖
𝑎
with
a regularization parameter a
o Its origins in information theory
o The motivation
does not introduce correlations into the data beyond
those which are required by the data
n
i
ii xLogxxE
1
)()()(
1)()())(( xLogxExEGrad
x
xEdiagxEHess ii
1
))(())((
56.
57. Results and Conclusion
Solution Development
Regularization Methods
The need for
Regularization
Mathematical modeling
Understanding the problem
58. o The solution method proposed in [] is the mathematical
formulation of the linearly constrained convex problem:
min
𝑓′
𝜆1 𝑘′ 𝑓′ − 𝑏′
2
2
+ 𝜆2 𝑓′
2
2
+ 𝜑(𝑓′)
𝑠. 𝑡
𝑘′ 𝑓′ + 𝑟 = 𝑏′
𝑓′ ≥ 0
o Where k’ is the Kroneker tensor product of K1 and K2,
o f’ is the unknown spectrum vector,
o b’ is the transformed measurements vector,
o r is the residual vector
o the convex function 𝜑 𝑓′
is either the Entropy function
𝐸𝑛𝑡 𝑎 𝑓′ with a regularization parameter a or the Hybrid
function 𝐻𝑦𝑏 𝑐(𝑓′) with a regularization parameter c
66. o Our algorithm produces reconstructions of far
greater quality than the other methods but at a cost
of convergence time that comes from the need of
tuning several parameters
o In contrast, our approach keeps the reconstruction
quality regardless of the data structure and data
size
o we have taken advantage of the inherited stability of
the 2D Picard condition to regularize the solution
and make it less sensitive to perturbations in the
measurement array.
o As a result, all required quantities such as gradient,
Hessian-vector product are computed with reduced
memory storage and computation time