Optimize Least-Square Methods and Sparse Linear Systems

Topics In Digital Contents Signals – Special Issue 01Topics In Digital Contents Signals – Special Issue 01
The Least-Square Optimization andThe Least-Square Optimization and
Sparse Linear System SolverSparse Linear System Solver
Presented by Ji-yong Kwon
Visual Computing Lab.
1

OutlineOutline
• What is the least-square optimization?
– Optimization
– Least-square optimization
– Application to computer graphics
• Poisson image cloning
• What is the sparse linear system
– Dense matrix v.s. Sparse matrix
– Steepest-descent approach
– Conjugate Gradient method
2

ReferenceReference
• Valuable reading materials
– Practical Least-Squares for Computer Graphics
• Pighin and Lewis
• ACM SIGGRAPH 2007 course note
• http://graphics.stanford.edu/~jplewis/lscourse/ls.pdf
– An Introduction to the Conjugate Gradient Method Without the
Agonizing Pain
• J. R. Shewchuk
• CMU tech. report
• http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf
3

The Least-Square OptimizationThe Least-Square Optimization
4

Simple ExampleSimple Example
• Example
– Line equation passing two points on the a-b plane
• One unique solution
5
a
b
01=++ ybxa
01
01
22
11
=++
=++
ybxa
ybxa






−
−






−
−
−
=











−
−
=











1
11
1
1
12
12
1221
22
11
aa
bb
babay
x
y
x
ba
ba
Line equation:
We know:
( )11,ba
( )22 ,ba

• Example
– Line equation passing three points on the a-b plane
• No exact solution
6
a
b
01=++ ybxa
01
01
01
33
22
11
=++
=++
=++
ybxa
ybxa
ybxa










−
−
−
=















1
1
1
33
22
11
y
x
ba
ba
ba
Line equation:
We know:
( )11,ba
( )22 ,ba
( )33,ba

Why Optimization?Why Optimization?
• Observation
– Unfortunately, many problems do not have a unique solution.
• Too many solutions, or
• No exact solution
– Concept of Optimization
• Find approximated solution
– Not exactly satisfy conditions,
– But satisfy conditions as much as possible.
• Strategy
– Set the objective (or energy) function
– Find a solution that minimizes (or maximizes) the objective
function.
7

Why Optimization?Why Optimization?
• Objective function
– A.K.A. energy function
– Input: a set of variables that we want to know
– Output: a scalar value
– Output value is used for estimation of solution’s quality
• Generally, small output value (small energy)  good solution
– A solution that minimizes the output value of the objective
function  Optimized solution
– To design the good objective function is the most
important task of the optimization techniques.
8

• Example again,
– Line equation passing three points on the a-b plane
• No exact solution,
• But we can compute the approximated solution.
9
a
b
( )11,ba
( )22 ,ba
( )33,ba
– Passing all points would be
impossible,
– Find the line that minimizes
distances from all points

Objective FunctionObjective Function
• Example again,
– How to compute ‘distances’?
• Setting the objective function
10
a
b
01=++ ybxaPoint on the line:
( )11,ba
( )22 ,ba
( )33,ba+
–
0
Point out of the line:
01
,01
<++
>++
ybxa
ybxa
Objective function:
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO

Optimization ProblemOptimization Problem
• Problem description
– Find the line coefficients (x, y) that minimize a sum of squared
distances between the line and given points.
– Mathematically,
– More compact description
11
( )∑=
++
3
1
2
1minimize i ii ybxa
( ) ( )∑=
++=
3
1
2
1argmin, i iix,yoo ybxayx

SolutionSolution
• Solution of the example
– The objective function has
a parabolic shape
 The objective function would be minimized
at the zero gradient of the function
12
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO
x
y
O(x,y)
( ) ( )
( ) ( )∑∑
∑∑
==
==
=++=++=
∂
∂
=++=++=
∂
∂
3
1
23
1
3
1
23
1
0212
0212
i iiiii iii
i iiiii iii
aybbxaybxab
y
O
abyaxaybxaa
x
O

SolutionSolution
• Solution of the example
13
x
y
O(x,y)
∑∑∑
∑∑∑
===
===
−=+
−=+
3
1
3
1
23
1
3
1
3
1
3
1
2
i ii ii ii
i ii iii i
bbybax
abayax








−
−
=













∑
∑
∑∑
∑∑
=
=
==
==
3
1
3
1
3
1
23
1
3
1
3
1
2
i i
i i
i ii ii
i iii i
b
a
y
x
bba
baa








−
−








=





∑
∑
∑∑
∑∑
=
=
−
==
==
3
1
3
1
1
3
1
23
1
3
1
3
1
2
i i
i i
i ii ii
i iii i
b
a
bba
baa
y
x

Squared DistanceSquared Distance
• Why ‘squared’?
– Naïve sum
• Each distance would be positive or negative,
• Sum of signed distances does not estimate the quality of the
solution.
– Sum of the absolute distance
• Distances would be 0 or positive,
• But the minimum point cannot be computed easily
(Not differentiable at the minimum point)
– Sum of the square distance
• Distances would be 0 or positive,
• Differentiable at the minimum point,
• The shape of the square function would be parabolic.
14
( ) ( )∑=
++=
3
1
1, i ii ybxayxO
( ) ∑=
++=
3
1
1, i ii ybxayxO
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO

Another SolutionAnother Solution
• Pseudo-inverse
– The inverse matrix can be computed if and only if the matrix is
square
– The pseudo-inverse matrix
– From the example before…
– Solution computed by using pseudo-inverse method
= Solution computed by using least-square optimization
15










−
−
−
=















1
1
1
33
22
11
y
x
ba
ba
ba
bAx =
( )
( )



=⇐
=⇐
= +−
+−
+
IAAAAA
IAAAAA
A 1
1
T
TT
( ) bAAAx TT 1−
=

BackgroundBackground
• Background of matrix differentiation
– Very comfortable technique for derivation of matrix system
– Reference
• A. M. Mathai, ‘Jacobians of Matrix Transformations and
Functions of Matrix Argument’, World Scientific Publishing, 1997
– Contents that would be covered in this lecture…
• A scalar value function of a vector
• A vector function of a vector
16
[ ]
( ) ( ) ( ) ( )
T
p
T
p
x
f
x
f
x
fy
fy
xxx








∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
=
xxx
x
x
x
,,,
,,,
21
21



• Theorem 1
– Let be the vector of variables
be the constant vector, then
[ ]T
pxxx ,,, 21 =x
[ ]T
paaa ,,, 21 =a
( )xAA
x
Axx
x
x
xx
a
x
xa
TT
T
T
y
y
y
y
y
y
+=
∂
∂
⇒=
=
∂
∂
⇒=
=
∂
∂
⇒=
2

• Proof of Theorem 1
[ ]T
pxxx ,,, 21 =x
[ ]T
paaa ,,, 21 =a
)
[ ]
)
[ ]
)
( )xAA
x
xAxA
Axx
x
x
xx
a
x
xa
TT
ii
p
j
jji
p
j
jij
i
p
i
p
j
jiij
T
T
p
T
p
i
i
p
T
T
p
T
p
i
i
pp
T
y
xaxa
x
y
xxay
xxx
x
y
x
y
x
yy
x
x
y
xxxy
aaa
x
y
x
y
x
yy
a
x
y
xaxaxay
+=
∂
∂
⇒+=+=
∂
∂
==
==








∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
∂
∂
+++==
==








∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
∂
∂
+++==
⋅⋅
==
= =
∑∑
∑∑3
22222
2
1
11
1 1
21
21
22
2
2
1
21
21
2211





• Theorem 2
– Let
19
[ ] [ ]
xy
x
y
J
yx
Jdd
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
yyyxxx
p
ppp
p
p
j
i
T
p
T
p
=




















∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
=








∂
∂
=
∂
∂
=
==





21
2
2
2
1
2
1
2
1
1
1
2121 ,

Matrix FormulationMatrix Formulation
• Example again,
– Can be described as a matrix form
20
( ) ( )
2
33
22
11
2
33
22
11
3
1
2
1
1
1
1
1
1
1,










−
−
−
−















=










++
++
++
=
++= ∑=
y
x
ba
ba
ba
ybxa
ybxa
ybxa
ybxayxO i ii
( )
( ) ( )bAxbAx
bAxx
−−=
−=
⇒
T
O
2

Matrix FormulationMatrix Formulation
• Matrix formulation of the least square optimization
21
( ) ( )( )
( )( )( )
( )
( )
( )
( ) bAAAx
bAAxA
bAAxA
bAxAAAA
bbAxbAxAx
x
bbbAxAxbAxAx
x
bAxbAx
x
bAxbAx
xx
TT
TT
TT
TTT
TTTT
TTTTTT
TTT
TO
1
022
2
2
−
=∴
=∴
=−=
−+=
+−
∂
∂
=
+−−
∂
∂
=
−−
∂
∂
=
−−
∂
∂
=
∂
∂
Can be solved by using
the linear system solver

ConstraintsConstraints
• Slightly different example
– Line that minimizes distance from the red points,
– One additional constraint
• This line should pass the white point
22
a
b
( )11,ba
( )22 ,ba
( )33,ba
( )cc ba ,
( )
01subject to
1minimize
3
1
2
=++
++∑=
cc
i ii
ybxa
ybxa

ConstraintsConstraints
• Constraints
– A.K.A. hard constraints
• c.f. soft constraints  objective (energy) term
– Condition that must be satisfied
– Constrained optimization
• Optimization with some constraints
– (Linear / Non-linear) (equality / inequality) constraint
– This lecture only covers linear equality constraint
23

Constrained OptimizationConstrained Optimization
• Lagrange multiplier
– Constrained optimization can be expressed as a
unconstrained optimization with a Lagrange multiplier
– Why is it possible?
• This lecture would not covers the theory of the Lagrange
multiplier.
• Reference
– http://en.wikipedia.org/wiki/Lagrange_multipliers
24
( )
( ) cyxC
yxO
=,subject to
,minimize
( ) ( )( )cyxCyxO −+ ,,minimize λ

• Solution of constrained optimization
– At the minimum point, the gradient of objective function
should be zero
– Also can be solved by using the linear system solver
25
( )1
2
1
minarg
2
++− xcbAxx
T
λ
01=+=
∂
∂
=+−=
∂
∂
xc
0cbAAxA
x
T
TT
O
O
λ
λ 





−
=











10
bAx
c
cAA T
T
T
λ
bxA ˆˆˆ =

• Case for multiple constraints
– Multiple Lagrange multipliers
26
( )cCxλbAxx −+− T2
2
1
minarg
0=−=
∂
∂
=+−=
∂
∂
cCx
λ
0λCbAAxA
x
O
O TTT






=











c
bAx
C
CAA TTT
λ0
bxA ˆˆˆ =

ImplementationImplementation
• How to solve the linear system?
– Provided by many libraries.
– Using OpenCV,
• Data structure for storing a matrix
– CvMat *aMat = cvCreateMat(nRow, nCol, CV_32F);
– cvReleaseMat(&aMat);
• Set and get the element of a matrix
– cvmSet(aMat, m, n, 1.0f);
– float mn = cvmGet(aMat, m, n);
• Linear operation
– cvAdd(aMat, bMat, cMat);
– cvGEMM(aMat, aMat, 1.0f, NULL, 0.0f, ataMat,
CV_GEMM_A_T);
• Solver
– cvSolve(aMat, bVect, xVect, CV_LU);
27

Practical ExamplePractical Example
• Poisson image cloning
– Interesting solution to composite the image
– Paste the modified gradient of source image with satisfaction
of boundary colors
28

Poisson Image CloningPoisson Image Cloning
– Source image pixel
– Target image pixel
– Unknown new image pixel
– Objective
• Minimize difference of gradients between a new image and a
source image
– Constraint
• Pixel values at the boundary should be equal to those of the
target image
29
yxs ,
yxt ,
yxn ,
s t
nt

– Mathematical formulation
30
( ) ( )( )
( ) ( )
( ) ( )( )
( ) ( )
( ) Ω∂∈=
−−−+
−−−
∑
∑
Ω∈+
++
Ω∈+
++
yxtn
ssnn
ssnn
yxyx
yxyx
yxyxyxyx
yxyx
yxyxyxyx
,for,subject to
minimize
,,
1,,,
2
,1,,1,
,1,,
2
,,1,,1
s t
nt

• Matrix formulation
31
( )BtBnλGsGnn −+− T2
2
1
minarg










−=



11G
( )yx, ( )1, +yx
( )yx ,1+










=



1B
( ) Ω∂∈yx,
0
0
=−=
∂
∂
=+−=
∂
∂
BtBn
λ
λBGsGGnG
n
O
O TTT






=











Bt
GsG
λ
n
0B
BGG TTT

• Why is it called ‘Poisson’?
32
s t
nt
-1
x,y-1
-1
x-1,y
4
x,y
-1
x+1,
y
-1
x,y+
1
( ) ( )( )
( ) ( )( )
( ) ( )( )
( ) ( )( )
( )
( )1,,1,11,,
1,,1,11,,
,1,,1,
,,1,,1
,1,,1,
1,,1,,
,
4
4
++−−
++−−
++
++
−−
−−
−−−−−
−−−−
=
−−−−
−−−−
−−−+
−−−
=
∂
∂
yxyxyxyxyx
yxyxyxyxyx
yxyxyxyx
yxyxyxyx
yxyxyxyx
yxyxyxyx
yx
sssss
nnnnn
ssnn
ssnn
ssnn
ssnn
n
O
s
y
n
x
n
∆=
∂
∂
+
∂
∂
2
2
2
2

• Implementation issues
– Computation of can be expensive
 Construct
– The number of neighbors is not always equal to four
33






=











Bt
GsG
λ
n
0B
BGG TTT
GGT
GGL T
=










−−−−=



11411L
-1
-1 4 -1
-1
-1
-1 3 -1
-1
-1 2

Sparse Linear System SolverSparse Linear System Solver
34

Dense V.S. SparseDense V.S. Sparse
• In previous example,
– Assume that the size of the composite image is 200 x 200
– 200 x 200  40,000 pixels  40,000 unknowns
– We should solve the linear system
– Size of A: 40,000 x 40,000
 1,600,000,000 elements  1,600,000,000 float
 1,600,000,000 byte  about 1.6Gb
– Computing the inverse of (40,000 x 40,000) is very, very
expensive
35
bAx =

• Concept of the dense / sparse matrix
– Dense matrix
• The matrix that has a small number of zero elements
– Sparse matrix
• The matrix that has a large number of zero elements
• Storing the dense / sparse matrix
– Dense matrix
• Store all elements
• [1, 0, 0; 0, -1, 0; 0, 0, 2]
– Sparse matrix
• Store only non-zero elements
• [(1,1,1), (2,2,-1), (3,3,2)]
36










−
200
010
001

• Linear system of Poisson image cloning
– has at most 5 non-zero elements per row.
– The number of non-zero elements of is same as that of
the boundary pixels.
– 200 x 200 images
 (40000 x 5 + a) elements
– Efficient multiplication
37






=











Bt
GsG
λ
n
0B
BGG TTT
GGT
B

Steepest-Descent MethodSteepest-Descent Method
• How to solve the sparse linear system?
– Computing the inverse is expensive
 Find an optimized solution Iteratively
– Strategy of the iterative method
• Set the initial solution
• Until the objective value converges,
– Compute the gradient of
the objective function
at the current solution
– Move the solution with
an inverse direction
of the gradient
38
x
y
O(x,y)
i
O
ii
xx
xx
∂
∂
−=+ α1

• Steepest-descent method for linear system
– Gradient of the objective function (assume A is symmetric)
– Next iteration
– How to determine the step size ?
39
bAx = ( ) cf TT
+−= xbAxxx
2
1
( ) bAxxbAxxx −⇒+−= cf TT
2
1
( ) iiiii rxAxbxx αα +=−+=+1
α

• Determine the optimal step
– minimizes the objective function at the zero gradient
40
α
( ) ( ) 0111 =′= +++ iii
d
d
ff
d
d
xxx
αα
( )
( )( )
( )
( )
i
T
i
i
T
i
i
T
ii
T
i
i
T
ii
i
T
ii
i
T
ii
i
T
i
Arr
rr
Arrrr
rArr
rArAxb
rrxAb
rAxb
=∴
=−=
−=
−−=
+−=
− +
α
α
α
α
α
0
1

Conjugate Gradient MethodConjugate Gradient Method
• Performance of steepest-descent method
– Simplest iterative algorithm that solves the linear system,
– But slow convergence.
• Especially at the optimized point
• Conjugate Gradient Method
– One of the most popular method that solves the sparse linear
system
– Use the gradient direction and its conjugate gradient direction
to find the solution
– Ideally, conjugate gradient method always converges at least
N iterations for a (N x N) matrix system
– This lecture does not cover the detail of the CGM
41

Conjugate Gradient MethodConjugate Gradient Method
• Many improved versions of CGM
– For stabilized and fast convergence
– Preconditioned Conjugate Gradient Method
– Conjugate Gradient Squared Method
– Bi-Conjugate Gradient Method
– Bi-Conjugate Gradient Stabilized Method
– Reference
• J. R. Shewchuk , An Introduction to the Conjugate Gradient
Method Without the Agonizing Pain, CMU tech. report
• Wikipedia
42

• Library for sparse linear solver
– TAUCS
• http://www.tau.ac.il/~stoledo/taucs/
– OpenNL
• http://alice.loria.fr/index.php/software/4-library/23-opennl.html
• My original library for CGM
– Made by Ji-yong Kwon
– Simple implementation for dense /sparse matrix
– Features
• 2 types of data structure: CDenseVect, CSparseMat
• Linear operations between datas
• A set of sparse matrix solver (CGM, BiCGSTAB)
• Multi-core processing
43

• Basic usage
– CSparseMat aMat;
CDenseVect bVect, xVect;
CSparseSolverBiCGSTAB solver;
– Memory allocation
• bVect.Init(nRow);
• aMat.Create(nRow, nCol, 32);
// maximum number of elements per row
• Automatic de-allocation
– Set/get element
• bVect.Set(row, 1.0f);
float value = bVect.Get(row);
• aMat.AddElement(row, col, 1.0f);
float value = aMat.GetElement(row, elementId);
44

• Basic usage
– Solver initialization
• solver.InitSolver(aMat, bVect, xVect, 1.0f);
– This function initialize the solver’s state
– xVect should be initialized.
– Solve
• while(solver.CheckTermination())
solver.OneStep(aMat, bVect, xVect);
– Residual
• float residual = solver.GetResidual();
– Additional comment
• For ‘Visual Studio’, check project property  C/C++ 
Language  Provide OpenMP : Yes
• Release mode is much faster than debug mode.
45

SummarySummary
• Concept of the least-square optimization
– Useful to convert the hard problem into the approximated
version
– Can be solved by using the linear system solver
• Although the problem has multiple linear equality constraints
• Concept of the sparse linear system
– Solving the large linear system with a dense matrix can be
very expensive
 Use iterative methods to solve the sparse linear system
efficiently.
• The most important thing is to design the
objective function
46

Optimize Least-Square Methods and Sparse Linear Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Optimize Least-Square Methods and Sparse Linear Systems

Similar to Optimize Least-Square Methods and Sparse Linear Systems (20)

Recently uploaded

Recently uploaded (20)

Optimize Least-Square Methods and Sparse Linear Systems