The document discusses least-square optimization and sparse linear systems. It introduces least-square optimization as a technique to find approximate solutions when exact solutions do not exist. It provides an example of using least-squares to find the line of best fit through three points. The objective is to minimize the sum of squared distances between the line and points. Solving the optimization problem yields a set of linear equations that can be solved using techniques like pseudo-inverse or conjugate gradient. Sparse linear systems with many zero entries can be solved more efficiently than dense systems.
Optimize Least-Square Methods and Sparse Linear Systems
1. Topics In Digital Contents Signals – Special Issue 01Topics In Digital Contents Signals – Special Issue 01
The Least-Square Optimization andThe Least-Square Optimization and
Sparse Linear System SolverSparse Linear System Solver
Presented by Ji-yong Kwon
Visual Computing Lab.
1
2. OutlineOutline
• What is the least-square optimization?
– Optimization
– Least-square optimization
– Application to computer graphics
• Poisson image cloning
• What is the sparse linear system
– Dense matrix v.s. Sparse matrix
– Steepest-descent approach
– Conjugate Gradient method
2
3. ReferenceReference
• Valuable reading materials
– Practical Least-Squares for Computer Graphics
• Pighin and Lewis
• ACM SIGGRAPH 2007 course note
• http://graphics.stanford.edu/~jplewis/lscourse/ls.pdf
– An Introduction to the Conjugate Gradient Method Without the
Agonizing Pain
• J. R. Shewchuk
• CMU tech. report
• http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf
3
5. Simple ExampleSimple Example
• Example
– Line equation passing two points on the a-b plane
• One unique solution
5
a
b
01=++ ybxa
01
01
22
11
=++
=++
ybxa
ybxa
−
−
−
−
−
=
−
−
=
1
11
1
1
12
12
1221
22
11
aa
bb
babay
x
y
x
ba
ba
Line equation:
We know:
( )11,ba
( )22 ,ba
6. Simple ExampleSimple Example
• Example
– Line equation passing three points on the a-b plane
• No exact solution
6
a
b
01=++ ybxa
01
01
01
33
22
11
=++
=++
=++
ybxa
ybxa
ybxa
−
−
−
=
1
1
1
33
22
11
y
x
ba
ba
ba
Line equation:
We know:
( )11,ba
( )22 ,ba
( )33,ba
7. Why Optimization?Why Optimization?
• Observation
– Unfortunately, many problems do not have a unique solution.
• Too many solutions, or
• No exact solution
– Concept of Optimization
• Find approximated solution
– Not exactly satisfy conditions,
– But satisfy conditions as much as possible.
• Strategy
– Set the objective (or energy) function
– Find a solution that minimizes (or maximizes) the objective
function.
7
8. Why Optimization?Why Optimization?
• Objective function
– A.K.A. energy function
– Input: a set of variables that we want to know
– Output: a scalar value
– Output value is used for estimation of solution’s quality
• Generally, small output value (small energy) good solution
– A solution that minimizes the output value of the objective
function Optimized solution
– To design the good objective function is the most
important task of the optimization techniques.
8
9. Simple ExampleSimple Example
• Example again,
– Line equation passing three points on the a-b plane
• No exact solution,
• But we can compute the approximated solution.
9
a
b
( )11,ba
( )22 ,ba
( )33,ba
– Passing all points would be
impossible,
– Find the line that minimizes
distances from all points
10. Objective FunctionObjective Function
• Example again,
– How to compute ‘distances’?
• Setting the objective function
10
a
b
01=++ ybxaPoint on the line:
( )11,ba
( )22 ,ba
( )33,ba+
–
0
Point out of the line:
01
,01
<++
>++
ybxa
ybxa
Objective function:
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO
11. Optimization ProblemOptimization Problem
• Problem description
– Find the line coefficients (x, y) that minimize a sum of squared
distances between the line and given points.
– Mathematically,
– More compact description
11
( )∑=
++
3
1
2
1minimize i ii ybxa
( ) ( )∑=
++=
3
1
2
1argmin, i iix,yoo ybxayx
12. SolutionSolution
• Solution of the example
– The objective function has
a parabolic shape
The objective function would be minimized
at the zero gradient of the function
12
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO
x
y
O(x,y)
( ) ( )
( ) ( )∑∑
∑∑
==
==
=++=++=
∂
∂
=++=++=
∂
∂
3
1
23
1
3
1
23
1
0212
0212
i iiiii iii
i iiiii iii
aybbxaybxab
y
O
abyaxaybxaa
x
O
13. SolutionSolution
• Solution of the example
13
x
y
O(x,y)
∑∑∑
∑∑∑
===
===
−=+
−=+
3
1
3
1
23
1
3
1
3
1
3
1
2
i ii ii ii
i ii iii i
bbybax
abayax
−
−
=
∑
∑
∑∑
∑∑
=
=
==
==
3
1
3
1
3
1
23
1
3
1
3
1
2
i i
i i
i ii ii
i iii i
b
a
y
x
bba
baa
−
−
=
∑
∑
∑∑
∑∑
=
=
−
==
==
3
1
3
1
1
3
1
23
1
3
1
3
1
2
i i
i i
i ii ii
i iii i
b
a
bba
baa
y
x
14. Squared DistanceSquared Distance
• Why ‘squared’?
– Naïve sum
• Each distance would be positive or negative,
• Sum of signed distances does not estimate the quality of the
solution.
– Sum of the absolute distance
• Distances would be 0 or positive,
• But the minimum point cannot be computed easily
(Not differentiable at the minimum point)
– Sum of the square distance
• Distances would be 0 or positive,
• Differentiable at the minimum point,
• The shape of the square function would be parabolic.
14
( ) ( )∑=
++=
3
1
1, i ii ybxayxO
( ) ∑=
++=
3
1
1, i ii ybxayxO
( ) ( )∑=
++=
3
1
2
1, i ii ybxayxO
15. Another SolutionAnother Solution
• Pseudo-inverse
– The inverse matrix can be computed if and only if the matrix is
square
– The pseudo-inverse matrix
– From the example before…
– Solution computed by using pseudo-inverse method
= Solution computed by using least-square optimization
15
−
−
−
=
1
1
1
33
22
11
y
x
ba
ba
ba
bAx =
( )
( )
=⇐
=⇐
= +−
+−
+
IAAAAA
IAAAAA
A 1
1
T
TT
( ) bAAAx TT 1−
=
16. BackgroundBackground
• Background of matrix differentiation
– Very comfortable technique for derivation of matrix system
– Reference
• A. M. Mathai, ‘Jacobians of Matrix Transformations and
Functions of Matrix Argument’, World Scientific Publishing, 1997
– Contents that would be covered in this lecture…
• A scalar value function of a vector
• A vector function of a vector
16
[ ]
( ) ( ) ( ) ( )
T
p
T
p
x
f
x
f
x
fy
fy
xxx
∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
=
xxx
x
x
x
,,,
,,,
21
21
17. BackgroundBackground
• Theorem 1
– Let be the vector of variables
be the constant vector, then
[ ]T
pxxx ,,, 21 =x
[ ]T
paaa ,,, 21 =a
( )xAA
x
Axx
x
x
xx
a
x
xa
TT
T
T
y
y
y
y
y
y
+=
∂
∂
⇒=
=
∂
∂
⇒=
=
∂
∂
⇒=
2
18. BackgroundBackground
• Proof of Theorem 1
[ ]T
pxxx ,,, 21 =x
[ ]T
paaa ,,, 21 =a
)
[ ]
)
[ ]
)
( )xAA
x
xAxA
Axx
x
x
xx
a
x
xa
TT
ii
p
j
jji
p
j
jij
i
p
i
p
j
jiij
T
T
p
T
p
i
i
p
T
T
p
T
p
i
i
pp
T
y
xaxa
x
y
xxay
xxx
x
y
x
y
x
yy
x
x
y
xxxy
aaa
x
y
x
y
x
yy
a
x
y
xaxaxay
+=
∂
∂
⇒+=+=
∂
∂
==
==
∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
∂
∂
+++==
==
∂
∂
∂
∂
∂
∂
=
∂
∂
⇒=
∂
∂
+++==
⋅⋅
==
= =
∑∑
∑∑3
22222
2
1
11
1 1
21
21
22
2
2
1
21
21
2211
19. BackgroundBackground
• Theorem 2
– Let
19
[ ] [ ]
xy
x
y
J
yx
Jdd
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
yyyxxx
p
ppp
p
p
j
i
T
p
T
p
=
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
=
∂
∂
=
∂
∂
=
==
21
2
2
2
1
2
1
2
1
1
1
2121 ,
20. Matrix FormulationMatrix Formulation
• Example again,
– Can be described as a matrix form
20
( ) ( )
2
33
22
11
2
33
22
11
3
1
2
1
1
1
1
1
1
1,
−
−
−
−
=
++
++
++
=
++= ∑=
y
x
ba
ba
ba
ybxa
ybxa
ybxa
ybxayxO i ii
( )
( ) ( )bAxbAx
bAxx
−−=
−=
⇒
T
O
2
21. Matrix FormulationMatrix Formulation
• Matrix formulation of the least square optimization
21
( ) ( )( )
( )( )( )
( )
( )
( )
( ) bAAAx
bAAxA
bAAxA
bAxAAAA
bbAxbAxAx
x
bbbAxAxbAxAx
x
bAxbAx
x
bAxbAx
xx
TT
TT
TT
TTT
TTTT
TTTTTT
TTT
TO
1
022
2
2
−
=∴
=∴
=−=
−+=
+−
∂
∂
=
+−−
∂
∂
=
−−
∂
∂
=
−−
∂
∂
=
∂
∂
Can be solved by using
the linear system solver
22. ConstraintsConstraints
• Slightly different example
– Line that minimizes distance from the red points,
– One additional constraint
• This line should pass the white point
22
a
b
( )11,ba
( )22 ,ba
( )33,ba
( )cc ba ,
( )
01subject to
1minimize
3
1
2
=++
++∑=
cc
i ii
ybxa
ybxa
23. ConstraintsConstraints
• Constraints
– A.K.A. hard constraints
• c.f. soft constraints objective (energy) term
– Condition that must be satisfied
– Constrained optimization
• Optimization with some constraints
– (Linear / Non-linear) (equality / inequality) constraint
– This lecture only covers linear equality constraint
23
24. Constrained OptimizationConstrained Optimization
• Lagrange multiplier
– Constrained optimization can be expressed as a
unconstrained optimization with a Lagrange multiplier
– Why is it possible?
• This lecture would not covers the theory of the Lagrange
multiplier.
• Reference
– http://en.wikipedia.org/wiki/Lagrange_multipliers
24
( )
( ) cyxC
yxO
=,subject to
,minimize
( ) ( )( )cyxCyxO −+ ,,minimize λ
25. Constrained OptimizationConstrained Optimization
• Solution of constrained optimization
– At the minimum point, the gradient of objective function
should be zero
– Also can be solved by using the linear system solver
25
( )1
2
1
minarg
2
++− xcbAxx
T
λ
01=+=
∂
∂
=+−=
∂
∂
xc
0cbAAxA
x
T
TT
O
O
λ
λ
−
=
10
bAx
c
cAA T
T
T
λ
bxA ˆˆˆ =
26. Constrained OptimizationConstrained Optimization
• Case for multiple constraints
– Multiple Lagrange multipliers
26
( )cCxλbAxx −+− T2
2
1
minarg
0=−=
∂
∂
=+−=
∂
∂
cCx
λ
0λCbAAxA
x
O
O TTT
=
c
bAx
C
CAA TTT
λ0
bxA ˆˆˆ =
27. ImplementationImplementation
• How to solve the linear system?
– Provided by many libraries.
– Using OpenCV,
• Data structure for storing a matrix
– CvMat *aMat = cvCreateMat(nRow, nCol, CV_32F);
– cvReleaseMat(&aMat);
• Set and get the element of a matrix
– cvmSet(aMat, m, n, 1.0f);
– float mn = cvmGet(aMat, m, n);
• Linear operation
– cvAdd(aMat, bMat, cMat);
– cvGEMM(aMat, aMat, 1.0f, NULL, 0.0f, ataMat,
CV_GEMM_A_T);
• Solver
– cvSolve(aMat, bVect, xVect, CV_LU);
27
28. Practical ExamplePractical Example
• Poisson image cloning
– Interesting solution to composite the image
– Paste the modified gradient of source image with satisfaction
of boundary colors
28
29. Poisson Image CloningPoisson Image Cloning
• Problem description
– Source image pixel
– Target image pixel
– Unknown new image pixel
– Objective
• Minimize difference of gradients between a new image and a
source image
– Constraint
• Pixel values at the boundary should be equal to those of the
target image
29
yxs ,
yxt ,
yxn ,
s t
nt
35. Dense V.S. SparseDense V.S. Sparse
• In previous example,
– Assume that the size of the composite image is 200 x 200
– 200 x 200 40,000 pixels 40,000 unknowns
– We should solve the linear system
– Size of A: 40,000 x 40,000
1,600,000,000 elements 1,600,000,000 float
1,600,000,000 byte about 1.6Gb
– Computing the inverse of (40,000 x 40,000) is very, very
expensive
35
bAx =
36. Dense V.S. SparseDense V.S. Sparse
• Concept of the dense / sparse matrix
– Dense matrix
• The matrix that has a small number of zero elements
– Sparse matrix
• The matrix that has a large number of zero elements
• Storing the dense / sparse matrix
– Dense matrix
• Store all elements
• [1, 0, 0; 0, -1, 0; 0, 0, 2]
– Sparse matrix
• Store only non-zero elements
• [(1,1,1), (2,2,-1), (3,3,2)]
36
−
200
010
001
37. Dense V.S. SparseDense V.S. Sparse
• Linear system of Poisson image cloning
– has at most 5 non-zero elements per row.
– The number of non-zero elements of is same as that of
the boundary pixels.
– 200 x 200 images
(40000 x 5 + a) elements
– Efficient multiplication
37
=
Bt
GsG
λ
n
0B
BGG TTT
GGT
B
38. Steepest-Descent MethodSteepest-Descent Method
• How to solve the sparse linear system?
– Computing the inverse is expensive
Find an optimized solution Iteratively
– Strategy of the iterative method
• Set the initial solution
• Until the objective value converges,
– Compute the gradient of
the objective function
at the current solution
– Move the solution with
an inverse direction
of the gradient
38
x
y
O(x,y)
i
O
ii
xx
xx
∂
∂
−=+ α1
39. Steepest-Descent MethodSteepest-Descent Method
• Steepest-descent method for linear system
– Gradient of the objective function (assume A is symmetric)
– Next iteration
– How to determine the step size ?
39
bAx = ( ) cf TT
+−= xbAxxx
2
1
( ) bAxxbAxxx −⇒+−= cf TT
2
1
( ) iiiii rxAxbxx αα +=−+=+1
α
40. Steepest-Descent MethodSteepest-Descent Method
• Determine the optimal step
– minimizes the objective function at the zero gradient
40
α
( ) ( ) 0111 =′= +++ iii
d
d
ff
d
d
xxx
αα
( )
( )( )
( )
( )
i
T
i
i
T
i
i
T
ii
T
i
i
T
ii
i
T
ii
i
T
ii
i
T
i
Arr
rr
Arrrr
rArr
rArAxb
rrxAb
rAxb
=∴
=−=
−=
−−=
+−=
− +
α
α
α
α
α
0
1
41. Conjugate Gradient MethodConjugate Gradient Method
• Performance of steepest-descent method
– Simplest iterative algorithm that solves the linear system,
– But slow convergence.
• Especially at the optimized point
• Conjugate Gradient Method
– One of the most popular method that solves the sparse linear
system
– Use the gradient direction and its conjugate gradient direction
to find the solution
– Ideally, conjugate gradient method always converges at least
N iterations for a (N x N) matrix system
– This lecture does not cover the detail of the CGM
41
42. Conjugate Gradient MethodConjugate Gradient Method
• Many improved versions of CGM
– For stabilized and fast convergence
– Preconditioned Conjugate Gradient Method
– Conjugate Gradient Squared Method
– Bi-Conjugate Gradient Method
– Bi-Conjugate Gradient Stabilized Method
– Reference
• J. R. Shewchuk , An Introduction to the Conjugate Gradient
Method Without the Agonizing Pain, CMU tech. report
• Wikipedia
42
43. ImplementationImplementation
• Library for sparse linear solver
– TAUCS
• http://www.tau.ac.il/~stoledo/taucs/
– OpenNL
• http://alice.loria.fr/index.php/software/4-library/23-opennl.html
• My original library for CGM
– Made by Ji-yong Kwon
– Simple implementation for dense /sparse matrix
– Features
• 2 types of data structure: CDenseVect, CSparseMat
• Linear operations between datas
• A set of sparse matrix solver (CGM, BiCGSTAB)
• Multi-core processing
43
44. ImplementationImplementation
• Basic usage
– CSparseMat aMat;
CDenseVect bVect, xVect;
CSparseSolverBiCGSTAB solver;
– Memory allocation
• bVect.Init(nRow);
• aMat.Create(nRow, nCol, 32);
// maximum number of elements per row
• Automatic de-allocation
– Set/get element
• bVect.Set(row, 1.0f);
float value = bVect.Get(row);
• aMat.AddElement(row, col, 1.0f);
float value = aMat.GetElement(row, elementId);
44
45. ImplementationImplementation
• Basic usage
– Solver initialization
• solver.InitSolver(aMat, bVect, xVect, 1.0f);
– This function initialize the solver’s state
– xVect should be initialized.
– Solve
• while(solver.CheckTermination())
solver.OneStep(aMat, bVect, xVect);
– Residual
• float residual = solver.GetResidual();
– Additional comment
• For ‘Visual Studio’, check project property C/C++
Language Provide OpenMP : Yes
• Release mode is much faster than debug mode.
45
46. SummarySummary
• Concept of the least-square optimization
– Useful to convert the hard problem into the approximated
version
– Can be solved by using the linear system solver
• Although the problem has multiple linear equality constraints
• Concept of the sparse linear system
– Solving the large linear system with a dense matrix can be
very expensive
Use iterative methods to solve the sparse linear system
efficiently.
• The most important thing is to design the
objective function
46