opt_slides_ump.pdf

Optimization
Optimization
Dr Reshma Khemchandani
Dr Reshma Khemchandani Optimization

Optimization
Unconstrained Optimization Problems
OUTLINE
1 UNCONSTRAINED OPTIMIZATION PROBLEMS

Optimization
An optimization problem where decision vector x is allowed to
take any value in Rn
is called an unconstrained optimization
problem (UMP).
Aim is to study standard algorithms for unconstrained
minimization of functions of one variable as well as functions of
several variables.
Consider the unconstrained minimization problem (UMP)
Min
x∈Rn
f(x)
to get a point x ∈ Rn
such that f(x) ≤ f(x) for all x ∈ Rn
.
Unfortunately, finding local minimum for general function is very
difficult task.

Optimization
Solution Set: Let f : Rn
−→ R be continuously differentiable.
Then the set Ω = { x ∈ Rn
: ∇f(x) = 0 } is called the solution
set of (1).
Let f : Rn
−→ R be differentiable convex function and x be a
point of the solution set. Then x is a global min point of UMP.
We now describe a common basic scheme of the form
x(k+1)
= x(k)
+ αk d(k)
for solving the UMP where x(k)
is the current solution, d(k)
is the
direction of movement from x(k)
and αk > 0 (called the step
size) is the distance upto which we move in the direction d(k)
from the current point x(k)
.

Optimization
Descent Property: An algorithm for solving the UMP is said to
have the descent property if the objective function value
decreases as we go through the sequence {x(k)
}, i.e.
f(x(k+1)
) < f(x(k)
) for all k.
In other words, for the algorithm to possess the descent
property, the objective function should decrease as we proceed.
Quadratic Termination Property: An algorithm for UMP is said to
have quadratic termination property (q.t.p) if the minimum of a
positive definite quadratic form in n variables is reached in at
most n iterations.
The motivation for defining q.t.p stems from the fact that near a
local min point, the function behaves like a strictly convex
function(like a parabola in R or a positive definite quadratic form
in Rn
) and therefore if the algorithm behaves well on such a
function, it will ‘hopefully’ do well on the other functions as well.

Optimization
Globally Convergent: An algorithm for the UMP is said to be
globally convergent if starting from any point x(0)
∈ Rn
, the
sequence {x(k)
} always converges to a point of the solution set
Ω.
The property of ‘global convergence’ guarantees that no matter
from any arbitrary point x(0)
∈ Rn
we start, we are guaranteed to
generate a sequence {x(k)
} that converges to a point of the
solution set. Here it must be noted that two different starting
points will, in general, generate two different sequences of
iterates but both converging to a point of solution set.
Order of Convergence: Let the sequence {x(k)
} converge to a
point x and let x(k)
6= x for sufficiently large k. The quantity
kx(k)
− xk is called the error of the kth
iterate x(k)
.

Optimization
Suppose that there exists p and 0 < a < ∞ such that
lim
k→∞
||x(k+1)
− x||
||x(k) − x||p
= a (0 < a < ∞)
then p is called the order of convergence of the sequence {x(k)
}.
Thus kx(k+1)
− xk = a||x(k)
− x||p
asymptotically.
If p = 1, the sequence {x(k)
} is said to have linear convergence
rate and for p = 2, it is said to have quadratic convergence rate.
In case p = 1 but a = 0, then the sequence {x(k)
} is said to
have super linear convergence rate.

Optimization
LINE SEARCH METHODS FOR UNIMODAL
FUNCTIONS
Unimodal Min Function: The function f : [a, b] −→ R is said to be
a unimodal (to be specific unimodal min) function if it has only
one mode i.e. it has a single relative min, i.e. ∃ a ≤ α ≤ b such
that
(i) f is strictly decreasing (↓) in [a, α).
(ii) f is strictly increasing (↑) in [α, b].
Similar definition hold for unimodal max function.
Unimodal function may not be differentiable - in fact it may not
even be continuous

Optimization
PLOT OF UNI-MODAL FUNCTIONS

Optimization
Basic Strategy
Let us choose two distinct points (say) x1 and x2 in [a, b] i.e.
a ≤ x1 < x2 ≤ b.
Let xmin denote the point giving the minimum value of f(x) over
[a, b].
Since f is uni-modal min function, it is clear that
(i) f(x1) < f(x2) ⇒ xmin ∈ [a, x2]
(ii) f(x1) > f(x2) ⇒ xmin ∈ [x1, b]
(iii) f(x1) = f(x2) ⇒ xmin ∈ [x1, x2] .

Optimization
LINE SEARCH METHODS
(I) The Golden Section Rule.
(II) The Fibonacci Search Method.
In discussing the aforesaid methods we shall make use of the
following notations
xL,k = lower limit of the search interval at the kth
iteration (so xL,1 = a)
xU,k = upper limit of the search interval at the kth
iteration (so xU,1 = b)
xp,k = 1st
trial point at the kth
iteration
xq,k = 2nd
trial point at the kth
iteration
Ep,k = f(xp,k ) value of the function at the 1st
trial point
Eq,k = f(xq,k ) value of the function at the 2nd
trial point
Ik = xU,k − xL,k = length of the search interval at the kth
iteration
(so I1 = (b - a))
IL
k = length of the left part of the search interval Ik
IR
k = length of the right part of the search interval Ik .

Optimization
x x
p,k
L,k
xq,k= xp,k+1 xq,k+1 x
U,k
I
I
Ik+1
k+1
k+2
I
k+2
L
L
R
R

Optimization
THE GOLDEN SECTION RULE
Here the two trial points xp,k and xq,k are chosen as per the following
criteria
(I) IL
k = IR
k for all k , i.e. no undue advantage is given to left or right
part.
(II) xp,k+1 = xq,k and xq,k+1 is computed afresh for the interval IR
k+1 (
or xq,k+1 = xp,k and xp,k+1 is computed afresh for the interval
IL
k+1), i.e. only one new trial point is computed at the kth
iteration,
the other trial point is obtained from the previous iteration. In
view of this we have
Ik = Ik+1 + Ik+2 for all k.
(III)
Ik
Ik+1
=
Ik+1
Ik+2
= c (constant) for all k. This is called the golden
section criterion.

Optimization
Thus we wish to choose trial points xp,k and xq,k according to
criterion (i), (ii) and (iii). From (ii) we have
Ik = Ik+1 + Ik+2
i.e.
Ik
Ik+2
= Ik+1
Ik+2
+ 1
i.e.
Ik
Ik+1
Ik+1
Ik+2
= Ik+1
Ik+2
+ 1 (by (iii))
i.e.
c2
= c + 1. Therefore c = 1±
√
5
2
Since c can not be negative, we get c =
1 +
√
5
2
= 1.618 (golden
section ratio and
1
1.618
= 0.618 .

Optimization
STEPWISE DESCRIPTION
Step 1 Input data- xL,1, xU,1, , f (Here 0 is the tolerance to be
prescribed by the user).
Step 2 Compute the first two trial points xp,1 and xq,1, where
xp,1 = xU,1 − 0.618 (xU,1 − xL,1)
xq,1 = xL,1 + 0.618 (xU,1 − xL,1).
Set k = 1.
Step 3 Evaluate the function f at the two trial points xp,k and xq,k . Let
Ep,k = f(xp,k )
Eq,k = f(xq,k ) .
Step 4 Test the interval which contains the minimum, i.e. if Ep,k ≤ Eq,k ,
go to Step 5 otherwise go to Step 6.

Optimization
Step 5 Use following relations to update the data
xL,k+1 = xL,k
xU,k+1 = xq,k
xp,k+1 = xU,k+1 - 0.618 Ik+1
xq,k+1 = xp,k
Ep,k+1 = f(xp,k+1)
Eq,k+1 = f(xq,k+1) = Ep,k .
Step 6 Use the following relations to update the data
xL,k+1 = xp,k
xU,k+1 = xU,k
xp,k+1 = xq,k
xq,k+1 = xL,k + 0.618 Ik+1
Ep,k+1 = f(xp,k+1) = Eq,k
Eq,k+1 = f(xq,k+1) .
Step 7 Test for the end of optimization , i.e. if Ik , go to Step 8,
otherwise set k = (k+1) and go to Step 4.

Optimization
Step 8 Output xL,n, xU,n. Then xmin ∈ (xL,n, xU,n) and
Emin ≤ Min(Ep,n−1, Eq,n−1).
It can be noted that for the golden section rule,
In
I1
=

1
1.618
n−1
= (0.618)n−1
. Thus knowing I1 and In we can
find out the number of iterations as well as the number of points
at which the function is to be evaluated.
Note that if we are going upto the 7th
iteration (i.e. getting xL,7
and xU,7) then we shall be having n = 7 functional evaluations,
namely, at xp,1, xq,1, and one each at 2nd
, 3rd
, 4th
, 5th
and 6th
iteration. To stop at the 7th
iteration we shall be computing xp,6
and xq,6 and Emin ≤ Min (xp,6, xq,6).
Thus N functional evaluations will mean stopping at the Nth
iteration and computing points upto xp,N−1 and xq,N−1. Also
Emin ≤ Min (xp,N−1, xq,N−1) and there are only (N − 1) interval
reductions.

Optimization
Find min x2
over [-5, 15] by the golden section rule. Take = 1.5.
k xL,k xU,k xp,k xq,k Ep,k Eq,k L/R
1 - 5.0 15.00 2.64 7.36 6.96 54.1 L

2 -5.0 7.36 -0.2 2.64 0.077 6.96 L

3 -5.0 2.64 -2.08 -0.27 4.33 0.077 R
. .
4 -2.08 2.64 -0.2 0.84 0.077 0.71 L

5 -2.08 0.84 -0.96 -0.27 0.92 0.077 R
. .
6 -0.96 0.84 -0.27 0.15 0.077 0.023 R
7 -0.27 0.84
I7 = 1.11 1.5(= ), stop, xmin ∈ [-0.27, 0.84] Emin ≤ Min (0.077,
0.023) = 0.023.

Optimization
The Fibonacci Search Method
Fibonacci Sequence: Let F0 = 1, F1 = 1 and Fi =
Fi−1 + Fi−2 (i ≥ 2). Then {Fn} is called the sequence of
Fibonacci numbers or in short, the Fibonacci sequence.
Thus, the Fibonacci sequence is {1, 1, 2, 3, 5, 8, 13, 21, . . .}
where F0 = 1, F1 = 1, F2 = 2, F3 = 3, F4 = 5, F5 = 8, F6 = 8,
F7 = 21... and so on.
(I) IL
k = IR
k for all k
(II) Ik = Ik+1 + Ik+2
(III)
Ik
Ik+1
=
Fn−k+1
Fn−k
, where n denotes the number of iterations
to be performed.

Optimization
We now give a justification of the condition (iii) given above. If we
wish to stop at the nth
iteration, then
In+1 ' In = 1 In
In = In = 1 In
In−1 = In + In+1 = 2 In
In−2 = In + In−1 = 3 In
In−3 = 5 In
In−4 = 8 In
.
.
.
I1 = FnIn.
In the golden section rule Ik
Ik+1
= c = 1.618 for all k and all n. In
Fibonacci search method this ratio depends on both the current
iteration k and also the total number of iterations n to be performed.
Therefore instead of 0.618 (i.e.
1
c
) we shall be using the number
Fn−k
Fn−k+1
for determining the points xp,k and xq,k .

Optimization
STEPWISE APPROACH OF THE FIBONACCI
SEARCH METHOD
Step 1 Input data: xL,1, xU,1, n, f.
Step 2 Compute Fibonacci numbers Fn and Fn−1 and find
xp,1 = xU,1 -
Fn−1
Fn
(xU,1 − xL,1)
xq,1 = xL,1 +
Fn−1
Fn
(xU,1 − xL,1)
Set k = 1.
Step 3 Evaluate
Ep,k = f(xp,k )
Eq,k = f(xq,k )
and test the interval which contains the minimum. If Ep,k ≤ Eq,k ,
go to Step 4, otherwise go to Step 5.

Optimization
xL,k+1 = xL,k
xU,k+1 = xq,k
xp,k+1 = xU,k+1 -
Fn−k
Fn−k+1
Ik+1
xq,k+1 = xp,k
Ep,k+1 = f(xp,k+1)
Eq,k+1 = f(xq,k+1) = Ep,k .
xL,k+1 = xp,k
xU,k+1 = xU,k
xp,k+1 = xq,k+1
xq,k+1 = xL,k +
Fn−k
Fn−k+1
Ik+1
Ep,k+1 = f(xp,k+1) = Eq,k
Eq,k+1 = f(xq,k+1) .

Optimization
Step 6 If k ≤ (n − 2), then set k = k + 1 and go to step 3. If k = (n − 1)
then go to Step 7.
Step 7 As F0 = F1 = 1, it can be seen that at the (n − 1)th
iteration, the
two trial points xp,n−1 and xq,n−1 will come out to be the same ,
i.e. xp,n−1 = xq,n−1. To make them distinct we have to add /
subtract a fixed positive number (say 0.01) to one of these. For
this we look at the (n − 2)th
iteration. If there (i.e. at the (n − 2)th
iteration) we are searching in the left then is subtracted from
xp,n−1 otherwise it is added to xq,n−1. We then determine new
xp,n−1 and xq,n−1 and hence xL,n and xU,n. Then
xmin ∈ (xL,n, xU,n)
and fmin ≤ Min(Ep,n−1, Eq,n−1)
As
In
I1
=
1
Fn
, so given I1 and In, the value of n can be computed in
advance.

Optimization
EXAMPLE
With n = 7, find min x2
over [-5,15] via Fibonacci search method.
k Fn−k /Fn−k+1 xL,k xU,k xp,k xq,k Ep,k Eq,k
1 13/21 -5.0 15.00 2.64 7.38 6.88 54.6

2 8/13 -5.0 7.38 -0.24 2.62 0.058 6.88

3 5/8 -5.0 2.62 -2.14 -0.24 4.67 0.058
. .
4 3/5 -2.14 2.62 -0.24 0.72 0.058 0.52

5 2/3 -2.14 0.72 -1.2 -0.24 1.44 0.058
. .
6 1/2 - 1.2 0.72 -0.24 -0.24+0.01 0.058 0.053
7 1 -0.24 0.72

Optimization
From this table we observe that xmin ∈ [−0.24, 0.72] and
fmin ≤ 0.053.
In the given example, I1 = 20 and therefore if we take = 1.5,
then In = 1.5.
This gives
I1
In
=
20
1.5
= 13.3 which from the sequence of
Fibonacci numbers determines n = 7 as F6 = 13 and F7 = 21.
Also as explained in Step 7, the two trial points at the (n − 1)th
iteration, i.e. xp,6 and xq,6 both becomes equal to 0.24.
Therefore we look at the (n − 2)th
iteration, i.e. the 5th
iteration in
our example. As there we are seeking the right part, we add
0.01 to xq,6 to get new xq,6 and then continue as explained.

Optimization
RELATION BETWEEN THE FIBONACCI SEARCH
METHOD AND THE GOLDEN SECTION RULE
For the constant c appearing in the golden section rule we have
c = 1 + 1/c = 1.618.
Fn '
cn+1
√
5
(for large n).
Let RF and RGS denote the reduction factors for the Fibonacci
search method and the golden section rule respectively.
RF =
In
I1
=
1
Fn
=
√
5
cn+1
RGS =
In
I1
=
1
c
n−1
=
1
cn−1
.
Therefore
RGS
RF
=
1
cn−1
×
cn+1
√
5
=
c2
√
5
' 1.17 .

Optimization
THE STEEPEST DESCENT METHOD
The steepest descent method is gradient based methods for
solving UMP.
Simple to implement and therefore used widely in various
applications.
The only drawback of the steepest descent method is its slow
convergence (as its order of convergence is one ).
The UMP: Min
x∈Rn
f(x), f has continuous first order partial
derivatives on Rn
.
Then the basic scheme involved is x(k+1)
= x(k)
+ αk d(k)
where
d(k)
= −∇f(x(k)
)/k∇f(x(k)
)k, is the direction of movement and
the step size αk ≥ 0 is chosen such that h(αk ) = Min
αk ≥0
h(αk ).
Here the function h(αk ) is given by h(αk ) = f(x(k)
+ αk d(k)
),
and we stop when k∇f(x(k)
)k .

Optimization
A stepwise description of the steepest descent method could be
Step 1 Choose x(0)
∈ Rn
and tolerance 0. Set k = 0.
Step 2 Compute ∇f(x(k)
) and d(k)
= −∇f(x(k)
)/k∇f(x(k)
)k.
Step 3 Evaluate x(k+1)
= x(k)
+ αk d(k)
where αk 0 is chosen such
that h(αk ) = Min
αk ≥0
h(αk ) ,
h(αk ) = f(x(k)
+ αk d(k)
) .
The Steps 2 and 3 above are repeated till k∇f(x(k)
)k . In that case
x(k)
becomes a point of the solution set and therefore, if f is a convex
function then it becomes an optimal solution.

Optimization
The steepest descent algorithm
(I) has descent property
(II) does not have quadratic termination property
(III) is globally convergent and
(IV) has order of convergence p = 1.
Therefore if we are using the steepest descent method then we can
start from any point x(0)
and as we proceed, the objective function
value will decrease, but the algorithm may take lot many iterations
near the optimal solution. Also it may, in general, take more than n
iterations to minimize a positive definite quadratic form of n variables.

Optimization
Example: Use the steepest descent method to minimize
f(x1, x2) = 3x2
1 − 4x1x2 + 2x2
2 + 4x1 + 6 over (x1, x2) ∈ R2
.
The given function is a convex function and so the steepest descent
method will give a global optimal solution. Starting from x(0)
= (0, 0)T
and following Steps 1-3 described above, we get
k x(k)
∇f(x(k)
) d(k)
αk
1 (0, 0)T
(4, 0)T
(−1, 0)T
2/3
2 (−2/3, 0)T
(0, 8/3)T
(0, −1)T
2/3
3 (−2/3, −2/3)T
(8/3, 0)T
(−1, 0)T
1/6
4 (−10/9, −2/3)T
(0, 16/9)T
(0, −1)T
1/4
5 (−38/27, −10/9)T
(16/9, 0)T
(−1, 0)T
1/6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Optimization
The function f(x1, x2) is a positive definite quadratic form in two
variables but its optimal solution has not been obtained in atmost
two iterations. This illustrates that the method of steepest
descent does not possess quadratic termination property.
We observe that the directions d(k)
are repeated alternately
(−1, 0)T
, (0, −1)T
, (−1, 0)T
, (0, −1)T
etc. Any two consecutive
directions d(k)
and d(k+1)
given by the steepest descent method
are mutually orthogonal. Therefore in R2
, if the first two
directions are d(1)
and d(2)
then d(3)
has to be d(1)
and d(4)
has
to be d(2)
so on. But this repetition may not be in R3
and higher
dimensional spaces because there if d(1)
and d(2)
are
orthogonal then we may have d(3)
, different from d(1)
, which is
orthogonal to d(2)
. So the important thing is the orthogonality of
consecutive directions and NOT their alternate repetition.

Optimization
x
x2
1
FIGURE: −∇ f(x(k)
) which is normal to the surface f(x) = constant at
x(k)
, is tangent to the surface at the point x(k+1)
.

Optimization
NEWTON’S METHOD
Newton’s method: The basic scheme: yk+1 = yk − g(yk )/g0
(yk )
where yk is the current iterate or the current approximation,
g(y) = 0, y ∈ R.
Looking at the standard basic scheme of Newton’s method
x(k+1)
= x(k)
− (Hf (x(k)
))−1
∇f(x(k)
)
for finding a solution of the system ∇f(x) = 0.
Newton’s method for solving UMP’s has quadratic termination
property (in fact here for minimizing a positive definite quadratic
form of n variables it will take exactly one iteration).
Order of convergence is 2 and it will have descent property, but it
will not have the property of global convergence.
Therefore except for the case when we are minimizing a positive
definite quadratic form, we cannot start the method from an
arbitrary point x(0) ∈ Rn
.

Optimization
Use Newton’s method to minimize
f(x1, x2) = 8x2
1 − 4x1x2 + 5x2
2 , (x1, x2) ∈ R2
.
As the function f is a positive definite quadratic form we can start
from any arbitrary point x(0)
∈ R2
.
To be specific, let x(0)
= (5, 2)T
. Then
∇f(x(0)
) =

16x1 − 4x2
−4x1 + 10x2

(x1=5,x2=2)
= 72 0

,
Hf (x(0)
) =

16 −4
−4 10

,
(Hf (x(0)
))−1
=
1
144

10 4
4 16

.
x(1)
= x(0)
− (Hf (x(0)
))−1
∇f(x(0)
)
=

5
2

−
1
444

10 4
4 16

72
0

=

0
0

,
giving x1 = 0, x2 = 0 as the minimizing point.

Optimization
THE CONJUGATE GRADIENT METHOD
Conjugate based methods are better than the steepest descent
method (in terms of order of convergence) and are simpler to
implement than the modified Newton’s method.
Conjugate Directions: Let Q be an (n × n) positive definite
matrix. Any two non-zero vectors (directions) d(1)
, d(2)
∈ Rn
are
said to be conjugate vectors or conjugate directions with respect
to Q, if (d(1)
)T
Q d(2)
= 0.
For Q = I, conjugacy reduces to the usual concept of
orthogonality. Therefore if d(1)
and d(2)
are conjugate with
respect to Q, sometimes we also call them Q - orthogonal.
A set {d(0)
, . . . , d(k)
} of (k + 1) vectors in Rn
is said to be
conjugate if every two of them are so, i.e.
(d(i)
)T
Q d(j)
= 0 (i 6= j).

Optimization
Let {d(0)
, d(1)
, . . . , d(k)
} be a set of (k + 1) non-zero vectors
which are conjugate with respect to a given positive definite
matrix Q. Then the vectors d(0)
, d(1)
, . . . , d(k)
are linearly
independent.
Consider the problem Min
x∈Rn
1
2
xT
Q x − bT
x where Q is an
(n × n) symmetric positive definite matrix and hence the problem
is strictly convex and has unique minimizing point x ∈ Rn
.
Further the KKT conditions give ∇

1
2
xT
Qx − bT
x

= 0, i.e.
Qx = b, which implies x = Q−1
b (note that as Q is positive
definite, Q−1
exists), which is equivalent to finding the unique
solution of the system of equations Qx = b.

Optimization
Let {d(0)
, d(1)
, . . . , d(n−1)
} be a set of n non-zero vectors in Rn
which are conjugate with respect to Q. Then x, the unique
solution to the system Qx = b or equivalently, the unique
minimizing point of problem, is given by
x =
n−1
X
k=0

(d(k)
)T
b
(d(k))T Qd(k)

d(k)
.
Conjugate Direction Theorem: Let {d(0)
, d(1)
, . . . , d(n−1)
} be a
set of n non-zero vectors in Rn
which are conjugate with respect
to Q. For any x(0)
∈ Rn
, the sequence {x(k)
} is generated
x(k+1)
= x(k)
+ αk d(k)
,
αk = −
(g(k)
)T
d(k)
(d(k))T Q d(k)
,
g(k)
= Qx(k)
− b,
converges to the unique solution x of the system Q x = b
exactly after n steps, i.e. x(n)
= x.

Optimization
CONJUGATE GRADIENT METHOD FOR THE
QUADRATIC CASE
Step 1 Choose x(0)
∈ Rn
arbitrary. Define d(0)
= −g(0)
= b − Qx(0)
. Set
k = 0.
Step 2 Use the scheme
x(k+1)
= x(k)
+ αk d(k)
αk =
−(g(k)
)T
d(k)
(d(k))T Qd(k)
d(k+1)
= −g(k+1)
+ βk d(k)
βk =
(g(k+1)
)T
Qd(k)
(d(k))T Qd(k)
g(k)
= Qx(k)
− b .
Step 3 Continue till we get x(n)
. Then x(n)
= x and hence stop.

Optimization
The main point to note here is that we do not need all directions
d(0)
, d(1)
, . . . , d(k)
at one go.
We start from d(0)
and generate subsequent conjugate
directions as we proceed with the algorithm.
As soon as we determine d(n−1)
, the point x(n)
is known and that
is precisely the point x.

Optimization
EXAMPLE
Use the conjugate gradient method to minimize
f(x1, x2) = 3x2
1 − 4x1x2 + 2x2
2 + 4x1 + 6, (x1, x2) ∈ R2
.
Express the given quadratic function in the form
f(x) =
1
2
xT
Qx − bT
x.
It is simple to check here that
Q =

6 −4
−4 4

, b =

−4
0

and x =

x1
x2

.
Q is positive definite.
For the sake of illustration let x(0)
= (0, 0)T
.

Optimization
Step 1 For x(0)
= (0, 0)T
, g(0)
= Qx(0)
− b =

6 −4
−4 4

0
0

−

−4
0

=

4
0

.
Therefore, d(0)
= −g(0)
= (−4, 0)T
.
Step 2 Next x(1)
= x(0)
+ α0d(0)
where
α0 =
−(g(0)
)T
d(0)
(d(0))T Qd(0)
=
−(4, 0)

4
0

(−4, 0)

6 −4
−4 4

−4
0
= 1/6
Therefore,
x(1)
=

0
0

+ 1/6

−4
0

=

−2/3
0


Optimization
Step 3 Now,
g(1)
= Qx(1)
− b =

6 −4
−4 4

−2/3
0

−

−4
0

=

0
8/3

,
β0 =
(g(1)
)T Qd(0)
(d(0))T Qg(0)
=
(0, 8/3)

6 −4
−4 4

−4
0

96
= 4/9,
and d(1)
= −g(1)
+ β0d(0)
=

0
−8/3

+
4
9

−4
0

=

−16/9
−8/3

.
Step 4 Now we obtain x(2)
as
x(2)
= x(1)
+ α1d(1)
, where α1 =
−(g(1)
)T
d(1)
(d(1))T Qd(1)
= 3/4.
x(2)
=

−2/3
0

+
3
4

−16/9
−8/3

=

−2
−2

= x,
i.e. the minimizing point of f(x1, x2) is x1 = −2, x2 = −2.

Optimization
For the conjugate gradient method, following are true
(I) (g(k+1)
)T
d(k)
= 0
(II) αk =
(g(k)
)T
g(k)
(d(k))T Q d(k)
(III) βk =
(g(k+1)
)T
g(k+1)
(g(k))T g(k)
(IV) (d(k)
)T
Q d(i)
= 0, (i = 0, 1, . . . , k − 1).
Here g(k)
is the gradient of the objective function at x(k)
i.e.
g(k)
= Q x(k)
− b.

Optimization
FLETCHER AND REEVES’ METHOD
Let us consider UMP Min
x∈Rn
f(x) and attempt to translate the steps of
the conjugate gradient method for UMP.
Step 1 Choose x(0)
∈ Rn
arbitrary and take d(0)
= −g(0)
= −∇f(x(0)
).
Set k = 0.
Step 2 Obtain αk 0 such that
h(αk ) = Min
αk 0
h(αk ) ,
where
h(αk ) = f(x(k)
+ αk d(k)
) .
Step 3 Define x(k+1)
= x(k)
+ αk d(k)
and take d(k+1)
= −g(k+1)
+ βk d(k)
where g(k+1)
= ∇f(x(k+1)
) and βk is given by
βk =
(g(k+1)
)T
g(k+1)
(g(k))T g(k)
.

Optimization
Step 4 Continue till we get a point x(p)
of the solution set, i.e.
k∇f(x(p)
)k for some preassigned tolerance 0.
Above method which has the descent property and also has the
quadratic termination property, but is not globally convergent, i.e.
we can not start the method from an arbitrary point x(0)
∈ Rn
.
To make this algorithm globally convergent we incorporate
Powell’s correction in the above procedure.

Optimization
POWELL’S CORRECTION
The procedure is as follows:
Step 1 Choose x(0)
∈ Rn
arbitrary and take d(0)
= −g(0)
= −∇f(x(0)
).
Step 2 For k = 0, 1, . . . , (n − 1),
(A) Set x(k+1)
= x(k)
+ αk d(k)
, where αk 0 is chosen such
that h(αk ) = Min
αk 0
h(αk ), h(αk ) = f(x(k)
+ αk d(k)
) .
(B) Compute g(k+1)
= ∇f(x(k+1)
) .
(C) Unless k = (n − 1), set
d(k+1)
= −g(k+1)
+ βk d(k)
,
βk =
(g(k+1)
)T
g(k+1)
(g(k))T g(k)
.
Step 3 Replace x(0)
by x(n)
and go back to Step 1.

opt_slides_ump.pdf

Recommended

Recommended

More Related Content

Similar to opt_slides_ump.pdf

Similar to opt_slides_ump.pdf (20)

Recently uploaded

Recently uploaded (20)

opt_slides_ump.pdf