2. Course Materials
β’ Arora, Introduction to Optimum Design, 3e, Elsevier,
(https://www.researchgate.net/publication/273120102_Introductio
n_to_Optimum_design)
β’ Parkinson, Optimization Methods for Engineering Design, Brigham
Young University
(http://apmonitor.com/me575/index.php/Main/BookChapters)
β’ Iqbal, Fundamental Engineering Optimization Methods, BookBoon
(https://bookboon.com/en/fundamental-engineering-optimization-
methods-ebook)
3. Numerical Optimization
β’ Consider an unconstrained NP problem: min
π
π π
β’ Use an iterative method to solve the problem: ππ+1 = ππ + πΌππ π,
where π π
is a search direction and πΌπ is the step size, such that the
function value decreases at each step, i.e., π ππ+1
< π ππ
β’ We expect lim
πββ
ππ = πβ
β’ The general iterative method is a two-step process:
β Finding a suitable search direction π π
along which the function
value locally decreases and any constraints are obeyed.
β Performing line search along π π
to find ππ+1
such that π ππ+1
attains its minimum value.
4. The Iterative Method
β’ Iterative algorithm:
1. Initialize: chose π0
2. Check termination: π»π ππ
β 0
3. Find a suitable search direction π π
,
that obeys the descent condition:
π»π ππ π
π π < 0
4. Search along π π
to find where
π ππ+1
attains minimum value
(line search problem)
5. Return to step 2
5. The Line Search Problem
β’ Assuming a suitable search direction π π has been determined, we
seek to determine a step length πΌπ, that minimizes π ππ+1 .
β’ Assuming ππ
and π π
are known, the projected function value along
π π is expressed as:
π ππ
+ πΌππ π
= π ππ
+ πΌπ π
= π(πΌ)
β’ The line search problem to choose πΌ to minimize π ππ+1 along π π
is defined as:
min
πΌ
π(πΌ) = π ππ
+ Ξ±π π
β’ Assuming that a solution exists, it is found by setting πβ² πΌ = 0.
6. Example: Quadratic Function
β’ Consider minimizing a quadratic function:
π π = 1
2 πππ¨π β πππ, π»π = π¨π β π
β’ Given a descent direction π , the line search problem is defined as:
min
πΌ
π(πΌ) = ππ
+ πΌπ
π
π¨ ππ
+ πΌπ β ππ
ππ
+ πΌπ
β’ A solution is found by setting πβ²
πΌ = 0, where
πβ² πΌ = π ππ¨ ππ + πΌπ β π ππ = 0
πΌ = β
π π
π¨ππ
β π
π ππ¨π
= β
π»π(ππ
)π
π
π ππ¨π
β’ Finally, ππ+1 = ππ + πΌπ .
7. Computer Methods for Line Search Problem
β’ Interval reduction methods
β Golden search
β Fibonacci search
β’ Approximate search methods
β Arjimoβs rule
β Quadrature curve fitting
8. Interval Reduction Methods
β’ The interval reduction methods find the minimum of a unimodal
function in two steps:
β Bracketing the minimum to an interval
β Reducing the interval to desired accuracy
β’ The bracketing step aims to find a three-point pattern, such that for
π₯1, π₯2, π₯3, π π₯1 β₯ π π₯2 < π π₯3 .
9. Fibonacciβs Method
β’ The Fibonacciβs method uses Fibonacci numbers to achieve
maximum interval reduction in a given number of steps.
β’ The Fibonacci number sequence is generated as:
πΉ0 = πΉ1 = 1, πΉπ = πΉπβ1 + πΉπβ2, π β₯ 2.
β’ The properties of Fibonacci numbers include:
β They achieve the golden ratio π = lim
πββ
πΉπβ1
πΉπ
=
5β1
2
β 0.618034
β The number of interval reductions π required to achieve a desired
accuracy π (where 1/πΉπ < π) is specified in advance.
β For given πΌ1 and π, πΌ2 =
πΉπβ1
πΉπ
πΌ1, πΌ3 = πΌ1 β πΌ2, πΌ4 = πΌ2 β πΌ3, etc.
10. The Golden Section Method
β’ The golden section method uses the golden ratio: π = 0.618034.
β’ The golden section algorithm is given as:
1. Initialize: specify π₯1, π₯4 πΌ1 = π₯4 β π₯1 , π, π: ππ <
π
πΌ1
2. Compute π₯2 = ππ₯1 + 1 β π π₯4, evaluate π2
3. For π = 1, β¦ , π β 1
Compute π₯3 = 1 β π π₯1 + ππ₯4, evaluate π3; if π2 < π3, set
π₯4 β π₯1, π₯1 β π₯3; else set π₯1 β π₯2, π₯2 β π₯3, π2 β π3
11. Approximate Search Methods
β’ Consider the line search problem: min
πΌ
π(πΌ) = π ππ + Ξ±π π
β’ Sufficient Descent Condition. The sufficient descent condition guards
against π π becoming too close to π»π ππ . The condition is stated as:
π»π ππ π
π π
< βπ π»π ππ 2
, π > 0
β’ Sufficient Decrease Condition. The sufficient decrease condition ensures
a nontrivial reduction in the function value. The condition is stated as:
π ππ + πΌπ π β π ππ β€ π πΌ π»π ππ π
π π, 0 < π < 1
β’ Curvature Condition. The curvature condition guards against πΌ becoming
too small. The condition is stated as:
π ππ
+ πΌπ π π
π π
β₯ π ππ
+ π π»π ππ π
π π
, 0 < π < π < 1
12. Approximate Line Search
β’ Strong Wolfe Conditions. The strong Wolfe conditions commonly
used by all line search algorithms include:
1. The sufficient decrease condition (Arjimoβs rule):
π πΌ β€ π 0 + ππΌπβ²
(0), 0 < π < 1
2. Strong curvature condition:
πβ²
πΌ β€ π πβ²
0 , 0 < π β€ π < 1
13. Approximate Line Search
β’ The approximate line search includes two steps:
β Bracketing the minimum
β Estimating the minimum
β’ Bracketing the Minimum. In the bracketing step we seek an interval
πΌ, πΌ such that πβ²
πΌ < 0 and πβ²
πΌ > 0.
β Since for any descent direction, πβ²
0 < 0, therefore, πΌ = 0 serves
as a lower bound on πΌ. To find an upper bound, gradually increase
πΌ, e.g., πΌ = 1,2, β¦,
β Assume that for some πΌπ > 0, we get πβ²
πΌπ < 0 and πβ²
πΌπ+1 > 0;
then, πΌπ serves as an upper bound.
14. Approximate Line Search
β’ Estimating the Minimum. Once the minimum has been bracketed
to a small interval, a quadratic or cubic polynomial approximation is
used to find the minimizer.
β’ If the polynomial minimizer πΌ satisfies strong Wolfeβs conditions for
the desired π and π values (say π = 0.2, π = 0.5), it is taken as the
function minimizer.
β’ Otherwise, πΌ is used to replace one of the πΌ or πΌ, and the
polynomial approximation step repeated.
15. Quadratic Curve Fitting
β’ Assuming that the interval πΌπ, πΌπ’ contains the minimum of a
unimodal function, π πΌ , its quadratic approximation, given as:
π πΌ = π0 + π1πΌ + π2πΌ2
, is obtained using three points
πΌπ, πΌπ, πΌπ’ , where the mid-point may be used for πΌπ
The quadratic coefficients {π0, π1, π2} are solved as:
π2 =
1
πΌπ’βπΌπ
π πΌπ’ βπ πΌπ
πΌπ’βπΌπ
β
π πΌπ βπ πΌπ
πΌπβπΌπ
π1 =
1
πΌπβπΌπ
π πΌπ β π πΌπ β π2(πΌπ + πΌπ)
π0 = π(πΌπ) β π1πΌπ β π2πΌπ
2
Then, the minimum is given as: πΌπππ = β
π1
2π2
16. Example: Approximate Search
β’ Let π πΌ = πβπΌ + πΌ2, πβ² πΌ = 2πΌ β πβπΌ, π 0 = 1, πβ² 0 = β1.
Let π = 0.2, and try πΌ = 0.1, 0.2, β¦, to bracket the minimum.
β’ From the sufficient decrease condition, the minimum is bracketed
in the interval: [0, 0.5]
β’ Using quadratic approximation, the minimum is found as:
π₯β
= 0.3531
The exact solution is given as: πΌπππ = 0.3517
β’ The Matlab commands are:
Define the function:
f=@(x) x.*x+exp(-x);
mu=0.2; al=0:.1:1;
18. Computer Methods for Finding the Search Direction
β’ Gradient based methods
β Steepest descent method
β Conjugate gradient method
β Quasi Newton methods
β’ Hessian based methods
β Newtonβs method
β Trust region methods
19. Steepest Descent Method
β’ The steepest descent method determines the search direction as:
π π = βπ»π(ππ),
β’ The update rule is given as: ππ+1
= ππ
β πΌπ β π»π(ππ
)
where πΌπ is determined by minimizing π(ππ+1) along π π
β’ Example: quadratic function
π π =
1
2
ππ
π¨π β ππ
π, π»π = π¨π β π
Then, ππ+1
= ππ
β πΌ β π»π ππ
; πΌ =
π» π ππ π
π» π ππ
π» π ππ π
ππ» π ππ
Define ππ = π β π¨ππ
Then, ππ+1
= ππ
+ πΌπππ; πΌπ =
ππ
π
ππ
ππ
ππ΄ππ
20. Steepest Descent Algorithm
β’ Initialize: choose π0
β’ For π = 0,1,2, β¦
β Compute π»π(ππ
)
β Check convergence: if π»π(ππ
) < π, stop.
β Set π π = βπ»π(ππ)
β Line search problem: Find min
πΌβ₯0
π ππ + πΌπ π
β Set ππ+1
= ππ
+ πΌπ π
.
23. Steepest Descent Method
β’ The steepest descent method becomes slow close to the optimum
β’ The method progresses in a zigzag fashion, since
π
ππΌ
π ππ
+ πΌπ π
= π» π ππ+1 π
π π
= βπ» π ππ+1 π
π» π ππ
= 0
β’ The method has linear convergence with rate constant
πΆ =
π ππ+1 βπ πβ
π ππ βπ πβ β€
ππππ π¨ β1
ππππ π¨ +1
2
24. Preconditioning
β’ Preconditioning (scaling) can be used to reduce the condition
number of the Hessian matrix and hence aid convergence
β’ Consider π π = 0.1π₯1
2
+ π₯2
2
= ππ
π¨π, where π¨ = ππππ(0.1, 1)
β’ Define a linear transformation: π = π·π, where π· = ππππ( 10, 1);
then, π π = ππ
π·π
π¨π·π = ππ
π
β’ Since ππππ π° = 1, the steepest descent method in the case of a
quadratic function converges in a single iteration
25. Conjugate Gradient Method
β’ For any square matrix π¨, the set of π¨-conjugate vectors is defined
by: π ππ
π¨π π = 0, π β π
β’ Let ππ = π» π ππ denote the gradient; then, starting from
π 0
= βπ0, a set of π¨-conjugate directions is generated as:
π 0 = βπ0; π π+1 = βππ+1 + π½ππ π π β₯ 0, β¦
where π½π =
ππ+1
π
π¨π π
π ππ
π¨π π
There are multiple ways to generate conjugate directions
β’ Using {π 0
, π 2
, β¦ , π πβ1
} as search directions, a quadratic function is
minimized in π steps.
26. Conjugate Directions Method
β’ The parameter π½π can be computed in different ways:
β By substituting π¨π π
=
1
πΌπ
(ππ+1 β ππ), we obtain:
π½π =
ππ+1
π
(ππ+1βππ)
π ππ
(ππ+1βππ)
(the Hestenes-Stiefel formula)
β In the case of exact line search, ππ+1
π
π π
= 0; then
π½π =
ππ+1
π
(ππ+1βππ)
ππ
πππ
(the Polak-Ribiere formula)
β Also, for exact line search ππ+1
π
ππ = π½πβ1(ππ + πΌππ¨π π
)π
π πβ1
= 0,
resulting in π½π =
ππ+1
π
ππ+1
ππ
πππ
(the Fletcher-Reeves formula)
Other versions of π½π have also been proposed.
30. Conjugate Gradient Method
β’ Assume that an update that includes steps πΌπ along π conjugate
vectors π π is assembled as: π¦ = πΌππ π
π
π=1 .
β’ Then, for a quadratic function, the minimization problem is
decomposed into a set of one-dimensional problems, i.e.,
min
π¦
π(π) β‘ min
πΌπ
1
2
πΌπ
2
π ππ
π¨π π
β πΌπππ
π π
π
π=1
β’ By setting the derivative with respect to πΌπ equal to zero, i.e.,
πΌππ ππ
π¨π π
β ππ
π π
= 0, we obtain: πΌπ =
πππ π
π ππ
π¨π π
.
β’ This shows that the CG algorithm iteratively determines the
conjugate directions π π and their coefficients πΌπ.
31. CG Rate of Convergence
β’ Conjugate gradient methods achieve superlinear convergence:
β In the case of quadratic functions, the minimum is reached exactly
in π iterations.
β For general nonlinear functions, convergence in 2π iterations is to
be expected.
β’ Nonlinear CG methods typically have the lowest per iteration
computational costs of all gradient methods.
32. Newtonβs Method
β’ Consider minimizing the second order approximation of π π :
min
π
π ππ + Ξπ = π ππ + π»π ππ
πΞπ + 1
2 Ξπππ―πΞπ
β’ Apply FONC: π―ππ + ππ = π, where ππ = π»π ππ
Then, assuming that π―π = π»2
π ππ stays positive definite, the
Newtonβs update rule is derived as: ππ+1 = ππ β π―π
β1
ππ
β’ Note:
β The convergence of the Newtonβs method is dependent on π―π
staying positive definite.
β A step size may be included in the Newtonβs method, i.e.,
ππ+1 = ππ β πΌππ―π
β1
ππ
33. Marquardt Modification to Newtonβs Method
β’ To ensure the positive definite condition on π―π, Marquardt
proposed the following modification to Newtonβs method:
π―π + ππ° π = βππ
where π is selected to ensure that the Hessian is positive definite.
β’ Since π―π + ππ° is also symmetric, the resulting system of linear
equations can be solved for π as:
π³π«π³π
π = βπ»π ππ
34. Newtonβs Algorithm
Newtonβs Method (Griva, Nash, & Sofer, p. 373):
1. Initialize: Choose π0, specify π
2. For π = 0,1, β¦
3. Check convergence: If π»π ππ < π, stop
4. Factorize modified Hessian as π»2
π ππ + π¬ = π³π«π³π
and solve
π³π«π³π π = βπ»π ππ for π
5. Perform line search to determine πΌπ and update the solution
estimate as ππ+1 = ππ + πΌπ π π
35. Rate of Convergence
β’ Newtonβs method achieves quadratic rate of convergence in the
close neighborhood of the optimal point, and superlinear
convergence otherwise.
β’ The main drawback of the Newtonβs method is its computational
cost: the Hessian matrix needs to be computed at every step, and a
linear system of equations needs to be solved to obtain the update.
β’ Due to the high computational and storage costs, classic Newtonβs
method is rarely used in practice.
36. Quasi Newtonβs Methods
β’ The quasi-Newton methods derive from a generalization of secant
method, that approximates the second derivative as:
πβ²β²
(π₯π) β
πβ² π₯π βπβ²(π₯πβ1)
π₯πβπ₯πβ1
β’ In the multi-dimensional case, the secant condition is generalized
as: π―π ππ β ππβ1 = π»π ππ β π»π ππβ1
β’ Define ππ = π―π
β1
, then
ππ β ππβ1 = ππ π»π ππ β π»π ππβ1
β’ The quasi-Newton methods iteratively update π―π or ππ as:
β Direct update: π―π+1 = π―π + βπ―π, π―0 = π°
β Inverse update: ππ+1 = ππ + βππ, π = π―β1
, π0 = π°
37. Quasi-Newton Methods
β’ Quasi-Newton update:
Let ππ = ππ+1 β ππ, ππ = π»π ππ+1 β π»π ππ ; then,
β The DFP (Davison-Fletcher-Powell) formula for inverse Hessian
update is given as:
ππ+1 = ππ β
ππππ ππππ
π
ππ
πππππ
+
ππππ
π
ππ
πππ
β The BGFS (Broyden, Fletcher, Goldfarb, Shanno) formula for direct
Hessian update is given as:
π―π+1 = π―π β
π―πππ π―πππ
π
ππ
ππ―πππ
+
ππππ
π
ππ
πππ
38. Quasi-Newton Algorithm
The Quasi-Newton Algorithm (Griva, Nash & Sofer, p.415):
β’ Initialize: Choose π0, π―0 (e.g., π―0 = π°), specify π
β’ For π = 0,1, β¦
β Check convergence: If π»π ππ < π, stop
β Solve π―ππ = βπ»π ππ for π π
(alternatively, π = βπππ»π ππ )
β Solve min
πΌ
π ππ + πΌπ π for πΌπ, and update the current estimate:
ππ+1 = ππ + πΌπ π π
β Compute ππ, ππ, and update π―π (or ππ as applicable)
44. Computer Methods for Constrained Problems
β’ Penalty and Barrier methods
β’ Augmented Lagrangian method (AL)
β’ Sequential linear programming (SLP)
β’ Sequential quadratic programming (SQP)
45. Penalty and Barrier Methods
β’ Consider the general optimization problem: min
π
π π
Subject to
βπ π = 0, π = 1, β¦ , π;
ππ π β€ 0, π = π, β¦ , π;
π₯ππΏ β€ π₯π β€ π₯ππ, π = 1, β¦ , π.
β’ Define a composite function to be used for constraint compliance:
Ξ¦ π, π = π π + π π π , β π , π
where π defines a loss function, and π is a vector of weights (penalty
parameters)
46. Penalty and Barrier Methods
β’ Penalty Function Method. A penalty function method employs a
quadratic loss function and iterates through the infeasible region
π π π , β π , π = π ππ
+
π
2
π + βπ π 2
π
ππ
+
π = max 0, ππ π , π > 0
β’ Barrier Function Method. A barrier method employs a log barrier
function and iterates through the feasible region
π π π , β π , π =
1
π
log βππ π₯
π
β’ For both penalty and barrier methods, as π β β, π(π) β πβ
47. The Augmented Lagrangian Method
β’ Consider an equality-constrained problem: min
π
π π
Subject to: βπ π = 0, π = 1, β¦ , π
β’ Define the augmented Lagrangian (AL) as:
π« π, π, π = π π + π£πβπ π +
1
2
πβπ
2
π
π
where the additional term defines an exterior penalty function with
π as the penalty parameter.
β’ For inequality constrained problems, the AL may be defined as:
π« π, π, π = π π +
π’πππ π +
1
2
πππ
2
π , if ππ +
π’π
π
β₯ 0
β
1
2π
π’π
2
, if ππ +
π’π
π
< 0
π
where a large π makes the Hessian of AL positive definite at π.
48. The Augmented Lagrangian Method
β’ The dual function for the AL is defined as:
π π = min
π
π« π, π, π = π π + π£πβπ π +
1
2
π βπ π
2
π
β’ The resulting dual optimization problem is: max
π
π π
β’ The dual problem may be solved via Newtonβs method as:
ππ+1
= ππ
β
π2π
ππ£πππ£π
β1
π
where
π2π
ππ£πππ£π
= βπ»βπ
π
π»2π« β1π»βπ
β’ For large π, the Newtonβs update may be approximated as:
π£π
π+1
= π£π
π
+ π
πβπ, π = 1, β¦ , π
49. Example: Augmented Lagrangian
β’ Maximize the volume of a cylindrical tank subject to surface area
constraint:
max
π,π
π π, π =
ππ2π
4
, subject to β:
ππ2
4
+ πππ β π΄0 = 0
β’ We can normalize the problem as:
min
π,π
π π, π = βπ2
π, subject to β: π2
+ 4ππ β 1 = 0
β’ The solution to the primal problem is obtained as:
Lagrangian function: β π, π, π = βπ2
π + π(π2
+ 4ππ β 1)
FONC: π π + 2π β ππ = 0, ππ π + 4 β π2
= 0, π2
+ 4ππ β 1 = 0
Optimal solution: πβ = 2πβ = 4πβ =
1
3
.
50. Example: Augmented Lagrangian
β’ Alternatively, define the Augmented Lagrangian function as:
π« π, π, π, π = βπ2π + π π2 + 4ππ β 1 +
1
2
π π2 + 4ππ β 1 2
β’ Define the dual function: π π = min
π,π
π« π, π, π, π
β’ Define dual optimization problem: max
π,π
π π
β’ Solution to the dual problem: πβ
= ππππ₯ = 0.144
β’ Solution to the design variables: πβ = 2πβ = 0.577
51. Sequential Linear Programming
β’ Consider the general optimization problem: min
π
π π
Subject to
βπ π = 0, π = 1, β¦ , π;
ππ π β€ 0, π = π, β¦ , π;
π₯ππΏ β€ π₯π β€ π₯ππ, π = 1, β¦ , π.
β’ Let ππ denote the current estimate of the design variables, and let
π denote the change in variables; define the first order expansion
of the objective and constraint functions in the neighborhood of ππ
π ππ
+ π = π ππ
+ π»π ππ π
π
ππ ππ + π = ππ ππ + π»ππ ππ π
π , π = 1, β¦ , π
βπ ππ + π = βπ ππ + π»βπ ππ π
π , π = 1, β¦ , π
52. Sequential Linear Programming
β’ Let ππ = π ππ , ππ
π
= ππ ππ , βπ
π
= βπ ππ ; ππ = βππ
π
, ππ = ββπ
π
,
π = π»π ππ , ππ = π»ππ ππ , ππ = π»βπ ππ ,
π¨ = π1, π2, β¦ , ππ , π΅ = π1, π2, β¦ , ππ .
β’ Using first order expansion, define an LP subprogram for the
current iteration of the NLP problem:
min
π
π = πππ
Subject to: π¨π
π β€ π,
π΅ππ = π
where π represents first-order change in the cost function, and the
columns of π¨ and π΅ matrices represent, respectively, the gradients
of inequality and equality constraints.
β’ The resulting LP problem can be solved via the Simplex method.
53. Sequential Linear Programming
β’ We may note that:
β Since both positive and negative changes to design variables ππ are
allowed, the variables ππ are unrestricted in sign
β The SLP method requires additional constraints of the form:
β βππ
π
β€ ππ
π
β€ βππ’
π
(termed move limits) to bind the LP solution.
These limits represent maximum allowable change in ππ in the
current iteration and are selected as percentage of current value.
β Move limits serve dual purpose of binding the solution and
obviating the need for line search.
β Overly restrictive move limits tend to make the SLP problem
infeasible.
54. SLP Example
β’ Consider the convex NLP problem:
min
π₯1,π₯2
π(π₯1, π₯2) = π₯1
2
β π₯1π₯2 + π₯2
2
Subject to: 1 β π₯1
2
β π₯2
2
β€ 0; βπ₯1 β€ 0, βπ₯2 β€ 0
The problem has a single minimum at: πβ
=
1
2
,
1
2
β’ The objective and constraint gradients are:
π»ππ
= 2π₯1 β π₯2, 2π₯2 β π₯1 ,
π»π1
π
= β2π₯1, β2π₯2 , π»π2
π
= β1,0 , π»π3
π
= [0, β1].
β’ Let π0
= 1, 1 , then π0
= 1, ππ
= 1 1 , π1 = π2 = π3 = 1;
π1
π
= β2 β 2 , π2
π
= β1 0 , π3
π
= 0 β 1
55. SLP Example
β’ Define the LP subproblem at the current step as:
min
π1,π2
π π₯1, π₯2 = π1 + π2
Subject to:
β2 β2
β1 0
0 β1
π1
π2
β€
1
1
1
β’ In the absence of move limits, the LP problem is unbounded; using
50% move limits, the SLP update is given as: π β
= β
1
2
, β
1
2
π
,
π1 =
1
2
,
1
2
π
, with resulting constraint violation: ππ =
1
2
, 0, 0 ;
smaller move limits may be used to reduce the constraint violation.
56. Sequential Linear Programming
SLP Algorithm (Arora, p. 508):
β’ Initialize: choose π0, π1 > 0, π2 > 0.
β’ For π = 0,1,2, β¦
β Choose move limits βππ
π
, βππ’
π
as some fraction of current design ππ
β Compute ππ
, π, ππ
π
, βπ
π
, ππ, ππ
β Formulate and solve the LP subproblem for π π
β If ππ β€ π1; π = 1, β¦ , π; βπ β€ π1; π = 1, β¦ , π; and π π β€ π2, stop
β Substitute ππ+1 β ππ + πΌπ π, π β π + 1.
57. Sequential Quadratic Programming
β’ Sequential quadratic programming (SQP) uses a quadratic
approximation to the objective function at every step of iteration.
β’ The SQP problem is defined as:
min
π
π = ππ
π +
1
2
π π
π
Subject to, π¨ππ β€ π, π΅ππ = π
β’ SQP does not require move limits, alleviating the shortcomings of
the SLP method.
β’ The SQP problem is convex; hence, it has a single global minimum.
β’ SQP can be solved via Simplex based linear complementarity problem
(LCP) framework.
58. Sequential Quadratic Programming
β’ The Lagrangian function for the SQP problem is defined as:
β π , π, π = πππ + 1
2
π ππ + ππ π¨ππ β π + π + ππ(π΅ππ β π)
β’ Then the KKT conditions are:
Optimality: πβ = π + π + π¨π + π΅π = π,
Feasibility: π¨π
π + π = π, π΅π
π = π ,
Complementarity: ππ
π = π,
Non-negativity: π β₯ π, π β₯ π
59. Sequential Quadratic Programming
β’ Since π is unrestricted in sign, let π = π β π, π β₯ π, π β₯ π, and
the KKT conditions are compactly written as:
π° π¨
π¨π
π
π΅π
π
π
π°
π
π΅ βπ΅
π π
π π
π
π
π
π
π
=
βπ
π
π
,
or π·πΏ = πΈ
β’ The complementary slackness conditions, πππ = π, translate as:
πΏππΏπ+π = 0, π = π + 1, β― , π + π.
β’ The resulting problem can be solved via Simplex method using LCP
framework.
60. Descent Function Approach
β’ In SQP methods, the line search step is based on minimization of a
descent function that penalizes constraint violations, i.e.,
Ξ¦ π = π π + π π π
where π π is the cost function, π π represents current
maximum constraint violation, and π > 0 is a penalty parameter.
β’ The descent function value at the current iteration is computed as:
Ξ¦π = ππ + π ππ,
π = max π π, ππ where ππ = π’π
π
π
π=1 + π£π
π
π
π=1
ππ = max {0; ππ, π = 1, . . . , π; βπ , π = 1, β¦ , π}
β’ The line search subproblem is defined as:
min
πΌ
Ξ¦ πΌ = Ξ¦ ππ
+ πΌπ π
61. SQP Algorithm
SQP Algorithm (Arora, p. 526):
β’ Initialize: choose π0, π 0 = 1, π1 > 0, π2 > 0.
β’ For π = 0,1,2, β¦
β Compute ππ
, ππ
π
, βπ
π
, π, ππ, ππ; compute ππ.
β Formulate and solve the QP subproblem to obtain π π and the
Lagrange multipliers ππ
and ππ
.
β If ππ β€ π1 and π π
β€ π2, stop.
β Compute π ; formulate and solve line search subproblem for πΌ
β Set ππ+1
β ππ
+ πΌπ π
, π π+1 β π , π β π + 1
β’ The above algorithm is convergent, i.e., Ξ¦ ππ
β€ Ξ¦ π0
; ππ
converges to the KKT point πβ
62. SQP with Approximate Line Search
β’ The SQP algorithm can use with approximate line search as follows:
Let π‘π, π = 0,1, β¦ denote a trial step size,
ππ+1,π
denote the trial design point,
ππ+1,π
= π( ππ+1,π
) denote the function value at the trial solution, and
Ξ¦π+1,π = ππ+1,π
+ π ππ+1,π is the penalty function at the trial solution.
β’ The trial solution is required to satisfy the descent condition:
Ξ¦π+1,π + π‘ππΎ π π 2
β€ Ξ¦π,π, 0 < πΎ < 1
where a common choice is: πΎ =
1
2
, π =
1
2
, π‘π = ππ
, π = 0,1,2, β¦.
β’ The above descent condition ensures that the constraint violation
decreases at each step of the method.
63. SQP Example
β’ Consider the NLP problem: min
π₯1,π₯2
π(π₯1, π₯2) = π₯1
2
β π₯1π₯2 + π₯2
2
subject to π1: 1 β π₯1
2
β π₯2
2
β€ 0, π2: βπ₯1 β€ 0, π3: βπ₯2 β€ 0
Then π»ππ
= 2π₯1 β π₯2, 2π₯2 β π₯1 , π»π1
π
= β2π₯1, β2π₯2 , π»π2
π
=
β1,0 , π»π3
π
= [0, β1]. Let π₯0 = 1, 1 ; then, π0 = 1, π = 1, 1 π,
π1 1,1 = π2 1,1 = π3 1,1 = β1.
β’ Since all constraints are initially inactive, π0 = 0, and π = βπ =
β1, β1 π; the line search problem is: min
πΌ
Ξ¦ πΌ = 1 β πΌ 2;
β’ By setting Ξ¦β²
πΌ = 0, we get the analytical solution: πΌ = 1; thus
π₯1 = 0, 0 , which results in a large constraint violation
64. SQP Example
β’ Alternatively, we may use approximate line search as follows:
β Let π 0 = 10, πΎ = π =
1
2
; let π‘0 = 1, then π1,0 = 0,0 , π1,0 = 0,
π1,0 = 1, Ξ¦1,0 = 10; π 0 2 = 2, and the descent condition
Ξ¦1,0 +
1
2
π 0 2
β€ Ξ¦0 = 1 is not met at the trial point.
β Next, for π‘1 =
1
2
, we get: π1,1 =
1
2
,
1
2
, π1,1 =
1
4
, V1,1 =
1
2
,
Ξ¦1,1 = 5
1
4
, and the descent condition fails again;
β Next, for π‘2 =
1
4
, we get: π1,2
=
3
4
,
3
4
, V1,2 = 0, π1,2
= Ξ¦1,2 =
9
16
,
and the descent condition checks as: Ξ¦1,2 +
1
8
π 0 2
β€ Ξ¦0 = 1.
β Therefore, we set πΌ = π‘2 =
1
4
, π1
= π1,2
=
3
4
,
3
4
with no
constraint violation.
66. SQP via Newtonβs Method
β’ Consider the following equality constrained problem:
min
π
π(π), subject to βπ π = 0, π = 1, β¦ , π
β’ The Lagrangian function is given as: β π, π = π π + πππ(π)
β’ The KKT conditions are: π»β π, π = π»π π + π΅π = π, π π = π
where π΅ = ππ(π) is a Jacobian matrix whose πth column is π»βπ π
β’ Using first order Taylor series expansion (with shorthand notation):
π»βπ+1 = π»βπ + π»2βπΞπ + πΞπ
ππ+1 = ππ + π΅πΞπ
β’ By expanding Ξπ = ππ+1
β ππ
, π»βπ
= π»ππ
+ π΅ππ
, and assuming
ππ β ππ+1 we obtain: π»2
βπ
π΅
π΅π
π
Ξππ
ππ+1 = β
π»ππ
ππ
which is similar to N-R update, but uses Hessian of the Lagrangian
67. SQP via Newtonβs Method
β’ Alternately, we consider minimizing the quadratic approximation:
min
Ξπ
1
2
Ξπππ»2βΞπ + π»ππΞπ
Subject to: βπ π₯ + ππ
π
Ξπ = 0, π = π, β¦ , π
β’ The KKT conditions are: π»π + π»2
βΞπ + π΅π = π, π + π΅Ξπ = π
β’ Thus the QP subproblem can be solved via Newtonβs method!
π»2
βπ
π΅
π΅π
π
Ξππ
ππ+1 = β
π»ππ
ππ
β’ The Hessian of the Lagrangian can be updated via BFGS method as:
π―π+1
= π―π
+ π«π
β π¬π
where π«π =
πππππ
πππ
Ξππ
, π¬π =
πππππ
πππ
Ξππ
, ππ = π―πΞππ, ππ = π»βπ+1 β βπ
69. SQP with Hessian Update
β’ For the next step, the QP problem is defined as:
min
π1,π2
π =
3
4
π1 + π2 +
1
2
π1
2
+ π2
2
Subject to: β
3
2
π1 + π2 β€ 0, βπ1 β€ 0, βπ2 β€ 0
β’ The application of KKT conditions results in a linear system of
equations, which are solved to obtain:
ππ
= π1, π2, π’1, π’2, π’3, π 1, π 2, π 3 = 0.188, 0.188, 0, 0, 0,0.125, 0.75, 0.75
70. Modified SQP Algorithm
Modified SQP Algorithm (Arora, p. 558):
β’ Initialize: choose π0, π 0 = 1, π―0 = πΌ; π1, π2 > 0.
β’ For π = 0,1,2, β¦
β Compute ππ
, ππ
π
, βπ
π
, π, ππ, ππ, and ππ. If π > 0, compute π―π
β Formulate and solve the modified QP subproblem for search
direction π π
and the Lagrange multipliers ππ
and ππ
.
β If ππ β€ π1 and π π
β€ π2, stop.
β Compute π ; formulate and solve line search subproblem for πΌ
β Set ππ+1
β ππ
+ πΌπ π
, π π+1 β π , π β π + 1.
71. SQP Algorithm
%SQP subproblem via Hessian update
% input: xk (current design); Lk (Hessian of Lagrangian
estimate)
%initialize
n=size(xk,1);
if ~exist('Lk','var'), Lk=diag(xk+(~xk)); end
tol=1e-7;
%function and constraint values
fk=f(xk);
dfk=df(xk);
gk=g(xk);
dgk=dg(xk);
%N-R update
A=[Lk dgk; dgk' 0*dgk'*dgk];
b=[-dfk;-gk];
dx=Ab;
dxk=dx(1:n);
lam=dx(n+1:end);
72. SQP Algorithm
%inactive constraints
idx1=find(lam<0);
if idx1
[dxk,lam]=inactive(lam,A,b,n);
end
%check termination
if abs(dxk)<tol, return, end
%adjust increment for constraint compliance
P=@(xk) f(xk)+lam'*abs(g(xk));
while P(xk+dxk)>P(xk),
dxk=dxk/2;
if abs(dxk)<tol, break, end
end
%Hessian update
dL=@(x) df(x)+dg(x)*lam;
Lk=update(Lk, xk, dxk, dL);
xk=xk+dxk;
disp([xk' f(xk) P(xk)])
73. SQP Algorithm
%function definitions
function [dxk,lam]=inactive(lam,A,b,n)
idx1=find(lam<0);
lam(idx1)=0;
idx2=find(lam);
v=[1:n,n+idx2];
A=A(v,v); b=b(v);
dx=Ab;
dxk=dx(1:n);
lam(idx2)=dx(n+1:end);
end
function Lk=update(Lk, xk, dxk, dL)
ga=dL(xk+dxk)-dL(xk);
Hx=Lk*dxk;
Dk=ga*ga'/(ga'*dxk);
Ek=Hx*Hx'/(Hx'*dxk);
Lk=Lk+Dk-Ek;
end
74. Generalized Reduced Gradient
β’ The GRG method finds the search direction by projecting the
objective function gradient onto the constraint hyperplane.
β’ The GRG points tangent to the constraint hyperplane, so that
iterative steps try to conform to the constraints.
β’ The constraints are effectively used to implicitly eliminate variables
and reduce problem dimensions.
75. Implicit Elimination
β’ Consider an equality constrained problem in two variables:
Objective: min π π , ππ
= π₯1, π₯2
Subject to: π π = 0
β’ The variation in the objective and constraint functions are:
ππ = π»π π
ππ =
ππ
ππ₯1
ππ₯1 +
ππ
ππ₯2
ππ₯2
ππ = π»π π
ππ =
ππ
ππ₯1
ππ₯1 +
ππ
ππ₯2
ππ₯2 = 0
β’ Solve for ππ₯2 = β
ππ/ππ₯1
ππ/ππ₯2
ππ₯1 and substitute in the objective function:
ππ =
ππ
ππ₯1
β
ππ
ππ₯2
ππ/ππ₯1
ππ/ππ₯2
ππ₯1
β’ Then the reduced gradient of π along π₯1 is given as:
π»ππ =
ππ
ππ₯1
β
ππ
ππ₯2
ππ/ππ₯1
ππ/ππ₯2
76. Implicit Elimination
β’ Consider a problem in π variable with π equality constraints:
Objective: min π π , ππ = π₯1, π₯2, β¦ , π₯π
Subject to: ππ π = 0, π = 1, β¦ , π
β’ We define π basic variables in terms of π β π nonbasic variables;
let ππ
= ππ
, ππ
, where π are basic and π are nonbasic.
β’ The gradient vector is partitioned as: π»ππ = π»π π π, π»π π π .
β’ The variations in the objective and constraint functions are:
ππ = π»π π π
ππ + π»π π π
ππ
ππ =
ππ
ππ
ππ +
ππ
ππ
ππ = π
where the matrices of partial derivatives are defined as:
ππ
ππ ππ
=
πππ
ππ¦π
;
ππ
ππ ππ
=
πππ
ππ§π
77. Generalized Reduced Gradient
β’ Since
ππ
ππ
is a square π Γ π matrix, we may solve for ππ as:
ππ = β
ππ
ππ
β1 ππ
ππ
, and substitute in ππ to obtain:
ππ = π»π π π
ππ β π»π π π ππ
ππ
β1 ππ
ππ
ππ
β’ Then the reduce gradient π»ππ is defined as:
π»ππ
π
= π»π π π
β π»π π π ππ
ππ
β1 ππ
ππ
β’ Next, we choose negative of π»ππ
π
as the search direction and
perform a line search to determine step size; then Ξπ = βπΌπ»ππ ,
Ξπ =
ππ
ππ
β1 ππ
ππ
Ξπ
78. GRG Algorithm
β’ Initialize: choose π0; evaluate objective function and constraints;
convert binding inequality constraints to equality constraints.
β’ Partition the variables into π basic and π β π nonbasic ones, e.g.,
choose first π values, or π highest values as basic variables.
β’ Compute the π»ππ along nonbasic variables. If π»ππ = 0, exit.
β’ Set Ξπ = βπ»ππ / π»ππ , Ξπ = β
ππ
ππ
β1 ππ
ππ
Ξπ.
β’ Do a line search along Ξπ to obtain Ξ±.
β’ Check feasibility at ππ + πΌΞπ. If necessary, use Newton-Raphson
iterations to adjust Ξπ as: Ξππ+1 = Ξππ β
ππ
ππ
β1
ππ
β’ Update: ππ+1 = ππ + πΌΞπ
79. Generalized Reduced Gradient
β’ Consider an equality constrained problem
Objective: min π π = 3π₯1 + 2π₯2 + 2π₯1
2
β π₯1π₯2 + 1.5π₯2
2
Subject to: π π = π₯1
2
β π₯2 β 1 = 0
β’ Let π0 =
β1
0
; then π0 = β1, π»π0 =
β1
3
, π0 = 0, π»π0 =
β2
β1
.
β’ Let π = π₯2 on the first iteration; then π»ππ
π
= β1 β 3
β2
β1
= β7.
β’ Let Ξπ = 1, then Ξπ =
β2
β1
1 = 2. By doing a line search along
Ξπ =
0.333
0.667
, we obtain π1
=
β0.350
β0.577
, π1
= β2.13.
β’ The optimum is reached in three iterations: πβ =
β0.634
β0.598
,
π πβ = β2.137.