SlideShare a Scribd company logo
1 of 46
0
Sebastian Bernasek
7-14-2015
Intro to Optimization: Part 1
1
What is optimization?
Identify variable values that minimize or maximize
some objective while satisfying constraints
objective
variables
constraints
minimize f(x)
where x = {x1,x2,..xn}
s.t. Ax < b
2
What for?
Finance
• maximize profit, minimize risk
• constraints: budgets, regulations
Engineering
• maximize IRR, minimize emissions
• constraints: resources, safety
Data modeling
3
Given a proposed model:
y(x) = θ1 sin(θ2 x)
which parameters (θi) best describe the data?
Data modeling
4
Which parameters (θi) best describe the data?
We must quantify goodness-of-fit
Data modeling
5
A good model will have minimal residual error
Goodness-of-fit metrics
ei =Yi - y(Xi )
y(Xi ) =q1 sin q2Xi( )
where Xi,Yi are data
and y(Xi) is the model, e.g.
6
 Least Squares
 Weighted Least Squares
Goodness-of-fit metrics
SSE = ei
2
i=1
N
å = Yi - y(Xi )( )
2
i=1
N
å
WSSE =
ei
2
si
2
i=1
N
å =
Yi - y(Xi )( )
2
si
2
i=1
N
ågives greater importance to
more precise data
all data equally important
We seek to minimize SSE and WSSE
7
 Log likelihood
Define the likelihood L(θ|Y)=p(Y|θ)
as the likelihood of θ being the true
parameters given the observed data
Goodness-of-fit metrics
8
 Log likelihood
Given p(Yi iid Y | θ) we can compute p(Y|θ):
the log transform is for convenience
Goodness-of-fit metrics
L(q |Y) = p(Y |q) = p(Yi |q)
i=1
N
Õ
lnL(q |Y) = ln p(Yi |q)
i=1
N
å
We seek to maximize ln L(θ | Y)
9
 Log likelihood
So what is p(Yi|θ) ?
Assume each residual is drawn from a distribution. For
example, assume ei are Gaussian distributed with
Goodness-of-fit metrics
p(Yi |q) =
1
(2psi
2
)1/2
e
-
1
2si
2
Yi -y(Xi )( )
2
ei -ei = ei
m = ei = 0
10
 Log likelihood
Goodness-of-fit metrics
lnL(q |Y) = ln p(Yi |q)
i=1
N
å
lnL(q |Y) = ln
1
(2psi
2
)1/2
e
-
1
2si
2
Yi -y(Xi )( )
2
é
ë
ê
ê
ù
û
ú
úi=1
N
å
lnL(q |Y) = ln
1
(2psi
2
)1/2
é
ë
ê
ù
û
ú-
i=1
N
å
1
2si
2
Yi - y(Xi )( )
2
i=1
N
å
lnL(q |Y) = ln
1
(2psi
2
)1/2
é
ë
ê
ù
û
ú-
i=1
N
å
1
2si
2
Yi -q1 sin(q2Xi )( )
2
i=1
N
å
maximize ln L(θ | Y)
11
 Least Squares
• simple and straightforward to implement
• requires large N for high accuracy
 Weighted Least Squares
• accounts for variability in precision of variables
• converges to least squares for high N
 Log Likelihood
• requires assumption for residuals PDF
Goodness-of-fit metrics
12
Given a proposed model:
y(x) = θ1 sin(θ2 x)
which parameters (θi) best describe the data?
Data modeling
objective
variables
constraints
minimize SSE(θ)
where θ = {θ1,θ2,..θn}
s.t. Aθ < b
13
Given a proposed model:
y(x) = θ1 sin(θ2 x)
which parameters (θi) best describe the data?
Data modeling
optimum
variables
minimum
θ = {5,1}
SSE(θ) = 277
14
minimize f(x)
where x = {x1,x2,..xn}
s.t. Ax < b
So how do we optimize?
15
Types of problems
There are many classes of optimization problems
1. constrained vs unconstrained
2. static vs dynamic
3. continuous vs discrete variables
4. deterministic vs stochastic variables
5. single vs multiple objective functions
16
Types of algorithms
There are many more classes of algorithms that
attempt to solve these problems
NEOS, UW
17
Types of algorithms
There are many more classes of algorithms that
attempt to solve these problems
NEOS, UW
18
Unconstrained Optimization
 Zero-Order Methods (function calls only)
• Nelder-Mead Simplex (direct search)
• Powell Conjugate Directions
 First-Order Methods
• Steepest Descent
• Nonlinear Conjugate Gradients
• Broyden-Fletcher-Goldfarb-Shanno Algorithms (BFGS)
 Second-Order Methods
• Newton’s Method
• Newton Conjugate Gradient
Here we classify algorithms by the derivative
information utilized.
scipy.optimize.fmin
19
 All but the simplex and Newton methods call
one-dimensional line searches as a subroutine
 Common option:
• Bisection Methods (e.g. Golden Search)
General Iterative Scheme
α=step size
dn = search direction
Unconstrained Optimization in 1-D
xn+1 = xn +adn
R =
a
b
= golden.ratio
R2
+ R-1= 0
linear convergence, but robust
20
1-D Optimization
root finding
 Calculus-based option:
• Newton-Raphson
Unconstrained Optimization in 1-D
xn+1 = xn -
f '(xn )
f ''(xn )
f '(xn+1)= 0
f ''(xn ) »
f '(xn )- f '(xn-1)
xn - xn-1
can use explicit
derivatives or numerical
approximation
g(xn+1)= g(xn )+g'(xn )(xn+1 - xn )
g(xn+1) = 0
xn+1 = xn -
g(xn )
g'(xn )
we want:
so let: g(x)= f '(x)
21
 Move to minimum of quadratic fit at each point
can achieve quadratic convergence for twice
differentiable functions
Newton Raphson
COS 323 Course Notes, Princeton U.
22
Newton’s Method in N-Dimensions
1. Construct a locally quadratic model (mn) via
Taylor expansion about xn:
2. At each step we want to move toward the
minimum of this model, where
pn = x- xn
mn (pn ) = f (xn )+ pn
T
Ñf (xn )+
1
2
pn
T
H(xn )pn
points near xn…
Ñmn(pn ) = 0
Ñmn(pn )= Ñf (xn )+H(xn )pndifferentiating…
pn = -H-1
(xn )Ñf (xn )solving…
23
Newton’s Method in N-Dimensions
3. The minimum of the local second-order model
lies in the direction pn.
Determine the optimal step size, α, by 1-D optimization
pn = -H-1
(xn )Ñf (xn )
Search directionGeneral Iterative Scheme
α=step size
dn = search direction
xn+1 = xn +adn
a = argmina f (xn -aH-1
(xn )Ñf (xn ))éë ùû
a = argmina f (xn +apn )[ ]
Golden search,
Newton’s method,
Brent’s Method,
Nelder-Mead Simplex,
etc.
24
Newton’s Method in N-Dimensions
4. Take the step
5. Check termination criteria and return to step 3
xn+1 = xn +apn
possible criteria:
• Maximum iterations reached
• Change in objective function below threshold
• Change in local gradient below threshold
• Change in local Hessian below threshold
25
BFGS Algorithm (quasi-Newton method)
• Numerically approx. H-1(xn)
• Multiply matrices
Newton’s Method in N-Dimensions
 How do we compute the Hessian?
pn = -H-1
(xn )Ñf (xn )
H(xn )pn =-Ñf (xn )
Newton’s Method
• Define H(xn) expressions
• Invert it and multiply
• Accurate
• Costly for high N
• Requires 2nd derivatives
• Avoids solving system
• Only req. 1st derivatives
• Crazy math I don’t get
scipy.optimize.fmin_bfgs
26
Gradient Descent
 Newton/BFGS make use of the local Hessian
 Alternatively we could just use the gradient
1. Pick a starting point, x0
2. Evaluate the local derivative
3. Perform line-search along gradient
1. Move directly along the gradient
2. Check convergence criteria and return to 2
Ñf (x0 )
xn+1 = xn -aÑf (xn )
a = argmina f (xn -aÑf (xn ))[ ]
27
Gradient Descent
 Function must be differentiable
 Subsequent steps are always perpendicular
 Can get caught in narrow valleys
xn+1 = xn -aÑf (xn )
a = argmina f (xn -aÑf (xn ))[ ]
28
Conjugate Gradient Method
 Avoids reversing previous iterations by ensuring that
each step is conjugate to all previous steps, creating a
linearly independent set of basis vectors
1. Pick a starting point and evaluate local derivative
2. First step follows gradient descent
3. Compute weights for previous steps, βn
x1 = x0 -aÑf (x0 )
a = argmina f (x0 -aÑf (x0 ))[ ]
bn =
Dxn
T
(Dxn -Dxn-1)
DxT
n-1Dxn-1
Dxn = -Ñf (xn )where
is the steepest direction
Polak-Ribiere Version
29
Conjugate Gradient Method
 Creates a set, si, of linearly independent vectors
that span the parameter space xi.
4. Compute search direction, sn
4. Move to optimal point along sn
4. Check convergence criteria and return to step 3
xn+1 = xn +asn
a = argmina f (xn +asn )[ ]
*Note that setting βi = 0 yields the gradient descent algorithm
sn =-Ñf (xn )+bnsn-1
30
Conjugate Gradient Method
 For properly conditions problems, guaranteed to
converge in N iterations
 Very commonly used scipy.optimize.fmin_cg
31
Powell’s Conjugate Directions
 Performs N line searches along N basis vectors in order
to determine an optimal search direction
 Preserves minimization achieved by previous steps by
retaining the basis vector set between iterations
1. Pick a starting point and a set of basis vectors
1. Determine the optimum step size along each vector
1. Let the search vector be the linear combination of basis vectors:
is convention
32
Powell’s Conjugate Directions
4. Move along the search vector
4. Add to the basis and drop the oldest basis vector
5. Check the convergence criteria and return to step 2
and
Problem: Algorithm tends toward a linearly dependent basis set
Solutions: 1. Reset to an orthogonal basis every N iterations
2. At step 5, replace the basis vector corresponding to the
largest change in f(x)
33
Powell’s Conjugate Directions
Advantages
 No derivatives required  only uses function calls
 Quadratic convergence
Accessible via scipy.optimize.fmin_powell
34
Nelder-Mead Simplex Algorithm
Direct search algorithm
Default method: scipy.optimize.fmin
Method consists of a simplex crawling
around the parameter space until it finds
and brackets a local minimum.
35
Nelder-Mead Simplex Algorithm
Simplex: convex hull of N+1 vertices in N-space.
2D: a triangle 3D: a tetrahedron
36
Nelder-Mead Simplex Algorithm
1. Pick a starting point and define a simplex around it with
N+1 vertices xi
2. Evaluate f(xi) at each vertex and rank order the vertices
such that x1 is the best and xN+1 is the worst
3. Evaluate the centroid of the best N vertices
x =
1
N
xi
i=1
N
å
37
Nelder-Mead Simplex Algorithm
4. Reflection: let xr = x +a(x - xN+1)
f (x1)£ f (xr )< f (xN )If replace xN+1 with xr
worst point
(highest function val.)
x
xr
xN+1
COS 323 Course Notes, Princeton U.
38
Nelder-Mead Simplex Algorithm
5. Expansion:
If reflection resulted in the best point, try:
xe = x +b(x - xN+1)
f (xe )< f (xr )If then replace xN+1 with xe
If not, replace xN+1 with xr
worst point
(highest function val.)
x
xe
xN+1
xr
COS 323 Course Notes, Princeton U.
39
Nelder-Mead Simplex Algorithm
6. Contraction:
If reflected point is still the worst, then try contraction
xc = x -g(xr - x)
x
xN+1
xc
COS 323 Course Notes, Princeton U.
40
Nelder-Mead Simplex Algorithm
7. Shrinkage:
If contraction fails, scale all vertices toward the best vertex.
xi = x1 +d(xi - x1)
xN+1
x2
x1
for 2 £i £ N +1
COS 323 Course Notes, Princeton U.
41
Nelder-Mead Simplex Algorithm
 Advantages:
 Doesn’t require any derivatives
 Few function calls at each iteration
 Works with rough surfaces
 Disadvantages:
 Can require many iterations
 Does not always converge. Convergence criteria are unknown.
 Inefficient in very high N
42
Nelder-Mead Simplex Algorithm
Parameter Required Typical
α > 0 1
β > 1 2
γ 0 < γ < 1 0.5
δ 0 < δ < 1 0.5
43
Algorithm Comparison
min f(x) iterations f(x) evals f'(x) evals
powell -2 2 43 0
conjugate gradient -2 4 40 10
gradient descent -2 3 32 8
bfgs -2 6 48 12
simplex -2 45 87 0
f(x) = sin(x) + cos(y)
44
Algorithm Comparison
f(x) = sin(xy) + cos(y)
 Simplex & Powell seem to similarly follow valleys with a more “local” focus
 BFGS/CG readily transcend valleys
45
2-D Rosenbock Function
min f(x) iterations f(x) evals f'(x) evals
powell 3.8 E-28 25 719 0
conjugate gradient 9.5 E-08 33 368 89
gradient descent 1.1 E+01 400 1712 428
bfgs 1.8 E-11 47 284 71
simplex 5.6 E-10 106 201 0

More Related Content

What's hot

Multi Objective Optimization
Multi Objective OptimizationMulti Objective Optimization
Multi Objective OptimizationNawroz University
 
Teaching learning based optimization technique
Teaching   learning based optimization techniqueTeaching   learning based optimization technique
Teaching learning based optimization techniqueSmriti Mehta
 
NON LINEAR PROGRAMMING
NON LINEAR PROGRAMMING NON LINEAR PROGRAMMING
NON LINEAR PROGRAMMING karishma gupta
 
Newton Raphson Method Using C Programming
Newton Raphson Method Using C Programming Newton Raphson Method Using C Programming
Newton Raphson Method Using C Programming Md Abu Bakar Siddique
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313Slideshare
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated AnnealingJoy Dutta
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...Ajay Kumar
 
Introduction to optimization technique
Introduction to optimization techniqueIntroduction to optimization technique
Introduction to optimization techniqueKAMINISINGH963
 
fuzzy fuzzification and defuzzification
fuzzy fuzzification and defuzzificationfuzzy fuzzification and defuzzification
fuzzy fuzzification and defuzzificationNourhan Selem Salm
 
Optimization and particle swarm optimization (O & PSO)
Optimization and particle swarm optimization (O & PSO) Optimization and particle swarm optimization (O & PSO)
Optimization and particle swarm optimization (O & PSO) Engr Nosheen Memon
 
presentation on Euler and Modified Euler method ,and Fitting of curve
presentation on Euler and Modified Euler method ,and Fitting of curve presentation on Euler and Modified Euler method ,and Fitting of curve
presentation on Euler and Modified Euler method ,and Fitting of curve Mukuldev Khunte
 
Optimization Simulated Annealing
Optimization Simulated AnnealingOptimization Simulated Annealing
Optimization Simulated AnnealingUday Wankar
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Design and Analysis of Algorithms
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of AlgorithmsSwapnil Agrawal
 
Multivariable Optimization-for class (1).pptx
Multivariable Optimization-for class (1).pptxMultivariable Optimization-for class (1).pptx
Multivariable Optimization-for class (1).pptxNehaJangir5
 

What's hot (20)

Metaheuristics
MetaheuristicsMetaheuristics
Metaheuristics
 
Multi Objective Optimization
Multi Objective OptimizationMulti Objective Optimization
Multi Objective Optimization
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Teaching learning based optimization technique
Teaching   learning based optimization techniqueTeaching   learning based optimization technique
Teaching learning based optimization technique
 
NON LINEAR PROGRAMMING
NON LINEAR PROGRAMMING NON LINEAR PROGRAMMING
NON LINEAR PROGRAMMING
 
Newton Raphson Method Using C Programming
Newton Raphson Method Using C Programming Newton Raphson Method Using C Programming
Newton Raphson Method Using C Programming
 
numerical methods
numerical methodsnumerical methods
numerical methods
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313
 
Penalty function
Penalty function Penalty function
Penalty function
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
Introduction to optimization technique
Introduction to optimization techniqueIntroduction to optimization technique
Introduction to optimization technique
 
fuzzy fuzzification and defuzzification
fuzzy fuzzification and defuzzificationfuzzy fuzzification and defuzzification
fuzzy fuzzification and defuzzification
 
Optimization and particle swarm optimization (O & PSO)
Optimization and particle swarm optimization (O & PSO) Optimization and particle swarm optimization (O & PSO)
Optimization and particle swarm optimization (O & PSO)
 
presentation on Euler and Modified Euler method ,and Fitting of curve
presentation on Euler and Modified Euler method ,and Fitting of curve presentation on Euler and Modified Euler method ,and Fitting of curve
presentation on Euler and Modified Euler method ,and Fitting of curve
 
Optimization Simulated Annealing
Optimization Simulated AnnealingOptimization Simulated Annealing
Optimization Simulated Annealing
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques
 
Design and Analysis of Algorithms
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of Algorithms
 
Multivariable Optimization-for class (1).pptx
Multivariable Optimization-for class (1).pptxMultivariable Optimization-for class (1).pptx
Multivariable Optimization-for class (1).pptx
 

Viewers also liked

optim function
optim functionoptim function
optim functionSupri Amir
 
Revista Lanbide 2002
Revista Lanbide 2002Revista Lanbide 2002
Revista Lanbide 2002Leire Hetel
 
Quasi newton
Quasi newtonQuasi newton
Quasi newtontokumoto
 
A simplex nelder mead genetic algorithm for minimizing molecular potential en...
A simplex nelder mead genetic algorithm for minimizing molecular potential en...A simplex nelder mead genetic algorithm for minimizing molecular potential en...
A simplex nelder mead genetic algorithm for minimizing molecular potential en...Aboul Ella Hassanien
 
Optimization In R
Optimization In ROptimization In R
Optimization In Rsyou6162
 
Comparative study of algorithms of nonlinear optimization
Comparative study of algorithms of nonlinear optimizationComparative study of algorithms of nonlinear optimization
Comparative study of algorithms of nonlinear optimizationPranamesh Chakraborty
 
Nelder Mead Search Algorithm
Nelder Mead Search AlgorithmNelder Mead Search Algorithm
Nelder Mead Search AlgorithmAshish Khetan
 
H2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymH2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymSri Ambati
 
Simulated annealing.ppt
Simulated annealing.pptSimulated annealing.ppt
Simulated annealing.pptKaal Nath
 
Using Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningUsing Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningDr. Volkan OBAN
 
Simulated Annealing - A Optimisation Technique
Simulated Annealing - A Optimisation TechniqueSimulated Annealing - A Optimisation Technique
Simulated Annealing - A Optimisation TechniqueAUSTIN MOSES
 
Simulated annealing -a informative approach
Simulated annealing -a informative approachSimulated annealing -a informative approach
Simulated annealing -a informative approachRanak Ghosh
 

Viewers also liked (14)

Cap 4 parte_elizabeth
Cap 4 parte_elizabethCap 4 parte_elizabeth
Cap 4 parte_elizabeth
 
optim function
optim functionoptim function
optim function
 
Revista Lanbide 2002
Revista Lanbide 2002Revista Lanbide 2002
Revista Lanbide 2002
 
Quasi newton
Quasi newtonQuasi newton
Quasi newton
 
A simplex nelder mead genetic algorithm for minimizing molecular potential en...
A simplex nelder mead genetic algorithm for minimizing molecular potential en...A simplex nelder mead genetic algorithm for minimizing molecular potential en...
A simplex nelder mead genetic algorithm for minimizing molecular potential en...
 
Optimization In R
Optimization In ROptimization In R
Optimization In R
 
Comparative study of algorithms of nonlinear optimization
Comparative study of algorithms of nonlinear optimizationComparative study of algorithms of nonlinear optimization
Comparative study of algorithms of nonlinear optimization
 
CV TKD
CV TKDCV TKD
CV TKD
 
Nelder Mead Search Algorithm
Nelder Mead Search AlgorithmNelder Mead Search Algorithm
Nelder Mead Search Algorithm
 
H2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas NykodymH2O World - GLM - Tomas Nykodym
H2O World - GLM - Tomas Nykodym
 
Simulated annealing.ppt
Simulated annealing.pptSimulated annealing.ppt
Simulated annealing.ppt
 
Using Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningUsing Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and Learning
 
Simulated Annealing - A Optimisation Technique
Simulated Annealing - A Optimisation TechniqueSimulated Annealing - A Optimisation Technique
Simulated Annealing - A Optimisation Technique
 
Simulated annealing -a informative approach
Simulated annealing -a informative approachSimulated annealing -a informative approach
Simulated annealing -a informative approach
 

Similar to Optimization tutorial

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function InterpolationJesse Bettencourt
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learningSteve Nouri
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)Eric Zhang
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsFrank Kienle
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical MethodsTeja Ande
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1arogozhnikov
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Yandex
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdfRahul926331
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdfanandsimple
 
Optimization Methods in Finance
Optimization Methods in FinanceOptimization Methods in Finance
Optimization Methods in Financethilankm
 

Similar to Optimization tutorial (20)

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo Methods
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
 
Es272 ch5b
Es272 ch5bEs272 ch5b
Es272 ch5b
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
Optimization Methods in Finance
Optimization Methods in FinanceOptimization Methods in Finance
Optimization Methods in Finance
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 

Recently uploaded (20)

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 

Optimization tutorial

  • 2. 1 What is optimization? Identify variable values that minimize or maximize some objective while satisfying constraints objective variables constraints minimize f(x) where x = {x1,x2,..xn} s.t. Ax < b
  • 3. 2 What for? Finance • maximize profit, minimize risk • constraints: budgets, regulations Engineering • maximize IRR, minimize emissions • constraints: resources, safety Data modeling
  • 4. 3 Given a proposed model: y(x) = θ1 sin(θ2 x) which parameters (θi) best describe the data? Data modeling
  • 5. 4 Which parameters (θi) best describe the data? We must quantify goodness-of-fit Data modeling
  • 6. 5 A good model will have minimal residual error Goodness-of-fit metrics ei =Yi - y(Xi ) y(Xi ) =q1 sin q2Xi( ) where Xi,Yi are data and y(Xi) is the model, e.g.
  • 7. 6  Least Squares  Weighted Least Squares Goodness-of-fit metrics SSE = ei 2 i=1 N å = Yi - y(Xi )( ) 2 i=1 N å WSSE = ei 2 si 2 i=1 N å = Yi - y(Xi )( ) 2 si 2 i=1 N ågives greater importance to more precise data all data equally important We seek to minimize SSE and WSSE
  • 8. 7  Log likelihood Define the likelihood L(θ|Y)=p(Y|θ) as the likelihood of θ being the true parameters given the observed data Goodness-of-fit metrics
  • 9. 8  Log likelihood Given p(Yi iid Y | θ) we can compute p(Y|θ): the log transform is for convenience Goodness-of-fit metrics L(q |Y) = p(Y |q) = p(Yi |q) i=1 N Õ lnL(q |Y) = ln p(Yi |q) i=1 N å We seek to maximize ln L(θ | Y)
  • 10. 9  Log likelihood So what is p(Yi|θ) ? Assume each residual is drawn from a distribution. For example, assume ei are Gaussian distributed with Goodness-of-fit metrics p(Yi |q) = 1 (2psi 2 )1/2 e - 1 2si 2 Yi -y(Xi )( ) 2 ei -ei = ei m = ei = 0
  • 11. 10  Log likelihood Goodness-of-fit metrics lnL(q |Y) = ln p(Yi |q) i=1 N å lnL(q |Y) = ln 1 (2psi 2 )1/2 e - 1 2si 2 Yi -y(Xi )( ) 2 é ë ê ê ù û ú úi=1 N å lnL(q |Y) = ln 1 (2psi 2 )1/2 é ë ê ù û ú- i=1 N å 1 2si 2 Yi - y(Xi )( ) 2 i=1 N å lnL(q |Y) = ln 1 (2psi 2 )1/2 é ë ê ù û ú- i=1 N å 1 2si 2 Yi -q1 sin(q2Xi )( ) 2 i=1 N å maximize ln L(θ | Y)
  • 12. 11  Least Squares • simple and straightforward to implement • requires large N for high accuracy  Weighted Least Squares • accounts for variability in precision of variables • converges to least squares for high N  Log Likelihood • requires assumption for residuals PDF Goodness-of-fit metrics
  • 13. 12 Given a proposed model: y(x) = θ1 sin(θ2 x) which parameters (θi) best describe the data? Data modeling objective variables constraints minimize SSE(θ) where θ = {θ1,θ2,..θn} s.t. Aθ < b
  • 14. 13 Given a proposed model: y(x) = θ1 sin(θ2 x) which parameters (θi) best describe the data? Data modeling optimum variables minimum θ = {5,1} SSE(θ) = 277
  • 15. 14 minimize f(x) where x = {x1,x2,..xn} s.t. Ax < b So how do we optimize?
  • 16. 15 Types of problems There are many classes of optimization problems 1. constrained vs unconstrained 2. static vs dynamic 3. continuous vs discrete variables 4. deterministic vs stochastic variables 5. single vs multiple objective functions
  • 17. 16 Types of algorithms There are many more classes of algorithms that attempt to solve these problems NEOS, UW
  • 18. 17 Types of algorithms There are many more classes of algorithms that attempt to solve these problems NEOS, UW
  • 19. 18 Unconstrained Optimization  Zero-Order Methods (function calls only) • Nelder-Mead Simplex (direct search) • Powell Conjugate Directions  First-Order Methods • Steepest Descent • Nonlinear Conjugate Gradients • Broyden-Fletcher-Goldfarb-Shanno Algorithms (BFGS)  Second-Order Methods • Newton’s Method • Newton Conjugate Gradient Here we classify algorithms by the derivative information utilized. scipy.optimize.fmin
  • 20. 19  All but the simplex and Newton methods call one-dimensional line searches as a subroutine  Common option: • Bisection Methods (e.g. Golden Search) General Iterative Scheme α=step size dn = search direction Unconstrained Optimization in 1-D xn+1 = xn +adn R = a b = golden.ratio R2 + R-1= 0 linear convergence, but robust
  • 21. 20 1-D Optimization root finding  Calculus-based option: • Newton-Raphson Unconstrained Optimization in 1-D xn+1 = xn - f '(xn ) f ''(xn ) f '(xn+1)= 0 f ''(xn ) » f '(xn )- f '(xn-1) xn - xn-1 can use explicit derivatives or numerical approximation g(xn+1)= g(xn )+g'(xn )(xn+1 - xn ) g(xn+1) = 0 xn+1 = xn - g(xn ) g'(xn ) we want: so let: g(x)= f '(x)
  • 22. 21  Move to minimum of quadratic fit at each point can achieve quadratic convergence for twice differentiable functions Newton Raphson COS 323 Course Notes, Princeton U.
  • 23. 22 Newton’s Method in N-Dimensions 1. Construct a locally quadratic model (mn) via Taylor expansion about xn: 2. At each step we want to move toward the minimum of this model, where pn = x- xn mn (pn ) = f (xn )+ pn T Ñf (xn )+ 1 2 pn T H(xn )pn points near xn… Ñmn(pn ) = 0 Ñmn(pn )= Ñf (xn )+H(xn )pndifferentiating… pn = -H-1 (xn )Ñf (xn )solving…
  • 24. 23 Newton’s Method in N-Dimensions 3. The minimum of the local second-order model lies in the direction pn. Determine the optimal step size, α, by 1-D optimization pn = -H-1 (xn )Ñf (xn ) Search directionGeneral Iterative Scheme α=step size dn = search direction xn+1 = xn +adn a = argmina f (xn -aH-1 (xn )Ñf (xn ))éë ùû a = argmina f (xn +apn )[ ] Golden search, Newton’s method, Brent’s Method, Nelder-Mead Simplex, etc.
  • 25. 24 Newton’s Method in N-Dimensions 4. Take the step 5. Check termination criteria and return to step 3 xn+1 = xn +apn possible criteria: • Maximum iterations reached • Change in objective function below threshold • Change in local gradient below threshold • Change in local Hessian below threshold
  • 26. 25 BFGS Algorithm (quasi-Newton method) • Numerically approx. H-1(xn) • Multiply matrices Newton’s Method in N-Dimensions  How do we compute the Hessian? pn = -H-1 (xn )Ñf (xn ) H(xn )pn =-Ñf (xn ) Newton’s Method • Define H(xn) expressions • Invert it and multiply • Accurate • Costly for high N • Requires 2nd derivatives • Avoids solving system • Only req. 1st derivatives • Crazy math I don’t get scipy.optimize.fmin_bfgs
  • 27. 26 Gradient Descent  Newton/BFGS make use of the local Hessian  Alternatively we could just use the gradient 1. Pick a starting point, x0 2. Evaluate the local derivative 3. Perform line-search along gradient 1. Move directly along the gradient 2. Check convergence criteria and return to 2 Ñf (x0 ) xn+1 = xn -aÑf (xn ) a = argmina f (xn -aÑf (xn ))[ ]
  • 28. 27 Gradient Descent  Function must be differentiable  Subsequent steps are always perpendicular  Can get caught in narrow valleys xn+1 = xn -aÑf (xn ) a = argmina f (xn -aÑf (xn ))[ ]
  • 29. 28 Conjugate Gradient Method  Avoids reversing previous iterations by ensuring that each step is conjugate to all previous steps, creating a linearly independent set of basis vectors 1. Pick a starting point and evaluate local derivative 2. First step follows gradient descent 3. Compute weights for previous steps, βn x1 = x0 -aÑf (x0 ) a = argmina f (x0 -aÑf (x0 ))[ ] bn = Dxn T (Dxn -Dxn-1) DxT n-1Dxn-1 Dxn = -Ñf (xn )where is the steepest direction Polak-Ribiere Version
  • 30. 29 Conjugate Gradient Method  Creates a set, si, of linearly independent vectors that span the parameter space xi. 4. Compute search direction, sn 4. Move to optimal point along sn 4. Check convergence criteria and return to step 3 xn+1 = xn +asn a = argmina f (xn +asn )[ ] *Note that setting βi = 0 yields the gradient descent algorithm sn =-Ñf (xn )+bnsn-1
  • 31. 30 Conjugate Gradient Method  For properly conditions problems, guaranteed to converge in N iterations  Very commonly used scipy.optimize.fmin_cg
  • 32. 31 Powell’s Conjugate Directions  Performs N line searches along N basis vectors in order to determine an optimal search direction  Preserves minimization achieved by previous steps by retaining the basis vector set between iterations 1. Pick a starting point and a set of basis vectors 1. Determine the optimum step size along each vector 1. Let the search vector be the linear combination of basis vectors: is convention
  • 33. 32 Powell’s Conjugate Directions 4. Move along the search vector 4. Add to the basis and drop the oldest basis vector 5. Check the convergence criteria and return to step 2 and Problem: Algorithm tends toward a linearly dependent basis set Solutions: 1. Reset to an orthogonal basis every N iterations 2. At step 5, replace the basis vector corresponding to the largest change in f(x)
  • 34. 33 Powell’s Conjugate Directions Advantages  No derivatives required  only uses function calls  Quadratic convergence Accessible via scipy.optimize.fmin_powell
  • 35. 34 Nelder-Mead Simplex Algorithm Direct search algorithm Default method: scipy.optimize.fmin Method consists of a simplex crawling around the parameter space until it finds and brackets a local minimum.
  • 36. 35 Nelder-Mead Simplex Algorithm Simplex: convex hull of N+1 vertices in N-space. 2D: a triangle 3D: a tetrahedron
  • 37. 36 Nelder-Mead Simplex Algorithm 1. Pick a starting point and define a simplex around it with N+1 vertices xi 2. Evaluate f(xi) at each vertex and rank order the vertices such that x1 is the best and xN+1 is the worst 3. Evaluate the centroid of the best N vertices x = 1 N xi i=1 N å
  • 38. 37 Nelder-Mead Simplex Algorithm 4. Reflection: let xr = x +a(x - xN+1) f (x1)£ f (xr )< f (xN )If replace xN+1 with xr worst point (highest function val.) x xr xN+1 COS 323 Course Notes, Princeton U.
  • 39. 38 Nelder-Mead Simplex Algorithm 5. Expansion: If reflection resulted in the best point, try: xe = x +b(x - xN+1) f (xe )< f (xr )If then replace xN+1 with xe If not, replace xN+1 with xr worst point (highest function val.) x xe xN+1 xr COS 323 Course Notes, Princeton U.
  • 40. 39 Nelder-Mead Simplex Algorithm 6. Contraction: If reflected point is still the worst, then try contraction xc = x -g(xr - x) x xN+1 xc COS 323 Course Notes, Princeton U.
  • 41. 40 Nelder-Mead Simplex Algorithm 7. Shrinkage: If contraction fails, scale all vertices toward the best vertex. xi = x1 +d(xi - x1) xN+1 x2 x1 for 2 £i £ N +1 COS 323 Course Notes, Princeton U.
  • 42. 41 Nelder-Mead Simplex Algorithm  Advantages:  Doesn’t require any derivatives  Few function calls at each iteration  Works with rough surfaces  Disadvantages:  Can require many iterations  Does not always converge. Convergence criteria are unknown.  Inefficient in very high N
  • 43. 42 Nelder-Mead Simplex Algorithm Parameter Required Typical α > 0 1 β > 1 2 γ 0 < γ < 1 0.5 δ 0 < δ < 1 0.5
  • 44. 43 Algorithm Comparison min f(x) iterations f(x) evals f'(x) evals powell -2 2 43 0 conjugate gradient -2 4 40 10 gradient descent -2 3 32 8 bfgs -2 6 48 12 simplex -2 45 87 0 f(x) = sin(x) + cos(y)
  • 45. 44 Algorithm Comparison f(x) = sin(xy) + cos(y)  Simplex & Powell seem to similarly follow valleys with a more “local” focus  BFGS/CG readily transcend valleys
  • 46. 45 2-D Rosenbock Function min f(x) iterations f(x) evals f'(x) evals powell 3.8 E-28 25 719 0 conjugate gradient 9.5 E-08 33 368 89 gradient descent 1.1 E+01 400 1712 428 bfgs 1.8 E-11 47 284 71 simplex 5.6 E-10 106 201 0