SlideShare a Scribd company logo
Optimization
COS 323
Ingredients
• Objective function
• Variables
• Constraints
Find values of the variables
that minimize or maximize the objective function
while satisfying the constraints
Different Kinds of Optimization
Figure from: Optimization Technology Center
http://www-fp.mcs.anl.gov/otc/Guide/OptWeb/
Different Optimization Techniques
• Algorithms have very different flavor depending
on specific problem
– Closed form vs. numerical vs. discrete
– Local vs. global minima
– Running times ranging from O(1) to NP-hard
• Today:
– Focus on continuous numerical methods
Optimization in 1-D
• Look for analogies to bracketing in root-finding
• What does it mean to bracket a minimum?
(xleft, f(xleft))
(xright, f(xright))
(xmid, f(xmid))
xleft < xmid < xright
f(xmid) < f(xleft)
f(xmid) < f(xright)
Optimization in 1-D
• Once we have these properties, there is at least
one local minimum between xleft and xright
• Establishing bracket initially:
– Given xinitial, increment
– Evaluate f(xinitial), f(xinitial+increment)
– If decreasing, step until find an increase
– Else, step in opposite direction until find an increase
– Grow increment at each step
• For maximization: substitute –f for f
Optimization in 1-D
• Strategy: evaluate function at some xnew
(xleft, f(xleft))
(xright, f(xright))
(xmid, f(xmid))
(xnew, f(xnew))
Optimization in 1-D
• Strategy: evaluate function at some xnew
– Here, new “bracket” points are xnew, xmid, xright
(xleft, f(xleft))
(xright, f(xright))
(xmid, f(xmid))
(xnew, f(xnew))
Optimization in 1-D
• Strategy: evaluate function at some xnew
– Here, new “bracket” points are xleft, xnew, xmid
(xleft, f(xleft))
(xright, f(xright))
(xmid, f(xmid))
(xnew, f(xnew))
Optimization in 1-D
• Unlike with root-finding, can’t always guarantee
that interval will be reduced by a factor of 2
• Let’s find the optimal place for xmid, relative to
left and right, that will guarantee same factor of
reduction regardless of outcome
Optimization in 1-D
if f(xnew) < f(xmid)
new interval = 
else
new interval = 1–2

2
Golden Section Search
• To assure same interval, want  = 1–2
• So,
• This is the “golden ratio” = 0.618…
• So, interval decreases by 30% per iteration
– Linear convergence

 


2
1
5
Error Tolerance
• Around minimum, derivative = 0, so
• Rule of thumb: pointless to ask for more
accuracy than sqrt()
– Can use double precision if you want a single-
precision result (and/or have single-precision data)


~
machine
)
(
)
(
)
(
...
)
(
)
(
)
(
2
2
1
2
2
1
x
x
x
f
x
f
x
x
f
x
x
f
x
f
x
x
f


















Faster 1-D Optimization
• Trade off super-linear convergence for
worse robustness
– Combine with Golden Section search for safety
• Usual bag of tricks:
– Fit parabola through 3 points, find minimum
– Compute derivatives as well as positions, fit cubic
– Use second derivatives: Newton
Newton’s Method
Newton’s Method
Newton’s Method
Newton’s Method
Newton’s Method
• At each step:
• Requires 1st and 2nd derivatives
• Quadratic convergence
)
(
)
(
1
k
k
k
k
x
f
x
f
x
x






Multi-Dimensional Optimization
• Important in many areas
– Fitting a model to measured data
– Finding best design in some parameter space
• Hard in general
– Weird shapes: multiple extrema, saddles,
curved or elongated valleys, etc.
– Can’t bracket
• In general, easier than rootfinding
– Can always walk “downhill”
Newton’s Method in
Multiple Dimensions
• Replace 1st derivative with gradient,
2nd derivative with Hessian

































2
2
2
2
2
2
)
,
(
y
f
y
x
f
y
x
f
x
f
y
f
x
f
H
f
y
x
f
Newton’s Method in
Multiple Dimensions
• Replace 1st derivative with gradient,
2nd derivative with Hessian
• So,
• Tends to be extremely fragile unless function
very smooth and starting close to minimum
)
(
)
(
1
1 k
k
k
k x
f
x
H
x
x






 

Important classification of methods
• Use function + gradient + Hessian (Newton)
• Use function + gradient (most descent methods)
• Use function values only (Nelder-Mead, called also
“simplex”, or “amoeba” method)
Steepest Descent Methods
• What if you can’t / don’t want to
use 2nd derivative?
• “Quasi-Newton” methods estimate Hessian
• Alternative: walk along (negative of) gradient…
– Perform 1-D minimization along line passing through
current point in the direction of the gradient
– Once done, re-compute gradient, iterate
Problem With Steepest Descent
Problem With Steepest Descent
Conjugate Gradient Methods
• Idea: avoid “undoing”
minimization that’s
already been done
• Walk along direction
• Polak and Ribiere
formula:
k
k
k
k d
g
d 


 
 1
1
k
k
k
k
k
g
g
g
g
gk
T
1
T
)
(
1





Conjugate Gradient Methods
• Conjugate gradient implicitly obtains
information about Hessian
• For quadratic function in n dimensions, gets
exact solution in n steps (ignoring roundoff error)
• Works well in practice…
Value-Only Methods in Multi-Dimensions
• If can’t evaluate gradients, life is hard
• Can use approximate (numerically evaluated)
gradients:
































 






















)
(
)
(
)
(
)
(
)
(
)
(
3
2
1
3
2
1
)
( x
f
e
x
f
x
f
e
x
f
x
f
e
x
f
e
f
e
f
e
f
x
f
Generic Optimization Strategies
• Uniform sampling:
– Cost rises exponentially with # of dimensions
• Simulated annealing:
– Search in random directions
– Start with large steps, gradually decrease
– “Annealing schedule” – how fast to cool?
Downhill Simplex Method (Nelder-Mead)
• Keep track of n+1 points in n dimensions
– Vertices of a simplex (triangle in 2D
tetrahedron in 3D, etc.)
• At each iteration: simplex can move,
expand, or contract
– Sometimes known as amoeba method:
simplex “oozes” along the function
Downhill Simplex Method (Nelder-Mead)
• Basic operation: reflection
worst point
(highest function value)
location probed by
reflection step
Downhill Simplex Method (Nelder-Mead)
• If reflection resulted in best (lowest) value so far,
try an expansion
• Else, if reflection helped at all, keep it
location probed by
expansion step
Downhill Simplex Method (Nelder-Mead)
• If reflection didn’t help (reflected point still worst)
try a contraction
location probed by
contration step
Downhill Simplex Method (Nelder-Mead)
• If all else fails shrink the simplex around
the best point
Downhill Simplex Method (Nelder-Mead)
• Method fairly efficient at each iteration
(typically 1-2 function evaluations)
• Can take lots of iterations
• Somewhat flakey – sometimes needs restart
after simplex collapses on itself, etc.
• Benefits: simple to implement, doesn’t need
derivative, doesn’t care about function
smoothness, etc.
Rosenbrock’s Function
• Designed specifically for testing
optimization techniques
• Curved, narrow valley
2
2
2
)
1
(
)
(
100
)
,
( x
x
y
y
x
f 



Constrained Optimization
• Equality constraints: optimize f(x)
subject to gi(x)=0
• Method of Lagrange multipliers: convert to a
higher-dimensional problem
• Minimize w.r.t.

 )
(
)
( x
g
x
f i
i
 )
;
( 1
1 k
n
x
x 
 

Constrained Optimization
• Inequality constraints are harder…
• If objective function and constraints all linear,
this is “linear programming”
• Observation: minimum must lie at corner of
region formed by constraints
• Simplex method: move from vertex to vertex,
minimizing objective function
Constrained Optimization
• General “nonlinear programming” hard
• Algorithms for special cases (e.g. quadratic)
Global Optimization
• In general, can’t guarantee that you’ve found
global (rather than local) minimum
• Some heuristics:
– Multi-start: try local optimization from
several starting positions
– Very slow simulated annealing
– Use analytical methods (or graphing) to determine
behavior, guide methods to correct neighborhoods

More Related Content

Similar to cos323_s06_lecture03_optimization.ppt

ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
sleepy_yoshi
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
anandsimple
 
ICML2017 best paper (Understanding black box predictions via influence functi...
ICML2017 best paper (Understanding black box predictions via influence functi...ICML2017 best paper (Understanding black box predictions via influence functi...
ICML2017 best paper (Understanding black box predictions via influence functi...
Antosny
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
MapR Technologies
 
Machine learning interviews day2
Machine learning interviews   day2Machine learning interviews   day2
Machine learning interviews day2
rajmohanc
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
Muthu Vinayagam
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
Ted Dunning
 
PSO and Its application in Engineering
PSO and Its application in EngineeringPSO and Its application in Engineering
PSO and Its application in EngineeringPrince Jain
 
Approximation Algorithms TSP
Approximation Algorithms   TSPApproximation Algorithms   TSP
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
Khang Pham
 
1629 stochastic subgradient approach for solving linear support vector
1629 stochastic subgradient approach for solving linear support vector1629 stochastic subgradient approach for solving linear support vector
1629 stochastic subgradient approach for solving linear support vector
Dr Fereidoun Dejahang
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
Ted Dunning
 
4optmizationtechniques-150308051251-conversion-gate01.pdf
4optmizationtechniques-150308051251-conversion-gate01.pdf4optmizationtechniques-150308051251-conversion-gate01.pdf
4optmizationtechniques-150308051251-conversion-gate01.pdf
BechanYadav4
 
optmizationtechniques.pdf
optmizationtechniques.pdfoptmizationtechniques.pdf
optmizationtechniques.pdf
SantiagoGarridoBulln
 
Coordinate Descent method
Coordinate Descent methodCoordinate Descent method
Coordinate Descent method
Sanghyuk Chun
 
勾配法
勾配法勾配法
勾配法
貴之 八木
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
KNaveenKumarECE
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
MapR Technologies
 

Similar to cos323_s06_lecture03_optimization.ppt (20)

ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
ICML2017 best paper (Understanding black box predictions via influence functi...
ICML2017 best paper (Understanding black box predictions via influence functi...ICML2017 best paper (Understanding black box predictions via influence functi...
ICML2017 best paper (Understanding black box predictions via influence functi...
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
Machine learning interviews day2
Machine learning interviews   day2Machine learning interviews   day2
Machine learning interviews day2
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 
PSO and Its application in Engineering
PSO and Its application in EngineeringPSO and Its application in Engineering
PSO and Its application in Engineering
 
Approximation Algorithms TSP
Approximation Algorithms   TSPApproximation Algorithms   TSP
Approximation Algorithms TSP
 
13 weightedresidual
13 weightedresidual13 weightedresidual
13 weightedresidual
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
 
1629 stochastic subgradient approach for solving linear support vector
1629 stochastic subgradient approach for solving linear support vector1629 stochastic subgradient approach for solving linear support vector
1629 stochastic subgradient approach for solving linear support vector
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
4optmizationtechniques-150308051251-conversion-gate01.pdf
4optmizationtechniques-150308051251-conversion-gate01.pdf4optmizationtechniques-150308051251-conversion-gate01.pdf
4optmizationtechniques-150308051251-conversion-gate01.pdf
 
optmizationtechniques.pdf
optmizationtechniques.pdfoptmizationtechniques.pdf
optmizationtechniques.pdf
 
Optmization techniques
Optmization techniquesOptmization techniques
Optmization techniques
 
Coordinate Descent method
Coordinate Descent methodCoordinate Descent method
Coordinate Descent method
 
勾配法
勾配法勾配法
勾配法
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
 

Recently uploaded

WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 

Recently uploaded (20)

WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 

cos323_s06_lecture03_optimization.ppt

  • 2. Ingredients • Objective function • Variables • Constraints Find values of the variables that minimize or maximize the objective function while satisfying the constraints
  • 3. Different Kinds of Optimization Figure from: Optimization Technology Center http://www-fp.mcs.anl.gov/otc/Guide/OptWeb/
  • 4. Different Optimization Techniques • Algorithms have very different flavor depending on specific problem – Closed form vs. numerical vs. discrete – Local vs. global minima – Running times ranging from O(1) to NP-hard • Today: – Focus on continuous numerical methods
  • 5. Optimization in 1-D • Look for analogies to bracketing in root-finding • What does it mean to bracket a minimum? (xleft, f(xleft)) (xright, f(xright)) (xmid, f(xmid)) xleft < xmid < xright f(xmid) < f(xleft) f(xmid) < f(xright)
  • 6. Optimization in 1-D • Once we have these properties, there is at least one local minimum between xleft and xright • Establishing bracket initially: – Given xinitial, increment – Evaluate f(xinitial), f(xinitial+increment) – If decreasing, step until find an increase – Else, step in opposite direction until find an increase – Grow increment at each step • For maximization: substitute –f for f
  • 7. Optimization in 1-D • Strategy: evaluate function at some xnew (xleft, f(xleft)) (xright, f(xright)) (xmid, f(xmid)) (xnew, f(xnew))
  • 8. Optimization in 1-D • Strategy: evaluate function at some xnew – Here, new “bracket” points are xnew, xmid, xright (xleft, f(xleft)) (xright, f(xright)) (xmid, f(xmid)) (xnew, f(xnew))
  • 9. Optimization in 1-D • Strategy: evaluate function at some xnew – Here, new “bracket” points are xleft, xnew, xmid (xleft, f(xleft)) (xright, f(xright)) (xmid, f(xmid)) (xnew, f(xnew))
  • 10. Optimization in 1-D • Unlike with root-finding, can’t always guarantee that interval will be reduced by a factor of 2 • Let’s find the optimal place for xmid, relative to left and right, that will guarantee same factor of reduction regardless of outcome
  • 11. Optimization in 1-D if f(xnew) < f(xmid) new interval =  else new interval = 1–2  2
  • 12. Golden Section Search • To assure same interval, want  = 1–2 • So, • This is the “golden ratio” = 0.618… • So, interval decreases by 30% per iteration – Linear convergence      2 1 5
  • 13. Error Tolerance • Around minimum, derivative = 0, so • Rule of thumb: pointless to ask for more accuracy than sqrt() – Can use double precision if you want a single- precision result (and/or have single-precision data)   ~ machine ) ( ) ( ) ( ... ) ( ) ( ) ( 2 2 1 2 2 1 x x x f x f x x f x x f x f x x f                  
  • 14. Faster 1-D Optimization • Trade off super-linear convergence for worse robustness – Combine with Golden Section search for safety • Usual bag of tricks: – Fit parabola through 3 points, find minimum – Compute derivatives as well as positions, fit cubic – Use second derivatives: Newton
  • 19. Newton’s Method • At each step: • Requires 1st and 2nd derivatives • Quadratic convergence ) ( ) ( 1 k k k k x f x f x x      
  • 20. Multi-Dimensional Optimization • Important in many areas – Fitting a model to measured data – Finding best design in some parameter space • Hard in general – Weird shapes: multiple extrema, saddles, curved or elongated valleys, etc. – Can’t bracket • In general, easier than rootfinding – Can always walk “downhill”
  • 21. Newton’s Method in Multiple Dimensions • Replace 1st derivative with gradient, 2nd derivative with Hessian                                  2 2 2 2 2 2 ) , ( y f y x f y x f x f y f x f H f y x f
  • 22. Newton’s Method in Multiple Dimensions • Replace 1st derivative with gradient, 2nd derivative with Hessian • So, • Tends to be extremely fragile unless function very smooth and starting close to minimum ) ( ) ( 1 1 k k k k x f x H x x         
  • 23. Important classification of methods • Use function + gradient + Hessian (Newton) • Use function + gradient (most descent methods) • Use function values only (Nelder-Mead, called also “simplex”, or “amoeba” method)
  • 24. Steepest Descent Methods • What if you can’t / don’t want to use 2nd derivative? • “Quasi-Newton” methods estimate Hessian • Alternative: walk along (negative of) gradient… – Perform 1-D minimization along line passing through current point in the direction of the gradient – Once done, re-compute gradient, iterate
  • 27. Conjugate Gradient Methods • Idea: avoid “undoing” minimization that’s already been done • Walk along direction • Polak and Ribiere formula: k k k k d g d       1 1 k k k k k g g g g gk T 1 T ) ( 1     
  • 28. Conjugate Gradient Methods • Conjugate gradient implicitly obtains information about Hessian • For quadratic function in n dimensions, gets exact solution in n steps (ignoring roundoff error) • Works well in practice…
  • 29. Value-Only Methods in Multi-Dimensions • If can’t evaluate gradients, life is hard • Can use approximate (numerically evaluated) gradients:                                                         ) ( ) ( ) ( ) ( ) ( ) ( 3 2 1 3 2 1 ) ( x f e x f x f e x f x f e x f e f e f e f x f
  • 30. Generic Optimization Strategies • Uniform sampling: – Cost rises exponentially with # of dimensions • Simulated annealing: – Search in random directions – Start with large steps, gradually decrease – “Annealing schedule” – how fast to cool?
  • 31. Downhill Simplex Method (Nelder-Mead) • Keep track of n+1 points in n dimensions – Vertices of a simplex (triangle in 2D tetrahedron in 3D, etc.) • At each iteration: simplex can move, expand, or contract – Sometimes known as amoeba method: simplex “oozes” along the function
  • 32. Downhill Simplex Method (Nelder-Mead) • Basic operation: reflection worst point (highest function value) location probed by reflection step
  • 33. Downhill Simplex Method (Nelder-Mead) • If reflection resulted in best (lowest) value so far, try an expansion • Else, if reflection helped at all, keep it location probed by expansion step
  • 34. Downhill Simplex Method (Nelder-Mead) • If reflection didn’t help (reflected point still worst) try a contraction location probed by contration step
  • 35. Downhill Simplex Method (Nelder-Mead) • If all else fails shrink the simplex around the best point
  • 36. Downhill Simplex Method (Nelder-Mead) • Method fairly efficient at each iteration (typically 1-2 function evaluations) • Can take lots of iterations • Somewhat flakey – sometimes needs restart after simplex collapses on itself, etc. • Benefits: simple to implement, doesn’t need derivative, doesn’t care about function smoothness, etc.
  • 37. Rosenbrock’s Function • Designed specifically for testing optimization techniques • Curved, narrow valley 2 2 2 ) 1 ( ) ( 100 ) , ( x x y y x f    
  • 38. Constrained Optimization • Equality constraints: optimize f(x) subject to gi(x)=0 • Method of Lagrange multipliers: convert to a higher-dimensional problem • Minimize w.r.t.   ) ( ) ( x g x f i i  ) ; ( 1 1 k n x x    
  • 39. Constrained Optimization • Inequality constraints are harder… • If objective function and constraints all linear, this is “linear programming” • Observation: minimum must lie at corner of region formed by constraints • Simplex method: move from vertex to vertex, minimizing objective function
  • 40. Constrained Optimization • General “nonlinear programming” hard • Algorithms for special cases (e.g. quadratic)
  • 41. Global Optimization • In general, can’t guarantee that you’ve found global (rather than local) minimum • Some heuristics: – Multi-start: try local optimization from several starting positions – Very slow simulated annealing – Use analytical methods (or graphing) to determine behavior, guide methods to correct neighborhoods