Machine Learning
Shaoqing Tan <tansq7@gmail.com>
vs
Traditional Optimization
Agenda
● Share my experience of traditional (not data driven) optimization.
● Provide a slightly different angle to describe what machine learning does.
● Demo with course project.
Machine Learning
● Helps predict behavior of new samples
● Learns data pattern with mathematics
● Minimizes the deviation from correct behavior
● Hyper parameters are not learned
Traditional Optimization
● Helps select parameters in engineering design
● Minimizes a loss function defined by
○ Safety factor
○ Monetary cost
○ Performance
● Often subject to constraints
● Loss function is evaluated via
○ Solid / fluidic mechanics simulation (FEA)
○ Electrical / electromagnetic simulation
○ Other logic / mathematics
Def of loss
Optimization
Design experiment
Evaluate loss
Check terminationKnowledge
Constraint
Initialization
Generically Speaking
Def of loss
Optimization
Design experiment
Evaluate loss
Check terminationKnowledge
Constraint
Initialization
Neural Network Training
NN loss
Training data
Gradient descent +
back propagation
Def of loss
Optimization
Design experiment
Evaluate loss
Check terminationKnowledge
Constraint
Initialization
Mechanical Design
Mech simulation
performance
Design constraints
Black box sampler
Def of loss
Optimization
Design experiment
Evaluate loss
Check terminationKnowledge
Constraint
Initialization
Neural Network Param Tuning
Cross validated
model performance
Design constraints
Black box sampler
Comparison
Def of loss Constraint Experiment Design
NN training NN loss Data Gradient descent +
back propagation
Mechanical
design
Mechanical simulation
performance
Design constraints Black box sampler
NN param
tuning
Cross validated model
performance
Design constraints Black box sampler
Black Box Sampling
● Gradient descent
○ Performs very well because it directly knows where it’s going!
○ Need derivative to function
● Black box means derivative is unavailable, such samplers include
○ Grid search: Brute force
○ Random search (Monti Carlo): Aimless
○ Quasi gradient descent: Susceptible to noise
○ Surrogate adaptive sampling - Models known points and sample new
points around the minimum of the sample
● Each one of these samplers will carry its own hyper parameters!
Surrogate Adaptive Sampling
● Surrogate models include
○ Bayesian
○ Radial basis function
○ Gaussian
○ Spline
○ Ensemble of the above
● Python libraries
○ https://github.com/HIPS/Spearmint
○ https://github.com/dme65/pySOT
● Google “global / black box optimization”
Demo & Questions

Machine learning vs traditional optimization

  • 1.
    Machine Learning Shaoqing Tan<tansq7@gmail.com> vs Traditional Optimization
  • 2.
    Agenda ● Share myexperience of traditional (not data driven) optimization. ● Provide a slightly different angle to describe what machine learning does. ● Demo with course project.
  • 3.
    Machine Learning ● Helpspredict behavior of new samples ● Learns data pattern with mathematics ● Minimizes the deviation from correct behavior ● Hyper parameters are not learned
  • 4.
    Traditional Optimization ● Helpsselect parameters in engineering design ● Minimizes a loss function defined by ○ Safety factor ○ Monetary cost ○ Performance ● Often subject to constraints ● Loss function is evaluated via ○ Solid / fluidic mechanics simulation (FEA) ○ Electrical / electromagnetic simulation ○ Other logic / mathematics
  • 5.
    Def of loss Optimization Designexperiment Evaluate loss Check terminationKnowledge Constraint Initialization Generically Speaking
  • 6.
    Def of loss Optimization Designexperiment Evaluate loss Check terminationKnowledge Constraint Initialization Neural Network Training NN loss Training data Gradient descent + back propagation
  • 7.
    Def of loss Optimization Designexperiment Evaluate loss Check terminationKnowledge Constraint Initialization Mechanical Design Mech simulation performance Design constraints Black box sampler
  • 8.
    Def of loss Optimization Designexperiment Evaluate loss Check terminationKnowledge Constraint Initialization Neural Network Param Tuning Cross validated model performance Design constraints Black box sampler
  • 9.
    Comparison Def of lossConstraint Experiment Design NN training NN loss Data Gradient descent + back propagation Mechanical design Mechanical simulation performance Design constraints Black box sampler NN param tuning Cross validated model performance Design constraints Black box sampler
  • 10.
    Black Box Sampling ●Gradient descent ○ Performs very well because it directly knows where it’s going! ○ Need derivative to function ● Black box means derivative is unavailable, such samplers include ○ Grid search: Brute force ○ Random search (Monti Carlo): Aimless ○ Quasi gradient descent: Susceptible to noise ○ Surrogate adaptive sampling - Models known points and sample new points around the minimum of the sample ● Each one of these samplers will carry its own hyper parameters!
  • 11.
    Surrogate Adaptive Sampling ●Surrogate models include ○ Bayesian ○ Radial basis function ○ Gaussian ○ Spline ○ Ensemble of the above ● Python libraries ○ https://github.com/HIPS/Spearmint ○ https://github.com/dme65/pySOT ● Google “global / black box optimization”
  • 12.