All machine learning and artificial intelligence pipelines - from reinforcement agents to deep neural nets - have tunable hyperparameters. Optimizing these hyperparameters can take a model from scrappy prototype to production-ready system. This presentation shows techniques for performing hyperparameter optimization from an engineer who builds advanced and widely used optimization tools.
Tips and techniques for hyperparameter optimization
1. #GHC17
Tips and Techniques for
Hyperparameter Optimization
Alexandra Johnson | @alexandraj777 | alexandra@sigopt.com
2. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Example: Beating Vegas
Scott Clark. Using Model Tuning to Beat
Vegas.
3. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Red Box = Hyperparameter
TensorFlow Playground
4. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Terminology
Optimization =
tuning
Model tuning =
hyperparameter
optimization
Model selection is
related
5. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Tune the Whole Pipeline
6. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Include Feature Parameters
7. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Include Feature Parameters
8. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Choosing a Metric
Balance long-term
and short-term goals
Question underlying
assumptions
Example from
Microsoft
9. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Composite Metric
Example: Lifetime Value
clicks*wclicks + likes*wlikes + views*wviews
10. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Choose Multiple Metrics
Balance competing
metrics
Explore “efficient
frontier”
Image from PhD Comics
11. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Avoiding Overfitting
12. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Optimization Loop
13. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Get A Suggestion
14. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Split Data into k Subsamples
15. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Repeat for each subsample
Train
Evaluate
16. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Report An Observation
17. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Repeat for New Hyperparameters
18. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Optimization Methods
19. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Hand Tuning
Hand tuning is time
consuming and
expensive
Algorithms can
quickly and cheaply
beat expert tuning
20. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Grid Search Random Search Bayesian
Optimization
Alternatives to Hand Tuning
21. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Alternatives to Hand Tuning
Genetic algorithms
Particle-based methods
Convex optimizers
Simulated annealing
To name a few...
22. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
No Grid Search
Hyper-
parameters
Model
Evaluations
2 100
3 1,000
4 10,000
5 100,000
23. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
No Random Search
Theoretically more
effective than grid
search
Large variance in
results
No intelligence
24. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Bayesian Optimization
Explore/exploit
Ideal for "expensive"
optimization
No requirements on:
convexity,
differentiability,
continuity
25. PAGE | GRACE HOPPER CELEBRATION FOR WOMEN IN COMPUTING 2017
PRESENTED BY THE ANITA BORG INSTITUTE AND THE ASSOCIATION FOR COMPUTING MACHINERY #GHC17
Takeaways
Optimize the entire pipeline
Ensure generalization
Use Bayesian optimization
26. FEEDBACK? RATE AND REVIEW THE SESSION ON OUR MOBILE APP
Download the GHC 17 app at http://bit.ly/ghc17app or search GHC 2017 in the app store
Thank you!
alexandra@sigopt.com | @alexandraj777