5. ● Grid search
● Random search
● Guided search
● Grad student search (still best)
Methods
6. ● Better configuration proposal
○ Objective function is estimated with surrogate models
○ Evolutionary methods
○ ...
● Faster objective function calculation
○ Bandid methods
○ Pruning
○ Estimating a score from the learning curve of NN
○ ...
Methods
10. Methods: bandid methods
● Budget options:
○ Dataset size
○ Number of epochs
○ Time
○ Number of features
○ Number of CV-folds
11. Methods: bandid methods
● Successive halving: set resource, set budget, set run nr link
● Hyperband: random resource, grid search run nr, within set budget link
16. Algorithm
● Objective function estimated with surrogate models
○ Random Forests
○ Gradient Boosted Trees
○ Gaussian process
● Next run params selected via acquisition function
○ Expected Improvement
○ Probability of Improvement
○ Lower Confidence Bound
● No objective func calculation speedup mechanism
25. API: {fun}_minimize
● There are (hyper)hyperparameters
● Acquisition function:
○ ‘EI’, ‘PI’ , expected improvement probability of improvement (max)
○ ‘LCB’, expected value of objective + variance of GP
● Exploration vs exploitation
○ xi for ‘EI’ and ‘PI’, low xi exploration high xi exploitation
○ Kappa for ‘LCB’, low kappa exploitation, high kappa exploration
32. Speed & Parallelization
● Runs sequentially and you cannot distribute it across many machines
● You can parallelize base estimator at every run with n_jobs
● If you have just 1 machine it is fast
34. Conclusions: good
● Easy to use API and great documentation
● A lot of optimizers and tweaking options
● Awesome visualizations
● Solid gains over the random search
● Fast if you are running sequentially on 1 machine
● Active project support
37. Algorithm
● Objective function estimated with Tree of Parzen Estimators
● Next run params selected via Expected Improvement
● Objective func calculation speedup via run pruning and
successive halving (optionally)
45. API: {fun}_minimize: callbacks
study = optuna.create_study()
study.optimize(objective, n_trials=100, callbacks=[report_neptune])
def report_neptune(study, trial):
neptune.send_metric('value', trial.value)
neptune.send_metric('best_value', study.best_value)
Available in bleeding edge version from source*
52. Conclusions: good
● Easy to use API
● Great documentation
● Can be easily distributed over a cluster of machines
● Has pruning
● Has callbacks
● Search space supports nesting
● Active project support
53. Conclusions: bad
● Only TPE optimizer available
● Only some visualizations
● *No gains over the random search (with 100 iterations budget)
54. Optuna is hyperopt with:
● better api
● waaaay better documentation
● pruning (and halving available)
● exception handling
● simpler parallelization
● active project support
57. ● HyperBand on Steroids
● It has state-of-the-art algorithms
○ Hyperband link
○ BOHB (Bayesian Optimization + Hyperband) link
● Distributed-computing-first API
HpBandSter
59. Algorithm
● Objective function estimated with TPE
● Next run params selected via Expected Improvement
● Objective func calculation speedup via bandid with
random budgets (hyperband)
62. API: server
● Workers communicate with server to:
○ get next parameter configuration
○ send results
● You have to define it even for the most basic setups/problems (weird)
83. Conclusions: good
● State-of-the-art algorithm
● Can be distributed over a cluster of machines
● Useful visualizations
● Search space supports nesting
86. Results (mostly subjective)
Scikit-Optimize Optuna HpBandSter Hyperopt
API/ease of use Great Great Difficult Good
Documentation Great Great Ok(ish) Bad
Speed/Parallelization Fast if
sequential/None
Great Good Ok
Visualizations Amazing Basic Very lib specific Some
*Experimental results 0.8566 (100) 0.8419 (100)
0.8597 (10000)
0.8629 (100) 0.8420 (100)
88. Conversions between
results objects are in
neptune-contrib
import neptunecontrib.hpo.utils as hpo_utils
results = hpo_utils.optuna2skopt(study)
Dream library
89. ● If you don’t have a lot of resources - use Scikit-Optimize
● If you want to get SOTA and don’t care about API/Docs - use HpBandSter
● If you want good docs/api/parallelization - use Optuna
Recommendations
90. ● Slides link on Twitter @NeptuneML or Linkedin @neptune.ml
● Blog posts on Medium @jakub.czakon
● Experiments in Neptune tags skopt/optuna/hpbandster
○ Code
○ Best hyperparams and Hyper hyper params
○ learning curves
○ diagnostic charts
○ resource consumption charts
○ pickled results objects
Materials
91. Data science work sharing hub.
Track | Organize | Collaborate
kuba@neptune.ml
@NeptuneML
https://medium.com/neptune-ml
Jakub Czakon