SlideShare a Scribd company logo
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd.
Bayesian optimization
PyData London 2017
6th of May, 2017
Thomas Huijskens
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 2
The modelling workflow of a data scientist roughly follows three steps
Feature engineering Feature selection Model selection
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 3
In this talk, we will focus on the last step of this workflow
Feature engineering Model selection
Focus of this talk
Feature selection
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 4
Most models have a number of hyperparameters that need to be chosen a
priori..
• Typically, we want to optimize some performance metric 𝑓ℳ for a model ℳ, on a hold-out
set of data.
• The model ℳ has some hyperparameters 𝜃 that we need to specify, and the performance
of ℳ is highly dependent on the settings of 𝜃.
• Example: For a classification problem, ℳ could be a support vector machine, and the
performance metric 𝑓ℳ could be the out-of-sample AUC.
• Because the performance of the SVM depends on hyperparameters 𝜃, the metric 𝑓ℳ
depends on 𝜃 as well.
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 5
𝑓ℳ 𝜃 𝐷)
Learner ℳ
Performance
function 𝑓
𝑓 depends on the hyperparameters 𝜃 of
the model ℳ
.. and our goal is to find the values of 𝜃 for which the value of the (out-of-
sample) performance 𝑓ℳ is optimal
𝑓 is conditional on the
data 𝐷
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 6
Manual grid search works for low-dimensional problems, but does not scale
well to higher dimensions
Curse of dimensionality
makes this inefficient in
higher dimensional
problems
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 7
Random grid search works better in higher dimensions, but..
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 8
.. shouldn't previously evaluated hyperparameter values guide us in the
search for the true optimal value?
Low values of L here,
should probably not
focus in this area
High values of L here,
should probably focus
in this area
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 9
Bayesian optimization
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 10
Bayesian optimization is only beneficial in certain situations
In Bayesian optimization, we are building a probabilistic model for the performance metric
𝑓ℳ(𝜃). This introduces computational overhead.
As a result, applying Bayesian optimization only really makes sense if
• The number of hyperparameters is very high (𝜃 is of high dimension); or
• It is computationally very expensive to evaluate 𝑓ℳ(𝜃) for a single point 𝜃.
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 11
Evaluate performance
of learner f with
parameters 𝜃
Update current belief
of loss surface of the
learner f
Choose 𝜃 that
maximises some
utility over the
current belief
The classic Bayesian optimization algorithm consists of three steps
𝑓 | 𝑦 𝑛𝑒𝑤
𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤
𝜃 𝑛𝑒𝑤
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 12
Evaluate performance
of learner f with
parameters 𝜃
Update current belief
of loss surface of the
learner f
Choose 𝜃 that
maximises some
utility over the
current belief
The classic Bayesian optimization algorithm consists of three steps
𝑓 | 𝑦 𝑛𝑒𝑤
𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤
𝜃 𝑛𝑒𝑤
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 13
The performance function 𝑓ℳ is modelled using Gaussian Processes (GPs)
• GPs are the generalization of a Gaussian distribution to a distribution over functions,
instead of random variables
• Just as a Gaussian distribution is completely specified by its mean and variance, a GP is
completely specified by its mean function and covariance function.
• We can think of a GP as a function that, instead of returning a scalar 𝑓(𝒙), returns the
mean and variance of a normal distribution over the possible values of 𝑓 at 𝒙.
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 14
GPs are the generalization of a Gaussian distribution to a distribution over
functions, instead of random variables1
1A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning, Eric Brochu, Vlad M. Cora, Nando de Freitas, 2010. https://arxiv.org/abs/1012.2599
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 15
Evaluate performance
of learner f with
parameters 𝜃
Update current belief
of loss surface of the
learner f
Choose 𝜃 that
maximises some
utility over the
current belief
The classic Bayesian optimization algorithm consists of three steps
𝑓 | 𝑦 𝑛𝑒𝑤
𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤
𝜃 𝑛𝑒𝑤
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 16
How do we use our current belief to make a good guess about what point to
evaluate next?
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 17
Acquisition functions are used to formalize what constitutes a ”best guess”
𝐸𝐼 𝜃 = 𝔼[max
𝜃
0, 𝑓ℳ 𝜃 − 𝑓ℳ 𝜃 ] ,
𝜃 𝑛𝑒𝑤 = argmax
𝜃
𝐸𝐼(𝜃)
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 18
The expected improvement acquisition function can be evaluated analytically
𝐸𝐼 𝜃 =
𝜇 𝜃 − 𝑓 𝜃 𝛷 𝑍 + 𝜎(𝜃)𝜙(𝑍), 𝜎 𝜃 > 0
0, 𝜎 𝜃 = 0
,
𝑍 =
𝜇 𝜃 − 𝑓( 𝜃)
𝜎(𝜃)
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 19
The expected improvement acquisition trades off exploitation of known
optimal areas, versus exploration of unexplored areas of the loss surface
Exploitation of
known optimal
space
Exploration of
unknown space
Exploration of
unknown space
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 20
Evaluate performance
of learner f with
parameters 𝜃
Update current belief
of loss surface of the
learner f
Choose 𝜃 that
maximises some
utility over the
current belief
The classic Bayesian optimization algorithm consists of three steps
𝑓 | 𝑦 𝑛𝑒𝑤
𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤
𝜃 𝑛𝑒𝑤
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 21
How do we use this in real-life?
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 22
Example – Bayesian optimization can be used to tune the hyperparameters of
an SVM model
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 23
Example – A pseudocode implementation of Bayesian optimization shows its
simple API
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 24
Example – Bayesian optimization can be used to tune the hyperparameters of
an SVM model
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 25
Example – Bayesian optimization can be used to tune the hyperparameters of
an SVM model
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 26
GPs can not be used as a simple black-box, and some care must be taken
1. Choose an appropriate scale for your hyperparameters: For parameters like a learning
rate, or regularization term, it makes more sense to sample on the log-uniform domain,
instead of the uniform domain.
2. Choose the kernel of the GP carefully: Each kernel implicitly assumes different properties
on the loss function, in terms of differentiability and periodicity.
All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 27
Multiple production-ready, and open-source, modules for Bayesian
optimization are available online
1. Spearmint
2. Hyperopt
3. MOE
4. Hyperband (model-free)
5. SMAC (model-free)

More Related Content

What's hot

Common Problems in Hyperparameter Optimization
Common Problems in Hyperparameter OptimizationCommon Problems in Hyperparameter Optimization
Common Problems in Hyperparameter Optimization
SigOpt
 
AWS Forcecast: DeepAR Predictor Time-series
AWS Forcecast: DeepAR Predictor Time-series AWS Forcecast: DeepAR Predictor Time-series
AWS Forcecast: DeepAR Predictor Time-series
PolarSeven Pty Ltd
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning Fundamentals
SigOpt
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
SigOpt
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise Webinar
SigOpt
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
Mazen Aly
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
SigOpt
 
Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for ML
Bin Han
 
Modeling at scale in systematic trading
Modeling at scale in systematic tradingModeling at scale in systematic trading
Modeling at scale in systematic trading
SigOpt
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kaggler
Kai Xin Thia
 
Alpine Tech Talk: System ML by Berthold Reinwald
Alpine Tech Talk: System ML by Berthold ReinwaldAlpine Tech Talk: System ML by Berthold Reinwald
Alpine Tech Talk: System ML by Berthold Reinwald
Chester Chen
 
Iaetsd protecting privacy preserving for cost effective adaptive actions
Iaetsd protecting  privacy preserving for cost effective adaptive actionsIaetsd protecting  privacy preserving for cost effective adaptive actions
Iaetsd protecting privacy preserving for cost effective adaptive actionsIaetsd Iaetsd
 
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling PlatformsSigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
SigOpt
 
Presentation: Ad-Click Prediction, A Data-Intensive Problem
Presentation: Ad-Click Prediction, A Data-Intensive ProblemPresentation: Ad-Click Prediction, A Data-Intensive Problem
Presentation: Ad-Click Prediction, A Data-Intensive Problem
Arzam Muzaffar Kotriwala
 
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
Data Trend Analysis by Assigning Polynomial Function For Given Data SetData Trend Analysis by Assigning Polynomial Function For Given Data Set
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
IJCERT
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Scott Clark
 
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
IRJET Journal
 
Lawry-Daniel.doc
Lawry-Daniel.docLawry-Daniel.doc
Lawry-Daniel.docbutest
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
cscpconf
 

What's hot (20)

Common Problems in Hyperparameter Optimization
Common Problems in Hyperparameter OptimizationCommon Problems in Hyperparameter Optimization
Common Problems in Hyperparameter Optimization
 
AWS Forcecast: DeepAR Predictor Time-series
AWS Forcecast: DeepAR Predictor Time-series AWS Forcecast: DeepAR Predictor Time-series
AWS Forcecast: DeepAR Predictor Time-series
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning Fundamentals
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise Webinar
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
 
Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for ML
 
Modeling at scale in systematic trading
Modeling at scale in systematic tradingModeling at scale in systematic trading
Modeling at scale in systematic trading
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kaggler
 
Alpine Tech Talk: System ML by Berthold Reinwald
Alpine Tech Talk: System ML by Berthold ReinwaldAlpine Tech Talk: System ML by Berthold Reinwald
Alpine Tech Talk: System ML by Berthold Reinwald
 
Iaetsd protecting privacy preserving for cost effective adaptive actions
Iaetsd protecting  privacy preserving for cost effective adaptive actionsIaetsd protecting  privacy preserving for cost effective adaptive actions
Iaetsd protecting privacy preserving for cost effective adaptive actions
 
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling PlatformsSigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Presentation: Ad-Click Prediction, A Data-Intensive Problem
Presentation: Ad-Click Prediction, A Data-Intensive ProblemPresentation: Ad-Click Prediction, A Data-Intensive Problem
Presentation: Ad-Click Prediction, A Data-Intensive Problem
 
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
Data Trend Analysis by Assigning Polynomial Function For Given Data SetData Trend Analysis by Assigning Polynomial Function For Given Data Set
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
IRJET- A Comprehensive Study of Artificial Bee Colony (ABC) Algorithms and it...
 
Lawry-Daniel.doc
Lawry-Daniel.docLawry-Daniel.doc
Lawry-Daniel.doc
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
 

Similar to Pydata presentation

Sagemaker Automatic model tuning
Sagemaker Automatic model tuningSagemaker Automatic model tuning
Sagemaker Automatic model tuning
Soji Adeshina
 
No more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas HuijskensNo more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas Huijskens
Sri Ambati
 
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Amazon Web Services
 
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonMCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
Amazon Web Services
 
Bayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender SystemsBayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender Systems
Viral Gupta
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
Amazon Web Services
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Databricks
 
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep..."A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
Edge AI and Vision Alliance
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
SigOpt
 
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Amazon Web Services
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
Greg Makowski
 
LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Faisal Siddiqi
 
Commercial Management and Cost Optimization on AWS - AWS Online Tech Talks
Commercial Management and Cost Optimization on AWS - AWS Online Tech TalksCommercial Management and Cost Optimization on AWS - AWS Online Tech Talks
Commercial Management and Cost Optimization on AWS - AWS Online Tech Talks
Amazon Web Services
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
SigOpt
 
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
Amazon Web Services
 
Parametric Estimation in a nutshell
Parametric Estimation in a nutshellParametric Estimation in a nutshell
Parametric Estimation in a nutshell
Planisware
 
WKS402 Well-Architected Workshop
WKS402 Well-Architected WorkshopWKS402 Well-Architected Workshop
WKS402 Well-Architected Workshop
Amazon Web Services
 
Ijcai 2020
Ijcai 2020Ijcai 2020
Ijcai 2020
Viral Gupta
 
AWS Cost Optimisation Best Practices Webinar
AWS Cost Optimisation Best Practices WebinarAWS Cost Optimisation Best Practices Webinar
AWS Cost Optimisation Best Practices Webinar
Amazon Web Services
 
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
Amazon Web Services
 

Similar to Pydata presentation (20)

Sagemaker Automatic model tuning
Sagemaker Automatic model tuningSagemaker Automatic model tuning
Sagemaker Automatic model tuning
 
No more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas HuijskensNo more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas Huijskens
 
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
 
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonMCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
 
Bayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender SystemsBayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender Systems
 
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWSAWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
AWS reInvent 2017 recap - Optimizing Costs as You Scale on AWS
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
 
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep..."A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
"A Shallow Dive into Training Deep Neural Networks," a Presentation from Deep...
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
 
Commercial Management and Cost Optimization on AWS - AWS Online Tech Talks
Commercial Management and Cost Optimization on AWS - AWS Online Tech TalksCommercial Management and Cost Optimization on AWS - AWS Online Tech Talks
Commercial Management and Cost Optimization on AWS - AWS Online Tech Talks
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
 
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
Auto Scaling Prime Time: Target Tracking Hits the Bullseye at Netflix - CMP31...
 
Parametric Estimation in a nutshell
Parametric Estimation in a nutshellParametric Estimation in a nutshell
Parametric Estimation in a nutshell
 
WKS402 Well-Architected Workshop
WKS402 Well-Architected WorkshopWKS402 Well-Architected Workshop
WKS402 Well-Architected Workshop
 
Ijcai 2020
Ijcai 2020Ijcai 2020
Ijcai 2020
 
AWS Cost Optimisation Best Practices Webinar
AWS Cost Optimisation Best Practices WebinarAWS Cost Optimisation Best Practices Webinar
AWS Cost Optimisation Best Practices Webinar
 
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
Save up to 90% and Run Production Workloads on Spot - CMP307 - re:Invent 2017
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Pydata presentation

  • 1. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. Bayesian optimization PyData London 2017 6th of May, 2017 Thomas Huijskens
  • 2. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 2 The modelling workflow of a data scientist roughly follows three steps Feature engineering Feature selection Model selection
  • 3. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 3 In this talk, we will focus on the last step of this workflow Feature engineering Model selection Focus of this talk Feature selection
  • 4. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 4 Most models have a number of hyperparameters that need to be chosen a priori.. • Typically, we want to optimize some performance metric 𝑓ℳ for a model ℳ, on a hold-out set of data. • The model ℳ has some hyperparameters 𝜃 that we need to specify, and the performance of ℳ is highly dependent on the settings of 𝜃. • Example: For a classification problem, ℳ could be a support vector machine, and the performance metric 𝑓ℳ could be the out-of-sample AUC. • Because the performance of the SVM depends on hyperparameters 𝜃, the metric 𝑓ℳ depends on 𝜃 as well.
  • 5. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 5 𝑓ℳ 𝜃 𝐷) Learner ℳ Performance function 𝑓 𝑓 depends on the hyperparameters 𝜃 of the model ℳ .. and our goal is to find the values of 𝜃 for which the value of the (out-of- sample) performance 𝑓ℳ is optimal 𝑓 is conditional on the data 𝐷
  • 6. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 6 Manual grid search works for low-dimensional problems, but does not scale well to higher dimensions Curse of dimensionality makes this inefficient in higher dimensional problems
  • 7. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 7 Random grid search works better in higher dimensions, but..
  • 8. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 8 .. shouldn't previously evaluated hyperparameter values guide us in the search for the true optimal value? Low values of L here, should probably not focus in this area High values of L here, should probably focus in this area
  • 9. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 9 Bayesian optimization
  • 10. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 10 Bayesian optimization is only beneficial in certain situations In Bayesian optimization, we are building a probabilistic model for the performance metric 𝑓ℳ(𝜃). This introduces computational overhead. As a result, applying Bayesian optimization only really makes sense if • The number of hyperparameters is very high (𝜃 is of high dimension); or • It is computationally very expensive to evaluate 𝑓ℳ(𝜃) for a single point 𝜃.
  • 11. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 11 Evaluate performance of learner f with parameters 𝜃 Update current belief of loss surface of the learner f Choose 𝜃 that maximises some utility over the current belief The classic Bayesian optimization algorithm consists of three steps 𝑓 | 𝑦 𝑛𝑒𝑤 𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤 𝜃 𝑛𝑒𝑤
  • 12. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 12 Evaluate performance of learner f with parameters 𝜃 Update current belief of loss surface of the learner f Choose 𝜃 that maximises some utility over the current belief The classic Bayesian optimization algorithm consists of three steps 𝑓 | 𝑦 𝑛𝑒𝑤 𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤 𝜃 𝑛𝑒𝑤
  • 13. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 13 The performance function 𝑓ℳ is modelled using Gaussian Processes (GPs) • GPs are the generalization of a Gaussian distribution to a distribution over functions, instead of random variables • Just as a Gaussian distribution is completely specified by its mean and variance, a GP is completely specified by its mean function and covariance function. • We can think of a GP as a function that, instead of returning a scalar 𝑓(𝒙), returns the mean and variance of a normal distribution over the possible values of 𝑓 at 𝒙.
  • 14. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 14 GPs are the generalization of a Gaussian distribution to a distribution over functions, instead of random variables1 1A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning, Eric Brochu, Vlad M. Cora, Nando de Freitas, 2010. https://arxiv.org/abs/1012.2599
  • 15. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 15 Evaluate performance of learner f with parameters 𝜃 Update current belief of loss surface of the learner f Choose 𝜃 that maximises some utility over the current belief The classic Bayesian optimization algorithm consists of three steps 𝑓 | 𝑦 𝑛𝑒𝑤 𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤 𝜃 𝑛𝑒𝑤
  • 16. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 16 How do we use our current belief to make a good guess about what point to evaluate next?
  • 17. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 17 Acquisition functions are used to formalize what constitutes a ”best guess” 𝐸𝐼 𝜃 = 𝔼[max 𝜃 0, 𝑓ℳ 𝜃 − 𝑓ℳ 𝜃 ] , 𝜃 𝑛𝑒𝑤 = argmax 𝜃 𝐸𝐼(𝜃)
  • 18. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 18 The expected improvement acquisition function can be evaluated analytically 𝐸𝐼 𝜃 = 𝜇 𝜃 − 𝑓 𝜃 𝛷 𝑍 + 𝜎(𝜃)𝜙(𝑍), 𝜎 𝜃 > 0 0, 𝜎 𝜃 = 0 , 𝑍 = 𝜇 𝜃 − 𝑓( 𝜃) 𝜎(𝜃)
  • 19. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 19 The expected improvement acquisition trades off exploitation of known optimal areas, versus exploration of unexplored areas of the loss surface Exploitation of known optimal space Exploration of unknown space Exploration of unknown space
  • 20. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 20 Evaluate performance of learner f with parameters 𝜃 Update current belief of loss surface of the learner f Choose 𝜃 that maximises some utility over the current belief The classic Bayesian optimization algorithm consists of three steps 𝑓 | 𝑦 𝑛𝑒𝑤 𝑦 𝑛𝑒𝑤 = 𝑓 𝜃 𝑛𝑒𝑤 𝜃 𝑛𝑒𝑤
  • 21. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 21 How do we use this in real-life?
  • 22. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 22 Example – Bayesian optimization can be used to tune the hyperparameters of an SVM model
  • 23. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 23 Example – A pseudocode implementation of Bayesian optimization shows its simple API
  • 24. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 24 Example – Bayesian optimization can be used to tune the hyperparameters of an SVM model
  • 25. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 25 Example – Bayesian optimization can be used to tune the hyperparameters of an SVM model
  • 26. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 26 GPs can not be used as a simple black-box, and some care must be taken 1. Choose an appropriate scale for your hyperparameters: For parameters like a learning rate, or regularization term, it makes more sense to sample on the log-uniform domain, instead of the uniform domain. 2. Choose the kernel of the GP carefully: Each kernel implicitly assumes different properties on the loss function, in terms of differentiability and periodicity.
  • 27. All content Copyright © 2017 QuantumBlack Visual Analytics Ltd. 27 Multiple production-ready, and open-source, modules for Bayesian optimization are available online 1. Spearmint 2. Hyperopt 3. MOE 4. Hyperband (model-free) 5. SMAC (model-free)

Editor's Notes

  1. Mention the similarity with general sequantial model based optimisation algorithms  this is just a special instance of an algorithm in that class.