SlideShare a Scribd company logo
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis
Automated Hyperparameter Optimization Using Bayesian
Optimization
Frederik Diehl Andreas Jauch
March 10, 2015
State March 10, 2015
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
AGENDA
Problem Description
Bayesian Optimization
apsis and its Architecture
Project Organisation
Performance Evaluation
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
MOTIVATION
Why Hyperparameter Optimization and why automating it?
hyperparameter tuning often leads to huge performance
gain
”more of an art than a science”
reproducibility of published results
automatic methods might be better than humans
provide ml algorithms to non-expert users
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ML PROCESS OVERVIEW
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FORMAL PROBLEM DESCRIPTION
λ : hyperparameter vector λ = (λ(1)
, ..., λ(n)
)
L(X, f) : loss function evaluated for model f and dataset x
Aλ(X) : learning algorithm with hyperparameter vector λ
learning on dataset X
Xtrain : training data, Xvalid : validation data, Xtest : test data
Ψ(λ) : hyperparameter response function/surface
Hyperparameter Optimization Problem
ˆλ ≈ argmin
λ Λ
mean
Xi∈Xvalid
(L(xi, Aλ(Xtrain))
Ψ(λ)
(1)
= argmin
λ Λ
(Ψ(λ)) (2)
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
KEY PROBLEM PROPERTIES
unknown, probably non-convex response surface Ψ
no derivative-based optimization possible
every evaluation of Ψ is expensive
evaluation time of Ψ depends on individual value of λ
low effective dimensionality of Ψ
which dimensions are important is dataset dependent
tree-structured configuration space1
1
not addressed in apsis yet
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
STATE OF THE ART
optimization still manual in many projects
grid search most common method
often the only provided method by many ml frameworks
random search
bayesian optimization
code of Jasper Snoek et. al. Harvard/Toronto
whetlab - bay opt in the cloud
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BAYESIAN OPTIMIZATION
approximate Ψ(λ) by a surrogate function M(λ) = y
surrogate function cheaper to evaluate than Ψ
interpret model to find minimization candidates for Ψ
evaluate Ψ for promising candidates
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BAYESIAN OPTIMIZATION FUNDAMENTALS
We need two design choices
Surrogate Modelling Function - Gaussian Processes
universal approximation
very flexible and have many useful properties
closed under sampling
Acquisition Function
Probability of Improvement
Expected Improvement
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ACQUISITION FUNCTION u
measures the expected utility of evaluating the objective
function at a point λnext
exploitation vs. exploration trade-off
(picture adopted from [1])
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZATION - SUCCESSIVELY UPDATING THE GP
1. find
max
λ
(u(λ)) = λnext
max of acquisition
2. Evaluate M at λnext
3. Update the GP
(picture adopted from [1])
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FITTING THE GP TO THE PROBLEM
by tuning the covariance!
Squared Exponential Kernel
KSE(λ, λ ) = exp −
1
2l2
·
1..dim(D)
(λd − λd)2
use Automatic Relevance Determination (ARD)
KSE(λ, λ ) = θ0 · exp

−
1
2
·
1..dim(D)
1
θd
2
(λd − λd)2


with ARD vector θ
θ = ( θ0
bias
, θ1, . . . , θd
dimension weights
)
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
HOW DO ACQUISITION FUNCTIONS LOOK LIKE?
Expected Improvement (EI)
uEI(λ) =
−∞
∞
max(Ψ(λ∗
) − y, 0) · pM(y|λ) dy
closed form solution for GPs available
uEI(λ|Mt) = σ(λ) ·
f(λ∗) − µ(λ)
σ(λ)
· Φ(λ) + φ(λ)
gradient analytically derived in apsis for more effective
optimization
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
THE apsis TOOLKIT
Automated Hyperparameter Optimization Framework for
random search
bayesian optimization
as an open source framework featuring
flexible architecture, ready to be extended for more
optimizers
ready for use with scikit-learn and theano
implemented in Python
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
PROJECT OBJECTIVES
open source implementation of state of the art research in
bayesian optimization
extendible project to encourage collaboration with other
researchers
easy integration with existing machine learning
frameworks
multi core support
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis ARCHITECTURE OVERVIEW
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis CORE MODEL COMPONENTS
Parameter Definitions
define the meta information for each hyperparameter
Candidates
represent a specific hyperparameter vector and its value
holds function value if available
Experiments
represent an optimization object
keeps track of finished and unfinished Candidates
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
USING apsis - EXPERIMENT ASSISTANTS
single experiment interaction interface
provides plots and result bookkeeping
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
USING apsis - LAB ASSISTANTS
multiple experiments to compare different optimization
techniques
cross validation
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
EXTENSIVE EXPERIMENT TRACKING
automated plot writing
automated results writing
write out information at every step
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
AUTOMATED PLOTTING
plot function evaluations and best results
plot confidence bars when using cross validation
write out plots at every step
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
PARAMETER REPRESENTATION IN apsis
different representation by parameter type
various nominal and numeric types
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
NUMERIC PARAMETERS IN apsis
Warping Mechanism
parameters are warped into [0, 1] interval
optimization core can assume a uniform and equal
distribution in [0, 1] space
warping can be user defined
Provided Warpings for
normalization of arbitrary intervals [a, b] into [0, 1] space.
asymptotic parameters, e.g. learning rate asymptotic at 0
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
NOMINAL PARAMETERS IN apsis
generally supported in apsis
no support in Bayesian Optimization
GP kernels based on distance metrics between parameters
interesting topic for further research
no publications on this topic so far
whetlab pretends to deal well with them - but doesn’t say
how
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
EXPECTED IMPROVEMENT OPTIMIZATION
gradient analytically derived2
1000 Steps Random Search for Initialization
Several iterative optimization methods integrated
L-BFGS-B Bounded Low Memory Quasi Newton Method
BFGS Quasi Newton Method
Nelder-Mead
Inexact Newton with Conjugate Gradient Solver
...
2
See our paper for derivation.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
DEALING WITH GP HYPERPARAMETERS
the GP surrogate model introduces new hyperparameter
not subject of optimization ⇒ hyper-hyperparameters
optimization by maximum likelihood method
integrating over these parameters in the acquisition
function using Hybrid Monte Carlo sampling
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis PROJECT SET UP
Open-Source project from the beginning
MIT-License
active issue tracking
PEP-8 code styling convention
Fully automated sphinx documentation build on every
commit
90% test coverage
clear commit messages
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis GITHUB REPOSITORY
Check out http://github.com/FrederikDiehl/apsis !
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ISSUE TRACKING AND DISCUSSION
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
UNIT TESTS
90% overall test coverage
100% in most core components
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
GOOD CODE DOCUMENTATION
almost 50:50 ratio of code vs. doc
documented according to sphinx standard
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FULLY AUTOMATED DOCUMENTATION BUILD
builds on every commit
Visit http://apsis.readthedocs.org !
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BRANIN HOO OPTIMIZATION
Best Value by Optimizer
Values by Optimizer
random search finds better end result but bay opt is more
stable
similar performance as in other bay opt literature
no other group publishes comparison to random search
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZING ARTIFICIAL NOISE FUNCTION
One dimensional noise function with several smoothing
variances
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZING ARTIFICIAL NOISE FUNCTION
Minimization result on 3d noise by smoothing factor.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BREZE MNIST NEURAL NETWORK
Neural Network on MNIST using uniform parameters
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BREZE MNIST NEURAL NETWORK (2)
Neural Network on MNIST using asymptotic parameters for
learning rate and learning rate decay
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
TAKING THE PROJECT TO THE NEXT LEVEL -
PROGRAM
implement full multicore support
implement a REST web-service to offer interoperability
with any language
improve integration of matplotlib
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
TAKING THE PROJECT TO THE NEXT LEVEL -
BAYESIAN OPTIMIZATION
deal with nominal parameters
try replacing GPs with Student-t processes
try to take tree structured configuration space into account
account for evaluation cost depending on hyperparameter
setting
implement freeze-thaw optimization idea [3]
automated learning of input warping [2]
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
Thank You!
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
REFERENCES I
Eric Brochu, Vlad M. Cora, and Nando de Freitas.
A Tutorial on Bayesian Optimization of Expensive Cost
Functions, with Application to Active User Modeling and
Hierarchical Reinforcement Learning.
IEEE Transactions on Reliability, abs/1012.2, 2010.
Jasper Snoek, Kevin Swersky, Richard S Zemel, and Ryan P
Adams.
Input warping for bayesian optimization of non-stationary
functions.
arXiv preprint arXiv:1402.0929, 2014.
Kevin Swersky, Jasper Snoek, and Ryan Prescott Adams.
Freeze-thaw bayesian optimization.
arXiv preprint arXiv:1406.3896, 2014.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
REFERENCES II

More Related Content

Viewers also liked

Rocket project 2014 updated
Rocket project 2014 updatedRocket project 2014 updated
Rocket project 2014 updatedlihab
 
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Jeremy Schiff
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Data Con LA
 
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data AgeSpark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
batchinsights
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
odsc
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
PAPIs.io
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
Sri Ambati
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
Spark Summit
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
Alice Zheng
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 

Viewers also liked (10)

Rocket project 2014 updated
Rocket project 2014 updatedRocket project 2014 updated
Rocket project 2014 updated
 
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
 
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data AgeSpark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 

Similar to apsis - Automatic Hyperparameter Optimization Framework for Machine Learning

1998 - Thesis JL Pacherie Parallel perators
1998 - Thesis JL Pacherie Parallel perators1998 - Thesis JL Pacherie Parallel perators
1998 - Thesis JL Pacherie Parallel peratorsJean-Lin Pacherie, Ph.D.
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1Dr. Jai Sakthi
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
Joel Falcou
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
SigOpt
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
MLconf
 
E132833
E132833E132833
E132833
irjes
 
The Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore EnvironmentsThe Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore Environments
University of Washington
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
Akhilesh Joshi
 
HDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel ArchitecturesHDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel Architectures
Joel Falcou
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learningbutest
 
2012 05-10 kaiser
2012 05-10 kaiser2012 05-10 kaiser
2012 05-10 kaiser
SCEE Team
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
Chun-Ming Chang
 
Automatically translating a general purpose C image processing library for ...
Automatically translating a general purpose C   image processing library for ...Automatically translating a general purpose C   image processing library for ...
Automatically translating a general purpose C image processing library for ...
Carrie Cox
 
Chapter15
Chapter15Chapter15
Chapter15
gourab87
 
Classification of optimization Techniques
Classification of optimization TechniquesClassification of optimization Techniques
Classification of optimization Techniques
shelememosisa
 

Similar to apsis - Automatic Hyperparameter Optimization Framework for Machine Learning (20)

1998 - Thesis JL Pacherie Parallel perators
1998 - Thesis JL Pacherie Parallel perators1998 - Thesis JL Pacherie Parallel perators
1998 - Thesis JL Pacherie Parallel perators
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1
/.Amd mnt/lotus/host/home/jaishakthi/presentation/rmeet1/rmeet 1
 
Virtual.PYXIS_flyer
Virtual.PYXIS_flyerVirtual.PYXIS_flyer
Virtual.PYXIS_flyer
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
mlsys_portrait
mlsys_portraitmlsys_portrait
mlsys_portrait
 
Virtual.PYXIS_flyer
Virtual.PYXIS_flyerVirtual.PYXIS_flyer
Virtual.PYXIS_flyer
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
E132833
E132833E132833
E132833
 
The Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore EnvironmentsThe Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore Environments
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
 
HDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel ArchitecturesHDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel Architectures
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learning
 
2012 05-10 kaiser
2012 05-10 kaiser2012 05-10 kaiser
2012 05-10 kaiser
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
Automatically translating a general purpose C image processing library for ...
Automatically translating a general purpose C   image processing library for ...Automatically translating a general purpose C   image processing library for ...
Automatically translating a general purpose C image processing library for ...
 
Chapter15
Chapter15Chapter15
Chapter15
 
Classification of optimization Techniques
Classification of optimization TechniquesClassification of optimization Techniques
Classification of optimization Techniques
 

Recently uploaded

top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 

Recently uploaded (20)

top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 

apsis - Automatic Hyperparameter Optimization Framework for Machine Learning

  • 1. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis Automated Hyperparameter Optimization Using Bayesian Optimization Frederik Diehl Andreas Jauch March 10, 2015 State March 10, 2015
  • 2. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation AGENDA Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
  • 3. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation MOTIVATION Why Hyperparameter Optimization and why automating it? hyperparameter tuning often leads to huge performance gain ”more of an art than a science” reproducibility of published results automatic methods might be better than humans provide ml algorithms to non-expert users
  • 4. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ML PROCESS OVERVIEW
  • 5. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FORMAL PROBLEM DESCRIPTION λ : hyperparameter vector λ = (λ(1) , ..., λ(n) ) L(X, f) : loss function evaluated for model f and dataset x Aλ(X) : learning algorithm with hyperparameter vector λ learning on dataset X Xtrain : training data, Xvalid : validation data, Xtest : test data Ψ(λ) : hyperparameter response function/surface Hyperparameter Optimization Problem ˆλ ≈ argmin λ Λ mean Xi∈Xvalid (L(xi, Aλ(Xtrain)) Ψ(λ) (1) = argmin λ Λ (Ψ(λ)) (2)
  • 6. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation KEY PROBLEM PROPERTIES unknown, probably non-convex response surface Ψ no derivative-based optimization possible every evaluation of Ψ is expensive evaluation time of Ψ depends on individual value of λ low effective dimensionality of Ψ which dimensions are important is dataset dependent tree-structured configuration space1 1 not addressed in apsis yet
  • 7. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation STATE OF THE ART optimization still manual in many projects grid search most common method often the only provided method by many ml frameworks random search bayesian optimization code of Jasper Snoek et. al. Harvard/Toronto whetlab - bay opt in the cloud
  • 8. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BAYESIAN OPTIMIZATION approximate Ψ(λ) by a surrogate function M(λ) = y surrogate function cheaper to evaluate than Ψ interpret model to find minimization candidates for Ψ evaluate Ψ for promising candidates
  • 9. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BAYESIAN OPTIMIZATION FUNDAMENTALS We need two design choices Surrogate Modelling Function - Gaussian Processes universal approximation very flexible and have many useful properties closed under sampling Acquisition Function Probability of Improvement Expected Improvement
  • 10. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ACQUISITION FUNCTION u measures the expected utility of evaluating the objective function at a point λnext exploitation vs. exploration trade-off (picture adopted from [1])
  • 11. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZATION - SUCCESSIVELY UPDATING THE GP 1. find max λ (u(λ)) = λnext max of acquisition 2. Evaluate M at λnext 3. Update the GP (picture adopted from [1])
  • 12. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FITTING THE GP TO THE PROBLEM by tuning the covariance! Squared Exponential Kernel KSE(λ, λ ) = exp − 1 2l2 · 1..dim(D) (λd − λd)2 use Automatic Relevance Determination (ARD) KSE(λ, λ ) = θ0 · exp  − 1 2 · 1..dim(D) 1 θd 2 (λd − λd)2   with ARD vector θ θ = ( θ0 bias , θ1, . . . , θd dimension weights )
  • 13. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation HOW DO ACQUISITION FUNCTIONS LOOK LIKE? Expected Improvement (EI) uEI(λ) = −∞ ∞ max(Ψ(λ∗ ) − y, 0) · pM(y|λ) dy closed form solution for GPs available uEI(λ|Mt) = σ(λ) · f(λ∗) − µ(λ) σ(λ) · Φ(λ) + φ(λ) gradient analytically derived in apsis for more effective optimization
  • 14. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation THE apsis TOOLKIT Automated Hyperparameter Optimization Framework for random search bayesian optimization as an open source framework featuring flexible architecture, ready to be extended for more optimizers ready for use with scikit-learn and theano implemented in Python
  • 15. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation PROJECT OBJECTIVES open source implementation of state of the art research in bayesian optimization extendible project to encourage collaboration with other researchers easy integration with existing machine learning frameworks multi core support
  • 16. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis ARCHITECTURE OVERVIEW
  • 17. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis CORE MODEL COMPONENTS Parameter Definitions define the meta information for each hyperparameter Candidates represent a specific hyperparameter vector and its value holds function value if available Experiments represent an optimization object keeps track of finished and unfinished Candidates
  • 18. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation USING apsis - EXPERIMENT ASSISTANTS single experiment interaction interface provides plots and result bookkeeping
  • 19. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation USING apsis - LAB ASSISTANTS multiple experiments to compare different optimization techniques cross validation
  • 20. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation EXTENSIVE EXPERIMENT TRACKING automated plot writing automated results writing write out information at every step
  • 21. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation AUTOMATED PLOTTING plot function evaluations and best results plot confidence bars when using cross validation write out plots at every step
  • 22. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation PARAMETER REPRESENTATION IN apsis different representation by parameter type various nominal and numeric types
  • 23. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation NUMERIC PARAMETERS IN apsis Warping Mechanism parameters are warped into [0, 1] interval optimization core can assume a uniform and equal distribution in [0, 1] space warping can be user defined Provided Warpings for normalization of arbitrary intervals [a, b] into [0, 1] space. asymptotic parameters, e.g. learning rate asymptotic at 0
  • 24. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation NOMINAL PARAMETERS IN apsis generally supported in apsis no support in Bayesian Optimization GP kernels based on distance metrics between parameters interesting topic for further research no publications on this topic so far whetlab pretends to deal well with them - but doesn’t say how
  • 25. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation EXPECTED IMPROVEMENT OPTIMIZATION gradient analytically derived2 1000 Steps Random Search for Initialization Several iterative optimization methods integrated L-BFGS-B Bounded Low Memory Quasi Newton Method BFGS Quasi Newton Method Nelder-Mead Inexact Newton with Conjugate Gradient Solver ... 2 See our paper for derivation.
  • 26. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation DEALING WITH GP HYPERPARAMETERS the GP surrogate model introduces new hyperparameter not subject of optimization ⇒ hyper-hyperparameters optimization by maximum likelihood method integrating over these parameters in the acquisition function using Hybrid Monte Carlo sampling
  • 27. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis PROJECT SET UP Open-Source project from the beginning MIT-License active issue tracking PEP-8 code styling convention Fully automated sphinx documentation build on every commit 90% test coverage clear commit messages
  • 28. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis GITHUB REPOSITORY Check out http://github.com/FrederikDiehl/apsis !
  • 29. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ISSUE TRACKING AND DISCUSSION
  • 30. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation UNIT TESTS 90% overall test coverage 100% in most core components
  • 31. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation GOOD CODE DOCUMENTATION almost 50:50 ratio of code vs. doc documented according to sphinx standard
  • 32. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FULLY AUTOMATED DOCUMENTATION BUILD builds on every commit Visit http://apsis.readthedocs.org !
  • 33. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BRANIN HOO OPTIMIZATION Best Value by Optimizer Values by Optimizer random search finds better end result but bay opt is more stable similar performance as in other bay opt literature no other group publishes comparison to random search
  • 34. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZING ARTIFICIAL NOISE FUNCTION One dimensional noise function with several smoothing variances
  • 35. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZING ARTIFICIAL NOISE FUNCTION Minimization result on 3d noise by smoothing factor.
  • 36. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BREZE MNIST NEURAL NETWORK Neural Network on MNIST using uniform parameters
  • 37. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BREZE MNIST NEURAL NETWORK (2) Neural Network on MNIST using asymptotic parameters for learning rate and learning rate decay
  • 38. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation TAKING THE PROJECT TO THE NEXT LEVEL - PROGRAM implement full multicore support implement a REST web-service to offer interoperability with any language improve integration of matplotlib
  • 39. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation TAKING THE PROJECT TO THE NEXT LEVEL - BAYESIAN OPTIMIZATION deal with nominal parameters try replacing GPs with Student-t processes try to take tree structured configuration space into account account for evaluation cost depending on hyperparameter setting implement freeze-thaw optimization idea [3] automated learning of input warping [2]
  • 40. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation Thank You!
  • 41. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation REFERENCES I Eric Brochu, Vlad M. Cora, and Nando de Freitas. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. IEEE Transactions on Reliability, abs/1012.2, 2010. Jasper Snoek, Kevin Swersky, Richard S Zemel, and Ryan P Adams. Input warping for bayesian optimization of non-stationary functions. arXiv preprint arXiv:1402.0929, 2014. Kevin Swersky, Jasper Snoek, and Ryan Prescott Adams. Freeze-thaw bayesian optimization. arXiv preprint arXiv:1406.3896, 2014.
  • 42. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation REFERENCES II