SlideShare a Scribd company logo
Hyperparameter Optimization
with Hyperband Algorithm
Deep Learning Meetup Italy
● Gilberto Batres-Estrada
Senior Data Scientist @ Trell Technologies
● AIFI: Graduate teaching fellow
● Co-author: Big Data and Machine Learning
in Quantitative Investment, Wiley. (Ch on LSTM)
● MSc in Theoretical Physics, Stockholm University
● MSc in Engineering: Applied Mathematics and Statistics ,
(KTH Royal Institute of Technology) in Stockholm.
Goals for today’s talk
1. Make the training process of neural networks faster
2. Get better performance and accurate neural networks (better test error)
3. To get more time for exploring different architectures
Agenda
● Random Search for Hyper-Parameter Optimization
● Bayesian optimization
● Hyperband
● Other methods
● Implementations and examples
Random Search
Proposed by James Bergstra and Yoshua Bengio
http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
Bayesian Optimization
Model the conditional probability
Where y is an evaluation metric such as test error and
is a set of hyperparameters.
Sequential Model-Based Algorithm Configuration SMAC
SMAC uses random forest to model
as a Gaussian Distribution (Hetter et al., 2011)
Tree Structured Parzen Estimator (TPE)
TPE is a non-standard Bayesian optimization algorithm based on tree-structured
Parzen density estimators (Bergstra et al., 2011)
Spearmint
Uses Gaussian Processes (GP) to model
And Performs slice sampling over GP (Sonek et al. 2012)
Hyperband
Hyperband
Successive Halving
Hyperband extends Successive Halving (Jamieson and Talwalkar, 2005) and uses it as a
subroutine
● Uniformly allocate a budget to a set of hyperparameter configurations
● Evaluate the performance of all configurations
● Throw out the worst half
● Repeat until one configuration remains
The algorithm allocates exponentially more resources to more promising configurations.
Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
Hyperband
● get_hyperparameter_configuration(n): returns a set of n i.i.d samples from some
distribution defined over the hyperparameter configuration space. Uniformly sample the hyperparameters from
a predefined space (hypercube with min and max bounds for each hyperparameter).
● run_then_return_val_loss(t, r): a function that takes a hyperparameter configuration t
and resource allocation r as input and returns the validation loss after training the configuration for the
allocated resources.
● top_k(configs, losses, k): a function that takes a set of configurations as well as their
associated losses and returns the top k performing configurations.
Hyperband: Implementation
Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
Finding the right hyperparameter configuration
Takeaways from Figure 2, more resources are needed to differentiate between the two configurations when
either:
1. The envelope functions are wider
2. The terminal losses are closer together
Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
Example from the Paper: LeNet
Example from the Paper: LeNet, Parameter Space
Experiment in the Paper
CNN used in Snoek et al. (2012) and Domhan et al. (2015)
Data-sets
● CIFAR-10 (40k, 10k, 10k)
● Rotated MNIST with Background images (MRBI)
(Larochelle et al., 2007) (10k, 2k, 50k)
● Street View House Numbers (SVHN) (600k, 6k, 26k)
Keras Tuner: Hyperparameter search
https://keras-team.github.io/keras-tuner/
Source code for Hyperband:
https://github.com/keras-team/keras-tuner/blob/master/kerastuner/tuners/hyperband.py
Other Methods: Cyclical Learning Rate
Lesley N. Smith
https://arxiv.org/pdf/1506.01186.pdf
Cyclical Learning Rate (CLR)
Torch:
Learning Rate Scheduler tf.keras
References
Gilberto Batres-Estrada
+46703387868
gilberto.batres-estrada@live.com
Repository https://github.com/gilberto-BE/deep_learning_italia
Cyclical Learning Rate: https://arxiv.org/pdf/1506.01186.pdf
Random Search: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
Keras tuner: https://keras-team.github.io/keras-tuner/
Learning Rate Scheduler: fastai (pytorch high level API) https://docs.fast.ai/callbacks.one_cycle.html
Source code for Hyperband: https://github.com/keras-team/keras-tuner/blob/master/kerastuner/tuners/hyperband.py

More Related Content

What's hot

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
Deep learning and neural networks (using simple mathematics)
Deep learning and neural networks (using simple mathematics)Deep learning and neural networks (using simple mathematics)
Deep learning and neural networks (using simple mathematics)
Amine Bendahmane
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
Akash Goel
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
Sri Ambati
 
Text categorization
Text categorizationText categorization
Text categorization
Shubham Pahune
 
Stft vs. mfcc
Stft vs. mfccStft vs. mfcc
Stft vs. mfcc
Muhammad Rizwan
 
Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filtering
SOYEON KIM
 
Associative memory network
Associative memory networkAssociative memory network
Associative memory network
Dr. C.V. Suresh Babu
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networksguestfee8698
 
Bayes Theorem.pdf
Bayes Theorem.pdfBayes Theorem.pdf
Bayes Theorem.pdf
Nirmalavenkatachalam
 
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
akira-ai
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
Joaquin Vanschoren
 
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Mohammed Bennamoun
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Dr. Radhey Shyam
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
Bruce Kuo
 
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Akanksha Bali
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
QuantUniversity
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
Manohar Swamynathan
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Modelsbutest
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
Samra Shahzadi
 

What's hot (20)

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Deep learning and neural networks (using simple mathematics)
Deep learning and neural networks (using simple mathematics)Deep learning and neural networks (using simple mathematics)
Deep learning and neural networks (using simple mathematics)
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Text categorization
Text categorizationText categorization
Text categorization
 
Stft vs. mfcc
Stft vs. mfccStft vs. mfcc
Stft vs. mfcc
 
Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filtering
 
Associative memory network
Associative memory networkAssociative memory network
Associative memory network
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networks
 
Bayes Theorem.pdf
Bayes Theorem.pdfBayes Theorem.pdf
Bayes Theorem.pdf
 
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
PPT4: Frameworks & Libraries of Machine Learning & Deep Learning
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
 
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Models
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 

Similar to Hyperparameter Optimization with Hyperband Algorithm

Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
Anubhav Jain
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...
IRJET Journal
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
NECST Lab @ Politecnico di Milano
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptx
skilljiolms
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
eSAT Publishing House
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET Journal
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
The Statistical and Applied Mathematical Sciences Institute
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
BlackhatGAURAV
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
Tarik Reza Toha
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
Arvind Rapaka
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
Pooyan Jamshidi
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
a3labdsp
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...
IOSR Journals
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
Aijun Zhang
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
Arumugam90
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
Larry Smarr
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 

Similar to Hyperparameter Optimization with Hyperband Algorithm (20)

Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...
 
3rd 3DDRESD: Floorplacer
3rd 3DDRESD: Floorplacer3rd 3DDRESD: Floorplacer
3rd 3DDRESD: Floorplacer
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptx
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
 
UIC Thesis Montone
UIC Thesis MontoneUIC Thesis Montone
UIC Thesis Montone
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...
 
E01113138
E01113138E01113138
E01113138
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 

More from Deep Learning Italia

Machine Learning driven Quantum Optimization for Marketing
Machine Learning driven Quantum Optimization for MarketingMachine Learning driven Quantum Optimization for Marketing
Machine Learning driven Quantum Optimization for Marketing
Deep Learning Italia
 
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettiveModelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
Deep Learning Italia
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
Deep Learning Italia
 
Meetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdfMeetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdf
Deep Learning Italia
 
Meetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdfMeetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdf
Deep Learning Italia
 
MEETUP Maggio - Team Automata
MEETUP Maggio - Team AutomataMEETUP Maggio - Team Automata
MEETUP Maggio - Team Automata
Deep Learning Italia
 
MEETUP APRILE - Ganomaly - Anomaly Detection.pdf
MEETUP APRILE - Ganomaly - Anomaly Detection.pdfMEETUP APRILE - Ganomaly - Anomaly Detection.pdf
MEETUP APRILE - Ganomaly - Anomaly Detection.pdf
Deep Learning Italia
 
2022_Meetup_Mazza-Marzo.pptx
2022_Meetup_Mazza-Marzo.pptx2022_Meetup_Mazza-Marzo.pptx
2022_Meetup_Mazza-Marzo.pptx
Deep Learning Italia
 
Machine Learning Security
Machine Learning SecurityMachine Learning Security
Machine Learning Security
Deep Learning Italia
 
The science of can and can t e la computazione quantistica
The science of can and can t e la computazione quantisticaThe science of can and can t e la computazione quantistica
The science of can and can t e la computazione quantistica
Deep Learning Italia
 
Dli meetup moccia
Dli meetup mocciaDli meetup moccia
Dli meetup moccia
Deep Learning Italia
 
Pi school-dli-presentation de nobili
Pi school-dli-presentation de nobiliPi school-dli-presentation de nobili
Pi school-dli-presentation de nobili
Deep Learning Italia
 
Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework
Deep Learning Italia
 
Explanation methods for Artificial Intelligence Models
Explanation methods for Artificial Intelligence ModelsExplanation methods for Artificial Intelligence Models
Explanation methods for Artificial Intelligence Models
Deep Learning Italia
 
Use Cases Machine Learning for Healthcare
Use Cases Machine Learning for HealthcareUse Cases Machine Learning for Healthcare
Use Cases Machine Learning for Healthcare
Deep Learning Italia
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
Deep Learning Italia
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Deep Learning Italia
 
Towards quantum machine learning calogero zarbo - meet up
Towards quantum machine learning  calogero zarbo - meet upTowards quantum machine learning  calogero zarbo - meet up
Towards quantum machine learning calogero zarbo - meet up
Deep Learning Italia
 
Macaluso antonio meetup dli 2020-12-15
Macaluso antonio  meetup dli 2020-12-15Macaluso antonio  meetup dli 2020-12-15
Macaluso antonio meetup dli 2020-12-15
Deep Learning Italia
 
Data privacy e anonymization in R
Data privacy e anonymization in RData privacy e anonymization in R
Data privacy e anonymization in R
Deep Learning Italia
 

More from Deep Learning Italia (20)

Machine Learning driven Quantum Optimization for Marketing
Machine Learning driven Quantum Optimization for MarketingMachine Learning driven Quantum Optimization for Marketing
Machine Learning driven Quantum Optimization for Marketing
 
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettiveModelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
Modelli linguistici da Eliza a ChatGPT P roblemi , fraintendimenti e prospettive
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
Meetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdfMeetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdf
 
Meetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdfMeetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdf
 
MEETUP Maggio - Team Automata
MEETUP Maggio - Team AutomataMEETUP Maggio - Team Automata
MEETUP Maggio - Team Automata
 
MEETUP APRILE - Ganomaly - Anomaly Detection.pdf
MEETUP APRILE - Ganomaly - Anomaly Detection.pdfMEETUP APRILE - Ganomaly - Anomaly Detection.pdf
MEETUP APRILE - Ganomaly - Anomaly Detection.pdf
 
2022_Meetup_Mazza-Marzo.pptx
2022_Meetup_Mazza-Marzo.pptx2022_Meetup_Mazza-Marzo.pptx
2022_Meetup_Mazza-Marzo.pptx
 
Machine Learning Security
Machine Learning SecurityMachine Learning Security
Machine Learning Security
 
The science of can and can t e la computazione quantistica
The science of can and can t e la computazione quantisticaThe science of can and can t e la computazione quantistica
The science of can and can t e la computazione quantistica
 
Dli meetup moccia
Dli meetup mocciaDli meetup moccia
Dli meetup moccia
 
Pi school-dli-presentation de nobili
Pi school-dli-presentation de nobiliPi school-dli-presentation de nobili
Pi school-dli-presentation de nobili
 
Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework
 
Explanation methods for Artificial Intelligence Models
Explanation methods for Artificial Intelligence ModelsExplanation methods for Artificial Intelligence Models
Explanation methods for Artificial Intelligence Models
 
Use Cases Machine Learning for Healthcare
Use Cases Machine Learning for HealthcareUse Cases Machine Learning for Healthcare
Use Cases Machine Learning for Healthcare
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
Towards quantum machine learning calogero zarbo - meet up
Towards quantum machine learning  calogero zarbo - meet upTowards quantum machine learning  calogero zarbo - meet up
Towards quantum machine learning calogero zarbo - meet up
 
Macaluso antonio meetup dli 2020-12-15
Macaluso antonio  meetup dli 2020-12-15Macaluso antonio  meetup dli 2020-12-15
Macaluso antonio meetup dli 2020-12-15
 
Data privacy e anonymization in R
Data privacy e anonymization in RData privacy e anonymization in R
Data privacy e anonymization in R
 

Recently uploaded

一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 

Recently uploaded (20)

一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 

Hyperparameter Optimization with Hyperband Algorithm

  • 1. Hyperparameter Optimization with Hyperband Algorithm Deep Learning Meetup Italy
  • 2. ● Gilberto Batres-Estrada Senior Data Scientist @ Trell Technologies ● AIFI: Graduate teaching fellow ● Co-author: Big Data and Machine Learning in Quantitative Investment, Wiley. (Ch on LSTM) ● MSc in Theoretical Physics, Stockholm University ● MSc in Engineering: Applied Mathematics and Statistics , (KTH Royal Institute of Technology) in Stockholm.
  • 3. Goals for today’s talk 1. Make the training process of neural networks faster 2. Get better performance and accurate neural networks (better test error) 3. To get more time for exploring different architectures
  • 4. Agenda ● Random Search for Hyper-Parameter Optimization ● Bayesian optimization ● Hyperband ● Other methods ● Implementations and examples
  • 5. Random Search Proposed by James Bergstra and Yoshua Bengio http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
  • 6. Bayesian Optimization Model the conditional probability Where y is an evaluation metric such as test error and is a set of hyperparameters.
  • 7. Sequential Model-Based Algorithm Configuration SMAC SMAC uses random forest to model as a Gaussian Distribution (Hetter et al., 2011)
  • 8. Tree Structured Parzen Estimator (TPE) TPE is a non-standard Bayesian optimization algorithm based on tree-structured Parzen density estimators (Bergstra et al., 2011)
  • 9. Spearmint Uses Gaussian Processes (GP) to model And Performs slice sampling over GP (Sonek et al. 2012)
  • 11. Hyperband Successive Halving Hyperband extends Successive Halving (Jamieson and Talwalkar, 2005) and uses it as a subroutine ● Uniformly allocate a budget to a set of hyperparameter configurations ● Evaluate the performance of all configurations ● Throw out the worst half ● Repeat until one configuration remains The algorithm allocates exponentially more resources to more promising configurations. Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
  • 12. Hyperband ● get_hyperparameter_configuration(n): returns a set of n i.i.d samples from some distribution defined over the hyperparameter configuration space. Uniformly sample the hyperparameters from a predefined space (hypercube with min and max bounds for each hyperparameter). ● run_then_return_val_loss(t, r): a function that takes a hyperparameter configuration t and resource allocation r as input and returns the validation loss after training the configuration for the allocated resources. ● top_k(configs, losses, k): a function that takes a set of configurations as well as their associated losses and returns the top k performing configurations.
  • 13. Hyperband: Implementation Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
  • 14. Finding the right hyperparameter configuration Takeaways from Figure 2, more resources are needed to differentiate between the two configurations when either: 1. The envelope functions are wider 2. The terminal losses are closer together Lisha Li et al. (2018) http://jmlr.org/papers/volume18/16-558/16-558.pdf
  • 15. Example from the Paper: LeNet
  • 16. Example from the Paper: LeNet, Parameter Space
  • 17. Experiment in the Paper CNN used in Snoek et al. (2012) and Domhan et al. (2015) Data-sets ● CIFAR-10 (40k, 10k, 10k) ● Rotated MNIST with Background images (MRBI) (Larochelle et al., 2007) (10k, 2k, 50k) ● Street View House Numbers (SVHN) (600k, 6k, 26k)
  • 18. Keras Tuner: Hyperparameter search https://keras-team.github.io/keras-tuner/ Source code for Hyperband: https://github.com/keras-team/keras-tuner/blob/master/kerastuner/tuners/hyperband.py
  • 19. Other Methods: Cyclical Learning Rate Lesley N. Smith https://arxiv.org/pdf/1506.01186.pdf
  • 20. Cyclical Learning Rate (CLR) Torch:
  • 22. References Gilberto Batres-Estrada +46703387868 gilberto.batres-estrada@live.com Repository https://github.com/gilberto-BE/deep_learning_italia Cyclical Learning Rate: https://arxiv.org/pdf/1506.01186.pdf Random Search: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf Keras tuner: https://keras-team.github.io/keras-tuner/ Learning Rate Scheduler: fastai (pytorch high level API) https://docs.fast.ai/callbacks.one_cycle.html Source code for Hyperband: https://github.com/keras-team/keras-tuner/blob/master/kerastuner/tuners/hyperband.py