SlideShare a Scribd company logo
SigOpt. Confidential.
Advanced Optimization for the Enterprise
Considerations and use cases
Scott Clark — Co-Founder and CEO, SigOpt
Tuesday, November 5, 2019
SigOpt. Confidential. 2
Accelerate and amplify the
impact of modelers everywhere
SigOpt. Confidential. 3
Abstract
SigOpt provides an extensive set of advanced features,
which help you, the expert, save time while
increasing performance.
Today, we will be sharing some of the intuition behind
these features, while combining and applying them to
tackle real-world problems
SigOpt. Confidential.
How experimentation impacts your modeling
4
Notebook & Model Framework
Hardware Environment
Data
Preparation
Experimentation, Training, Evaluation
Model
Productionalization
Validation
Serving
Deploying
Monitoring
Managing
Inference
Online Testing
Transformation
Labeling
Pre-Processing
Pipeline Dev.
Feature Eng.
Feature Stores
On-Premise Hybrid Multi-Cloud
Experimentation & Model Optimization
Insights, Tracking,
Collaboration
Model Search,
Hyperparameter Tuning
Resource Scheduler,
Management
...and more
SigOpt. Confidential. 5
Motivation
1. How to solve a black box optimization problem
2. Why you should optimize using
multiple competing metrics
3. How to continuously and efficiently employ your
project’s dedicated compute infrastructure
4. How to tune models with expensive training costs
SigOpt. Confidential.
SigOpt Features
Enterprise
Platform
Optimization
Engine
Experiment
Insights
Reproducibility
Intuitive web dashboards
Cross-team permissions
and collaboration
Advanced experiment
visualizations
Usage insights
Parameter importance
analysis
Multimetric optimization
Continuous, categorical, or
integer parameters
Constraints and
failure regions
Up to 10k observations,
100 parameters
Multitask optimization and
high parallelism
Conditional
parameters
Infrastructure agnostic
REST API
Parallel Resource Scheduler
Black-Box Interface
Tunes without
accessing any data
Libraries for Python,
Java, R, and MATLAB
6
SigOpt. Confidential.
How to solve a
black box optimization problem1
SigOpt. Confidential.
Why black box optimization?
SigOpt was designed to empower you, the practitioner, to re-define most machine
learning problems as black box optimization problems with the added side benefits:
• Amplified performance — incremental gains in accuracy or other success metrics
• Productivity gains — a consistent platform across tasks that facilitates sharing
• Accelerated modeling — early elimination of non-scalable tasks
• Compute efficiency — continuous, full utilization of infrastructure
SigOpt uses an ensemble of Bayesian and Global Optimization methods to solve
these black box optimization problems.
8
Black Box Optimization
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL,
Simulation Model
Model Evaluation
or Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale
with your needs
OPTIMIZATION ENGINE
Explore and exploit with a
variety of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Black Box Optimization
SigOpt. Confidential.
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale
with your needs
OPTIMIZATION ENGINE
Explore and exploit with a
variety of techniques
RESTAPI
Black Box Optimization
Better
Results
SigOpt. Confidential.
A graphical depiction of the iterative process
11
Build a statistical model Build a statistical model
Choose the next point
to maximize the acquisition function
Black Box Optimization
Choose the next point
to maximize the acquisition function
SigOpt. Confidential.
Gaussian processes: a powerful tool for modeling in spatial statistics
A standard tool for building statistical models is the Gaussian process
[Fasshauer et al, 2015, Fraizer, 2018].
• Assume that function values are jointly normally distributed.
• Apply prior beliefs about mean behavior and covariance between observations.
• Posterior beliefs about unobserved locations can be computed rather easily.
Different prior assumptions produce different statistical models:
12
Black Box Optimization
SigOpt. Confidential.
Acquisition function: given a model, how should we choose the next point?
An acquisition function is a strategy for defining the utility of a future sample, given the current samples, while
balancing exploration and exploitation [Shahriari et al, 2016].
Different acquisition functions choose different points (EI, PI, KG, etc.).
13
Black Box Optimization
Exploration:
Learning about the whole function f
Exploitation:
Further resolving regions where good f values
have already been observed
SigOpt Blog Posts: Intuition Behind Bayesian Optimization
Some Relevant Blog Posts
● Intuition Behind Covariance Kernels
● Approximation of Data
● Likelihood for Gaussian Processes
● Profile Likelihood vs. Kriging Variance
● Intuition behind Gaussian Processes
● Dealing with Troublesome Metrics
Find more blog posts visit:
https://sigopt.com/blog/
SigOpt. Confidential.
Why you should optimize using
multiple competing metrics
2
SigOpt. Confidential.
Why optimize against multiple competing metrics?
SigOpt allows the user to specify multiple competing metrics for either optimization or tracking
to better align success of the modeling with business value and has the additional benefit of:
• Multiple metrics — The option to defining multiple metrics, which can yield new and
interesting results
• Insights, metric storage — Insights through tracking of optimized and unoptimized
metrics
• Thresholds — The ability to define thresholds for success to better guide the optimizer
We believe this process gives models that deliver more reliable business outcomes and which
are better tied to real world applications.
16
Multiple Competing Metrics
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL,
Simulation Models
Model Evaluation
or Backtest
Testing
Data
New Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and reproduce
any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with your
needs
OPTIMIZATION ENGINE
Explore and exploit with a variety of
techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Multiple Competing Metrics
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and reproduce
any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with your
needs
OPTIMIZATION ENGINE
Explore and exploit with a variety of
techniques
RESTAPI
Multiple Competing Metrics
Multiple Optimized
and Unoptimized
Metrics
SigOpt. Confidential.
Multiple Competing Metrics
Balancing competing metrics to find the Pareto frontier
Most problems of practical relevance involve 2 or more competing metrics.
• Neural networks — Balancing accuracy and inference time
• Materials design — Balancing performance and maintenance cost
• Control systems — Balancing performance and safety
In a situation with Competing Metrics, the set of all efficient points (the Pareto frontier) is the solution.
19
Pareto
Frontier
Feasible
Region
SigOpt. Confidential.
Balancing competing metrics to find the Pareto frontier
As shown before, the goal in multi objective or multi criteria optimization the goal is to find
the optimal set of solution across a set of function [Knowles, 2006].
• This is formulated as finding the maximum of the set functions f1
to fn
over
the same domain x
• No single point exist as the solution, but we are actively trying to maximize the size of the
efficient frontier, which represent the set of solutions
• The solution is found through scalarization methods such as convex combination and
epsilon-constraint
20
Multiple Competing Metrics
SigOpt. Confidential.
Multiple Competing Metrics
Intuition: Convex Combination Scalarization
Idea: If we can convert the multimetric problem into a scalar problem, we can solve this problem using
Bayesian optimization.
One possible scalarization is through a convex combination of the objectives.
21
SigOpt. Confidential.
Balancing competing metrics to find the Pareto frontier with threshold
As shown before, the goal in multi objective or multi criteria optimization the goal is to find
the optimal set of solution across a set of function [Knowles, 2006].
• This is formulated as finding the maximum of the set functions f1
to fn
over the same
domain x
• No single point exist as the solution, but we are actively trying to maximize the size of the
efficient frontier, which represent the set of solutions
• The solution is found through constrained scalarization methods such as convex
combination and epsilon-constraint
• Allow users to change constraints as the search progresses [Letham et al, 2019]
22
Multiple Competing Metrics
SigOpt. Confidential.
Multiple Competing Metrics
Constrained Scalarization
1. Model all metrics independently.
• Requires no prior beliefs of how metrics interact.
• Missing data removed on a per metric basis if unrecorded.
2. Expose the efficient frontier through constrained scalar optimization.
• Enforce user constraints when given.
• Iterate through sub constraints to better resolve efficient frontier, if desired.
• Consider different regions of the frontier when parallelism is possible.
3. Allow users to change constraints as the search progresses.
• Allow the problems/goals to evolve as the user’s understanding changes.
Constraints give customers more control over the circumstances and more ability to understand our actions.
23
Variation on
Expected
Improvement
[Letham et al, 2019]
SigOpt. Confidential.
Intuition: Scalarization and Epsilon Constraints
24
Multiple Competing Metrics
SigOpt. Confidential.
Intuition: Constrained Scalarization and Epsilon Constraints
25
Multiple Competing Metrics
Multimetric Use Case 1
● Category: Time Series
● Task: Sequence Classification
● Model: CNN
● Data: Diatom Images
● Analysis: Accuracy-Time Tradeoff
● Result: Similar accuracy, 33% the inference time
Multimetric Use Case 2
● Category: NLP
● Task: Sentiment Analysis
● Model: CNN
● Data: Rotten Tomatoes Movie Reviews
● Analysis: Accuracy-Time Tradeoff
● Result: ~2% in accuracy versus 50% of training time
Learn more
https://devblogs.nvidia.com/sigopt-deep-learning-
hyperparameter-optimization/
Use Case: Balancing Speed & Accuracy in Deep Learning
Design: Question answering data and memory networks
Data Model
Sources:
Facebook AI Research (FAIR) bAbI dataset: https://research.fb.com/downloads/babi/
Sukhbaatar et al.: https://arxiv.org/abs/1503.08895
Comparison of Bayesian Optimization and Random Search
Setup: Hyperparameter Optimization
Standard Parameters Conditional Parameters
Result: Significant boost in consistency, accuracy
Comparison across random search versus Bayesian optimization with conditionals
Result: Highly cost efficient accuracy gains
Comparison across random search versus Bayesian optimization with conditionals
SigOpt is
18.5x as
efficient
SigOpt. Confidential.
How to continuously and efficiently
utilize your project’s allotted compute
infrastructure
3
SigOpt. Confidential.
Utilize compute by asynchronous parallel optimization
SigOpt natively handles Parallel Function Evaluation with the primary goal of
minimizing the Overall Wall-Clock Time. Parallelism also provides:
• Faster time-to-results — minimized overall wall-clock time
• Full resource utilization — asynchronous parallel optimization
• Scaling with infrastructure — optimize across the number of available compute resources
We believe this is essential to increase Research Productivity by lowering the time-to-results and
scaling with available infrastructure.
32
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL, Simulation
Model
Model Evaluation
or Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Worker
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Worker Worker
Worker Worker
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Parallel function evaluations
36
Parallel function evaluations are a way of efficiently maximizing a function
while using all available compute resources [Ginsbourger et al, 2008, Garcia-Barcos et al. 2019].
• Choosing points by jointly maximizing criteria over the entire set
• Asynchronously evaluating over a collection of points
• Fixing points which are currently being evaluated while sampling new ones
Continuously and efficiently utilize infrastructure
1D - Acquisition Function 2D - Acquisition Function
SigOpt. Confidential.
Parallel optimization: multiple worker nodes jointly optimizes over a given function
37
Parallel bandwidth = 1 Parallel bandwidth = 2 Parallel bandwidth = 3
Parallel bandwidth = 4 Parallel bandwidth = 5
Next point(s) to
evaluate:
Parallel bandwidth
represent the # of
available compute
resources
Statistical Model
Continuously and efficiently utilize infrastructure
Parallelism Use Case
● Category: NLP
● Task: Sentiment Analysis
● Model: CNN
● Data: Rotten Tomatoes Movie Reviews
● Analysis: Predicting Positive vs. Negative Sentiment
● Result: 400x speedup
Learn more
https://aws.amazon.com/blogs/machine-learning/fast-cnn-tuni
ng-with-aws-gpu-instances-and-sigopt/
Use Case: Fast CNN Tuning with AWS GPU Instances
Variables we tested in this experiment
Axes of exploration:
• 6 hyperparameters versus 10 parameters (including SGD and alternate architectures)
• CPU versus GPU compute
• Grid search versus random search versus Bayesian optimization
Results that we considered:
• Accuracy
• Compute cost
• Wall clock time
The parameters to tune
A deep dive
Results
Speed and accuracy
SigOpt helps you train your model faster
and achieve higher accuracy
This results in higher practitioner
productivity and better business
outcomes
While random and grid search for
hyperparameters do yield an accuracy
improvement, SigOpt achieves better
results on both dimensions
Results: 8x more cost efficient performance boost
Detailed performance across different optimization processes
Experiment Type Accuracy Trials Epochs CPU Time CPU Cost GPU Time GPU
Cost
Link Percent
Change
% Δ per
Comp $
Default (No Tuning) 75.70 1 50 2 hours $1.82 0 hours $0.04 NA 0 0
Grid Search (SGD
Only)
79.30 729 38394 64 days $1401.38 32 hours $27.58 here 4.60 0.13
Random Search (SGD
Only)
79.94 2400 127092 214 days $4638.86 106 hours $91.29 here 4.24 0.05
SigOpt Search (SGD
Only)
80.40 240 15803 27 days $576.81 13 hours $11.35 here 4.70 0.42
Grid Search (SGD +
Architecture)
Not
Feasible
59049 3109914 5255 days $113511.86 107 days $2233.95 NA NA NA
Random Search (SGD
+ Architecture)
80.12 4000 208729 353 days $7618.61 174 hours $149.94 here 4.42 0.03
SigOpt Search (SGD
+ Architecture)
81.00 400 30060 51 days $1097.19 25 hours $21.59 here 5.30 0.25
% Δ per Comp $ is calculated using GPU compute
Results: the experimentation loop
The training pipeline: from data to optimization and evaluation
SigOpt. Confidential.
How to tune models with
expensive training costs
4
SigOpt. Confidential.
How to efficiently minimize time to optimize any function
SigOpt’s multitask feature is an efficient way for modelers to tune model with an expensive training cost
with the benefit of:
• Faster time-to-market — The ability to bring expensive models into production faster
• Reduction in infrastructure cost — Intelligently leverage infrastructure while reducing cost
Through novel research SigOpt helps the user lower the overall time-to-market,
while reducing the overall compute budget.
45
Expensive Training Cost
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL, Simulation
Model
Model Evaluation or
Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Expensive Training Cost
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Expensive Training Cost
SigOpt. Confidential.
Using cheap or free information to speed learning
48
Sources:
Aaron Klein, Frank Hutter, et al.: https://arxiv.org/abs/1605.07079
SigOpt allows to the user to define lower-cost functions in order to quickly optimize expensive functions
• Cheaper-cost functions can be flexible (fewer epochs, subsampled data, other custom features)
• Use cheaper tasks earlier in the tuning process to explore
• Inform more expensive tasks later by exploiting what we learn
• In the process, reduce the full time required to tune an expensive model
Expensive Training Cost
SigOpt. Confidential.
Using cheap or free information to speed learning
We can build better models using inaccurate data to help point the
actual optimization in the right direction with less cost.
• Using a warm start through multi-task learning logic [Swersky et al, 2014]
• Combining good anytime performance with active learning [Klein et al, 2018]
• Accepting data from multiple sources without priors [Poloczek et al, 2017]
49
Expensive Training Cost
Use Case: Image Classification on a Budget
Use Case
● Category: Computer Vision
● Task: Image Classification
● Model: CNN
● Data: Stanford Cars Dataset
● Analysis: Architecture Comparison
● Result: 2.4% accuracy gain with a much shallower
model
Learn more
https://mlconf.com/blog/insights-for-building-high-performing-
image-classification-models/
SigOpt. Confidential.
Architecture: Classifying images of cars using ResNet
51
Convolutions Classification
ResNet
Input
Acura
TLX
2015
Output
Label
Sources:
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun: https://arxiv.org/abs/1512.03385
SigOpt. Confidential.
Training setup comparison
ImageNet Pretrained
Convolutional Layers
Fully Connected Layer
ImageNet Pretrained
Convolutional Layers
Fully Connected Layer
Input
Convolutional Features
Classification
Input
Convolutional Features
Classification
Fine Tuning Feature Extractor
Tuning
Tuning
SigOpt. Confidential.
Hyperparameter setup
53
Hyperparameter Lower Bound Upper Bound
Log Learning Rate 1.2e-4 1.0
Learning Rate Scheduler 0 0.99
Batch Size (powers of 2) 16 256
Nesterov False True
Log Weight Decay 1.2e-5 1.0
Momentum 0 0.9
Scheduler Step 1 20
SigOpt. Confidential.54
Insight: Multitask efficiency at the hyperparameter level
Example: Learning rate accuracy and values by cost of task over time
Progression of observations over time Accuracy and value for each observation Parameter importance analysis
SigOpt. Confidential.
Fine-tuning the smaller
network significantly
outperforms feature
extraction on a bigger
network
Results: Optimizing and tuning the full network outperforms
55
Multitask optimization
drives significant
performance gains
+3.92%
+1.58%
SigOpt. Confidential.
Implication: Fine-tuning significantly outperforms
Cost Breakdown for Multitask Optimization
Cost efficiency Feature Extractor ResNet 50 Fine-Tuning ResNet 18
Hours per training 4.08 4.2
Observations 220 220
Number of Runs 1 1
Total compute hours 898 924
Cost per GPU-hour $0.90 $0.90
% Improvement 1.58% 3.92%
Total compute cost $808 $832
cost ($) per %
improvement $509 $212
Similar Compute Cost
Similar Wall-Clock Time
Fine-Tuning Significantly
More Efficient and Effective
SigOpt. Confidential.
Thank you

More Related Content

What's hot

MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
SigOpt
 
PyData London 2018 talk on feature selection
PyData London 2018 talk on feature selectionPyData London 2018 talk on feature selection
PyData London 2018 talk on feature selection
Thomas Huijskens
 
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling PlatformsSigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
Matthew Evans
 
Insurance risk pricing with XGBoost
Insurance risk pricing with XGBoostInsurance risk pricing with XGBoost
Insurance risk pricing with XGBoost
Matthew Evans
 
Modeling at scale in systematic trading
Modeling at scale in systematic tradingModeling at scale in systematic trading
Modeling at scale in systematic trading
SigOpt
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
Wush Wu
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
Pratap Dangeti
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllBernard Ong
 
Tuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning OptimizationTuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning Optimization
SigOpt
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
Bernard Ong
 
VSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 SessionsVSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 Sessions
BigML, Inc
 
BigML Education - Logistic Regression
BigML Education - Logistic RegressionBigML Education - Logistic Regression
BigML Education - Logistic Regression
BigML, Inc
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
SigOpt
 
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Chris Ohk
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning Fundamentals
SigOpt
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Christopher Sneed, MSDS, PMP, CSPO
 
SigOpt at Ai4 Finance—Modeling at Scale
SigOpt at Ai4 Finance—Modeling at Scale SigOpt at Ai4 Finance—Modeling at Scale
SigOpt at Ai4 Finance—Modeling at Scale
SigOpt
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat
omarodibat
 

What's hot (20)

MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
PyData London 2018 talk on feature selection
PyData London 2018 talk on feature selectionPyData London 2018 talk on feature selection
PyData London 2018 talk on feature selection
 
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling PlatformsSigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
 
Insurance risk pricing with XGBoost
Insurance risk pricing with XGBoostInsurance risk pricing with XGBoost
Insurance risk pricing with XGBoost
 
Modeling at scale in systematic trading
Modeling at scale in systematic tradingModeling at scale in systematic trading
Modeling at scale in systematic trading
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_All
 
Tuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning OptimizationTuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning Optimization
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
 
VSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 SessionsVSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 Sessions
 
BigML Education - Logistic Regression
BigML Education - Logistic RegressionBigML Education - Logistic Regression
BigML Education - Logistic Regression
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
 
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning Fundamentals
 
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
Presentation - Predicting Online Purchases Using Conversion Prediction Modeli...
 
SigOpt at Ai4 Finance—Modeling at Scale
SigOpt at Ai4 Finance—Modeling at Scale SigOpt at Ai4 Finance—Modeling at Scale
SigOpt at Ai4 Finance—Modeling at Scale
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat
 

Similar to Advanced Optimization for the Enterprise Webinar

Tuning 2.0: Advanced Optimization Techniques Webinar
Tuning 2.0: Advanced Optimization Techniques WebinarTuning 2.0: Advanced Optimization Techniques Webinar
Tuning 2.0: Advanced Optimization Techniques Webinar
SigOpt
 
Modeling at Scale: SigOpt at TWIMLcon 2019
Modeling at Scale: SigOpt at TWIMLcon 2019Modeling at Scale: SigOpt at TWIMLcon 2019
Modeling at Scale: SigOpt at TWIMLcon 2019
SigOpt
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
SigOpt
 
Metric Management: a SigOpt Applied Use Case
Metric Management: a SigOpt Applied Use CaseMetric Management: a SigOpt Applied Use Case
Metric Management: a SigOpt Applied Use Case
SigOpt
 
SigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt for Hedge Funds
SigOpt for Hedge Funds
SigOpt
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Scott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
SigOpt
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
Roger Barga
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
Greg Makowski
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
Simon Hughes
 
IRJET- Machine Learning Techniques for Code Optimization
IRJET-  	  Machine Learning Techniques for Code OptimizationIRJET-  	  Machine Learning Techniques for Code Optimization
IRJET- Machine Learning Techniques for Code Optimization
IRJET Journal
 
Analytics
AnalyticsAnalytics
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveScaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
June Andrews
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Alok Singh
 
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Lucidworks
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
SigOpt
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
halifaxchester
 
Experimentation at Blue Apron (webinar)
Experimentation at Blue Apron (webinar)Experimentation at Blue Apron (webinar)
Experimentation at Blue Apron (webinar)
Optimizely
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
MLconf
 

Similar to Advanced Optimization for the Enterprise Webinar (20)

Tuning 2.0: Advanced Optimization Techniques Webinar
Tuning 2.0: Advanced Optimization Techniques WebinarTuning 2.0: Advanced Optimization Techniques Webinar
Tuning 2.0: Advanced Optimization Techniques Webinar
 
Modeling at Scale: SigOpt at TWIMLcon 2019
Modeling at Scale: SigOpt at TWIMLcon 2019Modeling at Scale: SigOpt at TWIMLcon 2019
Modeling at Scale: SigOpt at TWIMLcon 2019
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 
Metric Management: a SigOpt Applied Use Case
Metric Management: a SigOpt Applied Use CaseMetric Management: a SigOpt Applied Use Case
Metric Management: a SigOpt Applied Use Case
 
SigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt for Hedge Funds
SigOpt for Hedge Funds
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
IRJET- Machine Learning Techniques for Code Optimization
IRJET-  	  Machine Learning Techniques for Code OptimizationIRJET-  	  Machine Learning Techniques for Code Optimization
IRJET- Machine Learning Techniques for Code Optimization
 
Analytics
AnalyticsAnalytics
Analytics
 
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveScaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
 
Experimentation at Blue Apron (webinar)
Experimentation at Blue Apron (webinar)Experimentation at Blue Apron (webinar)
Experimentation at Blue Apron (webinar)
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 

More from SigOpt

Optimizing BERT and Natural Language Models with SigOpt Experiment Management
Optimizing BERT and Natural Language Models with SigOpt Experiment ManagementOptimizing BERT and Natural Language Models with SigOpt Experiment Management
Optimizing BERT and Natural Language Models with SigOpt Experiment Management
SigOpt
 
Experiment Management for the Enterprise
Experiment Management for the EnterpriseExperiment Management for the Enterprise
Experiment Management for the Enterprise
SigOpt
 
Efficient NLP by Distilling BERT and Multimetric Optimization
Efficient NLP by Distilling BERT and Multimetric OptimizationEfficient NLP by Distilling BERT and Multimetric Optimization
Efficient NLP by Distilling BERT and Multimetric Optimization
SigOpt
 
Detecting COVID-19 Cases with Deep Learning
Detecting COVID-19 Cases with Deep LearningDetecting COVID-19 Cases with Deep Learning
Detecting COVID-19 Cases with Deep Learning
SigOpt
 
Tuning Data Augmentation to Boost Model Performance
Tuning Data Augmentation to Boost Model PerformanceTuning Data Augmentation to Boost Model Performance
Tuning Data Augmentation to Boost Model Performance
SigOpt
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
SigOpt
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
SigOpt
 
SigOpt at GTC - Reducing operational barriers to optimization
SigOpt at GTC - Reducing operational barriers to optimizationSigOpt at GTC - Reducing operational barriers to optimization
SigOpt at GTC - Reducing operational barriers to optimization
SigOpt
 
Lessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scaleLessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scale
SigOpt
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
Tips and techniques for hyperparameter optimization
Tips and techniques for hyperparameter optimizationTips and techniques for hyperparameter optimization
Tips and techniques for hyperparameter optimization
SigOpt
 
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
SigOpt
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
SigOpt
 

More from SigOpt (15)

Optimizing BERT and Natural Language Models with SigOpt Experiment Management
Optimizing BERT and Natural Language Models with SigOpt Experiment ManagementOptimizing BERT and Natural Language Models with SigOpt Experiment Management
Optimizing BERT and Natural Language Models with SigOpt Experiment Management
 
Experiment Management for the Enterprise
Experiment Management for the EnterpriseExperiment Management for the Enterprise
Experiment Management for the Enterprise
 
Efficient NLP by Distilling BERT and Multimetric Optimization
Efficient NLP by Distilling BERT and Multimetric OptimizationEfficient NLP by Distilling BERT and Multimetric Optimization
Efficient NLP by Distilling BERT and Multimetric Optimization
 
Detecting COVID-19 Cases with Deep Learning
Detecting COVID-19 Cases with Deep LearningDetecting COVID-19 Cases with Deep Learning
Detecting COVID-19 Cases with Deep Learning
 
Tuning Data Augmentation to Boost Model Performance
Tuning Data Augmentation to Boost Model PerformanceTuning Data Augmentation to Boost Model Performance
Tuning Data Augmentation to Boost Model Performance
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 
SigOpt at GTC - Reducing operational barriers to optimization
SigOpt at GTC - Reducing operational barriers to optimizationSigOpt at GTC - Reducing operational barriers to optimization
SigOpt at GTC - Reducing operational barriers to optimization
 
Lessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scaleLessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scale
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model Training
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Tips and techniques for hyperparameter optimization
Tips and techniques for hyperparameter optimizationTips and techniques for hyperparameter optimization
Tips and techniques for hyperparameter optimization
 
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Advanced Optimization for the Enterprise Webinar

  • 1. SigOpt. Confidential. Advanced Optimization for the Enterprise Considerations and use cases Scott Clark — Co-Founder and CEO, SigOpt Tuesday, November 5, 2019
  • 2. SigOpt. Confidential. 2 Accelerate and amplify the impact of modelers everywhere
  • 3. SigOpt. Confidential. 3 Abstract SigOpt provides an extensive set of advanced features, which help you, the expert, save time while increasing performance. Today, we will be sharing some of the intuition behind these features, while combining and applying them to tackle real-world problems
  • 4. SigOpt. Confidential. How experimentation impacts your modeling 4 Notebook & Model Framework Hardware Environment Data Preparation Experimentation, Training, Evaluation Model Productionalization Validation Serving Deploying Monitoring Managing Inference Online Testing Transformation Labeling Pre-Processing Pipeline Dev. Feature Eng. Feature Stores On-Premise Hybrid Multi-Cloud Experimentation & Model Optimization Insights, Tracking, Collaboration Model Search, Hyperparameter Tuning Resource Scheduler, Management ...and more
  • 5. SigOpt. Confidential. 5 Motivation 1. How to solve a black box optimization problem 2. Why you should optimize using multiple competing metrics 3. How to continuously and efficiently employ your project’s dedicated compute infrastructure 4. How to tune models with expensive training costs
  • 6. SigOpt. Confidential. SigOpt Features Enterprise Platform Optimization Engine Experiment Insights Reproducibility Intuitive web dashboards Cross-team permissions and collaboration Advanced experiment visualizations Usage insights Parameter importance analysis Multimetric optimization Continuous, categorical, or integer parameters Constraints and failure regions Up to 10k observations, 100 parameters Multitask optimization and high parallelism Conditional parameters Infrastructure agnostic REST API Parallel Resource Scheduler Black-Box Interface Tunes without accessing any data Libraries for Python, Java, R, and MATLAB 6
  • 7. SigOpt. Confidential. How to solve a black box optimization problem1
  • 8. SigOpt. Confidential. Why black box optimization? SigOpt was designed to empower you, the practitioner, to re-define most machine learning problems as black box optimization problems with the added side benefits: • Amplified performance — incremental gains in accuracy or other success metrics • Productivity gains — a consistent platform across tasks that facilitates sharing • Accelerated modeling — early elimination of non-scalable tasks • Compute efficiency — continuous, full utilization of infrastructure SigOpt uses an ensemble of Bayesian and Global Optimization methods to solve these black box optimization problems. 8 Black Box Optimization
  • 9. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Black Box Optimization
  • 10. SigOpt. Confidential. EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Black Box Optimization Better Results
  • 11. SigOpt. Confidential. A graphical depiction of the iterative process 11 Build a statistical model Build a statistical model Choose the next point to maximize the acquisition function Black Box Optimization Choose the next point to maximize the acquisition function
  • 12. SigOpt. Confidential. Gaussian processes: a powerful tool for modeling in spatial statistics A standard tool for building statistical models is the Gaussian process [Fasshauer et al, 2015, Fraizer, 2018]. • Assume that function values are jointly normally distributed. • Apply prior beliefs about mean behavior and covariance between observations. • Posterior beliefs about unobserved locations can be computed rather easily. Different prior assumptions produce different statistical models: 12 Black Box Optimization
  • 13. SigOpt. Confidential. Acquisition function: given a model, how should we choose the next point? An acquisition function is a strategy for defining the utility of a future sample, given the current samples, while balancing exploration and exploitation [Shahriari et al, 2016]. Different acquisition functions choose different points (EI, PI, KG, etc.). 13 Black Box Optimization Exploration: Learning about the whole function f Exploitation: Further resolving regions where good f values have already been observed
  • 14. SigOpt Blog Posts: Intuition Behind Bayesian Optimization Some Relevant Blog Posts ● Intuition Behind Covariance Kernels ● Approximation of Data ● Likelihood for Gaussian Processes ● Profile Likelihood vs. Kriging Variance ● Intuition behind Gaussian Processes ● Dealing with Troublesome Metrics Find more blog posts visit: https://sigopt.com/blog/
  • 15. SigOpt. Confidential. Why you should optimize using multiple competing metrics 2
  • 16. SigOpt. Confidential. Why optimize against multiple competing metrics? SigOpt allows the user to specify multiple competing metrics for either optimization or tracking to better align success of the modeling with business value and has the additional benefit of: • Multiple metrics — The option to defining multiple metrics, which can yield new and interesting results • Insights, metric storage — Insights through tracking of optimized and unoptimized metrics • Thresholds — The ability to define thresholds for success to better guide the optimizer We believe this process gives models that deliver more reliable business outcomes and which are better tied to real world applications. 16 Multiple Competing Metrics
  • 17. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Models Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Multiple Competing Metrics
  • 18. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Multiple Competing Metrics Multiple Optimized and Unoptimized Metrics
  • 19. SigOpt. Confidential. Multiple Competing Metrics Balancing competing metrics to find the Pareto frontier Most problems of practical relevance involve 2 or more competing metrics. • Neural networks — Balancing accuracy and inference time • Materials design — Balancing performance and maintenance cost • Control systems — Balancing performance and safety In a situation with Competing Metrics, the set of all efficient points (the Pareto frontier) is the solution. 19 Pareto Frontier Feasible Region
  • 20. SigOpt. Confidential. Balancing competing metrics to find the Pareto frontier As shown before, the goal in multi objective or multi criteria optimization the goal is to find the optimal set of solution across a set of function [Knowles, 2006]. • This is formulated as finding the maximum of the set functions f1 to fn over the same domain x • No single point exist as the solution, but we are actively trying to maximize the size of the efficient frontier, which represent the set of solutions • The solution is found through scalarization methods such as convex combination and epsilon-constraint 20 Multiple Competing Metrics
  • 21. SigOpt. Confidential. Multiple Competing Metrics Intuition: Convex Combination Scalarization Idea: If we can convert the multimetric problem into a scalar problem, we can solve this problem using Bayesian optimization. One possible scalarization is through a convex combination of the objectives. 21
  • 22. SigOpt. Confidential. Balancing competing metrics to find the Pareto frontier with threshold As shown before, the goal in multi objective or multi criteria optimization the goal is to find the optimal set of solution across a set of function [Knowles, 2006]. • This is formulated as finding the maximum of the set functions f1 to fn over the same domain x • No single point exist as the solution, but we are actively trying to maximize the size of the efficient frontier, which represent the set of solutions • The solution is found through constrained scalarization methods such as convex combination and epsilon-constraint • Allow users to change constraints as the search progresses [Letham et al, 2019] 22 Multiple Competing Metrics
  • 23. SigOpt. Confidential. Multiple Competing Metrics Constrained Scalarization 1. Model all metrics independently. • Requires no prior beliefs of how metrics interact. • Missing data removed on a per metric basis if unrecorded. 2. Expose the efficient frontier through constrained scalar optimization. • Enforce user constraints when given. • Iterate through sub constraints to better resolve efficient frontier, if desired. • Consider different regions of the frontier when parallelism is possible. 3. Allow users to change constraints as the search progresses. • Allow the problems/goals to evolve as the user’s understanding changes. Constraints give customers more control over the circumstances and more ability to understand our actions. 23 Variation on Expected Improvement [Letham et al, 2019]
  • 24. SigOpt. Confidential. Intuition: Scalarization and Epsilon Constraints 24 Multiple Competing Metrics
  • 25. SigOpt. Confidential. Intuition: Constrained Scalarization and Epsilon Constraints 25 Multiple Competing Metrics
  • 26. Multimetric Use Case 1 ● Category: Time Series ● Task: Sequence Classification ● Model: CNN ● Data: Diatom Images ● Analysis: Accuracy-Time Tradeoff ● Result: Similar accuracy, 33% the inference time Multimetric Use Case 2 ● Category: NLP ● Task: Sentiment Analysis ● Model: CNN ● Data: Rotten Tomatoes Movie Reviews ● Analysis: Accuracy-Time Tradeoff ● Result: ~2% in accuracy versus 50% of training time Learn more https://devblogs.nvidia.com/sigopt-deep-learning- hyperparameter-optimization/ Use Case: Balancing Speed & Accuracy in Deep Learning
  • 27. Design: Question answering data and memory networks Data Model Sources: Facebook AI Research (FAIR) bAbI dataset: https://research.fb.com/downloads/babi/ Sukhbaatar et al.: https://arxiv.org/abs/1503.08895
  • 28. Comparison of Bayesian Optimization and Random Search Setup: Hyperparameter Optimization Standard Parameters Conditional Parameters
  • 29. Result: Significant boost in consistency, accuracy Comparison across random search versus Bayesian optimization with conditionals
  • 30. Result: Highly cost efficient accuracy gains Comparison across random search versus Bayesian optimization with conditionals SigOpt is 18.5x as efficient
  • 31. SigOpt. Confidential. How to continuously and efficiently utilize your project’s allotted compute infrastructure 3
  • 32. SigOpt. Confidential. Utilize compute by asynchronous parallel optimization SigOpt natively handles Parallel Function Evaluation with the primary goal of minimizing the Overall Wall-Clock Time. Parallelism also provides: • Faster time-to-results — minimized overall wall-clock time • Full resource utilization — asynchronous parallel optimization • Scaling with infrastructure — optimize across the number of available compute resources We believe this is essential to increase Research Productivity by lowering the time-to-results and scaling with available infrastructure. 32 Continuously and efficiently utilize infrastructure
  • 33. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Continuously and efficiently utilize infrastructure
  • 34. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Worker Continuously and efficiently utilize infrastructure
  • 35. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Worker Worker Worker Worker Continuously and efficiently utilize infrastructure
  • 36. SigOpt. Confidential. Parallel function evaluations 36 Parallel function evaluations are a way of efficiently maximizing a function while using all available compute resources [Ginsbourger et al, 2008, Garcia-Barcos et al. 2019]. • Choosing points by jointly maximizing criteria over the entire set • Asynchronously evaluating over a collection of points • Fixing points which are currently being evaluated while sampling new ones Continuously and efficiently utilize infrastructure 1D - Acquisition Function 2D - Acquisition Function
  • 37. SigOpt. Confidential. Parallel optimization: multiple worker nodes jointly optimizes over a given function 37 Parallel bandwidth = 1 Parallel bandwidth = 2 Parallel bandwidth = 3 Parallel bandwidth = 4 Parallel bandwidth = 5 Next point(s) to evaluate: Parallel bandwidth represent the # of available compute resources Statistical Model Continuously and efficiently utilize infrastructure
  • 38. Parallelism Use Case ● Category: NLP ● Task: Sentiment Analysis ● Model: CNN ● Data: Rotten Tomatoes Movie Reviews ● Analysis: Predicting Positive vs. Negative Sentiment ● Result: 400x speedup Learn more https://aws.amazon.com/blogs/machine-learning/fast-cnn-tuni ng-with-aws-gpu-instances-and-sigopt/ Use Case: Fast CNN Tuning with AWS GPU Instances
  • 39. Variables we tested in this experiment Axes of exploration: • 6 hyperparameters versus 10 parameters (including SGD and alternate architectures) • CPU versus GPU compute • Grid search versus random search versus Bayesian optimization Results that we considered: • Accuracy • Compute cost • Wall clock time
  • 40. The parameters to tune A deep dive
  • 41. Results Speed and accuracy SigOpt helps you train your model faster and achieve higher accuracy This results in higher practitioner productivity and better business outcomes While random and grid search for hyperparameters do yield an accuracy improvement, SigOpt achieves better results on both dimensions
  • 42. Results: 8x more cost efficient performance boost Detailed performance across different optimization processes Experiment Type Accuracy Trials Epochs CPU Time CPU Cost GPU Time GPU Cost Link Percent Change % Δ per Comp $ Default (No Tuning) 75.70 1 50 2 hours $1.82 0 hours $0.04 NA 0 0 Grid Search (SGD Only) 79.30 729 38394 64 days $1401.38 32 hours $27.58 here 4.60 0.13 Random Search (SGD Only) 79.94 2400 127092 214 days $4638.86 106 hours $91.29 here 4.24 0.05 SigOpt Search (SGD Only) 80.40 240 15803 27 days $576.81 13 hours $11.35 here 4.70 0.42 Grid Search (SGD + Architecture) Not Feasible 59049 3109914 5255 days $113511.86 107 days $2233.95 NA NA NA Random Search (SGD + Architecture) 80.12 4000 208729 353 days $7618.61 174 hours $149.94 here 4.42 0.03 SigOpt Search (SGD + Architecture) 81.00 400 30060 51 days $1097.19 25 hours $21.59 here 5.30 0.25 % Δ per Comp $ is calculated using GPU compute
  • 43. Results: the experimentation loop The training pipeline: from data to optimization and evaluation
  • 44. SigOpt. Confidential. How to tune models with expensive training costs 4
  • 45. SigOpt. Confidential. How to efficiently minimize time to optimize any function SigOpt’s multitask feature is an efficient way for modelers to tune model with an expensive training cost with the benefit of: • Faster time-to-market — The ability to bring expensive models into production faster • Reduction in infrastructure cost — Intelligently leverage infrastructure while reducing cost Through novel research SigOpt helps the user lower the overall time-to-market, while reducing the overall compute budget. 45 Expensive Training Cost
  • 46. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Expensive Training Cost
  • 47. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Expensive Training Cost
  • 48. SigOpt. Confidential. Using cheap or free information to speed learning 48 Sources: Aaron Klein, Frank Hutter, et al.: https://arxiv.org/abs/1605.07079 SigOpt allows to the user to define lower-cost functions in order to quickly optimize expensive functions • Cheaper-cost functions can be flexible (fewer epochs, subsampled data, other custom features) • Use cheaper tasks earlier in the tuning process to explore • Inform more expensive tasks later by exploiting what we learn • In the process, reduce the full time required to tune an expensive model Expensive Training Cost
  • 49. SigOpt. Confidential. Using cheap or free information to speed learning We can build better models using inaccurate data to help point the actual optimization in the right direction with less cost. • Using a warm start through multi-task learning logic [Swersky et al, 2014] • Combining good anytime performance with active learning [Klein et al, 2018] • Accepting data from multiple sources without priors [Poloczek et al, 2017] 49 Expensive Training Cost
  • 50. Use Case: Image Classification on a Budget Use Case ● Category: Computer Vision ● Task: Image Classification ● Model: CNN ● Data: Stanford Cars Dataset ● Analysis: Architecture Comparison ● Result: 2.4% accuracy gain with a much shallower model Learn more https://mlconf.com/blog/insights-for-building-high-performing- image-classification-models/
  • 51. SigOpt. Confidential. Architecture: Classifying images of cars using ResNet 51 Convolutions Classification ResNet Input Acura TLX 2015 Output Label Sources: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun: https://arxiv.org/abs/1512.03385
  • 52. SigOpt. Confidential. Training setup comparison ImageNet Pretrained Convolutional Layers Fully Connected Layer ImageNet Pretrained Convolutional Layers Fully Connected Layer Input Convolutional Features Classification Input Convolutional Features Classification Fine Tuning Feature Extractor Tuning Tuning
  • 53. SigOpt. Confidential. Hyperparameter setup 53 Hyperparameter Lower Bound Upper Bound Log Learning Rate 1.2e-4 1.0 Learning Rate Scheduler 0 0.99 Batch Size (powers of 2) 16 256 Nesterov False True Log Weight Decay 1.2e-5 1.0 Momentum 0 0.9 Scheduler Step 1 20
  • 54. SigOpt. Confidential.54 Insight: Multitask efficiency at the hyperparameter level Example: Learning rate accuracy and values by cost of task over time Progression of observations over time Accuracy and value for each observation Parameter importance analysis
  • 55. SigOpt. Confidential. Fine-tuning the smaller network significantly outperforms feature extraction on a bigger network Results: Optimizing and tuning the full network outperforms 55 Multitask optimization drives significant performance gains +3.92% +1.58%
  • 56. SigOpt. Confidential. Implication: Fine-tuning significantly outperforms Cost Breakdown for Multitask Optimization Cost efficiency Feature Extractor ResNet 50 Fine-Tuning ResNet 18 Hours per training 4.08 4.2 Observations 220 220 Number of Runs 1 1 Total compute hours 898 924 Cost per GPU-hour $0.90 $0.90 % Improvement 1.58% 3.92% Total compute cost $808 $832 cost ($) per % improvement $509 $212 Similar Compute Cost Similar Wall-Clock Time Fine-Tuning Significantly More Efficient and Effective