SlideShare a Scribd company logo
1 of 22
SigOpt. Confidential.
Advanced Techniques to Accelerate Model Tuning
Michael McCourt
Head of Engineering, SigOpt, an Intel company
June 8, 2021
Software for AI Optimization Summit
SigOpt. Confidential.
Today’s agenda
Advances in tuning AI models
• Implementing black box optimization strategies for tuning at scale
• Balancing multiple competing metrics
• Incorporating model information to accelerate tuning
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
2
SigOpt. Confidential.
Tuning AI models
Making the right decisions for generalization
Training models finds the parameters which minimize the loss (inaccuracy) of a
model for a given set of training data and fixed set of hyperparameters.
Tuning models finds hyperparameters which allow a model to perform in a
production setting at a level which all the stakeholders find acceptable.
Today, we will discuss how SigOpt guides this tuning process.
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
3
SigOpt. Confidential.
SigOpt workflow diagram
Your firewall
Training
Data
AI, ML, DL, Simulation
Model
Model Evaluation or
Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
REST
API
Parameters or
Hyperparameters
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
5
SigOpt. Confidential.
Optimally tuning AI models at scale
How do we power and manage this optimization?
At our core, we use black box algorithms to power the optimization.
• Bayesian optimization (a.k.a., model-based optimization)
• Evolutionary algorithms
• Quasi-random sampling (through QMCPy)
The theoretical designs of these algorithms is insufficient for production.
• Limited time for our computations
• Multiple trainings running asynchronously in parallel
• Customer edits to the optimization domain/results
We focus our discussion only on Bayesian optimization, the most suitable for ML.
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
6
SigOpt. Confidential.
Standard BO computational workflow
Online/Offline computation
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
7
SigOpt. Confidential.
Asynchronous BO computational workflow
Online/Offline computation
API Request
Report
Results
API Request
Next Point
Retrieve
Next Point
Rerank All
Points
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
8
SigOpt. Confidential.
Balancing multiple competing metrics
How do we satisfy multiple stakeholders?
Tuning models with scalar optimization is satisfying, but often incomplete.
Real-world circumstances involve multiple competing metrics.
• High accuracy, low inference time (example)
• Maximizing true positives, minimizing false diagnoses (example)
• Minimal incorrect fraud alerts, easily interpretable model (example)
• High accuracy, limited size, low computation (example)
Our goal must change, from finding the answer to understanding our options.
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
9
SigOpt. Confidential.
Studying the Pareto frontier
Understanding the tradeoff between two metrics
Can we find hyperparameters that maximize both accuracy and inference speed?
Probably not … but we can learn what is possible.
Accuracy Speed
Accuracy
Speed
Pareto
Frontier
All
Possible
Models
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
10
SigOpt. Confidential.
Finding the Pareto frontier
What strategies exist?
Much literature exists on finding the Pareto frontier.
• Genetic algorithms are by far the most popular (core reference).
• They work best with massive experimentation/compute.
• This may be impractical in ML tuning.
Bayesian optimization has started to address this problem:
• First with linear combinations of metrics,
• Later with analysis of the Pareto frontier hypervolume,
• Also by adaptively enforcing artificial constraints.
These model-based strategies are more efficient than genetic algorithms.
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
11
SigOpt. Confidential.
Building neural network models for edge devices
Practical limitations on neural networks
Situation: When deploying neural networks to edge devices (e.g., cell phones or televisions) they must
perform well without requiring supercomputers to run them. Simplified form of this blog.
Metrics to be balanced: Validation accuracy, Stability under edge deployment
12 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Dealing with more than 2 metrics
What is practical?
Approximating the Pareto frontier with 2 metrics requires many points.
• Our algorithm has proved effective in tests and in practice.
• Moving beyond 2 metrics requires many more points.
• This may be infeasible for ML tuning situations that have long training times.
What can be done to manage this higher number of metrics?
• We could enforce some metrics as constraints.
• We could define success to more easily incorporate more metrics.
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
13
SigOpt. Confidential.
Incorporating more metrics as constraints
Turning a numeric metric into “good enough”
Customers can pass more than two metrics to us as constraints. (example)
• Example: Maximize accuracy, minimize computation, 0.15M or less network
size
Our algorithm modifies our computations to incorporate this constraint.
14
Grayed out
points
violate the
constraint
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Constraint active search for many metrics
Exploring all the “good enough” parameters
If a user were only interested in satisfying constraints for all metrics …
• Every choice of hyperparameters would be only satisfactory/unsatisfactory.
• Model-building is independent of the number of metrics.
No optimization is needed … only a search to find all satisfactory models.
This is the thesis of our latest innovation: constraint active search.
• Explore the parameter space to give users actionable information as efficiently
as possible.
ICML 2021 paper accepted, feature to be released July 1.
15 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Constraint active search for many metrics
Exploring all the “good enough” parameters
16 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Incorporating model information to accelerate tuning
Moving beyond black box optimization
Our algorithms generally do not leverage information about customer models.
• Many customers want that knowledge to live behind their firewall.
Increasingly, customers seem willing to convey certain information to us.
• Example: The type of model, the role of hyperparameters, the size of their data
We want to use this information to give customers a better experience.
17 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Features specific to gradient-boosted methods
What can be done?
To address model-specific problems, we need to consider a class of models.
• We have focused on XGBoost and LightGBM.
• These have well-defined structure and common hyperparameters.
• These are very popular, both broadly and within our customer base.
• These are likely to have both very experienced users and new users.
18 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Features specific to gradient-boosted methods
Automatic hyperparameter domain selection
If we know the model, we
may already know which
hyperparameters to
study.
Quality of life change for
our customers, and we
gain knowledge about
these hyperparameters.
19 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Features specific to gradient-boosted methods
XGBoost/LightGBM specific prior beliefs
Our computations are more effective
with prior beliefs -- knowledge about
which hyperparameter values are
likely to be good.
In this gradient-boosted setting, we
can load these beliefs for our
customers.
20 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Thank you for having me!
We are starting a Q&A session now, or
email mccourt@sigopt.com with questions.
We’re hiring! New roles for
• Front end engineer
• Full stack engineer
• Product marketing manager
• Product design lead
• Senior product manager
Please contact mccourt@sigopt.com!
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
SigOpt. Confidential.
Try SigOpt for free
Interested in trying our product?
Sign up here or go to
app.sigopt.com/signup
Our free plan includes: 10 users per organization with 1 TB data storage, 500 hyperparameter optimization
experiments per month and 1,000 observations per experiment, unlimited tracked training runs, and email
support with 48-hour expected response time
© Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
Notices & Disclaimers
© Intel Corporation. Intel, SigOpt, the Intel logo, the SigOpt logo, and other Intel marks are
trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the
property of others.
No product or component can be absolutely secure.
Intel does not control or audit third-party data. You should consult other sources to evaluate
accuracy.
Your costs and results may vary.
Intel and SigOpt technologies may require enabled hardware, software or service activation.
Performance varies by use, configuration and other factors. Learn more at
https://sigopt.com/resources/ and www.Intel.com/PerformanceIndex​.

More Related Content

What's hot

What's hot (20)

Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.
 
AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
 
Simple Single Instruction Multiple Data (SIMD) with the Intel® Implicit SPMD ...
Simple Single Instruction Multiple Data (SIMD) with the Intel® Implicit SPMD ...Simple Single Instruction Multiple Data (SIMD) with the Intel® Implicit SPMD ...
Simple Single Instruction Multiple Data (SIMD) with the Intel® Implicit SPMD ...
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devices
 
AIDC India - AI on IA
AIDC India  - AI on IAAIDC India  - AI on IA
AIDC India - AI on IA
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
“Getting Efficient DNN Inference Performance: Is It Really About the TOPS?,” ...
“Getting Efficient DNN Inference Performance: Is It Really About the TOPS?,” ...“Getting Efficient DNN Inference Performance: Is It Really About the TOPS?,” ...
“Getting Efficient DNN Inference Performance: Is It Really About the TOPS?,” ...
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and Anaconda
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision Slides
 
Machine programming
Machine programmingMachine programming
Machine programming
 
N(ot)-o(nly)-(Ha)doop - the DAG showdown
N(ot)-o(nly)-(Ha)doop - the DAG showdownN(ot)-o(nly)-(Ha)doop - the DAG showdown
N(ot)-o(nly)-(Ha)doop - the DAG showdown
 
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPATDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
TDC2019 Intel Software Day - Otimizacao grafica com o Intel GPA
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
 

Similar to Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization Summit 2021 Technical Session

"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC..."Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
Edge AI and Vision Alliance
 
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
Edge AI and Vision Alliance
 

Similar to Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization Summit 2021 Technical Session (20)

Lessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scaleLessons for an enterprise approach to modeling at scale
Lessons for an enterprise approach to modeling at scale
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
 
Bitrock manufacturing
Bitrock manufacturing Bitrock manufacturing
Bitrock manufacturing
 
"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC..."Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoC...
 
Smart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart DetectionSmart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart Detection
 
What impact will the Internet of Things have on supply chain? A discussion - ...
What impact will the Internet of Things have on supply chain? A discussion - ...What impact will the Internet of Things have on supply chain? A discussion - ...
What impact will the Internet of Things have on supply chain? A discussion - ...
 
SigOpt for Hedge Funds
SigOpt for Hedge FundsSigOpt for Hedge Funds
SigOpt for Hedge Funds
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
“Smarter Manufacturing with Intel’s Deep Learning-Based Machine Vision,” a Pr...
 
Accelerate AI/ML Adoption with Intel Processors and C3IoT on AWS (AIM386-S) -...
Accelerate AI/ML Adoption with Intel Processors and C3IoT on AWS (AIM386-S) -...Accelerate AI/ML Adoption with Intel Processors and C3IoT on AWS (AIM386-S) -...
Accelerate AI/ML Adoption with Intel Processors and C3IoT on AWS (AIM386-S) -...
 
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
 
Driving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleDriving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to Exascale
 
940 diamond sponsor sengupta
940 diamond sponsor sengupta940 diamond sponsor sengupta
940 diamond sponsor sengupta
 
940 diamond sponsor sengupta,_using our laptop
940 diamond sponsor sengupta,_using our laptop940 diamond sponsor sengupta,_using our laptop
940 diamond sponsor sengupta,_using our laptop
 
940 paw business general session - ssg - data-robot
940 paw business   general session - ssg - data-robot940 paw business   general session - ssg - data-robot
940 paw business general session - ssg - data-robot
 
Experiment Management for the Enterprise
Experiment Management for the EnterpriseExperiment Management for the Enterprise
Experiment Management for the Enterprise
 

More from Intel® Software

More from Intel® Software (16)

Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
Intel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview Slides
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
 
AIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesAIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino Slides
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
 
Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...
Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...
Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...
 
ANYFACE*: Create Film Industry-Quality Facial Rendering & Animation Using Mai...
ANYFACE*: Create Film Industry-Quality Facial Rendering & Animation Using Mai...ANYFACE*: Create Film Industry-Quality Facial Rendering & Animation Using Mai...
ANYFACE*: Create Film Industry-Quality Facial Rendering & Animation Using Mai...
 
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
 
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...
 
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
Bring the Future of Entertainment to Your Living Room: MPEG-I Immersive Video...
 
Intel® AI: Parameter Efficient Training
Intel® AI: Parameter Efficient TrainingIntel® AI: Parameter Efficient Training
Intel® AI: Parameter Efficient Training
 
Intel® AI: Non-Parametric Priors for Generative Adversarial Networks
Intel® AI: Non-Parametric Priors for Generative Adversarial Networks Intel® AI: Non-Parametric Priors for Generative Adversarial Networks
Intel® AI: Non-Parametric Priors for Generative Adversarial Networks
 
Persistent Memory Programming with Pmemkv
Persistent Memory Programming with PmemkvPersistent Memory Programming with Pmemkv
Persistent Memory Programming with Pmemkv
 
Big Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageBig Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object Storage
 
Debugging Tools & Techniques for Persistent Memory Programming
Debugging Tools & Techniques for Persistent Memory ProgrammingDebugging Tools & Techniques for Persistent Memory Programming
Debugging Tools & Techniques for Persistent Memory Programming
 
Persistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the ProjectPersistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the Project
 

Recently uploaded

Recently uploaded (20)

Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST APIFrom Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Rapidoform for Modern Form Building and Insights
Rapidoform for Modern Form Building and InsightsRapidoform for Modern Form Building and Insights
Rapidoform for Modern Form Building and Insights
 
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
 
Your Ultimate Web Studio for Streaming Anywhere | Evmux
Your Ultimate Web Studio for Streaming Anywhere | EvmuxYour Ultimate Web Studio for Streaming Anywhere | Evmux
Your Ultimate Web Studio for Streaming Anywhere | Evmux
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdf
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 

Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization Summit 2021 Technical Session

  • 1. SigOpt. Confidential. Advanced Techniques to Accelerate Model Tuning Michael McCourt Head of Engineering, SigOpt, an Intel company June 8, 2021 Software for AI Optimization Summit
  • 2. SigOpt. Confidential. Today’s agenda Advances in tuning AI models • Implementing black box optimization strategies for tuning at scale • Balancing multiple competing metrics • Incorporating model information to accelerate tuning © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 2
  • 3. SigOpt. Confidential. Tuning AI models Making the right decisions for generalization Training models finds the parameters which minimize the loss (inaccuracy) of a model for a given set of training data and fixed set of hyperparameters. Tuning models finds hyperparameters which allow a model to perform in a production setting at a level which all the stakeholders find acceptable. Today, we will discuss how SigOpt guides this tuning process. © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 3
  • 4. SigOpt. Confidential. SigOpt workflow diagram Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques REST API Parameters or Hyperparameters © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 5
  • 5. SigOpt. Confidential. Optimally tuning AI models at scale How do we power and manage this optimization? At our core, we use black box algorithms to power the optimization. • Bayesian optimization (a.k.a., model-based optimization) • Evolutionary algorithms • Quasi-random sampling (through QMCPy) The theoretical designs of these algorithms is insufficient for production. • Limited time for our computations • Multiple trainings running asynchronously in parallel • Customer edits to the optimization domain/results We focus our discussion only on Bayesian optimization, the most suitable for ML. © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 6
  • 6. SigOpt. Confidential. Standard BO computational workflow Online/Offline computation © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 7
  • 7. SigOpt. Confidential. Asynchronous BO computational workflow Online/Offline computation API Request Report Results API Request Next Point Retrieve Next Point Rerank All Points © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 8
  • 8. SigOpt. Confidential. Balancing multiple competing metrics How do we satisfy multiple stakeholders? Tuning models with scalar optimization is satisfying, but often incomplete. Real-world circumstances involve multiple competing metrics. • High accuracy, low inference time (example) • Maximizing true positives, minimizing false diagnoses (example) • Minimal incorrect fraud alerts, easily interpretable model (example) • High accuracy, limited size, low computation (example) Our goal must change, from finding the answer to understanding our options. © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 9
  • 9. SigOpt. Confidential. Studying the Pareto frontier Understanding the tradeoff between two metrics Can we find hyperparameters that maximize both accuracy and inference speed? Probably not … but we can learn what is possible. Accuracy Speed Accuracy Speed Pareto Frontier All Possible Models © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 10
  • 10. SigOpt. Confidential. Finding the Pareto frontier What strategies exist? Much literature exists on finding the Pareto frontier. • Genetic algorithms are by far the most popular (core reference). • They work best with massive experimentation/compute. • This may be impractical in ML tuning. Bayesian optimization has started to address this problem: • First with linear combinations of metrics, • Later with analysis of the Pareto frontier hypervolume, • Also by adaptively enforcing artificial constraints. These model-based strategies are more efficient than genetic algorithms. © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 11
  • 11. SigOpt. Confidential. Building neural network models for edge devices Practical limitations on neural networks Situation: When deploying neural networks to edge devices (e.g., cell phones or televisions) they must perform well without requiring supercomputers to run them. Simplified form of this blog. Metrics to be balanced: Validation accuracy, Stability under edge deployment 12 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 12. SigOpt. Confidential. Dealing with more than 2 metrics What is practical? Approximating the Pareto frontier with 2 metrics requires many points. • Our algorithm has proved effective in tests and in practice. • Moving beyond 2 metrics requires many more points. • This may be infeasible for ML tuning situations that have long training times. What can be done to manage this higher number of metrics? • We could enforce some metrics as constraints. • We could define success to more easily incorporate more metrics. © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries 13
  • 13. SigOpt. Confidential. Incorporating more metrics as constraints Turning a numeric metric into “good enough” Customers can pass more than two metrics to us as constraints. (example) • Example: Maximize accuracy, minimize computation, 0.15M or less network size Our algorithm modifies our computations to incorporate this constraint. 14 Grayed out points violate the constraint © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 14. SigOpt. Confidential. Constraint active search for many metrics Exploring all the “good enough” parameters If a user were only interested in satisfying constraints for all metrics … • Every choice of hyperparameters would be only satisfactory/unsatisfactory. • Model-building is independent of the number of metrics. No optimization is needed … only a search to find all satisfactory models. This is the thesis of our latest innovation: constraint active search. • Explore the parameter space to give users actionable information as efficiently as possible. ICML 2021 paper accepted, feature to be released July 1. 15 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 15. SigOpt. Confidential. Constraint active search for many metrics Exploring all the “good enough” parameters 16 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 16. SigOpt. Confidential. Incorporating model information to accelerate tuning Moving beyond black box optimization Our algorithms generally do not leverage information about customer models. • Many customers want that knowledge to live behind their firewall. Increasingly, customers seem willing to convey certain information to us. • Example: The type of model, the role of hyperparameters, the size of their data We want to use this information to give customers a better experience. 17 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 17. SigOpt. Confidential. Features specific to gradient-boosted methods What can be done? To address model-specific problems, we need to consider a class of models. • We have focused on XGBoost and LightGBM. • These have well-defined structure and common hyperparameters. • These are very popular, both broadly and within our customer base. • These are likely to have both very experienced users and new users. 18 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 18. SigOpt. Confidential. Features specific to gradient-boosted methods Automatic hyperparameter domain selection If we know the model, we may already know which hyperparameters to study. Quality of life change for our customers, and we gain knowledge about these hyperparameters. 19 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 19. SigOpt. Confidential. Features specific to gradient-boosted methods XGBoost/LightGBM specific prior beliefs Our computations are more effective with prior beliefs -- knowledge about which hyperparameter values are likely to be good. In this gradient-boosted setting, we can load these beliefs for our customers. 20 © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 20. SigOpt. Confidential. Thank you for having me! We are starting a Q&A session now, or email mccourt@sigopt.com with questions. We’re hiring! New roles for • Front end engineer • Full stack engineer • Product marketing manager • Product design lead • Senior product manager Please contact mccourt@sigopt.com! © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 21. SigOpt. Confidential. Try SigOpt for free Interested in trying our product? Sign up here or go to app.sigopt.com/signup Our free plan includes: 10 users per organization with 1 TB data storage, 500 hyperparameter optimization experiments per month and 1,000 observations per experiment, unlimited tracked training runs, and email support with 48-hour expected response time © Intel Corporation. SigOpt and the SigOpt logo are trademarks of Intel Corporation or its subsidiaries
  • 22. Notices & Disclaimers © Intel Corporation. Intel, SigOpt, the Intel logo, the SigOpt logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No product or component can be absolutely secure. Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. Your costs and results may vary. Intel and SigOpt technologies may require enabled hardware, software or service activation. Performance varies by use, configuration and other factors. Learn more at https://sigopt.com/resources/ and www.Intel.com/PerformanceIndex​.