SlideShare a Scribd company logo
1
Use AI to Build AI
The Evolution of AutoML
Ning Jiang
CTO, OneClick.ai
2018
Ning Jiang
Co-founder of OneClick.ai, the first automated
Deep Learning platform in the market.
Previously Dev Manager at Microsoft Bing, Ning
has over 15 years of R&D experience in AI for ads,
search, and cyber security.
2
So, Why AutoML?
{ Challenges in AI Applications }
4
1. Never enough experienced data scientists
2. Long development cycle (typically 3 mo to 0.5 year)
3. High risk of failure
4. Endless engineering traps in implementation and
maintenance
{ Coming Along With Deep Learning }
5
1. Few experienced data scientists and engineers
2. Increasing complexity in data (mix images, text, and numbers)
3. Algorithms need to be customized
4. Increased design choices and hyper-parameters
5. Much harder to debug
What is AutoML
{ AutoML }
7
Controller
Model Training Model Validation
Model Designs
Validation DataTraining Data
{ Key Challenges }
8
1. Satisfy semantic Constraints (e.g. data types)
2. Take the feedback to improve model designs
3. Minimize number of models to train
4. Avoid local minima
5. Speed up model training
{ Neural Architecture Search }
9
1. Evolutionary algorithms
(ref: https://arxiv.org/abs/1703.01041)
2. Greedy search
(ref: https://arxiv.org/abs/1712.00559)
3. Reinforcement learning
(ref: https://arxiv.org/abs/1611.01578)
4. Speed up model training
(ref: https://arxiv.org/abs/1802.03268)
Greedy Search
{ Target Scenarios }
11
1. Image classification (on CIFAR-10 & ImageNet)
2. Using only Convolution & Pooling layers
3. This is what powers Google AutoML
{ Constraints }
12
1. Predefined architectures
2. N=2
3. # of filters decided by heuristics
4. NAS to find optimal Cell
structure
{ Basic constructs }
13
Each construct has
1. Two inputs
2. Processed by two operators
3. One combined output
Operator 1 Operator 2
输入1 输入2
{ Predefined Operators }
14
Why these and these only?
1. 3X3 convolution
2. 5X5 convolution
3. 7X7 convolution
4. Identity (pass through)
5. 3X3 average pooling
6. 3X3 max pooling
7. 3x3 dilated convolution
8. 1X7 followed by 7X1 convolution
Operator 1 Operator 2
输入1 输入2
{ Cells }
15
1. Stacking up to 5 basic
constructs
2. About 5.6x1014
cell
candidates
{ Greedy Search }
16
1. Start with a single construct
(m=1)
2. There are 256 possibilities
3. Add one more construct
4. Pick the best K (256) cells to train
5. Repeat step ¾ until we have 5
constructs in the cell
6. 1028 models to be trained
{ Pick the best cells}
17
1. Cells as a sequence of choices
2. LSTM to estimate model
accuracy
3. Training data are from trained
models (up to 1024 examples)
4. 99.03% accuracy at m=2
5. 99.52% at m=5
LSTM
Dense
Input2
Input2
Operator1
Operator2
{ Summary }
18
1. Fewer models to train
○ Remarkable improvement over evolutionary algorithms
2. Search from simple to complex models
3. Heavy use of domain knowledge and heuristics
4. Suboptimal results due to greedy search
5. Can’t generalize to other problems
Reinforcement Learning
{ Why RL? }
20
1. RL is a generative model
2. RL doesn't assume less domain knowledge on the problem
3. Trained model accuracy is used as rewards
{ RNN Controller }
21
{ RNN Controler }
22
1. Autoregressive RNN
2. Outputs capable of describe any architecture
3. Support non-linear architecture using Skip Connections
{ Skip Connections }
23
{ Stochastic Sampling }
24
For example:
1. Filter size has 4 choices:24,36,48,64
2. For each layer of convolution, RNN outputs a distribution:
○ 60%,20% ,10%, 10%)
○ With 60% chances, the filter size will be 24
3. This helps collects data to correct controller’s mistakes
{ Training RNN Controller }
25
1. Use REINFORCE to update controller parameters
○ Binary rewards (0/1)
○ Trained model accuracy is the prob. of reward being 1
○ Apply cross entropy to RNN outputs
2. Designs with higher accuracy are assigned higher prob.
{ Speed Up Model Training }
26
1. When same layers are shared across architectures
2. Share the same layer parameters
3. Alternating training between models
{ Summary }
27
1. Better model accuracy
2. Can be made to work with complex architectures
3. Able to correct controller mistakes (e.g. bias)
4. Speed up training when layers can be shared
○ From 40K to 16 GPU hours
5. Designed for specific type of problems
6. Still very expensive with typically 10K GPU hours
So, What is Next?
{ Challenges }
29
1. NAS algorithms are domain specific
2. Only neural networks are supported
3. Heavy use of human heuristics
4. Expensive (thousands of GPU hours)
5. Cold start problem: NAS has no prior knowledge about data
{ Our Answer }
30
Controller
Model Training Model Validation
Model Designs
Validation DataTraining Data
Training Data
{ Generalized Architecture Search }
31
1. Accumulate domain knowledge over time
2. Works with any algorithm (neural networks or not)
3. Automated feature engineering
4. Much fewer models to train
5. GAS powers OneClick.ai
32
Use AI to Build AI
1. Custom-built Deep Learning models for best performance
2. Model designs improved iteratively in few hours
3. Better models in fewer shots due to self-learned domain
knowledge
Meta-learning evaluates millions of
deep learning models in the blink of
an eye. US patent pending
33
Versatile Applications
1. Data types: numeric, categorical, date/time, textual, images
2. Applications: regression, classification, time-series forecasting,
clustering, recommendations, vision
Powered by deep learning, we support
an unprecedented range of applications
and data types
34
Unparalleled Simplicity
1. Users need zero AI background
2. Simpler to use than Excel
3. Advanced functions available to experts via a chatbot
Thanks to a chatbot-based UX, we can
accommodate both newbie and expert
users
Use AI to Build AI
Sign up on http://oneclick.ai
ask@oneclick.ai

More Related Content

What's hot

Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Databricks
 
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
Databricks
 
What is MLOps
What is MLOpsWhat is MLOps
What is MLOps
Henrik Skogström
 
Introduction to Auto ML
Introduction to Auto MLIntroduction to Auto ML
Introduction to Auto ML
Dmitry Petukhov
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
Joaquin Vanschoren
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
Databricks
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud Services
SC5.io
 
Interpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex modelsInterpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex models
Manojit Nandi
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)
Krishnaram Kenthapadi
 
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete DeckArtificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
SlideTeam
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Sangath babu
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Krishnaram Kenthapadi
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
Bill Liu
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
SylvainGugger
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
Venkata Reddy Konasani
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
SlideTeam
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learningbutest
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
Krishnaram Kenthapadi
 

What's hot (20)

Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
Needle in the Haystack—User Behavior Anomaly Detection for Information Securi...
 
What is MLOps
What is MLOpsWhat is MLOps
What is MLOps
 
Introduction to Auto ML
Introduction to Auto MLIntroduction to Auto ML
Introduction to Auto ML
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud Services
 
Interpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex modelsInterpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex models
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)
 
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete DeckArtificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 

Similar to The Evolution of AutoML

Current clustering techniques
Current clustering techniquesCurrent clustering techniques
Current clustering techniques
Poonam Kshirsagar
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Seldon
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt
 
Alexandra johnson reducing operational barriers to model training
Alexandra johnson   reducing operational barriers to model trainingAlexandra johnson   reducing operational barriers to model training
Alexandra johnson reducing operational barriers to model training
MLconf
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 
Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)
alekn
 
Thesis Defense (Gwendal DANIEL) - Nov 2017
Thesis Defense (Gwendal DANIEL) - Nov 2017Thesis Defense (Gwendal DANIEL) - Nov 2017
Thesis Defense (Gwendal DANIEL) - Nov 2017
Gwendal Daniel
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 
Webinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the LearningWebinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the Learning
Mesosphere Inc.
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Codemotion
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
KhondokerAbuNaim
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
Intro to Deep Learning with Keras - using TensorFlow backend
Intro to Deep Learning with Keras - using TensorFlow backendIntro to Deep Learning with Keras - using TensorFlow backend
Intro to Deep Learning with Keras - using TensorFlow backend
Amin Golnari
 
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
Benjamin Reese
 
Mps intro
Mps introMps intro
Mps intro
onelinkup
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
Tarik Reza Toha
 
First steps with Keras 2: A tutorial with Examples
First steps with Keras 2: A tutorial with ExamplesFirst steps with Keras 2: A tutorial with Examples
First steps with Keras 2: A tutorial with Examples
Felipe
 
Things you can find in the plan cache
Things you can find in the plan cacheThings you can find in the plan cache
Things you can find in the plan cachesqlserver.co.il
 

Similar to The Evolution of AutoML (20)

Current clustering techniques
Current clustering techniquesCurrent clustering techniques
Current clustering techniques
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
SigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt at MLconf - Reducing Operational Barriers to Model Training
SigOpt at MLconf - Reducing Operational Barriers to Model Training
 
Alexandra johnson reducing operational barriers to model training
Alexandra johnson   reducing operational barriers to model trainingAlexandra johnson   reducing operational barriers to model training
Alexandra johnson reducing operational barriers to model training
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)
 
Thesis Defense (Gwendal DANIEL) - Nov 2017
Thesis Defense (Gwendal DANIEL) - Nov 2017Thesis Defense (Gwendal DANIEL) - Nov 2017
Thesis Defense (Gwendal DANIEL) - Nov 2017
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Webinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the LearningWebinar: Deep Learning Pipelines Beyond the Learning
Webinar: Deep Learning Pipelines Beyond the Learning
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
 
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg SchadData Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
Data Con LA 2018 - Towards Data Science Engineering Principles by Joerg Schad
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Intro to Deep Learning with Keras - using TensorFlow backend
Intro to Deep Learning with Keras - using TensorFlow backendIntro to Deep Learning with Keras - using TensorFlow backend
Intro to Deep Learning with Keras - using TensorFlow backend
 
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
 
Mps intro
Mps introMps intro
Mps intro
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
 
First steps with Keras 2: A tutorial with Examples
First steps with Keras 2: A tutorial with ExamplesFirst steps with Keras 2: A tutorial with Examples
First steps with Keras 2: A tutorial with Examples
 
Things you can find in the plan cache
Things you can find in the plan cacheThings you can find in the plan cache
Things you can find in the plan cache
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 

The Evolution of AutoML

  • 1. 1 Use AI to Build AI The Evolution of AutoML Ning Jiang CTO, OneClick.ai 2018
  • 2. Ning Jiang Co-founder of OneClick.ai, the first automated Deep Learning platform in the market. Previously Dev Manager at Microsoft Bing, Ning has over 15 years of R&D experience in AI for ads, search, and cyber security. 2
  • 4. { Challenges in AI Applications } 4 1. Never enough experienced data scientists 2. Long development cycle (typically 3 mo to 0.5 year) 3. High risk of failure 4. Endless engineering traps in implementation and maintenance
  • 5. { Coming Along With Deep Learning } 5 1. Few experienced data scientists and engineers 2. Increasing complexity in data (mix images, text, and numbers) 3. Algorithms need to be customized 4. Increased design choices and hyper-parameters 5. Much harder to debug
  • 7. { AutoML } 7 Controller Model Training Model Validation Model Designs Validation DataTraining Data
  • 8. { Key Challenges } 8 1. Satisfy semantic Constraints (e.g. data types) 2. Take the feedback to improve model designs 3. Minimize number of models to train 4. Avoid local minima 5. Speed up model training
  • 9. { Neural Architecture Search } 9 1. Evolutionary algorithms (ref: https://arxiv.org/abs/1703.01041) 2. Greedy search (ref: https://arxiv.org/abs/1712.00559) 3. Reinforcement learning (ref: https://arxiv.org/abs/1611.01578) 4. Speed up model training (ref: https://arxiv.org/abs/1802.03268)
  • 11. { Target Scenarios } 11 1. Image classification (on CIFAR-10 & ImageNet) 2. Using only Convolution & Pooling layers 3. This is what powers Google AutoML
  • 12. { Constraints } 12 1. Predefined architectures 2. N=2 3. # of filters decided by heuristics 4. NAS to find optimal Cell structure
  • 13. { Basic constructs } 13 Each construct has 1. Two inputs 2. Processed by two operators 3. One combined output Operator 1 Operator 2 输入1 输入2
  • 14. { Predefined Operators } 14 Why these and these only? 1. 3X3 convolution 2. 5X5 convolution 3. 7X7 convolution 4. Identity (pass through) 5. 3X3 average pooling 6. 3X3 max pooling 7. 3x3 dilated convolution 8. 1X7 followed by 7X1 convolution Operator 1 Operator 2 输入1 输入2
  • 15. { Cells } 15 1. Stacking up to 5 basic constructs 2. About 5.6x1014 cell candidates
  • 16. { Greedy Search } 16 1. Start with a single construct (m=1) 2. There are 256 possibilities 3. Add one more construct 4. Pick the best K (256) cells to train 5. Repeat step ¾ until we have 5 constructs in the cell 6. 1028 models to be trained
  • 17. { Pick the best cells} 17 1. Cells as a sequence of choices 2. LSTM to estimate model accuracy 3. Training data are from trained models (up to 1024 examples) 4. 99.03% accuracy at m=2 5. 99.52% at m=5 LSTM Dense Input2 Input2 Operator1 Operator2
  • 18. { Summary } 18 1. Fewer models to train ○ Remarkable improvement over evolutionary algorithms 2. Search from simple to complex models 3. Heavy use of domain knowledge and heuristics 4. Suboptimal results due to greedy search 5. Can’t generalize to other problems
  • 20. { Why RL? } 20 1. RL is a generative model 2. RL doesn't assume less domain knowledge on the problem 3. Trained model accuracy is used as rewards
  • 22. { RNN Controler } 22 1. Autoregressive RNN 2. Outputs capable of describe any architecture 3. Support non-linear architecture using Skip Connections
  • 24. { Stochastic Sampling } 24 For example: 1. Filter size has 4 choices:24,36,48,64 2. For each layer of convolution, RNN outputs a distribution: ○ 60%,20% ,10%, 10%) ○ With 60% chances, the filter size will be 24 3. This helps collects data to correct controller’s mistakes
  • 25. { Training RNN Controller } 25 1. Use REINFORCE to update controller parameters ○ Binary rewards (0/1) ○ Trained model accuracy is the prob. of reward being 1 ○ Apply cross entropy to RNN outputs 2. Designs with higher accuracy are assigned higher prob.
  • 26. { Speed Up Model Training } 26 1. When same layers are shared across architectures 2. Share the same layer parameters 3. Alternating training between models
  • 27. { Summary } 27 1. Better model accuracy 2. Can be made to work with complex architectures 3. Able to correct controller mistakes (e.g. bias) 4. Speed up training when layers can be shared ○ From 40K to 16 GPU hours 5. Designed for specific type of problems 6. Still very expensive with typically 10K GPU hours
  • 28. So, What is Next?
  • 29. { Challenges } 29 1. NAS algorithms are domain specific 2. Only neural networks are supported 3. Heavy use of human heuristics 4. Expensive (thousands of GPU hours) 5. Cold start problem: NAS has no prior knowledge about data
  • 30. { Our Answer } 30 Controller Model Training Model Validation Model Designs Validation DataTraining Data Training Data
  • 31. { Generalized Architecture Search } 31 1. Accumulate domain knowledge over time 2. Works with any algorithm (neural networks or not) 3. Automated feature engineering 4. Much fewer models to train 5. GAS powers OneClick.ai
  • 32. 32 Use AI to Build AI 1. Custom-built Deep Learning models for best performance 2. Model designs improved iteratively in few hours 3. Better models in fewer shots due to self-learned domain knowledge Meta-learning evaluates millions of deep learning models in the blink of an eye. US patent pending
  • 33. 33 Versatile Applications 1. Data types: numeric, categorical, date/time, textual, images 2. Applications: regression, classification, time-series forecasting, clustering, recommendations, vision Powered by deep learning, we support an unprecedented range of applications and data types
  • 34. 34 Unparalleled Simplicity 1. Users need zero AI background 2. Simpler to use than Excel 3. Advanced functions available to experts via a chatbot Thanks to a chatbot-based UX, we can accommodate both newbie and expert users
  • 35. Use AI to Build AI Sign up on http://oneclick.ai ask@oneclick.ai