Plotcon 2016 Visualization Talk by Alexandra Johnson

•

1 like•1,786 views

Machine learning is full of ideas that are far abstracted away from the underlying data and difficult to understand. Luckily, this represents an amazing opportunity for visualization! These slides dive into the machine learning meta-problem of hyperparameter optimization. We'll show 4 opportunities for visualization in helping people understand, implement, and evaluate hyperparameter optimization strategies.

Software

Visualizing Abstract Concepts in Machine Learning | 2
What is Machine Learning?
Versicolor
Setosa
Virginica
Training Data + Model -> Labels (Classiﬁcation)
or Numbers (Regression)

Why is this so Intimidating?
Visualizing Abstract Concepts in Machine Learning | 3
In-brower deep neural net from playground.tensorﬂow.org
Hyperparameters = your
model's magic numbers
Examples: learning rate, ratio
of train to test data, number
of hidden layers, neurons per
hidden layer
Hyperparameter values must
be set before training

Solution: Hyperparameter Optimization
And four visualization challenges
Visualizing Abstract Concepts in Machine Learning | 4

Values you choose for your
hyperparameters have a
direct eﬀect on the
performance of your model
Hard to capture interactions
of 20 hyperparameters
20 Dimensional Math is Hard
Visualizing Abstract Concepts in Machine Learning | 5

−15 −10 −5 0 5
0.2
0.4
0.6
0.8
1
log_C
Accuracy
Visualizing Abstract Concepts in Machine Learning | 6
20 Dimensional Math is Hard
First try: graph model
performance vs
hyperparameter value
For every hyperparameter
Good for understanding
indivudal hyperparameters,
bad for understanding
interactions

0.3
0.4
0.5
0.6
0.7
0.8
0.9
Accuracy
Visualizing Abstract Concepts in Machine Learning | 7
20 Dimensional Math is Hard
Graph up to 4 dimensions at
once: x, y, z axis + color
Hard to visualize 4
dimensions at once, imagine
20!
Maybe you want to use an
algorithm to handle
hyperparameter optimization

Visualizing Abstract Concepts in Machine Learning | 8
Hyperparameter Optimization
Strategies are Diﬀerent
Grid Search Random Search Bayesian Optimization

Some Strategies Produce
Better Results
0.96 0.97 0.98 0.99
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Visualizing Abstract Concepts in Machine Learning | 9
Experiment = optimizing
hyperparameters of your
model, results in some
maximum performance
Some hyperparameter
optimization strategies are
stochastic, can't just look at
one experiment
Look at distribution of
maximum performance over
many experiments optimizing
hyperparameters of the same
model

Some Strategies Produce
Better Results
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Random Search
Grid Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 10
Use the Mann-Whitney U Test to compare distributions of
maximum performance

Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Best Seen Trace
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 11
How much time do you have
for optimization?
Strategies that reliably
produce better results faster
can optimize the
hyperparameters of your
model in less time

Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Range of Best Seen Traces
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 12
Again, consider a distribution
of optimization experiments
25th - 75th percentile of
performance our model
could acheive if we stopped
early

Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Ranges of Best Seen Traces
Timestep
BestSeenAccuracy
Grid Search
Random Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 13
Compare the area under the
curve of diﬀerent strategies
Further reading at
sigopt.com/research

Takeaways
Visualizing Abstract Concepts in Machine Learning | 14
Hyperparameter optimization is an invaluable part of any modern
machine learning pipeline
Concepts like comparing hyperparameter optimization strategies
are extremely abstract and diﬃcult to understand
Visualizations are in their infancy, but are an important part of
explaining these ideas

Thank You!
Visualizing Abstract Concepts in Machine Learning | 14
Email: alexandra@sigopt.com
Twitter: @alexandraj777
www.sigopt.com

What's hot

Common Problems in Hyperparameter OptimizationSigOpt

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf

Deep Dive into Hyperparameter TuningShubhmay Potdar

Machine learning with scikitlearnPratap Dangeti

MLConf 2016 SigOpt Talk by Scott ClarkSigOpt

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf

Kaggle Higgs Boson Machine Learning ChallengeBernard Ong

GA.-.Presentationoldmanpat

Winning Kaggle 101: Introduction to StackingTed Xiao

Feature EngineeringHJ van Veen

Kaggle presentationHJ van Veen

Automated Machine Learning (Auto ML)Hayim Makabee

Using SHAP to Understand Black Box ModelsJonathan Bechtel

Meetup_Consumer_Credit_Default_Vers_2_AllBernard Ong

GLM & GBM in H2OSri Ambati

General Tips for participating Kaggle CompetitionsMark Peng

Hacking Predictive Modeling - RoadSec 2018HJ van Veen

Towards automating machine learning: benchmarking tools for hyperparameter tu...PyData

Feature EngineeringSri Ambati

What's hot (19)

Common Problems in Hyperparameter Optimization

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017

Deep Dive into Hyperparameter Tuning

Machine learning with scikitlearn

MLConf 2016 SigOpt Talk by Scott Clark

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016

Kaggle Higgs Boson Machine Learning Challenge

GA.-.Presentation

Winning Kaggle 101: Introduction to Stacking

Feature Engineering

Kaggle presentation

Automated Machine Learning (Auto ML)

Using SHAP to Understand Black Box Models

Meetup_Consumer_Credit_Default_Vers_2_All

GLM & GBM in H2O

General Tips for participating Kaggle Competitions

Hacking Predictive Modeling - RoadSec 2018

Towards automating machine learning: benchmarking tools for hyperparameter tu...

Feature Engineering

Viewers also liked

Visualizing Threats: Network Visualization for Cyber SecurityCambridge Intelligence

Lecture7 xing fei-feiTianlu Wang

PROTEUS H2020 Bonaventura Del Monte

PLOTCON NYC: Text is data! Analysis and Visualization MethodsPlotly

Visualization and Theories of Learning in EducationLiz Dorland

Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi

Causal inference in data scienceAmit Sharma

Real Time Machine Learning Visualization with SparkDataWorks Summit/Hadoop Summit

Data Eng Conf NY Nov 2016 Parquet ArrowJulien Le Dem

Reducing the dimensionality of data with neural networksHakky St

AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedRaffael Marty

Boston startup scene fall 2016 (final)Jeffrey Bussgang

Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain

Visualizing the Model Selection ProcessBenjamin Bengfort

Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017MLconf

XGBoost: the algorithm that wins every competitionJaroslaw Szymczak

Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain

Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain

Viewers also liked (19)

Visualizing Threats: Network Visualization for Cyber Security

Lecture7 xing fei-fei

PROTEUS H2020

PLOTCON NYC: Text is data! Analysis and Visualization Methods

Visualization and Theories of Learning in Education

Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...

Causal inference in data science

Real Time Machine Learning Visualization with Spark

Data Eng Conf NY Nov 2016 Parquet Arrow

Reducing the dimensionality of data with neural networks

AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed

Boston startup scene fall 2016 (final)

Kdd 2014 Tutorial - the recommender problem revisited

Visualizing the Model Selection Process

Dynamics in graph analysis (PyData Carolinas 2016)

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

XGBoost: the algorithm that wins every competition

Recsys 2014 Tutorial - The Recommender Problem Revisited

Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Similar to Plotcon 2016 Visualization Talk by Alexandra Johnson

Experimental Design for Distributed Machine Learning with Myles BakerDatabricks

Build Deep Learning model to identify santader bank's dissatisfied customerssriram30691

Initializing & Optimizing Machine Learning ModelsEng Teong Cheah

Managing machine learningDavid Murgatroyd

Simulacion luis garciaguzman-21012011lideresacademicos

Advanced Optimization for the Enterprise WebinarSigOpt

Modeling at Scale: SigOpt at TWIMLcon 2019SigOpt

Tuning for Systematic Trading: Talk 3: Training, Tuning, and Metric StrategySigOpt

Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services

LF Energy Webinar - Unveiling OpenEEMeter 4.0DanBrown980551

Andrew NG machine learningShareDocView.com

Types of Machine Learning- Tanvir Siddike MoinTanvir Moin

Machine learning yearningmohammad pourheidary

Darius Silingas - From Model Driven Testing to Test Driven ModellingTEST Huddle

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh

Tuning 2.0: Advanced Optimization Techniques WebinarSigOpt

Understanding Mahout classification documentationNaveen Kumar

Net campus2015 antimomusoneDotNetCampus

PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus

Top 10 Data Science Practitioner PitfallsSri Ambati

Similar to Plotcon 2016 Visualization Talk by Alexandra Johnson (20)

Experimental Design for Distributed Machine Learning with Myles Baker

Build Deep Learning model to identify santader bank's dissatisfied customers

Initializing & Optimizing Machine Learning Models

Managing machine learning

Simulacion luis garciaguzman-21012011

Advanced Optimization for the Enterprise Webinar

Modeling at Scale: SigOpt at TWIMLcon 2019

Tuning for Systematic Trading: Talk 3: Training, Tuning, and Metric Strategy

Amazon SageMaker 內建機器學習演算法 (Level 400)

LF Energy Webinar - Unveiling OpenEEMeter 4.0

Andrew NG machine learning

Types of Machine Learning- Tanvir Siddike Moin

Machine learning yearning

Darius Silingas - From Model Driven Testing to Test Driven Modelling

Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...

Tuning 2.0: Advanced Optimization Techniques Webinar

Understanding Mahout classification documentation

Net campus2015 antimomusone

PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA

Top 10 Data Science Practitioner Pitfalls

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

5 Signs You Need a Fashion PLM Software.pdfWave PLM

Asset Management Software - InfographicHr365.us smith

chapter--4-software-project-planning.pptkotipi9215

Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

What is Fashion PLM and Why Do You Need ItWave PLM

Project Based Learning (A.I).pptx detail explanationkaushalgiri8080

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

cybersecurity notes for mca students for learningVitsRangannavar

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...

5 Signs You Need a Fashion PLM Software.pdf

Asset Management Software - Infographic

chapter--4-software-project-planning.ppt

Advancing Engineering with AI through the Next Generation of Strategic Projec...

why an Opensea Clone Script might be your perfect match.pdf

Hand gesture recognition PROJECT PPT.pptx

What is Fashion PLM and Why Do You Need It

Project Based Learning (A.I).pptx detail explanation

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Optimizing AI for immediate response in Smart CCTV

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

cybersecurity notes for mca students for learning

Salesforce Certified Field Service Consultant

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Der Spagat zwischen BIAS und FAIRNESS (2024)

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

Plotcon 2016 Visualization Talk by Alexandra Johnson

1. Visualizing Abstract Concepts in Machine Learning PIC Alexandra Johnson ___________ Software Engineer @ SigOpt #MachineLearning #MLViz Visualizing Abstract Concepts in Machine Learning | 1

2. Visualizing Abstract Concepts in Machine Learning | 2 What is Machine Learning? Versicolor Setosa Virginica Training Data + Model -> Labels (Classiﬁcation) or Numbers (Regression)

3. Why is this so Intimidating? Visualizing Abstract Concepts in Machine Learning | 3 In-brower deep neural net from playground.tensorﬂow.org Hyperparameters = your model's magic numbers Examples: learning rate, ratio of train to test data, number of hidden layers, neurons per hidden layer Hyperparameter values must be set before training

4. Solution: Hyperparameter Optimization And four visualization challenges Visualizing Abstract Concepts in Machine Learning | 4

5. Values you choose for your hyperparameters have a direct eﬀect on the performance of your model Hard to capture interactions of 20 hyperparameters 20 Dimensional Math is Hard Visualizing Abstract Concepts in Machine Learning | 5

6. −15 −10 −5 0 5 0.2 0.4 0.6 0.8 1 log_C Accuracy Visualizing Abstract Concepts in Machine Learning | 6 20 Dimensional Math is Hard First try: graph model performance vs hyperparameter value For every hyperparameter Good for understanding indivudal hyperparameters, bad for understanding interactions

7. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Accuracy Visualizing Abstract Concepts in Machine Learning | 7 20 Dimensional Math is Hard Graph up to 4 dimensions at once: x, y, z axis + color Hard to visualize 4 dimensions at once, imagine 20! Maybe you want to use an algorithm to handle hyperparameter optimization

8. Visualizing Abstract Concepts in Machine Learning | 8 Hyperparameter Optimization Strategies are Diﬀerent Grid Search Random Search Bayesian Optimization

9. Some Strategies Produce Better Results 0.96 0.97 0.98 0.99 0 5 10 15 20 25 Distribution of Best Found Values over Experiments of 25 Iterations Maximum Accuracy Experiments Visualizing Abstract Concepts in Machine Learning | 9 Experiment = optimizing hyperparameters of your model, results in some maximum performance Some hyperparameter optimization strategies are stochastic, can't just look at one experiment Look at distribution of maximum performance over many experiments optimizing hyperparameters of the same model

10. Some Strategies Produce Better Results 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 0 5 10 15 20 25 Distribution of Best Found Values over Experiments of 25 Iterations Maximum Accuracy Experiments Random Search Grid Search Bayesian Optimization Visualizing Abstract Concepts in Machine Learning | 10 Use the Mann-Whitney U Test to compare distributions of maximum performance

11. Some Strategies Produce Better Results, Faster 0 5 10 15 20 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Best Seen Trace Timestep BestSeenAccuracy Visualizing Abstract Concepts in Machine Learning | 11 How much time do you have for optimization? Strategies that reliably produce better results faster can optimize the hyperparameters of your model in less time

12. Some Strategies Produce Better Results, Faster 0 5 10 15 20 0.4 0.5 0.6 0.7 0.8 0.9 1 Interquartile Range of Best Seen Traces Timestep BestSeenAccuracy Visualizing Abstract Concepts in Machine Learning | 12 Again, consider a distribution of optimization experiments 25th - 75th percentile of performance our model could acheive if we stopped early

13. Some Strategies Produce Better Results, Faster 0 5 10 15 20 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Interquartile Ranges of Best Seen Traces Timestep BestSeenAccuracy Grid Search Random Search Bayesian Optimization Visualizing Abstract Concepts in Machine Learning | 13 Compare the area under the curve of diﬀerent strategies Further reading at sigopt.com/research

14. Takeaways Visualizing Abstract Concepts in Machine Learning | 14 Hyperparameter optimization is an invaluable part of any modern machine learning pipeline Concepts like comparing hyperparameter optimization strategies are extremely abstract and diﬃcult to understand Visualizations are in their infancy, but are an important part of explaining these ideas

15. Thank You! Visualizing Abstract Concepts in Machine Learning | 14 Email: alexandra@sigopt.com Twitter: @alexandraj777 www.sigopt.com

Plotcon 2016 Visualization Talk by Alexandra Johnson

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (19)

Similar to Plotcon 2016 Visualization Talk by Alexandra Johnson

Similar to Plotcon 2016 Visualization Talk by Alexandra Johnson (20)

More from SigOpt

More from SigOpt (20)

Recently uploaded

Recently uploaded (20)

Plotcon 2016 Visualization Talk by Alexandra Johnson