Machine learning is full of ideas that are far abstracted away from the underlying data and difficult to understand. Luckily, this represents an amazing opportunity for visualization! These slides dive into the machine learning meta-problem of hyperparameter optimization. We'll show 4 opportunities for visualization in helping people understand, implement, and evaluate hyperparameter optimization strategies.
Plotcon 2016 Visualization Talk by Alexandra Johnson
1. Visualizing Abstract Concepts in
Machine Learning
PIC
Alexandra Johnson
___________
Software Engineer @ SigOpt
#MachineLearning #MLViz
Visualizing Abstract Concepts in Machine Learning | 1
2. Visualizing Abstract Concepts in Machine Learning | 2
What is Machine Learning?
Versicolor
Setosa
Virginica
Training Data + Model -> Labels (Classification)
or Numbers (Regression)
3. Why is this so Intimidating?
Visualizing Abstract Concepts in Machine Learning | 3
In-brower deep neural net from playground.tensorflow.org
Hyperparameters = your
model's magic numbers
Examples: learning rate, ratio
of train to test data, number
of hidden layers, neurons per
hidden layer
Hyperparameter values must
be set before training
5. Values you choose for your
hyperparameters have a
direct effect on the
performance of your model
Hard to capture interactions
of 20 hyperparameters
20 Dimensional Math is Hard
Visualizing Abstract Concepts in Machine Learning | 5
6. −15 −10 −5 0 5
0.2
0.4
0.6
0.8
1
log_C
Accuracy
Visualizing Abstract Concepts in Machine Learning | 6
20 Dimensional Math is Hard
First try: graph model
performance vs
hyperparameter value
For every hyperparameter
Good for understanding
indivudal hyperparameters,
bad for understanding
interactions
7. 0.3
0.4
0.5
0.6
0.7
0.8
0.9
Accuracy
Visualizing Abstract Concepts in Machine Learning | 7
20 Dimensional Math is Hard
Graph up to 4 dimensions at
once: x, y, z axis + color
Hard to visualize 4
dimensions at once, imagine
20!
Maybe you want to use an
algorithm to handle
hyperparameter optimization
8. Visualizing Abstract Concepts in Machine Learning | 8
Hyperparameter Optimization
Strategies are Different
Grid Search Random Search Bayesian Optimization
9. Some Strategies Produce
Better Results
0.96 0.97 0.98 0.99
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Visualizing Abstract Concepts in Machine Learning | 9
Experiment = optimizing
hyperparameters of your
model, results in some
maximum performance
Some hyperparameter
optimization strategies are
stochastic, can't just look at
one experiment
Look at distribution of
maximum performance over
many experiments optimizing
hyperparameters of the same
model
10. Some Strategies Produce
Better Results
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Random Search
Grid Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 10
Use the Mann-Whitney U Test to compare distributions of
maximum performance
11. Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Best Seen Trace
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 11
How much time do you have
for optimization?
Strategies that reliably
produce better results faster
can optimize the
hyperparameters of your
model in less time
12. Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Range of Best Seen Traces
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 12
Again, consider a distribution
of optimization experiments
25th - 75th percentile of
performance our model
could acheive if we stopped
early
13. Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Ranges of Best Seen Traces
Timestep
BestSeenAccuracy
Grid Search
Random Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 13
Compare the area under the
curve of different strategies
Further reading at
sigopt.com/research
14. Takeaways
Visualizing Abstract Concepts in Machine Learning | 14
Hyperparameter optimization is an invaluable part of any modern
machine learning pipeline
Concepts like comparing hyperparameter optimization strategies
are extremely abstract and difficult to understand
Visualizations are in their infancy, but are an important part of
explaining these ideas