MLSEV Virtual. Automating Model Selection

#MLSEV 2
Automating Model Selection
Taking (most of) the work out of model tuning
Charles Parker
VP Algorithms, BigML, Inc

#MLSEV 3
Machine Learning for Machine Learning

#MLSEV 4
Parameter Optimization
• There are lots of algorithms and lots of parameters
• We don’t have time to try even close to everything
• If only we had a way to make a prediction . . .
Did I hear someone say
Machine Learning?

#MLSEV 5
The Allure of ML
“Why don’t we just use
machine learning to predict
the quality of a set of
modeling parameters before
we train a model on them?”
— Every ﬁrst year ML grad student ever

#MLSEV 6
Bayesian Parameter Optimization
• The performance of an ML algorithm (with associated parameters) is data
dependent
• So: Learn from your previous attempts
• Train a model, then evaluate it
• After you’ve done a number of evaluations, learn a regression model to predict the
performance of future, as-yet-untrained models
• Use this classiﬁer to chose a promising set of “next models” to evaluate

#MLSEV 7
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6

#MLSEV 8
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75

#MLSEV 9
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56

#MLSEV 10
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56
0.92

#MLSEV 11
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56
0.92

#MLSEV 12
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56
0.92
Machine Learning!
parameters ⟶ performance

#MLSEV 13
Model and
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56
0.92
Machine Learning!
parameters ⟶ performance

#MLSEV 14
Wow, Magic!
• So all of my problems are solved, right?
NO NO NO
• First, you’re selecting a model based on
held out data so you have to have
enough data to do an accurate model
selection
• Second, there are still important
remaining issues and possible ways to
screw up

#MLSEV 15
Remaining Issue #1: Metric Choice

#MLSEV 16
Driving The Search
• So how do we measure the peformance of
each model, to ﬁgure out what to do next?
• If we choose the wrong metric, we’ll get
models that are the best at something that we
don’t really care about
• But there are so many metrics! How do we
choose the right one?
• Hmmmm, all of this sounds awfully familiar . . .

#MLSEV 17
Flashback #1
TP + TN
Total
• “Percentage correct” - like an exam
• If Accuracy = 1 then no mistakes
• If Accuracy = 0 then all mistakes
• Intuitive but not always useful
• Watch out for unbalanced classes!
• Remember, only 1 in 1000 have the disease
• A silly model which always predicts “well” is 99.9% accurate

#MLSEV 18
A Metric Selection Flowchart
Will you bother about
threshold setting?
Is your dataset
imbalanced?
Is yours a
“ranking” problem?
Do you care
more about the top-
ranked instances?
Phi coefﬁcient
f-mesure Accuracy
Max. Phi
KS-statistic
Area Under the ROC / PR curve
Kendall’s Tau
Spearman’s Rho
Yes
Yes
Yes
No
No
No
Yes
No

#MLSEV 19
Ranking Problems
Medical Diagnosis (no) vs. Stock Picking (yes)

#MLSEV 20
Remaining Issue #2: Holdout Choice

#MLSEV 21
Is Cross-Validation Right for You?
• Cross-validation is a good tool some of the time
• Many Other times, it is disastrously bad (overly optimistic)
• This is why BigML offers the option for a speciﬁc holdout set.
• Should you use it?

#MLSEV 22
Flashback #2
• Okay, so I’m not testing on the training
data, so I’m good, right? NO NO NO
• You also have to worry about information
leakage between training and test data.
• What is this? Let’s try to predict the daily
closing price of the stock market
• What happens if you hold out 10 random
days from your dataset?
• What if you hold out the last 10 days?

#MLSEV 23
Flashback #3
• This is common when you have time-distributed
data, but can also happen in other instances:
• Let’s say we have a dataset of 10,000 pictures
from 20 people, each labeled with the year it which
it was taken
• We want to predict the year from the image
• What happens if we hold out random data?
• Solution: Hold out users instead

#MLSEV 24
Again, Take Care!
• These situations are very common in all cases
where data comes in groups (days, users, etc.)
• The solution is to hold out whole groups of data
• It’s possible that it isn’t a problem in your
dataset, but when in doubt, try both!

#MLSEV 25
Remaining Issue #3: Model Choice?

#MLSEV 26
Which Model is Best?
• Performance isn’t the only issue!
• Retraining: Will the amount of data you have be different in the future?
• Fit stability: How conﬁdent must you be that the model’s behavior is
invariant to small data changes?
• Prediction speed: The difference can be orders of magnitude

#MLSEV 27
Flashback #4
Amount of data required Linear models < trees, ensembles < deep learning
Potential to overﬁt Linear models < ensembles < trees, deep learning
Speed Linear models, trees < ensembles < deep learning
Representational Power Linear models < trees < ensembles < deep learning
• How much data do you have
• How fast do you need things to go?
• How much performance do you really need?

#MLSEV 28
Modeling Tradeoﬀs
Interpretability vs. Representability
Weak vs. Slow
Confidence vs. Performance
Biased vs. Data-hungry
Simple
(Logistic)
Complex
(Deepnets)

#MLSEV 29
Summary
• We can do some simple tricks and use
machine learning to help us search
through the space of possible models
• Even with this however, there is still
lots of work domain expert
• Automated model selection relies on
data. If you don’t have enough, it will
go poorly!

MLSEV Virtual. Automating Model Selection

MLSEV Virtual. Automating Model Selection

More Related Content

What's hot

Similar to MLSEV Virtual. Automating Model Selection

More from BigML, Inc

Recently uploaded

MLSEV Virtual. Automating Model Selection