Visualizing the Model Selection Process

Benjamin Bengfort
Benjamin BengfortComputer and Data Scientist
Visualizing the
Model Selection
Process
Benjamin Bengfort
@bbengfort
District Data Labs
Abstract
Machine learning is the hacker art of describing the features of instances that we want to
make predictions about, then fitting the data that describes those instances to a model
form. Applied machine learning has come a long way from it's beginnings in academia, and
with tools like Scikit-Learn, it's easier than ever to generate operational models for a wide
variety of applications. Thanks to the ease and variety of the tools in Scikit-Learn, the
primary job of the data scientist is model selection. Model selection involves performing
feature engineering, hyperparameter tuning, and algorithm selection. These dimensions of
machine learning often lead computer scientists towards automatic model selection via
optimization (maximization) of a model's evaluation metric. However, the search space is
large, and grid search approaches to machine learning can easily lead to failure and
frustration. Human intuition is still essential to machine learning, and visual analysis in
concert with automatic methods can allow data scientists to steer model selection towards
better fitted models, faster. In this talk, we will discuss interactive visual methods for better
understanding, steering, and tuning machine learning models.
So I read about this
great ML model
Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for
recommender systems." Computer 42.8 (2009): 30-37.
def nnmf(R, k=2, steps=5000, alpha=0.0002, beta=0.02):
n, m = R.shape
P = np.random.rand(n,k)
Q = np.random.rand(m,k).T
for step in range(steps):
for idx in range(n):
for jdx in range(m):
if R[idx][jdx] > 0:
eij = R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])
for kdx in range(K):
P[idx][kdx] = P[idx][kdx] + alpha * (2 * eij * Q[kdx][jdx] - beta * P[idx][kdx])
Q[kdx][jdx] = Q[kdx][jdx] + alpha * (2 * eij * P[idx][kdx] - beta * Q[kdx][jdx])
e = 0
for idx in range(n):
for jdx in range(m):
if R[idx][jdx] > 0:
e += (R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])) ** 2
if e < 0.001:
break
return P, Q.T
Life with Scikit-Learn
from sklearn.decomposition import NMF
model = NMF(n_components=2, init='random', random_state=0)
model.fit(R)
from sklearn.decomposition import NMF, TruncatedSVD, PCA
models = [
NMF(n_components=2, init='random', random_state=0),
TruncatedSVD(n_components=2),
PCA(n_components=2),
]
for model in models:
model.fit(R)
So now I’m all
Made Possible by the Scikit-Learn API
Buitinck, Lars, et al. "API design for machine learning software: experiences from
the scikit-learn project." arXiv preprint arXiv:1309.0238 (2013).
class Estimator(object):
def fit(self, X, y=None):
"""
Fits estimator to data.
"""
# set state of self
return self
def predict(self, X):
"""
Predict response of X
"""
# compute predictions pred
return pred
class Transformer(Estimator):
def transform(self, X):
"""
Transforms the input data.
"""
# transform X to X_prime
return X_prime
class Pipeline(Transfomer):
@property
def named_steps(self):
"""
Returns a sequence of estimators
"""
return self.steps
@property
def _final_estimator(self):
"""
Terminating estimator
"""
return self.steps[-1]
Algorithm design
stays in the hands of
Academia
Wizardry When Applied
The Model Selection Triple
Arun Kumar http://bit.ly/2abVNrI
Feature Analysis
Algorithm Selection
Hyperparameter
Tuning
The Model Selection Triple
- Define a bounded, high
dimensional feature space
that can be effectively
modeled.
- Transform and manipulate
the space to make
modeling easier.
- Extract a feature
representation of each
instance in the space.
Feature Analysis
Algorithm Selection
The Model Selection Triple
- Select a model family that
best/correctly defines the
relationship between the
variables of interest.
- Define a model form that
specifies exactly how
features interact to make a
prediction.
- Train a fitted model by
optimizing internal
parameters to the data.
Hyperparameter
Tuning
The Model Selection Triple
- Evaluate how the model
form is interacting with the
feature space.
- Identify hyperparameters
(parameters that affect
training or the prior, not
prediction)
- Tune the fitting and
prediction process by
modifying these params.
Can it be automated?
Regularization is a form of automatic feature analysis.
X0
X1
X0
X1
L1 Normalization
Possibility that a feature is eliminated by setting its
coefficient equal to zero.
L2 Normalization
Features are kept balanced by minimizing the
relative change of coefficients during learning.
Automatic Model Selection Criteria
from sklearn.cross_validation import KFold
kfolds = KFold(n=len(X), n_folds=12)
scores = [
model.fit(
X[train], y[train]
).score(
X[test], y[test]
)
for train, test in kfolds
]
F1
R2
Automatic Model Selection: Try Them All!
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn import cross_validation as cv
classifiers = [
KNeighborsClassifier(5),
SVC(kernel="linear", C=0.025),
RandomForestClassifier(max_depth=5),
AdaBoostClassifier(),
GaussianNB(),
]
kfold = cv.KFold(len(X), n_folds=12)
max([
cv.cross_val_score(model, X, y, cv=kfold).mean
for model in classifiers
])
Automatic Model Selection: Search Param Space
from sklearn.feature_extraction.text import *
from sklearn.linear_model import SGDClassifier
from sklearn.grid_search import GridSearchCV
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('model', SGDClassifier()),
])
parameters = {
'vect__max_df': (0.5, 0.75, 1.0),
'vect__max_features': (None, 5000, 10000),
'tfidf__use_idf': (True, False),
'tfidf__norm': ('l1', 'l2'),
'model__alpha': (0.00001, 0.000001),
'model__penalty': ('l2', 'elasticnet'),
}
search = GridSearchCV(pipeline, parameters)
search.fit(X, y)
Maybe not so Wizard?
Automatic Model Selection: Search?
Search is difficult particularly
in high dimensional space.
Even with techniques like
genetic algorithms or particle
swarm optimization, there is
no guarantee of a solution.
As the search space gets
larger, the amount of time
increases exponentially.
Anscombe, Francis J. "Graphs in statistical analysis."
The American Statistician 27.1 (1973): 17-21.
Anscombe’s Quartet
Through visualization
we can steer the model
selection process
Model Selection Management Systems
Kumar, Arun, et al. "Model selection management systems: The next frontier of
advanced analytics." ACM SIGMOD Record 44.4 (2016): 17-22.
Optimized Implementations
User Interfaces and DSLs
Model Selection Triples
{ {FE} x {AS} X {HT} }
Can we visualize
machine learning?
Data Management
Wrangling
Standardization
Normalization
Selection & Joins
Model Evaluation +
Hyperparameter Tuning
Model Selection
Feature Analysis
Linear
Models
Nearest
Neighbors
SVM
Ensemble Trees Bayes
Feature
Analysis
Feature
Selection
Model
Selection
Revisit
Features
Iterate!
Initial
Model
Model
Storage
Data and Model Management
Is “GitHub for Data” Enough?
Visualizing Feature Analysis
SPLOM (Scatterplot Matrices)
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 1 Dimension
Rank by:
1. Normality of distribution
(Shapiro-Wilk and
Kolmogorov-Smirnov)
2. Uniformity of distribution
(entropy)
3. Number of potential outliers
4. Number of hapaxes
5. Size of gap
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 1 Dimension
Rank by:
1. Normality of distribution
(Shapiro-Wilk and
Kolmogorov-Smirnov)
2. Uniformity of distribution
(entropy)
3. Number of potential outliers
4. Number of hapaxes
5. Size of gap
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 2 Dimensions
Rank by:
1. Correlation Coefficient
(Pearson, Spearman)
2. Least-squares error
3. Quadracity
4. Density based outlier
detection.
5. Uniformity (entropy of grids)
6. Number of items in the most
dense region of the plot.
Joint Plots: Diving Deeper after Rank by Feature
Special thanks to Seaborn for doing statistical visualization right!
Detecting Separablity
Radviz: Radial Visualization
Parallel Coordinates
Decomposition (PCA, SVD) of Feature Space
Visualizing Model Selection
Confusion Matrices
Receiver Operator Characteristic (ROC) and Area Under Curve (AUC)
Prediction Error Plots
Visualizing Residuals
Model Families vs. Model Forms vs. Fitted Models
Rebecca Bilbro http://bit.ly/2a1YoTs
kNN Tuning Slider in 2 Dimensions
Scott Fortmann-Roe http://bit.ly/29P4SS1
Visualizing Evaluation/Tuning
Cross Validation Curves
Visual Grid Search
Integrating Visual Model
Selection with Scikit-Learn
Yellowbrick
Scikit-Learn Pipelines: fit() and predict()
Data Loader
Transformer
Transformer
Estimator
Data Loader
Transformer
Transformer
Estimator
Transformer
Yellowbrick Visual Transformers
Data Loader
Transformer(s)
Feature
Visualization
Estimator
fit()
draw()
predict()
Data Loader
Transformer(s)
EstimatorCV
Evaluation
Visualization
fit()
predict()
score()
draw()
Model Selection Pipelines
Multi-Estimator
Visualization
Data Loader
Transformer(s)
EstimatorEstimatorEstimatorEstimator
Cross Validation Cross Validation Cross Validation Cross Validation
Employ Interactivity to Visualize More
Health and Wealth of Nations Recreated by Mike Bostock
Originally by Hans Rosling http://bit.ly/29RYBJD
Visual Analytics Mantra:
Overview First; Zoom & Filter; Details on Demand
Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics
for visual analysis." Queue 10.2 (2012): 30.
Codename Trinket
Visual Model Management System
Yellowbrick
http://bit.ly/2a5otxB
DDL Trinket
http://bit.ly/2a2Y0jy
DDL Open Source Projects on GitHub
Questions!
1 of 59

Recommended

Machine learning with scikitlearn by
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearnPratap Dangeti
3.9K views31 slides
Feature Selection in Machine Learning by
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
3K views47 slides
Model selection and cross validation techniques by
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
7.1K views113 slides
Feature selection by
Feature selectionFeature selection
Feature selectionDong Guo
14.2K views29 slides
Ensemble learning by
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
8.7K views14 slides
Performance Metrics for Machine Learning Algorithms by
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
1.7K views35 slides

More Related Content

What's hot

An Introduction to Supervised Machine Learning and Pattern Classification: Th... by
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
101.9K views55 slides
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le... by
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
10.2K views70 slides
Feature selection by
Feature selectionFeature selection
Feature selectiondkpawar
945 views11 slides
Logistic Regression | Logistic Regression In Python | Machine Learning Algori... by
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Simplilearn
4.2K views52 slides
Presentation on supervised learning by
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
24.3K views25 slides
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry by
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryAhmed Yousry
301 views91 slides

What's hot(20)

An Introduction to Supervised Machine Learning and Pattern Classification: Th... by Sebastian Raschka
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka101.9K views
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le... by Simplilearn
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn10.2K views
Feature selection by dkpawar
Feature selectionFeature selection
Feature selection
dkpawar945 views
Logistic Regression | Logistic Regression In Python | Machine Learning Algori... by Simplilearn
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Simplilearn4.2K views
Presentation on supervised learning by Tonmoy Bhagawati
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati24.3K views
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry by Ahmed Yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry301 views
Supervised and unsupervised learning by AmAn Singh
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh323 views
supervised learning by Amar Tripathi
supervised learningsupervised learning
supervised learning
Amar Tripathi25.1K views
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ... by Simplilearn
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Simplilearn1.3K views
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8 by Hakky St
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St3.9K views
Support Vector Machines by CloudxLab
Support Vector MachinesSupport Vector Machines
Support Vector Machines
CloudxLab866 views
Support Vector Machines by nextlib
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib19.9K views
A Beginner's Guide to Machine Learning with Scikit-Learn by Sarah Guido
A Beginner's Guide to Machine Learning with Scikit-LearnA Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-Learn
Sarah Guido19.6K views
Feature selection concepts and methods by Reza Ramezani
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani9.1K views
Boosting Approach to Solving Machine Learning Problems by Dr Sulaimon Afolabi
Boosting Approach to Solving Machine Learning ProblemsBoosting Approach to Solving Machine Learning Problems
Boosting Approach to Solving Machine Learning Problems
Dr Sulaimon Afolabi1.3K views
K mean-clustering algorithm by parry prabhu
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu50.6K views
Hyperparameter Tuning by Jon Lederman
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman2.9K views
Confusion Matrix by Rajat Gupta
Confusion MatrixConfusion Matrix
Confusion Matrix
Rajat Gupta743 views

Viewers also liked

Dynamics in graph analysis (PyData Carolinas 2016) by
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
1.5K views41 slides
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel... by
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Benjamin Bengfort
4.8K views60 slides
Data Product Architectures by
Data Product ArchitecturesData Product Architectures
Data Product ArchitecturesBenjamin Bengfort
5.2K views39 slides
An Interactive Visual Analytics Dashboard for the Employment Situation Report by
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportBenjamin Bengfort
1.4K views27 slides
Fast Data Analytics with Spark and Python by
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
30.3K views75 slides
A Primer on Entity Resolution by
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity ResolutionBenjamin Bengfort
14.4K views58 slides

Viewers also liked(20)

Dynamics in graph analysis (PyData Carolinas 2016) by Benjamin Bengfort
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
Benjamin Bengfort1.5K views
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel... by Benjamin Bengfort
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Benjamin Bengfort4.8K views
An Interactive Visual Analytics Dashboard for the Employment Situation Report by Benjamin Bengfort
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation Report
Benjamin Bengfort1.4K views
Fast Data Analytics with Spark and Python by Benjamin Bengfort
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
Benjamin Bengfort30.3K views
Building Data Products with Python (Georgetown) by Benjamin Bengfort
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
Benjamin Bengfort5.4K views
Beginners Guide to Non-Negative Matrix Factorization by Benjamin Bengfort
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
Benjamin Bengfort33.2K views
Lecture7 xing fei-fei by Tianlu Wang
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
Tianlu Wang797 views
Visualizing Threats: Network Visualization for Cyber Security by Cambridge Intelligence
Visualizing Threats: Network Visualization for Cyber SecurityVisualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber Security
Graph Based Machine Learning on Relational Data by Benjamin Bengfort
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
Benjamin Bengfort2.2K views
Plotcon 2016 Visualization Talk by Alexandra Johnson by SigOpt
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
SigOpt1.8K views
Evolutionary Design of Swarms (SSCI 2014) by Benjamin Bengfort
Evolutionary Design of Swarms (SSCI 2014)Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)
Benjamin Bengfort1.1K views
Visualization and Theories of Learning in Education by Liz Dorland
Visualization and Theories of Learning in EducationVisualization and Theories of Learning in Education
Visualization and Theories of Learning in Education
Liz Dorland3.5K views
NetworkX - python graph analysis and visualization @ PyHug by Jimmy Lai
NetworkX - python graph analysis and visualization @ PyHugNetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHug
Jimmy Lai2.5K views
Networkx & Gephi Tutorial #Pydata NYC by Gilad Lotan
Networkx & Gephi Tutorial #Pydata NYCNetworkx & Gephi Tutorial #Pydata NYC
Networkx & Gephi Tutorial #Pydata NYC
Gilad Lotan18.4K views

Similar to Visualizing the Model Selection Process

Visual diagnostics for more effective machine learning by
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
3.5K views55 slides
EE660_Report_YaxinLiu_8448347171 by
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
441 views9 slides
Learning with Relative Attributes by
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
99 views9 slides
Deep learning for molecules, introduction to chainer chemistry by
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
4K views61 slides
Machine Learning Model Bakeoff by
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoffmrphilroth
1.4K views48 slides
Keynote at IWLS 2017 by
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
743 views40 slides

Similar to Visualizing the Model Selection Process(20)

Visual diagnostics for more effective machine learning by Benjamin Bengfort
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
Benjamin Bengfort3.5K views
EE660_Report_YaxinLiu_8448347171 by Yaxin Liu
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu441 views
Learning with Relative Attributes by Vikas Jain
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
Vikas Jain99 views
Deep learning for molecules, introduction to chainer chemistry by Kenta Oono
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono4K views
Machine Learning Model Bakeoff by mrphilroth
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
mrphilroth1.4K views
Machine Learning: Classification Concepts (Part 1) by Daniel Chan
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
Daniel Chan168 views
Speaker Diarization by HONGJOO LEE
Speaker DiarizationSpeaker Diarization
Speaker Diarization
HONGJOO LEE3.1K views
Dimension reduction techniques[Feature Selection] by AAKANKSHA JAIN
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
AAKANKSHA JAIN274 views
Avihu Efrat's Viola and Jones face detection slides by wolf
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
wolf11K views
Learning Predictive Modeling with TSA and Kaggle by Yvonne K. Matos
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
Yvonne K. Matos107 views
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用) by Craig Chao
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Craig Chao5.4K views
A simple framework for contrastive learning of visual representations by Devansh16
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
Devansh16409 views
I am working on this code for my project- but the accuracy is 0-951601.docx by RyanEAcTuckern
I am working on this code for my project- but the accuracy is 0-951601.docxI am working on this code for my project- but the accuracy is 0-951601.docx
I am working on this code for my project- but the accuracy is 0-951601.docx
RyanEAcTuckern8 views
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre... by Yao Yao
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Yao Yao160 views

More from Benjamin Bengfort

Getting Started with TRISA by
Getting Started with TRISAGetting Started with TRISA
Getting Started with TRISABenjamin Bengfort
942 views77 slides
Introduction to Machine Learning with SciKit-Learn by
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
17.6K views78 slides
An Overview of Spanner: Google's Globally Distributed Database by
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
8.6K views23 slides
Graph Analyses with Python and NetworkX by
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
21.5K views92 slides
Natural Language Processing with Python by
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
10.4K views63 slides
Building Data Apps with Python by
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with PythonBenjamin Bengfort
2.9K views25 slides

More from Benjamin Bengfort(6)

Introduction to Machine Learning with SciKit-Learn by Benjamin Bengfort
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort17.6K views
An Overview of Spanner: Google's Globally Distributed Database by Benjamin Bengfort
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
Benjamin Bengfort8.6K views
Graph Analyses with Python and NetworkX by Benjamin Bengfort
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort21.5K views
Natural Language Processing with Python by Benjamin Bengfort
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
Benjamin Bengfort10.4K views

Recently uploaded

Presentation on experimental laboratory animal- Hamster by
Presentation on experimental laboratory animal- HamsterPresentation on experimental laboratory animal- Hamster
Presentation on experimental laboratory animal- HamsterKanika13641
6 views8 slides
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto... by
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...Sérgio Sacani
787 views14 slides
ZEBRA FISH: as model organism.pptx by
ZEBRA FISH: as model organism.pptxZEBRA FISH: as model organism.pptx
ZEBRA FISH: as model organism.pptxmahimachoudhary0807
14 views17 slides
IMMUNODIAGNOSTICS KITS.pdf by
IMMUNODIAGNOSTICS KITS.pdfIMMUNODIAGNOSTICS KITS.pdf
IMMUNODIAGNOSTICS KITS.pdfvetrivel303632
31 views10 slides
Exploring the nature and synchronicity of early cluster formation in the Larg... by
Exploring the nature and synchronicity of early cluster formation in the Larg...Exploring the nature and synchronicity of early cluster formation in the Larg...
Exploring the nature and synchronicity of early cluster formation in the Larg...Sérgio Sacani
1.5K views12 slides
selection of preformed arch wires during the alignment stage of preadjusted o... by
selection of preformed arch wires during the alignment stage of preadjusted o...selection of preformed arch wires during the alignment stage of preadjusted o...
selection of preformed arch wires during the alignment stage of preadjusted o...MaherFouda1
8 views100 slides

Recently uploaded(20)

Presentation on experimental laboratory animal- Hamster by Kanika13641
Presentation on experimental laboratory animal- HamsterPresentation on experimental laboratory animal- Hamster
Presentation on experimental laboratory animal- Hamster
Kanika136416 views
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto... by Sérgio Sacani
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...
XUE: Molecular Inventory in the Inner Region of an Extremely Irradiated Proto...
Sérgio Sacani787 views
Exploring the nature and synchronicity of early cluster formation in the Larg... by Sérgio Sacani
Exploring the nature and synchronicity of early cluster formation in the Larg...Exploring the nature and synchronicity of early cluster formation in the Larg...
Exploring the nature and synchronicity of early cluster formation in the Larg...
Sérgio Sacani1.5K views
selection of preformed arch wires during the alignment stage of preadjusted o... by MaherFouda1
selection of preformed arch wires during the alignment stage of preadjusted o...selection of preformed arch wires during the alignment stage of preadjusted o...
selection of preformed arch wires during the alignment stage of preadjusted o...
MaherFouda18 views
DNA manipulation Enzymes 2.pdf by NetHelix
DNA manipulation Enzymes 2.pdfDNA manipulation Enzymes 2.pdf
DNA manipulation Enzymes 2.pdf
NetHelix6 views
Real Science Radio - Dr Paul Homan Climate Change.pptx by Fred Williams
Real Science Radio - Dr Paul Homan Climate Change.pptxReal Science Radio - Dr Paul Homan Climate Change.pptx
Real Science Radio - Dr Paul Homan Climate Change.pptx
Fred Williams8 views
Heavy Tails Workshop NeurIPS2023.pdf by Charles Martin
Heavy Tails Workshop NeurIPS2023.pdfHeavy Tails Workshop NeurIPS2023.pdf
Heavy Tails Workshop NeurIPS2023.pdf
Charles Martin41 views
Gel Filtration or Permeation Chromatography by Poonam Aher Patil
Gel Filtration or Permeation ChromatographyGel Filtration or Permeation Chromatography
Gel Filtration or Permeation Chromatography
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by Anmol Vishnu Gupta
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
ELECTRON TRANSPORT CHAIN by DEEKSHA RANI
ELECTRON TRANSPORT CHAINELECTRON TRANSPORT CHAIN
ELECTRON TRANSPORT CHAIN
DEEKSHA RANI18 views
Towards Error-Corrected Quantum Computing with Neutral Atoms by Yuval Boger
Towards Error-Corrected Quantum Computing with Neutral AtomsTowards Error-Corrected Quantum Computing with Neutral Atoms
Towards Error-Corrected Quantum Computing with Neutral Atoms
Yuval Boger5 views
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F... by SwagatBehera9
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
SwagatBehera96 views
Factors affecting fluorescence and phosphorescence.pptx by SamarthGiri1
Factors affecting fluorescence and phosphorescence.pptxFactors affecting fluorescence and phosphorescence.pptx
Factors affecting fluorescence and phosphorescence.pptx
SamarthGiri19 views
Micelle Drug Delivery System (Nanotechnology).pptx by ANANYA KUMAR
Micelle Drug Delivery System (Nanotechnology).pptxMicelle Drug Delivery System (Nanotechnology).pptx
Micelle Drug Delivery System (Nanotechnology).pptx
ANANYA KUMAR5 views
INTRODUCTION TO PLANT SYSTEMATICS.pptx by RASHMI M G
INTRODUCTION TO PLANT SYSTEMATICS.pptxINTRODUCTION TO PLANT SYSTEMATICS.pptx
INTRODUCTION TO PLANT SYSTEMATICS.pptx
RASHMI M G 5 views

Visualizing the Model Selection Process

  • 1. Visualizing the Model Selection Process Benjamin Bengfort @bbengfort District Data Labs
  • 2. Abstract Machine learning is the hacker art of describing the features of instances that we want to make predictions about, then fitting the data that describes those instances to a model form. Applied machine learning has come a long way from it's beginnings in academia, and with tools like Scikit-Learn, it's easier than ever to generate operational models for a wide variety of applications. Thanks to the ease and variety of the tools in Scikit-Learn, the primary job of the data scientist is model selection. Model selection involves performing feature engineering, hyperparameter tuning, and algorithm selection. These dimensions of machine learning often lead computer scientists towards automatic model selection via optimization (maximization) of a model's evaluation metric. However, the search space is large, and grid search approaches to machine learning can easily lead to failure and frustration. Human intuition is still essential to machine learning, and visual analysis in concert with automatic methods can allow data scientists to steer model selection towards better fitted models, faster. In this talk, we will discuss interactive visual methods for better understanding, steering, and tuning machine learning models.
  • 3. So I read about this great ML model
  • 4. Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37.
  • 5. def nnmf(R, k=2, steps=5000, alpha=0.0002, beta=0.02): n, m = R.shape P = np.random.rand(n,k) Q = np.random.rand(m,k).T for step in range(steps): for idx in range(n): for jdx in range(m): if R[idx][jdx] > 0: eij = R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx]) for kdx in range(K): P[idx][kdx] = P[idx][kdx] + alpha * (2 * eij * Q[kdx][jdx] - beta * P[idx][kdx]) Q[kdx][jdx] = Q[kdx][jdx] + alpha * (2 * eij * P[idx][kdx] - beta * Q[kdx][jdx]) e = 0 for idx in range(n): for jdx in range(m): if R[idx][jdx] > 0: e += (R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])) ** 2 if e < 0.001: break return P, Q.T
  • 7. from sklearn.decomposition import NMF model = NMF(n_components=2, init='random', random_state=0) model.fit(R)
  • 8. from sklearn.decomposition import NMF, TruncatedSVD, PCA models = [ NMF(n_components=2, init='random', random_state=0), TruncatedSVD(n_components=2), PCA(n_components=2), ] for model in models: model.fit(R)
  • 10. Made Possible by the Scikit-Learn API Buitinck, Lars, et al. "API design for machine learning software: experiences from the scikit-learn project." arXiv preprint arXiv:1309.0238 (2013). class Estimator(object): def fit(self, X, y=None): """ Fits estimator to data. """ # set state of self return self def predict(self, X): """ Predict response of X """ # compute predictions pred return pred class Transformer(Estimator): def transform(self, X): """ Transforms the input data. """ # transform X to X_prime return X_prime class Pipeline(Transfomer): @property def named_steps(self): """ Returns a sequence of estimators """ return self.steps @property def _final_estimator(self): """ Terminating estimator """ return self.steps[-1]
  • 11. Algorithm design stays in the hands of Academia
  • 13. The Model Selection Triple Arun Kumar http://bit.ly/2abVNrI Feature Analysis Algorithm Selection Hyperparameter Tuning
  • 14. The Model Selection Triple - Define a bounded, high dimensional feature space that can be effectively modeled. - Transform and manipulate the space to make modeling easier. - Extract a feature representation of each instance in the space. Feature Analysis
  • 15. Algorithm Selection The Model Selection Triple - Select a model family that best/correctly defines the relationship between the variables of interest. - Define a model form that specifies exactly how features interact to make a prediction. - Train a fitted model by optimizing internal parameters to the data.
  • 16. Hyperparameter Tuning The Model Selection Triple - Evaluate how the model form is interacting with the feature space. - Identify hyperparameters (parameters that affect training or the prior, not prediction) - Tune the fitting and prediction process by modifying these params.
  • 17. Can it be automated?
  • 18. Regularization is a form of automatic feature analysis. X0 X1 X0 X1 L1 Normalization Possibility that a feature is eliminated by setting its coefficient equal to zero. L2 Normalization Features are kept balanced by minimizing the relative change of coefficients during learning.
  • 19. Automatic Model Selection Criteria from sklearn.cross_validation import KFold kfolds = KFold(n=len(X), n_folds=12) scores = [ model.fit( X[train], y[train] ).score( X[test], y[test] ) for train, test in kfolds ] F1 R2
  • 20. Automatic Model Selection: Try Them All! from sklearn.svm import SVC from sklearn.neighbors import KNeighborsClassifier from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import AdaBoostClassifier from sklearn.naive_bayes import GaussianNB from sklearn import cross_validation as cv classifiers = [ KNeighborsClassifier(5), SVC(kernel="linear", C=0.025), RandomForestClassifier(max_depth=5), AdaBoostClassifier(), GaussianNB(), ] kfold = cv.KFold(len(X), n_folds=12) max([ cv.cross_val_score(model, X, y, cv=kfold).mean for model in classifiers ])
  • 21. Automatic Model Selection: Search Param Space from sklearn.feature_extraction.text import * from sklearn.linear_model import SGDClassifier from sklearn.grid_search import GridSearchCV from sklearn.pipeline import Pipeline pipeline = Pipeline([ ('vect', CountVectorizer()), ('tfidf', TfidfTransformer()), ('model', SGDClassifier()), ]) parameters = { 'vect__max_df': (0.5, 0.75, 1.0), 'vect__max_features': (None, 5000, 10000), 'tfidf__use_idf': (True, False), 'tfidf__norm': ('l1', 'l2'), 'model__alpha': (0.00001, 0.000001), 'model__penalty': ('l2', 'elasticnet'), } search = GridSearchCV(pipeline, parameters) search.fit(X, y)
  • 22. Maybe not so Wizard?
  • 23. Automatic Model Selection: Search? Search is difficult particularly in high dimensional space. Even with techniques like genetic algorithms or particle swarm optimization, there is no guarantee of a solution. As the search space gets larger, the amount of time increases exponentially.
  • 24. Anscombe, Francis J. "Graphs in statistical analysis." The American Statistician 27.1 (1973): 17-21. Anscombe’s Quartet
  • 25. Through visualization we can steer the model selection process
  • 26. Model Selection Management Systems Kumar, Arun, et al. "Model selection management systems: The next frontier of advanced analytics." ACM SIGMOD Record 44.4 (2016): 17-22. Optimized Implementations User Interfaces and DSLs Model Selection Triples { {FE} x {AS} X {HT} }
  • 28. Data Management Wrangling Standardization Normalization Selection & Joins Model Evaluation + Hyperparameter Tuning Model Selection Feature Analysis Linear Models Nearest Neighbors SVM Ensemble Trees Bayes Feature Analysis Feature Selection Model Selection Revisit Features Iterate! Initial Model Model Storage
  • 29. Data and Model Management
  • 30. Is “GitHub for Data” Enough?
  • 33. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 1 Dimension Rank by: 1. Normality of distribution (Shapiro-Wilk and Kolmogorov-Smirnov) 2. Uniformity of distribution (entropy) 3. Number of potential outliers 4. Number of hapaxes 5. Size of gap
  • 34. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 1 Dimension Rank by: 1. Normality of distribution (Shapiro-Wilk and Kolmogorov-Smirnov) 2. Uniformity of distribution (entropy) 3. Number of potential outliers 4. Number of hapaxes 5. Size of gap
  • 35. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 2 Dimensions Rank by: 1. Correlation Coefficient (Pearson, Spearman) 2. Least-squares error 3. Quadracity 4. Density based outlier detection. 5. Uniformity (entropy of grids) 6. Number of items in the most dense region of the plot.
  • 36. Joint Plots: Diving Deeper after Rank by Feature Special thanks to Seaborn for doing statistical visualization right!
  • 40. Decomposition (PCA, SVD) of Feature Space
  • 43. Receiver Operator Characteristic (ROC) and Area Under Curve (AUC)
  • 46. Model Families vs. Model Forms vs. Fitted Models Rebecca Bilbro http://bit.ly/2a1YoTs
  • 47. kNN Tuning Slider in 2 Dimensions Scott Fortmann-Roe http://bit.ly/29P4SS1
  • 51. Integrating Visual Model Selection with Scikit-Learn Yellowbrick
  • 52. Scikit-Learn Pipelines: fit() and predict() Data Loader Transformer Transformer Estimator Data Loader Transformer Transformer Estimator Transformer
  • 53. Yellowbrick Visual Transformers Data Loader Transformer(s) Feature Visualization Estimator fit() draw() predict() Data Loader Transformer(s) EstimatorCV Evaluation Visualization fit() predict() score() draw()
  • 54. Model Selection Pipelines Multi-Estimator Visualization Data Loader Transformer(s) EstimatorEstimatorEstimatorEstimator Cross Validation Cross Validation Cross Validation Cross Validation
  • 55. Employ Interactivity to Visualize More Health and Wealth of Nations Recreated by Mike Bostock Originally by Hans Rosling http://bit.ly/29RYBJD
  • 56. Visual Analytics Mantra: Overview First; Zoom & Filter; Details on Demand Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics for visual analysis." Queue 10.2 (2012): 30.
  • 57. Codename Trinket Visual Model Management System