SlideShare a Scribd company logo
1 of 59
Download to read offline
Visualizing the
Model Selection
Process
Benjamin Bengfort
@bbengfort
District Data Labs
Abstract
Machine learning is the hacker art of describing the features of instances that we want to
make predictions about, then fitting the data that describes those instances to a model
form. Applied machine learning has come a long way from it's beginnings in academia, and
with tools like Scikit-Learn, it's easier than ever to generate operational models for a wide
variety of applications. Thanks to the ease and variety of the tools in Scikit-Learn, the
primary job of the data scientist is model selection. Model selection involves performing
feature engineering, hyperparameter tuning, and algorithm selection. These dimensions of
machine learning often lead computer scientists towards automatic model selection via
optimization (maximization) of a model's evaluation metric. However, the search space is
large, and grid search approaches to machine learning can easily lead to failure and
frustration. Human intuition is still essential to machine learning, and visual analysis in
concert with automatic methods can allow data scientists to steer model selection towards
better fitted models, faster. In this talk, we will discuss interactive visual methods for better
understanding, steering, and tuning machine learning models.
So I read about this
great ML model
Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for
recommender systems." Computer 42.8 (2009): 30-37.
def nnmf(R, k=2, steps=5000, alpha=0.0002, beta=0.02):
n, m = R.shape
P = np.random.rand(n,k)
Q = np.random.rand(m,k).T
for step in range(steps):
for idx in range(n):
for jdx in range(m):
if R[idx][jdx] > 0:
eij = R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])
for kdx in range(K):
P[idx][kdx] = P[idx][kdx] + alpha * (2 * eij * Q[kdx][jdx] - beta * P[idx][kdx])
Q[kdx][jdx] = Q[kdx][jdx] + alpha * (2 * eij * P[idx][kdx] - beta * Q[kdx][jdx])
e = 0
for idx in range(n):
for jdx in range(m):
if R[idx][jdx] > 0:
e += (R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])) ** 2
if e < 0.001:
break
return P, Q.T
Life with Scikit-Learn
from sklearn.decomposition import NMF
model = NMF(n_components=2, init='random', random_state=0)
model.fit(R)
from sklearn.decomposition import NMF, TruncatedSVD, PCA
models = [
NMF(n_components=2, init='random', random_state=0),
TruncatedSVD(n_components=2),
PCA(n_components=2),
]
for model in models:
model.fit(R)
So now I’m all
Made Possible by the Scikit-Learn API
Buitinck, Lars, et al. "API design for machine learning software: experiences from
the scikit-learn project." arXiv preprint arXiv:1309.0238 (2013).
class Estimator(object):
def fit(self, X, y=None):
"""
Fits estimator to data.
"""
# set state of self
return self
def predict(self, X):
"""
Predict response of X
"""
# compute predictions pred
return pred
class Transformer(Estimator):
def transform(self, X):
"""
Transforms the input data.
"""
# transform X to X_prime
return X_prime
class Pipeline(Transfomer):
@property
def named_steps(self):
"""
Returns a sequence of estimators
"""
return self.steps
@property
def _final_estimator(self):
"""
Terminating estimator
"""
return self.steps[-1]
Algorithm design
stays in the hands of
Academia
Wizardry When Applied
The Model Selection Triple
Arun Kumar http://bit.ly/2abVNrI
Feature Analysis
Algorithm Selection
Hyperparameter
Tuning
The Model Selection Triple
- Define a bounded, high
dimensional feature space
that can be effectively
modeled.
- Transform and manipulate
the space to make
modeling easier.
- Extract a feature
representation of each
instance in the space.
Feature Analysis
Algorithm Selection
The Model Selection Triple
- Select a model family that
best/correctly defines the
relationship between the
variables of interest.
- Define a model form that
specifies exactly how
features interact to make a
prediction.
- Train a fitted model by
optimizing internal
parameters to the data.
Hyperparameter
Tuning
The Model Selection Triple
- Evaluate how the model
form is interacting with the
feature space.
- Identify hyperparameters
(parameters that affect
training or the prior, not
prediction)
- Tune the fitting and
prediction process by
modifying these params.
Can it be automated?
Regularization is a form of automatic feature analysis.
X0
X1
X0
X1
L1 Normalization
Possibility that a feature is eliminated by setting its
coefficient equal to zero.
L2 Normalization
Features are kept balanced by minimizing the
relative change of coefficients during learning.
Automatic Model Selection Criteria
from sklearn.cross_validation import KFold
kfolds = KFold(n=len(X), n_folds=12)
scores = [
model.fit(
X[train], y[train]
).score(
X[test], y[test]
)
for train, test in kfolds
]
F1
R2
Automatic Model Selection: Try Them All!
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn import cross_validation as cv
classifiers = [
KNeighborsClassifier(5),
SVC(kernel="linear", C=0.025),
RandomForestClassifier(max_depth=5),
AdaBoostClassifier(),
GaussianNB(),
]
kfold = cv.KFold(len(X), n_folds=12)
max([
cv.cross_val_score(model, X, y, cv=kfold).mean
for model in classifiers
])
Automatic Model Selection: Search Param Space
from sklearn.feature_extraction.text import *
from sklearn.linear_model import SGDClassifier
from sklearn.grid_search import GridSearchCV
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('model', SGDClassifier()),
])
parameters = {
'vect__max_df': (0.5, 0.75, 1.0),
'vect__max_features': (None, 5000, 10000),
'tfidf__use_idf': (True, False),
'tfidf__norm': ('l1', 'l2'),
'model__alpha': (0.00001, 0.000001),
'model__penalty': ('l2', 'elasticnet'),
}
search = GridSearchCV(pipeline, parameters)
search.fit(X, y)
Maybe not so Wizard?
Automatic Model Selection: Search?
Search is difficult particularly
in high dimensional space.
Even with techniques like
genetic algorithms or particle
swarm optimization, there is
no guarantee of a solution.
As the search space gets
larger, the amount of time
increases exponentially.
Anscombe, Francis J. "Graphs in statistical analysis."
The American Statistician 27.1 (1973): 17-21.
Anscombe’s Quartet
Through visualization
we can steer the model
selection process
Model Selection Management Systems
Kumar, Arun, et al. "Model selection management systems: The next frontier of
advanced analytics." ACM SIGMOD Record 44.4 (2016): 17-22.
Optimized Implementations
User Interfaces and DSLs
Model Selection Triples
{ {FE} x {AS} X {HT} }
Can we visualize
machine learning?
Data Management
Wrangling
Standardization
Normalization
Selection & Joins
Model Evaluation +
Hyperparameter Tuning
Model Selection
Feature Analysis
Linear
Models
Nearest
Neighbors
SVM
Ensemble Trees Bayes
Feature
Analysis
Feature
Selection
Model
Selection
Revisit
Features
Iterate!
Initial
Model
Model
Storage
Data and Model Management
Is “GitHub for Data” Enough?
Visualizing Feature Analysis
SPLOM (Scatterplot Matrices)
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 1 Dimension
Rank by:
1. Normality of distribution
(Shapiro-Wilk and
Kolmogorov-Smirnov)
2. Uniformity of distribution
(entropy)
3. Number of potential outliers
4. Number of hapaxes
5. Size of gap
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 1 Dimension
Rank by:
1. Normality of distribution
(Shapiro-Wilk and
Kolmogorov-Smirnov)
2. Uniformity of distribution
(entropy)
3. Number of potential outliers
4. Number of hapaxes
5. Size of gap
Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive
exploration of multidimensional data." Information visualization 4.2 (2005): 96-113.
Visual Rank by Feature: 2 Dimensions
Rank by:
1. Correlation Coefficient
(Pearson, Spearman)
2. Least-squares error
3. Quadracity
4. Density based outlier
detection.
5. Uniformity (entropy of grids)
6. Number of items in the most
dense region of the plot.
Joint Plots: Diving Deeper after Rank by Feature
Special thanks to Seaborn for doing statistical visualization right!
Detecting Separablity
Radviz: Radial Visualization
Parallel Coordinates
Decomposition (PCA, SVD) of Feature Space
Visualizing Model Selection
Confusion Matrices
Receiver Operator Characteristic (ROC) and Area Under Curve (AUC)
Prediction Error Plots
Visualizing Residuals
Model Families vs. Model Forms vs. Fitted Models
Rebecca Bilbro http://bit.ly/2a1YoTs
kNN Tuning Slider in 2 Dimensions
Scott Fortmann-Roe http://bit.ly/29P4SS1
Visualizing Evaluation/Tuning
Cross Validation Curves
Visual Grid Search
Integrating Visual Model
Selection with Scikit-Learn
Yellowbrick
Scikit-Learn Pipelines: fit() and predict()
Data Loader
Transformer
Transformer
Estimator
Data Loader
Transformer
Transformer
Estimator
Transformer
Yellowbrick Visual Transformers
Data Loader
Transformer(s)
Feature
Visualization
Estimator
fit()
draw()
predict()
Data Loader
Transformer(s)
EstimatorCV
Evaluation
Visualization
fit()
predict()
score()
draw()
Model Selection Pipelines
Multi-Estimator
Visualization
Data Loader
Transformer(s)
EstimatorEstimatorEstimatorEstimator
Cross Validation Cross Validation Cross Validation Cross Validation
Employ Interactivity to Visualize More
Health and Wealth of Nations Recreated by Mike Bostock
Originally by Hans Rosling http://bit.ly/29RYBJD
Visual Analytics Mantra:
Overview First; Zoom & Filter; Details on Demand
Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics
for visual analysis." Queue 10.2 (2012): 30.
Codename Trinket
Visual Model Management System
Yellowbrick
http://bit.ly/2a5otxB
DDL Trinket
http://bit.ly/2a2Y0jy
DDL Open Source Projects on GitHub
Questions!

More Related Content

What's hot

2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classificationKrish_ver2
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learningShajun Nisha
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Simplilearn
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA BoostAman Patel
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regressionkishanthkumaar
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revisedKrish_ver2
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn
 
Decision Tree - ID3
Decision Tree - ID3Decision Tree - ID3
Decision Tree - ID3Xueping Peng
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for ClassificationPrakash Pimpale
 

What's hot (20)

2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 
SPADE -
SPADE - SPADE -
SPADE -
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
 
Decision Tree - ID3
Decision Tree - ID3Decision Tree - ID3
Decision Tree - ID3
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 

Viewers also liked

Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
 
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Benjamin Bengfort
 
An Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportBenjamin Bengfort
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity ResolutionBenjamin Bengfort
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Benjamin Bengfort
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-feiTianlu Wang
 
Visualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityVisualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityCambridge Intelligence
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataBenjamin Bengfort
 
Solving graph problems using networkX
Solving graph problems using networkXSolving graph problems using networkX
Solving graph problems using networkXKrishna Sangeeth KS
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra JohnsonSigOpt
 
Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Benjamin Bengfort
 
Visualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationVisualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationLiz Dorland
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugNetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugJimmy Lai
 
Networkx & Gephi Tutorial #Pydata NYC
Networkx & Gephi Tutorial #Pydata NYCNetworkx & Gephi Tutorial #Pydata NYC
Networkx & Gephi Tutorial #Pydata NYCGilad Lotan
 

Viewers also liked (20)

Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
 
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
An Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation Report
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Visualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityVisualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber Security
 
Annotation with Redfox
Annotation with RedfoxAnnotation with Redfox
Annotation with Redfox
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
 
Rasta processing of speech
Rasta processing of speechRasta processing of speech
Rasta processing of speech
 
Solving graph problems using networkX
Solving graph problems using networkXSolving graph problems using networkX
Solving graph problems using networkX
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
 
Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)
 
PROTEUS H2020
PROTEUS H2020 PROTEUS H2020
PROTEUS H2020
 
Visualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationVisualization and Theories of Learning in Education
Visualization and Theories of Learning in Education
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugNetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHug
 
Networkx & Gephi Tutorial #Pydata NYC
Networkx & Gephi Tutorial #Pydata NYCNetworkx & Gephi Tutorial #Pydata NYC
Networkx & Gephi Tutorial #Pydata NYC
 

Similar to Visualizing the Model Selection Process

Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoffmrphilroth
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker DiarizationHONGJOO LEE
 
Pointcuts and Analysis
Pointcuts and AnalysisPointcuts and Analysis
Pointcuts and AnalysisWiwat Ruengmee
 
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]AAKANKSHA JAIN
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleYvonne K. Matos
 
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Craig Chao
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint febimu409
 

Similar to Visualizing the Model Selection Process (20)

Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker Diarization
 
Pointcuts and Analysis
Pointcuts and AnalysisPointcuts and Analysis
Pointcuts and Analysis
 
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
 
IEEE Projects 2014-2015
IEEE Projects 2014-2015IEEE Projects 2014-2015
IEEE Projects 2014-2015
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Report
ReportReport
Report
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
 
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learning
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint feb
 

More from Benjamin Bengfort

Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with PythonBenjamin Bengfort
 

More from Benjamin Bengfort (6)

Getting Started with TRISA
Getting Started with TRISAGetting Started with TRISA
Getting Started with TRISA
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 

Recently uploaded

Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 

Recently uploaded (20)

Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 

Visualizing the Model Selection Process

  • 1. Visualizing the Model Selection Process Benjamin Bengfort @bbengfort District Data Labs
  • 2. Abstract Machine learning is the hacker art of describing the features of instances that we want to make predictions about, then fitting the data that describes those instances to a model form. Applied machine learning has come a long way from it's beginnings in academia, and with tools like Scikit-Learn, it's easier than ever to generate operational models for a wide variety of applications. Thanks to the ease and variety of the tools in Scikit-Learn, the primary job of the data scientist is model selection. Model selection involves performing feature engineering, hyperparameter tuning, and algorithm selection. These dimensions of machine learning often lead computer scientists towards automatic model selection via optimization (maximization) of a model's evaluation metric. However, the search space is large, and grid search approaches to machine learning can easily lead to failure and frustration. Human intuition is still essential to machine learning, and visual analysis in concert with automatic methods can allow data scientists to steer model selection towards better fitted models, faster. In this talk, we will discuss interactive visual methods for better understanding, steering, and tuning machine learning models.
  • 3. So I read about this great ML model
  • 4. Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009): 30-37.
  • 5. def nnmf(R, k=2, steps=5000, alpha=0.0002, beta=0.02): n, m = R.shape P = np.random.rand(n,k) Q = np.random.rand(m,k).T for step in range(steps): for idx in range(n): for jdx in range(m): if R[idx][jdx] > 0: eij = R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx]) for kdx in range(K): P[idx][kdx] = P[idx][kdx] + alpha * (2 * eij * Q[kdx][jdx] - beta * P[idx][kdx]) Q[kdx][jdx] = Q[kdx][jdx] + alpha * (2 * eij * P[idx][kdx] - beta * Q[kdx][jdx]) e = 0 for idx in range(n): for jdx in range(m): if R[idx][jdx] > 0: e += (R[idx][jdx] - np.dot(P[idx,:], Q[:,jdx])) ** 2 if e < 0.001: break return P, Q.T
  • 7. from sklearn.decomposition import NMF model = NMF(n_components=2, init='random', random_state=0) model.fit(R)
  • 8. from sklearn.decomposition import NMF, TruncatedSVD, PCA models = [ NMF(n_components=2, init='random', random_state=0), TruncatedSVD(n_components=2), PCA(n_components=2), ] for model in models: model.fit(R)
  • 10. Made Possible by the Scikit-Learn API Buitinck, Lars, et al. "API design for machine learning software: experiences from the scikit-learn project." arXiv preprint arXiv:1309.0238 (2013). class Estimator(object): def fit(self, X, y=None): """ Fits estimator to data. """ # set state of self return self def predict(self, X): """ Predict response of X """ # compute predictions pred return pred class Transformer(Estimator): def transform(self, X): """ Transforms the input data. """ # transform X to X_prime return X_prime class Pipeline(Transfomer): @property def named_steps(self): """ Returns a sequence of estimators """ return self.steps @property def _final_estimator(self): """ Terminating estimator """ return self.steps[-1]
  • 11. Algorithm design stays in the hands of Academia
  • 13. The Model Selection Triple Arun Kumar http://bit.ly/2abVNrI Feature Analysis Algorithm Selection Hyperparameter Tuning
  • 14. The Model Selection Triple - Define a bounded, high dimensional feature space that can be effectively modeled. - Transform and manipulate the space to make modeling easier. - Extract a feature representation of each instance in the space. Feature Analysis
  • 15. Algorithm Selection The Model Selection Triple - Select a model family that best/correctly defines the relationship between the variables of interest. - Define a model form that specifies exactly how features interact to make a prediction. - Train a fitted model by optimizing internal parameters to the data.
  • 16. Hyperparameter Tuning The Model Selection Triple - Evaluate how the model form is interacting with the feature space. - Identify hyperparameters (parameters that affect training or the prior, not prediction) - Tune the fitting and prediction process by modifying these params.
  • 17. Can it be automated?
  • 18. Regularization is a form of automatic feature analysis. X0 X1 X0 X1 L1 Normalization Possibility that a feature is eliminated by setting its coefficient equal to zero. L2 Normalization Features are kept balanced by minimizing the relative change of coefficients during learning.
  • 19. Automatic Model Selection Criteria from sklearn.cross_validation import KFold kfolds = KFold(n=len(X), n_folds=12) scores = [ model.fit( X[train], y[train] ).score( X[test], y[test] ) for train, test in kfolds ] F1 R2
  • 20. Automatic Model Selection: Try Them All! from sklearn.svm import SVC from sklearn.neighbors import KNeighborsClassifier from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import AdaBoostClassifier from sklearn.naive_bayes import GaussianNB from sklearn import cross_validation as cv classifiers = [ KNeighborsClassifier(5), SVC(kernel="linear", C=0.025), RandomForestClassifier(max_depth=5), AdaBoostClassifier(), GaussianNB(), ] kfold = cv.KFold(len(X), n_folds=12) max([ cv.cross_val_score(model, X, y, cv=kfold).mean for model in classifiers ])
  • 21. Automatic Model Selection: Search Param Space from sklearn.feature_extraction.text import * from sklearn.linear_model import SGDClassifier from sklearn.grid_search import GridSearchCV from sklearn.pipeline import Pipeline pipeline = Pipeline([ ('vect', CountVectorizer()), ('tfidf', TfidfTransformer()), ('model', SGDClassifier()), ]) parameters = { 'vect__max_df': (0.5, 0.75, 1.0), 'vect__max_features': (None, 5000, 10000), 'tfidf__use_idf': (True, False), 'tfidf__norm': ('l1', 'l2'), 'model__alpha': (0.00001, 0.000001), 'model__penalty': ('l2', 'elasticnet'), } search = GridSearchCV(pipeline, parameters) search.fit(X, y)
  • 22. Maybe not so Wizard?
  • 23. Automatic Model Selection: Search? Search is difficult particularly in high dimensional space. Even with techniques like genetic algorithms or particle swarm optimization, there is no guarantee of a solution. As the search space gets larger, the amount of time increases exponentially.
  • 24. Anscombe, Francis J. "Graphs in statistical analysis." The American Statistician 27.1 (1973): 17-21. Anscombe’s Quartet
  • 25. Through visualization we can steer the model selection process
  • 26. Model Selection Management Systems Kumar, Arun, et al. "Model selection management systems: The next frontier of advanced analytics." ACM SIGMOD Record 44.4 (2016): 17-22. Optimized Implementations User Interfaces and DSLs Model Selection Triples { {FE} x {AS} X {HT} }
  • 28. Data Management Wrangling Standardization Normalization Selection & Joins Model Evaluation + Hyperparameter Tuning Model Selection Feature Analysis Linear Models Nearest Neighbors SVM Ensemble Trees Bayes Feature Analysis Feature Selection Model Selection Revisit Features Iterate! Initial Model Model Storage
  • 29. Data and Model Management
  • 30. Is “GitHub for Data” Enough?
  • 33. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 1 Dimension Rank by: 1. Normality of distribution (Shapiro-Wilk and Kolmogorov-Smirnov) 2. Uniformity of distribution (entropy) 3. Number of potential outliers 4. Number of hapaxes 5. Size of gap
  • 34. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 1 Dimension Rank by: 1. Normality of distribution (Shapiro-Wilk and Kolmogorov-Smirnov) 2. Uniformity of distribution (entropy) 3. Number of potential outliers 4. Number of hapaxes 5. Size of gap
  • 35. Seo, Jinwook, and Ben Shneiderman. "A rank-by-feature framework for interactive exploration of multidimensional data." Information visualization 4.2 (2005): 96-113. Visual Rank by Feature: 2 Dimensions Rank by: 1. Correlation Coefficient (Pearson, Spearman) 2. Least-squares error 3. Quadracity 4. Density based outlier detection. 5. Uniformity (entropy of grids) 6. Number of items in the most dense region of the plot.
  • 36. Joint Plots: Diving Deeper after Rank by Feature Special thanks to Seaborn for doing statistical visualization right!
  • 40. Decomposition (PCA, SVD) of Feature Space
  • 43. Receiver Operator Characteristic (ROC) and Area Under Curve (AUC)
  • 46. Model Families vs. Model Forms vs. Fitted Models Rebecca Bilbro http://bit.ly/2a1YoTs
  • 47. kNN Tuning Slider in 2 Dimensions Scott Fortmann-Roe http://bit.ly/29P4SS1
  • 51. Integrating Visual Model Selection with Scikit-Learn Yellowbrick
  • 52. Scikit-Learn Pipelines: fit() and predict() Data Loader Transformer Transformer Estimator Data Loader Transformer Transformer Estimator Transformer
  • 53. Yellowbrick Visual Transformers Data Loader Transformer(s) Feature Visualization Estimator fit() draw() predict() Data Loader Transformer(s) EstimatorCV Evaluation Visualization fit() predict() score() draw()
  • 54. Model Selection Pipelines Multi-Estimator Visualization Data Loader Transformer(s) EstimatorEstimatorEstimatorEstimator Cross Validation Cross Validation Cross Validation Cross Validation
  • 55. Employ Interactivity to Visualize More Health and Wealth of Nations Recreated by Mike Bostock Originally by Hans Rosling http://bit.ly/29RYBJD
  • 56. Visual Analytics Mantra: Overview First; Zoom & Filter; Details on Demand Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics for visual analysis." Queue 10.2 (2012): 30.
  • 57. Codename Trinket Visual Model Management System