Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!

Machine Learning Interpretation Lab using
aquarium

Confidential2 Confidential2
Lab session Initiation

Confidential3
Link to aquarium: http://aquarium.h2o.ai
License key: bit.ly/diveintoh2o
LOGGING IN TO THE LAB

Confidential4
Info
Pramit Choudhary
@MaverickPramit
www.linkedin.com/in/pramitc/
pramit.choudhary@h2o.ai
● Lead data scientist(ML Scientist) at h2o.ai.
● Previously Lead data scientist at DataScience.com(acquired by Oracle)
● Have been exploring better ways to evaluate,
extract and explain the learned decision policies for predictive models at h2o.ai
● For sometime I was also using machine learning algorithms to find love for eHarmony.

Confidential5
Machine Learning workflow
Figure: Shows an improved ML workflow showing how MLI fits
into a typical predictive analytical setup
**Reference: Adapted from ”Learning from data - Professor Yaser Abu-Mostafa”
Model Interpretation
Learnings by model inference

Confidential6
Driverless AI Components
Automatic
Visualization
Machine Learning
Interpretability
Machine Learning
Experimentation
Project
Management
ML
Recipes
Scoring Pipeline
and Deployment
Feature
Engineering
“Brining in human in the
loop”

Confidential7
Problems Addressed by Driverless AI
• Supervised Learning
– Regression
– Classification
– Binary
– Multinomial
• Tabular structured data
– Numeric
– Categorical
– Time/Date
– Text
– Missing values are allowed
• IDENTICALLY AND
INDEPENDENTLY DISTRIBUTED
(IID) ROWS
• TIME-SERIES
– Single time-series
– Grouped time-series
– Time-series problems with a gap
between training and testing to
account for time to deploy
• NLP
– Categorization
– Sentiment

Confidential8
What is Model Interpretation?
• Ability to explain and present a model in a way that is understandable to humans.
-- Finale Doshi-Velez and Been Kim. “Towards a rigorous science of interpretable machine learning.” arXiv
preprint. 2017. https://arxiv.org/pdf/1702.08608.pdf
• “Why did you do A” or ”Why DIDN’T you do B”, ”Why CAN’T you do C” ?
-- Gilpin, Leilani H., et al. "Explaining Explanations: An Approach to Evaluating Interpretability of Machine
Learning." arXiv preprint arXiv:1806.00069 (2018).
• A Model’s result is self descriptive and needs no further explanation; expressed in terms of inputs and
outputs. “Mapping an abstract concept(e.g. predicted class) into a domain that human can make sense
of“
-- Gregoire Montavon et al. “Methods for Interpreting and Understanding Deep Neural Networks”

“A collection of visual and/or interactive artifacts
that provide a user with sufficient description of
the model behavior to accurately perform tasks
like evaluation, trusting, predicting, or improving
the model.”
Assistant Professor Sameer Singh, University of
California, Irvine

Confidential10
Motives for Model Interpretation
1. Helps in defining hypothesis for the problem
2. Debugging and improving an ML system –
mismatched objectives.
3. Exploring and discovering latent or hidden feature
interactions (useful for feature engineering/selection
and resolving preconceptions).
4. Understanding model variability to avoid over-fitting.
5. Helps in model comparison.
6. Building domain knowledge about a particular use
case.
7. Bring transparency to decision making to enable
trust and safety – ML System is making sound
decisions.
Model Maker(Producer)
1. Ability to share the explanations to consumers of the
predictive model.
2. Explain the model/algorithm.
3. Explain the key features driving the KPI.
4. Verify and validate the accountability of ML learning
systems, e.g., causes for false positives in credit scoring,
insurance claim frauds(identifying spurious correlations
is easier for humans).
5. Identify blind spots to prevent adversarial attacks or fix
dataset errors.
6. Right to explanation: Comply with data protection law
and regulations, e.g., EU’s GDPR.
Model Breaker(Consumer)

Confidential11
How will it help?
Currently
ML use-case = data munging + data scientist expertise + model building with availability to computation
How can we scale this 10x?
ML use-case = data munging + data scientist expertise + model building with availability to computation
Driverless Machine Learning + Machine Learning Interpretation(MLI)

Confidential12
Scope of Interpretation
Global Interpretation
Being able to explain the conditional interaction
between dependent (response) variables and
independent (predictor or explanatory)
variables based on the complete dataset.
Helpful in explaining the context of the decision
classification.
Local Interpretation
Being able to explain the conditional
interaction between dependent (response)
variables and independent (predictor or
explanatory) variables for a single row or a
subset of rows. Helpful in identifying local
trends and intuitions.
Global Interpretation
Local Interpretation

Confidential13
• Supervised learning problem:
– Titanic Dataset: https://www.kaggle.com/c/titanic/data
• Timeseries problem:
– Walmart sales dataset: https://www.kaggle.com/c/walmart-recruiting-store-
sales-forecasting/data
Datasets

Algorithms
1. Native Interpretation
2. Post-hoc Interpretation

Confidential15
Experiment Settings
TimeAccuracy
• Relative interpretability – higher
values favor more interpretable
models
• The higher the interpretability
setting, the lower the complexity
of the engineered features and
of the final model(s)
Interpretability

Confidential16
Native Interpretability
Interpretability
Ensemble
Level
Target
Transformation
Feature Engineering
Feature Pre-
Pruning
Monotonicity
Constraints
1 - 3 <= 3 None Disabled
4 <= 3 Inverse None Disabled
5 <= 3 Anscombe
Clustering (ID, distance)
Truncated SVD
None Disabled
6 <= 2
Logit
Sigmoid
Feature selection Disabled
7 <= 2 Frequency Encoding Feature selection Enabled
8 <= 1 4th
Root Feature selection Enabled
9 <= 1
Square
Square Root
Bulk Interactions (add,
subtract, multiply,
divide)
Weight of Evidence
Feature selection Enabled
10 0
Identity
Unit Box
Log
Date Decompositions
Number Encoding
Target Encoding
Text (TF-IDF,
Frequency)
Feature selection Enabled
Good
start

Confidential17
Post-hoc Interpretation/Evaluation

Confidential18
Reason Codes using K-LIME

Confidential19
PDP/ICE
p(HouseAge, Avg. Occupants per household) vs
Avg. House Value: One can observe that once the
avg. occupancy > 2, houseAge does not seem to
have much of an effect on the avg. house value
Image credit: Patrick Hall, Navdeep Gill,
h2o.ai team
• Helps in understanding interaction impact of two independent features in a low dimensional space
visually
• Helps in understanding the average partial dependence of the target function f(Y|X )on
subsetof features by marginalizing over rest of the features (complement set of features)

Confidential20
Leave One Covariate Out (LOCO)

Confidential21
Shapley values
• Explanation idea is borrowed from coalitional game theory
• Initial idea was proposed by Lloyd S. Shapley (1953)
-- Lloyd S. Shapley, Alvin E. Roth, et al. The Shapley value: Essays in honor of Lloyd S. Shapley. Cambridge University
Press, 1988. URL: http://www.library.fa.ru/files/Roth2.pdf.
• Idea:
– Feature’s contribution is computed by observing the difference between model’s initial predictions
– Followed by it’s average prediction post perturbing the feature space
– One has to be careful with features which are computationally dependent while perturbing
• Defined as
-- Erik Strumbelj and Igor Kononenko. An Efficient Explanation of Individual Classifications using Game Theory. Journal of
Machine Learning Research,
11(Jan):1–18, 2010. URL: http://www.jmlr.org/papers/volume11/strumbelj10a/strumbelj10a.pdf.
-- Scott M. Lundberg and Su-In Lee. A Unified Approach to Interpreting Model Predictions. URL: http://papers.nips.cc/paper/
7062-a-unified-approach-to-interpreting-model-predictions.pdf.
• Scope of Interpretation: Helps in computing Globally and locally faithful feature importance
• Flexible: Can be adopted in different forms - model agnostic or model specific approximations (Tree SHAP for tree ensemble
methods, Linear SHAP)

Surrogate Decision Trees
• If the original DNN learned decision function is denoted as 𝑔 and set of predictions, g(X)=Ỳ, then a surrogate tree(𝘩tree) can be
learned such 𝘩tree (X, Ỳ) ≅ g(X)
• The faithfulness of the decisions generated by SDT depends on how precisely the tree surrogates captures the decisions learned
by the original estimator(g).
-- Craven, M., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. In Advances in neural
information processing systems (pp. 24-30).
A B

Confidential23
• Capture Feature Interactions
• Partial dependence function could possibly be extended to capture
feature interactions by applying statistical tests
Exploring: Capturing interactions
Image: Showing H-Statistics scores for all possible pairs of features ordered in-terms of importance

Time Series

Confidential25
0
50
100
150
200
250
12/21/2017 12/31/2017 1/10/2018 1/20/2018 1/30/2018 2/9/2018 2/19/2018
Sales over time
0
10
20
30
40
50
60
70
80
12/21/2017 12/31/2017 1/10/2018 1/20/2018 1/30/2018 2/9/2018 2/19/2018
Sales over time
0
50
100
150
200
250
300
350
400
12/31/2017 1/5/2018 1/10/2018 1/15/2018
Sales over time
Linear relationshipNonlinear (seasonal)
relationship
What is a Time Series Problem?

Confidential26
0
100
200
300
400
500
600
700
800
12/21/2017 12/31/2017 1/10/2018 1/20/2018 1/30/2018 2/9/2018 2/19/2018 3/1/2018 3/11/2018
sales per per day (all groups)
0
100
200
300
400
500
600
700
800
12/21/2017 12/31/2017 1/10/2018 1/20/2018 1/30/2018 2/9/2018 2/19/2018 3/1/2018 3/11/2018
sales by group
group 1 group 2 group 3
time groups sales
01/01/2018 group1 30
01/01/2018 group2 100
01/01/2018 group3 10
02/01/2018 group1 60.2
02/01/2018 group2 200.2
02/01/2018 group3 20.2
03/01/2018 group1 90.3
03/01/2018 group2 300.3
03/01/2018 group3 30.3
04/01/2018 group1 120.4
04/01/2018 group2 400.4
04/01/2018 group3 40.4
Time Groups

Confidential27
MLI for Time Series

Confidential28
Can we think?
An interactive playground to simulate, infer and evaluate
all forms of model- what-if(scenario analysis), identifying
counterfactuals to establish model robustness
Image reference: Rajeswara et.al (http://arxiv.org/abs/1703.02660)

Confidential29
Template
29 Image credit: Patrick Hall

• https://www.fatml.org/resources/principles-for-accountable-algorithms
• https://www.darpa.mil/program/explainable-artificial-intelligence
• https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning
• https://twitter.com/jpatrickhall/status/1073572209988255744
• https://github.com/Castro38/automatic-ml-intro-tutorial-h2oai/blob/master/automatic-ml-
intro-tutorial.md
• https://github.com/h2oai/mli-resources
• https://github.com/jphall663/awesome-machine-learning-interpretability
• https://www.h2o.ai/blog/what-is-your-ai-thinking-part-1/
Questions?

Confidential31
Driverless AI - Machine Learning Interpretability
Gain confidence in models before deploying them!

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!

More Related Content

What's hot

Similar to Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!

More from Sri Ambati

Recently uploaded

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!