Different ways of interpreting
model predictions
Anton Kulesh
Discussion Club #2
InData Labs, 2018
Problem
The black box is a model that accepts the inputs and gives the responses, but
does not explain how they were received.
● credit scoring
● medical diagnostics
● crime detection
● etc.
Data PredictionMagic
1. https://www.techemergence.com/ai-crime-prevention-5-current-applications
1
2/36
Motivation
1) Understanding and transparency
2) Trust and fairness
3) Prevention of objectionable biases
Usage
● model improving (model selection, generating new features etc.)
● identification of a data leakage (it isn`t always possible to focus on the quality of
the model)
● detecting the dataset shift (Real-world data can be significantly different)
● insights for business
3/36
General Data Protection Regulation
GDPR was adopted by European Parliament in April 2016, and will be enforceable
throughout EU by May 2018.
● Non-discrimination
○ race or ethnic origin
○ religious or philosophical beliefs
○ person’s sex life or sexual orientation
○ processing of genetic data
○ data concerning health
○ etc.
https://medium.com/trustableai/gdpr-and-its-impacts-on-machine-learning-applications-d5b5b0c3a815 4/36
General Data Protection Regulation
GDPR was adopted by European Parliament in April 2016, and will be enforceable
throughout EU by May 2018.
● Right to explanation
○ data subjects have the right to access information
collected about them, and also requires data
processors to ensure data subjects are notified
about the data collected
○ right to receive “meaningful information about
the logic (algorithm) and possible impact”
1. about “right to explaination”: https://arxiv.org/abs/1606.08813
1
5/36
Confusions
A black box model may have a systematic bias or poor generalization
Will the prisoner commit the crime?
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
6/36
*Crimes in the U.S. 2016
Percent of population:
- White: 72%
- African American: 12%
- Asian: 5%
https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-21
7/36
Confusions
A black box model may have a systematic bias or poor generalization
Is this a husky or wolf?
Example from [7]
8/36
Interpretability: Definition
Human understanding
● The complexity of the function
○ linear monotonic
○ non-linear monotonic
○ non-linear non-monotonic
● Scope
○ Global
○ Local (why this particular observation was classified this way)
● Application domain
○ model-agnostic
○ model-specific
9/36
Interpretable models
● LASSO / Ridge regression
● Decision tree
● Rule-based algorithms
● Generalized Additive Models
● Quantile regression
● ...what else?
https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning
10/36
Global feature importance (tree ensembles)
This values are calculated for an entire dataset
● Gain -- total reduction of loss or impurity contributed by all splits for a given
feature [1]
● Split Count -- how many times a feature is used to split [2]
● Permutation is to randomly permute the values of a feature in the dataset and
then observe how the model quality change
● Cover represents the number of instances in tree node
[1] L. Breiman, J. Friedman and other. Classification and regression trees. CRC press, 1984.
[2] T. Chen, C. Guestrin. XGBoost: A scalable tree boosting system. ACM, 2016.
11/36
Individualized (Local) methods
Computing feature importance values for each individual prediction
● LIME [7] (see also ELI5)
● SHAP [1,2]
● DeepLIFT [5] (Recursive prediction explanation method for deep learning)
● Layer-Wise Relevance Propagation [6] (predictions interpretation of deep networks)
● Attention-Based RNNs [8]
12/36
Local Interpretable Model-agnostic Explanations
LIME is a feature attribution method that can explain the predictions of any
classifier or regressor in a faithful way, by approximating it locally with an
interpretable model
https://github.com/marcotcr/lime 13/36
LIME: Explanation example
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
14/36
LIME: Formulation
To find the explanation function g we should minimize following objective function:
● -- regularization of model complexity (number of non-zero features)
● -- local kernel (exponential kernel)
● -- loss function (squared loss)
● x` -- simplified inputs (depends on input space type)
● G -- class of linear functions
15/36
LIME: Evaluation with human subjects
● Human subjects on Amazon Mechanical Turk was recruited (100 users)
● Models
○ SVM trained on a original 20 newsgroup dataset (SVM-I)
○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II)
● Accuracy
○ Test set
SVM-I: 94%, SVM-II: 88.6%
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
16/36
LIME: Evaluation with human subjects
● Human subjects on Amazon Mechanical Turk was recruited (100 users)
● Models
○ SVM trained on a original 20 newsgroup dataset (SVM-I)
○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II)
● Accuracy
○ Test set
SVM-I: 94%, SVM-II: 88.6%
○ Religion dataset
SVM-I: 57.3%, SVM-II: 69%!!!
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
17/36
LIME: Evaluation with human subjects
● Human subjects on Amazon Mechanical Turk was recruited (100 users)
● Models
○ SVM trained on a original 20 newsgroup dataset (SVM-I)
○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II)
● Accuracy
○ Test set
SVM-I: 94%, SVM-II: 88.6%
○ Religion dataset
SVM-I: 57.3%, SVM-II: 69%!!!
● LIME vs. Greedy
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
18/36
LIME: Pros and Cons
- Current version is raw
- Too time consuming for calculating global importances
+ Working with different domains (tabular, text, images)
+ Allows you to explain any model
+ Submodular pick
19/36
SHapley Additive exPlanation
Scott Lundberg. A Unified Approach to Interpreting Model Prediction, NIPS 2017
20/36
SHAP: Additive feature attribution method
The explanation function g is presented as linear combination by binary variables:
21/36
SHAP: Additive feature attribution method
The explanation function g is presented as linear combination by binary variables[2]:
22/36
SHAP: Additive feature attribution models properties
● Local accuracy
○ The sum of the feature attributions is equal to the output of the function we are seeking to
explain
● Missingness
○ Features that are already missing (such that Zi`=0) are attributed no importance
● Consistency
○ Changing a model so a feature has a larger impact on the model will never decrease the
attribution assigned to that feature
Only one possible explanation model g satisfies all properties…
23/36
SHAP: Consistancy
Scott M. Lundberg, Gabriel G. Erion, Su-In Lee. Consistent Individualized Feature Attribution for Tree Ensembles. 24/36
SHAP values
For calculating SHAP values we combine conditional expectations of function
given variables and with Shapley values from the game theory
- S is the set of non-zero indexes in z`
- fx(S) = f(hx(z`)) = E[f(x) | xS] is expected value of the function conditioned on a
subset S of the input features
25/36
SHAP: Theoretical proof
Methods not based on Shapley values
violate local accuracy and/or consistency
Authors[1] refer to Young(1985) paper, where demonstrated that Shapley values are the only set of values that
satisfy three axioms similar to Property 1 and 3. 26/36
Kernel SHAP: Linear LIME + Shapley values
We can recover Shapley values using following equation
but we should pick specific form for , and [1]
27/36
SHAP: force plot
28/36
SHAP: summary plot
● SHAP value distributions
● sorted by global impact
Long trails reaching to the right means
that extreme values of these
measurements can significantly raise
the risk of death, but cannot significantly
lower your risk
29/36
SHAP: global feature importance
30/36
SHAP: modifications
● Model-Agnostic Approximations
○ Kernel SHAP (Linear LIME + Shapley values)
● Model-Specific Approximations
○ Linear SHAP
○ Low-Order SHAP
○ Max SHAP
○ Deep SHAP (DeepLIFT + Shapley values)
○ Tree SHAP (XGBoost and LightGBM)
31/36
Attention-Based RNNs: Image description
https://arxiv.org/pdf/1502.03044.pdf 32/36
Attention-Based RNNs: Image description
https://arxiv.org/pdf/1502.03044.pdf 33/36
Attention-Based RNNs: Medical codes prediction
https://arxiv.org/abs/1802.05695 34/36
What are the benefits?
● We can understand decisions of complex models more clearly
● We can prevent overfitting and access fairness of the model
● We can improve our model
● We can choose appropriate model for production
● We can bring insights for business
● ....
35/36
Sources
[1] A Unified Approach to Interpreting Model Predictions. – arXiv, 2017.
[2] Scott M. Lundberg, Gabriel G. Erion, Su-In Lee. Consistent Individualized Feature Attribution for Tree
Ensembles. – arXiv, 2018.
[3] B. Goodman, S. Flaxman. European Union regulations on algorithmic decision-making and a "right to
explanation". -- arXiv, 2016.
[4] P. Wu. GDPR and its impacts on machine learning applications. -- Medium, 2017.
[5] Shrikumar, P. Greenside, A. Kundaje. Learning important features through propagating activation
differences. arXiv preprint arXiv:1704.02685 (2017).
[6] Bach, Sebastian, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise
relevance propagation. PloS one 10.7 (2015): e0130140.
[7] M.T. Ribeiro, M. Tulio, S. Singh, C. Guestrin. "Why should i trust you?": Explaining the predictions
of any classifier. ACM, 2016.
[8] K. Xu, J. Lei Ba, R. Kiros and other. Show, Attend and Tell: Neural Image Caption Generation with
Visual Attention. – arXiv, 2016.
36/36

DC02. Interpretation of predictions

  • 1.
    Different ways ofinterpreting model predictions Anton Kulesh Discussion Club #2 InData Labs, 2018
  • 2.
    Problem The black boxis a model that accepts the inputs and gives the responses, but does not explain how they were received. ● credit scoring ● medical diagnostics ● crime detection ● etc. Data PredictionMagic 1. https://www.techemergence.com/ai-crime-prevention-5-current-applications 1 2/36
  • 3.
    Motivation 1) Understanding andtransparency 2) Trust and fairness 3) Prevention of objectionable biases Usage ● model improving (model selection, generating new features etc.) ● identification of a data leakage (it isn`t always possible to focus on the quality of the model) ● detecting the dataset shift (Real-world data can be significantly different) ● insights for business 3/36
  • 4.
    General Data ProtectionRegulation GDPR was adopted by European Parliament in April 2016, and will be enforceable throughout EU by May 2018. ● Non-discrimination ○ race or ethnic origin ○ religious or philosophical beliefs ○ person’s sex life or sexual orientation ○ processing of genetic data ○ data concerning health ○ etc. https://medium.com/trustableai/gdpr-and-its-impacts-on-machine-learning-applications-d5b5b0c3a815 4/36
  • 5.
    General Data ProtectionRegulation GDPR was adopted by European Parliament in April 2016, and will be enforceable throughout EU by May 2018. ● Right to explanation ○ data subjects have the right to access information collected about them, and also requires data processors to ensure data subjects are notified about the data collected ○ right to receive “meaningful information about the logic (algorithm) and possible impact” 1. about “right to explaination”: https://arxiv.org/abs/1606.08813 1 5/36
  • 6.
    Confusions A black boxmodel may have a systematic bias or poor generalization Will the prisoner commit the crime? https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing 6/36
  • 7.
    *Crimes in theU.S. 2016 Percent of population: - White: 72% - African American: 12% - Asian: 5% https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-21 7/36
  • 8.
    Confusions A black boxmodel may have a systematic bias or poor generalization Is this a husky or wolf? Example from [7] 8/36
  • 9.
    Interpretability: Definition Human understanding ●The complexity of the function ○ linear monotonic ○ non-linear monotonic ○ non-linear non-monotonic ● Scope ○ Global ○ Local (why this particular observation was classified this way) ● Application domain ○ model-agnostic ○ model-specific 9/36
  • 10.
    Interpretable models ● LASSO/ Ridge regression ● Decision tree ● Rule-based algorithms ● Generalized Additive Models ● Quantile regression ● ...what else? https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning 10/36
  • 11.
    Global feature importance(tree ensembles) This values are calculated for an entire dataset ● Gain -- total reduction of loss or impurity contributed by all splits for a given feature [1] ● Split Count -- how many times a feature is used to split [2] ● Permutation is to randomly permute the values of a feature in the dataset and then observe how the model quality change ● Cover represents the number of instances in tree node [1] L. Breiman, J. Friedman and other. Classification and regression trees. CRC press, 1984. [2] T. Chen, C. Guestrin. XGBoost: A scalable tree boosting system. ACM, 2016. 11/36
  • 12.
    Individualized (Local) methods Computingfeature importance values for each individual prediction ● LIME [7] (see also ELI5) ● SHAP [1,2] ● DeepLIFT [5] (Recursive prediction explanation method for deep learning) ● Layer-Wise Relevance Propagation [6] (predictions interpretation of deep networks) ● Attention-Based RNNs [8] 12/36
  • 13.
    Local Interpretable Model-agnosticExplanations LIME is a feature attribution method that can explain the predictions of any classifier or regressor in a faithful way, by approximating it locally with an interpretable model https://github.com/marcotcr/lime 13/36
  • 14.
    LIME: Explanation example "WhyShould I Trust You?": Explaining the Predictions of Any Classifier 14/36
  • 15.
    LIME: Formulation To findthe explanation function g we should minimize following objective function: ● -- regularization of model complexity (number of non-zero features) ● -- local kernel (exponential kernel) ● -- loss function (squared loss) ● x` -- simplified inputs (depends on input space type) ● G -- class of linear functions 15/36
  • 16.
    LIME: Evaluation withhuman subjects ● Human subjects on Amazon Mechanical Turk was recruited (100 users) ● Models ○ SVM trained on a original 20 newsgroup dataset (SVM-I) ○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II) ● Accuracy ○ Test set SVM-I: 94%, SVM-II: 88.6% "Why Should I Trust You?": Explaining the Predictions of Any Classifier 16/36
  • 17.
    LIME: Evaluation withhuman subjects ● Human subjects on Amazon Mechanical Turk was recruited (100 users) ● Models ○ SVM trained on a original 20 newsgroup dataset (SVM-I) ○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II) ● Accuracy ○ Test set SVM-I: 94%, SVM-II: 88.6% ○ Religion dataset SVM-I: 57.3%, SVM-II: 69%!!! "Why Should I Trust You?": Explaining the Predictions of Any Classifier 17/36
  • 18.
    LIME: Evaluation withhuman subjects ● Human subjects on Amazon Mechanical Turk was recruited (100 users) ● Models ○ SVM trained on a original 20 newsgroup dataset (SVM-I) ○ SVM trained on a “cleaned” 20 newsgroup dataset (SVM-II) ● Accuracy ○ Test set SVM-I: 94%, SVM-II: 88.6% ○ Religion dataset SVM-I: 57.3%, SVM-II: 69%!!! ● LIME vs. Greedy "Why Should I Trust You?": Explaining the Predictions of Any Classifier 18/36
  • 19.
    LIME: Pros andCons - Current version is raw - Too time consuming for calculating global importances + Working with different domains (tabular, text, images) + Allows you to explain any model + Submodular pick 19/36
  • 20.
    SHapley Additive exPlanation ScottLundberg. A Unified Approach to Interpreting Model Prediction, NIPS 2017 20/36
  • 21.
    SHAP: Additive featureattribution method The explanation function g is presented as linear combination by binary variables: 21/36
  • 22.
    SHAP: Additive featureattribution method The explanation function g is presented as linear combination by binary variables[2]: 22/36
  • 23.
    SHAP: Additive featureattribution models properties ● Local accuracy ○ The sum of the feature attributions is equal to the output of the function we are seeking to explain ● Missingness ○ Features that are already missing (such that Zi`=0) are attributed no importance ● Consistency ○ Changing a model so a feature has a larger impact on the model will never decrease the attribution assigned to that feature Only one possible explanation model g satisfies all properties… 23/36
  • 24.
    SHAP: Consistancy Scott M.Lundberg, Gabriel G. Erion, Su-In Lee. Consistent Individualized Feature Attribution for Tree Ensembles. 24/36
  • 25.
    SHAP values For calculatingSHAP values we combine conditional expectations of function given variables and with Shapley values from the game theory - S is the set of non-zero indexes in z` - fx(S) = f(hx(z`)) = E[f(x) | xS] is expected value of the function conditioned on a subset S of the input features 25/36
  • 26.
    SHAP: Theoretical proof Methodsnot based on Shapley values violate local accuracy and/or consistency Authors[1] refer to Young(1985) paper, where demonstrated that Shapley values are the only set of values that satisfy three axioms similar to Property 1 and 3. 26/36
  • 27.
    Kernel SHAP: LinearLIME + Shapley values We can recover Shapley values using following equation but we should pick specific form for , and [1] 27/36
  • 28.
  • 29.
    SHAP: summary plot ●SHAP value distributions ● sorted by global impact Long trails reaching to the right means that extreme values of these measurements can significantly raise the risk of death, but cannot significantly lower your risk 29/36
  • 30.
    SHAP: global featureimportance 30/36
  • 31.
    SHAP: modifications ● Model-AgnosticApproximations ○ Kernel SHAP (Linear LIME + Shapley values) ● Model-Specific Approximations ○ Linear SHAP ○ Low-Order SHAP ○ Max SHAP ○ Deep SHAP (DeepLIFT + Shapley values) ○ Tree SHAP (XGBoost and LightGBM) 31/36
  • 32.
    Attention-Based RNNs: Imagedescription https://arxiv.org/pdf/1502.03044.pdf 32/36
  • 33.
    Attention-Based RNNs: Imagedescription https://arxiv.org/pdf/1502.03044.pdf 33/36
  • 34.
    Attention-Based RNNs: Medicalcodes prediction https://arxiv.org/abs/1802.05695 34/36
  • 35.
    What are thebenefits? ● We can understand decisions of complex models more clearly ● We can prevent overfitting and access fairness of the model ● We can improve our model ● We can choose appropriate model for production ● We can bring insights for business ● .... 35/36
  • 36.
    Sources [1] A UnifiedApproach to Interpreting Model Predictions. – arXiv, 2017. [2] Scott M. Lundberg, Gabriel G. Erion, Su-In Lee. Consistent Individualized Feature Attribution for Tree Ensembles. – arXiv, 2018. [3] B. Goodman, S. Flaxman. European Union regulations on algorithmic decision-making and a "right to explanation". -- arXiv, 2016. [4] P. Wu. GDPR and its impacts on machine learning applications. -- Medium, 2017. [5] Shrikumar, P. Greenside, A. Kundaje. Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685 (2017). [6] Bach, Sebastian, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10.7 (2015): e0130140. [7] M.T. Ribeiro, M. Tulio, S. Singh, C. Guestrin. "Why should i trust you?": Explaining the predictions of any classifier. ACM, 2016. [8] K. Xu, J. Lei Ba, R. Kiros and other. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. – arXiv, 2016. 36/36