1 KYOTO UNIVERSITY
KYOTO UNIVERSITY
Interpretability of Machine Learning
Daiki Tanaka
Kashima lab., Kyoto University
Paper Reading Seminar, 2018/7/6(Fri)
2 KYOTO UNIVERSITY
Background:
What is interpretability?
n Interpretation is the process of giving explanations to humans.
n There is no clear definition, but it is
l To explain not only output of models (what) but also the reason
(why).
l To interpret the model by using some method or criteria.
3 KYOTO UNIVERSITY
Background:
What can we do by interpreting the model?
n We can make Fair and Ethical decision.
l (i.e.) Medical diagnosis, Terrorism detection
n We can trust the model prediction.
n We can generate hypotheses.
l (i.e.) a simple regression model might reveal a strong association between
thalidomide use and birth defects or smoking and lung cancer (Wang et al., 1999)
Wang et al. Smoking and the occurence of alzheimer’s disease: Cross-sectional and longitudinal data in a population-based
study. American journal of epidemiology, 1999.
4 KYOTO UNIVERSITY
Background:
Approach to interpreting
1. Select interpretable model.
l Rule-base method, decision tree, linear regression model…
Sometimes, we can’t get enough accuracy by doing so.
2. Use some method to interpret.
5 KYOTO UNIVERSITY
Background:
two ways of Interpreting
n Global interpretability
l Interpreting the whole tendency of model or data.
l Sometimes not accurate locally.
n Local interpretability
l Interpreting the limited area of model or data.
l We can get more accurate explanations.
6 KYOTO UNIVERSITY
Background:
Overview
Model Specific Model Agnostic
Global interpretability • Regression coefficients
• Feature importance
…
• Surrogate models
• Sensitivity Analysis
…
Local interpretability • Maximum Activation
Analysis
…
• LIME
• LOCO
• SHAP
…
Model agnostic : treating the original model as a black box.
7 KYOTO UNIVERSITY
Background:
Overview
Model Specific Model Agnostic
Global interpretability • Regression coefficients
• Feature importance
…
• Surrogate models
• Sensitivity Analysis
…
Local interpretability • Maximum Activation
Analysis
…
• LIME
• LOCO
• SHAP
…
8 KYOTO UNIVERSITY
Background:
Model Specific & Global interpretability
n Sparse Linear Regression
! = #$
% &
By doing Lasso regression, we know which features are important.
n Feature importance
l Random Forest (trees.feature_impotances_)
l In model agnostic way,
l Do ROC analysis for each explanatory variable.
9 KYOTO UNIVERSITY
Background:
Model Specific Model Agnostic
Global interpretability • Regression coefficients
• Feature importance
…
• Surrogate models
• Sensitivity Analysis
…
Local interpretability • Maximum Activation
Analysis
…
• LIME
• LOCO
• SHAP
…
10 KYOTO UNIVERSITY
Background:
Model Agnostic & Global Interpretability
n Surrogate models
l Finding a simple model which can substitute the complex model.
l Surrogate models are usually created by training a linear regression or decision
tree on the original inputs and predictions of a complex model.
n Sensitivity Analysis
l Investigating whether model behavior and outputs remain stable when data is
perturbed.
S x =
$f
$x
Simple model
Original
inputs
Teaching
Complex model
11 KYOTO UNIVERSITY
Background:
Overview
Model Specific Model Agnostic
Global interpretability • Regression coefficients
• Feature importance
…
• Surrogate models
• Sensitivity Analysis
…
Local interpretability • Maximum Activation
Analysis
…
• LIME
• LOCO
• SHAP
…
12 KYOTO UNIVERSITY
Background:
Model specific & Local Interpretability
n Maximum Activation Analysis
l Examples are found that maximally activate certain neurons, layers, or filters in a
neural network or certain trees in decision tree ensembles.
13 KYOTO UNIVERSITY
Background:
Model Specific Model Agnostic
Global interpretability • Regression coefficients
• Feature importance
…
• Surrogate models
• Sensitivity Analysis
…
Local interpretability • Maximum Activation
Analysis
…
• LIME
• LOCO
• SHAP
…
14 KYOTO UNIVERSITY
Today’s paper:
paper of LIME
n Title : “Why Should I Trust You?” Explaining the Predictions
of Any Classifier (KDD ‘16)
n Authors:
Marco Tulio Ribeiro
University of Washington
Sameer Singh
University of Washington
Carlos Guestrin
University of Washington
15 KYOTO UNIVERSITY
Back ground:
“If the users do not trust a model or a prediction, they will not use it.”
n Determining trust in individual predictions is an important problem
when the model is used for decision making.
n By “explaining a prediction”, they mean presenting textual or visual
artifacts that provide qualitative understanding of the relationship
between the instance’s components (e.g. words in text, patches in an
image) and the model’s prediction.
16 KYOTO UNIVERSITY
Background :
Desired characteristics for explainers
n They must be interpretable : they provide good understanding between the
input variables and the response.
n Another essential criterion is local fidelity : explainers must correspond to
how the model behaves around the instances being explained.
n Explainers should be model-agnostic : treating the original model as a black
box.
17 KYOTO UNIVERSITY
Proposed method
18 KYOTO UNIVERSITY
Interpretable Data Representations:
Features – Interpretable Representations
n ! ∈ ℝ$
: original representation of an instance being explained.
l For text classification, ! is a feature like word embedding.
l For image classification, ! is a tensor of with three color channels per pixel.
n !′ ∈ {0, 1}$+
: a binary vector for its interpretable representation.
l For text classification, !′ is a binary vector indicating the presence or absence of
a word.
l For image classification, !′ is a binary vector indicating the “presence” or
“absence” of a contiguous patch of similar pixels (a super-pixel).
19 KYOTO UNIVERSITY
Proposed method : KEY IDEA
training the explainer from the instances sampled around x.
Instance being explained : x
Explainer
Original complex decision boundary
Negative samplesPositive samples
20 KYOTO UNIVERSITY
Proposed method : KEY IDEA
How to sample around x?
n They sample instances around x′ by drawing non-zero elements of x#
uniformly at random.
n Given a perturbed sample z#
∈ {0, 1}+
, they recover the original
representation z ∈ ℝ+
and obtain f z , which is used as a label for
explanation model.
Original representation: x Interpretable
representation : x’
perturbed sample : z’
z
Label of sample z : f(z)
Drawing
21 KYOTO UNIVERSITY
Proposed method:
minimize local loss and model complexity
n ! :	the instance being explained
n $ :	target model which we want to interpret.
n % ∈ ' : an interpretable model such as linear models, decision trees.
n ()(+) : proximity measure between an instance + to !.
n Ω(%) : a measure of complexity of the explanation g.
n ℒ : locality loss
22 KYOTO UNIVERSITY
Proposed method:
Sparse Linear Explanations
n Let G be the class of linear models ; g z′ = w' ( z′
n Locality loss ℒ is weighted square loss:
p *(,) : probability that z belongs to a certain class.
p ./(,) : sample weight that indicates the distance between x and z.
n Sampling weight π1(z) is defined as follows :
π1 z = exp(−
D x, z 8
σ8
)
23 KYOTO UNIVERSITY
Example :
Deep networks for images
n When we want to get explanations for image classifiers, we wish to
highlight the super-pixels with positive weight toward a specific class.
n Classifier : Google’s Inception neural networks
The reason why Acoustic guitar was
classified as Electric guitar
24 KYOTO UNIVERSITY
Proposed method :
Explaining for a whole model
n Explaining a single prediction is not enough to evaluate and assess trust in the
model as a whole.
They propose to give a global understanding of the model by explaining a set of
individual instances.
Selecting a good set of instances to explain the whole model.
n Problem is
p Given : a set of instances X
p Output : B instances for users to inspect.
25 KYOTO UNIVERSITY
Proposed method : SP-LIME
choose instances to cover as many important features as possible.
the most important feature
• c : coverage
• !" : the global importance of feature j.
• The goal of pick problem is to find a set of
instances that maximizes coverage.
• Due to submodularity, a greedy algorithm can be
used to solve.
26 KYOTO UNIVERSITY
Experiments
30 KYOTO UNIVERSITY
Experiment : Evaluation with human subjects
setting
n They evaluate LIME and SP-LIME in the following settings:
1. Can users select the best classifier?
2. Can non-experts improve a classifier?
3. Do explanations lead to insights?
n Dataset :
l Training : “Christianity” and “Atheism” documents (including not generalizing
features such as authors and headers.)
l Testing : 819 webpages about each class (real-world data.)
n Human subjects are recruited on Amazon Mechanical Turk.
31 KYOTO UNIVERSITY
Experiment :
1. Can users select the best classifier?
n Evaluating whether explanations can help users decide which
classifier is better.
n Two classifiers :
l SVM trained on the original 20 newsgroups dataset
l SVM trained on a “cleaned” dataset. (authors and headers are removed)
n Users are asked to select which classifier will perform best in the real world.
n Explanations are produced by greedy or LIME.
n Instances are selected either by random or Submodular Pick.
32 KYOTO UNIVERSITY
Experiment :
1. Can users select the best classifier?
n Result : Submodular pick greatly improves the user’s ability, with
LIME outperforming greedy in both cases.
33 KYOTO UNIVERSITY
Experiment :
2. Can non-experts improve a classifier?
n Explanations can aid in feature engineering by presenting the important features
and removing not generalizing features.
n Users are asked to identify which words from the explanations should be removed
from subsequent training, for the worse classifier from the previous experiment.
n Explanations are produced by SP-LIME or RP-LIME.
Original
classifier
…….
…….
…….
34 KYOTO UNIVERSITY
Experiment :
2. Can non-experts improve a classifier?
n Result : The crowd workers are able to improve the model by removing features.
Further, SP-LIME outperforms RP-LIME.
35 KYOTO UNIVERSITY
Experiment :
3. Do explanations lead to insights?
n Task of distinguish Wolves and Huskies.
n Train a logistic regression classifier on a training set of 20 images, in which all
pictures of wolves had snow in the background, while huskies did not. → this
classifier predicts “Wolf” if there is snow and “Huskey” otherwise.
n Experiment proceeds as follows:
1. They present a balanced set of 10 test predictions without explanations, where
one wolf is not in a snowy back (and thus is predicted as “Husky”) and one husky
is in a snowy back (and is predicted as “Wolf”). The other 8 examples are
classified correctly.
2. They show the same images with explanations.
36 KYOTO UNIVERSITY
Experiment :
3. Do explanations lead to insights?
Ask the subjects,
1. Do you trust this algorithm to work well in the real world?
2. Why?
3. How do you think the algorithm is able to distinguish between these photos of wolves and
huskies?
n Result : Explaining individual predictions is effective for getting insights into
classifiers.
People who is not correct
People who is correct
37 KYOTO UNIVERSITY
Conclusion
38 KYOTO UNIVERSITY
n They proposed LIME, an approach to faithfully explain the predictions
of any model in an interpretable way.
n They also introduced SP-LIME, a method to select representative
predictions to give global view of the model.
n Their experiments show that explanations are useful.
n Other explanation model, such as decision trees.
n Other domain, such as speech, video, medical domains, and
recommendation systems.
Conclusion and future works:
They proposed a novel interpreting method.

Interpretability of machine learning

  • 1.
    1 KYOTO UNIVERSITY KYOTOUNIVERSITY Interpretability of Machine Learning Daiki Tanaka Kashima lab., Kyoto University Paper Reading Seminar, 2018/7/6(Fri)
  • 2.
    2 KYOTO UNIVERSITY Background: Whatis interpretability? n Interpretation is the process of giving explanations to humans. n There is no clear definition, but it is l To explain not only output of models (what) but also the reason (why). l To interpret the model by using some method or criteria.
  • 3.
    3 KYOTO UNIVERSITY Background: Whatcan we do by interpreting the model? n We can make Fair and Ethical decision. l (i.e.) Medical diagnosis, Terrorism detection n We can trust the model prediction. n We can generate hypotheses. l (i.e.) a simple regression model might reveal a strong association between thalidomide use and birth defects or smoking and lung cancer (Wang et al., 1999) Wang et al. Smoking and the occurence of alzheimer’s disease: Cross-sectional and longitudinal data in a population-based study. American journal of epidemiology, 1999.
  • 4.
    4 KYOTO UNIVERSITY Background: Approachto interpreting 1. Select interpretable model. l Rule-base method, decision tree, linear regression model… Sometimes, we can’t get enough accuracy by doing so. 2. Use some method to interpret.
  • 5.
    5 KYOTO UNIVERSITY Background: twoways of Interpreting n Global interpretability l Interpreting the whole tendency of model or data. l Sometimes not accurate locally. n Local interpretability l Interpreting the limited area of model or data. l We can get more accurate explanations.
  • 6.
    6 KYOTO UNIVERSITY Background: Overview ModelSpecific Model Agnostic Global interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP … Model agnostic : treating the original model as a black box.
  • 7.
    7 KYOTO UNIVERSITY Background: Overview ModelSpecific Model Agnostic Global interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …
  • 8.
    8 KYOTO UNIVERSITY Background: ModelSpecific & Global interpretability n Sparse Linear Regression ! = #$ % & By doing Lasso regression, we know which features are important. n Feature importance l Random Forest (trees.feature_impotances_) l In model agnostic way, l Do ROC analysis for each explanatory variable.
  • 9.
    9 KYOTO UNIVERSITY Background: ModelSpecific Model Agnostic Global interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …
  • 10.
    10 KYOTO UNIVERSITY Background: ModelAgnostic & Global Interpretability n Surrogate models l Finding a simple model which can substitute the complex model. l Surrogate models are usually created by training a linear regression or decision tree on the original inputs and predictions of a complex model. n Sensitivity Analysis l Investigating whether model behavior and outputs remain stable when data is perturbed. S x = $f $x Simple model Original inputs Teaching Complex model
  • 11.
    11 KYOTO UNIVERSITY Background: Overview ModelSpecific Model Agnostic Global interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …
  • 12.
    12 KYOTO UNIVERSITY Background: Modelspecific & Local Interpretability n Maximum Activation Analysis l Examples are found that maximally activate certain neurons, layers, or filters in a neural network or certain trees in decision tree ensembles.
  • 13.
    13 KYOTO UNIVERSITY Background: ModelSpecific Model Agnostic Global interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …
  • 14.
    14 KYOTO UNIVERSITY Today’spaper: paper of LIME n Title : “Why Should I Trust You?” Explaining the Predictions of Any Classifier (KDD ‘16) n Authors: Marco Tulio Ribeiro University of Washington Sameer Singh University of Washington Carlos Guestrin University of Washington
  • 15.
    15 KYOTO UNIVERSITY Background: “If the users do not trust a model or a prediction, they will not use it.” n Determining trust in individual predictions is an important problem when the model is used for decision making. n By “explaining a prediction”, they mean presenting textual or visual artifacts that provide qualitative understanding of the relationship between the instance’s components (e.g. words in text, patches in an image) and the model’s prediction.
  • 16.
    16 KYOTO UNIVERSITY Background: Desired characteristics for explainers n They must be interpretable : they provide good understanding between the input variables and the response. n Another essential criterion is local fidelity : explainers must correspond to how the model behaves around the instances being explained. n Explainers should be model-agnostic : treating the original model as a black box.
  • 17.
  • 18.
    18 KYOTO UNIVERSITY InterpretableData Representations: Features – Interpretable Representations n ! ∈ ℝ$ : original representation of an instance being explained. l For text classification, ! is a feature like word embedding. l For image classification, ! is a tensor of with three color channels per pixel. n !′ ∈ {0, 1}$+ : a binary vector for its interpretable representation. l For text classification, !′ is a binary vector indicating the presence or absence of a word. l For image classification, !′ is a binary vector indicating the “presence” or “absence” of a contiguous patch of similar pixels (a super-pixel).
  • 19.
    19 KYOTO UNIVERSITY Proposedmethod : KEY IDEA training the explainer from the instances sampled around x. Instance being explained : x Explainer Original complex decision boundary Negative samplesPositive samples
  • 20.
    20 KYOTO UNIVERSITY Proposedmethod : KEY IDEA How to sample around x? n They sample instances around x′ by drawing non-zero elements of x# uniformly at random. n Given a perturbed sample z# ∈ {0, 1}+ , they recover the original representation z ∈ ℝ+ and obtain f z , which is used as a label for explanation model. Original representation: x Interpretable representation : x’ perturbed sample : z’ z Label of sample z : f(z) Drawing
  • 21.
    21 KYOTO UNIVERSITY Proposedmethod: minimize local loss and model complexity n ! : the instance being explained n $ : target model which we want to interpret. n % ∈ ' : an interpretable model such as linear models, decision trees. n ()(+) : proximity measure between an instance + to !. n Ω(%) : a measure of complexity of the explanation g. n ℒ : locality loss
  • 22.
    22 KYOTO UNIVERSITY Proposedmethod: Sparse Linear Explanations n Let G be the class of linear models ; g z′ = w' ( z′ n Locality loss ℒ is weighted square loss: p *(,) : probability that z belongs to a certain class. p ./(,) : sample weight that indicates the distance between x and z. n Sampling weight π1(z) is defined as follows : π1 z = exp(− D x, z 8 σ8 )
  • 23.
    23 KYOTO UNIVERSITY Example: Deep networks for images n When we want to get explanations for image classifiers, we wish to highlight the super-pixels with positive weight toward a specific class. n Classifier : Google’s Inception neural networks The reason why Acoustic guitar was classified as Electric guitar
  • 24.
    24 KYOTO UNIVERSITY Proposedmethod : Explaining for a whole model n Explaining a single prediction is not enough to evaluate and assess trust in the model as a whole. They propose to give a global understanding of the model by explaining a set of individual instances. Selecting a good set of instances to explain the whole model. n Problem is p Given : a set of instances X p Output : B instances for users to inspect.
  • 25.
    25 KYOTO UNIVERSITY Proposedmethod : SP-LIME choose instances to cover as many important features as possible. the most important feature • c : coverage • !" : the global importance of feature j. • The goal of pick problem is to find a set of instances that maximizes coverage. • Due to submodularity, a greedy algorithm can be used to solve.
  • 26.
  • 27.
    30 KYOTO UNIVERSITY Experiment: Evaluation with human subjects setting n They evaluate LIME and SP-LIME in the following settings: 1. Can users select the best classifier? 2. Can non-experts improve a classifier? 3. Do explanations lead to insights? n Dataset : l Training : “Christianity” and “Atheism” documents (including not generalizing features such as authors and headers.) l Testing : 819 webpages about each class (real-world data.) n Human subjects are recruited on Amazon Mechanical Turk.
  • 28.
    31 KYOTO UNIVERSITY Experiment: 1. Can users select the best classifier? n Evaluating whether explanations can help users decide which classifier is better. n Two classifiers : l SVM trained on the original 20 newsgroups dataset l SVM trained on a “cleaned” dataset. (authors and headers are removed) n Users are asked to select which classifier will perform best in the real world. n Explanations are produced by greedy or LIME. n Instances are selected either by random or Submodular Pick.
  • 29.
    32 KYOTO UNIVERSITY Experiment: 1. Can users select the best classifier? n Result : Submodular pick greatly improves the user’s ability, with LIME outperforming greedy in both cases.
  • 30.
    33 KYOTO UNIVERSITY Experiment: 2. Can non-experts improve a classifier? n Explanations can aid in feature engineering by presenting the important features and removing not generalizing features. n Users are asked to identify which words from the explanations should be removed from subsequent training, for the worse classifier from the previous experiment. n Explanations are produced by SP-LIME or RP-LIME. Original classifier ……. ……. …….
  • 31.
    34 KYOTO UNIVERSITY Experiment: 2. Can non-experts improve a classifier? n Result : The crowd workers are able to improve the model by removing features. Further, SP-LIME outperforms RP-LIME.
  • 32.
    35 KYOTO UNIVERSITY Experiment: 3. Do explanations lead to insights? n Task of distinguish Wolves and Huskies. n Train a logistic regression classifier on a training set of 20 images, in which all pictures of wolves had snow in the background, while huskies did not. → this classifier predicts “Wolf” if there is snow and “Huskey” otherwise. n Experiment proceeds as follows: 1. They present a balanced set of 10 test predictions without explanations, where one wolf is not in a snowy back (and thus is predicted as “Husky”) and one husky is in a snowy back (and is predicted as “Wolf”). The other 8 examples are classified correctly. 2. They show the same images with explanations.
  • 33.
    36 KYOTO UNIVERSITY Experiment: 3. Do explanations lead to insights? Ask the subjects, 1. Do you trust this algorithm to work well in the real world? 2. Why? 3. How do you think the algorithm is able to distinguish between these photos of wolves and huskies? n Result : Explaining individual predictions is effective for getting insights into classifiers. People who is not correct People who is correct
  • 34.
  • 35.
    38 KYOTO UNIVERSITY nThey proposed LIME, an approach to faithfully explain the predictions of any model in an interpretable way. n They also introduced SP-LIME, a method to select representative predictions to give global view of the model. n Their experiments show that explanations are useful. n Other explanation model, such as decision trees. n Other domain, such as speech, video, medical domains, and recommendation systems. Conclusion and future works: They proposed a novel interpreting method.