6. Types of Interpretable Methods
We can interpret the model either before building the model, when
building it, or after building a model.
Most interpretation methods for DNNs interpret the model after it is built.
6
9. Attention Mechanisms
Attention mechanisms guide deep neural networks to focus on
relevant input features, which allows to interpret how the model made
certain predictions.
9
[Bahdanau et al. 15] Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015
10. Limitation of Conventional Attention Mechanisms
Conventional attention models may allocate attention inaccurately since
they are trained in a weakly-supervised manner.
The problem becomes more prominent when a task has no one-to-one
mapping from inputs to the final predictions.
10
11. Limitation of Conventional Attention Mechanisms
This is because the conventional attention mechanisms do not consider
uncertainties in the model and the input, which often leads to
overconfident attention allocations.
Such unreliability may lead to incorrect predictions and/or interpretations
which can result in fatal consequences for safety-critical applications.
11
13. Uncertainty Aware Attention (UA)
13
Multi-class classification performance on the three health records datasets
14. Info-GAN
14
There are structures in the noise vectors that have meaningful and
consistent effects on the output of the generator.
However, there’s no systematic way to find these structures. The only
thing affecting to the generator output is the noise input, so we have no
idea how to modify the noise to generate expected images.
15. Info-GAN
15
The idea is to provide a latent code, which has meaningful and consistent
effects on the output - disentangled representation
The hope is that if you keep the code the same and randomly change the
noise, you get variations of the same digit.
18. Understanding Black-Box Predictions
Given a high-accuracy blackbox model and a prediction from it, can we
answer why the model made a certain prediction?
[Koh and Liang 17] tackles this question by training a model’s prediction through its learning algorithm
and back to the training data.
To formalize the impact of a training point on a prediction, they ask the counterfactual:
What would happen if we did not have this training point or if its value were slightly changed?
18
[Koh and Liang 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
19. Interpretable Mimic Learning
This framework is mainly based on knowledge distillation from Neural
Networks.
However, they use Gradient Boosting Trees (GBT) instead of another neural
network as the student model since GBT satisfies our requirements for
both learning capacity and interpretability.
19[Che et al. 2016] Z. Che, S. Purushotham, R. Khemani, and Y. Liu. Interpretable Deep Models for
ICU outcome prediction, AMIA 2016.
Knowledge distillation
G. Hinton et al. 15
20. Interpretable Mimic Learning
The resulting simple model works even better than the best deep learning
model – perhaps due to suppression of the overfitting.
20[Che et al. 2016] Z. Che, S. Purushotham, R. Khemani, and Y. Liu. Interpretable Deep Models for
ICU outcome prediction, AMIA 2016.
21. Visualizing Convolutional Neural Networks
Propose Deconvolution Network (deconvnet) to inversely map the feature
activations to pixel space and provide a sensitivity analysis to point out
which regions of an image affect to decision making process the most.
21
[Zeiler and Fergus 14] Visualizing and Understanding Convolutional Networks, ECCV 2014
22. Prediction difference analysis
22
The visualization method shows which pixels of a specific input image are
evidence for or against a prediction
[Zintgraf et al. 2017] Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, ICLR 2017
Shown is the evidence for (red) and against (blue) the prediction.
We see that the facial features of the cockatoo are most supportive for the decision, and
parts of the body seem to constitute evidence against it.
24. Understanding Data Through Examples
[Kim et al. 16] propose to interpret the given data by providing examples
that can show the full picture – majorities + minorities
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability 24
47. 48
Pilot study with human subjects
Definition of interpretability: A method is interpretable if a user can
correctly and efficiently predict the method’s results.
Task: Assign a new data point to one of the groups using 1) all images
2) prototypes 3) prototypes and criticisms 4) small set of randomly
selected images
50. Take-home messages
51
• There are three types of Interpretable Methods, but mostly after building
models
• Criticism and prototypes are equally important and are a step towards
improving interpretability of complex data distributions
• MMD-critic learns prototypes + criticisms that highlight aspects of
data that are overlooked by prototypes.
51. Discussion
52
• If we have the insight into a dataset, can we really build a better model?
Human intuition is biased and not realiable!
52. Gap in Interpretable ML research
53
• Limited work to explain the operation of RNNs, only CNN. Attention
mechanism is not enough. Especially in multimodal network (CNN +
RNN), this kind of research is more necessary
As a result of the success of deep learning over the past decade, many model success and even surpass human performance on classification tasks. However, it still remains secrect how deep learning models actually works.
DL models are usually considered as black-box
First and foremost, I would like to provide a bird view over X-ai
As a result of the success of deep learning over the past decade, many model success and even surpass human performance on classification tasks. However, it still remains secrect how deep learning models actually works.
DL models are usually considered as black-box
To deal with this, interpretation should be given to support the operation of DL models. However, Interpretability is not a well-defined concept
Generally speaking, interpretable methods are now divided into three categories: before building the model, when building it, or after building a model. However, Most interpretation methods for DNNs interpret the model after it is built.
First, when building a new model, we can use/
An intuitive example is to use a sparse models, which is easy to understand. In addition, decision tree support human intuition as we can know the decision at each stage.
Another solution is to use attention mechanism as at each time step, we can adjust the focal point in input
The next category, interpretation after building a mode, which covers almost all papers in this course.
In a paper, Understanding Black-box Predictions via Influence Functions, Koh and Liang address the question: why the model made a certain prediction
By training a model’s prediction through its learning algorithm and back to the training data. To formalize the impact of a training point on a prediction, they ask the counterfactual:What would happen if we did not have this training point or if its value were slightly changed?
In paper Visualizing and Understanding Convolutional Networks, authors proposed to visualize learned representations in convolutional neural networks using deconvolution and maximally activating images.
Another paper, mostly you know, Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, they highlights areas in a given input image that provide evidence for or against a certain class.
The paper I am gonna present today falls into this type of category, Interpretation Before Building a Model
This paper explore data analysis through examples
Now I will introduce the paper: Examples are not Enough, Learn to Criticize! Criticism for Interpretability
AI community invents million of different DL models, but essentially, AI is data-driven, what we get is what we have. Its mean the model will behave based on the data we provide
So, it would be nice if we know what we are having before building any models
Imagine you are given a giant dataset, that contains one billion of data points. Before modeling, you wanna get a sense of what the data looks like. Of couse you don’t have time to look at all one billion images so you might do sampling from this group
A lot of images look like this
Another group shows that this kind of image is popular.
But the problem is that protoptyes images don’t give you the full picture. There are also groups like this, and we need to look at them to get the full picture. Then the question is which group should we look?
We want to look at important minorities. Others you can ignore.
Like this one, animal laying on keyboard. These are small but noy ignorable
Or this one. They are different from prototypes we have seen so far
So you finally want to come up with an algorithm to efficiently select majorities and important minorities
So this paper is about an algorithm of that kind. The idea is not only select prototypes but also important minorities. This helps human get better insights into a complex high dimensional dataset
Now coming to related work of this paper
Human tends to over-generalize and this cartoon suggest overgeneralization. This algorthim in this paper help us to minimiza the over-generalization via prototypes + criticisms
However, examples are not enough. Relying only on examples to explain the models’ behavior can lead over-generalization and misunderstanding. Examples alone may be sufficient when the distribution of data points are ‘clean’ – in the sense that there exists a set of prototypical examples which sufficiently represent the data. However, this is rarely the case in real world data. For instance, fitting models to complex datasets often requires the use of regularization
Here fitting models to complex datasets often requires the use of regularization means when training, we add regularization to generalize both prototype and criticism then we can not see the real distribution of data.
There are number of methods to select prototypes but non of them focus on minorities. There are outlier detection methods that consider minorities however mostly focus on detecting abnormalities rather than representing the whole distribution.
Now, we will explore how MMD-critic works
So, technically speaking, this work will select prototypes generated from distribution p, and criticism from …
Here, how can we measure the distance between the distribution, the authors propose to use MMD
MMD is used to calculate the discrepancy between two distribution P and Q, by this witness function. However, this function is intractable; as a result, we need to approximate this function by sampling like this function.
To further measure this function, authors use Bayesian model criticism and two-sample tests.
Prototypes: min vi cac representative la se dung gan nhau
Criticisms: max boi vi 2 distribution se la xa nhau
Now jumping to experiments
This paper conducts three experiments, both qualitatively and quantitatively
Competitive performance with PS, thuat toan classifier su dung nearest neighbor de classify (clustering)
Measure how well they did and how quickly they give back the response. Talking about speed first, people work fastest using prototypes (make sense vi so sample trong prototypes la it nhat)…
Conclusion: When criticism is given together with prototypes, a human pilot study suggests that humans are better able to perform a predictive task that requires the data-distributions to be well-explained. This suggests that criticism and prototypes are a step towards improving interpretability of complex data distributions. (Nhom thu 3 perform tot nhat boi vi da biet nhom so 2 la prototype roi). Prototypes + criticisms works best suggest that human intuition works best if the dataset only have prototypes + criticisms => we can filter data to get only prototype+criticism, khi do human da co insight tot => co the build model tot hon