Seminar

A Introduction to Counterfactual Explanations
Luo

Tohoku Univ.

Outline for Today’s Talk
2. Counterfactual Explanations methods for CNN-based classiﬁer
— Explaining Image Classiﬁers by Counterfactual Generation. ICLR 2019

— Counterfactual Visual Explanations. ICML 2019

— Generative Counterfactual Introspection for Explainable Deep Learning. arXiv

— Global Explanations of Convolutional Neural Networks With Concept Attribution. CVPR 2020

3. Future plan
1. Counterfactual Explanations and XAI
Page , 6/302

• XAI
Methods and techniques in AI technology such that the results of the solution can be understood by human

experts. One of DARPA ( Defense Advanced Research Projects Agency at U.S.) programs
Face recognition Self-driving vehicle Medical image diagnosis
• Reason for it
XAI tools are crucial for high-impact, high-risk applications of deep learning
Background
Page , 6/303

Q: why this phenomenon occurs?
Conﬁdential
Real Black Box Problem that I Have Met
Page , 6/304

c1 c2 c3 F1 F2
“dog” 95.3%

“sheep” 2.1%

“cat” 1.2 %

…
X model: F(X) Predictions
1 What is model looking at?
Perturbation approaches; Counterfactual Explanations
2 What & How model learn from x?
Internal representation
3 How to improve model’s extreme performance?
Advanced training techniques
4 How to explain wrong prediction?
Momentarily Missed Detection in [1]
XAI
Page , 6/305
[1] Analysis and a Solution of Momentarily Missed Detection for Anchor-based Object Detectors.Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani

Counterfactual Explanations
• Counterfactual thinking
A concept in psychology that involves the human tendency to create possible alternatives to life events that have already
occurred; something that is contrary to what actually happened
c1 c2 c3 F1 F2
“dog” 95.3%

“sheep” 2.1%

“cat” 1.2 %

…
model: F(X) Predictions
Page , 6/306
counterfactual images
{
{
X





3. Future plan
Page , 6/307

Explaining Image Classifiers by Counterfactual Generation
Motivation: Which parts of the image, if not seen by the classifier, would most affect its decision? Saliency Map

Goals: To find important regions for a pre-trained model to classify the image

Method: To replace proposed masked region ( which may be important ) with image generated by GAN
O.O.D problems: New generated image based on the previous methods is unnatural, However new generated image

based on GAN is more realistic, which is closer to training data distribution
Predicted possibility on bird based on different in-filling methods
[2] Explaining Image Classifiers by Counterfactual Generation. Chun-Hao Chang, Elliot Creager, Anna Goldenberg, & David Duvenaud
Page , 6/308

• What is exactly explained about CNNs in paper [2]?
Explaining Image Classifiers by Counterfactual Generation
Page , 6/309
Which parts of the image, if not seen by the classiﬁer, would most aﬀect its decision? Saliency Map

AlexNet focuses more on the body region of the animal, while VGG and ResNet focus more on the head features
Saliency map

Counterfactual Visual Explanations
• Research background: Many papers focus on ﬁnding the important regions which push model to make a prediction

• Motivation: How should the image I be diﬀerent for the model to predict it as class c′ instead
[3] Counterfactual Visual Explanations. ICML, 2019. Yash Goyal, Ziyan Wu, Jan Ernst, Dhruv Batra, Devi Parikh, Stefan Lee.
If the red box region on the left (I) was replaced by the red box region on the
right (Iʹ), then model will make a different prediction, e.g., c—> cʹ
• Two contributions claimed by the authors:
1. Propose an approach to generate counterfactual visual explanations

2. The explanations can help in teaching humans via human studies
Page , 6/3010

1 Decompose CNNs into feature extractor f(I) and decision net g(f(I))
2 Deﬁne a transformation that replaces regions in the feature f(I) with those
from feature f(I*)
Take feature ( hw*d )
Solution 1. Exhaustive search approach
Solution 2. A continuous relaxation of a and P that replaces search with an optimization
3 Algorithms to ﬁnd minimum replacement region
Find minimum region
Page , 6/3011

• Remaining problem for this method
The new edited image sometimes could be very weird, which are far from the natural image
query image distractor image edited image
Page , 6/3012

a) Results on MNIST based on simple CNN ( 2C+2F) b) Results on Omniglot based on simple CNN ( 2C+2F) c) Results on CUB based on VGG-16
• What is exactly explained about CNNs in Paper [3]?
If the high-light region from query image was replaced by it from distractor image, model will switch it’s prediction to target class
Page , 6/3013

• Concept attribution
Measure the importance of semantic concepts to model predictions, e.g., texture, color, layout

• Global
Category-wide interpretations
1
2
3
a) Visualization
b) Calculate user-deﬁned

concept importance
Global Explanations of Convolutional Neural Networks With Concept Attribution
Page , 6/3014
[4] Global Explanations of Convolutional Neural Networks With Concept Attribution. Weibin Wu, Yu-Wing Tai.

Class Concepts Captured by VGG-16 and ResNet-50
Chickadee
Tarantula
Example image VGG-16
Results reported in [4] Reproduced result according to [5]
VGG-16
Iterations: 1000. Blur freq: 4. Blur radius: 1. Weight decay: 0.0001. Clip value: 0.1.
[5] Understanding Neural Networks Through Deep Visualization. Jason Yosinski, Hod Lipson.
ResNet-50 ResNet-50
Reproduced result according to [5]
Page , 6/3015
Results reported in [4]

CNNs may use concept information when giving a prediction on a specific class, e.g., texture for predicting zebra
Importance scores of different conceptsClass concepts captured by different models
Page , 6/3016





3. Future plan
Page , 6/3017

Future Plan
Page , 6/3018
• Expanding my knowledge on XAI methods
• Introducing XAI methods to other computer vision tasks

Thank you for your attention!
Any questions or comments are welcome.
Cherry blossoms at Honnmanji Temple ( 2020.3.24 08:05 )Warm afternoon at Kamogawa River ( 2020.3.8 16:58 )

Generative Counterfactual Introspection for Explainable Deep Learning
• Motivation: what meaningful change can be made to the input image in order to alter the prediction ( Similar to last paper )
[6] Generative Counterfactual Introspection for Explainable Deep Learning. Shusen Liu, Bhavya Kailkhura, Donald Loveland, Yong Han
where loss is cross-entropy loss for predicting image I(Aʹ) to
label cʹ using classiﬁer C
lossC,c′
Page , 6/3020
Supplemental Materials

MNIST: Changes to the image of digit 9 to alter its prediction CelebA: Changes to the image of a person to alter its prediction
older
Page , 6/3021
A meaningful change can be made to the input image in order to alter the prediction into target class
Supplemental Materials
Generative Counterfactual Introspection for Explainable Deep Learning

Seminar

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Seminar

Similar to Seminar (20)

Recently uploaded

Recently uploaded (20)

Seminar