SlideShare a Scribd company logo
1 of 64
Download to read offline
Explaining Machine Learning Models
Ankur Taly, Fiddler Labs
ankur@fiddler.ai
Joint work with Mukund Sundararajan1
, Qiqi Yan1
, Kedar Dhamdhere1
, and Pramod
Mudrakarta2
and colleagues at Fiddler labs
1
Google, 2
U Chicago
Machine Learning Models are Everywhere!
Output
(Label, sentence, next word, next move, etc.)
Input
(Image, sentence, game position, etc.)
neuron
Problem: Machine Learning Model is a Black Box
Output
(Label, sentence, next word, next move, etc.)
Input
(Image, sentence, game position, etc.)
?
Top label: “fireboat”
Why did the model label this image as “fireboat”?
Top label: “clog”
Why did the model label this image as “clog”?
Beautiful design and
execution, both the space itself
and the food. Many interesting
options beyond the excellent
traditional Italian meatball.
Great for the Financial District.
Why did the network predict
positive sentiment for this
review?
Credit Line Increase
Fair lending laws [ECOA, FCRA] require credit decisions to be explainable
Bank Credit Lending Model
Why? Why not? How?
? Request Denied
Query AI System
Credit Lending Score = 0.3
Example: Credit Lending in a black-box ML world
Regulators
IT & Operations
Data Scientists
Business Owner
Can I trust our AI
decisions?
Are these AI system
decisions fair?
Customer Support
How do I answer this
customer complaint?
How do I monitor and
debug this model?
Is this the best model
that can be built?
Black-box
AI
Why I am getting this
decision?
How can I get a
better decision?
Poor Decision
Black-Box Models create Confusion & Doubt
End User
So, how do you explain a model?
Attribute a model’s prediction on an input to features of the input
Examples:
● Attribute a lending model’s prediction to its features
● Attribute an object recognition network’s prediction to its pixels
● Attribute a text sentiment network’s prediction to individual words
A reductive formulation of “why this prediction” but surprisingly useful :-)
The Attribution Problem
● Debugging network predictions
● Generating an explanation for the end-user
● Analyzing network robustness
● Assessing prediction confidence
Applications of Attributions
● Two Attribution Methods
○ Integrated Gradients
○ Shapley Values
● Applications of attributions
● Discussion
Agenda
● Ablations: Drop each feature and note the change in prediction
○ Computationally expensive, Unrealistic inputs, Misleading when features interact
Naive approaches
● Ablations: Drop each feature and note the change in prediction
○ Computationally expensive, Unrealistic inputs, Misleading when features interact
● Feature*Gradient: Attribution for feature xi
is xi
* 𝜕y/𝜕xi
Naive approaches
● Ablations: Drop each feature and note the change in prediction
○ Computationally expensive, Unrealistic inputs, Misleading when features interact
● Feature*Gradient: Attribution for feature xi
is xi
* 𝜕y/𝜕xi
Gradients in the
vicinity of the input
seem like noise
Naive approaches
score
intensity
Interesting gradients
uninteresting gradients
1.0
0.0
Baseline … scaled inputs ...
… gradients of scaled inputs ….
Input
IG(input, base) ::= (input - base) * ∫0 -1
▽F(𝛂*input + (1-𝛂)*base) d𝛂
Original image Integrated Gradients
Our Method: Integrated Gradients [ICML 2017]
What is a baseline?
● A baseline is an informationless input that results in a neutral prediction
○ E.g., Black image for image models
○ E.g., Empty text or zero embedding vector for text models
● Integrated Gradients explains the diff F(input) - F(baseline)
● Baselines (or Norms) are essential to all explanations [Kahneman-Miller 86]
○ E.g., A man suffers from indigestion. Doctor blames it to a stomach ulcer. Wife
blames it on eating turnips. Both are correct relative to their baselines
○ The baseline may also be an important analysis knob
18
Many more Inception+ImageNet examples here
Evaluating an Attribution Method
● Ablate top attributed features and examine the change in prediction
○ Issue: May introduce artifacts in the input (e.g., the square below)
● Compare attributions to (human provided) groundtruth on “feature importance”
○ Issue 1: Attributions may appear incorrect because the network reasons differently
○ Issue 2 : Confirmation bias
Evaluating an Attribution Method
● Ablate top attributed features and examine the change in prediction
○ Issue: May introduce artifacts in the input (e.g., the square below)
● Compare attributions to (human provided) groundtruth on “feature importance”
○ Issue 1: Attributions may appear incorrect because the network reasons differently
○ Issue 2 : Confirmation bias
The mandate for attributions is to be faithful to the network’s reasoning
Our Approach: Axiomatic Justification
● List desirable criteria (axioms) for an attribution method
● Establish a uniqueness result: X is the only method that satisfies these criteria
Axioms
● Insensitivity: A variable that has no effect on the output gets no attribution
● Sensitivity: If baseline and input differ in a single variable, and have different
outputs, then that variable should receive some attribution
● Linearity preservation: Attributions(ɑ*F1 + ß*F2) = ɑ*Attributions(F1) +
ß*Attributions(F2)
● Implementation invariance: Two networks that compute identical functions for
all inputs get identical attributions
● Completeness: Sum(attributions) = F(input) - F(baseline)
● Symmetry: Symmetric variables with identical values get equal attributions
Result
Historical note:
● Integrated Gradients is the Aumann-Shapley method from cooperative
game theory, which has a similar characterization; see [Friedman 2004]
Theorem [ICML 2017]: Integrated Gradients is the unique
path-integral method satisfying: Sensitivity, Insensitivity, Linearity
preservation, Implementation invariance, Completeness, and
Symmetry
Debugging network behavior
Original image “Clog”
Why is this image labeled as “clog”?
Original image Integrated Gradients
(for label “clog”)
“Clog”
Why is this image labeled as “clog”?
Detecting an architecture bug
● Deep network [Kearns, 2016] predicts if a molecule binds to certain DNA site
● Finding: Some atoms had identical attributions despite different connectivity
● Deep network [Kearns, 2016] predicts if a molecule binds to certain DNA site
● Finding: Some atoms had identical attributions despite different connectivity
Detecting an architecture bug
● Bug: The architecture had a bug due to which the convolved bond features
did not affect the prediction!
● Deep network predicts various diseases from chest x-rays
Original image
Integrated gradients
(for top label)
Detecting a data issue
● Deep network predicts various diseases from chest x-rays
● Finding: Attributions fell on radiologist’s markings (rather than the pathology)
Original image
Integrated gradients
(for top label)
Detecting a data issue
Generating explanations
for end users
Prediction: “proliferative” DR1
● Proliferative implies vision-threatening
Can we provide an explanation to the
doctor with supporting evidence for
“proliferative” DR?
Retinal Fundus Image
1
Diabetic Retinopathy (DR) is a diabetes complication that affects the eye. Deep networks can
predict DR grade from retinal fundus images with high accuracy (AUC ~0.97) [JAMA, 2016].
Retinal Fundus Image
Integrated Gradients for label: “proliferative”
Visualization: Overlay heatmap on green channel
Retinal Fundus Image
Integrated Gradients for label: “proliferative”
Visualization: Overlay heatmap on green channel
Lesions
Neovascularization
● Hard to see on original image
● Known to be vision-threatening
Efficacy of Explanations
Explanations help when:
● Model is right, and explanation convinces the doctor
● Model is wrong, and explanation reveals the flaw in the model’s reasoning
Be careful about long-term effects too!
Humans and Automation: Use, Misuse, Disuse, Abuse - Parsuraman and Riley, 1997
But, Explanations can also hurt when:
● Model is right, but explanation is unintelligible
● Model is wrong, but the explanation convinces the doctor
Assisted-read study
9 doctors grade 2000 images under three different conditions
A. Image only
B. Image + Model’s prediction scores
C. Image + Model’s prediction scores + Explanation (Integrated Gradients)
Some findings:
● Seeing prediction scores (B) significantly increases accuracy vs. image only (A)
● Showing explanations (C) only provides slight additional improvement
○ Masks help more when model certainty is low
● Both B and C increase doctor ↔ model agreement
Paper: Using a deep learning algorithm and integrated gradients explanation to assist
grading for diabetic retinopathy --- Journal of Ophthalmology [2018]
Analyzing Model Robustness
Tabular QA Visual QA Reading Comprehension
Peyton Manning became the first
quarterback ever to lead two different
teams to multiple Super Bowls. He is also
the oldest quarterback ever to play in a
Super Bowl at age 39. The past record was
held by John Elway, who led the Broncos
to victory in Super Bowl XXXIII at age 38
and is currently Denver’s Executive Vice
President of Football Operations and
General Manager
Kazemi and Elqursh (2017) model.
61.1% on VQA 1.0 dataset
(state of the art = 66.7%)
Neural Programmer (2017) model
33.5% accuracy on WikiTableQuestions
Yu et al (2018) model.
84.6 F-1 score on SQuAD (state of the art)
39
Q: How many medals did India win?
A: 197
Q: How symmetrical are the white
bricks on either side of the building?
A: very
Q: Name of the quarterback who
was 38 in Super Bowl XXXIII?
A: John Elway
Tabular QA Visual QA Reading Comprehension
Peyton Manning became the first
quarterback ever to lead two different
teams to multiple Super Bowls. He is also
the oldest quarterback ever to play in a
Super Bowl at age 39. The past record was
held by John Elway, who led the Broncos
to victory in Super Bowl XXXIII at age 38
and is currently Denver’s Executive Vice
President of Football Operations and
General Manager
Robustness question: Do these network read the question carefully? :-)
Kazemi and Elqursh (2017) model.
61.1% on VQA 1.0 dataset
(state of the art = 66.7%)
Neural Programmer (2017) model
33.5% accuracy on WikiTableQuestions
Yu et al (2018) model.
84.6 F-1 score on SQuAD (state of the art)
40
Q: How many medals did India win?
A: 197
Q: How symmetrical are the white
bricks on either side of the building?
A: very
Q: Name of the quarterback who
was 38 in Super Bowl XXXIII?
A: John Elway
Kazemi and Elqursh (2017) model.
Accuracy: 61.1% (state of the art: 66.7%)
Visual QA
Q: How symmetrical are the white bricks on either side of the building?
A: very
41
Visual QA
Q: How symmetrical are the white bricks on either side of the building?
A: very
Q: How asymmetrical are the white bricks on either side of the building?
A: very
42
Kazemi and Elqursh (2017) model.
Accuracy: 61.1% (state of the art: 66.7%)
Visual QA
Q: How symmetrical are the white bricks on either side of the building?
A: very
Q: How asymmetrical are the white bricks on either side of the building?
A: very
Q: How big are the white bricks on either side of the building?
A: very
43
Kazemi and Elqursh (2017) model.
Accuracy: 61.1% (state of the art: 66.7%)
Visual QA
Q: How symmetrical are the white bricks on either side of the building?
A: very
Q: How asymmetrical are the white bricks on either side of the building?
A: very
Q: How big are the white bricks on either side of the building?
A: very
Q: How fast are the bricks speaking on either side of the building?
A: very
44
Kazemi and Elqursh (2017) model.
Accuracy: 61.1% (state of the art: 66.7%)
Visual QA
Q: How symmetrical are the white bricks on either side of the building?
A: very
Q: How asymmetrical are the white bricks on either side of the building?
A: very
Q: How big are the white bricks on either side of the building?
A: very
Q: How fast are the bricks speaking on either side of the building?
A: very
45
Test/dev accuracy does not show us the
entire picture. Need to look inside!
Kazemi and Elqursh (2017) model.
Accuracy: 61.1% (state of the art: 66.7%)
Analysis procedure
● Attribute the answer (or answer selection logic) to question words
○ Baseline: Empty question, but full context (image, text, paragraph)
■ By design, attribution will not fall on the context
● Visualize attributions per example
● Aggregate attributions across examples
Visual QA attributions
Q: How symmetrical are the white bricks on either side of the building?
A: very
How symmetrical are the white bricks on
either side of the building?
red: high attribution
blue: negative attribution
gray: near-zero attribution
47
During inference, drop all words from the dataset except ones which are frequently
top attributions
● E.g. How many red buses are in the picture?
48
Visual QA
Over-stability
Top tokens: color, many, what, is, how, there, …
During inference, drop all words from the dataset except ones which are frequently
top attributions
● E.g. How many red buses are in the picture?
49
Top tokens: color, many, what, is, how, there, …
Visual QA
Over-stability
80% of final accuracy
reached with just 100
words
(Orig vocab size: 5305)
50% of final
accuracy with
just one word
“color”
Attack: Subject ablation
Low-attribution nouns
'tweet',
'childhood',
'copyrights',
'mornings',
'disorder',
'importance',
'topless',
'critter',
'jumper',
'fits'
What is the man doing? → What is the tweet doing?
How many children are there? → How many tweet are there?
VQA model’s response remains the same 75.6% of the
time on questions that it originally answered correctly
50
Replace the subject of a question with a low-attribution noun from the vocabulary
● This ought to change the answer but often does not!
Many other attacks!
● Visual QA
○ Prefix concatenation attack (accuracy drop: 61.1% to 19%)
○ Stop word deletion attack (accuracy drop: 61.1% to 52%)
● Tabular QA
○ Prefix concatenation attack (accuracy drop: 33.5% to 11.4%)
○ Stop word deletion attack (accuracy drop: 33.5% to 28.5%)
○ Table row reordering attack (accuracy drop: 33.5 to 23%)
● Paragraph QA
○ Improved paragraph concatenation attacks of Jia and Liang from [EMNLP 2017]
Paper: Did the model understand the question? [ACL 2018]
Integrated Gradients is a technique for attributing a deep network’s (or any
differentiable model’s) prediction to its input features. It is very easy to apply,
widely applicable and backed by an axiomatic theory.
References:
● Axiomatic Attribution for Deep Networks [ICML 2017]
● Did the model understand the question? [ACL 2018]
● Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic
retinopathy [Journal of Ophthalmology, 2018]
● Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry [PNAS, 2019]
● Exploring Principled Visualizations for Deep Network Attributions [EXSS Workshop, 2019]
Summary
But, what about non-differentiable models?
● Decision trees
● Boosted trees
● Random forests
● etc.
Classic result in game theory on distributing gain in a coalition game
● Coalition Games
○ Players collaborating to generate some gain (think: revenue)
○ Set function v(S) determining the gain for any subset S of players
● Shapley Values are a fair way to attribute the total gain to the players based on their
contributions
○ Concept: Marginal contribution of a player to a subset of other players (v(S U {i}) - v(S))
○ Shapley value for a player is a specific weighted aggregation of its marginal over all
possible subsets of other players
Shapley Value [Annals of Mathematical studies,1953]
Shapley values are unique under four simple axioms
● Dummy: If a player never contributes to the game then it must receive zero attribution
● Efficiency: Attributions must add to the total gain
● Symmetry: Symmetric players must receive equal attribution
● Linearity: Attribution for the (weighted) sum of two games must be the same as the
(weighted) sum of the attributions for each of the games
Shapley Value Justification
● We define a coalition game for each model input X
○ Players are the features in the input
○ Gain is the model prediction (output), i.e., gain = F(X)
● Feature attributions are the Shapley values of this game
Challenge: What does it mean for only some players (features) to be present?
● That is, how do we define F(x_1, <absent>, x_3, …, <absent>) ?
Shapley Values for Explaining ML models
Key Idea: Take the expected prediction when the (absent) feature is sampled from
a certain distribution.
Different approaches choose different distributions
● [SHAP, NIPS 2018] Use conditional distribution w.r.t. the present features
● [QII, S&P 2016] Use marginal distribution
● [Strumbelj et al., JMLR 2009] Use uniform distribution
● [Integrated Gradients, ICML 2017] Use a specific baseline point
Coming Soon: Paper investigating the best way(s) to handle feature absence
Modeling Feature Absence
Exact Shapley value computation is exponential in the number of features
● Several approximations have been studied in the past
● Fiddler offers an efficient, distributed computation of approx. Shapley values
Our strategy
● For non-differentiable models with structured inputs, use Shapley values
● For differentiable models, use Integrated Gradients
Computing Shapley Values
Some limitations and caveats
Attributions are pretty shallow
Attributions do not explain:
● How the network combines the features to produce the answer?
● What training data influenced the prediction
● Why gradient descent converged
● etc.
An instance where attributions are useless:
● A network that predicts TRUE when there are even number of black pixels and FALSE
otherwise
Attributions are useful when the network behavior entails that a strict
subset of input features are important
Attributions are for human consumption
● Humans interpret attributions and generate insights
○ Doctor maps attributions for diabetic retinopathy to pathologies like
microaneurysms, hemorrhages, etc.
● Visualization matters as much as the attribution technique
Attributions are for human consumption
● Humans interpret attributions and generate insights
○ Doctor maps attributions for diabetic retinopathy to pathologies like
microaneurysms, hemorrhages, etc.
● Visualization matters as much as the attribution technique
Naive scaling of attributions
from 0 to 255
Attributions have a large
range and long tail
across pixels
After clipping attributions
at 99% to reduce range
Principles of Visualization for Image Attributions
Visualizations must satisfy:
● Commensurateness: Feature brightness must be proportional to attributions
● Coverage: Large fraction of the important features must be visible
● Correspondence: Allow correspondence between attributions and raw image
● Coherence: Visualizations must be optimized for human perception
Paper: Exploring Principled Visualizations for Deep Network Attributions [ExSS 2019]
Thank you!
Summary: Integrated Gradients is a technique for attributing a deep network’s
prediction to its input features. It is very easy to apply, widely applicable and
backed by an axiomatic theory.
My email: ataly@google.com
References:
● Axiomatic Attribution for Deep Networks [ICML 2017]
● Did the model understand the question? [ACL 2018]
● Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic
retinopathy [Journal of Ophthalmology, 2018]
● Exploring Principled Visualizations for Deep Network Attributions [EXSS Workshop, 2019]
● Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry [preprint, 2019]

More Related Content

Similar to ExplainingMLModels.pdf

Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
ANP-GP Approach for Selection of Software Architecture Styles
ANP-GP Approach for Selection of Software Architecture StylesANP-GP Approach for Selection of Software Architecture Styles
ANP-GP Approach for Selection of Software Architecture StylesWaqas Tariq
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
 
Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"Ryohei Suzuki
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Daniel Roggen
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality ReductionSaad Elbeleidy
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingLionel Briand
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4shortJun Miyazaki
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...Aalto University
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistRebecca Bilbro
 

Similar to ExplainingMLModels.pdf (20)

Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
ANP-GP Approach for Selection of Software Architecture Styles
ANP-GP Approach for Selection of Software Architecture StylesANP-GP Approach for Selection of Software Architecture Styles
ANP-GP Approach for Selection of Software Architecture Styles
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
C3 w5
C3 w5C3 w5
C3 w5
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

ExplainingMLModels.pdf

  • 1. Explaining Machine Learning Models Ankur Taly, Fiddler Labs ankur@fiddler.ai Joint work with Mukund Sundararajan1 , Qiqi Yan1 , Kedar Dhamdhere1 , and Pramod Mudrakarta2 and colleagues at Fiddler labs 1 Google, 2 U Chicago
  • 2. Machine Learning Models are Everywhere! Output (Label, sentence, next word, next move, etc.) Input (Image, sentence, game position, etc.) neuron
  • 3. Problem: Machine Learning Model is a Black Box Output (Label, sentence, next word, next move, etc.) Input (Image, sentence, game position, etc.) ?
  • 4. Top label: “fireboat” Why did the model label this image as “fireboat”?
  • 5. Top label: “clog” Why did the model label this image as “clog”?
  • 6. Beautiful design and execution, both the space itself and the food. Many interesting options beyond the excellent traditional Italian meatball. Great for the Financial District. Why did the network predict positive sentiment for this review?
  • 7. Credit Line Increase Fair lending laws [ECOA, FCRA] require credit decisions to be explainable Bank Credit Lending Model Why? Why not? How? ? Request Denied Query AI System Credit Lending Score = 0.3 Example: Credit Lending in a black-box ML world
  • 8. Regulators IT & Operations Data Scientists Business Owner Can I trust our AI decisions? Are these AI system decisions fair? Customer Support How do I answer this customer complaint? How do I monitor and debug this model? Is this the best model that can be built? Black-box AI Why I am getting this decision? How can I get a better decision? Poor Decision Black-Box Models create Confusion & Doubt End User
  • 9. So, how do you explain a model?
  • 10. Attribute a model’s prediction on an input to features of the input Examples: ● Attribute a lending model’s prediction to its features ● Attribute an object recognition network’s prediction to its pixels ● Attribute a text sentiment network’s prediction to individual words A reductive formulation of “why this prediction” but surprisingly useful :-) The Attribution Problem
  • 11. ● Debugging network predictions ● Generating an explanation for the end-user ● Analyzing network robustness ● Assessing prediction confidence Applications of Attributions
  • 12. ● Two Attribution Methods ○ Integrated Gradients ○ Shapley Values ● Applications of attributions ● Discussion Agenda
  • 13. ● Ablations: Drop each feature and note the change in prediction ○ Computationally expensive, Unrealistic inputs, Misleading when features interact Naive approaches
  • 14. ● Ablations: Drop each feature and note the change in prediction ○ Computationally expensive, Unrealistic inputs, Misleading when features interact ● Feature*Gradient: Attribution for feature xi is xi * 𝜕y/𝜕xi Naive approaches
  • 15. ● Ablations: Drop each feature and note the change in prediction ○ Computationally expensive, Unrealistic inputs, Misleading when features interact ● Feature*Gradient: Attribution for feature xi is xi * 𝜕y/𝜕xi Gradients in the vicinity of the input seem like noise Naive approaches
  • 16. score intensity Interesting gradients uninteresting gradients 1.0 0.0 Baseline … scaled inputs ... … gradients of scaled inputs …. Input
  • 17. IG(input, base) ::= (input - base) * ∫0 -1 ▽F(𝛂*input + (1-𝛂)*base) d𝛂 Original image Integrated Gradients Our Method: Integrated Gradients [ICML 2017]
  • 18. What is a baseline? ● A baseline is an informationless input that results in a neutral prediction ○ E.g., Black image for image models ○ E.g., Empty text or zero embedding vector for text models ● Integrated Gradients explains the diff F(input) - F(baseline) ● Baselines (or Norms) are essential to all explanations [Kahneman-Miller 86] ○ E.g., A man suffers from indigestion. Doctor blames it to a stomach ulcer. Wife blames it on eating turnips. Both are correct relative to their baselines ○ The baseline may also be an important analysis knob 18
  • 20. Evaluating an Attribution Method ● Ablate top attributed features and examine the change in prediction ○ Issue: May introduce artifacts in the input (e.g., the square below) ● Compare attributions to (human provided) groundtruth on “feature importance” ○ Issue 1: Attributions may appear incorrect because the network reasons differently ○ Issue 2 : Confirmation bias
  • 21. Evaluating an Attribution Method ● Ablate top attributed features and examine the change in prediction ○ Issue: May introduce artifacts in the input (e.g., the square below) ● Compare attributions to (human provided) groundtruth on “feature importance” ○ Issue 1: Attributions may appear incorrect because the network reasons differently ○ Issue 2 : Confirmation bias The mandate for attributions is to be faithful to the network’s reasoning
  • 22. Our Approach: Axiomatic Justification ● List desirable criteria (axioms) for an attribution method ● Establish a uniqueness result: X is the only method that satisfies these criteria
  • 23. Axioms ● Insensitivity: A variable that has no effect on the output gets no attribution ● Sensitivity: If baseline and input differ in a single variable, and have different outputs, then that variable should receive some attribution ● Linearity preservation: Attributions(ɑ*F1 + ß*F2) = ɑ*Attributions(F1) + ß*Attributions(F2) ● Implementation invariance: Two networks that compute identical functions for all inputs get identical attributions ● Completeness: Sum(attributions) = F(input) - F(baseline) ● Symmetry: Symmetric variables with identical values get equal attributions
  • 24. Result Historical note: ● Integrated Gradients is the Aumann-Shapley method from cooperative game theory, which has a similar characterization; see [Friedman 2004] Theorem [ICML 2017]: Integrated Gradients is the unique path-integral method satisfying: Sensitivity, Insensitivity, Linearity preservation, Implementation invariance, Completeness, and Symmetry
  • 26. Original image “Clog” Why is this image labeled as “clog”?
  • 27. Original image Integrated Gradients (for label “clog”) “Clog” Why is this image labeled as “clog”?
  • 28. Detecting an architecture bug ● Deep network [Kearns, 2016] predicts if a molecule binds to certain DNA site ● Finding: Some atoms had identical attributions despite different connectivity
  • 29. ● Deep network [Kearns, 2016] predicts if a molecule binds to certain DNA site ● Finding: Some atoms had identical attributions despite different connectivity Detecting an architecture bug ● Bug: The architecture had a bug due to which the convolved bond features did not affect the prediction!
  • 30. ● Deep network predicts various diseases from chest x-rays Original image Integrated gradients (for top label) Detecting a data issue
  • 31. ● Deep network predicts various diseases from chest x-rays ● Finding: Attributions fell on radiologist’s markings (rather than the pathology) Original image Integrated gradients (for top label) Detecting a data issue
  • 33. Prediction: “proliferative” DR1 ● Proliferative implies vision-threatening Can we provide an explanation to the doctor with supporting evidence for “proliferative” DR? Retinal Fundus Image 1 Diabetic Retinopathy (DR) is a diabetes complication that affects the eye. Deep networks can predict DR grade from retinal fundus images with high accuracy (AUC ~0.97) [JAMA, 2016].
  • 34. Retinal Fundus Image Integrated Gradients for label: “proliferative” Visualization: Overlay heatmap on green channel
  • 35. Retinal Fundus Image Integrated Gradients for label: “proliferative” Visualization: Overlay heatmap on green channel Lesions Neovascularization ● Hard to see on original image ● Known to be vision-threatening
  • 36. Efficacy of Explanations Explanations help when: ● Model is right, and explanation convinces the doctor ● Model is wrong, and explanation reveals the flaw in the model’s reasoning Be careful about long-term effects too! Humans and Automation: Use, Misuse, Disuse, Abuse - Parsuraman and Riley, 1997 But, Explanations can also hurt when: ● Model is right, but explanation is unintelligible ● Model is wrong, but the explanation convinces the doctor
  • 37. Assisted-read study 9 doctors grade 2000 images under three different conditions A. Image only B. Image + Model’s prediction scores C. Image + Model’s prediction scores + Explanation (Integrated Gradients) Some findings: ● Seeing prediction scores (B) significantly increases accuracy vs. image only (A) ● Showing explanations (C) only provides slight additional improvement ○ Masks help more when model certainty is low ● Both B and C increase doctor ↔ model agreement Paper: Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic retinopathy --- Journal of Ophthalmology [2018]
  • 39. Tabular QA Visual QA Reading Comprehension Peyton Manning became the first quarterback ever to lead two different teams to multiple Super Bowls. He is also the oldest quarterback ever to play in a Super Bowl at age 39. The past record was held by John Elway, who led the Broncos to victory in Super Bowl XXXIII at age 38 and is currently Denver’s Executive Vice President of Football Operations and General Manager Kazemi and Elqursh (2017) model. 61.1% on VQA 1.0 dataset (state of the art = 66.7%) Neural Programmer (2017) model 33.5% accuracy on WikiTableQuestions Yu et al (2018) model. 84.6 F-1 score on SQuAD (state of the art) 39 Q: How many medals did India win? A: 197 Q: How symmetrical are the white bricks on either side of the building? A: very Q: Name of the quarterback who was 38 in Super Bowl XXXIII? A: John Elway
  • 40. Tabular QA Visual QA Reading Comprehension Peyton Manning became the first quarterback ever to lead two different teams to multiple Super Bowls. He is also the oldest quarterback ever to play in a Super Bowl at age 39. The past record was held by John Elway, who led the Broncos to victory in Super Bowl XXXIII at age 38 and is currently Denver’s Executive Vice President of Football Operations and General Manager Robustness question: Do these network read the question carefully? :-) Kazemi and Elqursh (2017) model. 61.1% on VQA 1.0 dataset (state of the art = 66.7%) Neural Programmer (2017) model 33.5% accuracy on WikiTableQuestions Yu et al (2018) model. 84.6 F-1 score on SQuAD (state of the art) 40 Q: How many medals did India win? A: 197 Q: How symmetrical are the white bricks on either side of the building? A: very Q: Name of the quarterback who was 38 in Super Bowl XXXIII? A: John Elway
  • 41. Kazemi and Elqursh (2017) model. Accuracy: 61.1% (state of the art: 66.7%) Visual QA Q: How symmetrical are the white bricks on either side of the building? A: very 41
  • 42. Visual QA Q: How symmetrical are the white bricks on either side of the building? A: very Q: How asymmetrical are the white bricks on either side of the building? A: very 42 Kazemi and Elqursh (2017) model. Accuracy: 61.1% (state of the art: 66.7%)
  • 43. Visual QA Q: How symmetrical are the white bricks on either side of the building? A: very Q: How asymmetrical are the white bricks on either side of the building? A: very Q: How big are the white bricks on either side of the building? A: very 43 Kazemi and Elqursh (2017) model. Accuracy: 61.1% (state of the art: 66.7%)
  • 44. Visual QA Q: How symmetrical are the white bricks on either side of the building? A: very Q: How asymmetrical are the white bricks on either side of the building? A: very Q: How big are the white bricks on either side of the building? A: very Q: How fast are the bricks speaking on either side of the building? A: very 44 Kazemi and Elqursh (2017) model. Accuracy: 61.1% (state of the art: 66.7%)
  • 45. Visual QA Q: How symmetrical are the white bricks on either side of the building? A: very Q: How asymmetrical are the white bricks on either side of the building? A: very Q: How big are the white bricks on either side of the building? A: very Q: How fast are the bricks speaking on either side of the building? A: very 45 Test/dev accuracy does not show us the entire picture. Need to look inside! Kazemi and Elqursh (2017) model. Accuracy: 61.1% (state of the art: 66.7%)
  • 46. Analysis procedure ● Attribute the answer (or answer selection logic) to question words ○ Baseline: Empty question, but full context (image, text, paragraph) ■ By design, attribution will not fall on the context ● Visualize attributions per example ● Aggregate attributions across examples
  • 47. Visual QA attributions Q: How symmetrical are the white bricks on either side of the building? A: very How symmetrical are the white bricks on either side of the building? red: high attribution blue: negative attribution gray: near-zero attribution 47
  • 48. During inference, drop all words from the dataset except ones which are frequently top attributions ● E.g. How many red buses are in the picture? 48 Visual QA Over-stability Top tokens: color, many, what, is, how, there, …
  • 49. During inference, drop all words from the dataset except ones which are frequently top attributions ● E.g. How many red buses are in the picture? 49 Top tokens: color, many, what, is, how, there, … Visual QA Over-stability 80% of final accuracy reached with just 100 words (Orig vocab size: 5305) 50% of final accuracy with just one word “color”
  • 50. Attack: Subject ablation Low-attribution nouns 'tweet', 'childhood', 'copyrights', 'mornings', 'disorder', 'importance', 'topless', 'critter', 'jumper', 'fits' What is the man doing? → What is the tweet doing? How many children are there? → How many tweet are there? VQA model’s response remains the same 75.6% of the time on questions that it originally answered correctly 50 Replace the subject of a question with a low-attribution noun from the vocabulary ● This ought to change the answer but often does not!
  • 51. Many other attacks! ● Visual QA ○ Prefix concatenation attack (accuracy drop: 61.1% to 19%) ○ Stop word deletion attack (accuracy drop: 61.1% to 52%) ● Tabular QA ○ Prefix concatenation attack (accuracy drop: 33.5% to 11.4%) ○ Stop word deletion attack (accuracy drop: 33.5% to 28.5%) ○ Table row reordering attack (accuracy drop: 33.5 to 23%) ● Paragraph QA ○ Improved paragraph concatenation attacks of Jia and Liang from [EMNLP 2017] Paper: Did the model understand the question? [ACL 2018]
  • 52. Integrated Gradients is a technique for attributing a deep network’s (or any differentiable model’s) prediction to its input features. It is very easy to apply, widely applicable and backed by an axiomatic theory. References: ● Axiomatic Attribution for Deep Networks [ICML 2017] ● Did the model understand the question? [ACL 2018] ● Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic retinopathy [Journal of Ophthalmology, 2018] ● Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry [PNAS, 2019] ● Exploring Principled Visualizations for Deep Network Attributions [EXSS Workshop, 2019] Summary
  • 53. But, what about non-differentiable models? ● Decision trees ● Boosted trees ● Random forests ● etc.
  • 54. Classic result in game theory on distributing gain in a coalition game ● Coalition Games ○ Players collaborating to generate some gain (think: revenue) ○ Set function v(S) determining the gain for any subset S of players ● Shapley Values are a fair way to attribute the total gain to the players based on their contributions ○ Concept: Marginal contribution of a player to a subset of other players (v(S U {i}) - v(S)) ○ Shapley value for a player is a specific weighted aggregation of its marginal over all possible subsets of other players Shapley Value [Annals of Mathematical studies,1953]
  • 55. Shapley values are unique under four simple axioms ● Dummy: If a player never contributes to the game then it must receive zero attribution ● Efficiency: Attributions must add to the total gain ● Symmetry: Symmetric players must receive equal attribution ● Linearity: Attribution for the (weighted) sum of two games must be the same as the (weighted) sum of the attributions for each of the games Shapley Value Justification
  • 56. ● We define a coalition game for each model input X ○ Players are the features in the input ○ Gain is the model prediction (output), i.e., gain = F(X) ● Feature attributions are the Shapley values of this game Challenge: What does it mean for only some players (features) to be present? ● That is, how do we define F(x_1, <absent>, x_3, …, <absent>) ? Shapley Values for Explaining ML models
  • 57. Key Idea: Take the expected prediction when the (absent) feature is sampled from a certain distribution. Different approaches choose different distributions ● [SHAP, NIPS 2018] Use conditional distribution w.r.t. the present features ● [QII, S&P 2016] Use marginal distribution ● [Strumbelj et al., JMLR 2009] Use uniform distribution ● [Integrated Gradients, ICML 2017] Use a specific baseline point Coming Soon: Paper investigating the best way(s) to handle feature absence Modeling Feature Absence
  • 58. Exact Shapley value computation is exponential in the number of features ● Several approximations have been studied in the past ● Fiddler offers an efficient, distributed computation of approx. Shapley values Our strategy ● For non-differentiable models with structured inputs, use Shapley values ● For differentiable models, use Integrated Gradients Computing Shapley Values
  • 60. Attributions are pretty shallow Attributions do not explain: ● How the network combines the features to produce the answer? ● What training data influenced the prediction ● Why gradient descent converged ● etc. An instance where attributions are useless: ● A network that predicts TRUE when there are even number of black pixels and FALSE otherwise Attributions are useful when the network behavior entails that a strict subset of input features are important
  • 61. Attributions are for human consumption ● Humans interpret attributions and generate insights ○ Doctor maps attributions for diabetic retinopathy to pathologies like microaneurysms, hemorrhages, etc. ● Visualization matters as much as the attribution technique
  • 62. Attributions are for human consumption ● Humans interpret attributions and generate insights ○ Doctor maps attributions for diabetic retinopathy to pathologies like microaneurysms, hemorrhages, etc. ● Visualization matters as much as the attribution technique Naive scaling of attributions from 0 to 255 Attributions have a large range and long tail across pixels After clipping attributions at 99% to reduce range
  • 63. Principles of Visualization for Image Attributions Visualizations must satisfy: ● Commensurateness: Feature brightness must be proportional to attributions ● Coverage: Large fraction of the important features must be visible ● Correspondence: Allow correspondence between attributions and raw image ● Coherence: Visualizations must be optimized for human perception Paper: Exploring Principled Visualizations for Deep Network Attributions [ExSS 2019]
  • 64. Thank you! Summary: Integrated Gradients is a technique for attributing a deep network’s prediction to its input features. It is very easy to apply, widely applicable and backed by an axiomatic theory. My email: ataly@google.com References: ● Axiomatic Attribution for Deep Networks [ICML 2017] ● Did the model understand the question? [ACL 2018] ● Using a deep learning algorithm and integrated gradients explanation to assist grading for diabetic retinopathy [Journal of Ophthalmology, 2018] ● Exploring Principled Visualizations for Deep Network Attributions [EXSS Workshop, 2019] ● Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry [preprint, 2019]