SlideShare a Scribd company logo
1 of 20
Pang Wei Koh
Stanford University,
Stanford, CA.
Percy Liang
Stanford University,
Stanford, CA.
Presented by,
Zabir Al Nazi
Roll : 1409016
Department of Electronics and Communication Engineering,
Khulna University of Engineering and Technology,
Khulna-9203, Bangladesh.
Mentored by,
Tasnim Azad Abir
Lecturer,
Department of Electronics and Communication Engineering,
Khulna University of Engineering and Technology,
Khulna-9203, Bangladesh.
Proceedings of the 34th International Conference on Machine Learning
1
Agenda
 Introduction
 Objectives/Research Questions
 Methodology
 Scaling Up
 Results
 Conclusion and Future Work
 Acknowledgements
2
Introduction (1/2)
Error(%)
Figure 1. ImageNet Large Scale Visual Recognition Challenge top 5 error
Human
ImageNetDataset
TeslaK80
GPU
+
3
Introduction (2/2)
Output
Benign (78%)
Acoustic Neuroma (3%)
Chordoma (11%)
Meningioma (8%)
Input
Figure 2. Black box prediction of Brain Cancer classification 4
Objectives/Research Questions
Given a high-accuracy black-box model and a prediction –
 Can we answer why did the model make this prediction?
Training Dataset
Prediction ?
Input
5
Methodology (1/3)
𝑧𝑖 є Z such that 𝑧𝑖 = (𝑥𝑖 , 𝑦𝑖)
Pick θ to minimize
1
𝑛 𝑖=1
𝑛
𝐿 𝑧𝑖, 𝜃^
Test data 𝑧1 , 𝑧2 , 𝑧3 …
DogFishDog
Dog 79%
θ
6
Methodology (2/3)
For test data point z
Pick θ 𝜀, 𝑧 to minimize
1
𝑛 𝑖=1
𝑛
𝐿 𝑧𝑖, 𝜃 + 𝜀. 𝐿(𝑧, 𝜃)
^
Test data 𝑧1 , 𝑧2 , 𝑧3 …
DogFishDog
Dog 83%
z
θ 𝜀. 𝑧
7
Methodology (3/3)
 Influence
 What is 𝐿 𝑧𝑡𝑒𝑠𝑡, θ 𝜀, 𝑧 − 𝐿 𝑧𝑡𝑒𝑠𝑡, θ ?
 How much the prediction changes for a single data point
(removing from test data)?
 But retraining for each z , 𝜀 is costly.
8
Scaling Up (1/3)
 Influence Function
 Robust Statistics, 1970
 Consider an estimator T which will act on a distribution F
 How much does T change if we perturb F?
9
Scaling Up (2/3)
 𝜃 𝜀, 𝑧 = arg minθ
1
𝑛 𝑖=1
𝑛
𝐿 𝑧𝑖, 𝜃 + 𝜀. 𝐿(𝑧, 𝜃)
 Influence of upweighting z on parameters 𝜃 is given by –
 𝐼 𝑢𝑝, 𝑙𝑜𝑠𝑠 =
𝜕𝐿(𝑧𝑡𝑒𝑠𝑡, 𝜃𝜀,𝑧)
𝜕 𝜀
= 𝛻𝜃 𝐿 𝑧𝑡𝑒𝑠𝑡 , 𝜃
𝑇
𝐻 𝜃
−1
𝛻𝜃 𝐿 𝑧 , 𝜃 [1]
 𝐻 𝜃 =
1
𝑛 𝑖=1
𝑛
𝛻𝜃
2 𝐿 𝑧𝑖, 𝜃
[1] Cook & Weisberg, 1982
10
Scaling Up (3/3)
 Don’t explicitly form 𝐻 𝜃
−1
, instead compute 𝐻 𝜃
−1
𝑣
𝑣 𝐻 𝜃 𝑣 𝐻 𝜃
−1
𝑣
Pearlmutter trick [1] CG [2]
Taylor [3]
[1] Pearlmutter, 1994
[2] Martens, 2010
[3] Agarwal, Bullins, 2016
11
Result (1/5)
Figure 3. Comparing models
Different models
can reach the same
result in totally
different paths.
12
Result (2/5)
 ML systems get their training data from outside world which is
vulnerable to attack
 Can we create adversarial training examples?
Dog (97%) Dog (98%) Dog (98%)
Dog(99%)
Dog(98%)
Label: Fish
13
Result (3/5)
 How easy it is to fool a machine learning model?
Fish (97%) Fish (93%) Fish (87%)
Fish(63%)
Fish(52%)
Label: Fish
14
Result (4/5)
 Debugging model errors: Why did a model make a wrong prediction?
 Case study: Hospital re-admission (20K patients, 127 features)
Healthy + re-admitted
adults
Healthy children
Re-admitted children
Original Modified
~20K ~20K
21 1
3 3
same
-20
same
15
Result (5/5)
0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
FEATUREWEIGHT
TOP 20 FEATURES
Indicatorfeatureforchild
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
1 2 3 4 5
INFLUENCE
TOP 5 INFLUENTIAL TRAINING EXAMPLES
Healthychild
Re-admitted
child
(a) (b)
Figure 4. Debugging models using (a) feature weights (b) training point influence 16
Conclusion and Future Work
 A new way of looking at high performing, complex, black box
models diagnostics
 Applications such as creating training set attacks, debugging,
fixing labels
 Underlying each of the applications is a common tool, simple
idea of Influence function
 Influence function assumes very small perturbation in the
model.
 Open problem – coming up with closed form with global
change in model
17
Acknowledgements
 The authors of the conference paper ‘Understanding Black-box
Predictions via Influence Functions’ Pang Wei Koh et al.
 I am grateful to my supervisor Tasnim Azad Abir sir, for his guidance
18
19
20

More Related Content

Similar to Paper presentation on 'Understanding Balck-box Predictions via Influence Functions'

A software fault localization technique based on program mutations
A software fault localization technique based on program mutationsA software fault localization technique based on program mutations
A software fault localization technique based on program mutations
Tao He
 
Possibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognitionPossibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognition
IJCI JOURNAL
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
Smriti Tikoo
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentation
rohan_anil
 

Similar to Paper presentation on 'Understanding Balck-box Predictions via Influence Functions' (20)

J03504073076
J03504073076J03504073076
J03504073076
 
A software fault localization technique based on program mutations
A software fault localization technique based on program mutationsA software fault localization technique based on program mutations
A software fault localization technique based on program mutations
 
Hybrid layer of protection analysis and bow tie analysis with fuzzy approach ...
Hybrid layer of protection analysis and bow tie analysis with fuzzy approach ...Hybrid layer of protection analysis and bow tie analysis with fuzzy approach ...
Hybrid layer of protection analysis and bow tie analysis with fuzzy approach ...
 
Arousal when making decisions predicts Big Five: A machine learning approach
Arousal when making decisions predicts Big Five: A machine learning approachArousal when making decisions predicts Big Five: A machine learning approach
Arousal when making decisions predicts Big Five: A machine learning approach
 
Identification of Bird Species using Automation Tool
Identification of Bird Species using Automation ToolIdentification of Bird Species using Automation Tool
Identification of Bird Species using Automation Tool
 
Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.
Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.
Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.
 
Adapting New Data In Intrusion Detection Systems
Adapting New Data In Intrusion Detection SystemsAdapting New Data In Intrusion Detection Systems
Adapting New Data In Intrusion Detection Systems
 
Computer Aided Diagnosis using Margin and Posterior Acoustic Featuresfor Brea...
Computer Aided Diagnosis using Margin and Posterior Acoustic Featuresfor Brea...Computer Aided Diagnosis using Margin and Posterior Acoustic Featuresfor Brea...
Computer Aided Diagnosis using Margin and Posterior Acoustic Featuresfor Brea...
 
Possibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognitionPossibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognition
 
IRJET- Breast Cancer Prediction using Deep Learning
IRJET-  	  Breast Cancer Prediction using Deep LearningIRJET-  	  Breast Cancer Prediction using Deep Learning
IRJET- Breast Cancer Prediction using Deep Learning
 
Prediction of Air Quality Index using Random Forest Algorithm
Prediction of Air Quality Index using Random Forest AlgorithmPrediction of Air Quality Index using Random Forest Algorithm
Prediction of Air Quality Index using Random Forest Algorithm
 
Ml in genomics
Ml in genomicsMl in genomics
Ml in genomics
 
IRJET- Detection of Writing, Spelling and Arithmetic Dyslexic Problems in...
IRJET-  	  Detection of Writing, Spelling and Arithmetic Dyslexic Problems in...IRJET-  	  Detection of Writing, Spelling and Arithmetic Dyslexic Problems in...
IRJET- Detection of Writing, Spelling and Arithmetic Dyslexic Problems in...
 
IRJET- Detection of Suspicious Lesions in Mammogram using Zebra Medical V...
IRJET-  	  Detection of Suspicious Lesions in Mammogram using Zebra Medical V...IRJET-  	  Detection of Suspicious Lesions in Mammogram using Zebra Medical V...
IRJET- Detection of Suspicious Lesions in Mammogram using Zebra Medical V...
 
Prediction of Cognitive Imperiment using Deep Learning
Prediction of Cognitive Imperiment using Deep LearningPrediction of Cognitive Imperiment using Deep Learning
Prediction of Cognitive Imperiment using Deep Learning
 
Smriti's research paper
Smriti's research paperSmriti's research paper
Smriti's research paper
 
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
Hybrid bat-ant colony optimization algorithm for rule-based feature selection...
 
Prediction of Dengue, Diabetes and Swine Flu using Random Forest Classificati...
Prediction of Dengue, Diabetes and Swine Flu using Random Forest Classificati...Prediction of Dengue, Diabetes and Swine Flu using Random Forest Classificati...
Prediction of Dengue, Diabetes and Swine Flu using Random Forest Classificati...
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentation
 
ICT Developments in Mobile Technology for Global Public Health: InSTEDD Colla...
ICT Developments in Mobile Technology for Global Public Health: InSTEDD Colla...ICT Developments in Mobile Technology for Global Public Health: InSTEDD Colla...
ICT Developments in Mobile Technology for Global Public Health: InSTEDD Colla...
 

Recently uploaded

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Cherry
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Cherry
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Cherry
 

Recently uploaded (20)

CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Early Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdfEarly Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdf
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdf
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 

Paper presentation on 'Understanding Balck-box Predictions via Influence Functions'

  • 1. Pang Wei Koh Stanford University, Stanford, CA. Percy Liang Stanford University, Stanford, CA. Presented by, Zabir Al Nazi Roll : 1409016 Department of Electronics and Communication Engineering, Khulna University of Engineering and Technology, Khulna-9203, Bangladesh. Mentored by, Tasnim Azad Abir Lecturer, Department of Electronics and Communication Engineering, Khulna University of Engineering and Technology, Khulna-9203, Bangladesh. Proceedings of the 34th International Conference on Machine Learning 1
  • 2. Agenda  Introduction  Objectives/Research Questions  Methodology  Scaling Up  Results  Conclusion and Future Work  Acknowledgements 2
  • 3. Introduction (1/2) Error(%) Figure 1. ImageNet Large Scale Visual Recognition Challenge top 5 error Human ImageNetDataset TeslaK80 GPU + 3
  • 4. Introduction (2/2) Output Benign (78%) Acoustic Neuroma (3%) Chordoma (11%) Meningioma (8%) Input Figure 2. Black box prediction of Brain Cancer classification 4
  • 5. Objectives/Research Questions Given a high-accuracy black-box model and a prediction –  Can we answer why did the model make this prediction? Training Dataset Prediction ? Input 5
  • 6. Methodology (1/3) 𝑧𝑖 є Z such that 𝑧𝑖 = (𝑥𝑖 , 𝑦𝑖) Pick θ to minimize 1 𝑛 𝑖=1 𝑛 𝐿 𝑧𝑖, 𝜃^ Test data 𝑧1 , 𝑧2 , 𝑧3 … DogFishDog Dog 79% θ 6
  • 7. Methodology (2/3) For test data point z Pick θ 𝜀, 𝑧 to minimize 1 𝑛 𝑖=1 𝑛 𝐿 𝑧𝑖, 𝜃 + 𝜀. 𝐿(𝑧, 𝜃) ^ Test data 𝑧1 , 𝑧2 , 𝑧3 … DogFishDog Dog 83% z θ 𝜀. 𝑧 7
  • 8. Methodology (3/3)  Influence  What is 𝐿 𝑧𝑡𝑒𝑠𝑡, θ 𝜀, 𝑧 − 𝐿 𝑧𝑡𝑒𝑠𝑡, θ ?  How much the prediction changes for a single data point (removing from test data)?  But retraining for each z , 𝜀 is costly. 8
  • 9. Scaling Up (1/3)  Influence Function  Robust Statistics, 1970  Consider an estimator T which will act on a distribution F  How much does T change if we perturb F? 9
  • 10. Scaling Up (2/3)  𝜃 𝜀, 𝑧 = arg minθ 1 𝑛 𝑖=1 𝑛 𝐿 𝑧𝑖, 𝜃 + 𝜀. 𝐿(𝑧, 𝜃)  Influence of upweighting z on parameters 𝜃 is given by –  𝐼 𝑢𝑝, 𝑙𝑜𝑠𝑠 = 𝜕𝐿(𝑧𝑡𝑒𝑠𝑡, 𝜃𝜀,𝑧) 𝜕 𝜀 = 𝛻𝜃 𝐿 𝑧𝑡𝑒𝑠𝑡 , 𝜃 𝑇 𝐻 𝜃 −1 𝛻𝜃 𝐿 𝑧 , 𝜃 [1]  𝐻 𝜃 = 1 𝑛 𝑖=1 𝑛 𝛻𝜃 2 𝐿 𝑧𝑖, 𝜃 [1] Cook & Weisberg, 1982 10
  • 11. Scaling Up (3/3)  Don’t explicitly form 𝐻 𝜃 −1 , instead compute 𝐻 𝜃 −1 𝑣 𝑣 𝐻 𝜃 𝑣 𝐻 𝜃 −1 𝑣 Pearlmutter trick [1] CG [2] Taylor [3] [1] Pearlmutter, 1994 [2] Martens, 2010 [3] Agarwal, Bullins, 2016 11
  • 12. Result (1/5) Figure 3. Comparing models Different models can reach the same result in totally different paths. 12
  • 13. Result (2/5)  ML systems get their training data from outside world which is vulnerable to attack  Can we create adversarial training examples? Dog (97%) Dog (98%) Dog (98%) Dog(99%) Dog(98%) Label: Fish 13
  • 14. Result (3/5)  How easy it is to fool a machine learning model? Fish (97%) Fish (93%) Fish (87%) Fish(63%) Fish(52%) Label: Fish 14
  • 15. Result (4/5)  Debugging model errors: Why did a model make a wrong prediction?  Case study: Hospital re-admission (20K patients, 127 features) Healthy + re-admitted adults Healthy children Re-admitted children Original Modified ~20K ~20K 21 1 3 3 same -20 same 15
  • 16. Result (5/5) 0 0.5 1 1.5 2 2.5 3 3.5 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 FEATUREWEIGHT TOP 20 FEATURES Indicatorfeatureforchild 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 1 2 3 4 5 INFLUENCE TOP 5 INFLUENTIAL TRAINING EXAMPLES Healthychild Re-admitted child (a) (b) Figure 4. Debugging models using (a) feature weights (b) training point influence 16
  • 17. Conclusion and Future Work  A new way of looking at high performing, complex, black box models diagnostics  Applications such as creating training set attacks, debugging, fixing labels  Underlying each of the applications is a common tool, simple idea of Influence function  Influence function assumes very small perturbation in the model.  Open problem – coming up with closed form with global change in model 17
  • 18. Acknowledgements  The authors of the conference paper ‘Understanding Black-box Predictions via Influence Functions’ Pang Wei Koh et al.  I am grateful to my supervisor Tasnim Azad Abir sir, for his guidance 18
  • 19. 19
  • 20. 20