SlideShare a Scribd company logo
1 of 25
SHapley Additive
exPlanations
SHAP Ted Discussion
shap overview
What’s all the fuss about?
shapley
● Game theory approach to giving “credit”
to cooperative group
● Shapley values calculate the
importance of a feature by
comparing what a model predicts
with and without the feature.
However, since the order in which a
model sees features can affect its
predictions, this is done in every
possible order, so that the features
are fairly compared. source
shap
● What Shapley does is quantifying the contribution that
each player brings to the game. What SHAP does is
quantifying the contribution that each feature brings to
the prediction made by the model.
● One game: one observation. SHAP is local
● Lundberg, Scott M., and Su-In Lee. “A unified
approach to interpreting model predictions.” Advances
in Neural Information Processing Systems (2017)
● implementation of shapley (TreeSHAP, KernelSHAP)
● connects LIME and shapley values
● one line of python gives you feature explanations
h/t these slides
Quick History
Quick History
● Imagine a machine learning model that
predicts the income of a person
knowing age, gender and job of the
person.
● Shapley values are based on the idea
that the outcome of each possible
combination (or coalition) of players
should be considered to determine the
importance of a single player. In our
case, this corresponds to each possible
combination of f features (f going from
0 to F, F being the number of all
features available, in our example 3).
● In math, this is called a “power set” and
can be represented as a tree. h/t this article
● Cardinality of a power set is 2 ^ n,
where n is the number of elements of
the original set.
● SHAP requires to train a distinct
predictive model for each distinct
coalition in the power set (2 ^ F models)
● Models are completely equivalent:
hyperparameters and their training data
(which is the full dataset). The only thing
that changes is the set of features
included in the model.
● Imagine that we have already trained
our 8 models on the same training data.
take a new observation (let us call it x₀)
and see what the 8 different models
predict for the same observation x₀.
● Two nodes connected by an edge differ
for just one feature, the gap between
the predictions of two connected nodes
due to additional feature. This is called
“marginal contribution” of a feature.
● Each edge represents the marginal
contribution brought by a feature
● Overall effect of Age on the final model
(i.e. the SHAP value of Age for x₀)
● Consider marginal contribution of Age
in all the models - edges highlighted in
red.
● How does SHAP figure out the weights -
next section!
shap specifics
Shapley Axioms
Shapley Axioms
Shapley Axioms
Shapley Axioms
Shapley Equation
Shapley Equation
for a subset S, the weight is the
product of the number of
permutations of S and the number of
permutations of the complement of S
and i (i.e.; N{S∪{i}}).
shap example
Shapley in ML
● Shapley value is computed by perturbing
input features and seeing how changes to
the input features correspond to the final
model prediction.
● Shapley value = the average marginal
contribution of a feature to the overall
model score
● For ML models, it’s not possible to just
“exclude” a feature when determining a
prediction.
● The formulation of Shapley values within
an ML context simulates “excluded”
features by sampling the empirical
distribution of the feature’s values and
averaging over multiple samples (Monte
Carlo with other data sample’s features -
FrakenFeatures!)
Shap Package
Explainers for
● Tree models (e.g. XGBoost)
● Deep explainer (neural nets)
● Linear explainer (regression)
Shapley Usage - Beeswarm Plot
Shapley Usage - Waterfall Plot
Shapley Usage - Force Plot
Advantages / Disadvantages
● Everyone likes explainability
● SHAP python package is two lines and
fairly fast (especially for tree based
models)
● Model agnostic (black box)
● Performed on each data point - so we
get granularity to a single point, and can
aggregate over the whole model or
subsets of data.
● Brute force calculation is combinatorial,
SHAP does some fancy Monte Carlo like
approximation, especially when model
structure (think trees) is know - but it is
still a compute beast
● Stakeholders (who have not heard
Junlin’s talk yet) will mistake SHAP
analysis for causation NOT correlation
● SHAP may make prediction on
unrealistic data
● There is no native SPARK version (so you
have to convert pySpark dataframes to
pandas.

More Related Content

Similar to Shapley Tech Talk - SHAP and Shapley Discussion

[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of FunctionsJaeJun Yoo
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Hayim Makabee
 
Steering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsSteering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsMelissa Moody
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Rebecca Bilbro
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWomen in Analytics Conference
 
Intepretable Machine Learning
Intepretable Machine LearningIntepretable Machine Learning
Intepretable Machine LearningAnkit Tewari
 
Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAminaRepo
 
Citython presentation
Citython presentationCitython presentation
Citython presentationAnkit Tewari
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learningjie cao
 
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Databricks
 
Open and Automated Machine Learning
Open and Automated Machine LearningOpen and Automated Machine Learning
Open and Automated Machine LearningJoaquin Vanschoren
 
Discriminatively trained and or graph models for object shape detection
Discriminatively trained and or graph models for object shape detectionDiscriminatively trained and or graph models for object shape detection
Discriminatively trained and or graph models for object shape detectionI3E Technologies
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...ssuser4b1f48
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4arogozhnikov
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)Yu Liu
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...taeseon ryu
 

Similar to Shapley Tech Talk - SHAP and Shapley Discussion (20)

C3 w5
C3 w5C3 w5
C3 w5
 
[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions[PR12] Generative Models as Distributions of Functions
[PR12] Generative Models as Distributions of Functions
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
 
Steering Model Selection with Visual Diagnostics
Steering Model Selection with Visual DiagnosticsSteering Model Selection with Visual Diagnostics
Steering Model Selection with Visual Diagnostics
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual Diagnostics
 
Intepretable Machine Learning
Intepretable Machine LearningIntepretable Machine Learning
Intepretable Machine Learning
 
Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
 
Open and Automated Machine Learning
Open and Automated Machine LearningOpen and Automated Machine Learning
Open and Automated Machine Learning
 
Discriminatively trained and or graph models for object shape detection
Discriminatively trained and or graph models for object shape detectionDiscriminatively trained and or graph models for object shape detection
Discriminatively trained and or graph models for object shape detection
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
 
Siamese networks
Siamese networksSiamese networks
Siamese networks
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
 

More from Tushar Tank

Image Processing Background Elimination in Video Editting
Image Processing Background Elimination in Video EdittingImage Processing Background Elimination in Video Editting
Image Processing Background Elimination in Video EdittingTushar Tank
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsTushar Tank
 
Bayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesBayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesTushar Tank
 
Review of CausalImpact / Bayesian Structural Time-Series Analysis
Review of CausalImpact / Bayesian Structural Time-Series AnalysisReview of CausalImpact / Bayesian Structural Time-Series Analysis
Review of CausalImpact / Bayesian Structural Time-Series AnalysisTushar Tank
 
Tech Talk overview of xgboost and review of paper
Tech Talk overview of xgboost and review of paperTech Talk overview of xgboost and review of paper
Tech Talk overview of xgboost and review of paperTushar Tank
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical ClusteringTushar Tank
 
Variational Inference
Variational InferenceVariational Inference
Variational InferenceTushar Tank
 
Time Frequency Analysis for Poets
Time Frequency Analysis for PoetsTime Frequency Analysis for Poets
Time Frequency Analysis for PoetsTushar Tank
 
Kalman filter upload
Kalman filter uploadKalman filter upload
Kalman filter uploadTushar Tank
 

More from Tushar Tank (10)

Image Processing Background Elimination in Video Editting
Image Processing Background Elimination in Video EdittingImage Processing Background Elimination in Video Editting
Image Processing Background Elimination in Video Editting
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov Chains
 
Bayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesBayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with Examples
 
Review of CausalImpact / Bayesian Structural Time-Series Analysis
Review of CausalImpact / Bayesian Structural Time-Series AnalysisReview of CausalImpact / Bayesian Structural Time-Series Analysis
Review of CausalImpact / Bayesian Structural Time-Series Analysis
 
Tech Talk overview of xgboost and review of paper
Tech Talk overview of xgboost and review of paperTech Talk overview of xgboost and review of paper
Tech Talk overview of xgboost and review of paper
 
Hindu ABC Book
Hindu ABC BookHindu ABC Book
Hindu ABC Book
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical Clustering
 
Variational Inference
Variational InferenceVariational Inference
Variational Inference
 
Time Frequency Analysis for Poets
Time Frequency Analysis for PoetsTime Frequency Analysis for Poets
Time Frequency Analysis for Poets
 
Kalman filter upload
Kalman filter uploadKalman filter upload
Kalman filter upload
 

Recently uploaded

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Recently uploaded (20)

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Shapley Tech Talk - SHAP and Shapley Discussion

  • 3. What’s all the fuss about? shapley ● Game theory approach to giving “credit” to cooperative group ● Shapley values calculate the importance of a feature by comparing what a model predicts with and without the feature. However, since the order in which a model sees features can affect its predictions, this is done in every possible order, so that the features are fairly compared. source shap ● What Shapley does is quantifying the contribution that each player brings to the game. What SHAP does is quantifying the contribution that each feature brings to the prediction made by the model. ● One game: one observation. SHAP is local ● Lundberg, Scott M., and Su-In Lee. “A unified approach to interpreting model predictions.” Advances in Neural Information Processing Systems (2017) ● implementation of shapley (TreeSHAP, KernelSHAP) ● connects LIME and shapley values ● one line of python gives you feature explanations
  • 7.
  • 8.
  • 9. ● Imagine a machine learning model that predicts the income of a person knowing age, gender and job of the person. ● Shapley values are based on the idea that the outcome of each possible combination (or coalition) of players should be considered to determine the importance of a single player. In our case, this corresponds to each possible combination of f features (f going from 0 to F, F being the number of all features available, in our example 3). ● In math, this is called a “power set” and can be represented as a tree. h/t this article
  • 10. ● Cardinality of a power set is 2 ^ n, where n is the number of elements of the original set. ● SHAP requires to train a distinct predictive model for each distinct coalition in the power set (2 ^ F models) ● Models are completely equivalent: hyperparameters and their training data (which is the full dataset). The only thing that changes is the set of features included in the model. ● Imagine that we have already trained our 8 models on the same training data. take a new observation (let us call it x₀) and see what the 8 different models predict for the same observation x₀.
  • 11. ● Two nodes connected by an edge differ for just one feature, the gap between the predictions of two connected nodes due to additional feature. This is called “marginal contribution” of a feature. ● Each edge represents the marginal contribution brought by a feature ● Overall effect of Age on the final model (i.e. the SHAP value of Age for x₀) ● Consider marginal contribution of Age in all the models - edges highlighted in red. ● How does SHAP figure out the weights - next section!
  • 18. Shapley Equation for a subset S, the weight is the product of the number of permutations of S and the number of permutations of the complement of S and i (i.e.; N{S∪{i}}).
  • 20. Shapley in ML ● Shapley value is computed by perturbing input features and seeing how changes to the input features correspond to the final model prediction. ● Shapley value = the average marginal contribution of a feature to the overall model score ● For ML models, it’s not possible to just “exclude” a feature when determining a prediction. ● The formulation of Shapley values within an ML context simulates “excluded” features by sampling the empirical distribution of the feature’s values and averaging over multiple samples (Monte Carlo with other data sample’s features - FrakenFeatures!)
  • 21. Shap Package Explainers for ● Tree models (e.g. XGBoost) ● Deep explainer (neural nets) ● Linear explainer (regression)
  • 22. Shapley Usage - Beeswarm Plot
  • 23. Shapley Usage - Waterfall Plot
  • 24. Shapley Usage - Force Plot
  • 25. Advantages / Disadvantages ● Everyone likes explainability ● SHAP python package is two lines and fairly fast (especially for tree based models) ● Model agnostic (black box) ● Performed on each data point - so we get granularity to a single point, and can aggregate over the whole model or subsets of data. ● Brute force calculation is combinatorial, SHAP does some fancy Monte Carlo like approximation, especially when model structure (think trees) is know - but it is still a compute beast ● Stakeholders (who have not heard Junlin’s talk yet) will mistake SHAP analysis for causation NOT correlation ● SHAP may make prediction on unrealistic data ● There is no native SPARK version (so you have to convert pySpark dataframes to pandas.