SlideShare a Scribd company logo
1 of 18
Machine Learning
Explainability
“Many people say machine learning models are "black boxes", in the sense that they
can make good predictions but you can't understand the logic behind those
predictions.”
Reference:
https://www.kaggle.com/code/dansbecker/use-cases-for-model-insights
Insights gained:
● What features in the data did the model think are most important?
● For any single prediction from a model, how did each feature in the data affect that particular
prediction?
● How does each feature affect the model's predictions in a big-picture sense (what is its
typical effect when considered over a large number of possible predictions)?
Why do we need to know the logic behind predictions?
1. Helps debugging
2. Informing Feature Engineering
3. Directing Future Data Collection
4. Informing Human Decision-Making
5. Building Trust
- Finds which features have the biggest impact on predictions
- Measures feature importance
Permutation Importance
Method:
1. Get a trained model
2. Randomly shuffle a single feature column and make predictions
3. Compute how much loss function suffered from shuffling
4. Undo shuffle and repeat for next feature column
Example:
We want to predict whether a soccer/football team will have the "Man of the
Game" winner based on the team's statistics.
Example Implementation:
We want to predict whether a soccer/football team will have the "Man of the Game"
winner based on the team's statistics.
The top values are the most important features, and those towards the
bottom matter least.
Partial Dependence Plots
- Shows how a feature affects predictions
- Can be interpreted similarly to the coefficients in those
models.
Method:
1. Get a trained model
2. Start with one single row of data
3. Alter the value of one feature starting from low values to high values
and make predictions
4. Repeat for very row and compute average predictions for every value of
the feature (from low to high)
Example Implementation:
Need to
specify which
feature to
plot
Y-axis→ change in prediction
from baseline (when feature
value=0)
X-axis → feature value
Blue region→ confidence
interval
Interpretation→ scoring one
goal substantially increases
your chances of winning "Man
of The Match." But extra goals
beyond that appear to have little
impact on predictions.
Interpretation→
This model thinks you are
more likely to win Man of the
Match if your players run a
total of 100 km over the
course of the game. Though
running much more causes
lower predictions.
2D Partial Dependence Plots
- Shows how the interaction between feature affects predictions
Description→ shows predictions for any
combination of Goals Scored and Distance
covered.
Lighter color indicates higher probability for
winning.
Interpretation→ High change to win when
a team scores at least 1 goal and they run a
total distance close to 100km.
If they score 0 goals, distance covered
doesn't matter.
SHAP Values (SHapley Additive exPlanations)
- Breaks down a prediction to show the impact of each feature leading to that particular
prediction.
- Useful for justifying the model’s reason for prediction
- Example: A model says a bank shouldn't loan someone money, and the bank is legally
required to explain the basis for each loan rejection
Method:
1. Get a trained model
2. Make prediction for a specific row of data
3. Decomposes a prediction with the following equation:
sum(SHAP values for all features) = pred_for_team -
pred_for_baseline_values
The SHAP values of all features sum up to explain why the prediction was different from the baseline.
Example Implementation:
Decomposes a prediction in a graph like this.
Interpretation
- We predicted 0.71, whereas the base_value is 0.4933.
- Feature values causing increased predictions are in pink, and also shows the magnitude of the
feature's effect. The biggest impact comes from Goal Scored being 2.
- Feature values decreasing the prediction are in blue.
Advanced Uses of SHAP Values
SHAP summary plots: Give us a birds-eye view of feature importance and what is driving it.
Each dot has three characteristics:
- Vertical axis indicates what feature it is
depicting
- Color indicates high or low value of the
feature for that row of the dataset
- Horizontal axis shows whether the effect
of that value caused a higher or lower
prediction.
The point in the upper left was for a team
that scored few goals, reducing the
prediction by 0.25.
Interpretation:
Usually Yellow Card doesn't affect the
prediction, but there is an extreme case
where a high value caused a much lower
prediction.
Thank You
Any Questions ?

More Related Content

Similar to Machine Learning Explainability.pptx

Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsJonathan Bechtel
 
Stock market analysis
Stock market analysisStock market analysis
Stock market analysisSruti Jain
 
Recovering 3D human body configurations using shape contexts
Recovering 3D human body configurations using shape contextsRecovering 3D human body configurations using shape contexts
Recovering 3D human body configurations using shape contextswolf
 
Understanding Black Box Models with Shapley Values
Understanding Black Box Models with Shapley ValuesUnderstanding Black Box Models with Shapley Values
Understanding Black Box Models with Shapley ValuesJonathan Bechtel
 
Machine Learning in e commerce - Reboot
Machine Learning in e commerce - RebootMachine Learning in e commerce - Reboot
Machine Learning in e commerce - RebootMarion DE SOUSA
 
A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)Rama Irsheidat
 
Interpretable ML
Interpretable MLInterpretable ML
Interpretable MLMayur Sand
 
Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxRakshaAgrawal21
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCRakshaAgrawal21
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentationNeerajNishad4
 
Stock Market Prediction using Machine Learning
Stock Market Prediction using Machine LearningStock Market Prediction using Machine Learning
Stock Market Prediction using Machine LearningAravind Balaji
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
FAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisFAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisQuynh Tran
 
House price prediction
House price predictionHouse price prediction
House price predictionSabahBegum
 
Instruction level parallelism using ppm branch prediction
Instruction level parallelism using ppm branch predictionInstruction level parallelism using ppm branch prediction
Instruction level parallelism using ppm branch predictionIAEME Publication
 
housepriceprediction-180915174356.pdf
housepriceprediction-180915174356.pdfhousepriceprediction-180915174356.pdf
housepriceprediction-180915174356.pdfVinayShekarReddy
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 

Similar to Machine Learning Explainability.pptx (20)

Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box Models
 
Stock market analysis
Stock market analysisStock market analysis
Stock market analysis
 
Recovering 3D human body configurations using shape contexts
Recovering 3D human body configurations using shape contextsRecovering 3D human body configurations using shape contexts
Recovering 3D human body configurations using shape contexts
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Understanding Black Box Models with Shapley Values
Understanding Black Box Models with Shapley ValuesUnderstanding Black Box Models with Shapley Values
Understanding Black Box Models with Shapley Values
 
Machine Learning in e commerce - Reboot
Machine Learning in e commerce - RebootMachine Learning in e commerce - Reboot
Machine Learning in e commerce - Reboot
 
A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)A Unified Approach to Interpreting Model Predictions (SHAP)
A Unified Approach to Interpreting Model Predictions (SHAP)
 
Interpretable ML
Interpretable MLInterpretable ML
Interpretable ML
 
Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptx
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSC
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentation
 
Stock Market Prediction using Machine Learning
Stock Market Prediction using Machine LearningStock Market Prediction using Machine Learning
Stock Market Prediction using Machine Learning
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
FAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisFAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and Analysis
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
pre
prepre
pre
 
Instruction level parallelism using ppm branch prediction
Instruction level parallelism using ppm branch predictionInstruction level parallelism using ppm branch prediction
Instruction level parallelism using ppm branch prediction
 
housepriceprediction-180915174356.pdf
housepriceprediction-180915174356.pdfhousepriceprediction-180915174356.pdf
housepriceprediction-180915174356.pdf
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 

Recently uploaded

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 

Machine Learning Explainability.pptx

  • 1. Machine Learning Explainability “Many people say machine learning models are "black boxes", in the sense that they can make good predictions but you can't understand the logic behind those predictions.” Reference: https://www.kaggle.com/code/dansbecker/use-cases-for-model-insights
  • 2. Insights gained: ● What features in the data did the model think are most important? ● For any single prediction from a model, how did each feature in the data affect that particular prediction? ● How does each feature affect the model's predictions in a big-picture sense (what is its typical effect when considered over a large number of possible predictions)?
  • 3. Why do we need to know the logic behind predictions? 1. Helps debugging 2. Informing Feature Engineering 3. Directing Future Data Collection 4. Informing Human Decision-Making 5. Building Trust
  • 4. - Finds which features have the biggest impact on predictions - Measures feature importance Permutation Importance Method: 1. Get a trained model 2. Randomly shuffle a single feature column and make predictions 3. Compute how much loss function suffered from shuffling 4. Undo shuffle and repeat for next feature column
  • 5. Example: We want to predict whether a soccer/football team will have the "Man of the Game" winner based on the team's statistics.
  • 6. Example Implementation: We want to predict whether a soccer/football team will have the "Man of the Game" winner based on the team's statistics. The top values are the most important features, and those towards the bottom matter least.
  • 7. Partial Dependence Plots - Shows how a feature affects predictions - Can be interpreted similarly to the coefficients in those models. Method: 1. Get a trained model 2. Start with one single row of data 3. Alter the value of one feature starting from low values to high values and make predictions 4. Repeat for very row and compute average predictions for every value of the feature (from low to high)
  • 9. Y-axis→ change in prediction from baseline (when feature value=0) X-axis → feature value Blue region→ confidence interval Interpretation→ scoring one goal substantially increases your chances of winning "Man of The Match." But extra goals beyond that appear to have little impact on predictions.
  • 10. Interpretation→ This model thinks you are more likely to win Man of the Match if your players run a total of 100 km over the course of the game. Though running much more causes lower predictions.
  • 11. 2D Partial Dependence Plots - Shows how the interaction between feature affects predictions
  • 12. Description→ shows predictions for any combination of Goals Scored and Distance covered. Lighter color indicates higher probability for winning. Interpretation→ High change to win when a team scores at least 1 goal and they run a total distance close to 100km. If they score 0 goals, distance covered doesn't matter.
  • 13. SHAP Values (SHapley Additive exPlanations) - Breaks down a prediction to show the impact of each feature leading to that particular prediction. - Useful for justifying the model’s reason for prediction - Example: A model says a bank shouldn't loan someone money, and the bank is legally required to explain the basis for each loan rejection Method: 1. Get a trained model 2. Make prediction for a specific row of data 3. Decomposes a prediction with the following equation: sum(SHAP values for all features) = pred_for_team - pred_for_baseline_values The SHAP values of all features sum up to explain why the prediction was different from the baseline.
  • 15. Decomposes a prediction in a graph like this. Interpretation - We predicted 0.71, whereas the base_value is 0.4933. - Feature values causing increased predictions are in pink, and also shows the magnitude of the feature's effect. The biggest impact comes from Goal Scored being 2. - Feature values decreasing the prediction are in blue.
  • 16. Advanced Uses of SHAP Values SHAP summary plots: Give us a birds-eye view of feature importance and what is driving it. Each dot has three characteristics: - Vertical axis indicates what feature it is depicting - Color indicates high or low value of the feature for that row of the dataset - Horizontal axis shows whether the effect of that value caused a higher or lower prediction.
  • 17. The point in the upper left was for a team that scored few goals, reducing the prediction by 0.25. Interpretation: Usually Yellow Card doesn't affect the prediction, but there is an extreme case where a high value caused a much lower prediction.

Editor's Notes

  1. I learnt about it recently from a course in Kaggle It is about using statistical packages in python to help us understand the logic, the model used to make predictions
  2. Understanding the patterns a model is finding will help you identify when those are at odds with your knowledge of the real world, and this is typically the first step in tracking down bugs. Helps you understand feature importance and feature correlation to help you perform feature engineering (create new features from existing ones). Especially helpful when number of features are large. The insights can help you understand the value of features you currently have, which will help you reason about what new values may be most helpful for future data collection. Sometimes insights about what lead to the prediction can be more important than the value of prediction for future decision making strategies. showing insights that fit our general understanding of the problem will help build trust, even among people with little deep knowledge of data science
  3. Diasvantage: But it doesn't tell you how each features matter. If a feature has medium permutation importance, that could mean it has a large effect for a few predictions, but no effect in general, or a medium effect for all predictions.
  4. The randomness to the exact performance change is measured by shuffling same column multiple times and measuring the variance in change in performance. Negative values: When by chance prediction after shuffling is better. Indicates the feature is unimportant. You'll occasionally see negative values for permutation importances. In those cases, the predictions on the shuffled (or noisy) data happened to be more accurate than the real data. This happens when the feature didn't matter (should have had an importance close to 0), but random chance caused the predictions on shuffled data to be more accurate. This is more common with small datasets, like the one in this example, because there is more room for luck/chance.
  5. The randomness to the exact performance change is measured by shuffling same column multiple times and measuring the variance in change in performance. Negative values: When by chance prediction after shuffling is better. Indicates the feature is unimportant. You'll occasionally see negative values for permutation importances. In those cases, the predictions on the shuffled (or noisy) data happened to be more accurate than the real data. This happens when the feature didn't matter (should have had an importance close to 0), but random chance caused the predictions on shuffled data to be more accurate. This is more common with small datasets, like the one in this example, because there is more room for luck/chance.
  6. (Model: RandomForest)
  7. SHAP values interpret the impact of having a certain value for a given feature in comparison to the prediction we'd make if that feature took some baseline value. How much was a prediction driven by the fact that the team scored 3 goals, instead of some baseline number of goals.
  8. Can use other shap objects for specific models such as: shap.DeepExplainer works with Deep Learning models.
  9. Interpretation: For example, the point in the upper left was for a team that scored few goals, reducing the prediction by 0.25. Usually Yellow Card doesn't affect the prediction, but there is an extreme case where a high value caused a much lower prediction. Permutation Importance doesn't tell us how each features matter. If a feature has medium permutation importance, that could mean it has a large effect for a few predictions, but no effect in general, or a medium effect for all predictions.