Machine
Learning
Interpretability
What is it?
Explaining a Linear Model Hypothesis Explaining a Non-Linear Model HypothesisVS
Global VS Local Interpretability
Model-Agnostic VS Model-Specific
K-Lime
PDP and ICE
LOCO – Feature Importance
Surrogate Models
Shapley - Values
Random Forest – Feature Importance
PDP and ICE
Surrogate Models
LOCO – Feature Importance
K-LIME
Why MLI?
Development of MLI
First Wave
Second Wave
Third Wave
DRIVERLESS AI
 Model Built with DAI
 MLI with DAI

Machine Learning Interpretability

Editor's Notes

  • #3 Residual Analysis: If strong patterns are visible in plotted residuals, this is a dead giveaway that there are problems with your data, your model, or both. Vice versa, if models are producing randomly distributed residuals, this a strong indication of a well-fit, dependable, trustworthy model, especially if other fit statistics (i.e., R2, AUC, etc.) are in the appropriate ranges. Linear: An illustration of approximate model with exact explanations. Non Linear: An illustration of an exact model with approximate explanations. Here f(x) represents the true, unknown target function, which is approximated by training a machine learning algorithm on the pictured data points.
  • #4 How to calculate local contributions of a feature just remove that feature. PDP: Partial dependence plots show us the way machine-learned response functions change based on the values of one or two independent variables of interest, while averaging out the effects of all other independent variables. Partial dependence plots with two independent variables are particularly useful for visualizing complex types of variable interactions between the independent variables of interest. ICE: Individual conditional expectation (ICE) plots, a newer and less well-known adaptation of partial dependence plots, can be used to create more localized explanations using the same ideas as partial dependence plots. ICE plots are particularly useful when there are strong relationships between many input variables.
  • #5 Here let us refer DARPA: Reasoning over narrowly defined problems – rules – no learning capabilities – very successful in many cases - cyber grand challenge Statistical models trained on big data – good classification and prediction – Construct explainable models for classes of real world phenomenon
  • #6 No explainability in the second wave AI – skewed training data creates maladaptation
  • #7 This is the kind of explainability of neural nets we need.