Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCon AI Summit 2017


Published on

Despite widespread adoption and success most machine learning models remain black boxes. Many times users and practitioners are asked to implicitly trust the results. However understanding the reasons behind predictions is critical in assessing trust, which is fundamental if one is asked to take action based on such models, or even to compare two similar models. In this talk I will (1.) formulate the notion of interpretability of models, (2.) provide a review of various attempts and research initiatives to solve this very important problem and (3.) demonstrate real industry use-cases and results focusing primarily on Deep Neural Networks.

Published in: Data & Analytics

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCon AI Summit 2017

  1. 1. Interpretability of Deep Neural Nets
  2. 2. Agenda • Case for Interpretability • Just how black is the Deep Neural Nets(DNNs) box ? • Recent research papers/solutions to Interpretability of DNNs • Demo / Code walk through
  3. 3. ML everywhere Big Data + Vast computing resources + Key algo breakthroughs
  4. 4. “People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.” - Pedro Domingos “The Master Algorithm” Intelligible Models for Heathcare: Predicting Pneumonia Risk and Hospital 30 day Readmission (Rich Caruana et al 2015 , Microsoft Research) - Goal : Predict POD for InPatient or OutPatient care - Neural Network predicted Asthma as OutPatient - Historically Asthama patients were sent start to IC - Model was “downgraded” to logistic regression
  5. 5. Interpretability in machine learning ⋮ 𝑥# 𝑥$ 𝑥% 𝑥& Evaluation Metrics 𝒚∗ 𝒚) Results Interpretation What is Interpretability ? “Ability to explain or to present in understandable terms to a human” Explanation – “Currency in which we exchange beliefs” Towards a rigorous science of Interpretability in Machine Learning, (Been Kim
  6. 6. DARPA – XAI program concept
  7. 7. Explainable Models Credit: DARPA XAI project
  8. 8. Native interpretability ? Linear and Monotonic Linear and Non Monotonic Non Linear and Non Monotonic - Linear regressions - Decision trees Multi Adaptive Regression Splines - Boosted models - Non-linear SVMs - DNNs
  9. 9. When do we need interpretability ? Stems from incompleteness in problem formalization (under-specification) No amount of data may “fix it” Visual Q/A Image Translation Playing Go Research Debug Mismatched Objective Safety Scientific Inquiry Human Training Fairness Voice NLP Multi modal learning Model Lifecycle Mgt Weak At Par Better Domain Adaption
  10. 10. Scope of Interpretability Global The complete conditional distribution Relationships between input and the dependent variables Relationship between the ML algo and results Local Local region of the conditional distribution “Why did the model predict this particularly ?” “What if this particular input was absent ?”
  11. 11. Deep Neural Nets Input space for CNNs trained on Imagenet Per Image: Num of Pixels = 256 X 256 X 3 Values each pixel can take = 256 Domain of all image = 256($./0$./0%) Total Input : 1.2M across 1000 classes : 3×256$×1200 ~ 236M/class : Very very tiny fraction of Domain Parameter to Training set (6 &⁄ ) : 4 to 110 But… SGD always converges to a good solution Samoyed White-wolf
  12. 12. Recent research on understanding DNNs Why does deep and cheap learning work so we ? (Henry W. Lin (Harvard), Max Tegmark (MIT), David Rolnick (MIT)) • Physics centric theory Understanding deep learning requires rethinking generalization (Chiyuan Zhang, Samy Bengio, Maritz Hardt, Benjamin Recht, Oriol Vinyals (Google Brain and Deepmind)) • Revisits learning theory, esp generalization bounds in empirical risk minimization Opening the Black Box of Deep Neural Networks via Information (Ravid Shwartz-Ziv, Naftali Tishby) • Information bottleneck theory
  13. 13. Randomization Test Partially corrupted labels : independent relabeling with probability p Random labels : all labels replaced with a random ones Shuffled pixels : random permutation of the pixels applied to all images Random pixels : a different random permutation to each image Gaussian : A Gaussian dist. (same mean/var of original image) of pixels All hyper-parameters were kept same Training error was 0 in all models
  14. 14. Role of Regularization • Typical regularizations: Dropouts, batchnormalization, data augmentation, early stopping, weight decay (l2) • Explicit regularizers may improve generalization performance • Implicit regularizers like model architecture and SGD are a better controller of generalization error
  15. 15. Takeaways • Effective capacity of a neural networks is sufficient for memorizing the entire data set • Optimization != Generalization • DNNs are fragile to overfitting, it will shatter any input space • Regularizers may improve performance but are not necessary or itself sufficient for controlling generalization error
  16. 16. Interpretability of DNNs. Hmmm.. Notwithstanding a lack of unifying theory on Deep Networks, they work great Having 85-90% accuracy in classification problems is almost easy (of course with state of the art models, and careful hyper-parameter tuning) How can we build trust ?
  17. 17. Disentanglement and separation of features Learn facial expressions: - Same individual are close in pixel space - To extract expression must disentangle and separate expression from face A four layer NN can separate the spirals Credit :
  18. 18. Visualizing representations t-SNE • Embeds a high dim probability distribution to a 2-D plane • Uses SGD to minimize KLD Embedding of 2-d representation of the final conv layer in AlexNet trained on Imagenet images Visually inspect clusters for feature coherence Can be a tool for global visualization of feature separation Is not trivial to get good results Credit : Karpathy, t-SNE vizualization of CNN codes
  19. 19. Proxy models – Knowledge Distillation DNNs learns probability dist between target (Dark knowledge, Hinton) • First: Train a large network on training • Train against point mass probability distribution • Next: Train a shallow model using richer probability distribution between targets Credit :
  20. 20. Local interpretations Saliency / Attribution maps Visualize the features in the input space that mattered for the classification Sensitivity analysis on model What would happen to output 𝒚) ? If we perturb the input 𝒙 → 𝒙 + 𝝐, 𝝐 can be feature, data, specific inputs
  21. 21. Saliency maps – Grad-CAM - Backprops target class activations from final conv layer - Does not need any retraining or architecture change - Quite fast; single operation in most frameworks - Uses guided backprop to only propagate positive activations - Negative gradients get zero-ed out • Misses negatively correlated inputs Credit : Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
  22. 22. Attribution maps DeepLift (Deep Learning Important FeaTures) - Explain “difference of reference value” of output in terms of “difference from reference value” of input: • ∆ 𝑡 → ∆𝑥#, ∆𝑥$, . . , ∆𝑥& - Assign contributions 𝐶∆0C,∆D: • ∑ 𝐶∆0C,∆D & G = ∆ 𝑡 - Can account for –ve contributions - Very new, hasn’t been depicted in non MNIST dataset. Also reference value is empirical Integrated Gradients - Pick some reference values, eg image with 0 pixel values - Scale input values linearly to actual value, do gradient * ∆𝑖𝑛𝑝𝑢𝑡 at each step ↪ ∆𝑖𝑛𝑝𝑢𝑡 ∗ ∑ 𝑔𝑟𝑎𝑑0C - Very fine grained – at a pixel level Learning Important Features Through Propagating Activation Differences Axiomatic Attribution for Deep Networks
  23. 23. Sensitivity analysis Change the input by 𝝐 and then observe prediction probability - Occlusion based - Idea is probability score will drop as important areas are occluded - Superpixel based - Same idea as above but better coherence Credit :
  24. 24. LIME Local Interpretable Model Agnostic Explanations Key insights Local vrs Global interpretation Globally faithful interpretation might be impossible To explain individual decision need to know the small local region Global trust If we trust individual reasonings Repeat with a good coverage over the input space Local explanation of the + data points Locally fitted sparse linear model
  25. 25. Example - CNN • Segments the image using opencv • Build a linear model based on prediction scores against segments
  26. 26. Example – NLP (Topic modelling)
  27. 27. Example – Tabular Data (RandomForest)
  28. 28. Conclusion Interpretability is not a “good to have” feature This is just the beginning and future is bright • “Right to explanation” – EU General Data Protection Regulation • SR 11-7: Guidance on Model Risk Management • Explainable Artificial Intelligence – Darpa •
  29. 29. Backup Slides
  30. 30. Learning theory Given Input {𝑥G, 𝑥$, . . , 𝑥&} ∈ 𝒳 eg images ; Output {𝑦#, 𝑦$ , … . , 𝑦&} ∈ Υ eg labels ; Hypothesis space Η set of functions Goal of supervised learning is to learn a function -> 𝑓[ ∶ 𝑦6]^_ = 𝑓[ 𝑥&^` Define a loss function ℓ 𝑓[ 𝑥 , 𝑦 Define emprical loss : ℓ[ = # b Σ 𝑓, 𝑧 𝑤ℎ𝑒𝑟𝑒 𝑧 = 𝑥G, 𝑦G We want lim &klm 𝑙[ 𝑓[ − 𝑙 𝑓[ = 0 ; Ie training set error and real error converge to 0 as n tends to infinity No of trainable parameters indicative of model complexity Regularization is used to penalize complexity and reduce variance Generalization Error = |training error – validation error|
  31. 31. Model Selection : Bias - Variance Tradeoff Deep Neural Nets
  32. 32. Under-specification Bias Scientific Understanding: • We have no complete way to state what knowledge is • Best we can do is ask for explanation Safety: • Complex tasks is almost never end-to-end testable • Query model for explanation Ethics: • Encoding all protections a priori, not possible • Guard against discrimination Mismatched objectives: • Optimizing an incomplete objective All these may address depressions. But which side effect are you willing to accept ? Debugging: • We may not know the internals • Domain mismatches • Mislabeled Training set Model lifecycle management: • Compare different models • Training set evolution Your own : • …