Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
November, 2018 @leonardo_noleto
Leonardo Noleto
Senior Data Scientist
Unboxing
the Black Boxes
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s writt...
BLECKWEN
6 rue Dewoitine
Immeuble Rubis
78140 Vélizy – France
www.bleckwen.ai
Let's stay in touch
Leonardo Noleto
Senior D...
Upcoming SlideShare
Loading in …5
×

Unboxing the black boxes (Updated version November '18)

667 views

Published on

As machine learning has become more widely adopted across many industries and involved in many aspects of decision making, machine learning interpretability is therefore becoming an integral part of the data scientist workflow and can no longer just be an afterthought. Ultimately, it’s reasonable to wonder whether we can understand and trust decisions made by a predictive model.

However, in an increasingly competitive environment, data scientists are using ever-complex machine learning algorithms like XGBoost or Deep Learning to deliver more accurate models to businesses. Unfortunately, there is a fundamental tension between accuracy and interpretability: the most accurate models are often the hardest to understand. Opaque and complicated nonlinear models limit trust and transparency, slowing adoption of machine learning models in high regulated industries like banking, healthcare and insurance. But things needn't be that way!

In this talk, Leonardo Noleto, senior data scientist at Bleckwen, will explore the vibrant area of machine learning interpretability and explain how to understand the inner-workings of black-box models, thanks to interpretability techniques. Along the way, Leonardo offers an overview of interpretability and the trade-offs among various approaches of making machine learning models interpretable. Leonardo concludes with a demonstration of open source tools like LIME and SHAP.

Published in: Technology
  • How to improve brain memory power naturally? Boost your brainpower with brain pill now... ➤➤ https://tinyurl.com/brainpill101
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Unboxing the black boxes (Updated version November '18)

  1. 1. November, 2018 @leonardo_noleto Leonardo Noleto Senior Data Scientist Unboxing the Black Boxes
  2. 2. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Leonardo Noleto Ϝ(𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟 𝑠𝑐𝑖𝑒𝑛𝑡𝑖𝑠𝑡) → 𝑑𝑎𝑡𝑎 𝑠𝑐𝑖𝑒𝑛𝑡𝑖𝑠𝑡 Former software engineer. Senior data scientist @ Bleckwen Previously worked for start-ups and large companies (OVH and KPMG) co-founder and former organizer of Toulouse Data Science Meetup About Me 2 @leonardo_noleto linkedin.com/in/noleto
  3. 3. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Machine learning models are everywhere Machine learning is at the core of many recent advances in science and technology
  4. 4. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Booking an appointment for you
  5. 5. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Beating the Go Master
  6. 6. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Carrying goods
  7. 7. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Defeating Breast Cancer
  8. 8. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Speaking Chinese
  9. 9. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission And even painting
  10. 10. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 10 But sometimes things go wrong.… A major flaw in Google's algorithm allegedly tagged two black people's faces with the word 'gorillas' Source: http://www.businessinsider.fr/us/google-tags-black-people-as-gorillas-2015-7
  11. 11. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 11 But sometimes things go very, very badly wrong There’s software used across the country (US) to predict future criminals. And it’s biased against blacks. Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
  12. 12. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission • With the growing impact of ML algorithms on society, it is no longer acceptable to trust the model without an answer to the question: why? why did the model make a precise decision? • Data science professionals used to focus on metrics like accuracy and model generalization • We need to make sure that models we put in prod do not hurt people Machine learning is becoming ubiquitous in our lives xkcd on Machine Learning
  13. 13. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Image from Drive.ai, a self-driving car service for public use in Frisco, Texas Why we should care about Interpretability?
  14. 14. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission “Interpretability is the degree to which a human can understand the cause of a decision.” Miller, Tim. 2017. “Explanation in Artificial Intelligence: Insights from the Social Sciences.”
  15. 15. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Interpretability, why it is important? STRENGTHEN TRUST AND TRANSPARENCY A model may discriminate against certain populations or make decisions based on fallacious correlations. Without understanding the captured factors, we will have no guarantee that decisions will be fair. SATISFY REGULATORY REQUIREMENTS The GDPR determines how the personal data of European citizens can be used and analyzed. Interpretability is a key enabler for auditing machine learning models. EXPLAIN DECISIONS A machine learning model with better interpretability allows humans to establish the diagnosis and understand what happened. IMPROVE MODELS For data scientists, interpretability also ensures that the model is good for the right reasons and wrong for the right reasons as well. In addition, this offers new possibilities for feature engineering and debugging the model.
  16. 16. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Image source: https://blog.bigml.com/2018/05/01/prediction-explanation-adding-transparency-to-machine-learning/amp/ The trade-off Accuracy vs. Interpretability No data scientist wants to give up on accuracy…
  17. 17. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Explaining the taxonomy of Interpretability globallocal Model-agnostic Model-specific scope approach
  18. 18. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 18 • In interpretability research there are two scopes that serve different purposes, they allow us to understand the model on two different levels: • Global: it is often important to understand the model as a whole and then zoom in on a specific case (or group of cases). The overall explanation provides an overview of the most influential variables in the model based on the input data and the variable to be predicted. • Local: local explanations identify the specific variables that have contributed to an individual decision. Note that the most important variables in global explanation do not necessarily correspond to the most important variables in a local prediction Explaining the taxonomy of Interpretability: Scope Summarizing Global and Local Interpretation (Source: DataScience.com)
  19. 19. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 19 • There are two ways to try and achieve interpretability: • Model-agnostic: this approach treats the model like a black box, meaning that it only considers the relation between inputs and outputs • Model-specific: this approach aims to use the internal structure of the model itself in order to explain it (trees, neurons, linear coefficients...) Explaining the taxonomy of Interpretability: Approaches
  20. 20. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 20 • The first intuition when seeking explanations is using only what we call white box models (models that are intuitively interpretable) like linear model, decision trees and rule list. • The second intuition is using the internal structure of a specific model to extract reasons behind the decisions. We will go into one example of model specific approach: one that uses the tree structure to explain tree based models • As a last resort, we can adopt a model agnostic approach which allows more liberty in using complex models hopefully without compromising interpretability How do we enable model interpretability?
  21. 21. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 21 Using only white box models
  22. 22. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 22 • Probably the most known class of "white box" model • Linear models learn a linear and monotonic functions between features and target • Linearity makes the model interpretable, we can deduce the contribution of a feature by looking at its coefficient in the model White-box model: Linear Models monotonically increasing function. monotonically decreasing function. A function that is not monotonic Global and Local White Box
  23. 23. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 23 • Pros • By looking at the coefficients we can get a global idea of what the model is doing • For one instance we can see how much each feature actually contributes to the final result • When interpreting models prefer Ridge instead of Lasso • Provides regulator-approvable models (i.e.: Equal Credit Opportunity Act - ECOA) • Cons • As its name implies, Linear models can only represent linear relationships • So much unrealistic assumptions for real data (Linearity, normality, homoscedasticity, independence, fixed features, and absence of multicollinearity) • Linear models are not that easy to interpret when variables are correlated • Coefficients cannot easily be compared if input are not standardized White-box model: Linear Models
  24. 24. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 24 • Decision trees often mimic the human level thinking so it is simple to understand the data and visualize the logic of different paths leading to the final decision • Interpretation can be reached by reading the tree as a bunch of nested if-else statements White-box model: Decision Trees Global and enable Local White Box
  25. 25. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 25 • Pros • Can be displayed graphically • Can be specified as a series of rules, and more closely approximate human decision-making than other models • Can automatically learn feature interactions regardless of monotonic transformations • Tends to ignore irrelevant features • Cons • Performance is (generally) not competitive with the best supervised learning methods • Lack of smoothness • Can easily overfit the training data (tuning is required) • Small variations in the data can result in a completely different tree (high variance) • Suffers with highly unbalanced data sets White-box model: Decision Trees
  26. 26. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 26 • GAMs provides a useful extensions of linear models, making them more flexible while still retaining much of their interpretability • It consists of multiple smoothing functions (splines) • Relationships between the individual predictors and the dependent variable follow smooth patterns that can be linear or nonlinear • We can estimate these smooth relationships simultaneously and then predict g(E(Y))) by simply adding them up White-box model: Generalized Additive Models Image source: https://multithreaded.stitchfix.com/blog/2015/07/30/gam/ Global and Local White Box
  27. 27. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 27 • Pros • Interpretability advantages of Linear Models • Relationships between independent and dependent variable are not assumed to be linear • Predictor functions are automatically derived during model estimation • Regularization of predictor functions helps avoid overfitting • GAMs are competitive with popular learning techniques like SVM and Random Forest • Cons • Can be computationally expensive for large data sets • One may be tempted to make the model overly complex (with many degrees of freedom) to get more accuracy (less interpretability and possible over fitting) • Interpretation is not so straightforward as linear models (more involving) White-box model: GAMs
  28. 28. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 28 Model specific interpretability
  29. 29. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 29 • TreeInterpreter is a simple idea that allows us to have local interpretability for tree based models (Decision trees, Random Forests, ExtraTrees, XGBoost…) • Principle: feature weights are calculated by following decision paths in trees of an ensemble. Each node of the tree has an output score, and contribution of a feature on the decision path is how much the score changes from parent to child • Every prediction can be trivially presented as a sum of feature contributions, showing how the features lead to a particular prediction Model Specific: TreeInterpreter Local only Tree based models
  30. 30. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 30 Model Specific: Example TreeInterpreter Let’s take the Boston housing price data set and build a regression decision tree to predict housing in suburbs of Boston
  31. 31. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 31 Model Specific: Example TreeInterpreter RM LSTAT NOX DIST 3.1 4.5 0.54 1.2 RM: average number of rooms among homes in the neighborhood. LSTAT: percentage of homeowners in the neighborhood considered "lower class“ NOX: the air quality DIST: distance from the city center Target is Median value of owner-occupied homes in $1000s Prediction: 23.03 ≈ 22.60 (trainset mean) - 2.64(loss from RM) + 3.51(gain from LSTAT) - 0.44(loss from DIS) Loss from RM 19.96 – 22.60 = -2.64 Loss from DIS 23.03 – 23.47 = -0.44 Gain from LSTAT 23.47 – 19.96 = +3.51 Let’s take the Boston housing price data set and build a regression decision tree to predict housing in suburbs of Boston
  32. 32. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 32 • Pros • Simple and intuitive • Enable tree based model local explanation (i.e. why a particular prediction is made) • Enable tree based model debugging • Cons • Original Python package limited to scikit-learn API (use ELI5 for XGBoost) • Heuristic method (maths behind it is not well founded) • Biased towards lower splits in the tree (as trees get deeper, this bias only grows) Model Specific: TreeInterpreter
  33. 33. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 33 Model agnostic interpretability
  34. 34. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 34 Model agnostic: LIME • Local Interpretable Model-agnostic Explanations: explain single predictions of any classifier or regressor, by approximating it locally with an interpretable model • LIME is based on two simple ideas: perturbation and local surrogate model Tabular data Image data Text data Local only Any model
  35. 35. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 35 Model agnostic: LIME - Intuition Complex Model Perturbation: LIME takes a prediction you want to explain ( ) and systematically perturbs its inputs. This creates perturbed data in a neighborhood of this data point. Perturbed data become a new labelled training data set (labels come from the complex model)
  36. 36. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 36 Model agnostic: LIME - Intuition Local surrogate model : LIME then fits an interpretable model (linear model) to describe the relationships between the (perturbed) inputs and outputs Local surrogate model
  37. 37. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 37 Model agnostic: LIME – How it works Peek a sample to explain Discretize numerical features (optional) Create perturbed samples Label perturbed data Feature selection Fit a weighted regression Extract coefficients as feature importance
  38. 38. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 38 • Pros • Quite intuitive (based on surrogate models) • Creates selective explanations, which humans prefer • Provides explanation for tabular data, text and images • Cons • The library evolved from the paper • Synthetic data set generation does not reflect original data set distribution • The explainer can be very unstable • Tuning parameters is not always clear • How big should the neighbourhood be? • Changing numerical discretization method can get to the opposite explanation • Works poorly for severe class imbalance Model agnostic: LIME
  39. 39. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 39 • Most of the methods that we explored so far are very approximate • TreeInterpreter for example is based on an intuition and has no formal proof or justification • LIME has some theory behind it but makes a lot of assumptions and approximations, this makes the explanations unstable and unreliable • We need to go from heuristics and approximations to a more well founded theory Model agnostic: Towards a more rigorous approach
  40. 40. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 40 • SHAP (SHapley Additive exPlanations) is a unified approach of techniques like TreeInterpreter, LIME and others to explain the output of any machine learning model • SHAP connects coalitional game theory with local explanations so that explanations satisfies the three following requirements that are desirable in the task of interpretability: Local accuracy, Missingness and Consistency Enters SHAP
  41. 41. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 41 • Definition: Coalitional Game theory, also referred to as cooperative game theory, is the type of games where players need to form coalitions (cooperate) in order to maximize the general gain • Let’s take an example of Escape game as a group: • The game can accept 1 to 3 players • The players have one hour to solve the puzzles in the room and find their way out • The group’s score is the time left to an hour • The prize at the end is ten rupees per minute spared SHAP – Introducing coalition game theory
  42. 42. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 42 • A team of 3 escaped with a score of 18 minutes (time left) and won 180 rupees. • How do we distribute the cash prize fairly among the players? SHAP – Introducing coalition game theory
  43. 43. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 43 Model agnostic: SHAP – Introducing coalition game theory « Members should receive payments proportional to their marginal contribution » Lloyd Shapley 1923-2016
  44. 44. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 44 Model agnostic: SHAP – Shapely value properties • Efficiency: The attributions for each player sum up to the value • Symmetry: If two players have the same contribution they have the same payoff • Dummy: No value, no payoff • Additivity: For the sum of two games the attribution is the sum of attributions from the two games
  45. 45. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 45 Model agnostic: SHAP – The Shapley value theorem For the coalitional game described by < 𝑁, 𝜈 > there exists a unique attribution 𝜙 that satisfies the four fairness axioms seen later, it is the Shapley value: 𝜙𝑖 𝜈 = 𝑆⊆𝑁− 𝑖 𝑆 ! 𝑁 − 𝑆 − 1 ! 𝑁 ! (𝜈 𝑆 ∪ 𝑖 − 𝜈 𝑆 ) 𝜙: 𝑡ℎ𝑒 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝑝𝑙𝑎𝑦𝑒𝑟 𝑖, 𝜈: 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑡ℎ𝑎𝑡 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑡ℎ𝑒 ’𝑤𝑜𝑟𝑡ℎ’ 𝑜𝑓 𝑎 𝑐𝑜𝑎𝑙𝑖𝑡𝑖𝑜𝑛, 𝑖: 𝑡ℎ𝑒 𝑝𝑙𝑎𝑦𝑒𝑟 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑁: 𝐹𝑖𝑛𝑖𝑡𝑒 𝑠𝑒𝑡 𝑜𝑓 𝑎𝑙𝑙 𝑝𝑙𝑎𝑦𝑒𝑟𝑠, 𝑎𝑙𝑠𝑜 𝑐𝑎𝑙𝑙𝑒𝑑 𝑇ℎ𝑒 𝑔𝑟𝑎𝑛𝑑 𝑐𝑜𝑎𝑙𝑖𝑡𝑖𝑜𝑛, 𝑆: 𝑆𝑢𝑏𝑠𝑒𝑡 𝑜𝑓 𝑝𝑙𝑎𝑦𝑒𝑟𝑠, 𝑎 𝑐𝑜𝑎𝑙𝑖𝑡𝑖𝑜𝑛
  46. 46. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 46 • From coalitional game theory to model interpretability: Model agnostic: SHAP – Connecting to Machine Learning Players Game Gain attribution Features Model Feature attribution
  47. 47. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 47 • From coalitional game theory to model interpretability: • Local accuracy: 𝑓 𝑥 = 𝑔(𝑥′) meaning that the model and the explanation model have the same value for the instance that we want to explain • Missingness: 𝑥′𝑖 = 0 ⇒ 𝜙𝑖 = 0, meaning that if a value is missing, there’s no importance attribution • Consistency: if the value added by a feature is bigger in a modified model, the corresponding feature attribution is larger. This gives explanations a certain consistency • All three properties can be equated to the four properties previously seen in the coalitional game theory. Equivalence can be demonstrated. Model agnostic: SHAP – Connecting to Machine Learning
  48. 48. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 48 • Current feature attribution methods for tree based models are inconsistent (XGBoost/RF feature importances, Gain split, etc) • TreeInterpreter needs to be modified to get the right values, the values that examine the outcomes for every possible subset and match the properties of Shapely value • TreeSHAP values are optimal but challenging to compute • The author of SHAP proposes a novel methods that reduces the computational cost that is exponential in theory to a polynomial cost • Detailed computation is beyond the scope of this talk, mathematical explanation can be found in the author paper: Consistent feature attribution for tree ensembles Model specific: TreeSHAP Global and Local Tree based models
  49. 49. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 49 • Inspired by LIME but using the right parameters to get Shapley values • In the original LIME there are hyperparameters to adjust the vicinity of the sample to explain, the loss function and the regularization term • The original LIME choices for these parameters are made heuristically; in general, it does not meet the properties of local accuracy and consistency • SHAP’s author propose a new method to find these parameters that are proven to satisfy Shapely value • The proof can be found in the author NIPS paper: A Unified Approach to Interpreting Model Predictions Model agnostic: KernelSHAP Global and Local Any model
  50. 50. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 50 • LIME helps to debug text classifications and explain part of the text • SHAP, LIME and DeepLift provides a method for decomposing the output prediction of a neural networks for image classification Model agnostic: Text and Image Interpretability ELI5 TextExplainer based on LIME algorithm: debugging black-box text classifiers LIME explaining a Bernese mountain dog labeled from Inception v3 model
  51. 51. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 51 • So far, we have seen essentially explainers that outputs reasons as “feature contribution” • There are many other classes of explainers: • Visual (Partial Dependency Plot and Individual Conditional Expectations - ICE) • Rules explanations (Bayesian Rule List, Anchors, Skope Rules, etc) • Case reasoning (Influence Funtions) • Contrastive Explanations (DeepLift, DeepExplainer, etc) Model Interpretability: the other families of explainers Using partial dependence to understand the relationship between a variable and a model's predictions (source: Skater package) The plot above explains ten outputs (digits 0-9) for four different images. Red pixels increase the model's output while blue pixels decrease the output (source SHAP Deep Explainer)
  52. 52. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 52 • Transparency: Make available externally the design of algorithms, the data collection procedures and all data treatment • Fairness: Ensure that algorithmic decisions do not create discriminatory or unjust impacts when comparing across different demographics (e.g. race, sex, etc). • Accountability : Enable interested third parties to probe, understand, and review the behavior of the algorithm through disclosure of information that enables monitoring, checking, or criticism • Privacy: Ensuring that sensitive information in the data is protected Beyond Interpretability Image source: https://www.wired.com/2016/10/understanding-artificial-intelligence-decisions/
  53. 53. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 53 • Trade-off: • Approximate models with straightforward and stable explanations • Accurate models with approximate explanations • Evaluating the interpretability quality • Use simulated data for testing explanations and different packages (methods) • See O’Reilly article: Testing machine learning interpretability techniques • Rethink feature engineering • Is the feature intelligible? How long does it take to understand the explanation? • It is actionable? • Is there a monotonicity constraint? • Model sparsity: How many features are being used by the explanation? • Computational Speed: agnostic techniques are time-consuming Lessons learned after 6 months of interpretability
  54. 54. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission How we are using interpretability techniques Our clients are fighting against financial fraud with the power of Machine Learning and the Behavioral Profiling while keeping decisions interpretable
  55. 55. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission We combine best-in-class Machine Learning models with interpretation technologies to get the best from a collective artificial and human intelligence
  56. 56. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission Collaboration between Human Being and Artificial Intelligence will secure the world.
  57. 57. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission DEEP TRUTH REVEALED Anti-fraud solution for banks. Thank you. https://github.com/BleckwenAI/mli-catalog
  58. 58. Confidential - All rights reserved, any reproduction or distribution of the content is prohibited without Bleckwen’s written permission 58 • Interpretable Machine Learning: A Guide for Making Black Box Models Explainable by Christoph Molnar • O’Reilly Red book: An Introduction to Machine Learning Interpretability • GAMs: Intelligible Machine Learning Models for HealthCare • Interpretable ML Symposium: http://interpretable.ml/ • Awesome Interpretable Machine Learning (Github repository) • FAT/ML Fairness, Accountability, and Transparency in Machine Learning https://www.fatml.org/ Additional references
  59. 59. BLECKWEN 6 rue Dewoitine Immeuble Rubis 78140 Vélizy – France www.bleckwen.ai Let's stay in touch Leonardo Noleto Senior Data Scientist leonardo.noleto@bleckwen.ai

×