SlideShare a Scribd company logo
Machine Learning Interpretability:
the key to ML adoption in the enterprise
Alberto Danese
About me
• Senior Data Scientist @ Cerved
since 2016
• Innovation & Data Sources Team
Background
• Manager @ EY
• Senior Consultant @ Between
• Computer Engineer @ Politecnico
di Milano (2007)
• Active kaggler since 2016 (albedan)
• Just 8 competitions completed so
far
Day time Night time
• Kaggle Grandmaster after my 6th
competition
• 6 gold medals, 4 solo gold, 1 solo
prize in masters only competition
Talk overview
• Why ML Interpretability is key
• How (and when) to use ML Interpretability
• Drill-down on methods, techniques and approaches
Why ML Interpretability is key
AI & ML history in a nutshell
Early days
(1940s –
1970s):
• Turing test
• Dartmouth
workshop
• First neural
network and
perceptrons AI winter(s)
(1970s – early 1990s)
AI success and new developments
(from the early 1990s)
• Achievements cover by media
• Deep Blue vs Kasparov (1996-97)
• AlphaGo vs Lee Sedol (2016)
• Tech (HW) advancements
• GPUs (2006)
• TPUs (2016)
• Free and open source languages and
frameworks
• Competitions and culture
• Netflix prize (2006)
• Kaggle (2010)
• HBR – Data scientist: the sexiest
job of 21st century (2012)
And now, in 2019
Let’s use some simple logic:
• 60 years of AI
• 20 years of advancements
and accomplishments
• 1 cultural change
AI & ML solutions are now widely
deployed in production, across all
industries and companies
Right?
And now, in 2019 (hype-free)
• It’s Still Early Days for Machine Learning Adoption[1]
• Nearly half (49%) of the 11,400+ data specialists who
took O’Reilly’s survey in June 2018 indicated they were
in the exploration phase of machine learning and have
not deployed any machine learning models into
production
A two-speed world
Technology-first companies
• Large use of ML in production
• Very frequent use, at scale, often
on not critical topics (movie
recommendations, targeted ads)
Well-established companies
(banks, insurances, healthcare)
• More POCs than actual
deployments
• Crucial but not so frequent
decisions (accepting a mortgage
request, health diagnosis)
• Why? Trust & communication, i.e. ML
Interpretability
• Humans need explanations –
especially regulators – and classic
statistical methods (e.g. linear
regressions) are easily interpretable
• Till a few years ago, many ML
models were complete black boxes
ML interpretability besides regulation
• Husky vs. Wolf classification as in the
paper “Why should I trust you”[2]
1. The authors trained a biased classifier
(on purpose): every wolf picture had
snow in the background
2. They asked 27 ML students if they
trusted the model and to highlight
potential features (with and without
ML interpretability)
• Other areas where interpretability
matters: hacking / adversarial attacks
Is snow a key feature?
Yes, for 25 out of 27
Is snow a key feature?
Yes, for 12 out of 27
How (and when) to use ML
Interpretability
Let’s agree on the basics
1. Interpretability: the ability to explain or to present in understandable terms to
a human[3]
2. Accuracy vs. Interpretability is a tradeoff[4], i.e. you can get:
• Accurate models with approximate explanations
• Approximate models with accurate explanations
3. Global vs. Local Interpretability[4]:
• Global: explain how the model works in predicting unseen data
• Local: explain the "reason" of a specific prediction (i.e. of a single record)
4. Model agnostic vs. model specific interpretability models
Four levels of interpretability
Know your data
Before building a model
After the
model has
been built
(globally)
After the
model has
been built
(locally)
• Enforcing specific properties in a model, to allow for a better
understanding of its internal behaviour
• Fine tuning what the model is optimizing
• Showing global behaviour of the model, i.e. the relation between
the predicted target and one or more variables
• Giving more information on a specific prediction, e.g. what features
impacted that prediction the most
• Your model is only as interpretable as your data
• Understand processes / responsabilities on data
F
O
C
U
S
0
1
2 3
My quick framework
• Name and main concepts of the interpretability model / approach
• When is it applied ( before / after building the model)
• Where it applies ( local / global)
• Other notes / restrictions (model agnostic or specific, etc.)
• What can you say if you use this method / approach
• Main focus on structured data
B A
GL
Drill down on methods,
techniques and approaches
Monotonicity constraints (1/2)
• Main concepts:
• Some features are expected to show a monotonic behaviour (constantly non-descending or
non-ascending with respect to the target variable)
• Usually, due to the specific training set, a tree-based greedy model (e.g. GBT) is likely to
"catch up" irregularities, even when the overall trend is non-descending or non-ascending
• Enforcing a monotonic constraint, a specific tree-based model only "accepts" splits that are
in line with the constraint
• This can help limiting overfitting if the monotonic behaviour is consistent
• Monotonicity constraints are applied before building a model
• They act globally and are specific of tree-based models
• When you enforce a monotonic constraint on feature X with respect to target Y,
you can safely say: when all other features are equal, there’s no way that
increasing X will result in a decreasing prediction of Y
1 B G
Monotonicity constraints (2/2)
• Let’s take a simple parabola with
added gaussian noise
• It’s a one feature, one target
problem
1 B G
Monotonicity constraints (2/2)
• Let’s fit a simple Lightgbm
• Overall a decent fit, but locally it
can show a decreasing trend
1 B G
Monotonicity constraints (2/2)
• Let’s add this to the lgb
parameters:
• +1 = non-descending
• 0 = no constraint
• -1 = non-ascending
• Much better fit!
1 B G
Monotonicity constraints (2/2)
• If I set -1 as monotonic constraint,
the model doesn’t even find a
single valid split
1 B G
Monotonicity constraints (2/2)
• Actually the model with the
correct constraint not only gives a
degree of explainability, it’s also
the best performing model!
1 B G
Other examples
• Build a model that optimizes a custom metric
• Custom evaluation function
• Custom objective function
• When you apply custom objective and/or evaluation functions, you can
say: my model (in)directly optimizes this specific metric
1 B G
Partial dependence plots (1/2)2 A G
• Main concepts:
• Once you have highlighted the most important features, it is useful to understand how these
features affect the predictions
• The partial dependence plots "average out" the other variables and usually represents the
effect of one or two features with respect to the outcome[7]
• PDP analysis is performed after a model has been built and is a global measure,
typically model-agnostic
• With PDP, you can say: on average, the predictions have this specific behaviour
with respect to this one variable (or two of them)
Partial dependence plots (2/2)2 A G
Partial Dependence Plots 2-features PDP
LIME (1/2)3
• Main concepts:
• LIME stands for Local Interpretable Model-Agnostic Explanations
• It assumes that, locally, complex models can be approximated with simpler linear models
• It’s based on 4 phases, starting from the single prediction we want to explain
• Perturbations: alter your dataset and get the black box predictions for new observations
• Weighting: the new samples are weighted by their proximity to the instance of interest
• Fitting: a weighted, interpretable model is fitted on the perturbed dataset
• Explanation: of the (simple) local model
• It works on tabular data, text and images and it’s completely model-agnostic
• With LIME, you can say: this specific prediction is affected by these features, each
with its own relevance
A L
LIME (2/2)3 A L
Built with R package "lime"
First, build an explainer with data and a model (here, a classic
XGBoost)
Then, create explanations specifying the number N of
features you want to include
Other examples
• Shapley additive explanations
• Uses game theory to explain predictions
• For each prediction, you get the contribution of each variable to that specific case
• Natively a local model, can be extended to analyze globally the behaviour of a model
• Fast implementation for GBTs available
• With Shapley, you can say: in this specific prediction, each feature gives this exact
contribution
3 A L
ML Interpretability – Recap & Examples
Need Example Approach
Enforce some kind of expected
behaviour in a ML model
Real estate: floor surface vs. house price for two
identical apartments (location, quality, etc.)
Enforce monotonicity before
building a ML model
Show the effects of different
features on a specific target,
across a large population
Healthcare: most important variables that are
linked to a form of illness and what their impact is
Feature importance + PDPs
Understand a single prediction
and possibly define ad hoc
strategies based on individual
analysis
Customer churn: for each customer in the top N%
likely to churn, identify the main reason(s) and
give actionable insights to define individual
marketing campaigns[10]
LIME + Shapley
A L
A G
B G1
2
3
Wrap up
ML Interpretability – Conclusions
• Regulation, human nature and in general the desire of fair, robust and transparent
models are all good reasons to dig into ML interpretability
• There are many ways to make a ML model interpretable with radically distinct
approaches, but always consider:
• Data first
• Tricks and expedients to use before building a model
• Ex-post global explanations
• Ex-post local explanations
• This field had a tremendous growth in the last 2-3 years and currently allows high-
performance models (like GBTs) to have a more than reasonable level of interpretability
• An interpretable model can be more robust, insightful, actionable… simply better
References
• [1] https://www.datanami.com/2018/08/07/its-still-early-days-for-machine-learning-adoption/
• [2] "Why Should I Trust You?": Explaining the Predictions of Any Classifier – Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin (2016) – https://arxiv.org/abs/1602.04938
• [3] Towards A Rigorous Science of Interpretable Machine Learning – Finale Doshi-Velez, Been Kim (2017) - https://arxiv.org/abs/1702.08608
• [4] An introduction to Machine Learning Interpretability – Patrick Hall and Navdeep Gill – O’Reilly
• [5] https://github.com/dmlc/xgboost/blob/master/R-package/demo/custom_objective.R
• [6] https://medium.com/the-artificial-impostor/feature-importance-measures-for-tree-models-part-i-47f187c1a2c3
• [7] https://bgreenwell.github.io/pdp/articles/pdp-example-xgboost.html
• [8] https://christophm.github.io/interpretable-ml-book/
• [9] https://github.com/slundberg/shap
• [10] https://medium.com/civis-analytics/demystifying-black-box-models-with-shap-value-analysis-3e20b536fc80
• Icons made by Freepik from www.flaticon.com
Thank you!

More Related Content

What's hot

Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretable
Aditya Bhattacharya
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Databricks
 
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Pramit Choudhary
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
台灣資料科學年會
 
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
台灣資料科學年會
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
BigML, Inc
 
A Kaggle Talk
A Kaggle TalkA Kaggle Talk
A Kaggle Talk
Lex Toumbourou
 
DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical Perspective
BigML, Inc
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
Haptik
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
Akin Osman Kazakci
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
David Raj Kanthi
 
Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]
Jenia Gorokhovsky
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
Sri Ambati
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
IDEAS - Int'l Data Engineering and Science Association
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Alexandros Karatzoglou
 
Overview of Machine Learning and its Applications
Overview of Machine Learning and its ApplicationsOverview of Machine Learning and its Applications
Overview of Machine Learning and its Applications
Deepak Chawla
 

What's hot (19)

Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretable
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
 
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
孔令傑 / 給工程師的統計學及資料分析 123 (2016/9/4)
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
A Kaggle Talk
A Kaggle TalkA Kaggle Talk
A Kaggle Talk
 
DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical Perspective
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
 
Overview of Machine Learning and its Applications
Overview of Machine Learning and its ApplicationsOverview of Machine Learning and its Applications
Overview of Machine Learning and its Applications
 

Similar to Kaggle Days Paris - Alberto Danese - ML Interpretability

Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Emil Lupu
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
Anuj Gupta
 
ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systems
Anuj Gupta
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
Subrat Panda, PhD
 
Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
OOSD_UNIT1 (1).pptx
OOSD_UNIT1 (1).pptxOOSD_UNIT1 (1).pptx
OOSD_UNIT1 (1).pptx
DebabrataPain1
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Aalto University
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
GibDevs
 
Interpretable ML
Interpretable MLInterpretable ML
Interpretable ML
Mayur Sand
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
The Path to Digital Engineering
The Path to Digital EngineeringThe Path to Digital Engineering
The Path to Digital Engineering
Elizabeth Steiner
 
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptxUNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
MinilikDerseh1
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
Graph Models for Deep Learning
Graph Models for Deep LearningGraph Models for Deep Learning
Graph Models for Deep Learning
Experfy
 
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell ReboNYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
Maryam Farooq
 
5-CEN6016-Chapter1.ppt
5-CEN6016-Chapter1.ppt5-CEN6016-Chapter1.ppt
5-CEN6016-Chapter1.ppt
DrCMeenakshiVISTAS
 
Teechical Report Writing
Teechical Report WritingTeechical Report Writing
Teechical Report Writing
Mehran Sunny
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
CEN6016-Chapter1.ppt
CEN6016-Chapter1.pptCEN6016-Chapter1.ppt
CEN6016-Chapter1.ppt
SumitVishwambhar
 
CEN6016-Chapter1.ppt
CEN6016-Chapter1.pptCEN6016-Chapter1.ppt
CEN6016-Chapter1.ppt
NelsonYanes6
 

Similar to Kaggle Days Paris - Alberto Danese - ML Interpretability (20)

Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
 
ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systems
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
 
OOSD_UNIT1 (1).pptx
OOSD_UNIT1 (1).pptxOOSD_UNIT1 (1).pptx
OOSD_UNIT1 (1).pptx
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Interpretable ML
Interpretable MLInterpretable ML
Interpretable ML
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
The Path to Digital Engineering
The Path to Digital EngineeringThe Path to Digital Engineering
The Path to Digital Engineering
 
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptxUNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Graph Models for Deep Learning
Graph Models for Deep LearningGraph Models for Deep Learning
Graph Models for Deep Learning
 
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell ReboNYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
NYAI #25: Evolution Strategies: An Alternative Approach to AI w/ Maxwell Rebo
 
5-CEN6016-Chapter1.ppt
5-CEN6016-Chapter1.ppt5-CEN6016-Chapter1.ppt
5-CEN6016-Chapter1.ppt
 
Teechical Report Writing
Teechical Report WritingTeechical Report Writing
Teechical Report Writing
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
CEN6016-Chapter1.ppt
CEN6016-Chapter1.pptCEN6016-Chapter1.ppt
CEN6016-Chapter1.ppt
 
CEN6016-Chapter1.ppt
CEN6016-Chapter1.pptCEN6016-Chapter1.ppt
CEN6016-Chapter1.ppt
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 

Kaggle Days Paris - Alberto Danese - ML Interpretability

  • 1. Machine Learning Interpretability: the key to ML adoption in the enterprise Alberto Danese
  • 2. About me • Senior Data Scientist @ Cerved since 2016 • Innovation & Data Sources Team Background • Manager @ EY • Senior Consultant @ Between • Computer Engineer @ Politecnico di Milano (2007) • Active kaggler since 2016 (albedan) • Just 8 competitions completed so far Day time Night time • Kaggle Grandmaster after my 6th competition • 6 gold medals, 4 solo gold, 1 solo prize in masters only competition
  • 3. Talk overview • Why ML Interpretability is key • How (and when) to use ML Interpretability • Drill-down on methods, techniques and approaches
  • 5. AI & ML history in a nutshell Early days (1940s – 1970s): • Turing test • Dartmouth workshop • First neural network and perceptrons AI winter(s) (1970s – early 1990s) AI success and new developments (from the early 1990s) • Achievements cover by media • Deep Blue vs Kasparov (1996-97) • AlphaGo vs Lee Sedol (2016) • Tech (HW) advancements • GPUs (2006) • TPUs (2016) • Free and open source languages and frameworks • Competitions and culture • Netflix prize (2006) • Kaggle (2010) • HBR – Data scientist: the sexiest job of 21st century (2012)
  • 6. And now, in 2019 Let’s use some simple logic: • 60 years of AI • 20 years of advancements and accomplishments • 1 cultural change AI & ML solutions are now widely deployed in production, across all industries and companies Right?
  • 7. And now, in 2019 (hype-free) • It’s Still Early Days for Machine Learning Adoption[1] • Nearly half (49%) of the 11,400+ data specialists who took O’Reilly’s survey in June 2018 indicated they were in the exploration phase of machine learning and have not deployed any machine learning models into production
  • 8. A two-speed world Technology-first companies • Large use of ML in production • Very frequent use, at scale, often on not critical topics (movie recommendations, targeted ads) Well-established companies (banks, insurances, healthcare) • More POCs than actual deployments • Crucial but not so frequent decisions (accepting a mortgage request, health diagnosis) • Why? Trust & communication, i.e. ML Interpretability • Humans need explanations – especially regulators – and classic statistical methods (e.g. linear regressions) are easily interpretable • Till a few years ago, many ML models were complete black boxes
  • 9. ML interpretability besides regulation • Husky vs. Wolf classification as in the paper “Why should I trust you”[2] 1. The authors trained a biased classifier (on purpose): every wolf picture had snow in the background 2. They asked 27 ML students if they trusted the model and to highlight potential features (with and without ML interpretability) • Other areas where interpretability matters: hacking / adversarial attacks Is snow a key feature? Yes, for 25 out of 27 Is snow a key feature? Yes, for 12 out of 27
  • 10. How (and when) to use ML Interpretability
  • 11. Let’s agree on the basics 1. Interpretability: the ability to explain or to present in understandable terms to a human[3] 2. Accuracy vs. Interpretability is a tradeoff[4], i.e. you can get: • Accurate models with approximate explanations • Approximate models with accurate explanations 3. Global vs. Local Interpretability[4]: • Global: explain how the model works in predicting unseen data • Local: explain the "reason" of a specific prediction (i.e. of a single record) 4. Model agnostic vs. model specific interpretability models
  • 12. Four levels of interpretability Know your data Before building a model After the model has been built (globally) After the model has been built (locally) • Enforcing specific properties in a model, to allow for a better understanding of its internal behaviour • Fine tuning what the model is optimizing • Showing global behaviour of the model, i.e. the relation between the predicted target and one or more variables • Giving more information on a specific prediction, e.g. what features impacted that prediction the most • Your model is only as interpretable as your data • Understand processes / responsabilities on data F O C U S 0 1 2 3
  • 13. My quick framework • Name and main concepts of the interpretability model / approach • When is it applied ( before / after building the model) • Where it applies ( local / global) • Other notes / restrictions (model agnostic or specific, etc.) • What can you say if you use this method / approach • Main focus on structured data B A GL
  • 14. Drill down on methods, techniques and approaches
  • 15. Monotonicity constraints (1/2) • Main concepts: • Some features are expected to show a monotonic behaviour (constantly non-descending or non-ascending with respect to the target variable) • Usually, due to the specific training set, a tree-based greedy model (e.g. GBT) is likely to "catch up" irregularities, even when the overall trend is non-descending or non-ascending • Enforcing a monotonic constraint, a specific tree-based model only "accepts" splits that are in line with the constraint • This can help limiting overfitting if the monotonic behaviour is consistent • Monotonicity constraints are applied before building a model • They act globally and are specific of tree-based models • When you enforce a monotonic constraint on feature X with respect to target Y, you can safely say: when all other features are equal, there’s no way that increasing X will result in a decreasing prediction of Y 1 B G
  • 16. Monotonicity constraints (2/2) • Let’s take a simple parabola with added gaussian noise • It’s a one feature, one target problem 1 B G
  • 17. Monotonicity constraints (2/2) • Let’s fit a simple Lightgbm • Overall a decent fit, but locally it can show a decreasing trend 1 B G
  • 18. Monotonicity constraints (2/2) • Let’s add this to the lgb parameters: • +1 = non-descending • 0 = no constraint • -1 = non-ascending • Much better fit! 1 B G
  • 19. Monotonicity constraints (2/2) • If I set -1 as monotonic constraint, the model doesn’t even find a single valid split 1 B G
  • 20. Monotonicity constraints (2/2) • Actually the model with the correct constraint not only gives a degree of explainability, it’s also the best performing model! 1 B G
  • 21. Other examples • Build a model that optimizes a custom metric • Custom evaluation function • Custom objective function • When you apply custom objective and/or evaluation functions, you can say: my model (in)directly optimizes this specific metric 1 B G
  • 22. Partial dependence plots (1/2)2 A G • Main concepts: • Once you have highlighted the most important features, it is useful to understand how these features affect the predictions • The partial dependence plots "average out" the other variables and usually represents the effect of one or two features with respect to the outcome[7] • PDP analysis is performed after a model has been built and is a global measure, typically model-agnostic • With PDP, you can say: on average, the predictions have this specific behaviour with respect to this one variable (or two of them)
  • 23. Partial dependence plots (2/2)2 A G Partial Dependence Plots 2-features PDP
  • 24. LIME (1/2)3 • Main concepts: • LIME stands for Local Interpretable Model-Agnostic Explanations • It assumes that, locally, complex models can be approximated with simpler linear models • It’s based on 4 phases, starting from the single prediction we want to explain • Perturbations: alter your dataset and get the black box predictions for new observations • Weighting: the new samples are weighted by their proximity to the instance of interest • Fitting: a weighted, interpretable model is fitted on the perturbed dataset • Explanation: of the (simple) local model • It works on tabular data, text and images and it’s completely model-agnostic • With LIME, you can say: this specific prediction is affected by these features, each with its own relevance A L
  • 25. LIME (2/2)3 A L Built with R package "lime" First, build an explainer with data and a model (here, a classic XGBoost) Then, create explanations specifying the number N of features you want to include
  • 26. Other examples • Shapley additive explanations • Uses game theory to explain predictions • For each prediction, you get the contribution of each variable to that specific case • Natively a local model, can be extended to analyze globally the behaviour of a model • Fast implementation for GBTs available • With Shapley, you can say: in this specific prediction, each feature gives this exact contribution 3 A L
  • 27. ML Interpretability – Recap & Examples Need Example Approach Enforce some kind of expected behaviour in a ML model Real estate: floor surface vs. house price for two identical apartments (location, quality, etc.) Enforce monotonicity before building a ML model Show the effects of different features on a specific target, across a large population Healthcare: most important variables that are linked to a form of illness and what their impact is Feature importance + PDPs Understand a single prediction and possibly define ad hoc strategies based on individual analysis Customer churn: for each customer in the top N% likely to churn, identify the main reason(s) and give actionable insights to define individual marketing campaigns[10] LIME + Shapley A L A G B G1 2 3
  • 29. ML Interpretability – Conclusions • Regulation, human nature and in general the desire of fair, robust and transparent models are all good reasons to dig into ML interpretability • There are many ways to make a ML model interpretable with radically distinct approaches, but always consider: • Data first • Tricks and expedients to use before building a model • Ex-post global explanations • Ex-post local explanations • This field had a tremendous growth in the last 2-3 years and currently allows high- performance models (like GBTs) to have a more than reasonable level of interpretability • An interpretable model can be more robust, insightful, actionable… simply better
  • 30. References • [1] https://www.datanami.com/2018/08/07/its-still-early-days-for-machine-learning-adoption/ • [2] "Why Should I Trust You?": Explaining the Predictions of Any Classifier – Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin (2016) – https://arxiv.org/abs/1602.04938 • [3] Towards A Rigorous Science of Interpretable Machine Learning – Finale Doshi-Velez, Been Kim (2017) - https://arxiv.org/abs/1702.08608 • [4] An introduction to Machine Learning Interpretability – Patrick Hall and Navdeep Gill – O’Reilly • [5] https://github.com/dmlc/xgboost/blob/master/R-package/demo/custom_objective.R • [6] https://medium.com/the-artificial-impostor/feature-importance-measures-for-tree-models-part-i-47f187c1a2c3 • [7] https://bgreenwell.github.io/pdp/articles/pdp-example-xgboost.html • [8] https://christophm.github.io/interpretable-ml-book/ • [9] https://github.com/slundberg/shap • [10] https://medium.com/civis-analytics/demystifying-black-box-models-with-shap-value-analysis-3e20b536fc80 • Icons made by Freepik from www.flaticon.com Thank you!