SlideShare a Scribd company logo
How good is your prediction?
Quantifying uncertainty in Machine Learning predictions
PyData London 2019 (12th- 14th July)
Maria Navarro
Outline
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Motivating example
How good is your prediction?
Problem To find out whether a car is a total lo
To do it we have:
1. A set of historical observations 𝑥1; 𝑦1 , ⋯ , 𝑥 𝑁; 𝑦 𝑁 , where:
• 𝑥𝑖 describes the accident by age of the driver, model of the car, etc.
• 𝑦𝑖 is a label which identifies whether the car is reparable or not
2. A machine learning algorithm (h 𝑥 = 𝑦)
PROBLEM: To find out whether a car is a total loss or not
Motivating example
How good is your prediction, REALLY?
A new accident, 𝑥 𝑁+1, occurs. We run our model, and we obtain the following results:
1. The car is classified as total loss
2. The probability of total loss according to our model is 0.85
3. The model is roughly 91% accurate in training, test and validation sets, so we expect same
behaviour in production data
4. The model has an AUC of 0.88 in training, so again that is what we expect in production data
What do these measurements mean?
Do we have any guarantee about accident 𝑥 𝑁+1?
Are we confident about the prediction?
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Introduction to conformal predictions
Why Conformal Predictions (CP) ?
1. There are several ad hoc ways to obtain some confidence around your predictions (resampling
methods, assume normality, etc.)
2. Conformal predictions assumes very little about the outcome you are trying to predict. It only
assume exchangeability.
3. It can be used with any machine learning algorithm.
4. It provides error bounds at a confidence level that we can select.
5. Probabilities are well-calibrated.
6. It is easy to implement.
7. The framework has been proven:
V. Vovk, A. Gammerman, G. Shafer
Algorithmic learning in a random walk, Springer 2005.
Introduction to conformal predictions
General idea
• Let 𝑍 be a probability distribution.
• f z → ℝ some function.
• We draw 5 samples from the distribution 𝑍 and apply 𝑓 𝑧 :
 𝑓 𝑧𝑖 = 𝛼𝑖, with 𝑖 = 1, … , 5
 For simplicity, we assume 𝛼1 ≤ 𝛼2 ≤ 𝛼3 ≤ 𝛼4 ≤ 𝛼5
• We estimate the cumulative distribution function (CDF) for the scores:
0 0.2 0.4 0.6 0.8 1
𝛼1 𝛼2 𝛼3 𝛼4 𝛼5
• We draw a new sample from z ∈ 𝑍. We assume exchangeability and compute 𝑓 𝑧 = 𝛼.
• We can estimate its probability: 𝑃 𝛼 ≤ 𝛼4 = 0.6 and 𝑃 𝛼 ≤ 𝛼2 = 0.2
Introduction to conformal predictions
Relation to our problem
• Let 𝑧𝑖 = 𝑥𝑖; 𝑦𝑖 with 𝑖 = 1, … , 𝑝 be a sample of the probability distribution, 𝑍 = 𝑋, 𝑌 , where:
 𝑥𝑖 is our observables and 𝑦𝑖 the target we want to predict
• We define 𝑓 𝑧𝑖 = 𝑦𝑖 − ℎ 𝑥𝑖 , where:
 ℎ 𝑥𝑖 is a regression model train on 𝑧𝑖 with 𝑖 = 5, … , 𝑝
• We apply 𝑓 𝑧 to the 5 remaining samples
 𝑓 𝑧𝑖 = 𝛼𝑖, with 𝑖 = 1, … , 5
 We can compute the exact values 0.10 ≤ 0.13 ≤ 0.28 ≤ 0.30 ≤ 0.38
• We estimate the cumulative distribution function (CDF) for the scores:
0 0.2 0.4 0.6 0.8 1
0.10 0.13 0.28 0.30 0.38
• We draw a new sample from z ∈ 𝑍. We assume exchangeability and compute 𝑓 𝑧 = 𝑦 − ℎ 𝑥 = 𝑦 − 2 .
• We can estimate its probability:
 𝑃 𝑦 − 2 ≤ 0.30 = 0.6 and 𝑃 𝑦 − 2 ≤ 0.28 = 0.4
 𝑃 𝑦 ∈ 2 ± 0.30 = 0.6 and 𝑃 𝑦 ∈ 2 ± 0.30 = 0.4
 𝑦 𝜖 1.7, 2.3 with probability 0.6
Introduction to conformal predictions
Inputs for conformal predictions
• A set of training examples 𝑧𝑖 = 𝑥𝑖, 𝑦𝑖 with 𝑖 = 1, … , 𝑃
 They must be drawn from an exchangeable distribution (the order of observations is
irrelevant).
• A non-conformity function 𝑓 𝑧 → ℝ
 It measures the “weirdness” of an example 𝑥𝑖, 𝑦𝑖
 It should give low scores to similar examples 𝑥𝑖, 𝑦𝑖 and high scores to different ones
𝑥𝑖, ¬𝑦𝑖
 Common choice is take some function of the underlying model, but it can be anything: the
probability estimate for correct class, distance to neighbours with same class, probability from
the trees, absolute error of a regression model, etc.
• Set a significance level 𝛆 ∈ (0,1), so 1 − 𝜀 confidence level
Introduction to conformal predictions
How does conformal predictions work?
• Divide training set into two disjoint sets: 𝑍𝑡 with 𝑍𝑡 = 𝑚 and 𝑍 𝑐 with 𝑍 𝑐 = 𝑛, 𝑚 + 𝑛 = 𝑝
• Build the underlying model, ℎ, using 𝑍𝑡
• Apply 𝑓 𝑧𝑖 = 𝛼𝑖 to the elements of the set you did not use for training ℎ , and estimate its probability
distribution 𝛼1, … , 𝛼 𝑛 ~ 𝑄
• If a new example comes in 𝑥, ℎ 𝑥 = 𝑦 , then we will reject 𝑦
 We will reject 𝑦 if 𝑓 (𝑥, 𝑦) = 𝛼 𝑦 does not belong to 𝑄
• We compute the non-conformity degree which is called p-value as follows:
𝑝 𝑦=
𝑧 𝑗 𝜖 𝑍 𝑐∶ 𝛼 𝑗 ≥ 𝛼 𝑦
𝑛+1
, 𝑝 𝑦 is the p-value
• Finally the prediction region:
Γ 𝜀
= 𝑦 𝜖 𝑌: 𝑝 𝑦 > 𝜀
Is 𝒚 a very non-conforming example?
Introduction to conformal predictions
Conformal prediction output
The prediction region Γ 𝜀
contains prediction 𝑦 with probability 1 − 𝜀
 In classification :
 𝛼 𝑦 is know, but we need to compute 𝑝 𝑦
 The result is a set of labels:
Γ 𝜀
= 𝐶𝑙𝑎𝑠𝑠1, 𝐶𝑙𝑎𝑠𝑠3, 𝐶𝑙𝑎𝑠𝑠5 s. t. 𝑃 𝑦 ∈ Γ 𝜀
= 1 − 𝜀
o If Γ 𝜀
= ∅ , then always erroneous
o If Γ 𝜀
= 𝐶 (only one class), then always true (if it is the correct class)
o If Γ 𝜀
= 𝐶𝑙𝑎𝑠𝑠1, 𝐶𝑙𝑎𝑠𝑠3, … , 𝐶𝑙𝑎𝑠𝑠5 (several classes), then always correct
 In regression is an interval:
 𝑝 𝑦 is know, but we need to compute 𝛼 𝑦
 The result is an interval:
Γ 𝜀
= 𝑎, 𝑏 where 𝑎, 𝑏 ∈ ℝ and s. t. 𝑃 𝑦 ∈ Γ 𝜀
= 1 − 𝜀
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Conformal predictions in classification
Algorithm to compute conformal prediction regions in classification problems
Let 𝑍 = 𝑋, 𝑌 be the historical data set for our classification problem, where:
 𝑍 = 𝑝, 𝑋 is the information about the problem and 𝑌 = 𝐶1 , … , 𝐶𝑠 set of labels.
 𝑍 is exchangeable.
To obtain the prediction region:
1. Divide 𝑍 into two disjoint sets:
 𝑍𝑡 proper training set with 𝑍𝑡 = 𝑚
 𝑍 𝑐 calibration set with 𝑍 𝑐 = 𝑛
2. Fit a classifier, ℎ 𝑋 = 𝑌, using 𝑍𝑡
3. Define a non-conformity function 𝑓 𝑧 to measure the weirdness of your samples
4. Apply 𝑓 𝑧 to each element in 𝑍 𝑐 to obtain the calibration scores: 𝛼1, … , 𝛼 𝑛
5. Set a significance level 𝜀 𝜖 0, 1
Conformal predictions in classification
Algorithm to compute conformal predictions in classification problems
6. For a new sample 𝑥, 𝑦 compute the scoring value for each label in 𝑌:
∀ 𝐶𝑖 𝜖 𝑌 𝑓 𝑥, 𝑦 = 𝐶𝑖 = 𝛼 𝐶 𝑖
7. For each label in 𝑌 compute the p-value as follows:
∀ 𝐶𝑖 𝜖 𝑌 𝑝 𝐶 𝑖
=
𝑧 𝑗 𝜖 𝑍 𝑐∶ 𝛼 𝑗 ≥𝛼 𝐶 𝑖
𝑛+1
8. Finally build the prediction region as follows:
Γ 𝜀
= 𝐶𝑖 𝜖 𝑌: 𝑝 𝐶 𝑖
> 𝜀 , then
for the new prediction ℎ 𝑥 = 𝑦, 𝑃 𝑦 𝜖 Γ 𝜀
= 1 − ε
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Conformal predictions in regression
Algorithm to compute conformal prediction regions in regression problems
Let 𝑍 = 𝑋, 𝑌 be the historical data set for our classification problem, where:
 𝑍 = 𝑝, 𝑋 is the information about the problem and 𝑌 a continuous target.
 𝑍 is exchangeable.
To obtain the prediction region:
1. Divide 𝑍 into two disjoint sets:
 𝑍𝑡 proper training set with 𝑍𝑡 = 𝑚
 𝑍 𝑐 calibration set with 𝑍 𝑐 = 𝑛
2. Fit a regression model, ℎ 𝑋 = 𝑌, using 𝑍𝑡
3. Define a non-conformity function 𝑓 𝑧 to measure the weirdness of your samples
4. Apply 𝑓 𝑧 to each element in 𝑍 𝑐 to obtain the calibration scores: 𝛼1, … , 𝛼 𝑛
5. Set a significance level 𝜀 𝜖 0, 1
Conformal predictions in regression
Algorithm to compute conformal predictions in regression problems
6. Sort calibrations scores 𝛼1, … , 𝛼 𝑛 in a descending order
7. Compute the index 𝑠 = 𝜀 𝑛 + 1
 This is the index of the (1 − ε)-percentile of the non-conformity score 𝛼 𝑠
8. Finally the prediction region for a new sample:
Γ 𝜀
= ℎ 𝑥𝑖 ± 𝛼 𝑠, with 𝑃 ℎ(𝑥𝑖)𝜖 Γ 𝜀
= 1 − ε
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Application
Classification with conformal predictors
• The dataset is imbalanced (Total Loss is the minority class)
• The model is XGBoost
• Model performance:
• A new accident happens the model says it is a Total Loss, but how confident we are?
• Due to business restrictions we have to minimize the number false positives in TL
PROBLEM: To find out whether a car is a total loss or not
Application
Classification with conformal predictors
• We take the test set, 𝑍𝑡𝑒𝑠𝑡 = (𝑥𝑖, 𝑦𝑖) with 𝑖 = 1, … , 𝑀
• We define a non-conformity function:
𝑓 𝑧 =
𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑐𝑙𝑎𝑠𝑠 𝑖 + 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑐𝑙𝑎𝑠𝑠 𝑖
2
where:
 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠 𝑖 according to the model that 𝑦 = 𝑐𝑙𝑎𝑠𝑠 𝑖
 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠 𝑖 recalibrated probability that 𝑦 = 𝑐𝑙𝑎𝑠𝑠 𝑖
Application
Classification with conformal predictors
• Let us assume 𝑀 = 9 and apply 𝑓 𝑧 to each 𝑧𝑖 𝜖 𝑍𝑡𝑒𝑠𝑡
• We order the scores, and use them to compute the p-value per label for the new accident:
TL = 0.85 p-value TL = 8/(9+1) = 0.8 > 𝜀 = 0.05
Non-TL = 0.15 p-value non-TL = 2/(9+1) = 0.2 > 𝜀 = 0.05
Γ 𝜀
= 𝑇𝐿, 𝑛𝑜𝑛 − 𝑇𝐿 s. t. 𝑃 𝑦 ∈ Γ 𝜀
= 0.95
Application
Classification with conformal predictors
Application
Classification with conformal predictions
Application
Regression with conformal predictors
• The dataset is not correctly label there were some inconsistencies.
• The model is XGBoost.
• Model performance:
• The model output was the input to another model
PROBLEM: to compute/find out the price of a car
Application
Regression with conformal predictions
Application
Regression with conformal predictions
Application
Regression with conformal predictors
• We take the test set, 𝑍𝑡𝑒𝑠𝑡 = (𝑥𝑖, 𝑦𝑖) with 𝑖 = 1, … , 𝑀
• We define a non-conformity function:
𝑓 𝑧 = 𝑦 − ℎ(𝑥)
where:
 𝑦 is the true value, and ℎ(𝑥) the model prediction
• Let us assume 𝑀 = 9 and apply 𝑓 𝑧 to each 𝑧𝑖 𝜖 𝑍𝑡𝑒𝑠𝑡
• We order in descending order
• We set 𝜀 = 0.2, then the index of the score 𝑠 = 0.2 ∙ 9 + 1 = 2 𝛼 𝑠=2
• The fixed width conformal interval would be: ℎ(𝑥) ± 189.52
Application
Regression with conformal predictions
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
Summary and conclusions
Take away
• Good model performance does not mean trustable predictions.
• Conformal predictions is a useful tool with different applications.
• It is easy to understand and to implement.
• Define a non-conformity function is not always easy.
• Confident areound predictions bring some
Motivating example
Introduction to conformal predictions
Conformal predictions in classification
Conformal predictions in regression
Application
Summary and conclusions
References
References
Some publications
References
Some interesting readings
1. V. Vovk, A. Gammerman, G. Shafer, Algorithm learning in a random walk, Springer, 2005.
2. H. Linusson, An introduction to conformal predictions, 2017.
3. V. Vovk, Cross-conformal predictors, Annals of Mathematics and Artificial Intelligence, 1-20, 2013.
4. U. Johannsson, H. Bostrom, T. Lofstrom, H. Linusson, Regression conformal predictors with
random forest, Machine Learning, 95, 155-176, 2014.
5. V. Balasubramanian, S-S. Ho, V. Vovk, Conformal predictions for reliable machine learning, Science
Direct Journal and Book, 2014.
How is your prediction? Quantifying uncertainty in Machine Learning
predictions
Questions

More Related Content

What's hot

Polycystic Ovarian Syndrome.pptx
Polycystic Ovarian Syndrome.pptxPolycystic Ovarian Syndrome.pptx
Polycystic Ovarian Syndrome.pptx
Rafi Rozan
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
Jaroslaw Szymczak
 
MATCHING GRAPH THEORY
MATCHING GRAPH THEORYMATCHING GRAPH THEORY
MATCHING GRAPH THEORY
garishma bhatia
 
Cuckoo Search via Levy Flights
Cuckoo Search via Levy FlightsCuckoo Search via Levy Flights
Cuckoo Search via Levy Flights
Xin-She Yang
 
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANIOVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
DR SHASHWAT JANI
 
Clique Relaxation Models in Networks: Theory, Algorithms, and Applications
Clique Relaxation Models in Networks: Theory, Algorithms, and ApplicationsClique Relaxation Models in Networks: Theory, Algorithms, and Applications
Clique Relaxation Models in Networks: Theory, Algorithms, and Applications
SSA KPI
 
Role of hysteroscopy in infertility Management
Role of hysteroscopy in infertility Management  Role of hysteroscopy in infertility Management
Role of hysteroscopy in infertility Management
Azuka Chinweokwu Ezeike
 
Thromboprophylaxis During Pregnancy, Labour And
Thromboprophylaxis During Pregnancy, Labour AndThromboprophylaxis During Pregnancy, Labour And
Thromboprophylaxis During Pregnancy, Labour And
North Cumbria University Hospitals NHS Trust
 
Fertility enhancing hysteroscopic surgery
Fertility enhancing hysteroscopic surgeryFertility enhancing hysteroscopic surgery
Fertility enhancing hysteroscopic surgery
DrRokeyaBegum
 
Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...
Navin Kumar
 
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSIS With FOCUS ON DINOGEST
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSISWith FOCUS ON DINOGESTPANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSISWith FOCUS ON DINOGEST
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSIS With FOCUS ON DINOGEST
Lifecare Centre
 
Particle Swarm Optimization
Particle Swarm OptimizationParticle Swarm Optimization
Particle Swarm Optimization
Stelios Petrakis
 
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB  PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
DGFPublicAwareness
 
Lecture 29 fuzzy systems
Lecture 29   fuzzy systemsLecture 29   fuzzy systems
Lecture 29 fuzzy systems
university of sargodha
 
Ulipristal acetate in treatment of fibroids
Ulipristal acetate in treatment of fibroidsUlipristal acetate in treatment of fibroids
Ulipristal acetate in treatment of fibroids
Indraneel Jadhav
 
Newton-Raphson Method
Newton-Raphson MethodNewton-Raphson Method
Newton-Raphson Method
Sunith Guraddi
 

What's hot (16)

Polycystic Ovarian Syndrome.pptx
Polycystic Ovarian Syndrome.pptxPolycystic Ovarian Syndrome.pptx
Polycystic Ovarian Syndrome.pptx
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
 
MATCHING GRAPH THEORY
MATCHING GRAPH THEORYMATCHING GRAPH THEORY
MATCHING GRAPH THEORY
 
Cuckoo Search via Levy Flights
Cuckoo Search via Levy FlightsCuckoo Search via Levy Flights
Cuckoo Search via Levy Flights
 
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANIOVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
OVARIAN REJUVENATION - ROLE OF PLATELET RICH PLASMA THERAPY BY DR SHASHWAT JANI
 
Clique Relaxation Models in Networks: Theory, Algorithms, and Applications
Clique Relaxation Models in Networks: Theory, Algorithms, and ApplicationsClique Relaxation Models in Networks: Theory, Algorithms, and Applications
Clique Relaxation Models in Networks: Theory, Algorithms, and Applications
 
Role of hysteroscopy in infertility Management
Role of hysteroscopy in infertility Management  Role of hysteroscopy in infertility Management
Role of hysteroscopy in infertility Management
 
Thromboprophylaxis During Pregnancy, Labour And
Thromboprophylaxis During Pregnancy, Labour AndThromboprophylaxis During Pregnancy, Labour And
Thromboprophylaxis During Pregnancy, Labour And
 
Fertility enhancing hysteroscopic surgery
Fertility enhancing hysteroscopic surgeryFertility enhancing hysteroscopic surgery
Fertility enhancing hysteroscopic surgery
 
Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...Bidirectional graph search techniques for finding shortest path in image base...
Bidirectional graph search techniques for finding shortest path in image base...
 
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSIS With FOCUS ON DINOGEST
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSISWith FOCUS ON DINOGESTPANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSISWith FOCUS ON DINOGEST
PANEL DISCUSSION ON PRACTICAL APPROACH TO ENDOMETRIOSIS With FOCUS ON DINOGEST
 
Particle Swarm Optimization
Particle Swarm OptimizationParticle Swarm Optimization
Particle Swarm Optimization
 
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB  PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
PANEL DISCUSSION MANAGEMENT OF PCOS WOMB to TOMB
 
Lecture 29 fuzzy systems
Lecture 29   fuzzy systemsLecture 29   fuzzy systems
Lecture 29 fuzzy systems
 
Ulipristal acetate in treatment of fibroids
Ulipristal acetate in treatment of fibroidsUlipristal acetate in treatment of fibroids
Ulipristal acetate in treatment of fibroids
 
Newton-Raphson Method
Newton-Raphson MethodNewton-Raphson Method
Newton-Raphson Method
 

Similar to Py data19 final

WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
weka Content
 
WEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been LearnedWEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been Learned
DataminingTools Inc
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
Masa Kato
 
Understanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence FunctionsUnderstanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence Functions
SEMINARGROOT
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
MohamedAliHabib3
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
ajondaree
 
Machine learning ( Part 1 )
Machine learning ( Part 1 )Machine learning ( Part 1 )
Machine learning ( Part 1 )
Sunil OS
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
Hadrian7
 
Estimating Causal Effects from Observations
Estimating Causal Effects from ObservationsEstimating Causal Effects from Observations
Estimating Causal Effects from Observations
Antigoni-Maria Founta
 
MACHINE LEARNING.pptx
MACHINE LEARNING.pptxMACHINE LEARNING.pptx
MACHINE LEARNING.pptx
SOURAVGHOSH623569
 
Domain adaptation: A Theoretical View
Domain adaptation: A Theoretical ViewDomain adaptation: A Theoretical View
Domain adaptation: A Theoretical View
Chia-Ching Lin
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
gadissaassefa
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoring
eurosigdoc acm
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
gmorishita
 
MLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptxMLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptx
RahulChaudhry15
 
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
inventionjournals
 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notes
UmeshJagga1
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
butest
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
butest
 

Similar to Py data19 final (20)

WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
 
WEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been LearnedWEKA: Credibility Evaluating Whats Been Learned
WEKA: Credibility Evaluating Whats Been Learned
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
 
Understanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence FunctionsUnderstanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence Functions
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
 
Machine learning ( Part 1 )
Machine learning ( Part 1 )Machine learning ( Part 1 )
Machine learning ( Part 1 )
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Estimating Causal Effects from Observations
Estimating Causal Effects from ObservationsEstimating Causal Effects from Observations
Estimating Causal Effects from Observations
 
MACHINE LEARNING.pptx
MACHINE LEARNING.pptxMACHINE LEARNING.pptx
MACHINE LEARNING.pptx
 
Domain adaptation: A Theoretical View
Domain adaptation: A Theoretical ViewDomain adaptation: A Theoretical View
Domain adaptation: A Theoretical View
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoring
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
MLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptxMLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptx
 
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm	Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
Study on Evaluation of Venture Capital Based onInteractive Projection Algorithm
 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notes
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
 

Recently uploaded

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 

Recently uploaded (20)

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 

Py data19 final

  • 1. How good is your prediction? Quantifying uncertainty in Machine Learning predictions PyData London 2019 (12th- 14th July) Maria Navarro
  • 2. Outline Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 3. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 4. Motivating example How good is your prediction? Problem To find out whether a car is a total lo To do it we have: 1. A set of historical observations 𝑥1; 𝑦1 , ⋯ , 𝑥 𝑁; 𝑦 𝑁 , where: • 𝑥𝑖 describes the accident by age of the driver, model of the car, etc. • 𝑦𝑖 is a label which identifies whether the car is reparable or not 2. A machine learning algorithm (h 𝑥 = 𝑦) PROBLEM: To find out whether a car is a total loss or not
  • 5. Motivating example How good is your prediction, REALLY? A new accident, 𝑥 𝑁+1, occurs. We run our model, and we obtain the following results: 1. The car is classified as total loss 2. The probability of total loss according to our model is 0.85 3. The model is roughly 91% accurate in training, test and validation sets, so we expect same behaviour in production data 4. The model has an AUC of 0.88 in training, so again that is what we expect in production data What do these measurements mean? Do we have any guarantee about accident 𝑥 𝑁+1? Are we confident about the prediction?
  • 6. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 7. Introduction to conformal predictions Why Conformal Predictions (CP) ? 1. There are several ad hoc ways to obtain some confidence around your predictions (resampling methods, assume normality, etc.) 2. Conformal predictions assumes very little about the outcome you are trying to predict. It only assume exchangeability. 3. It can be used with any machine learning algorithm. 4. It provides error bounds at a confidence level that we can select. 5. Probabilities are well-calibrated. 6. It is easy to implement. 7. The framework has been proven: V. Vovk, A. Gammerman, G. Shafer Algorithmic learning in a random walk, Springer 2005.
  • 8. Introduction to conformal predictions General idea • Let 𝑍 be a probability distribution. • f z → ℝ some function. • We draw 5 samples from the distribution 𝑍 and apply 𝑓 𝑧 :  𝑓 𝑧𝑖 = 𝛼𝑖, with 𝑖 = 1, … , 5  For simplicity, we assume 𝛼1 ≤ 𝛼2 ≤ 𝛼3 ≤ 𝛼4 ≤ 𝛼5 • We estimate the cumulative distribution function (CDF) for the scores: 0 0.2 0.4 0.6 0.8 1 𝛼1 𝛼2 𝛼3 𝛼4 𝛼5 • We draw a new sample from z ∈ 𝑍. We assume exchangeability and compute 𝑓 𝑧 = 𝛼. • We can estimate its probability: 𝑃 𝛼 ≤ 𝛼4 = 0.6 and 𝑃 𝛼 ≤ 𝛼2 = 0.2
  • 9. Introduction to conformal predictions Relation to our problem • Let 𝑧𝑖 = 𝑥𝑖; 𝑦𝑖 with 𝑖 = 1, … , 𝑝 be a sample of the probability distribution, 𝑍 = 𝑋, 𝑌 , where:  𝑥𝑖 is our observables and 𝑦𝑖 the target we want to predict • We define 𝑓 𝑧𝑖 = 𝑦𝑖 − ℎ 𝑥𝑖 , where:  ℎ 𝑥𝑖 is a regression model train on 𝑧𝑖 with 𝑖 = 5, … , 𝑝 • We apply 𝑓 𝑧 to the 5 remaining samples  𝑓 𝑧𝑖 = 𝛼𝑖, with 𝑖 = 1, … , 5  We can compute the exact values 0.10 ≤ 0.13 ≤ 0.28 ≤ 0.30 ≤ 0.38 • We estimate the cumulative distribution function (CDF) for the scores: 0 0.2 0.4 0.6 0.8 1 0.10 0.13 0.28 0.30 0.38 • We draw a new sample from z ∈ 𝑍. We assume exchangeability and compute 𝑓 𝑧 = 𝑦 − ℎ 𝑥 = 𝑦 − 2 . • We can estimate its probability:  𝑃 𝑦 − 2 ≤ 0.30 = 0.6 and 𝑃 𝑦 − 2 ≤ 0.28 = 0.4  𝑃 𝑦 ∈ 2 ± 0.30 = 0.6 and 𝑃 𝑦 ∈ 2 ± 0.30 = 0.4  𝑦 𝜖 1.7, 2.3 with probability 0.6
  • 10. Introduction to conformal predictions Inputs for conformal predictions • A set of training examples 𝑧𝑖 = 𝑥𝑖, 𝑦𝑖 with 𝑖 = 1, … , 𝑃  They must be drawn from an exchangeable distribution (the order of observations is irrelevant). • A non-conformity function 𝑓 𝑧 → ℝ  It measures the “weirdness” of an example 𝑥𝑖, 𝑦𝑖  It should give low scores to similar examples 𝑥𝑖, 𝑦𝑖 and high scores to different ones 𝑥𝑖, ¬𝑦𝑖  Common choice is take some function of the underlying model, but it can be anything: the probability estimate for correct class, distance to neighbours with same class, probability from the trees, absolute error of a regression model, etc. • Set a significance level 𝛆 ∈ (0,1), so 1 − 𝜀 confidence level
  • 11. Introduction to conformal predictions How does conformal predictions work? • Divide training set into two disjoint sets: 𝑍𝑡 with 𝑍𝑡 = 𝑚 and 𝑍 𝑐 with 𝑍 𝑐 = 𝑛, 𝑚 + 𝑛 = 𝑝 • Build the underlying model, ℎ, using 𝑍𝑡 • Apply 𝑓 𝑧𝑖 = 𝛼𝑖 to the elements of the set you did not use for training ℎ , and estimate its probability distribution 𝛼1, … , 𝛼 𝑛 ~ 𝑄 • If a new example comes in 𝑥, ℎ 𝑥 = 𝑦 , then we will reject 𝑦  We will reject 𝑦 if 𝑓 (𝑥, 𝑦) = 𝛼 𝑦 does not belong to 𝑄 • We compute the non-conformity degree which is called p-value as follows: 𝑝 𝑦= 𝑧 𝑗 𝜖 𝑍 𝑐∶ 𝛼 𝑗 ≥ 𝛼 𝑦 𝑛+1 , 𝑝 𝑦 is the p-value • Finally the prediction region: Γ 𝜀 = 𝑦 𝜖 𝑌: 𝑝 𝑦 > 𝜀 Is 𝒚 a very non-conforming example?
  • 12. Introduction to conformal predictions Conformal prediction output The prediction region Γ 𝜀 contains prediction 𝑦 with probability 1 − 𝜀  In classification :  𝛼 𝑦 is know, but we need to compute 𝑝 𝑦  The result is a set of labels: Γ 𝜀 = 𝐶𝑙𝑎𝑠𝑠1, 𝐶𝑙𝑎𝑠𝑠3, 𝐶𝑙𝑎𝑠𝑠5 s. t. 𝑃 𝑦 ∈ Γ 𝜀 = 1 − 𝜀 o If Γ 𝜀 = ∅ , then always erroneous o If Γ 𝜀 = 𝐶 (only one class), then always true (if it is the correct class) o If Γ 𝜀 = 𝐶𝑙𝑎𝑠𝑠1, 𝐶𝑙𝑎𝑠𝑠3, … , 𝐶𝑙𝑎𝑠𝑠5 (several classes), then always correct  In regression is an interval:  𝑝 𝑦 is know, but we need to compute 𝛼 𝑦  The result is an interval: Γ 𝜀 = 𝑎, 𝑏 where 𝑎, 𝑏 ∈ ℝ and s. t. 𝑃 𝑦 ∈ Γ 𝜀 = 1 − 𝜀
  • 13. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 14. Conformal predictions in classification Algorithm to compute conformal prediction regions in classification problems Let 𝑍 = 𝑋, 𝑌 be the historical data set for our classification problem, where:  𝑍 = 𝑝, 𝑋 is the information about the problem and 𝑌 = 𝐶1 , … , 𝐶𝑠 set of labels.  𝑍 is exchangeable. To obtain the prediction region: 1. Divide 𝑍 into two disjoint sets:  𝑍𝑡 proper training set with 𝑍𝑡 = 𝑚  𝑍 𝑐 calibration set with 𝑍 𝑐 = 𝑛 2. Fit a classifier, ℎ 𝑋 = 𝑌, using 𝑍𝑡 3. Define a non-conformity function 𝑓 𝑧 to measure the weirdness of your samples 4. Apply 𝑓 𝑧 to each element in 𝑍 𝑐 to obtain the calibration scores: 𝛼1, … , 𝛼 𝑛 5. Set a significance level 𝜀 𝜖 0, 1
  • 15. Conformal predictions in classification Algorithm to compute conformal predictions in classification problems 6. For a new sample 𝑥, 𝑦 compute the scoring value for each label in 𝑌: ∀ 𝐶𝑖 𝜖 𝑌 𝑓 𝑥, 𝑦 = 𝐶𝑖 = 𝛼 𝐶 𝑖 7. For each label in 𝑌 compute the p-value as follows: ∀ 𝐶𝑖 𝜖 𝑌 𝑝 𝐶 𝑖 = 𝑧 𝑗 𝜖 𝑍 𝑐∶ 𝛼 𝑗 ≥𝛼 𝐶 𝑖 𝑛+1 8. Finally build the prediction region as follows: Γ 𝜀 = 𝐶𝑖 𝜖 𝑌: 𝑝 𝐶 𝑖 > 𝜀 , then for the new prediction ℎ 𝑥 = 𝑦, 𝑃 𝑦 𝜖 Γ 𝜀 = 1 − ε
  • 16. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 17. Conformal predictions in regression Algorithm to compute conformal prediction regions in regression problems Let 𝑍 = 𝑋, 𝑌 be the historical data set for our classification problem, where:  𝑍 = 𝑝, 𝑋 is the information about the problem and 𝑌 a continuous target.  𝑍 is exchangeable. To obtain the prediction region: 1. Divide 𝑍 into two disjoint sets:  𝑍𝑡 proper training set with 𝑍𝑡 = 𝑚  𝑍 𝑐 calibration set with 𝑍 𝑐 = 𝑛 2. Fit a regression model, ℎ 𝑋 = 𝑌, using 𝑍𝑡 3. Define a non-conformity function 𝑓 𝑧 to measure the weirdness of your samples 4. Apply 𝑓 𝑧 to each element in 𝑍 𝑐 to obtain the calibration scores: 𝛼1, … , 𝛼 𝑛 5. Set a significance level 𝜀 𝜖 0, 1
  • 18. Conformal predictions in regression Algorithm to compute conformal predictions in regression problems 6. Sort calibrations scores 𝛼1, … , 𝛼 𝑛 in a descending order 7. Compute the index 𝑠 = 𝜀 𝑛 + 1  This is the index of the (1 − ε)-percentile of the non-conformity score 𝛼 𝑠 8. Finally the prediction region for a new sample: Γ 𝜀 = ℎ 𝑥𝑖 ± 𝛼 𝑠, with 𝑃 ℎ(𝑥𝑖)𝜖 Γ 𝜀 = 1 − ε
  • 19. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 20. Application Classification with conformal predictors • The dataset is imbalanced (Total Loss is the minority class) • The model is XGBoost • Model performance: • A new accident happens the model says it is a Total Loss, but how confident we are? • Due to business restrictions we have to minimize the number false positives in TL PROBLEM: To find out whether a car is a total loss or not
  • 21. Application Classification with conformal predictors • We take the test set, 𝑍𝑡𝑒𝑠𝑡 = (𝑥𝑖, 𝑦𝑖) with 𝑖 = 1, … , 𝑀 • We define a non-conformity function: 𝑓 𝑧 = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑐𝑙𝑎𝑠𝑠 𝑖 + 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑐𝑙𝑎𝑠𝑠 𝑖 2 where:  𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠 𝑖 according to the model that 𝑦 = 𝑐𝑙𝑎𝑠𝑠 𝑖  𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠 𝑖 recalibrated probability that 𝑦 = 𝑐𝑙𝑎𝑠𝑠 𝑖
  • 22. Application Classification with conformal predictors • Let us assume 𝑀 = 9 and apply 𝑓 𝑧 to each 𝑧𝑖 𝜖 𝑍𝑡𝑒𝑠𝑡 • We order the scores, and use them to compute the p-value per label for the new accident: TL = 0.85 p-value TL = 8/(9+1) = 0.8 > 𝜀 = 0.05 Non-TL = 0.15 p-value non-TL = 2/(9+1) = 0.2 > 𝜀 = 0.05 Γ 𝜀 = 𝑇𝐿, 𝑛𝑜𝑛 − 𝑇𝐿 s. t. 𝑃 𝑦 ∈ Γ 𝜀 = 0.95
  • 25. Application Regression with conformal predictors • The dataset is not correctly label there were some inconsistencies. • The model is XGBoost. • Model performance: • The model output was the input to another model PROBLEM: to compute/find out the price of a car
  • 28. Application Regression with conformal predictors • We take the test set, 𝑍𝑡𝑒𝑠𝑡 = (𝑥𝑖, 𝑦𝑖) with 𝑖 = 1, … , 𝑀 • We define a non-conformity function: 𝑓 𝑧 = 𝑦 − ℎ(𝑥) where:  𝑦 is the true value, and ℎ(𝑥) the model prediction • Let us assume 𝑀 = 9 and apply 𝑓 𝑧 to each 𝑧𝑖 𝜖 𝑍𝑡𝑒𝑠𝑡 • We order in descending order • We set 𝜀 = 0.2, then the index of the score 𝑠 = 0.2 ∙ 9 + 1 = 2 𝛼 𝑠=2 • The fixed width conformal interval would be: ℎ(𝑥) ± 189.52
  • 30. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 31. Summary and conclusions Take away • Good model performance does not mean trustable predictions. • Conformal predictions is a useful tool with different applications. • It is easy to understand and to implement. • Define a non-conformity function is not always easy. • Confident areound predictions bring some
  • 32. Motivating example Introduction to conformal predictions Conformal predictions in classification Conformal predictions in regression Application Summary and conclusions References
  • 34. References Some interesting readings 1. V. Vovk, A. Gammerman, G. Shafer, Algorithm learning in a random walk, Springer, 2005. 2. H. Linusson, An introduction to conformal predictions, 2017. 3. V. Vovk, Cross-conformal predictors, Annals of Mathematics and Artificial Intelligence, 1-20, 2013. 4. U. Johannsson, H. Bostrom, T. Lofstrom, H. Linusson, Regression conformal predictors with random forest, Machine Learning, 95, 155-176, 2014. 5. V. Balasubramanian, S-S. Ho, V. Vovk, Conformal predictions for reliable machine learning, Science Direct Journal and Book, 2014.
  • 35. How is your prediction? Quantifying uncertainty in Machine Learning predictions Questions