How good is your prediction a gentle introduction to conformal prediction.

How good is your prediction?
A gentle introduction to
Conformal Prediction.
Marco Capuccini
System-Software Development Group
RIKEN Center for Computational Science
marco.capuccini@riken.jp

Marco Capuccini
PhD
in
Scientific
C
om
puting
U
ppsala
U
niversity
Sw
eden
M
Sc
B
ioinform
atics
U
ppsala
U
niversity
Sw
eden
B
Sc
C
om
puterScience
La
Sapienza
R
om
e,Italy
Cloud / HPC
Computing
Data
Engineering
Machine
Learning
Bioinformatics
Postdoc
R
IKEN
(R
-C
C
S)
Kobe,Japan

RIKEN
▪ RIKEN is Japan's largest (government funded) research institution
▪ Established in 1917
▪ Research centers and institutes across Japan
…

Machine Learning (ML)
Machine Learning (ML) is a family of methods to derive knowledge or predictions
from data
Supervised learning:
● Given a set of objects x1
, x2
… xn
with known labels l1
, l2
…
● Goal: train a model M that can be used in order to predict unknown labels ln+1
,
ln+2
… for new objects xn+1
, xn+2
…
Application: spam ﬁltering, text recognition, image analysis, ﬁnancing, genomics,
song and movie recommendation …

How do we evaluate M?
Current (best) practices
1. Split x1
, x2
… xn
in a training set x1
, x2
… xk
and a test set xk+1
, xk+2
… xn
(1<k<n)
2. train M over the training set and evaluate it over the test set
What if performance changes with new objects?
How do we assign object-speciﬁc conﬁdence to predictions?

Conformal Prediction
Mathematical framework (by Vovk et al.). Main idea:
● For an unseen object, instead of producing a single prediction l' a Conformal
Predictor (CP) produces a prediction set { l1
',l2
', … lK
' } according to a
user-speciﬁed signiﬁcance level 𝜺
● Vovk et al. provide proof that ℙ( l ϵ { l1
',l2
', … lK
' } ) ≥ 1 - 𝜺, where l is the true label
for the unseen object
Can be applied to any ML predictor. How?
Vovk, Vladimir, Alex Gammerman, and Glenn Shafer. Algorithmic learning in a random world. Springer Science &
Business Media, 2005.

Neural network example
Binary classiﬁcation
● The output sigmoid layer models the probability of positive class
CP wants the underlying predictor to assign a Non-Conformity Measure (NCM) to
examples; i.e. a strangeness measure for examples
● Given a labelled object (x, l)
NCM(x,l) = 1 - NN(x) if l is positive; NN(x) otherwise

More NCMs (1)
● Logistic regression?
● Linear Support Vector Machines?
● Random forests?
NCM(x,l) = 1 - LR(x) if l is positive; LR(x) otherwise
NCM(x,l) = -SVM(x) if l is positive, SVM(x) otherwise
NCM(x,l) = fraction of trees predicting the wrong label

Implementation
Idea
1. Given a new object. For each label:
● Compute “p-values” using a calibration set
2. Add label to prediction set if p-value > 𝜺. The p-values can also be used as a
measure conﬁdence.
Details
● Shafer, Glenn, and Vladimir Vovk. "A tutorial on conformal prediction." Journal
of Machine Learning Research 9.Mar (2008): 371-421.
● http://jmlr.csail.mit.edu/papers/volume9/shafer08a/shafer08a.pdf

Example: AI-assisted pathology
~80K prostate biopsies from
~7.5K Swedish men
~80K slides ~5M
training tiles
Train CNN with
Inception-v3
Validation over ~500K tiles
Kaggle:
https://www.kaggle.com/c/prostate-cancer-grade-assessment

AI-assisted Pathology with Conformal Prediction
CNN
CP
Conformal
Predictor
For a user-deﬁned 𝜺
CP(𝜺, x1
) = {Benignant}
CP(𝜺, x2
) = {Benignant, Cancer}
...
CP(𝜺, xn
) = {}
xk
, k=1...n is an unseen tile
By construction the true label of xk
is in the prediction set with probability at least 1- 𝜺
(Vovk et al. provide proof under exchangeability assumption)

Conformal Predictor Eﬃciency (AI-assisted Pathology)
Significance Level (𝜺)

Questions?
marco.capuccini@riken.jp

How good is your prediction a gentle introduction to conformal prediction.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How good is your prediction a gentle introduction to conformal prediction.

Similar to How good is your prediction a gentle introduction to conformal prediction. (20)

More from Deep Learning Italia

More from Deep Learning Italia (20)

Recently uploaded

Recently uploaded (20)

How good is your prediction a gentle introduction to conformal prediction.