Quantiative uncertainty in QSAR predictions - Bayesian predictive inference and the magic of bootstrap

Uncertainty in QSAR Predictions –
Bayesian Inference and the Magic of
Bootstrap
Ullrika Sahlin PhD
Centre for Environmental and
Climate Research (CEC)

QSAR integrated assessment
Assessment
model
Input 1
Input 2
Input 3
Decision
node
QSAR
prediction
QSAR
prediction
Experimental
value

Uncertainty in hazard assessment –
does it matter?
4.
Conservative
value of
toxicity
3.
Expected
toxicity
2.
Median
toxicity
1. QSAR
predictions
without
uncertainty
0. No HA
?: 386
Not toxic*:
281
265 262 153
+109
+3
+16
Very toxic:
105
Sahlin et al. 2013. Arguments for Considering Uncertainty in QSAR Predictions
in Hazard and Risk Assessments. ATLA

QSAR integrated hazard assessment
and the AD domain problem
-10 -8 -6 -4
0200400600800
Predicted No Effect Concentration of 386 Triazoles
log min{EC50}
Molecularweight
Relative toxicity potential
Low confidence in prediction

Modes of statistical inference
• Parametric inference
– Explain
– Hypothesis-driven
• Predictive inference
– Predict to support decision making
– Generate hypothesis
• Evidence synthesis
– Consider quality
Geisser. Introduction to predictive inference 1993. Sutton and Abrams 2001. Bayesian
methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research.

To predict…
 is to make a statement
of something we have
not yet observed
 is always made with
uncertainty
 is made using at least
one model

How can I…
• Assess uncertainty in a prediction?
• Take my judgement of confidence in the
model into account?
• Validate the assessment?
Principle for
QSAR modelling
Principle to
judge
confidence in
predictions
Principle to
assess
uncertainty

Uncertainty in a prediction
Predictive error Predictive reliability
Our confidence in using a
model to predict what we
want to predict
0.0 0.1 0.2 0.3 0.4 0.5 0.6
-2-101
hat value
predictivemean
2 4 6 8 10 12 14
-2-101
nC
logEC50
Discrepancy between model
and reality

-5 0 5 10
-10-5051015
nC
predictedy
Different kinds of errors

5e-02 5e-01 5e+00 5e+01 5e+02
51015
distance from model
prediction
+
+ +
+
+
+
+
+ ++++
+ + +
++
+
++
+
+
+
++
+
+ ++
+
+
+
+
++
+
+
+
+
+
+++ +
++
+ +
+
+
+
+
+
+
++
++
++
+
+
+
++ + +
+
+
+
+
+
+
++
+
++
+++
+
+
+
+
+
+
+
+
+
+
++
++
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
++
+
+
+
++ +++
+
++++++++++
+ +
+ +
+
+ +
+
++ + +
+ ++
+
++
++ + +
++
+
+
+ +
+
+ +
++
++
+
+
+
+
+
+
+
++
+
+
+
+
++
++ +
+
+
+
+
+
+
+
+
+
+ +
+
+
+++
++++
+
+
+
+
+
+
++
+
+
++
++
+
+
+ +
++ + +
+
++
+ +
+
+
+ +
+
+
++
+
+
+
+
+
++ ++
+ +
++
+
+ +
++
++
+
+++ +
+
+
+
+
+
+++
+
++
+
+
+
++
++
+ +
++
+
+
+
+
+
+
+
+
+
+
++ + + ++
+ ++
++ +
+
+
+
+
+
+
+
+ +++
+
+ ++++
+
+
+++
+++++++
+ + +++
+
+
+
+
+
+
+
++
+
+
+
++
+
++
+
+ +
+ ++++ +
+++
+
++ +
+ ++
Predictive reliability

Different measures of predictive
reliability
• Similarity to points in the training data set
• Distance from the centre of training data
• Density of training data around the item to be
predicted
• Sensitivity analysis e.g. standard deviation in
perturbed predictions

Predictive error of a regression

Predictive distribution
p(Y < y |X,θ)

Use likelihood to compare!

Assessment of
predictive
distribution
Frequentist
framework
Frequentist
analytical
Sampling
"external data" Re-sampling
Jackknifing
"without
replacement"
Bootstrapping
"with
replacement"
Bayesian
framework
Bayesian
analytical
Bayesian
sampling
Different ways to assess

I. Bayesian modelling
Assessment of
predictive
distribution
Frequentist
framework
Frequentist
analytical
Sampling
Jackknifing
"without
replacement"
Bootstrapping
"with
replacement"
Bayesian
framework
Bayesian
analytical
Bayesian
sampling

• Model parameters are
uncertain
• Uncertainty is described by
probability
• Prior information is
subjective
• Data enters through
Bayesian updating
0 50 100 150 200
505560657075
MCMC sampling
parameter 1
parameter2

Pros
• Uncertainty is measured by
probability
• Links to decision theory
• Motivated under small data
Cons
• Treatment of high-
dimensional descriptor
space?
• Limitation to specific
models?
• Re-modelling of QSARs
needed

Validation
Fathead Minnow QSARdata R-package
Park and Casella (2008) Journal of the American Statistical
Association, Gramacy and Pantaleo (2010) Bayesian Analysis.
-2 -1 0 1 2
-1012
training data
observed
predicted
R2_Blasso = 0.79
-3 -2 -1 0 1 2
-2-10123
test data
observed
predicted
R2_Blasso = 0.75

Validation
Empirical coverage
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
training data
confidence
hitrate
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
test data
confidence
hitrate

2. Bootstrap sampling
Assessment of
predictive
distribution
Frequentist
framework
Frequentist
analytical
Sampling
Jackknifing
"without
replacement"
Bootstrapping
"with
replacement"
Bayesian
framework
Bayesian
analytical
Bayesian
sampling

3. Assessment considering judgment in
predictive reliability
Inspired by Denham 1997 and Clark 2009
Type of distribution:
Gaussian
Mean: Point
prediction yq
Variance: Local Predictive Error Sum of
Squares divided by denominator

Gaussian
Mean: Point
prediction yq
Observed prediction errors Measure of predictive reliability
jj yy ˆ Sampling from distribution of
modified residuals

n
j jq
n
j jjjq
q
w
yyw
PRESSW
1 ,
1
2
, )ˆ(
.
)(
2
,
)ˆ(.
jqwkNNj
jjq yyPRESSkNN
n
j jj yyPRESS 1
2
)ˆ(
Gaussian
Mean: Point
prediction Yq

Validate the assessment
Evaluation on External data
log likelihood score
Assessmentofpredictiveerror
-100 -80 -60 -40 -20 0
equal
W euclidean
W leverage
W ADdens
kNN euclidean
kNN leverage
kNN ADdens
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Empirical coverage (External data)
confidence level
hitrate
1:1
equal
W euclidean
W leverage
W ADdens
kNN euclidean
kNN leverage
kNN ADdens

So – which approach is the best?
-2 -1 0 1 2
-2-1012
training data
observed
predicted
R2_pls = 0.77 R2_boot = 0.83 R2_Blasso = 0.79
-3 -2 -1 0 1 2
-2-10123
test data
observed
predicted
R2_pls = 0.77 R2_boot = 0.78 R2_Blasso = 0.75

0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
training data
confidence
hitrate
1:1
Blasso
Bootstrap
kNN leverage
equal
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
test data
confidence
hitrate
1:1
Blasso
Bootstrap
W euclidean
equal

0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
training data
confidence
hitrate
1:1
Blasso
Bootstrap
kNN leverage
equal
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
test data
confidence
hitrate
1:1
Blasso
Bootstrap
W euclidean
equal
Evaluation on training data
log likelihood score
Assessmentofpredictiveerror
-200 -150 -100 -50 0
Blasso
Bootstrap
kNN leverage
equal

Take home messages
• A predictions is complete when given with
uncertainty specified by probability
• Assessment of uncertainty need both be
theoretical motivated and proved honest in
empirical evaluation of performance measures
• Three useful approaches are to assess uncertainty
through modelling (Bayesian), sampling (e.g.
bootstrapping), or post modelling of predictive
error
• Use appropriate measures to validate the
assessment of uncertainty

Thank you for your attention
Drive safely in the statistical djungle!

Quantiative uncertainty in QSAR predictions - Bayesian predictive inference and the magic of bootstrap

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (14)

Similar to Quantiative uncertainty in QSAR predictions - Bayesian predictive inference and the magic of bootstrap

Similar to Quantiative uncertainty in QSAR predictions - Bayesian predictive inference and the magic of bootstrap (20)

Recently uploaded

Recently uploaded (20)

Quantiative uncertainty in QSAR predictions - Bayesian predictive inference and the magic of bootstrap

Editor's Notes