SlideShare a Scribd company logo
Maarten van Smeden, PhD
2 november 2020
Why the EPV≥10 sample size rule is rubbish
and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Statistician at Julius Center for Health Sciences and Primary Care
• Main interests (but not limited to):
• prognostic and diagnostic modeling
• measurement error
• missing data
Today’s topic:
EPV≥10 sample size rule (aka 1 in 10 rule) has be one of the leading
sample size rules in prognostic/diagnostic prediction modeling
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Outline
• The EPV≥10 rule-of-thumb: where does it come from?
• Evidence the EPV≥10 rule has no rationale
• Evidence that sample size is important (even if you use the fancier methods)
• Actual sample size calculations for prediction models
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Ever wondered if AD/BC gives the “best” estimate of the odds ratio?
What if I told you that AD/BC is biased?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Let’s say we have fitted a logistic regression model to a dataset, and obtain
ln
𝑝𝑝𝑖𝑖
1 − 𝑝𝑝𝑖𝑖
= 𝛼𝛼� + 𝛽𝛽̂1 𝑋𝑋1𝑖𝑖 + 𝛽𝛽̂2 𝑋𝑋2𝑖𝑖 + ⋯ + 𝛽𝛽̂𝑘𝑘 𝑋𝑋𝑘𝑘𝑖𝑖
I’m very sorry, but 𝛽𝛽̂1 is a biased estimator, and 𝛽𝛽̂2 too, ….
…. actually they are all finite sample biased
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Epidemiology text-books:
• Confounding bias
• Information bias
• Selection bias
… nothing about finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Important: bias vs consistency
• Consistency ≈ as sample size increases, estimate converges to truth
• Bias ≈ with repeated samples, the average estimate converges to truth
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Log(odds) is consistent but finite sample biased
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Illustration by simulation
• Simulate 4 normal covariates with equal multivariable log-odds-ratios of 2
• 1,000 simulation samples of N = 50
• Consistency: create 1,000 meta-dataset of increasing size: meta-dataset
r consists of each created dataset up to r;
• Bias: calculate difference estimate of exposure effect and true value for
each of the created datasets up to r;
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Average of 400 studies
with N = 50
1 study with N = 20,000
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With decreasing sample size
How we usually think
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With decreasing sample size
But actually with odds ratios
(and other ratios)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
The origin of the 1 in 10 rule
“For EPV values of 10 or greater, no major problems occurred. For EPV
values less than 10, however, the regression coefficients were biased in
both positive and negative directions”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Source: Peduzzi et al. 1996
?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
!?!
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Firth’s ”correction” aims to reduce finite sample bias in maximum
likelihood estimates, applicable to logistic regression
• It makes clever use of the “Jeffries prior” (from Bayesian literature) to
penalize the log-likelihood, which shrinks the estimated coefficients
• It has a nice theoretical justifications, but does it work well?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Standard
Averaged over 465 simulation conditions with 10,000 replications each
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
StandardFirth’scorrection
Averaged over 465 simulation conditions with 10,000 replications each
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Firth’s correction
Difficult? No
Example R code:
> require(“logistf”)
> logistf(Y~X1+X2+X3+X4, firth=T, data=df)
Compared to default (maxlik) logistic regression, Firth’s correction generally:
• Narrower confidence intervals
• Lower MSE
• Better predictions*
*requires adjustment of the intercept using flic=TRUE option in logistf
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Sample issue size solved?
… not quite!
• Precision of regression coefficients
• Variable selection and functional form
• Ensure predictions are adequate
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Sample issue size solved?
… not quite!
• Precision of regression coefficients
• Variable selection and functional form
• Ensure predictions are adequate
• Why would a one-solution fits all rule-of-thumb be appropriate?
• Think of sample size for a randomized clinical trial
Would be odd to suggest all trials should have 100 patients in each arm?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
TRIPOD Item 8. Explain how the study size was arrived at
Moons et al. Ann Intern Med 2015 (TRIPOD Explanation & Elaboration)
“Although there is a consensus on the importance
of having an adequate sample size for model
development, how to determine what counts as
‘adequate’ is not clear …”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Why is sample size important?
• We want to have a large enough sample size to develop a model that
provides accurate risk predictions in new individuals from target
population
• Many (most?) models do not perform well when checked in new data
• small sample sizes
• overfitting
• lack of (internal) validation
• …
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Recent example
• Reviewed 232 prediction models
• “All models were rated at high or
unclear risk of bias”
• Sample size: median 338; IQR 134 to 707
• Number of events: median 69; IQR 37 to 160
Living review, doi: 10.1136/bmj.m1328 (these numbers from a soon to appear review update)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Recent example
• External validation 22 COVID-19 related
prognostic models
• Performance: poor to very poor
• “Admission oxygen saturation on room air and patient age are strong
predictors of deterioration and mortality among hospitalised adults with
COVID-19, respectively. None of the prognostic models evaluated here
offered incremental value for patient stratification to these univariable
predictors.”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Small sample size and overfitting
• Spurious predictor-outcome associations
• Important predictors can be missed
• Unimportant predictors can be selected
• Regression coefficients too large and uncertain
• Model doesn’t predict well in new data
• Disappointing discrimination
• Often calibration slope < 1
https://twitter.com/LesGuessing/status/997146590442799105
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With small N: calibration slope often < 1
Predictions too extreme
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“Modern” methods aim to circumvent overfitting
• Penalised regression: e.g. lasso, ridge regression, elastic net
• Standard regression followed by uniform (global) shrinkage
• Target calibrated predicted risks in new data: shrinkage and penalty
terms estimated using bootstrapping or cross-validation
• Sample size problem solved?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“shrinkage works on the average but may fail in the particular unique
problem on which the statistician is working.”
• Required shrinkage is hard to estimate
• Often large uncertainty correct value to use, especially in small datasets (!)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“We conclude that, despite improved performance on average, shrinkage often
worked poorly in individual datasets, in particular when it was most needed.
The results imply that shrinkage methods do not solve problems associated
with small sample size or low number of events per variable.”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Our proposal
• Calculate sample size that is needed to
• minimise potential overfitting
• estimate probability (risk) precisely
• Sample size formula’s for
• Continuous outcomes
• Time-to-event outcomes
• Binary outcomes (focus today)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Example
• COVID-19 prognosis hospitalized
patients
• Composite outcome: “deterioration”
(in-hospital death, ventilator support,
ICU)
A priori expectations
• Event fraction at least 30%
• 40 candidate predictor parameters
• C-statistic of 0.71(conservative est)
-> Cox-Snell R2 of 0.24
MedRxiv Preprint (not peer reviewed): 10.1101/2020.10.09.20209957
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Restricted cubic splines
with 4 knots: 3 degrees of
freedom
Note: EPV rule also
calculates degrees of
freedom of candidate
predictors, not variables!
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Calculate required sample size
Criterion 1. Shrinkage: expected heuristic shrinkage factor, S ≥ 0.9
(calibration slope, target < 10% overfitting)
Criterion 2. Optimism: Cox-Snell R2 apparent - Cox-Snell R2 validation < 0.05
(overfitting)
Criterion 3: A small margin of error in overall risk estimate < 0.05 absolute error
(precision estimated baseline risk)
(Criterion 4: a small margin of absolute error in the estimated risks)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Calculation
R code:
> require(pmsampsize)
> pmsampsize(type="b",rsquared=0.24,parameters=40,prevalence=0.3)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
A few alternative scenarios
• rsquared=0.24,parameters=40,prevalence=0.3 -> EPV≥9.7
• rsquared=0.12,parameters=40,prevalence=0.3 -> EPV≥21.0
• rsquared=0.12,parameters=40,prevalence=0.5 -> EPV≥35.0
• rsquared=0.36,parameters=40,prevalence=0.2 -> EPV≥5
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
The sample size that meets all criteria is the MINIMUM required
• Why minimum? Other criteria may be important
e.g. missing data, clustering, variable selection
• May raise required sample size further
• Simulation based approaches
Preprint (not peer reviewed) doi: 10.21203/rs.3.rs-87100/v1
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Summary
• Default logistic regression produces finite sample biased estimates
• Finite sample bias can be substantial; easily solved using Firth’s correction
• “Modern” approaches (e.g. Firth, Lasso, Ridge) no compensation for low N
• New sample size criteria to replace the one-size-fits-all EPV≥10 rule
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
https://www.prognosisresearch.com/
New website by Richard Riley and Kym Snell
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Work in collaboration with:
• Carl Moons
• Hans Reitsma
• Richard Riley (Keele, materials for this presentation)
• Gary Collins (Oxford)
• Ben Van Calster (Leuven)
• Ewout Steyerberg (Leiden)
• Rishi Gupta (UCL)
• Many others
Contact: M.vanSmeden@umcutrecht.nl

More Related Content

What's hot

Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptx
Ewout Steyerberg
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
Maarten van Smeden
 
Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
Maarten van Smeden
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
Evangelos Kritsotakis
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
GaryCollins74
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part I
Maarten van Smeden
 
Whatever happened to design based inference
Whatever happened to design based inferenceWhatever happened to design based inference
Whatever happened to design based inference
StephenSenn2
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm down
BenVanCalster
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurt
Stephen Senn
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
Maarten van Smeden
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
Stephen Senn
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
Maarten van Smeden
 
NNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measuresNNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measures
Stephen Senn
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
Maarten van Smeden
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
Maarten van Smeden
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Ewout Steyerberg
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
Maarten van Smeden
 
P value wars
P value warsP value wars
P value wars
Stephen Senn
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
StephenSenn3
 
Predictimands
PredictimandsPredictimands
Predictimands
Maarten van Smeden
 

What's hot (20)

Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptx
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
 
Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part I
 
Whatever happened to design based inference
Whatever happened to design based inferenceWhatever happened to design based inference
Whatever happened to design based inference
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm down
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurt
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
NNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measuresNNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measures
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
P value wars
P value warsP value wars
P value wars
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
Predictimands
PredictimandsPredictimands
Predictimands
 

Similar to Why the EPV≥10 sample size rule is rubbish and what to use instead

The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
Stephen Senn
 
Sti2018 jws
Sti2018 jwsSti2018 jws
Sti2018 jws
Jesper Schneider
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
Paul Agapow
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
Bioinformatics and Computational Biosciences Branch
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
User Vision
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. Evaluations
BigML, Inc
 

Similar to Why the EPV≥10 sample size rule is rubbish and what to use instead (6)

The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
 
Sti2018 jws
Sti2018 jwsSti2018 jws
Sti2018 jws
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. Evaluations
 

More from Maarten van Smeden

UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
Maarten van Smeden
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
Maarten van Smeden
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
Maarten van Smeden
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
Maarten van Smeden
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Maarten van Smeden
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Maarten van Smeden
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
Maarten van Smeden
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
Maarten van Smeden
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
Maarten van Smeden
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
Maarten van Smeden
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
Maarten van Smeden
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
Maarten van Smeden
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
Maarten van Smeden
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
Maarten van Smeden
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
Maarten van Smeden
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
Maarten van Smeden
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
Maarten van Smeden
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
Maarten van Smeden
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
Maarten van Smeden
 

More from Maarten van Smeden (19)

UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
 

Recently uploaded

Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 

Recently uploaded (20)

Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 

Why the EPV≥10 sample size rule is rubbish and what to use instead

  • 1. Maarten van Smeden, PhD 2 november 2020 Why the EPV≥10 sample size rule is rubbish and what to use instead
  • 2. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Statistician at Julius Center for Health Sciences and Primary Care • Main interests (but not limited to): • prognostic and diagnostic modeling • measurement error • missing data Today’s topic: EPV≥10 sample size rule (aka 1 in 10 rule) has be one of the leading sample size rules in prognostic/diagnostic prediction modeling
  • 3. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 4. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 5. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Outline • The EPV≥10 rule-of-thumb: where does it come from? • Evidence the EPV≥10 rule has no rationale • Evidence that sample size is important (even if you use the fancier methods) • Actual sample size calculations for prediction models
  • 6. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Ever wondered if AD/BC gives the “best” estimate of the odds ratio? What if I told you that AD/BC is biased?
  • 7. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Let’s say we have fitted a logistic regression model to a dataset, and obtain ln 𝑝𝑝𝑖𝑖 1 − 𝑝𝑝𝑖𝑖 = 𝛼𝛼� + 𝛽𝛽̂1 𝑋𝑋1𝑖𝑖 + 𝛽𝛽̂2 𝑋𝑋2𝑖𝑖 + ⋯ + 𝛽𝛽̂𝑘𝑘 𝑋𝑋𝑘𝑘𝑖𝑖 I’m very sorry, but 𝛽𝛽̂1 is a biased estimator, and 𝛽𝛽̂2 too, …. …. actually they are all finite sample biased
  • 8. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Epidemiology text-books: • Confounding bias • Information bias • Selection bias … nothing about finite sample bias
  • 9. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Important: bias vs consistency • Consistency ≈ as sample size increases, estimate converges to truth • Bias ≈ with repeated samples, the average estimate converges to truth
  • 10. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Log(odds) is consistent but finite sample biased
  • 11. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 12. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 13. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 14. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 15. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 16. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 17. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Illustration by simulation • Simulate 4 normal covariates with equal multivariable log-odds-ratios of 2 • 1,000 simulation samples of N = 50 • Consistency: create 1,000 meta-dataset of increasing size: meta-dataset r consists of each created dataset up to r; • Bias: calculate difference estimate of exposure effect and true value for each of the created datasets up to r;
  • 18. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 19. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 20. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Average of 400 studies with N = 50 1 study with N = 20,000
  • 21. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With decreasing sample size How we usually think
  • 22. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With decreasing sample size But actually with odds ratios (and other ratios)
  • 23. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead The origin of the 1 in 10 rule “For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions”
  • 24. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Source: Peduzzi et al. 1996 ?
  • 25. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead More simulation studies Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216
  • 26. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead More simulation studies Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216 !?!
  • 27. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies • Evaluate a possible solution to reduce the finite sample bias
  • 28. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies (simulation technicality: handling of “separation”) • Evaluate a possible solution to reduce the finite sample bias
  • 29. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies (simulation technicality: handling of “separation”) • Evaluate a possible solution to reduce the finite sample bias
  • 30. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Firth’s ”correction” aims to reduce finite sample bias in maximum likelihood estimates, applicable to logistic regression • It makes clever use of the “Jeffries prior” (from Bayesian literature) to penalize the log-likelihood, which shrinks the estimated coefficients • It has a nice theoretical justifications, but does it work well?
  • 31. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Standard Averaged over 465 simulation conditions with 10,000 replications each
  • 32. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead StandardFirth’scorrection Averaged over 465 simulation conditions with 10,000 replications each
  • 33. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Firth’s correction Difficult? No Example R code: > require(“logistf”) > logistf(Y~X1+X2+X3+X4, firth=T, data=df) Compared to default (maxlik) logistic regression, Firth’s correction generally: • Narrower confidence intervals • Lower MSE • Better predictions* *requires adjustment of the intercept using flic=TRUE option in logistf
  • 34. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Sample issue size solved? … not quite! • Precision of regression coefficients • Variable selection and functional form • Ensure predictions are adequate
  • 35. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Sample issue size solved? … not quite! • Precision of regression coefficients • Variable selection and functional form • Ensure predictions are adequate • Why would a one-solution fits all rule-of-thumb be appropriate? • Think of sample size for a randomized clinical trial Would be odd to suggest all trials should have 100 patients in each arm?
  • 36. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead TRIPOD Item 8. Explain how the study size was arrived at Moons et al. Ann Intern Med 2015 (TRIPOD Explanation & Elaboration) “Although there is a consensus on the importance of having an adequate sample size for model development, how to determine what counts as ‘adequate’ is not clear …”
  • 37. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Why is sample size important? • We want to have a large enough sample size to develop a model that provides accurate risk predictions in new individuals from target population • Many (most?) models do not perform well when checked in new data • small sample sizes • overfitting • lack of (internal) validation • …
  • 38. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Recent example • Reviewed 232 prediction models • “All models were rated at high or unclear risk of bias” • Sample size: median 338; IQR 134 to 707 • Number of events: median 69; IQR 37 to 160 Living review, doi: 10.1136/bmj.m1328 (these numbers from a soon to appear review update)
  • 39. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Recent example • External validation 22 COVID-19 related prognostic models • Performance: poor to very poor • “Admission oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with COVID-19, respectively. None of the prognostic models evaluated here offered incremental value for patient stratification to these univariable predictors.”
  • 40. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Small sample size and overfitting • Spurious predictor-outcome associations • Important predictors can be missed • Unimportant predictors can be selected • Regression coefficients too large and uncertain • Model doesn’t predict well in new data • Disappointing discrimination • Often calibration slope < 1 https://twitter.com/LesGuessing/status/997146590442799105
  • 41. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With small N: calibration slope often < 1 Predictions too extreme
  • 42. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “Modern” methods aim to circumvent overfitting • Penalised regression: e.g. lasso, ridge regression, elastic net • Standard regression followed by uniform (global) shrinkage • Target calibrated predicted risks in new data: shrinkage and penalty terms estimated using bootstrapping or cross-validation • Sample size problem solved?
  • 43. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “shrinkage works on the average but may fail in the particular unique problem on which the statistician is working.” • Required shrinkage is hard to estimate • Often large uncertainty correct value to use, especially in small datasets (!)
  • 44. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “We conclude that, despite improved performance on average, shrinkage often worked poorly in individual datasets, in particular when it was most needed. The results imply that shrinkage methods do not solve problems associated with small sample size or low number of events per variable.”
  • 45. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 46. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 47. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Our proposal • Calculate sample size that is needed to • minimise potential overfitting • estimate probability (risk) precisely • Sample size formula’s for • Continuous outcomes • Time-to-event outcomes • Binary outcomes (focus today)
  • 48. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Example • COVID-19 prognosis hospitalized patients • Composite outcome: “deterioration” (in-hospital death, ventilator support, ICU) A priori expectations • Event fraction at least 30% • 40 candidate predictor parameters • C-statistic of 0.71(conservative est) -> Cox-Snell R2 of 0.24 MedRxiv Preprint (not peer reviewed): 10.1101/2020.10.09.20209957
  • 49. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Restricted cubic splines with 4 knots: 3 degrees of freedom Note: EPV rule also calculates degrees of freedom of candidate predictors, not variables!
  • 50. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Calculate required sample size Criterion 1. Shrinkage: expected heuristic shrinkage factor, S ≥ 0.9 (calibration slope, target < 10% overfitting) Criterion 2. Optimism: Cox-Snell R2 apparent - Cox-Snell R2 validation < 0.05 (overfitting) Criterion 3: A small margin of error in overall risk estimate < 0.05 absolute error (precision estimated baseline risk) (Criterion 4: a small margin of absolute error in the estimated risks)
  • 51. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Calculation R code: > require(pmsampsize) > pmsampsize(type="b",rsquared=0.24,parameters=40,prevalence=0.3)
  • 52. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead A few alternative scenarios • rsquared=0.24,parameters=40,prevalence=0.3 -> EPV≥9.7 • rsquared=0.12,parameters=40,prevalence=0.3 -> EPV≥21.0 • rsquared=0.12,parameters=40,prevalence=0.5 -> EPV≥35.0 • rsquared=0.36,parameters=40,prevalence=0.2 -> EPV≥5
  • 53. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead The sample size that meets all criteria is the MINIMUM required • Why minimum? Other criteria may be important e.g. missing data, clustering, variable selection • May raise required sample size further • Simulation based approaches Preprint (not peer reviewed) doi: 10.21203/rs.3.rs-87100/v1
  • 54. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Summary • Default logistic regression produces finite sample biased estimates • Finite sample bias can be substantial; easily solved using Firth’s correction • “Modern” approaches (e.g. Firth, Lasso, Ridge) no compensation for low N • New sample size criteria to replace the one-size-fits-all EPV≥10 rule
  • 55. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead https://www.prognosisresearch.com/ New website by Richard Riley and Kym Snell
  • 56. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Work in collaboration with: • Carl Moons • Hans Reitsma • Richard Riley (Keele, materials for this presentation) • Gary Collins (Oxford) • Ben Van Calster (Leuven) • Ewout Steyerberg (Leiden) • Rishi Gupta (UCL) • Many others Contact: M.vanSmeden@umcutrecht.nl