SlideShare a Scribd company logo
1 of 101
Download to read offline
Prediction modeling
Maarten van Smeden, Department of Clinical Epidemiology,
Leiden University Medical Center, Leiden, Netherlands
Berlin, Advanced Methods Methods in Health Data Sciences
Jan 16 2020
2
14-Jan-20Insert > Header & footer3
4
Cartoon of Jim Borgman, first published by the Cincinnati Inquirer and King Features Syndicate April 27 1997
Cookbook review
5
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
“We selected 50 common ingredients from random
recipes of a cookbook”
Cookbook review
veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato,
lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive,
mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster,
potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon,
cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla,
hickory, molasses, almonds, baking soda, ginger, terrapin
6
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
Studies relating the ingredients to cancer: 40/50
veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato,
lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive,
mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster,
potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon,
cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla,
hickory, molasses, almonds, baking soda, ginger, terrapin
7
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
Increased/decreased risk of developing cancer: 36/40
veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato,
lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive,
mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster,
potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon,
cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla,
hickory, molasses, almonds, baking soda, ginger, terrapin
8
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
9
Credits to Peter Tennant for identifying this example
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
11
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
12
A
L
Y
exposure outcome
confounder
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
Causal effect estimate
13
What would have happened with a group of individuals had they
received some treatment or exposure rather than another?
Image sources: https://bit.ly/3a9wRMj https://bit.ly/2uEDRQJ
Causal effect estimate
15
What would have happened with a group of individuals had they
received some treatment or exposure rather than another?
Randomized clinical trials
16
exchangeability
Randomized clinical trials
17
A
L
Y
exposure outcome
confounder
Observational (non-randomized) study
18
A
L
Y
exposure outcome
confounder
Observational study: diet -> diabetes, age
19
Age No diabetes Diabetes No diabetes Diabetes RR
< 50 years 19 1 37 3 1.50
≥ 50 years 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
≥ 50 years
> 50 years
Total
Diabetes
risk
< 50 years
Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y
Observational study: diet -> diabetes, weight loss
20
Weight No diabetes Diabetes No diabetes Diabetes RR
Lost 19 1 37 3 1.50
Gained 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
Gained wt
Lost wt
Total
Diabetes
risk
< 50 years
Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y
12 RCTs; 52 nutritional epidemiology claims
0/52 replicated
5/52 effect in the opposite direction
21
Young & Karr, Significance, 2001, DOI: 10.1111/j.1740-9713.2011.00506.x
But…
22
Ellie Murray (Jul 13 2018): https://twitter.com/EpiEllie/status/1017622949799571456
23
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
24
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
The “scientific value” of predictive modeling
25
1. Uncover potential new causal mechanisms and generation of new hypotheses
2. To discover and compare measures and operationalisations of constructs
3. Improving existing explanatory models by capturing underlying complex patterns
4. Reality check to theory: assessing distance between theory and practice
5. Compare competing theories by examining the predictive power
6. Generate knowledge of un-predictability
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330 (p292)
Wells rule
26
Wells et al., Lancet, 1997. doi: 10.1016/S0140-6736(97)08140-3
Apgar
27
Apgar, JAMA, 1958. doi: 10.1001/jama.1958.03000150027007
28
Courtesy, Anna Lohmann
29
Courtesy, Anna Lohmann
Prediction
Usual aim: to make accurate predictions
… of a future outcome or presence of a disease
… for an individual patient
… generally based on >1 factor (predictor)
why?
• to inform decision making (about additional testing/treatment)
• for counseling
30
To explain or to predict?
Explanatory models
• Causality
• Understanding the role of elements in complex systems
• ”What will happen if….”
Predictive models
• Forecasting
• Often, focus is on the performance of the forecasting
• “What will happen ….”
Descriptive models
• “What happened?”
31
Require different
research design
and analysis
choices
• Confounding
• Stein’s paradox
• Estimators
Risk estimation example: SCORE
32
Conroy, European Heart Journal, 2003. doi: 10.1016/S0195-668X(03)00114-3
33
https://apple.co/2s1aWWa
Risk prediction
Risk prediction can be broadly categorized into:
• Diagnostic: risk of a target disease being currently present vs not present
• Prognostic: risk of a certain health state over a certain time period
• Do we need a randomized controlled trial for diagnostic/prognostic prediction?
• Do we need counterfactual thinking?
34
Risk prediction
35
TRIPOD: Collins et al., Annals of Int Medicine, 2015. doi: 10.7326/M14-0697
Risk?
Risk = probability
36
Probability
37
38
39
How accurate is this point-of-care test?
40
image from: https://bit.ly/39LuajJ
Classical diagnostic test accuracy study
Patients suspected of target
condition
Reference standard for target
condition
Index test(s)
Domain
“Exposure”
“Outcome”
Classical diagnostic test accuracy study
Patients suspected of target
condition
Reference standard for target
condition
Index test(s)
Classical diagnostic test accuracy study
Patients suspected of target
condition
Reference standard for target
condition
Index test(s)
Role of time?
Cross-sectional in nature: index test and reference standard (in principle) at
same point in time to test for target condition at that time point
Classical diagnostic test accuracy study
Patients suspected of target
condition
Reference standard for target
condition
Index test(s)
Comparator for index test?
None, study of accuracy does not require a comparison to another index test
Classical diagnostic test accuracy study
Patients suspected of target
condition
Reference standard for target
condition
Index test(s)
Confounding (bias)?
No need for (conditional) exchangability to interpret accuracy. Confounding
(bias) is not an issue
Classical diagnostic test accuracy study
Prevalence = (A+B)/(A+B+C+D)
Sensitivity = A/(A+B)
Specificity = D/(C+D)
Positive predictive value = A/(A+C)
Negative predictive value = D/(B+D)
Probability
• Disease prevalence (Prev): Pr(Disease +)
• Sensitivity (Se): Pr(Test + | Disease +)
• Specificity (Sp): Pr(Test – | Disease –)
• Positive predictive value (PPV): Pr(Disease + | Test +)
• Negative predictive value (NPV): Pr(Disease – | Test –)
What is left and what is right from the “|” sign matters
All probabilities are conditional
• Some conditions are given without saying (e.g. probability is about human
individuals), others less so (e.g. prediction in first vs secondary care)
• Things that are constant (e.g. setting) do not enter in notation
• There is no such as thing as ”the probability”: context is everything
Small side step: the p-value
p-value*: Pr(Data|Hypothesis)
Is not: Pr(Hypothesis|Data)
Somewhat simplified, correct notation would be: Pr(T(X) ≥ x; hypothesis)
Small side step: the p-value
Pr(Death|Handgun)
= 5% to 20%*
Pr(Handgun|Death)
= 0.03%**
*from New York Times (http://www.nytimes.com article published: 2008/04/03/)
** from CBS StatLine (concerning deaths and registered gun crimes in 2015 in the Netherlands)
Bayes theorem
Pr(A|B) =
Pr B A ) Pr(A)
Pr(B)
Probability of A occurring given B happened
Probability of B occurring given A happened
Probability of A occurring
Probability of B occurring
Thomas Bayes
(1702-1761)
Bayesville
https://youtu.be/otdaJPVQIgg
https://youtu.be/otdaJPVQIgg
In-class exercise – ClearBlue compact pregnancy test
• Calculate Prev, Se, Sp, NPV and PPV
• Re-calculate NPV assuming Prev of 10%, and again with 80%
• Make use of NPV = Sp*(1-Prev)/[(1-Se)*Prev + Sp*(1-Prev)]
In reality
• Performance of the Clearblue COMPACT pregnancy test was worse: 38 additional
results among pregnant women were ‘non-conclusive’
• The reference standard was a ‘trained study coordinator’ reading of the same test
Diagnostic test is simplest prediction model
• Nowcasting (instead of forecasting)
• Best available prediction of target disease status is test result
• Assuming no other relevant information is available
• Risk prediction (probability) for disease:
• PPV with positive test
• 1-NPV with negative test
Model development
Research design: aims
Point of intended use of the risk model
• Primary care (paper/computer/app)?
• Secondary care (beside)?
• Low resource setting?
Complexity
• Number of predictors?
• Transparency of calculation?
• Should it be fast?
Research design: design of data collection
Prospective cohort study: measurement of predictors at baseline + follow-up until event
occurs (time-horizon)
Alternatives
• Randomized trials?
• Routine care data?
• Case-control?
Statistical models
• Regression models for binary/time-to-event outcomes
• Logistic regression
• Cox proportional hazards models (or parametric alternatives)
• Alternatives
• Multinomial and ordinal models
• Decision trees (and decendants)
• Neural networks
Regression model specification
f(X): linear predictor (lp)
lp = b0 + b1X1 + … + bPXP (only "main effects")
Logistic regression: Pr(Y = 1 | X1 ,…,XP ) = 1/(1 + exp{-lp})
b0; intercept, important?
Intercept
Intercept = 0 - c, intercept = 0, intercept = 0 + c
Model predictive performance
Discrimination
• Sensitivity/specificity trade-off
• Arbitrary choice threshold ! Many
possible sensitivity/specificity pairs
• All pairs in 1 graph: ROC curve
• Area under the ROC-curve:
probability that a random individual
with event has a higher predicted
probability than a random individual
without event
• Area under the ROC-curve: the c- statistic (for logistic regression) takes
on values between 0.5 (no better than a coin-ip) and 1.0 (perfect
discrimination)
Calibration
https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-019-1466-7
Optimism
Predictive performance evaluations are too optimistic when estimated on the same data
where the risk prediction model was developed. This is therefore called apparent
performance of the model
Optimism can be large, especially in small datasets and with a large number of predictors
To get a better estimate of the predictive performance (more about this next week):
• Internal validation (same data sample)
• External validation (other data sample)
https://twitter.com/LesGuessing/status/997146590442799105
1955: Stein’s paradox
Stein’s paradox in words (rather simplified)
When one has three or more units (say, individuals), and for each unit one can calculate
an average score (say, average blood pressure), then the best guess of future
observations for each unit (say, blood pressure tomorrow) is NOT the average score.
1961: James-Stein estimator: the next Symposium
James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley symposium on
mathematical statistics and probability. Vol. 1. 1961.
1977
1977: Baseball example
Squared error reduced from .077 to .022
Stein’s paradox
• Probably among the most surprising (and initially doubted) phenomena in statistics
• Now a large “family”: shrinkage estimators reduce prediction variance to an extent
that typically outweighs the bias that is introduced
• Bias/variance trade-off principle has motivated many statistical and machine learning
developments
Expected prediction error = irreducible error + bias2 + variance
Simulate 100 times
Not just lucky
• 5% reduction in MSPE just by shrinkage estimator
• Van Houwelingen and le Cessie’s heuristic shrinkage factor
Heuristic argument for shrinkage
Heuristic argument for shrinkage
Overfitting
"Idiosyncrasies in the data are fitted rather than generalizable patterns. A
model may hence not be applicable to new patients, even when the setting of
application is very similar to the development setting."
Steyerberg (2009). Clinical Prediction Models.
Overfitting versus underfitting
To avoid overfitting…
Large data (sample size / no. events) and to pre-specify your analyses as much as
possible!
And:
• Be conservative when removing predictor variables
• Apply shrinkage methods
• Correct for optimism
EPV – rule of thumb
Events per variable (EPV) for logistic/survival models:
number of events (smallest outcome group)
number of candidate predictor variables1
EPV = 10 commonly used minimal criterion
EPV – rule of dumb?
• EPV values for reliable selection of predictors from a larger set of
candidate predictors may be as large as 50
• Statistical simulation studies on the minimal EPV rules are highly
heterogeneous and have large problems
New sample size proposals
Variable selection
• Selection unstable
• Selection and order of entry often overinterpreted
• Limited power to detect true effects
• Predictive ability suffers, ‘underfitting’
• Risk of false-positive associations
• Multiple testing, ‘overfitting’
• Inference biased
• P-values exaggerated; standard errors too small
• Estimated coefficients biased
• ‘testimation’
Selection with small sample size
Conditional probabilities are at the core of prediction
• Perfect or near-perfect predicting models?
Suspect!
• Proving that a probability model generates a wrong risk prediction?
Difficult!
When is a risk model ready for use?
Prediction model landscape
>110 models for prostate cancer (Shariat 2008)
>100 models for Traumatic Brain Injury (Perel 2006)
83 models for stroke (Counsell 2001)
54 models for breast cancer (Altman 2009)
43 models for type 2 diabetes (Collins 2011; Dieren 2012)
31 models for osteoporotic fracture (Steurer 2011)
29 models in reproductive medicine (Leushuis 2009)
26 models for hospital readmission (Kansagara 2011)
>25 models for length of stay in cardiac surgery (Ettema 2010)
>350 models for CVD outcomes (Damen 2016)
• Few prediction models are externally validated
• Predictive performance often poor
97
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
98
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
Problems in common (selection)
• Generalizability/transportability
• Missing values
• Model misspecification
• Measurement and misclassification error
99
100
Two hour tutorial to R (free): www.r-tutorial.nl
Repository of open datasets: http://mvansmeden.com/post/opendatarepos/
101

More Related Content

What's hot

What's hot (20)

Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradox
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
systematic review and metaanalysis
systematic review and metaanalysis systematic review and metaanalysis
systematic review and metaanalysis
 
20160604 TRIPOD ws@ACPJC
20160604 TRIPOD ws@ACPJC20160604 TRIPOD ws@ACPJC
20160604 TRIPOD ws@ACPJC
 
What is a two sample z test?
What is a two sample z test?What is a two sample z test?
What is a two sample z test?
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Multiple Linear Regression
Multiple Linear RegressionMultiple Linear Regression
Multiple Linear Regression
 
Introduction to Multivariate analysis
Introduction to Multivariate analysisIntroduction to Multivariate analysis
Introduction to Multivariate analysis
 
Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptx
 

Similar to The basics of prediction modeling

observational analytical study
observational analytical studyobservational analytical study
observational analytical study
Dr. Partha Sarkar
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
rgveroniki
 
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptxDr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
PriyankaSharma89719
 
What is your question
What is your questionWhat is your question
What is your question
Stephen Senn
 
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
MohedLipan
 

Similar to The basics of prediction modeling (20)

observational analytical study
observational analytical studyobservational analytical study
observational analytical study
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
 
Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019
 
UAB Pulmonary board review study design and statistical principles
UAB Pulmonary board review study  design and statistical principles UAB Pulmonary board review study  design and statistical principles
UAB Pulmonary board review study design and statistical principles
 
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptxDr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
 
Day 1 (Lecture 3): Predictive Analytics in Healthcare
Day 1 (Lecture 3): Predictive Analytics in HealthcareDay 1 (Lecture 3): Predictive Analytics in Healthcare
Day 1 (Lecture 3): Predictive Analytics in Healthcare
 
What is your question
What is your questionWhat is your question
What is your question
 
What is your question
What is your questionWhat is your question
What is your question
 
Depersonalising medicine
Depersonalising medicineDepersonalising medicine
Depersonalising medicine
 
Epidemiological Studies
Epidemiological StudiesEpidemiological Studies
Epidemiological Studies
 
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
3.3 Unit Three Session III - CH10 Observational Study Designs.pdf
 
Evidence-Based Practice_Lecture 4_slides
Evidence-Based Practice_Lecture 4_slidesEvidence-Based Practice_Lecture 4_slides
Evidence-Based Practice_Lecture 4_slides
 
screening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdfscreening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdf
 
Evidence based decision making in periodontics
Evidence based decision making in periodonticsEvidence based decision making in periodontics
Evidence based decision making in periodontics
 
Epidemological methods
Epidemological methodsEpidemological methods
Epidemological methods
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction Models
 
Glymour aaai
Glymour aaaiGlymour aaai
Glymour aaai
 
RCT to causal inference.pptx
RCT to causal inference.pptxRCT to causal inference.pptx
RCT to causal inference.pptx
 
Case control study
Case control studyCase control study
Case control study
 

More from Maarten van Smeden

More from Maarten van Smeden (19)

UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Five questions about artificial intelligence
Five questions about artificial intelligenceFive questions about artificial intelligence
Five questions about artificial intelligence
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 

Recently uploaded

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 

Recently uploaded (20)

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 

The basics of prediction modeling

  • 1. Prediction modeling Maarten van Smeden, Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, Netherlands Berlin, Advanced Methods Methods in Health Data Sciences Jan 16 2020
  • 2. 2
  • 4. 4 Cartoon of Jim Borgman, first published by the Cincinnati Inquirer and King Features Syndicate April 27 1997
  • 5. Cookbook review 5 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142 “We selected 50 common ingredients from random recipes of a cookbook”
  • 6. Cookbook review veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin 6 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
  • 7. Studies relating the ingredients to cancer: 40/50 veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin 7 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
  • 8. Increased/decreased risk of developing cancer: 36/40 veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla, hickory, molasses, almonds, baking soda, ginger, terrapin 8 Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
  • 9. 9 Credits to Peter Tennant for identifying this example
  • 10.
  • 11. To explain or to predict? Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories • e.g. aetiology of illness, effect of treatment Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism • e.g. prognostic or diagnostic prediction model Descriptive models • Capture the data structure 11 Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
  • 12. To explain or to predict? Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories • e.g. aetiology of illness, effect of treatment Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism • e.g. prognostic or diagnostic prediction model Descriptive models • Capture the data structure 12 A L Y exposure outcome confounder Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
  • 13. Causal effect estimate 13 What would have happened with a group of individuals had they received some treatment or exposure rather than another?
  • 14. Image sources: https://bit.ly/3a9wRMj https://bit.ly/2uEDRQJ
  • 15. Causal effect estimate 15 What would have happened with a group of individuals had they received some treatment or exposure rather than another?
  • 19. Observational study: diet -> diabetes, age 19 Age No diabetes Diabetes No diabetes Diabetes RR < 50 years 19 1 37 3 1.50 ≥ 50 years 28 12 12 8 1.33 Total 47 13 49 11 0.88 Traditional Exotic diet 50% 40% 30% 20% 10% ≥ 50 years > 50 years Total Diabetes risk < 50 years Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y
  • 20. Observational study: diet -> diabetes, weight loss 20 Weight No diabetes Diabetes No diabetes Diabetes RR Lost 19 1 37 3 1.50 Gained 28 12 12 8 1.33 Total 47 13 49 11 0.88 Traditional Exotic diet 50% 40% 30% 20% 10% Gained wt Lost wt Total Diabetes risk < 50 years Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y
  • 21. 12 RCTs; 52 nutritional epidemiology claims 0/52 replicated 5/52 effect in the opposite direction 21 Young & Karr, Significance, 2001, DOI: 10.1111/j.1740-9713.2011.00506.x
  • 22. But… 22 Ellie Murray (Jul 13 2018): https://twitter.com/EpiEllie/status/1017622949799571456
  • 23. 23
  • 24. To explain or to predict? Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories • e.g. aetiology of illness, effect of treatment Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism • e.g. prognostic or diagnostic prediction model Descriptive models • Capture the data structure 24 Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
  • 25. The “scientific value” of predictive modeling 25 1. Uncover potential new causal mechanisms and generation of new hypotheses 2. To discover and compare measures and operationalisations of constructs 3. Improving existing explanatory models by capturing underlying complex patterns 4. Reality check to theory: assessing distance between theory and practice 5. Compare competing theories by examining the predictive power 6. Generate knowledge of un-predictability Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330 (p292)
  • 26. Wells rule 26 Wells et al., Lancet, 1997. doi: 10.1016/S0140-6736(97)08140-3
  • 27. Apgar 27 Apgar, JAMA, 1958. doi: 10.1001/jama.1958.03000150027007
  • 30. Prediction Usual aim: to make accurate predictions … of a future outcome or presence of a disease … for an individual patient … generally based on >1 factor (predictor) why? • to inform decision making (about additional testing/treatment) • for counseling 30
  • 31. To explain or to predict? Explanatory models • Causality • Understanding the role of elements in complex systems • ”What will happen if….” Predictive models • Forecasting • Often, focus is on the performance of the forecasting • “What will happen ….” Descriptive models • “What happened?” 31 Require different research design and analysis choices • Confounding • Stein’s paradox • Estimators
  • 32. Risk estimation example: SCORE 32 Conroy, European Heart Journal, 2003. doi: 10.1016/S0195-668X(03)00114-3
  • 34. Risk prediction Risk prediction can be broadly categorized into: • Diagnostic: risk of a target disease being currently present vs not present • Prognostic: risk of a certain health state over a certain time period • Do we need a randomized controlled trial for diagnostic/prognostic prediction? • Do we need counterfactual thinking? 34
  • 35. Risk prediction 35 TRIPOD: Collins et al., Annals of Int Medicine, 2015. doi: 10.7326/M14-0697
  • 38. 38
  • 39. 39
  • 40. How accurate is this point-of-care test? 40 image from: https://bit.ly/39LuajJ
  • 41. Classical diagnostic test accuracy study Patients suspected of target condition Reference standard for target condition Index test(s) Domain “Exposure” “Outcome”
  • 42. Classical diagnostic test accuracy study Patients suspected of target condition Reference standard for target condition Index test(s)
  • 43. Classical diagnostic test accuracy study Patients suspected of target condition Reference standard for target condition Index test(s) Role of time? Cross-sectional in nature: index test and reference standard (in principle) at same point in time to test for target condition at that time point
  • 44. Classical diagnostic test accuracy study Patients suspected of target condition Reference standard for target condition Index test(s) Comparator for index test? None, study of accuracy does not require a comparison to another index test
  • 45. Classical diagnostic test accuracy study Patients suspected of target condition Reference standard for target condition Index test(s) Confounding (bias)? No need for (conditional) exchangability to interpret accuracy. Confounding (bias) is not an issue
  • 46. Classical diagnostic test accuracy study Prevalence = (A+B)/(A+B+C+D) Sensitivity = A/(A+B) Specificity = D/(C+D) Positive predictive value = A/(A+C) Negative predictive value = D/(B+D)
  • 47. Probability • Disease prevalence (Prev): Pr(Disease +) • Sensitivity (Se): Pr(Test + | Disease +) • Specificity (Sp): Pr(Test – | Disease –) • Positive predictive value (PPV): Pr(Disease + | Test +) • Negative predictive value (NPV): Pr(Disease – | Test –) What is left and what is right from the “|” sign matters
  • 48. All probabilities are conditional • Some conditions are given without saying (e.g. probability is about human individuals), others less so (e.g. prediction in first vs secondary care) • Things that are constant (e.g. setting) do not enter in notation • There is no such as thing as ”the probability”: context is everything
  • 49. Small side step: the p-value p-value*: Pr(Data|Hypothesis) Is not: Pr(Hypothesis|Data) Somewhat simplified, correct notation would be: Pr(T(X) ≥ x; hypothesis)
  • 50. Small side step: the p-value Pr(Death|Handgun) = 5% to 20%* Pr(Handgun|Death) = 0.03%** *from New York Times (http://www.nytimes.com article published: 2008/04/03/) ** from CBS StatLine (concerning deaths and registered gun crimes in 2015 in the Netherlands)
  • 51. Bayes theorem Pr(A|B) = Pr B A ) Pr(A) Pr(B) Probability of A occurring given B happened Probability of B occurring given A happened Probability of A occurring Probability of B occurring Thomas Bayes (1702-1761)
  • 53. In-class exercise – ClearBlue compact pregnancy test • Calculate Prev, Se, Sp, NPV and PPV • Re-calculate NPV assuming Prev of 10%, and again with 80% • Make use of NPV = Sp*(1-Prev)/[(1-Se)*Prev + Sp*(1-Prev)]
  • 54. In reality • Performance of the Clearblue COMPACT pregnancy test was worse: 38 additional results among pregnant women were ‘non-conclusive’ • The reference standard was a ‘trained study coordinator’ reading of the same test
  • 55. Diagnostic test is simplest prediction model • Nowcasting (instead of forecasting) • Best available prediction of target disease status is test result • Assuming no other relevant information is available • Risk prediction (probability) for disease: • PPV with positive test • 1-NPV with negative test
  • 57. Research design: aims Point of intended use of the risk model • Primary care (paper/computer/app)? • Secondary care (beside)? • Low resource setting? Complexity • Number of predictors? • Transparency of calculation? • Should it be fast?
  • 58. Research design: design of data collection Prospective cohort study: measurement of predictors at baseline + follow-up until event occurs (time-horizon) Alternatives • Randomized trials? • Routine care data? • Case-control?
  • 59. Statistical models • Regression models for binary/time-to-event outcomes • Logistic regression • Cox proportional hazards models (or parametric alternatives) • Alternatives • Multinomial and ordinal models • Decision trees (and decendants) • Neural networks
  • 60. Regression model specification f(X): linear predictor (lp) lp = b0 + b1X1 + … + bPXP (only "main effects") Logistic regression: Pr(Y = 1 | X1 ,…,XP ) = 1/(1 + exp{-lp}) b0; intercept, important?
  • 61. Intercept Intercept = 0 - c, intercept = 0, intercept = 0 + c
  • 63. Discrimination • Sensitivity/specificity trade-off • Arbitrary choice threshold ! Many possible sensitivity/specificity pairs • All pairs in 1 graph: ROC curve • Area under the ROC-curve: probability that a random individual with event has a higher predicted probability than a random individual without event • Area under the ROC-curve: the c- statistic (for logistic regression) takes on values between 0.5 (no better than a coin-ip) and 1.0 (perfect discrimination)
  • 65. Optimism Predictive performance evaluations are too optimistic when estimated on the same data where the risk prediction model was developed. This is therefore called apparent performance of the model Optimism can be large, especially in small datasets and with a large number of predictors To get a better estimate of the predictive performance (more about this next week): • Internal validation (same data sample) • External validation (other data sample)
  • 68. Stein’s paradox in words (rather simplified) When one has three or more units (say, individuals), and for each unit one can calculate an average score (say, average blood pressure), then the best guess of future observations for each unit (say, blood pressure tomorrow) is NOT the average score.
  • 69. 1961: James-Stein estimator: the next Symposium James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability. Vol. 1. 1961.
  • 70. 1977
  • 71. 1977: Baseball example Squared error reduced from .077 to .022
  • 72. Stein’s paradox • Probably among the most surprising (and initially doubted) phenomena in statistics • Now a large “family”: shrinkage estimators reduce prediction variance to an extent that typically outweighs the bias that is introduced • Bias/variance trade-off principle has motivated many statistical and machine learning developments Expected prediction error = irreducible error + bias2 + variance
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 84. Not just lucky • 5% reduction in MSPE just by shrinkage estimator • Van Houwelingen and le Cessie’s heuristic shrinkage factor
  • 87. Overfitting "Idiosyncrasies in the data are fitted rather than generalizable patterns. A model may hence not be applicable to new patients, even when the setting of application is very similar to the development setting." Steyerberg (2009). Clinical Prediction Models.
  • 89. To avoid overfitting… Large data (sample size / no. events) and to pre-specify your analyses as much as possible! And: • Be conservative when removing predictor variables • Apply shrinkage methods • Correct for optimism
  • 90. EPV – rule of thumb Events per variable (EPV) for logistic/survival models: number of events (smallest outcome group) number of candidate predictor variables1 EPV = 10 commonly used minimal criterion
  • 91. EPV – rule of dumb? • EPV values for reliable selection of predictors from a larger set of candidate predictors may be as large as 50 • Statistical simulation studies on the minimal EPV rules are highly heterogeneous and have large problems
  • 92. New sample size proposals
  • 93. Variable selection • Selection unstable • Selection and order of entry often overinterpreted • Limited power to detect true effects • Predictive ability suffers, ‘underfitting’ • Risk of false-positive associations • Multiple testing, ‘overfitting’ • Inference biased • P-values exaggerated; standard errors too small • Estimated coefficients biased • ‘testimation’
  • 94. Selection with small sample size
  • 95. Conditional probabilities are at the core of prediction • Perfect or near-perfect predicting models? Suspect! • Proving that a probability model generates a wrong risk prediction? Difficult!
  • 96. When is a risk model ready for use?
  • 97. Prediction model landscape >110 models for prostate cancer (Shariat 2008) >100 models for Traumatic Brain Injury (Perel 2006) 83 models for stroke (Counsell 2001) 54 models for breast cancer (Altman 2009) 43 models for type 2 diabetes (Collins 2011; Dieren 2012) 31 models for osteoporotic fracture (Steurer 2011) 29 models in reproductive medicine (Leushuis 2009) 26 models for hospital readmission (Kansagara 2011) >25 models for length of stay in cardiac surgery (Ettema 2010) >350 models for CVD outcomes (Damen 2016) • Few prediction models are externally validated • Predictive performance often poor 97
  • 98. To explain or to predict? Explanatory models • Theory: interest in regression coefficients • Testing and comparing existing causal theories • e.g. aetiology of illness, effect of treatment Predictive models • Interest in (risk) predictions of future observations • No concern about causality • Concerns about overfitting and optimism • e.g. prognostic or diagnostic prediction model Descriptive models • Capture the data structure 98 Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330
  • 99. Problems in common (selection) • Generalizability/transportability • Missing values • Model misspecification • Measurement and misclassification error 99
  • 100. 100
  • 101. Two hour tutorial to R (free): www.r-tutorial.nl Repository of open datasets: http://mvansmeden.com/post/opendatarepos/ 101