Is it causal, is it prediction or is it neither?

Is it causal, is it prediction or is it neither?
Maarten van Smeden, Department of Clinical Epidemiology,
Leiden University Medical Center, Leiden, Netherlands
Seminar Erasmus School of Health Policy & Management
June 24 2019

Nature's survey of 1,576 researchers
3
Nature news (May 25, 2016) https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970

Cookbook review
4
Schoenfeld & Ioannidis, Am J Clin Nutr 2013, DOI: 10.3945/ajcn.112.047142
“We selected 50 common ingredients from random
recipes of a cookbook”

Cookbook review
veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato,
lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive,
mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster,
potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon,
cayenne, orange, tea, rum, raisin, bay leaf, cloves, thyme, vanilla,
hickory, molasses, almonds, baking soda, ginger, terrapin
5

Studies relating the ingredients to cancer: 40/50
6

Increased/decreased risk of developing cancer: 36/40
7

8
Cartoon of Jim Borgman, first published by the Cincinnati Inquirer and King Features Syndicate April 27 1997

Published 43 articles on statistical significance testing (vol73,2019)

https://bit.ly/2KyLXxo (winner VWN publication prize for best science journalism article in 2018)
Read 19 peer reviewed articles using data from
Dutch cohort studies: 15 had serious limitations

12
https://www.volkskrant.nl/wetenschap/gezond-drinken-bestaat-toch-niet-ook-dat-ene-glaasje-per-dag-kun-je-beter-laten-staan~b9052f89/

13
Credits to Peter Tennant for identifying this example

To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Predictive models
• Interest in (risk) predictions of future observations
• No concern about causality
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
15
Shmueli, Statistical Science 2010, DOI: 10.1214/10-STS330

Explanatory models
Predictive models
Descriptive models
16
A
L
Y
exposure outcome
confounder

Causal effect estimate
17
What would have happened with a group of individuals had they
received some treatment or exposure rather than another?

Causal effect estimate
19
What would have happened with a group of individuals had they
received some treatment or exposure rather than another?

Randomized clinical trials
20
exchangeability

Randomized clinical trials
21
A
L
Y
exposure outcome
confounder

Observational (non-randomized) study
22
A
L
Y
exposure outcome
confounder

Observational study: diet -> diabetes, age
23
Age No diabetes Diabetes No diabetes Diabetes RR
< 50 years 19 1 37 3 1.50
≥ 50 years 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
≥ 50 years
> 50 years
Total
Diabetes
risk
< 50 years
Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y

Observational study: diet -> diabetes, weight loss
24
Weight No diabetes Diabetes No diabetes Diabetes RR
Lost 19 1 37 3 1.50
Gained 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
Gained wt
Lost wt
Total
Diabetes
risk
< 50 years
Numerical example adapted from Peter Tennant with permission: http://tiny.cc/ai6o8y

12 RCTs; 52 nutritional epidemiology claims
0/52 replicated
5/52 effect in the opposite direction
27
Young & Karr, Significance, 2001, DOI: 10.1111/j.1740-9713.2011.00506.x

But…
28
Ellie Murray (Jul 13 2018): https://twitter.com/EpiEllie/status/1017622949799571456

Explanatory models
Predictive models
Descriptive models
30

Apgar
31
Apgar, JAMA, 1958. doi: 10.1001/jama.1958.03000150027007

Risk estimation example: SCORE
32
Conroy, European Heart Journal, 2003. doi: 10.1016/S0195-668X(03)00114-3

https://twitter.com/LesGuessing/status/997146590442799105

1961
37
James & Stein. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability. Vol. 1. 1961.

38
Efron & Morris Scientific American, 1977

39
Efron & Morris Scientific American, 1977

Prediction model landscape
>110 models for prostate cancer (Shariat 2008)
>100 models for Traumatic Brain Injury (Perel 2006)
83 models for stroke (Counsell 2001)
54 models for breast cancer (Altman 2009)
43 models for type 2 diabetes (Collins 2011; Dieren 2012)
31 models for osteoporotic fracture (Steurer 2011)
29 models in reproductive medicine (Leushuis 2009)
26 models for hospital readmission (Kansagara 2011)
>25 models for length of stay in cardiac surgery (Ettema 2010)
>350 models for CVD outcomes (Damen 2016)
• Few prediction models are externally validated
• Predictive performance often poor
43

Explanatory models
Predictive models
Descriptive models
45

Explanatory models
• Causality
• Understanding the role of elements in complex systems
• ”What will happen if….”
Predictive models
• Forecasting
• Often, focus is on the performance of the forecasting
• “What will happen ….”
Descriptive models
• “What happened?”
46
Require different
research design
and analysis
choices
• Confounding
• Stein’s paradox
• Estimators

Problems in common (selection)
• Generalizability/transportability
• Missing values
• Model misspecification
• Measurement and misclassification error
47
https://osf.io/msx8d/
preprint

Two hour tutorial to R (free): www.r-tutorial.nl
Repository of open datasets: http://mvansmeden.com/post/opendatarepos/
49

Is it causal, is it prediction or is it neither?

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Is it causal, is it prediction or is it neither?

Similar to Is it causal, is it prediction or is it neither? (20)

More from Maarten van Smeden

More from Maarten van Smeden (15)

Recently uploaded

Recently uploaded (20)

Is it causal, is it prediction or is it neither?