SlideShare a Scribd company logo
1 of 20
August 2023, Milan (Italy)
Sources of uncertainty in
clinical prediction models
Ben Van Calster, PhD
Dept Development & Regeneration, KU Leuven (B)
Dept Biomedical Data Sciences, LUMC Leiden (NL)
Thanks to Laure Wynants and Ewout Steyerberg
Validated = trustworthy?
• Classic prediction model validation is on population level:
• Discrimination (c stat)
• Calibration (curve)
• Clinical utility (net benefit)
• But patients focus on their own risk
• Is that trustworthy, even if the model is calibrated?
Internal and external validation / validity
Sources of uncertainty
Mateo Dineen, “Elephants in the room”
Elephants in the room
Altman & Royston 2000: “We believe that the distinction between what is achievable at
the group and individual levels is not well understood.”
“Alternative models […], while demonstrating good agreement for describing
patients in the aggregate, are shown to differ considerably for individual patients.”
“Human survival is so uncertain that even the best statistical analysis cannot provide
single-number prediction of real use for individual patients.”
Does (true) individual risk exist?
• Concept of individual risk is debated
• Nonsense, not meaningful, … (Stern 2012, Sniderman 2015, people in 19th century)
• “reference class problem”
• Even then:
• Risk estimate may be seen as the level to which one is ‘willing to bet’ on
the existence/occurrence of the event (cf de Finetti; Nau 2001)
• How stable / uncertain is this ‘betting level’?
Sources of uncertainty
• Aleatory uncertainty
• Epistemic uncertainty
• Approximation/Estimation uncertainty
• Model(er) uncertainty
• Data uncertainty
• Population uncertainty (heterogeneity)
Hullermeier & Waegeman. Machine Learning 2021;110:457-506.
Gruber et al. arXiv 2023 (statistician’s perspective)
Illustration: ovarian cancer diagnosis
• Data from 1133 patients from the U Hospital Leuven, 1999-2015
• Patients with one or more ovarian tumors scheduled for surgery
• Estimate risk of malignancy
• Main model:
• Standard logistic regression
• age (years), max lesion diameter (mm), proportion solid tissue, CA125
(IU/L), bilateral tumors (Y/N), papillations with blood flow (Y/N)
• First 4 predictors: rcs with 3 knots
• 10 parameters, 37% prevalence, assumed AUC of 0.88 (cf literature):
minimum required sample size 359 (Riley et al 2020)
Illustration: ovarian cancer diagnosis
• Random test set of 100 patients, remaining 1033 is training pool
• We randomly select 385 cases from training pool
• First, impute missing CA125 values (31%) using regression imputation
• Train model
• Apply imputation model and prediction model to test set
• Calculate risks on test set
Illustration: ovarian cancer diagnosis
10 variations regarding modeling uncertainty:
• Nonlinearity based on first degree fractional polynomials (Sauerbrei et al 2007)
• Nonlinearity ignored (we don’t like it, but it happens a lot)
• Numerical variables dichotomized at median (stay calm)
• Backward elimination with alpha 1%
• Backward elimination with alpha 20%
• Penalized estimation based on AICc (Harrell 2015)
• Including interactions with CA125, backward elimination at alpha 20%
• Random forest model using very deep trees (min.node.size 2)
• Random forest model with less deep trees (min.node.size 20)
• Use 2 other categorical preds (ordinal vascularation score, irregular walls)
Illustration: ovarian cancer diagnosis
3 variations regarding data uncertainty:
• Model tumor size and proportion solid tissue using estimated volumes
• Impute using missing indicator method
• Impute using median imputation conditional on outcome (stay calm)
2 variations regarding population uncertainty
• Sample from training pool from hospital in Rome (N=1141)
• Sample from training pool from hospital in Malmö (N=1053)
Approximation uncertainty
• Repeat all of this 100 times (100 random training sets of 385 cases)
Approximation/instability (main model)
Mean AUC: 0.97 (range 0.96-0.98) (!)
Mean ECE: 0.06 (range 0.04-0.09)
Mean risk range:
20 %p (range 2-74)
Model uncertainty (1st train set)
AUC range 0.95-0.98
ECE range 0.06-0.10
Mean risk range:
21 %p (range 2-63)
Data uncertainty (1st train set)
AUC range 0.97-0.98
ECE range 0.06-0.07
Mean risk range:
6 %p (range 0.2-30)
Population uncertainty (1st train set)
AUC range 0.96-0.97
ECE range 0.06-0.07
Mean risk range:
12 %p (range 1-67)
All uncertainties
Mean AUC 0.97 (range 0.83-0.99)
Mean ECE 0.07 (range 0.02-0.24)
Mean risk range:
52 %p (range 5->99)
Conclusions
• Model predictions suffer from multiple sources of uncertainty
• Transparency: for policy makers / physicians / patients
• Risk communication, shared decision making?
• Quantifying uncertainty completely is impossible (multiverse too large)
• Reporting stability is good, but is only a part of uncertainty
• Riley’s sample size methodology is great, but only a start? (Riley 2020)
• Methods to abstain from prediction: idem (Myers 2020; Kompa 2021)
29-Aug-23
18
Conclusions
• Epistemic uncertainty: under the influence of the modeler
• Larger sample sizes, bias-variance considerations
• Domain knowledge about predictors
• Address heterogeneity between settings during development & validation
• IPD / multicenter (TRIPOD-Cluster: Debray 2023)
• Distribution and effects of covariates (Steyerberg 2019)
• Local versus global model: updating
29-Aug-23
19
Literature
• Lemeshow et al. Intens Care Med 1995. Competing models (diff pred).
• Henderson & Keiding. J Med Ethics 2005. Competing models (diff pred).
• Steyerberg et al. JCE 2005. Competing models (diff pred).
• Stern. arXiv 2012. Competing models (diff pred).
• Sniderman et al. JAMA 2015. Competing models (diff pred).
• Pate et al. BMC Med 2019. Modeling decisions (pred, time, region, impute)
• Meijerink et al. arXiv 2020.
• Myers et al. Npj Digit Med 2020. Tool to identify subgroups where risks cannot be trusted.
• Pate et al. Diagn Progn Res 2020. N and stability.
• Kompa et al. Npj Digit Med 2021. Overview of uncertainty quantification/abstention.
• Hullermeier & Wageman. Mach Learn 2021. Overview of uncertainty.
• Thomassen et al. ISCB2022/SMDM2023. Stability using effective N (x people like you)
• Riley & Collins. Biom J 2023. Stability (4 levels) and quantification.
• Ledger et al. medRxiv 2023. Uncertainty due to choice of algorithm.
• Gruber et al. arXiv 2023. Overview of uncertainty.

More Related Content

Similar to ISCB 2023 Sources of uncertainty b.pptx

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
 
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Evangelos Kritsotakis
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
 
Machine Learning - Breast Cancer Diagnosis
Machine Learning - Breast Cancer DiagnosisMachine Learning - Breast Cancer Diagnosis
Machine Learning - Breast Cancer DiagnosisPramod Sharma
 
A distributed data mining network infrastructure for Australian radiotherapy ...
A distributed data mining network infrastructure for Australian radiotherapy ...A distributed data mining network infrastructure for Australian radiotherapy ...
A distributed data mining network infrastructure for Australian radiotherapy ...Cancer Institute NSW
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesnQuery
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...GaryCollins74
 
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...cscpconf
 
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...csandit
 
A plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsA plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsBenVanCalster
 
Dekker trog - learning outcome prediction models from cancer data - 2017
Dekker   trog  - learning outcome prediction models from cancer data - 2017Dekker   trog  - learning outcome prediction models from cancer data - 2017
Dekker trog - learning outcome prediction models from cancer data - 2017Andre Dekker
 
Early hospital mortality prediction using vital signals
Early hospital mortality prediction using vital signalsEarly hospital mortality prediction using vital signals
Early hospital mortality prediction using vital signalsReza Sadeghi
 
CAR2021ZoeHu learning lecture in easy .pptx
CAR2021ZoeHu learning lecture in easy .pptxCAR2021ZoeHu learning lecture in easy .pptx
CAR2021ZoeHu learning lecture in easy .pptxJafarHussain48
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMin-hyung Kim
 
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...Salpy Kelian
 
Ensemble strategies for a medical diagnostic decision support system: A breas...
Ensemble strategies for a medical diagnostic decision support system: A breas...Ensemble strategies for a medical diagnostic decision support system: A breas...
Ensemble strategies for a medical diagnostic decision support system: A breas...dewisetiyana52
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Joel Saltz
 

Similar to ISCB 2023 Sources of uncertainty b.pptx (20)

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Machine Learning - Breast Cancer Diagnosis
Machine Learning - Breast Cancer DiagnosisMachine Learning - Breast Cancer Diagnosis
Machine Learning - Breast Cancer Diagnosis
 
A distributed data mining network infrastructure for Australian radiotherapy ...
A distributed data mining network infrastructure for Australian radiotherapy ...A distributed data mining network infrastructure for Australian radiotherapy ...
A distributed data mining network infrastructure for Australian radiotherapy ...
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...
Fractal Parameters of Tumour Microscopic Images as Prognostic Indicators of C...
 
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...
FRACTAL PARAMETERS OF TUMOUR MICROSCOPIC IMAGES AS PROGNOSTIC INDICATORS OF C...
 
A plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsA plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction models
 
Women in STEM
Women in STEM Women in STEM
Women in STEM
 
Dekker trog - learning outcome prediction models from cancer data - 2017
Dekker   trog  - learning outcome prediction models from cancer data - 2017Dekker   trog  - learning outcome prediction models from cancer data - 2017
Dekker trog - learning outcome prediction models from cancer data - 2017
 
Early hospital mortality prediction using vital signals
Early hospital mortality prediction using vital signalsEarly hospital mortality prediction using vital signals
Early hospital mortality prediction using vital signals
 
CAR2021ZoeHu learning lecture in easy .pptx
CAR2021ZoeHu learning lecture in easy .pptxCAR2021ZoeHu learning lecture in easy .pptx
CAR2021ZoeHu learning lecture in easy .pptx
 
SET PROJECT PPT.pptx
SET PROJECT PPT.pptxSET PROJECT PPT.pptx
SET PROJECT PPT.pptx
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -clean
 
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...
Journal Club - Discussion of Heriot et al. Criteria for identifying patients ...
 
Ensemble strategies for a medical diagnostic decision support system: A breas...
Ensemble strategies for a medical diagnostic decision support system: A breas...Ensemble strategies for a medical diagnostic decision support system: A breas...
Ensemble strategies for a medical diagnostic decision support system: A breas...
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

ISCB 2023 Sources of uncertainty b.pptx

  • 1. August 2023, Milan (Italy) Sources of uncertainty in clinical prediction models Ben Van Calster, PhD Dept Development & Regeneration, KU Leuven (B) Dept Biomedical Data Sciences, LUMC Leiden (NL) Thanks to Laure Wynants and Ewout Steyerberg
  • 2. Validated = trustworthy? • Classic prediction model validation is on population level: • Discrimination (c stat) • Calibration (curve) • Clinical utility (net benefit) • But patients focus on their own risk • Is that trustworthy, even if the model is calibrated?
  • 3. Internal and external validation / validity
  • 4. Sources of uncertainty Mateo Dineen, “Elephants in the room” Elephants in the room Altman & Royston 2000: “We believe that the distinction between what is achievable at the group and individual levels is not well understood.”
  • 5. “Alternative models […], while demonstrating good agreement for describing patients in the aggregate, are shown to differ considerably for individual patients.”
  • 6. “Human survival is so uncertain that even the best statistical analysis cannot provide single-number prediction of real use for individual patients.”
  • 7. Does (true) individual risk exist? • Concept of individual risk is debated • Nonsense, not meaningful, … (Stern 2012, Sniderman 2015, people in 19th century) • “reference class problem” • Even then: • Risk estimate may be seen as the level to which one is ‘willing to bet’ on the existence/occurrence of the event (cf de Finetti; Nau 2001) • How stable / uncertain is this ‘betting level’?
  • 8. Sources of uncertainty • Aleatory uncertainty • Epistemic uncertainty • Approximation/Estimation uncertainty • Model(er) uncertainty • Data uncertainty • Population uncertainty (heterogeneity) Hullermeier & Waegeman. Machine Learning 2021;110:457-506. Gruber et al. arXiv 2023 (statistician’s perspective)
  • 9. Illustration: ovarian cancer diagnosis • Data from 1133 patients from the U Hospital Leuven, 1999-2015 • Patients with one or more ovarian tumors scheduled for surgery • Estimate risk of malignancy • Main model: • Standard logistic regression • age (years), max lesion diameter (mm), proportion solid tissue, CA125 (IU/L), bilateral tumors (Y/N), papillations with blood flow (Y/N) • First 4 predictors: rcs with 3 knots • 10 parameters, 37% prevalence, assumed AUC of 0.88 (cf literature): minimum required sample size 359 (Riley et al 2020)
  • 10. Illustration: ovarian cancer diagnosis • Random test set of 100 patients, remaining 1033 is training pool • We randomly select 385 cases from training pool • First, impute missing CA125 values (31%) using regression imputation • Train model • Apply imputation model and prediction model to test set • Calculate risks on test set
  • 11. Illustration: ovarian cancer diagnosis 10 variations regarding modeling uncertainty: • Nonlinearity based on first degree fractional polynomials (Sauerbrei et al 2007) • Nonlinearity ignored (we don’t like it, but it happens a lot) • Numerical variables dichotomized at median (stay calm) • Backward elimination with alpha 1% • Backward elimination with alpha 20% • Penalized estimation based on AICc (Harrell 2015) • Including interactions with CA125, backward elimination at alpha 20% • Random forest model using very deep trees (min.node.size 2) • Random forest model with less deep trees (min.node.size 20) • Use 2 other categorical preds (ordinal vascularation score, irregular walls)
  • 12. Illustration: ovarian cancer diagnosis 3 variations regarding data uncertainty: • Model tumor size and proportion solid tissue using estimated volumes • Impute using missing indicator method • Impute using median imputation conditional on outcome (stay calm) 2 variations regarding population uncertainty • Sample from training pool from hospital in Rome (N=1141) • Sample from training pool from hospital in Malmö (N=1053) Approximation uncertainty • Repeat all of this 100 times (100 random training sets of 385 cases)
  • 13. Approximation/instability (main model) Mean AUC: 0.97 (range 0.96-0.98) (!) Mean ECE: 0.06 (range 0.04-0.09) Mean risk range: 20 %p (range 2-74)
  • 14. Model uncertainty (1st train set) AUC range 0.95-0.98 ECE range 0.06-0.10 Mean risk range: 21 %p (range 2-63)
  • 15. Data uncertainty (1st train set) AUC range 0.97-0.98 ECE range 0.06-0.07 Mean risk range: 6 %p (range 0.2-30)
  • 16. Population uncertainty (1st train set) AUC range 0.96-0.97 ECE range 0.06-0.07 Mean risk range: 12 %p (range 1-67)
  • 17. All uncertainties Mean AUC 0.97 (range 0.83-0.99) Mean ECE 0.07 (range 0.02-0.24) Mean risk range: 52 %p (range 5->99)
  • 18. Conclusions • Model predictions suffer from multiple sources of uncertainty • Transparency: for policy makers / physicians / patients • Risk communication, shared decision making? • Quantifying uncertainty completely is impossible (multiverse too large) • Reporting stability is good, but is only a part of uncertainty • Riley’s sample size methodology is great, but only a start? (Riley 2020) • Methods to abstain from prediction: idem (Myers 2020; Kompa 2021) 29-Aug-23 18
  • 19. Conclusions • Epistemic uncertainty: under the influence of the modeler • Larger sample sizes, bias-variance considerations • Domain knowledge about predictors • Address heterogeneity between settings during development & validation • IPD / multicenter (TRIPOD-Cluster: Debray 2023) • Distribution and effects of covariates (Steyerberg 2019) • Local versus global model: updating 29-Aug-23 19
  • 20. Literature • Lemeshow et al. Intens Care Med 1995. Competing models (diff pred). • Henderson & Keiding. J Med Ethics 2005. Competing models (diff pred). • Steyerberg et al. JCE 2005. Competing models (diff pred). • Stern. arXiv 2012. Competing models (diff pred). • Sniderman et al. JAMA 2015. Competing models (diff pred). • Pate et al. BMC Med 2019. Modeling decisions (pred, time, region, impute) • Meijerink et al. arXiv 2020. • Myers et al. Npj Digit Med 2020. Tool to identify subgroups where risks cannot be trusted. • Pate et al. Diagn Progn Res 2020. N and stability. • Kompa et al. Npj Digit Med 2021. Overview of uncertainty quantification/abstention. • Hullermeier & Wageman. Mach Learn 2021. Overview of uncertainty. • Thomassen et al. ISCB2022/SMDM2023. Stability using effective N (x people like you) • Riley & Collins. Biom J 2023. Stability (4 levels) and quantification. • Ledger et al. medRxiv 2023. Uncertainty due to choice of algorithm. • Gruber et al. arXiv 2023. Overview of uncertainty.