SlideShare a Scribd company logo
1 of 34
QUANTIFYING THE IMPACT OF DIFFERENT
APPROACHES FOR HANDLING CONTINUOUS
PREDICTORS ON THE PERFORMANCE OF A
PROGNOSTIC MODEL
Gary Collins, Emmanuel Ogundimu, Jonathan Cook,
Yannick Le Manach, Doug Altman
Centre for Statistics in Medicine
University of Oxford
20-July-2016
gary.collins@csm.ox.ac.uk
Outline
 Existing guidance
 What’s done in practice?
 Brief overview of the study sample & simulation set-up
 Findings & Discussion
2
Basis of this presentation
3
Not a new idea…
4
It’s all in the title…(1994-2006)
1. Problems in dichotomizing continuous variables (Altman 1994)
2. Dangers of using "optimal" cutpoints in the evaluation of prognostic
factors. (Altman et al 1994)
3. How bad is categorization? (Weinberg; 1995)
4. Seven reasons why you should NOT categorize continuous data
(Dinero; 1996)
5. Breaking Up is Hard to Do: The Heartbreak of Dichotomizing
Continuous Data (Streiner; 2002)
6. Negative consequences of dichotomizing continuous predictor
variables (Irwin & McClelland; 2003)
7. Why carve up your continuous data? (Owen 2005)
8. Chopped liver? OK. Chopped data? Not OK. Chopped liver? OK.
Chopped data? Not OK (Butts & Ng 2005)
9. Categorizing continuous variables resulted in different
predictors in a prognostic model for nonspecific neck pain
(Schellingerhout et al 2006)
5
It’s all in the title…(2006-2014)
10.Dichotomizing continuous predictors in multiple regression: a bad idea
(Royston et el 2006)
11. The cost of dichotomising continuous variables (Altman & Royston; 2006)
12.Leave 'em alone - why continuous variables should be analyzed as such
(van Walraven & Hart; 2008)
13.Dichotomization of continuous data--a pitfall in prognostic factor studies
(Metze; 2008)
14. Analysis by categorizing or dichotomizing continuous variables is
inadvisable: an example from the natural history of unruptured aneurysms
(Naggara et al 2011)
15.Against quantiles: categorization of continuous variables in epidemiologic
research, and its discontents (Bennette & Vickers; 2012)
16.Dichotomizing continuous variables in statistical analysis: a practice to
avoid (Dawson & Weiss; 2012)
17. The danger of dichotomizing continuous variables: A visualization (Kuss
2013)
18. The “anathema” of arbitrary categorization of continuous predictors
(Vintzileos et al; 2014)
19. Ophthalmic statistics note: the perils of dichotomising continuous variables
(Cumberland et al 2014)
6
Prognostic factor (PF)
A B C
PF not present
(low risk)
PF present
(high risk)
Cut-point
Biologically implausible
Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
Prognostic factor (PF)
A B C
PF not present
(low risk)
PF present
(high risk)
Cut-point
Biologically implausible
“Convoluted Reasoning and
Anti-intellectual Pomposity”
“C.R.A.P”
(Norman & Streiner;
Biostatistics: the Bare
Essentials, 2008)
Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
Still, what happens in practice…?
 Breast cancer models (Altman 2009)
– Categorised some/all - 34/53 (64%)
 Diabetes models (Collins et al 2011)
– Categorised some/all 21/43 (49%)
 General medical journals (Bouwmeester et al 2012)
– Categorised 30/64 (47%)
– Dichotomised 21/64 (21%)
 Cancer models (Mallett et al 2010)
– All categorised/dichotomised 24/47 (51%)
9
Aim of the study
 Investigate the impact of different approaches for
handling continuous predictors on the
– apparent performance (same data)
– validation performance (different data; geographical validation)
 Investigate the influence of sample size has on the
approach for handling continuous predictors
10
Sample characteristics (THIN)
11
80,800 CVD
events
4688 CVD
events
565 hip
fractures
7721 hip
fractures
Models
 Cox models to predict
– 10-year risk of CVD (men & women)
– 10-year risk of hip fracture (women only)
 CVD model contained 7 predictors
– Age, sex, family history, cholesterol, SBP, BMI, hypertension
 Hip fracture model contained 5 predictors
– Age, BMI, Townsend score, asthma, antidepressants
12
Resampling strategy
 MODEL DEVELOPMENT
– To ensure the number of events in each sample was fixed at
25, 50, 100, and 2000 events
– Sample were drawn from those with and without the event
(separately)
– 200 samples randomly drawn (with replacement)
 MODEL VALIDATION
– All available data were used
• CVD: n=110,934 (4688 CVD events)
• Hip fracture: n=61,563 (565 hip fractures)
13
Approaches considered
 Dichotomised at the
– Median predictor value
– ‘optimal’ cut-point based on the logrank test
 Categorised into
– 3 groups (using tertile predictor values)
– 4 groups (using quartile predictor values)
– 5 groups (using quintile predictor values)
– 5-year age categories
– 10-year age categories
 Linear relationship
 Nonlinear relationship
– fractional polynomials (FP2; 4 degrees of freedom per predictor)
– restricted cubic splines (3 knots)
14
Performance measures calculated
 Calibration
– Calibration plot
– Harrell’s “val.surv” function; hazard regression with linear
splines
 Discrimination
– Harrell’s c-index
 Clinical utility
– Decision curve analysis (Vickers & Elkin 2006)
– Net benefit;
• weighted difference between true positives and false positives
 D-statistic; Brier Score; R-squared also examined
– Not reported here - but in the supplementary material of
Collins et al Stat Med 2016.
15
Net benefit (recap)
 pt is the probability threshold to denote ‘high
risk’
– Used to weight the FPs and FN results
 TP and FP calculated using Kaplan-Meier
estimates of the percentage surviving at 10
years among those with predicted risks
greater than pt
 Bottom line: model with highest NB ‘wins’
16
Age & CVD
17
Total serum cholesterol & CVD
18
Age, cholesterol, BMI, SBP & CVD
19
Age, BMI & Hip fracture
20
RESULTS: CVD 25 events
21
RESULTS: CVD 50 events
22
RESULTS: CVD 100 events
23
RESULTS: CVD 1000 events
24
RESULTS: Hip fracture 25 events
25
RESULTS: Hip fracture 50 events
26
RESULTS: Hip fracture 100 events
27
RESULTS: Hip fracture 1000 events
28
RESULTS: Discrimination CVD
 At small sample sizes (25 events)
– Large difference in between apparent performance and
validation performance for ‘optimal’ dichotomisation
• 0.84 (apparent); 0.72 (validation)
– Smaller differences observed for FP/RCS/Linear
• 0.84 (apparent); 0.78 (validation)
 Observed difference between dichotomisation (at
the median) and linear/FP/RCS
– Apparent performance: difference of 0.05
– Validation performance: difference of 0.05
– Observed over all 4 sample sizes examined
 Negligible differences between linear/FP/RCS
29
RESULTS: Discrimination Hip Fracture
 At small sample sizes (25 events)
– Large difference in between apparent performance and
validation performance for ‘optimal’ dichotomisation
• 0.86 (apparent); 0.76 (validation)
– FP/RCS/Linear
• 0.90 (apparent); 0.87 (validation)
 Observed difference between dichotomisation (at
the median) and linear/FP/RCS
– Apparent performance: difference of 0.1
– Validation performance: difference of 0.1
– Observed over all 4 sample sizes examined
 Negligible differences between linear/FP/RCS
30
RESULTS: Discrimination Hip Fracture
31
RESULTS: Decision Curve Analysis
(CVD only) [higher NB better model]
32
FP/RCS
dichotomisation
RESULTS: Net cases found per 1000
33
Conclusions
 Systematic reviews show dichotomising /
categorising continuous predictors routinely done
when developing a prediction model
 Dichotomising, either at the median or ‘optimal’
predictor value leads to models with substantially
poorer performance
– Poor discrimination; poor calibration; poor clinical utility
 Large discrepancies between apparent performance
and validation performance observed for ‘optimal’
split dichotomising
 The impact of dichotomising continuous predictors
are handled are more pronounced at smaller sample
sizes
34

More Related Content

What's hot

Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical researchMaarten van Smeden
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IMaarten van Smeden
 
Prognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthPrognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthMaarten van Smeden
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Maarten van Smeden
 
How to read a forest plot?
How to read a forest plot?How to read a forest plot?
How to read a forest plot?Samir Haffar
 
Real world modified
Real world modifiedReal world modified
Real world modifiedStephen Senn
 
The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansStephen Senn
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression ModelsStephen Senn
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
 
Application of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptxApplication of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptxHalim AS
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Maarten van Smeden
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingMaarten van Smeden
 
Why I hate minimisation
Why I hate minimisationWhy I hate minimisation
Why I hate minimisationStephen Senn
 
Minimally important differences v2
Minimally important differences v2Minimally important differences v2
Minimally important differences v2Stephen Senn
 
How I humanise the ICU - pet therapy - Alex Psirides
How I humanise the ICU - pet therapy - Alex PsiridesHow I humanise the ICU - pet therapy - Alex Psirides
How I humanise the ICU - pet therapy - Alex PsiridesIntensive Care Society
 
Randomized CLinical Trail
Randomized CLinical TrailRandomized CLinical Trail
Randomized CLinical Trailamitakashyap1
 
P values and the art of herding cats
P values  and the art of herding catsP values  and the art of herding cats
P values and the art of herding catsStephen Senn
 
Sepsis: ED and Trauma symposium
Sepsis: ED and Trauma symposiumSepsis: ED and Trauma symposium
Sepsis: ED and Trauma symposiumUFJaxEMS
 

What's hot (20)

Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part I
 
Prognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthPrognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient health
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
How to read a forest plot?
How to read a forest plot?How to read a forest plot?
How to read a forest plot?
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Application of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptxApplication of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptx
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
Why I hate minimisation
Why I hate minimisationWhy I hate minimisation
Why I hate minimisation
 
Minimally important differences v2
Minimally important differences v2Minimally important differences v2
Minimally important differences v2
 
How I humanise the ICU - pet therapy - Alex Psirides
How I humanise the ICU - pet therapy - Alex PsiridesHow I humanise the ICU - pet therapy - Alex Psirides
How I humanise the ICU - pet therapy - Alex Psirides
 
Randomized CLinical Trail
Randomized CLinical TrailRandomized CLinical Trail
Randomized CLinical Trail
 
P values and the art of herding cats
P values  and the art of herding catsP values  and the art of herding cats
P values and the art of herding cats
 
Sepsis: ED and Trauma symposium
Sepsis: ED and Trauma symposiumSepsis: ED and Trauma symposium
Sepsis: ED and Trauma symposium
 

Similar to QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL

ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxBenVanCalster
 
The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...jemille6
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sternergveroniki
 
Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...BenVanCalster
 
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Evangelos Kritsotakis
 
A plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsA plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsBenVanCalster
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEwout Steyerberg
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitLaure Wynants
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
 
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...MIS Quarterly
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsLevi Waldron
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life AssessmentRobust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessmentdylanturner22
 
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Fundación Ramón Areces
 
HANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdfHANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdfJuan491341
 

Similar to QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL (20)

ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptx
 
The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
 
Copenhagen 23.10.2008
Copenhagen 23.10.2008Copenhagen 23.10.2008
Copenhagen 23.10.2008
 
Copenhagen 2008
Copenhagen 2008Copenhagen 2008
Copenhagen 2008
 
Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...
 
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)
 
A plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction modelsA plea for good methodology when developing clinical prediction models
A plea for good methodology when developing clinical prediction models
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk prediction
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net Benefit
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Clinical Prediction Rules
Clinical Prediction RulesClinical Prediction Rules
Clinical Prediction Rules
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...
Healthcare Predicitive Analytics for Risk Profiling in Chronic Care: A Bayesi...
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction Models
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life AssessmentRobust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessment
 
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
 
HANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdfHANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdf
 

Recently uploaded

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxnoordubaliya2003
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 

Recently uploaded (20)

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 

QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL

  • 1. QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL Gary Collins, Emmanuel Ogundimu, Jonathan Cook, Yannick Le Manach, Doug Altman Centre for Statistics in Medicine University of Oxford 20-July-2016 gary.collins@csm.ox.ac.uk
  • 2. Outline  Existing guidance  What’s done in practice?  Brief overview of the study sample & simulation set-up  Findings & Discussion 2
  • 3. Basis of this presentation 3
  • 4. Not a new idea… 4
  • 5. It’s all in the title…(1994-2006) 1. Problems in dichotomizing continuous variables (Altman 1994) 2. Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. (Altman et al 1994) 3. How bad is categorization? (Weinberg; 1995) 4. Seven reasons why you should NOT categorize continuous data (Dinero; 1996) 5. Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data (Streiner; 2002) 6. Negative consequences of dichotomizing continuous predictor variables (Irwin & McClelland; 2003) 7. Why carve up your continuous data? (Owen 2005) 8. Chopped liver? OK. Chopped data? Not OK. Chopped liver? OK. Chopped data? Not OK (Butts & Ng 2005) 9. Categorizing continuous variables resulted in different predictors in a prognostic model for nonspecific neck pain (Schellingerhout et al 2006) 5
  • 6. It’s all in the title…(2006-2014) 10.Dichotomizing continuous predictors in multiple regression: a bad idea (Royston et el 2006) 11. The cost of dichotomising continuous variables (Altman & Royston; 2006) 12.Leave 'em alone - why continuous variables should be analyzed as such (van Walraven & Hart; 2008) 13.Dichotomization of continuous data--a pitfall in prognostic factor studies (Metze; 2008) 14. Analysis by categorizing or dichotomizing continuous variables is inadvisable: an example from the natural history of unruptured aneurysms (Naggara et al 2011) 15.Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents (Bennette & Vickers; 2012) 16.Dichotomizing continuous variables in statistical analysis: a practice to avoid (Dawson & Weiss; 2012) 17. The danger of dichotomizing continuous variables: A visualization (Kuss 2013) 18. The “anathema” of arbitrary categorization of continuous predictors (Vintzileos et al; 2014) 19. Ophthalmic statistics note: the perils of dichotomising continuous variables (Cumberland et al 2014) 6
  • 7. Prognostic factor (PF) A B C PF not present (low risk) PF present (high risk) Cut-point Biologically implausible Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
  • 8. Prognostic factor (PF) A B C PF not present (low risk) PF present (high risk) Cut-point Biologically implausible “Convoluted Reasoning and Anti-intellectual Pomposity” “C.R.A.P” (Norman & Streiner; Biostatistics: the Bare Essentials, 2008) Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
  • 9. Still, what happens in practice…?  Breast cancer models (Altman 2009) – Categorised some/all - 34/53 (64%)  Diabetes models (Collins et al 2011) – Categorised some/all 21/43 (49%)  General medical journals (Bouwmeester et al 2012) – Categorised 30/64 (47%) – Dichotomised 21/64 (21%)  Cancer models (Mallett et al 2010) – All categorised/dichotomised 24/47 (51%) 9
  • 10. Aim of the study  Investigate the impact of different approaches for handling continuous predictors on the – apparent performance (same data) – validation performance (different data; geographical validation)  Investigate the influence of sample size has on the approach for handling continuous predictors 10
  • 11. Sample characteristics (THIN) 11 80,800 CVD events 4688 CVD events 565 hip fractures 7721 hip fractures
  • 12. Models  Cox models to predict – 10-year risk of CVD (men & women) – 10-year risk of hip fracture (women only)  CVD model contained 7 predictors – Age, sex, family history, cholesterol, SBP, BMI, hypertension  Hip fracture model contained 5 predictors – Age, BMI, Townsend score, asthma, antidepressants 12
  • 13. Resampling strategy  MODEL DEVELOPMENT – To ensure the number of events in each sample was fixed at 25, 50, 100, and 2000 events – Sample were drawn from those with and without the event (separately) – 200 samples randomly drawn (with replacement)  MODEL VALIDATION – All available data were used • CVD: n=110,934 (4688 CVD events) • Hip fracture: n=61,563 (565 hip fractures) 13
  • 14. Approaches considered  Dichotomised at the – Median predictor value – ‘optimal’ cut-point based on the logrank test  Categorised into – 3 groups (using tertile predictor values) – 4 groups (using quartile predictor values) – 5 groups (using quintile predictor values) – 5-year age categories – 10-year age categories  Linear relationship  Nonlinear relationship – fractional polynomials (FP2; 4 degrees of freedom per predictor) – restricted cubic splines (3 knots) 14
  • 15. Performance measures calculated  Calibration – Calibration plot – Harrell’s “val.surv” function; hazard regression with linear splines  Discrimination – Harrell’s c-index  Clinical utility – Decision curve analysis (Vickers & Elkin 2006) – Net benefit; • weighted difference between true positives and false positives  D-statistic; Brier Score; R-squared also examined – Not reported here - but in the supplementary material of Collins et al Stat Med 2016. 15
  • 16. Net benefit (recap)  pt is the probability threshold to denote ‘high risk’ – Used to weight the FPs and FN results  TP and FP calculated using Kaplan-Meier estimates of the percentage surviving at 10 years among those with predicted risks greater than pt  Bottom line: model with highest NB ‘wins’ 16
  • 19. Age, cholesterol, BMI, SBP & CVD 19
  • 20. Age, BMI & Hip fracture 20
  • 21. RESULTS: CVD 25 events 21
  • 22. RESULTS: CVD 50 events 22
  • 23. RESULTS: CVD 100 events 23
  • 24. RESULTS: CVD 1000 events 24
  • 25. RESULTS: Hip fracture 25 events 25
  • 26. RESULTS: Hip fracture 50 events 26
  • 27. RESULTS: Hip fracture 100 events 27
  • 28. RESULTS: Hip fracture 1000 events 28
  • 29. RESULTS: Discrimination CVD  At small sample sizes (25 events) – Large difference in between apparent performance and validation performance for ‘optimal’ dichotomisation • 0.84 (apparent); 0.72 (validation) – Smaller differences observed for FP/RCS/Linear • 0.84 (apparent); 0.78 (validation)  Observed difference between dichotomisation (at the median) and linear/FP/RCS – Apparent performance: difference of 0.05 – Validation performance: difference of 0.05 – Observed over all 4 sample sizes examined  Negligible differences between linear/FP/RCS 29
  • 30. RESULTS: Discrimination Hip Fracture  At small sample sizes (25 events) – Large difference in between apparent performance and validation performance for ‘optimal’ dichotomisation • 0.86 (apparent); 0.76 (validation) – FP/RCS/Linear • 0.90 (apparent); 0.87 (validation)  Observed difference between dichotomisation (at the median) and linear/FP/RCS – Apparent performance: difference of 0.1 – Validation performance: difference of 0.1 – Observed over all 4 sample sizes examined  Negligible differences between linear/FP/RCS 30
  • 32. RESULTS: Decision Curve Analysis (CVD only) [higher NB better model] 32 FP/RCS dichotomisation
  • 33. RESULTS: Net cases found per 1000 33
  • 34. Conclusions  Systematic reviews show dichotomising / categorising continuous predictors routinely done when developing a prediction model  Dichotomising, either at the median or ‘optimal’ predictor value leads to models with substantially poorer performance – Poor discrimination; poor calibration; poor clinical utility  Large discrepancies between apparent performance and validation performance observed for ‘optimal’ split dichotomising  The impact of dichotomising continuous predictors are handled are more pronounced at smaller sample sizes 34