SlideShare a Scribd company logo
1 of 61
Download to read offline
Correcting for missing data, measurement
error and confounding
Maarten van Smeden, PhD
University Medical Center Utrecht
Julius Center for Health Sciences and Primary Care
The Netherlands
Twitter: @MvanSmeden
Email: M.vanSmeden@umcutrecht.nl
30 November 2020
Methods meeting
Epidemiology methods group, UMC Utrecht
I have no conflicts of interest to declare
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Rationale
• Confounding -> correlation is not causation
• Measurement error & missing data -> correlation is not always
correlation
• In causal epidemiologic research we often see all three
…. but we rarely try to ”correct” for all three
Twitter: @MaartenvSmedenUtrecht, November 30 2020
There is no shortage of methods
Confounding Missing data Measurement error
Multivariable adjustments Multiple imputation Regression calibration
Weighting Weighting Weighting
Matching Full information maximum
likelihood
Multiple imputation for ME
Instrumental variable analysis Last observation carried forward SIMEX
RANDOMIZATION (!) Missing indicator methods Full information maximum
likelihood
“Bayesian approaches” “Bayesian approaches” “Bayesian approaches”
A non-exhaustive list of statistical correction strategies
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Outline
• Confounding (1 slide)
• Missing data (2 slides)
• Measurement error (many slides)
• How to solve” all three? (couple of more slides)
• What about prediction (if I have time left)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Confounding
• A: treatment (Tx, 1 for treated; 0 for not treated)
• Y: outcome (1 for death; 0 for survival)
• Potential outcomes
Ya=1: outcome under Tx; Ya=0: outcome under no Tx
usually observe either Ya=0 or Ya=1 for an individual
• Randomized trials: Ya ⊥ A (unconditional exchangeability)
• Observational studies aim: Ya ⊥ A | L (conditional exchangeability)
L: confounding variables -> no unmeasured confounding
• (Additional causal assumptions: positivity, consistency,
SUTVA,…)
More info: causal inference: what if? Hernan & Robins
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Missing data
• Missing values are observations/records which were:
– never collected (either by design or not)
– lost accidentally
– wrongly collected and so deleted (measurement error?)
• Usually distinguish between three types of missing data
– MCAR: the probability that data are missing does NOT
depend on the values of observed or missing data
– MAR: the probability that data are missing depends on the
values of the observed data, but does NOT depend on the
values of the missing data
– MNAR: the probability that data are missing depends on the
values of the missing data
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Beware of mindless imputation
Source: Hughes et al. IJE, 2019, doi:10.1093/ije/dyz032
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Personal observations (I may be biased)
Causal inference epidemiology
• Confounding on center stage in analyses and discussion
• Missing data often cannot be ignored (especially for higher %):
performing multiple imputation becoming mainstream?
• Measurement error the elephant in the room: belongs to the
discussion section (not methods), lots of misconceptions!
• (Note: not independent, e.g. measurement error can result in
problems with confounding)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error
“Errors in reading, calculating or recording a
numerical value. The difference between
observed values of a variable recorded
under similar conditions and some fixed true
value.“
The Cambridge Dictionary of Statistics (4th ed), ISBN: 9780521766999
Twitter: @MaartenvSmedenUtrecht, November 30 2020 img: https://bit.ly/2T9UnRt
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement of systolic blood pressure
Measurement error due to:
• White coat effect1
• Non-adherence to measurement protocol2
• Fallibility of measurement instruments3
• ….
Measurement error varies:
• Number of BP measurements taken4
• Gender4
• Circadian rhythm
• ….
doi: 110.1370/afm.1211; 210.3399/096016407782604965; 310.2147/MDER.S141599; 410.3109/08037051.2014.986952
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example circadian rythm
doi: 10.1111/j.1552-6909.2000.tb02771.x
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Imprecision of medical measurements
doi: 10.1136/bmj.m149
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error: a long list
• Blood pressure
• Dietary intake
• Smoking status
• Air pollution
• BMI
• Physical activity
• Vaccination status
• Social class
• Carotid intima media thickness
• Thyroid hormone levels
• Glucose levels
• Cholesterol levels
• Income
• Family history
• Mental health history
• Education level
• “Intelligence”
• Respiratory rates
• Medication use
• Sedentary hours
• Vitamin use
• Immigration status
• Age at first intercourse
• Age at menopause
• ICD coding
• Symptoms
• Date of symptom onset
• Medication use
• Visceral adipose tissue
• Angina class
• Heart rate
• Grip and pinch strength
• Cough frequency
• Infant height
• Gestational age
• Disease specific mortality
• ….
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error mentioned
Journals of epidemiology
Jurek et al. 20061 61% (N = 35)
Brakenhoff et al. 20182 56% (N = 198)
Shaw et al. 20193 80% (N = 65)
doi: 110.1007/s10654-006-9083-0; 210.1016/j.jclinepi.2018.02.02; 310.1016/j.annepidem.2018.09.001
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error mentioned
Journals of general medicine
Brakenhoff et al. 20182: 25% (N = 57)
doi: 210.1016/j.jclinepi.2018.02.02
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error “corrections” applied
Journals of epidemiology
Jurek et al. 20061: 2% (N = 1)
Brakenhoff et al. 20182: 4% (N = 13)
Shaw et al. 20193: 6% (N = 5)
doi: 110.1007/s10654-006-9083-0; 210.1016/j.jclinepi.2018.02.02; 310.1016/j.annepidem.2018.09.001
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error “corrections” applied
Journals of general medicine
Brakenhoff et al. 20182: 0% (N = 0)
doi: 210.1016/j.jclinepi.2018.02.02
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Twitter: @MaartenvSmedenUtrecht, November 30 2020
• Myth 1: measurement error can be compensated for by large
numbers of observations
• Myth 2: the exposure effect is underestimated when variables
are measured with error
• Myth 3: exposure measurement error is nondifferential if
measurements are taken without knowledge of the outcome
• Myth 4: measurement error can be prevented but not mitigated
in epidemiological data analyses
• Myth 5: certain types of epidemiological research are
unaffected by measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
• Myth 1: measurement error can be compensated for by large
numbers of observations
• Myth 2: the exposure effect is underestimated when variables
are measured with error
• Myth 3: exposure measurement error is nondifferential if
measurements are taken without knowledge of the outcome
• Myth 4: measurement error can be prevented but not mitigated
in epidemiological data analyses
• Myth 5: certain types of epidemiological research are
unaffected by measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Types of measurement error
Measurement are
consistently wrong in a
particular direction
Classical (Random)
measurement error
Differential
measurement error
Systematic
measurement error
Measurements fluctuate
around their true value
Measurements are
consistently wrong in a
particular direction,
varying per group
Courtesy: Linda Nab
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Classical measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Classical measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Tripple whammy of measurement error
• Bias
• Increased imprecision
• Masked functional relations
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Tripple whammy of measurement error
• Bias
• Increased imprecision
• Masked functional relations
Always weaker effects?
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example: classical measurement error
doi: 10.1371/journal.pone.0192298
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example: classical measurement error
Second Manifestations of ARTerial disease (SMART) cohort
doi: 10.1371/journal.pone.0192298
Effect of
interest
Confounder
with error
Outcome
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example: classical measurement error
doi: 10.1371/journal.pone.0192298
% bias in hazard ratio for SBP (multivariable Cox regression model)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example: classical measurement error
Second Manifestations of ARTerial disease (SMART) cohort
doi: 10.1371/journal.pone.0192298
Effect of
interest
Confounder
with error
Outcome
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Example: classical measurement error
doi: 10.1371/journal.pone.0192298
% bias in hazard ratio for SBP (multivariable Cox regression model)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
• Myth 1: measurement error can be compensated for by large
numbers of observations
• Myth 2: the exposure effect is underestimated when variables
are measured with error
• Myth 3: exposure measurement error is nondifferential if
measurements are taken without knowledge of the outcome
• Myth 4: measurement error can be prevented but not mitigated
in epidemiological data analyses
• Myth 5: certain types of epidemiological research are
unaffected by measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Randomized trials unaffected?
excerpt from: 10.1186/s13063-018-2954-3
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Randomized controlled trials
doi: 10.1002/sim.8359
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Randomized controlled trials
Classical (Random)
measurement error
Systematic
measurement error
Differential
measurement error
doi: 10.1002/sim.8359
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Randomized controlled trials
Classical (Random)
measurement error
Systematic
measurement error
Differential
measurement error
• Unbiased Tx effect estimator
• Increased Type-II error
• Nominal Type-I error
• Possibly biased Tx effect estimator
• Type-II error affected
• Type-I generally nominal
• Possibly biased Tx effect estimator
• Type-II error affected
• Type-I not nominal
doi: 10.1002/sim.8359
Twitter: @MaartenvSmedenUtrecht, November 30 2020
• Myth 1: measurement error can be compensated for by large
numbers of observations
• Myth 2: the exposure effect is underestimated when variables
are measured with error
• Myth 3: exposure measurement error is nondifferential if
measurements are taken without knowledge of the outcome
• Myth 4: measurement error can be prevented but not mitigated
in epidemiological data analyses
• Myth 5: certain types of epidemiological research are
unaffected by measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Tripple whammy of measurement error
• Bias
• Increased imprecision
• Masked functional relations
Usually the target for measurement error “corrections”
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error corrections
Replicates study
Study sample
𝑌∗
Standard
measurements
replicated
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error corrections
External validation set
Study sample
𝑌∗
External validation set
Standard
measurements
Standard
measurements
+
Validated
measurements
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error corrections
Internal validation set
Study sample
𝑌∗Internal validation set
Standard
measurements
Standard
measurements
+
Validated
measurements
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Simulation study
OLS regression
Y = a0+a1A + b1L1+…+bpLp + e, e~N(0,s)
a1: effect of primary interest
A,L ~ multivariate normal with mean vector 0 and correlation-matrix
with equal pairwise correlations
Random measurement error: on A, generating a new A*
Missing data (MAR): on L1
True values for a0 = 0, a1 = 10, and b1= b2 = … = bp based on total
confounding effect (crude minus adjusted)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Simulation factors
100,000 generated datasets by random draws from simulation factors
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Models
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Sequential models
• MIME: Multiple imputation for measurement error
Multiple impute both A (only observed in subset) and missingness
L1 : full conditional specification (Y,A,A*,L), followed by OLS using
A and L as covariates (Rubin’s rules)
• MIRC: Multiple imputation and regression calibration
1. Impute missing values in L1
2. In subset: OLS for A given A*,L
3. For the entire set: Arc = E(A| A*,L)
4. For each multiple imputed set: OLS using Arc and L as
covariates, and adjust standard errors (RC)
5. Combine using Rubin’s rules
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Simultaneous models
Conditional submodels
• Y | A, L (primary analysis model)
• A*| Y, A, L
• A | L
• L1 | L2,…,LP
Estimated simultaneously
• MCMC: Bayes (uninformative priors)
• Full information maximum likelihood: FIML (structural equation
model)
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Results
Twitter: @MaartenvSmedenUtrecht, November 30 2020
What does this mean?
• Simple setting (OLS, 1 covariate with missing data, 1 covariate
with measurement error, internal validation): ”full adjustment”
approaches work really well even in small N = 100.
• Differences especially in rMSE, nearly no bias
• The Bayesian approach seems most promising (for its
frequentist properties!): least bias, easy to expand to other link
functions, multivariate missing data and measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement error models are not new
doi: 10.2307/1422689
Twitter: @MaartenvSmedenUtrecht, November 30 2020
• Myth 1: measurement error can be compensated for by large
numbers of observations
• Myth 2: the exposure effect is underestimated when variables
are measured with error
• Myth 3: exposure measurement error is nondifferential if
measurements are taken without knowledge of the outcome
• Myth 4: measurement error can be prevented but not mitigated
in epidemiological data analyses
• Myth 5: certain types of epidemiological research are
unaffected by measurement error
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Exceptions?
• Measurement error in prognostic factors in an RCT
– Same argument about missing data (e.g. see White and
Thompson, Stat Med 2005)
• Special case of measurement error in a confounder
– e.g. confounding by indication, where indication was based
on the confounder with error
Twitter: @MaartenvSmedenUtrecht, November 30 2020 Nab et al. Epidemiology, 2020, doi: 10.1097/EDE.0000000000001239
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Sensitivity analysis tool
https://lindanab.shinyapps.io/SensitivityAnalysis/
Preprint: https://arxiv.org/abs/1912.05800
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Exceptions?
• Measurement error in prognostic factors in an RCT
– Same argument about missing data (e.g. see White and
Thompson, Stat Med 2005)
• Special case of measurement error in a confounder
– e.g. confounding by indication, where indication was based
on the confounder with error
• Prediction models BUT…..
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement heterogeneity
Twitter: @MaartenvSmedenUtrecht, November 30 2020
Measurement: are labels the new oil?
https://twitter.com/DrHughHarvey/status/1230218991026819077
Twitter: @MaartenvSmedenUtrecht, November 30 2020

More Related Content

What's hot

Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Maarten van Smeden
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Maarten van Smeden
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling Maarten van Smeden
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMaarten van Smeden
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...GaryCollins74
 
Real world modified
Real world modifiedReal world modified
Real world modifiedStephen Senn
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression ModelsStephen Senn
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
 
ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxBenVanCalster
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Maarten van Smeden
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...StephenSenn2
 
Meta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxMeta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxACSRM
 

What's hot (20)

Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Improving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradoxImproving predictions: Lasso, Ridge and Stein's paradox
Improving predictions: Lasso, Ridge and Stein's paradox
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
P-values in crisis
P-values in crisisP-values in crisis
P-values in crisis
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
 
Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptx
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 
Meta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxMeta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptx
 
On p-values
On p-valuesOn p-values
On p-values
 

Similar to Correcting for missing data, measurement error and confounding

The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirusMaarten van Smeden
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemMaarten van Smeden
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...Maarten van Smeden
 
Twitter Improves Seasonal Influenza Prediction
Twitter Improves Seasonal Influenza PredictionTwitter Improves Seasonal Influenza Prediction
Twitter Improves Seasonal Influenza PredictionHarshavardhan Achrekar
 
When Usability Meets Social Media: Strengthen Your Connections with Users
When Usability Meets Social Media: Strengthen Your Connections with UsersWhen Usability Meets Social Media: Strengthen Your Connections with Users
When Usability Meets Social Media: Strengthen Your Connections with UsersYoo Young Lee
 
IAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandIAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandStatsCommunications
 
Mathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsMathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsAsli Yazagan
 
Mn ltss projection model
Mn ltss projection modelMn ltss projection model
Mn ltss projection modelsoder145
 
COVID-19 Update (Summary): September 14, 2020
COVID-19 Update (Summary): September 14, 2020COVID-19 Update (Summary): September 14, 2020
COVID-19 Update (Summary): September 14, 2020Steve Shafer
 
assignment of statistics 2.pdf
assignment of statistics 2.pdfassignment of statistics 2.pdf
assignment of statistics 2.pdfSyedDaniyalKazmi2
 
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...SC CTSI at USC and CHLA
 
COVID-19 Analysis: July 21, 2020
COVID-19 Analysis: July 21, 2020COVID-19 Analysis: July 21, 2020
COVID-19 Analysis: July 21, 2020Steve Shafer
 
Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationIsabelle Augenstein
 
climate and health webinar presentations_2.2023.pdf
 climate and health webinar presentations_2.2023.pdf climate and health webinar presentations_2.2023.pdf
climate and health webinar presentations_2.2023.pdfEhsan Larik
 
COVID-19 Analysis: July 17, 2020
COVID-19 Analysis: July 17, 2020COVID-19 Analysis: July 17, 2020
COVID-19 Analysis: July 17, 2020Steve Shafer
 
COVID-19 Update (Summary): September 8, 2020
COVID-19 Update (Summary): September 8, 2020COVID-19 Update (Summary): September 8, 2020
COVID-19 Update (Summary): September 8, 2020Steve Shafer
 
COVID-19 Analysis: July 18, 2020
COVID-19 Analysis: July 18, 2020COVID-19 Analysis: July 18, 2020
COVID-19 Analysis: July 18, 2020Steve Shafer
 

Similar to Correcting for missing data, measurement error and confounding (20)

The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
Twitter Improves Seasonal Influenza Prediction
Twitter Improves Seasonal Influenza PredictionTwitter Improves Seasonal Influenza Prediction
Twitter Improves Seasonal Influenza Prediction
 
When Usability Meets Social Media: Strengthen Your Connections with Users
When Usability Meets Social Media: Strengthen Your Connections with UsersWhen Usability Meets Social Media: Strengthen Your Connections with Users
When Usability Meets Social Media: Strengthen Your Connections with Users
 
IAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandIAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. Durand
 
Mathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsMathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical Informatics
 
Mn ltss projection model
Mn ltss projection modelMn ltss projection model
Mn ltss projection model
 
Statistics Exericse 29
Statistics Exericse 29Statistics Exericse 29
Statistics Exericse 29
 
COVID-19 Update (Summary): September 14, 2020
COVID-19 Update (Summary): September 14, 2020COVID-19 Update (Summary): September 14, 2020
COVID-19 Update (Summary): September 14, 2020
 
assignment of statistics 2.pdf
assignment of statistics 2.pdfassignment of statistics 2.pdf
assignment of statistics 2.pdf
 
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
 
COVID-19 Analysis: July 21, 2020
COVID-19 Analysis: July 21, 2020COVID-19 Analysis: July 21, 2020
COVID-19 Analysis: July 21, 2020
 
Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
 
climate and health webinar presentations_2.2023.pdf
 climate and health webinar presentations_2.2023.pdf climate and health webinar presentations_2.2023.pdf
climate and health webinar presentations_2.2023.pdf
 
COVID-19 Analysis: July 17, 2020
COVID-19 Analysis: July 17, 2020COVID-19 Analysis: July 17, 2020
COVID-19 Analysis: July 17, 2020
 
COVID-19 Update (Summary): September 8, 2020
COVID-19 Update (Summary): September 8, 2020COVID-19 Update (Summary): September 8, 2020
COVID-19 Update (Summary): September 8, 2020
 
COVID-19 Analysis: July 18, 2020
COVID-19 Analysis: July 18, 2020COVID-19 Analysis: July 18, 2020
COVID-19 Analysis: July 18, 2020
 

More from Maarten van Smeden

Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023Maarten van Smeden
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicineMaarten van Smeden
 
Prognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthPrognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthMaarten van Smeden
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Maarten van Smeden
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Maarten van Smeden
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyondMaarten van Smeden
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the futureMaarten van Smeden
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science threadMaarten van Smeden
 

More from Maarten van Smeden (13)

Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Predictimands
PredictimandsPredictimands
Predictimands
 
Prognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient healthPrognosis-based medicine: merits and pitfalls of forecasting patient health
Prognosis-based medicine: merits and pitfalls of forecasting patient health
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 

Recently uploaded

Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 

Recently uploaded (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 

Correcting for missing data, measurement error and confounding

  • 1. Correcting for missing data, measurement error and confounding Maarten van Smeden, PhD University Medical Center Utrecht Julius Center for Health Sciences and Primary Care The Netherlands Twitter: @MvanSmeden Email: M.vanSmeden@umcutrecht.nl 30 November 2020 Methods meeting Epidemiology methods group, UMC Utrecht I have no conflicts of interest to declare
  • 3. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Rationale • Confounding -> correlation is not causation • Measurement error & missing data -> correlation is not always correlation • In causal epidemiologic research we often see all three …. but we rarely try to ”correct” for all three
  • 4. Twitter: @MaartenvSmedenUtrecht, November 30 2020 There is no shortage of methods Confounding Missing data Measurement error Multivariable adjustments Multiple imputation Regression calibration Weighting Weighting Weighting Matching Full information maximum likelihood Multiple imputation for ME Instrumental variable analysis Last observation carried forward SIMEX RANDOMIZATION (!) Missing indicator methods Full information maximum likelihood “Bayesian approaches” “Bayesian approaches” “Bayesian approaches” A non-exhaustive list of statistical correction strategies
  • 5. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Outline • Confounding (1 slide) • Missing data (2 slides) • Measurement error (many slides) • How to solve” all three? (couple of more slides) • What about prediction (if I have time left)
  • 6. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Confounding • A: treatment (Tx, 1 for treated; 0 for not treated) • Y: outcome (1 for death; 0 for survival) • Potential outcomes Ya=1: outcome under Tx; Ya=0: outcome under no Tx usually observe either Ya=0 or Ya=1 for an individual • Randomized trials: Ya ⊥ A (unconditional exchangeability) • Observational studies aim: Ya ⊥ A | L (conditional exchangeability) L: confounding variables -> no unmeasured confounding • (Additional causal assumptions: positivity, consistency, SUTVA,…) More info: causal inference: what if? Hernan & Robins
  • 7. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Missing data • Missing values are observations/records which were: – never collected (either by design or not) – lost accidentally – wrongly collected and so deleted (measurement error?) • Usually distinguish between three types of missing data – MCAR: the probability that data are missing does NOT depend on the values of observed or missing data – MAR: the probability that data are missing depends on the values of the observed data, but does NOT depend on the values of the missing data – MNAR: the probability that data are missing depends on the values of the missing data
  • 8. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Beware of mindless imputation Source: Hughes et al. IJE, 2019, doi:10.1093/ije/dyz032
  • 9. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Personal observations (I may be biased) Causal inference epidemiology • Confounding on center stage in analyses and discussion • Missing data often cannot be ignored (especially for higher %): performing multiple imputation becoming mainstream? • Measurement error the elephant in the room: belongs to the discussion section (not methods), lots of misconceptions! • (Note: not independent, e.g. measurement error can result in problems with confounding)
  • 10. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error “Errors in reading, calculating or recording a numerical value. The difference between observed values of a variable recorded under similar conditions and some fixed true value.“ The Cambridge Dictionary of Statistics (4th ed), ISBN: 9780521766999
  • 11. Twitter: @MaartenvSmedenUtrecht, November 30 2020 img: https://bit.ly/2T9UnRt
  • 12. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement of systolic blood pressure Measurement error due to: • White coat effect1 • Non-adherence to measurement protocol2 • Fallibility of measurement instruments3 • …. Measurement error varies: • Number of BP measurements taken4 • Gender4 • Circadian rhythm • …. doi: 110.1370/afm.1211; 210.3399/096016407782604965; 310.2147/MDER.S141599; 410.3109/08037051.2014.986952
  • 13. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example circadian rythm doi: 10.1111/j.1552-6909.2000.tb02771.x
  • 14. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Imprecision of medical measurements doi: 10.1136/bmj.m149
  • 15. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error: a long list • Blood pressure • Dietary intake • Smoking status • Air pollution • BMI • Physical activity • Vaccination status • Social class • Carotid intima media thickness • Thyroid hormone levels • Glucose levels • Cholesterol levels • Income • Family history • Mental health history • Education level • “Intelligence” • Respiratory rates • Medication use • Sedentary hours • Vitamin use • Immigration status • Age at first intercourse • Age at menopause • ICD coding • Symptoms • Date of symptom onset • Medication use • Visceral adipose tissue • Angina class • Heart rate • Grip and pinch strength • Cough frequency • Infant height • Gestational age • Disease specific mortality • ….
  • 16. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error mentioned Journals of epidemiology Jurek et al. 20061 61% (N = 35) Brakenhoff et al. 20182 56% (N = 198) Shaw et al. 20193 80% (N = 65) doi: 110.1007/s10654-006-9083-0; 210.1016/j.jclinepi.2018.02.02; 310.1016/j.annepidem.2018.09.001
  • 17. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error mentioned Journals of general medicine Brakenhoff et al. 20182: 25% (N = 57) doi: 210.1016/j.jclinepi.2018.02.02
  • 18. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error “corrections” applied Journals of epidemiology Jurek et al. 20061: 2% (N = 1) Brakenhoff et al. 20182: 4% (N = 13) Shaw et al. 20193: 6% (N = 5) doi: 110.1007/s10654-006-9083-0; 210.1016/j.jclinepi.2018.02.02; 310.1016/j.annepidem.2018.09.001
  • 19. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error “corrections” applied Journals of general medicine Brakenhoff et al. 20182: 0% (N = 0) doi: 210.1016/j.jclinepi.2018.02.02
  • 23. Twitter: @MaartenvSmedenUtrecht, November 30 2020 • Myth 1: measurement error can be compensated for by large numbers of observations • Myth 2: the exposure effect is underestimated when variables are measured with error • Myth 3: exposure measurement error is nondifferential if measurements are taken without knowledge of the outcome • Myth 4: measurement error can be prevented but not mitigated in epidemiological data analyses • Myth 5: certain types of epidemiological research are unaffected by measurement error
  • 24. Twitter: @MaartenvSmedenUtrecht, November 30 2020 • Myth 1: measurement error can be compensated for by large numbers of observations • Myth 2: the exposure effect is underestimated when variables are measured with error • Myth 3: exposure measurement error is nondifferential if measurements are taken without knowledge of the outcome • Myth 4: measurement error can be prevented but not mitigated in epidemiological data analyses • Myth 5: certain types of epidemiological research are unaffected by measurement error
  • 25. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Types of measurement error Measurement are consistently wrong in a particular direction Classical (Random) measurement error Differential measurement error Systematic measurement error Measurements fluctuate around their true value Measurements are consistently wrong in a particular direction, varying per group Courtesy: Linda Nab
  • 26. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Classical measurement error
  • 27. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Classical measurement error
  • 28. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Tripple whammy of measurement error • Bias • Increased imprecision • Masked functional relations
  • 29. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Tripple whammy of measurement error • Bias • Increased imprecision • Masked functional relations Always weaker effects?
  • 30. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example: classical measurement error doi: 10.1371/journal.pone.0192298
  • 31. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example: classical measurement error Second Manifestations of ARTerial disease (SMART) cohort doi: 10.1371/journal.pone.0192298 Effect of interest Confounder with error Outcome
  • 32. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example: classical measurement error doi: 10.1371/journal.pone.0192298 % bias in hazard ratio for SBP (multivariable Cox regression model)
  • 33. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example: classical measurement error Second Manifestations of ARTerial disease (SMART) cohort doi: 10.1371/journal.pone.0192298 Effect of interest Confounder with error Outcome
  • 34. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Example: classical measurement error doi: 10.1371/journal.pone.0192298 % bias in hazard ratio for SBP (multivariable Cox regression model)
  • 35. Twitter: @MaartenvSmedenUtrecht, November 30 2020 • Myth 1: measurement error can be compensated for by large numbers of observations • Myth 2: the exposure effect is underestimated when variables are measured with error • Myth 3: exposure measurement error is nondifferential if measurements are taken without knowledge of the outcome • Myth 4: measurement error can be prevented but not mitigated in epidemiological data analyses • Myth 5: certain types of epidemiological research are unaffected by measurement error
  • 36. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Randomized trials unaffected? excerpt from: 10.1186/s13063-018-2954-3
  • 37. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Randomized controlled trials doi: 10.1002/sim.8359
  • 38. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Randomized controlled trials Classical (Random) measurement error Systematic measurement error Differential measurement error doi: 10.1002/sim.8359
  • 39. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Randomized controlled trials Classical (Random) measurement error Systematic measurement error Differential measurement error • Unbiased Tx effect estimator • Increased Type-II error • Nominal Type-I error • Possibly biased Tx effect estimator • Type-II error affected • Type-I generally nominal • Possibly biased Tx effect estimator • Type-II error affected • Type-I not nominal doi: 10.1002/sim.8359
  • 40. Twitter: @MaartenvSmedenUtrecht, November 30 2020 • Myth 1: measurement error can be compensated for by large numbers of observations • Myth 2: the exposure effect is underestimated when variables are measured with error • Myth 3: exposure measurement error is nondifferential if measurements are taken without knowledge of the outcome • Myth 4: measurement error can be prevented but not mitigated in epidemiological data analyses • Myth 5: certain types of epidemiological research are unaffected by measurement error
  • 41. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Tripple whammy of measurement error • Bias • Increased imprecision • Masked functional relations Usually the target for measurement error “corrections”
  • 43. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error corrections Replicates study Study sample 𝑌∗ Standard measurements replicated
  • 44. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error corrections External validation set Study sample 𝑌∗ External validation set Standard measurements Standard measurements + Validated measurements
  • 45. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error corrections Internal validation set Study sample 𝑌∗Internal validation set Standard measurements Standard measurements + Validated measurements
  • 46. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Simulation study OLS regression Y = a0+a1A + b1L1+…+bpLp + e, e~N(0,s) a1: effect of primary interest A,L ~ multivariate normal with mean vector 0 and correlation-matrix with equal pairwise correlations Random measurement error: on A, generating a new A* Missing data (MAR): on L1 True values for a0 = 0, a1 = 10, and b1= b2 = … = bp based on total confounding effect (crude minus adjusted)
  • 47. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Simulation factors 100,000 generated datasets by random draws from simulation factors
  • 49. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Sequential models • MIME: Multiple imputation for measurement error Multiple impute both A (only observed in subset) and missingness L1 : full conditional specification (Y,A,A*,L), followed by OLS using A and L as covariates (Rubin’s rules) • MIRC: Multiple imputation and regression calibration 1. Impute missing values in L1 2. In subset: OLS for A given A*,L 3. For the entire set: Arc = E(A| A*,L) 4. For each multiple imputed set: OLS using Arc and L as covariates, and adjust standard errors (RC) 5. Combine using Rubin’s rules
  • 50. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Simultaneous models Conditional submodels • Y | A, L (primary analysis model) • A*| Y, A, L • A | L • L1 | L2,…,LP Estimated simultaneously • MCMC: Bayes (uninformative priors) • Full information maximum likelihood: FIML (structural equation model)
  • 52. Twitter: @MaartenvSmedenUtrecht, November 30 2020 What does this mean? • Simple setting (OLS, 1 covariate with missing data, 1 covariate with measurement error, internal validation): ”full adjustment” approaches work really well even in small N = 100. • Differences especially in rMSE, nearly no bias • The Bayesian approach seems most promising (for its frequentist properties!): least bias, easy to expand to other link functions, multivariate missing data and measurement error
  • 53. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement error models are not new doi: 10.2307/1422689
  • 54. Twitter: @MaartenvSmedenUtrecht, November 30 2020 • Myth 1: measurement error can be compensated for by large numbers of observations • Myth 2: the exposure effect is underestimated when variables are measured with error • Myth 3: exposure measurement error is nondifferential if measurements are taken without knowledge of the outcome • Myth 4: measurement error can be prevented but not mitigated in epidemiological data analyses • Myth 5: certain types of epidemiological research are unaffected by measurement error
  • 55. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Exceptions? • Measurement error in prognostic factors in an RCT – Same argument about missing data (e.g. see White and Thompson, Stat Med 2005) • Special case of measurement error in a confounder – e.g. confounding by indication, where indication was based on the confounder with error
  • 56. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Nab et al. Epidemiology, 2020, doi: 10.1097/EDE.0000000000001239
  • 57. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Sensitivity analysis tool https://lindanab.shinyapps.io/SensitivityAnalysis/ Preprint: https://arxiv.org/abs/1912.05800
  • 58. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Exceptions? • Measurement error in prognostic factors in an RCT – Same argument about missing data (e.g. see White and Thompson, Stat Med 2005) • Special case of measurement error in a confounder – e.g. confounding by indication, where indication was based on the confounder with error • Prediction models BUT…..
  • 59. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement heterogeneity
  • 60. Twitter: @MaartenvSmedenUtrecht, November 30 2020 Measurement: are labels the new oil? https://twitter.com/DrHughHarvey/status/1230218991026819077