SlideShare a Scribd company logo
1 of 29
Download to read offline
Common statistical pitfalls & errors
in biomedical research (a top-5 list)
Evangelos I. Kritsotakis
Assoc. Prof. of Biostatistics, Med. School, University of Crete
Honorary Senior Lecturer, ScHARR, University of Sheffield
e.kritsotakis@uoc.gr
10.06.2023
Outline and disclaimer
Top-5 list of common statistical pitfalls leading to errors, related to:
 Normality
 Time confounding
 Linearity
 Clustering
 Calibration
 This is a personal view based on my experience as a reader, reviewer, and
editor of medical journals,
o might be incomplete and biased, but hopefully will be useful.
 These problems are well known to statisticians and methodologists, but
they continue to appear in medical journals.
 Makes sense to summarize the data with median and IQR (rather than mean ± SD).
 Most researchers would apply a non-parametric test (e.g. Mann-Whitney U-test).
 But the t-test will work fine in this situation!
 In fact, is more appropriate and informative to use the t-test than non-parametrics.
NORMALITY: Who is afraid of non-normal data?
Data from the HELAS cohort of emergency laparotomies:
serum albumin
blood urea nitrogen
NORMALITY: Who is afraid of non-normal data?
The t-test, and thus linear regression, are NOT afraid of non-normal data!
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
http://www.youtube.com/watch?v=tHU0_-Jzg34
 t-test assumes Normality per group,
so that sample means are Normally
distributed.
but
 By the central limit theorem, the
sample means will approximate to
the Normal distribution when the
sample size increases, regardless of
the distribution of the original
observations
NORMALITY: Who is afraid of non-normal data?
The t-test, and thus also linear regression, are NOT afraid of non-normal data!
Rules of thumb for the t-test:
 n < 25 per group, the data must be normally distributed to use the t-test.
 n > 25 per group, no extreme outliers, can handle moderately skewed distributions
 n > 200 per group, t-test robust to heavily skewed distributions
When should you use a non-parametric test?
• n < 25 per group (as it is very difficult to confirm normality)
Eur J Endocrinol 2020;183(2):L1-L3.
Please DO NOT perform statistical tests for normality !
(e.g. Kolmogorov–Smirnov or Shapiro–Wilk tests)
NORMALITY: Applying non-parametrics in large samples - PITFALL
Parametric vs. non-parametric tests:
t-test vs. Wilcoxon-Mann-Whitney test
Rejection rates (p < 0.05) of the WMW and t-tests
after 10 000 replications
Data drawn at random from skewed gamma
distributions (Skewness coef. = 3), with equal
means and medians, 𝑆𝐷1 = 1.1 × 𝑆𝐷2
BMC Med Res Methodol 2012;12:78.
FOLLOW UP TIME: frequently variable and/or incomplete
• Patients entering a trial my have different
times of follow up.
• Not all patients will experience the event
of interest by end of data collection.
• Times to outcome event (endpoint) are
incomplete (right censored).
Prognostic study design
Patient follow up
Otolaryngol Head Neck Surg. 2010
= censoring
= event occurrence
S = short serial time
M = medium
L = long.
FOLLOW UP TIME: ignoring variable follow ups is an error!
R
R
R
R
R
R
Time (hours)  Time (hours) 
Drug A Drug B
R = relief of pain
1 2 8 3
2 8
5
• Pain relief proportions are ¾ (75%) for both drugs, but drug A is preferable.
• Times to event should not be ignored !
• One solution is to use (average) incidence rates:
• Compare using standard Poisson or negative Binomial regression models.
• This assumes constant rates and no censoring.
𝐼𝑅𝐴 =
3
12
= 0.25 𝐼𝑅𝐵 =
3
18
= 0.17 events per person−hour
FOLLOW UP TIME: ignoring censoring is an error!
Naïve suggestions:
A. Use complete data, exclude patients with incomplete follow up (too pessimistic!).
B. Assume censored patients, survived until end of study (too optimistic).
Solution:
C. Account for censoring with survival analysis methods: Kaplan-Meier, Cox regression, etc
1-year survival:
B) 47%
C) 41%
A) 27%
TIME DEPENDENT EFFECTS: e.g. non-proportional hazards
Kaplan-Meier survival curves showing the probabilities of remaining infection free.
Piecewise Cox model to estimate vaccine efficacy:
VE = 59% (95%CI 31% to 75%; P = 0.001) during first 9 weeks
VE = -17% (95%CI -76% to 23%; P = 0.460) during last 6 weeks
TIME TRENDS: over time, things may change anyway! - PITFALL
One measure before and after intervention (group level data)
? ?
Accounting for time trends may tell a different story!
?
TIME TRENDS: the interrupted time series model
Res Synth Methods 2021; 12(1):106-117
Segmented regression: 𝑌𝑡 = 𝛽0 + 𝛽1 ∙ 𝑡 + 𝛽2 ∙ 𝑋𝑡 + 𝛽3 𝑡 − 𝑡0 𝑋𝑡
𝒕𝟎
𝛽1
𝛽1 + 𝛽3
𝛽2
TIME TRENDS: ITS Example (1)
Carbapenem-focused antimicrobial stewardship intervention, Jan 2020 – Dec 2020,
University Hospital of Heraklion
Treatments per 100 hospital admissions:
 Level change IRR 0.63 (95%CI 0.50–0.80),
P < 0.001,
 Trend change IRR 1.02 (95%CI 1.00–1.04),
P = 0.117
Quarterly data on hospital consumption of
carbapenems:
 Level change: −4.9 DDD/100 PD
(95%CI −7.3 to −2.6); P = 0.007
J Antimicrob Chemother 2023;78(4):1000-1008.
TIME TRENDS: ITS Example (2)
Impact of SARS-CoV-2 preventive measures against healthcare-associated infections
from multidrug-resistant ESKAPEE pathogens (PAGNH + VENIZELEIO):
 Pre-COVID-19 period (3/2019 – 2/2020): 1.06 infections per 1,000 patient-days.
 COVID-19 period (3/2020 to 2/2021): 1.11 infections per 1,000 patient-days;
 IRR = 1.05 (overall), P = 0.58.
IRR = 0.46 (level drop) IRR = 0.44 (level drop)
Antibiotics 2023; 12(7):1088
LINEARITY: non-linear relationships are common - PITFALL
P
ΣbX
For the odds of binary outcome Y, the logistic regression model is:
loge(odds of Y) = b0 + b1X1 + b2X2 + b3X3 + … (linearity in logit)
or, equivalently:
 
1 1 2 2 3
0 3
b X b X b X
b
1
Probability of Y
1 e
    


• Non-linear probability model.
• Log-linear odds model.
• Measure of effect is the Odds Ratio (OR).
• Assumes that a 1 unit increase in a
covariate X has the same effect (OR) on the
outcome across the entire range of the
covariate ’s values – this is very strong
assumption and should be checked for
continuous variables!
• Use cubic splines or fractional polynomials.
LINEARITY: visualizing the effects before modelling
• HELAS cohort of emergency laparotomy patients in Greece
• Outcome: 30-day post-operative death
• Covariate: Age
• Logistic regression model: loge(odds death) = b0 + b1× AGE
OR = 1.75 (95% CI 1.47–2.09) per 10-years increase in age (P < 0.001)
i.e. odds of death after EL increase by 75% for each 10 additional years of age
across the entire range of ages (linearity)
World J Surg. 2023 Jan;47(1):130-139.
LINEARITY: visualizing the effects before modelling
• HELAS cohort of emergency laparotomy patients in Greece
• Outcome: 30-day post-operative death
• Covariate: BMI
World J Surg. 2023 Jan;47(1):130-139.
CLUSTERING: within-groups correlation - PITFALL
 Clustering occurs when data within a cluster tend to be ‘more alike’
(`intra-cluster correlation’)
 By design:
• longitudinal studies with repeated measurements (clusters = patients),
• data compiled across multiple experiments (clusters = trials),
• meta-analysis of different studies (clusters = studies),
• multicenter studies,
• cluster-randomized controlled trials ,
• cluster sampling in cross-sectional surveys,.
 By nature:
• subjects clustered within centers (surgeons, clinics, hospitals);
• clustering by surgeon or therapist delivering the intervention.
CLUSTERING: ignoring within-groups correlation
 Many statistical tests and models require independent data. Applying them on
clustered data, produces a false sense of precision, higher chances for Type I error,
and consequently incorrect conclusions may be drawn.
 Data within a cluster do not contribute
completely independent information,
the “effective” sample size is less than
the total number of observations.
The color of each data point represents the cluster to which it belongs
J Neurosci 2010;30(32):10601-8
CLUSTERING: Consequences of ignoring clustering
J Neurosci 2010;30(32):10601-8
CLUSTERING: methods to account for intra-cluster correlation
 `Fixed effect’ method: add one binary predictor variable for each cluster in a
regression / ANOVA model (using one cluster as a reference cluster).
o Simplest method, but requires small number of clusters.
o Results strictly only applicable to the particular set of clusters.
o Cannot be used in designs such as cluster RCTs.
 ‘Random effects’ model (aka mixed or multilevel),
o `marginal’ estimate of effect, for an individual changing exposure level within
a specified cluster,
o estimate of the between cluster variability itself.
 `Generalized estimating equations’ (GEEs).
o population average effect, for an individual moving from one exposure level to
another, regardless of cluster.
CLUSTERING: multilevel models
1. Random intercepts model
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1 ⋅ 𝑋𝑖𝑗 +𝑒𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝑢0𝑗
2. Random slopes model
𝑌𝑖𝑗 = 𝛽0 + 𝛽1𝑗 ⋅ 𝑋𝑖𝑗 + 𝑒𝑖𝑗
𝛽1𝑗 = 𝛾10 + 𝑢1𝑗
3. Random intercepts and slopes
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 ⋅ 𝑋𝑖𝑗 + 𝑒𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝑢1𝑗
Patient: i
Cluster: j
CALIBRATION: Clinical Prediction Models
Obtain a system (set of variables + model) that estimates the
risk of the outcome.
Predictive
models:
Aim is the use in NEW patients:
it should work ‘tomorrow’, not
now (validation).
https://riskcalculator.facs.org/RiskCalculator/PatientInfo.jsp
CALIBRATION: Assessing clinical prediction models
• Discrimination
– Ability of model to rank subjects according
to the risk of the outcome event.
– Trade-off between sensitivity and specificity
– Assessed graphically with a Receiver
Operating Curve (ROC) and numerically by
the area under the curve (AUC = c-index)
• Calibration
– Agreement between risk predictions from
the model and observed risks of outcome.
– Assessed graphically with calibration plots
– Assessed numerically with the calibration
slope (ideal slope = 1) and calibration
intercept (ideal CITL= 0)
Slope =1.05
CITL = 0.00
CALIBRATION: Overfitting – PITFALL
Overfitting =
Source: https://retrobadge.co.uk/retrobadge/slogans-sayings-
badges/public-enemy-number-one-small-retro-badge/
Overfitting = What you see is not what you get!
“Idiosyncrasies in the data are fitted rather than
generalizable patterns. A model may hence not be
applicable to new patients, even when the setting of
application is very similar to the development setting”
Steyerberg, 2009, Springer, ISBN 978-0-387-77244-8.
CALIBRATION: Overfitting – PITFALL
• Typical calibration plot with overfitting:
Source: Maarten van Smeden
 Discrimination (e.g. AUC) may not be affected, but:
 Low risks are underestimated
 High risk are overestimated
CALIBRATION: Overfitting – PITFALL
CALIBRATION: Prognostic prediction after EL in the HELAS cohort
J Trauma Acute Care Surg 2023;94(6):847-856.
Good discrimination (high AUC or C-statistic value) does not necessarily coincide with good calibration.
RECOMMENDED READINGS: Short lists by others
 van Smeden M. A Very Short List of Common Pitfalls in Research Design, Data Analysis, and
Reporting. PRiMER. 2022;6:26. PMID: 36119906.
 Riley RD, Cole TJ, Deeks J, et al. On the 12th Day of Christmas, a Statistician Sent to Me . . .
BMJ. 2022;379:e072883. PMID: 36593578.
 Makin TR, Orban de Xivry JJ. Ten common statistical mistakes to watch out for when writing
or reviewing a manuscript. Elife. 2019 ;8:e48175. PMID: 31596231.
 Strasak AM, Zaman Q, Pfeiffer KP, Göbel G, Ulmer H. Statistical errors in medical research -
a review of common pitfalls. Swiss Med Wkly 2007;137(3-4):44-49.
 Borg DN, Lohse KR, Sainani KL. Ten Common Statistical Errors from All Phases of Research,
and Their Fixes. PM R. 2020;12(6):610-614. doi:10.1002/pmrj.12395
And an all-time classic:
 Altman DG. The scandal of poor medical research. BMJ. 1994;308(6924):283-284.

More Related Content

What's hot

4 Ppt Dengue By Maria Niaz
4 Ppt Dengue By Maria Niaz4 Ppt Dengue By Maria Niaz
4 Ppt Dengue By Maria Niaz
Zahoor Ahmed
 

What's hot (11)

Cells and organs of the immune system.pptx
Cells and organs of the immune system.pptxCells and organs of the immune system.pptx
Cells and organs of the immune system.pptx
 
Check cell, Preparation and Importance.pptx
Check cell, Preparation and Importance.pptxCheck cell, Preparation and Importance.pptx
Check cell, Preparation and Importance.pptx
 
CDC-XM / Complement dependent cytotoxicity -crossmatch
CDC-XM / Complement dependent cytotoxicity -crossmatchCDC-XM / Complement dependent cytotoxicity -crossmatch
CDC-XM / Complement dependent cytotoxicity -crossmatch
 
17 ketosteroids
17 ketosteroids17 ketosteroids
17 ketosteroids
 
Identification of pathogenic bacteria in clinical microbiology
Identification of pathogenic bacteria in clinical microbiologyIdentification of pathogenic bacteria in clinical microbiology
Identification of pathogenic bacteria in clinical microbiology
 
10 step marketing plan biorad D10
10 step marketing plan biorad D10 10 step marketing plan biorad D10
10 step marketing plan biorad D10
 
Isolation and identification of salmonella &e.coli
Isolation and identification of salmonella &e.coliIsolation and identification of salmonella &e.coli
Isolation and identification of salmonella &e.coli
 
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
 
Experiment modelling of Auto-immune diseases
Experiment modelling of Auto-immune diseasesExperiment modelling of Auto-immune diseases
Experiment modelling of Auto-immune diseases
 
4 Ppt Dengue By Maria Niaz
4 Ppt Dengue By Maria Niaz4 Ppt Dengue By Maria Niaz
4 Ppt Dengue By Maria Niaz
 
Ch06
Ch06Ch06
Ch06
 

Similar to Common statistical pitfalls & errors in biomedical research (a top-5 list)

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
cambridgeWD
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
cambridgeWD
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net Benefit
Laure Wynants
 
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design IssuesExtending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
nQuery
 
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
European School of Oncology
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
nQuery
 

Similar to Common statistical pitfalls & errors in biomedical research (a top-5 list) (20)

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net Benefit
 
unmatched case control studies
unmatched case control studiesunmatched case control studies
unmatched case control studies
 
Calibration of risk prediction models: decision making with the lights on or ...
Calibration of risk prediction models: decision making with the lights on or ...Calibration of risk prediction models: decision making with the lights on or ...
Calibration of risk prediction models: decision making with the lights on or ...
 
Quantitative Synthesis I
Quantitative Synthesis IQuantitative Synthesis I
Quantitative Synthesis I
 
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design IssuesExtending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
 
ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptx
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...
2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...
2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...
 
Stats.pptx
Stats.pptxStats.pptx
Stats.pptx
 
Avoid overfitting in precision medicine: How to use cross-validation to relia...
Avoid overfitting in precision medicine: How to use cross-validation to relia...Avoid overfitting in precision medicine: How to use cross-validation to relia...
Avoid overfitting in precision medicine: How to use cross-validation to relia...
 
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
 
Sampling distributions
Sampling distributionsSampling distributions
Sampling distributions
 
Lemeshow samplesize
Lemeshow samplesizeLemeshow samplesize
Lemeshow samplesize
 
Projecting ‘time to event’ outcomes in technology assessment: an alternative ...
Projecting ‘time to event’ outcomes in technology assessment: an alternative ...Projecting ‘time to event’ outcomes in technology assessment: an alternative ...
Projecting ‘time to event’ outcomes in technology assessment: an alternative ...
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
 
Critical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptxCritical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptx
 
Medical Statistics used in Oncology
Medical Statistics used in OncologyMedical Statistics used in Oncology
Medical Statistics used in Oncology
 
Ideal induction therapy for newly diagnosed AML. Do we have a consensus?
Ideal induction therapy for newly diagnosed AML. Do we have a consensus?Ideal induction therapy for newly diagnosed AML. Do we have a consensus?
Ideal induction therapy for newly diagnosed AML. Do we have a consensus?
 

Recently uploaded

Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Janvi Singh
 
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
chaddageeta79
 
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
Dipal Arora
 
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
Dipal Arora
 
Physiologic Anatomy of Heart_AntiCopy.pdf
Physiologic Anatomy of Heart_AntiCopy.pdfPhysiologic Anatomy of Heart_AntiCopy.pdf
Physiologic Anatomy of Heart_AntiCopy.pdf
MedicoseAcademics
 

Recently uploaded (20)

Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
 
Face and Muscles of facial expression.pptx
Face and Muscles of facial expression.pptxFace and Muscles of facial expression.pptx
Face and Muscles of facial expression.pptx
 
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
👉 Guntur Call Girls Service Just Call 🍑👄7427069034 🍑👄 Top Class Call Girl Ser...
 
See it and Catch it! Recognizing the Thought Traps that Negatively Impact How...
See it and Catch it! Recognizing the Thought Traps that Negatively Impact How...See it and Catch it! Recognizing the Thought Traps that Negatively Impact How...
See it and Catch it! Recognizing the Thought Traps that Negatively Impact How...
 
HISTORY, CONCEPT AND ITS IMPORTANCE IN DRUG DEVELOPMENT.pptx
HISTORY, CONCEPT AND ITS IMPORTANCE IN DRUG DEVELOPMENT.pptxHISTORY, CONCEPT AND ITS IMPORTANCE IN DRUG DEVELOPMENT.pptx
HISTORY, CONCEPT AND ITS IMPORTANCE IN DRUG DEVELOPMENT.pptx
 
Test bank for critical care nursing a holistic approach 11th edition morton f...
Test bank for critical care nursing a holistic approach 11th edition morton f...Test bank for critical care nursing a holistic approach 11th edition morton f...
Test bank for critical care nursing a holistic approach 11th edition morton f...
 
Call Girls in Lucknow Just Call 👉👉8875999948 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8875999948 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉8875999948 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8875999948 Top Class Call Girl Service Avai...
 
Intro to disinformation and public health
Intro to disinformation and public healthIntro to disinformation and public health
Intro to disinformation and public health
 
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
Female Call Girls Jodhpur Just Call Dipal 🥰8250077686🥰 Top Class Call Girl Se...
 
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
Female Call Girls Sri Ganganagar Just Call Dipal 🥰8250077686🥰 Top Class Call ...
 
TEST BANK For Porth's Essentials of Pathophysiology, 5th Edition by Tommie L ...
TEST BANK For Porth's Essentials of Pathophysiology, 5th Edition by Tommie L ...TEST BANK For Porth's Essentials of Pathophysiology, 5th Edition by Tommie L ...
TEST BANK For Porth's Essentials of Pathophysiology, 5th Edition by Tommie L ...
 
ABO Blood grouping in-compatibility in pregnancy
ABO Blood grouping in-compatibility in pregnancyABO Blood grouping in-compatibility in pregnancy
ABO Blood grouping in-compatibility in pregnancy
 
Top 10 Most Beautiful Russian Pornstars List 2024
Top 10 Most Beautiful Russian Pornstars List 2024Top 10 Most Beautiful Russian Pornstars List 2024
Top 10 Most Beautiful Russian Pornstars List 2024
 
Physiologic Anatomy of Heart_AntiCopy.pdf
Physiologic Anatomy of Heart_AntiCopy.pdfPhysiologic Anatomy of Heart_AntiCopy.pdf
Physiologic Anatomy of Heart_AntiCopy.pdf
 
VIP ℂall Girls Arekere Bangalore 6378878445 WhatsApp: Me All Time Serviℂe Ava...
VIP ℂall Girls Arekere Bangalore 6378878445 WhatsApp: Me All Time Serviℂe Ava...VIP ℂall Girls Arekere Bangalore 6378878445 WhatsApp: Me All Time Serviℂe Ava...
VIP ℂall Girls Arekere Bangalore 6378878445 WhatsApp: Me All Time Serviℂe Ava...
 
Drug development life cycle indepth overview.pptx
Drug development life cycle indepth overview.pptxDrug development life cycle indepth overview.pptx
Drug development life cycle indepth overview.pptx
 
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdfShazia Iqbal 2024 - Bioorganic Chemistry.pdf
Shazia Iqbal 2024 - Bioorganic Chemistry.pdf
 
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptxANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
 
Physicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdfPhysicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdf
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 

Common statistical pitfalls & errors in biomedical research (a top-5 list)

  • 1. Common statistical pitfalls & errors in biomedical research (a top-5 list) Evangelos I. Kritsotakis Assoc. Prof. of Biostatistics, Med. School, University of Crete Honorary Senior Lecturer, ScHARR, University of Sheffield e.kritsotakis@uoc.gr 10.06.2023
  • 2. Outline and disclaimer Top-5 list of common statistical pitfalls leading to errors, related to:  Normality  Time confounding  Linearity  Clustering  Calibration  This is a personal view based on my experience as a reader, reviewer, and editor of medical journals, o might be incomplete and biased, but hopefully will be useful.  These problems are well known to statisticians and methodologists, but they continue to appear in medical journals.
  • 3.  Makes sense to summarize the data with median and IQR (rather than mean ± SD).  Most researchers would apply a non-parametric test (e.g. Mann-Whitney U-test).  But the t-test will work fine in this situation!  In fact, is more appropriate and informative to use the t-test than non-parametrics. NORMALITY: Who is afraid of non-normal data? Data from the HELAS cohort of emergency laparotomies: serum albumin blood urea nitrogen
  • 4. NORMALITY: Who is afraid of non-normal data? The t-test, and thus linear regression, are NOT afraid of non-normal data! http://onlinestatbook.com/stat_sim/sampling_dist/index.html http://www.youtube.com/watch?v=tHU0_-Jzg34  t-test assumes Normality per group, so that sample means are Normally distributed. but  By the central limit theorem, the sample means will approximate to the Normal distribution when the sample size increases, regardless of the distribution of the original observations
  • 5. NORMALITY: Who is afraid of non-normal data? The t-test, and thus also linear regression, are NOT afraid of non-normal data! Rules of thumb for the t-test:  n < 25 per group, the data must be normally distributed to use the t-test.  n > 25 per group, no extreme outliers, can handle moderately skewed distributions  n > 200 per group, t-test robust to heavily skewed distributions When should you use a non-parametric test? • n < 25 per group (as it is very difficult to confirm normality) Eur J Endocrinol 2020;183(2):L1-L3. Please DO NOT perform statistical tests for normality ! (e.g. Kolmogorov–Smirnov or Shapiro–Wilk tests)
  • 6. NORMALITY: Applying non-parametrics in large samples - PITFALL Parametric vs. non-parametric tests: t-test vs. Wilcoxon-Mann-Whitney test Rejection rates (p < 0.05) of the WMW and t-tests after 10 000 replications Data drawn at random from skewed gamma distributions (Skewness coef. = 3), with equal means and medians, 𝑆𝐷1 = 1.1 × 𝑆𝐷2 BMC Med Res Methodol 2012;12:78.
  • 7. FOLLOW UP TIME: frequently variable and/or incomplete • Patients entering a trial my have different times of follow up. • Not all patients will experience the event of interest by end of data collection. • Times to outcome event (endpoint) are incomplete (right censored). Prognostic study design Patient follow up Otolaryngol Head Neck Surg. 2010 = censoring = event occurrence S = short serial time M = medium L = long.
  • 8. FOLLOW UP TIME: ignoring variable follow ups is an error! R R R R R R Time (hours)  Time (hours)  Drug A Drug B R = relief of pain 1 2 8 3 2 8 5 • Pain relief proportions are ¾ (75%) for both drugs, but drug A is preferable. • Times to event should not be ignored ! • One solution is to use (average) incidence rates: • Compare using standard Poisson or negative Binomial regression models. • This assumes constant rates and no censoring. 𝐼𝑅𝐴 = 3 12 = 0.25 𝐼𝑅𝐵 = 3 18 = 0.17 events per person−hour
  • 9. FOLLOW UP TIME: ignoring censoring is an error! Naïve suggestions: A. Use complete data, exclude patients with incomplete follow up (too pessimistic!). B. Assume censored patients, survived until end of study (too optimistic). Solution: C. Account for censoring with survival analysis methods: Kaplan-Meier, Cox regression, etc 1-year survival: B) 47% C) 41% A) 27%
  • 10. TIME DEPENDENT EFFECTS: e.g. non-proportional hazards Kaplan-Meier survival curves showing the probabilities of remaining infection free. Piecewise Cox model to estimate vaccine efficacy: VE = 59% (95%CI 31% to 75%; P = 0.001) during first 9 weeks VE = -17% (95%CI -76% to 23%; P = 0.460) during last 6 weeks
  • 11. TIME TRENDS: over time, things may change anyway! - PITFALL One measure before and after intervention (group level data) ? ? Accounting for time trends may tell a different story! ?
  • 12. TIME TRENDS: the interrupted time series model Res Synth Methods 2021; 12(1):106-117 Segmented regression: 𝑌𝑡 = 𝛽0 + 𝛽1 ∙ 𝑡 + 𝛽2 ∙ 𝑋𝑡 + 𝛽3 𝑡 − 𝑡0 𝑋𝑡 𝒕𝟎 𝛽1 𝛽1 + 𝛽3 𝛽2
  • 13. TIME TRENDS: ITS Example (1) Carbapenem-focused antimicrobial stewardship intervention, Jan 2020 – Dec 2020, University Hospital of Heraklion Treatments per 100 hospital admissions:  Level change IRR 0.63 (95%CI 0.50–0.80), P < 0.001,  Trend change IRR 1.02 (95%CI 1.00–1.04), P = 0.117 Quarterly data on hospital consumption of carbapenems:  Level change: −4.9 DDD/100 PD (95%CI −7.3 to −2.6); P = 0.007 J Antimicrob Chemother 2023;78(4):1000-1008.
  • 14. TIME TRENDS: ITS Example (2) Impact of SARS-CoV-2 preventive measures against healthcare-associated infections from multidrug-resistant ESKAPEE pathogens (PAGNH + VENIZELEIO):  Pre-COVID-19 period (3/2019 – 2/2020): 1.06 infections per 1,000 patient-days.  COVID-19 period (3/2020 to 2/2021): 1.11 infections per 1,000 patient-days;  IRR = 1.05 (overall), P = 0.58. IRR = 0.46 (level drop) IRR = 0.44 (level drop) Antibiotics 2023; 12(7):1088
  • 15. LINEARITY: non-linear relationships are common - PITFALL P ΣbX For the odds of binary outcome Y, the logistic regression model is: loge(odds of Y) = b0 + b1X1 + b2X2 + b3X3 + … (linearity in logit) or, equivalently:   1 1 2 2 3 0 3 b X b X b X b 1 Probability of Y 1 e        • Non-linear probability model. • Log-linear odds model. • Measure of effect is the Odds Ratio (OR). • Assumes that a 1 unit increase in a covariate X has the same effect (OR) on the outcome across the entire range of the covariate ’s values – this is very strong assumption and should be checked for continuous variables! • Use cubic splines or fractional polynomials.
  • 16. LINEARITY: visualizing the effects before modelling • HELAS cohort of emergency laparotomy patients in Greece • Outcome: 30-day post-operative death • Covariate: Age • Logistic regression model: loge(odds death) = b0 + b1× AGE OR = 1.75 (95% CI 1.47–2.09) per 10-years increase in age (P < 0.001) i.e. odds of death after EL increase by 75% for each 10 additional years of age across the entire range of ages (linearity) World J Surg. 2023 Jan;47(1):130-139.
  • 17. LINEARITY: visualizing the effects before modelling • HELAS cohort of emergency laparotomy patients in Greece • Outcome: 30-day post-operative death • Covariate: BMI World J Surg. 2023 Jan;47(1):130-139.
  • 18. CLUSTERING: within-groups correlation - PITFALL  Clustering occurs when data within a cluster tend to be ‘more alike’ (`intra-cluster correlation’)  By design: • longitudinal studies with repeated measurements (clusters = patients), • data compiled across multiple experiments (clusters = trials), • meta-analysis of different studies (clusters = studies), • multicenter studies, • cluster-randomized controlled trials , • cluster sampling in cross-sectional surveys,.  By nature: • subjects clustered within centers (surgeons, clinics, hospitals); • clustering by surgeon or therapist delivering the intervention.
  • 19. CLUSTERING: ignoring within-groups correlation  Many statistical tests and models require independent data. Applying them on clustered data, produces a false sense of precision, higher chances for Type I error, and consequently incorrect conclusions may be drawn.  Data within a cluster do not contribute completely independent information, the “effective” sample size is less than the total number of observations. The color of each data point represents the cluster to which it belongs J Neurosci 2010;30(32):10601-8
  • 20. CLUSTERING: Consequences of ignoring clustering J Neurosci 2010;30(32):10601-8
  • 21. CLUSTERING: methods to account for intra-cluster correlation  `Fixed effect’ method: add one binary predictor variable for each cluster in a regression / ANOVA model (using one cluster as a reference cluster). o Simplest method, but requires small number of clusters. o Results strictly only applicable to the particular set of clusters. o Cannot be used in designs such as cluster RCTs.  ‘Random effects’ model (aka mixed or multilevel), o `marginal’ estimate of effect, for an individual changing exposure level within a specified cluster, o estimate of the between cluster variability itself.  `Generalized estimating equations’ (GEEs). o population average effect, for an individual moving from one exposure level to another, regardless of cluster.
  • 22. CLUSTERING: multilevel models 1. Random intercepts model 𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1 ⋅ 𝑋𝑖𝑗 +𝑒𝑖𝑗 𝛽0𝑗 = 𝛾00 + 𝑢0𝑗 2. Random slopes model 𝑌𝑖𝑗 = 𝛽0 + 𝛽1𝑗 ⋅ 𝑋𝑖𝑗 + 𝑒𝑖𝑗 𝛽1𝑗 = 𝛾10 + 𝑢1𝑗 3. Random intercepts and slopes 𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 ⋅ 𝑋𝑖𝑗 + 𝑒𝑖𝑗 𝛽0𝑗 = 𝛾00 + 𝑢0𝑗 𝛽1𝑗 = 𝛾10 + 𝑢1𝑗 Patient: i Cluster: j
  • 23. CALIBRATION: Clinical Prediction Models Obtain a system (set of variables + model) that estimates the risk of the outcome. Predictive models: Aim is the use in NEW patients: it should work ‘tomorrow’, not now (validation). https://riskcalculator.facs.org/RiskCalculator/PatientInfo.jsp
  • 24. CALIBRATION: Assessing clinical prediction models • Discrimination – Ability of model to rank subjects according to the risk of the outcome event. – Trade-off between sensitivity and specificity – Assessed graphically with a Receiver Operating Curve (ROC) and numerically by the area under the curve (AUC = c-index) • Calibration – Agreement between risk predictions from the model and observed risks of outcome. – Assessed graphically with calibration plots – Assessed numerically with the calibration slope (ideal slope = 1) and calibration intercept (ideal CITL= 0) Slope =1.05 CITL = 0.00
  • 25. CALIBRATION: Overfitting – PITFALL Overfitting = Source: https://retrobadge.co.uk/retrobadge/slogans-sayings- badges/public-enemy-number-one-small-retro-badge/ Overfitting = What you see is not what you get! “Idiosyncrasies in the data are fitted rather than generalizable patterns. A model may hence not be applicable to new patients, even when the setting of application is very similar to the development setting” Steyerberg, 2009, Springer, ISBN 978-0-387-77244-8.
  • 27. • Typical calibration plot with overfitting: Source: Maarten van Smeden  Discrimination (e.g. AUC) may not be affected, but:  Low risks are underestimated  High risk are overestimated CALIBRATION: Overfitting – PITFALL
  • 28. CALIBRATION: Prognostic prediction after EL in the HELAS cohort J Trauma Acute Care Surg 2023;94(6):847-856. Good discrimination (high AUC or C-statistic value) does not necessarily coincide with good calibration.
  • 29. RECOMMENDED READINGS: Short lists by others  van Smeden M. A Very Short List of Common Pitfalls in Research Design, Data Analysis, and Reporting. PRiMER. 2022;6:26. PMID: 36119906.  Riley RD, Cole TJ, Deeks J, et al. On the 12th Day of Christmas, a Statistician Sent to Me . . . BMJ. 2022;379:e072883. PMID: 36593578.  Makin TR, Orban de Xivry JJ. Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. Elife. 2019 ;8:e48175. PMID: 31596231.  Strasak AM, Zaman Q, Pfeiffer KP, Göbel G, Ulmer H. Statistical errors in medical research - a review of common pitfalls. Swiss Med Wkly 2007;137(3-4):44-49.  Borg DN, Lohse KR, Sainani KL. Ten Common Statistical Errors from All Phases of Research, and Their Fixes. PM R. 2020;12(6):610-614. doi:10.1002/pmrj.12395 And an all-time classic:  Altman DG. The scandal of poor medical research. BMJ. 1994;308(6924):283-284.