This study re-analyzed data from the Cochrane Library to evaluate methods for estimating between-study heterogeneity in meta-analyses. The researchers downloaded RevMan files from over 3,800 Cochrane reviews containing over 57,000 meta-analyses. They evaluated methods for estimating the between-study variance (tau-squared) using simulated and real Cochrane data. Their results showed that the DerSimonian-Laird bootstrap method performed best overall at estimating tau-squared and detecting heterogeneity, especially in small meta-analyses. However, over 50% of small meta-analyses in the Cochrane data failed to detect high between-study heterogeneity. The study highlights limitations in commonly used methods for accounting for heterogeneity in meta-analyses.
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Internal 2014 - Cochrane data
1. Background
Methods
Results
So what?
A re-analysis of the Cochrane Library data
the dangers of unobserved heterogeneity in meta-analyses
Evan Kontopantelis12 David Springate13 David
Reeves13
1NIHR School for Primary Care Research
2Centre for Health Informatics, Institute of Population Health
3Centre for Biostatistics, Institute of Population Health
Centre for Biostatistics, 10 Feb 2014
Kontopantelis A re-analysis of the Cochrane Library data
3. Background
Methods
Results
So what?
Meta-analysis
Synthesising existing evidence to answer clinical questions
Relatively young and dymanic field of research
Activity reflects the importance of MA and potential to
provide conclusive answers
Individual Patient Data meta-analysis is the best option,
but considerable cost and access to patient data required
When original data unavailable, evidence combined in a
two stage process
retrieving the relevant summary effect statistics
using MA model to calculate the overall effect estimate ˆµ
Kontopantelis A re-analysis of the Cochrane Library data
4. Background
Methods
Results
So what?
Heterogeneity estimate
or between-study variance estimate ˆτ2
Model selection depends on the heterogeneity estimate
If present usually a random-effects approach is selected
But a fixed-effects model may be chosen for theoretical or
practical reasons
Different approaches for combining study results
Inverse variance
Mantel-Haenszel
Peto
Kontopantelis A re-analysis of the Cochrane Library data
5. Background
Methods
Results
So what?
Meta-analysis methods
Inverse variance: fixed- or random-effects & continuous or
dichotomous outcome
DerSimonian-Laird, moment based estimator
Also: ML, REML, PL, Biggerstaff-Tweedie,
Follmann-Proschan, Sidik-Jonkman
Mantel-Haenszel: fixed-effect & dichotomous outcome
odds ratio, risk ratio or risk difference
different weighting scheme
low events numbers or small studies
Peto: fixed-effect & dichotomous outcome
Peto odds ratio
small intervention effects or very rare events
if ˆτ2 > 0 only modelled through inverse variance weighting
Kontopantelis A re-analysis of the Cochrane Library data
6. Background
Methods
Results
So what?
Random-effects (RE) models
Accurate ˆτ2 important performance driver
Large ˆτ2 leads to wider CIs
Zero ˆτ2 reduces all methods to fixed-effect
Three main approaches to estimating:
DerSimonian-Laird (ˆτ2
DL)
Maximum Likelihood (ˆτ2
ML)
Restricted Maximum Likelihood (ˆτ2
REML)
Many methods use one of these but vary in estimating µ
In practice, ˆτ2
DL computed and heterogeneity quantified and
reported using Cochran’s Q, I2 or H2
Kontopantelis A re-analysis of the Cochrane Library data
7. Background
Methods
Results
So what?
Random or fixed?
two ‘schools’ of thought
Fixed-effect (FE)
‘what is the average result of trials conducted to date’?
assumption-free
Random-effects (RE)
‘what is the true treatment effect’?
various assumptions
normally distributed trial effects
varying treatment effect across populations although findings
limited since based on observed studies only
more conservative; findings potentially more generalisable
Researchers reassured when ˆτ2 = 0
FE often used when low heterogeneity detected
Kontopantelis A re-analysis of the Cochrane Library data
8. Background
Methods
Results
So what?
Simples!
Start
(sort of)
Outcome(s)
continuous
Inverse Variance weighting methods (IV)
Yes
Fixed-effect by
conviction
Fixed-effect IV
model
Yes
No
Detected
heterogeneity
No Random-effects IV model
DL VC ML
REMLPL
Yes
Outcome(s)
dichotomous
No
Maentel-Haenszel methods (MH)
Fixed-effect by
conviction
Fixed-effect MH true
model
Yes
Detected
heterogeneity
No
Combining dichotomous
and continuous outcomes
Transform
dichotomous
outsomes to SMD
Feeling
adventurous?
Yes
Yes!
No!
Rare events
Very rare
events?
Estimate
heterogeneity (τ2
)
No
No
Random-effects MH-IV
hybrid model
Yes
Peto methods (P)
Fixed-effect Peto
true model
Yes
No
Outcome(s)
time-to-event
No
Fixed-effect Peto O-E
true model
Yes
Bayesian?
No
τ2
est
BP
MVa
MVb
Yes
Random-effects
IV model
DL
τ2
estimation
DL DL2 DLb VC
VC2ML REML PL
Non-zero
prior
Yes
τ2
est
B0
No
Kontopantelis A re-analysis of the Cochrane Library data
9. Background
Methods
Results
So what?
Cochrane Database for Systematic Reviews
Richest resource of meta-analyses in the world
Fifty-four active groups responsible for organising, advising
on and publishing systematic reviews
Authors obliged to use RevMan and submit the data and
analyses file along with the review, contributing to the
creation of a vast data resource
RevMan offers quite a few fixed-effect choices but only the
DerSimonian-Laird random-effects method has been
implemented to quantify and account for heterogeneity
hidden data
Kontopantelis A re-analysis of the Cochrane Library data
10. Background
Methods
Results
So what?
Software options
RevMan
Easy to use
Streamlined and ‘idiot-proof’
Limited model options
Data manipulation generally not possible
MetaEasy for data collection and some manipulation
Stata offers quite a few packages with advanced options
and model choices: metan, metaan, metabias etc
R similarly very well supported
Kontopantelis A re-analysis of the Cochrane Library data
11. Background
Methods
Results
So what?
Data
Analyses
‘Real’ Data
Cochrane Database for Systematic Reviews
Python code to crawl Wiley website for RevMan files
Downloaded 3,845 relevant RevMan files (of 3,984
available in Aug 2012) and imported in Stata
Each file a systematic review
Within each file, various research questions might have
been posed
investigated across various relevant outcomes?
variability in intervention or outcome?
Kontopantelis A re-analysis of the Cochrane Library data
12. Background
Methods
Results
So what?
Data
Analyses
‘Real’ Data
Cochrane Database for Systematic Reviews
Cochrane
database
CD000006
Group: Pregnancy and
Childbirth
Review name:
Absorbable suture
materials for primary
repair of episiotomy and
second degree tears
Meta-analysis 1
Synthetic sutures
versus catgut
Meta-analysis 2
Fast-absorbing synthetic versus
standard absorbable synthetic material
Meta-analysis 3
Glycerol impregnated catgut (softgut)
versus chromic catgut
Meta-analysis 4
Monofilament versus standard
polyglycolic sutures
Outcome 1.1
Short-term pain: pain at day 3 or less
(women experiencing any pain)
Subgroup 1.1.1
Standard synthetic; k=9
Subgroup 1.1.2
Fast absorbing; k=1
Outcome 1.9
Dyspareunia - at 3 months
postpartum
Subgroup 1.9.1
Standard synthetic; k=5
Subgroup 1.9.2
Fast absorbing; k=1
Main 1.9.0
k=6
Main 1.1.0
k=10
Outcome 2.1
Short-term pain: at 3 days or less
Main 2.1.0
k=3
Outcome 2.11
Maternal satisfaction: satisfied with
repair at 12 months
Main 2.11.0
k=1
Outcome 3.1
Short-term pain: pain at 3 days or
less
Main 3.1.0
k=1
Outcome 3.8
Dyspareunia at 6 - 12 months
Main 3.8.0
k=1
Outcome 4.1
Short-term pain: mean pain scores at
3 days
Main 4.1.0
k=1
Outcome 4.4
Wound problems at 8 - 12 weeks:
women seeking professional help for
problem with perineal repair
Main 4.4.0
k=1
Kontopantelis A re-analysis of the Cochrane Library data
13. Background
Methods
Results
So what?
Data
Analyses
Simulated Data
Generated effect size Yi and within study variance
estimates ˆσ2
i for each simulated meta-analysis study
Distribution for ˆσ2
i based on the χ2
1 distribution
For Yi (where Yi = θi + ei)
assumed ei ∼ N(0, ˆσ2
i )
various distributional scenarios for θi : normal, moderate
and extreme skew-normal, uniform, bimodal
three τ2
values to capture low (I2
= 15.1%), medium
(I2
= 34.9%) and large (I2
= 64.1%) heterogeneity
For each distributional assumption and τ2 value, 10,000
meta-analysis cases simulated
Kontopantelis A re-analysis of the Cochrane Library data
14. Background
Methods
Results
So what?
Data
Analyses
The questions
Investigate the potential bias when assuming ˆτ2 = 0
Compare the performance of τ2
estimators in various
scenarios
Present the distribution of ˆτ2
derived from all
meta-analyses in the Cochrane Library
Present details on the number of meta-analysed studies,
model selection and zero ˆτ2
Assess the sensitivity of results and conclusions using
alternative models
Kontopantelis A re-analysis of the Cochrane Library data
15. Background
Methods
Results
So what?
Data
Analyses
Between-study variance estimators
frequentist, more or less
DerSimonian-Laird
one-step (ˆτ2
DL)
two-step (ˆτ2
DL2)
non-parametric bootstrap (ˆτ2
DLb)
minimum ˆτ2
DL = 0.01 assumed (ˆτ2
DLi )
Variance components
one-step (ˆτ2
VC)
two-step (ˆτ2
VC2)
Iterative
Maximum likelihood (ˆτ2
ML)
Restricted maximum likelihood (ˆτ2
REML)
Profile likelihood (ˆτ2
PL)
Kontopantelis A re-analysis of the Cochrane Library data
16. Background
Methods
Results
So what?
Data
Analyses
Between-study variance estimators
Bayesian
Sidik and Jonkman model error variance
crude ratio estimates used as a-priori values (ˆτ2
MVa)
VC estimator used to inform a-priori values with minimum
value of 0.01 (ˆτ2
MVb)
Rukhin
prior between-study variance zero (ˆτ2
B0)
prior between-study variance non-zero and fixed (ˆτ2
BP)
Kontopantelis A re-analysis of the Cochrane Library data
17. Background
Methods
Results
So what?
Data
Analyses
Assessment criteria
in the 10,000 meta-analysis cases for each simulation scenario
Average bias & average absolute bias in ˆτ2
Percentage of zero ˆτ2
Coverage probability for the effect estimate
Type I error
proportion of 95% CIs for the overall effect estimate that
contain the true overall effect θi
Error-interval estimation for the effect
quantifies accuracy of estimation of the error-interval
around the point estimate
ratio of estimated confidence interval for the effect,
compared to the interval based on the true τ2
Kontopantelis A re-analysis of the Cochrane Library data
18. Background
Methods
Results
So what?
Method performance
Cochrane data
Which method?
Performance not affected much by effects’ distribution
Absolute bias
B0 (k ≤ 3) and ML
Coverage
MVa-BP (k ≤ 3) and DLb
Error-interval estimation and detecting
DLb
DLb seems best method overall, especially in detecting
heterogeneity
appears to be a big problem: DL failed to detect high τ2
for
over 50% of small meta-analyses
Bayesian methods did well for very small MAs
Kontopantelis A re-analysis of the Cochrane Library data
19. Background
Methods
Results
So what?
Method performance
Cochrane data
Meta-analyses numbers
Of the 3,845 files 2,801 had identified relevant studies and
contained any data
98,615 analyses extracted 57,397 of which meta-analyses
32,005 were overall meta-analyses
25,392 were subgroup meta-analyses
Estimation of an overall effect
Peto method in 4,340 (7.6%)
Mantel-Haenszel in 33,184 (57.8%)
Inverse variance in 19,873 (34.6%)
random-effects more prevalent in inverse variance methods
and larger meta-analyses
34% of meta-analyses on 2 studies (53% k ≤ 3)!
Kontopantelis A re-analysis of the Cochrane Library data
20. Background
Methods
Results
So what?
Method performance
Cochrane data
Meta-analyses by Cochrane group
Figures
Figure 1: All meta-analyses, including single-study and subgroup meta-analyses
0
2000
4000
6000
8000
10000
12000
14000
PregnancyandChildbirth
Schizophrenia
Neonatal
MenstrualDisordersandSubfertility
DepressionAnxietyandNeurosis
Airways
Hepato-Biliary
FertilityRegulation
Musculoskeletal
Stroke
AcuteRespiratoryInfections
Renal
DementiaandCognitiveImprovement
PainPalliativeandSupportiveCare
InfectiousDiseases
Heart
BoneJointandMuscleTrauma
MetabolicandEndocrineDisorders
GynaecologicalCancer
DevelopmentalPsychosocialandLearning…
ColorectalCancer
Hypertension
Anaesthesia
HaematologicalMalignancies
DrugsandAlcohol
Incontinence
InflammatoryBowelDiseaseandFunctional…
MovementDisorders
NeuromuscularDisease
OralHealth
PeripheralVascularDiseases
BreastCancer
TobaccoAddiction
CysticFibrosisandGeneticDisorders
Back
Skin
HIV/AIDS
Injuries
EyesandVision
Wounds
EarNoseandThroatDisorders
Epilepsy
UpperGastrointestinalandPancreaticDiseases
EffectivePracticeandOrganisationofCare
ProstaticDiseasesandUrologicCancers
MultipleSclerosisandRareDiseasesofthe…
MultipleSclerosis
ConsumersandCommunication
LungCancer
SexuallyTransmittedDiseases
ChildhoodCancer
OccupationalSafetyandHealth
SexuallyTransmittedInfections
PublicHealth
Single Study Fixed-effect model (by choice or necessity) Random-effects model
Kontopantelis A re-analysis of the Cochrane Library data
21. Background
Methods
Results
So what?
Method performance
Cochrane data
Model selection by meta-analysis size
16%
20%
24%
25%
27%
29% 32%
31% 31% 33% 33% 34% 30% 38% 30% 30% 32% 33% 35% 37% 38%
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Model selection by number of available studies
(% of Random‐effects meta‐analyses)
Fixed‐effect (by choice or necessity) Random‐effects
Kontopantelis A re-analysis of the Cochrane Library data
22. Background
Methods
Results
So what?
Method performance
Cochrane data
Meta-analyses by method choice
Figure 2: Model selection by number of available studies (and % of random-effects meta-analyses)*
*note that in many cases fixed-effect models were used when heterogeneity was detected
Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations,
21%
27%
31%
37%
41%
51%
15%
19%
22%
22%
27%
30%
0
2000
4000
6000
8000
10000
12000
2 3 4 5 6-9 10+
Number of Studies in meta-analysis
Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)
Kontopantelis A re-analysis of the Cochrane Library data
23. Background
Methods
Results
So what?
Method performance
Cochrane data
Comparing Cochrane data with simulated
To assess the validity of a homogeneity assumption we
compared the percentage of zero ˆτ2
DL, in real and
simulated data
Calculated ˆτ2
DL for all Cochrane meta-analyses
Percentage of zero ˆτ2
DL was lower in the real data than in
the low and moderate heterogeneity simulated data
Suggests that mean true between-study variance is higher
than generally assumed but fails to be detected; especially
for small meta-analyses
Kontopantelis A re-analysis of the Cochrane Library data
24. Background
Methods
Results
So what?
Method performance
Cochrane data
Comparing Cochrane data with simulated
*note that in many case fixed-effect models were used when heterogeneity was detected
Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations,
using the DerSimonian-Laird method*
*Normal distribution of the effects assumed in the simulations (more extreme distributions produced similar
results).
Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)
0
10
20
30
40
50
60
70
80
90
100
2 3 4 5 10 20
%ofzeroτ^2estimateswithDerSimonian-Laird
Number of studies in meta-analyis
Observed
true I^2=15%
true I^2=35%
true I^2=64%
Kontopantelis A re-analysis of the Cochrane Library data
25. Background
Methods
Results
So what?
Method performance
Cochrane data
Reanalysing the Cochrane data
We applied all methods to all 57,397 meta-analyses to
assess ˆτ2 distributions and the sensitivity of the results
and conclusions
For simplicity discuss differences between standard
methods and DLb; not a perfect method but one that
performed well overall
As in simulations, DLb identifies more heterogeneous
meta-analyses; ˆτ2
DL = 0 for 50.5% & ˆτ2
DLb = 0 for 31.2%
Distributions of ˆτ2 agree with the hypothesised χ2
1
Kontopantelis A re-analysis of the Cochrane Library data
26. Background
Methods
Results
So what?
Method performance
Cochrane data
Distributions for ˆτ2
0500100015002000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
Zero est(%): DL=44.9, DLb=29.6, VC=48.9 REML=45.4
ML=62.2, B0=49.2, VC2=44.3, DL2=45.3
Non-convergence(%): ML=0.7, REML=1.4.
Inverse Variance
010002000300040005000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
Zero est(%): DL=54.2, DLb=32.7, VC=58.8 REML=55.6
ML=75.0, B0=59.6, VC2=53.9, DL2=55.5
Non-convergence(%): ML=1.3, REML=1.9.
Mantel-Haenszel
0200400600
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
Zero est(%): DL=50.8, DLb=27.3, VC=54.2 REML=51.4
ML=70.0, B0=54.8, VC2=49.6, DL2=51.0
Non-convergence(%): ML=0.6, REML=1.0.
Peto & O-E
02000400060008000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
Zero est(%): DL=50.7, DLb=31.2, VC=55.0 REML=51.7
ML=70.2, B0=55.6, VC2=50.2, DL2=51.6
Non-convergence(%): ML=1.0, REML=1.6.
all methods
non-zero estimates only
DL DLb VC ML
REML B0 VC2 DL2
Kontopantelis A re-analysis of the Cochrane Library data
27. Background
Methods
Results
So what?
Method performance
Cochrane data
Changes in results and conclusions
Inverse variance with DLb
when ˆτ2
DL > 0 but ignored, conclusions change for 19.1% of
analyses
in overwhelming majority of changes, effects stopped being
statistically significant
Findings were similar for Mantel-Haenszel and Peto
methods, although the validity of the inverse variance
weighting in these (which is a prerequisite for the use or
random-effects models) warrants further investigation
Kontopantelis A re-analysis of the Cochrane Library data
28. Background
Methods
Results
So what?
Method performance
Cochrane data
Changes in results and conclusions
e.g. inverse variance analyses
RevMan DerSimonian-Laird
Random-effects method says
heterogeneity is present
Analysis with bootstrap DL rarely
changes conclusions (although
higher heterogeneity estimates
and found in around 20% more
meta-analysis
Conclusions change for:
0.9% of analyses
No
Estimated heterogeneity
‘ignored’ by authors and a
fixed-effect model is chosen
Yes
Analysis with bootstrap DL rarely
changes conclusions
Conclusions change for:
2.4% of analyses
No
Analysis with bootstrap DL makes
a difference in 1 in 5 analyses (as
would analysis with standard DL
but to a smaller extent)
Conclusions change for:
19.1% of analyses
Yes
Kontopantelis A re-analysis of the Cochrane Library data
29. Background
Methods
Results
So what?
Summary
Relevant and future work
Findings
Methods often fail to detect τ2 in small MA
Even when ˆτ2 > 0, often ignored
Mean true heterogeneity higher than assumed or
estimated; but standard method fails to detect it
Non-parametric DerSimonian-Laird bootstrap seems best
method overall, especially in detecting heterogeneity
Bayesian estimators MVa (Sidik-Jonkman) and BP
(Ruhkin) performed very well when k ≤ 3
19-21% of statistical conclusions change, when ˆτ2
DL > 0
but ignored
Kontopantelis A re-analysis of the Cochrane Library data
30. Background
Methods
Results
So what?
Summary
Relevant and future work
Conclusions
Detecting and accurately estimating ˆτ2 in a small MA is
very difficult; yet for 53% of Cochrane MAs, k ≤ 3
ˆτ2 = 0 assumed to lead to a more reliable meta-analysis
and high ˆτ2 is alarming and potentially prohibitive
Estimates of zero heterogeneity should also be a concern
since heterogeneity is likely present but undetected
Bootstrapped DL leads to a small improvement but
problem largely remains, especially for very small MAs
Caution against ignoring heterogeneity when detected
For full generalisability, random-effects essential?
Kontopantelis A re-analysis of the Cochrane Library data
31. Background
Methods
Results
So what?
Summary
Relevant and future work
Effect sizes in Randomised Controlled Trials
Most large treatment effects emerge from small studies,
and when additional trials are performed, the effect sizes
become typically much smaller
Well validated large effects are uncommon and pertain to
nonfatal outcomes
ORIGINAL CONTRIBUTION
Empirical Evaluation of Very Large Treatment
Effects of Medical Interventions
Tiago V. Pereira, PhD
Ralph I. Horwitz, MD
John P. A. Ioannidis, MD, DSc
M
OST EFFECTIVE INTERVEN-
tions in health care con-
fer modest, incremental
benefits.1,2
Randomized
trials, the gold standard to evaluate
medical interventions, are ideally con-
ducted under the principle of equi-
poise3
: the compared groups are not
perceived to have a clear advantage;
thus, very large treatment effects are
usually not anticipated. However, very
large treatment effects are observed oc-
casionally in some trials. These effects
may include both anticipated and un-
Context Most medical interventions have modest effects, but occasionally some clini-
cal trials may find very large effects for benefits or harms.
Objective To evaluate the frequency and features of very large effects in medicine.
Data Sources Cochrane Database of Systematic Reviews (CDSR, 2010, issue 7).
Study Selection We separated all binary-outcome CDSR forest plots with com-
parisons of interventions according to whether the first published trial, a subsequent
trial (not the first), or no trial had a nominally statistically significant (PϽ.05) very large
effect (odds ratio [OR], Ն5). We also sampled randomly 250 topics from each group
for further in-depth evaluation.
Data Extraction We assessed the types of treatments and outcomes in trials with
very large effects, examined how often large-effect trials were followed up by other
trials on the same topic, and how these effects compared against the effects of the
respective meta-analyses.
Results Among 85 002 forest plots (from 3082 reviews), 8239 (9.7%) had a sig-
nificant very large effect in the first published trial, 5158 (6.1%) only after the first
published trial, and 71 605 (84.2%) had no trials with significant very large effects.
Nominally significant very large effects typically appeared in small trials with median
number of events: 18 in first trials and 15 in subsequent trials. Topics with veryKontopantelis A re-analysis of the Cochrane Library data
32. Background
Methods
Results
So what?
Summary
Relevant and future work
Publication bias
Publication bias was present in a substantial proportion of
large meta-analyses that were recently published in four
major medical journals (BMJ, JAMA, Lancet, and PLOS
Medicine between 2008 and 2012).
Publication Bias in Recent Meta-Analyses
Michal Kicinski*
Department of Science, Hasselt University, Hasselt, Belgium
Abstract
Introduction: Positive results have a greater chance of being published and outcomes that are statistically significant
have a greater chance of being fully reported. One consequence of research underreporting is that it may influence
the sample of studies that is available for a meta-analysis. Smaller studies are often characterized by larger effects in
published meta-analyses, which can be possibly explained by publication bias. We investigated the association
between the statistical significance of the results and the probability of being included in recent meta-analyses.
Methods: For meta-analyses of clinical trials, we defined the relative risk as the ratio of the probability of including
statistically significant results favoring the treatment to the probability of including other results. For meta-analyses of
other studies, we defined the relative risk as the ratio of the probability of including biologically plausible statistically
significant results to the probability of including other results. We applied a Bayesian selection model for meta-
analyses that included at least 30 studies and were published in four major general medical journals (BMJ, JAMA,
Lancet, and PLOS Medicine) between 2008 and 2012.
Results: We identified 49 meta-analyses. The estimate of the relative risk was greater than one in 42 meta-analyses,
greater than two in 16 meta-analyses, greater than three in eight meta-analyses, and greater than five in four meta-
analyses. In 10 out of 28 meta-analyses of clinical trials, there was strong evidence that statistically significant results
favoring the treatment were more likely to be included. In 4 out of 19 meta-analyses of observational studies, there
was strong evidence that plausible statistically significant outcomes had a higher probability of being included.
Kontopantelis A re-analysis of the Cochrane Library data
33. Background
Methods
Results
So what?
Summary
Relevant and future work
Future work
Look for publication bias
Examine factors that predict large effect sizes and
significant findings (e.g. subanalyses)
Is model choice (FE or RE) driven by the results? (i.e.
‘hope’ for a significant finding?)
Update our Stata metaan command to include the
Bayesian methods (DLb already added)
Kontopantelis A re-analysis of the Cochrane Library data
34. Appendix Thank you!
A Re-Analysis of the Cochrane Library Data: The Dangers
of Unobserved Heterogeneity in Meta-Analyses
Evangelos Kontopantelis1,2,3
*, David A. Springate1,2
, David Reeves1,2
1 Centre for Primary Care, NIHR School for Primary Care Research, Institute of Population Health, University of Manchester, Manchester, United Kingdom, 2 Centre for
Biostatistics, Institute of Population Health, University of Manchester, Manchester, United Kingdom, 3 Centre for Health Informatics, Institute of Population Health,
University of Manchester, Manchester, United Kingdom
Abstract
Background: Heterogeneity has a key role in meta-analysis methods and can greatly affect conclusions. However, true levels
of heterogeneity are unknown and often researchers assume homogeneity. We aim to: a) investigate the prevalence of
unobserved heterogeneity and the validity of the assumption of homogeneity; b) assess the performance of various meta-
analysis methods; c) apply the findings to published meta-analyses.
Methods and Findings: We accessed 57,397 meta-analyses, available in the Cochrane Library in August 2012. Using
simulated data we assessed the performance of various meta-analysis methods in different scenarios. The prevalence of a
zero heterogeneity estimate in the simulated scenarios was compared with that in the Cochrane data, to estimate the
degree of unobserved heterogeneity in the latter. We re-analysed all meta-analyses using all methods and assessed the
sensitivity of the statistical conclusions. Levels of unobserved heterogeneity in the Cochrane data appeared to be high,
especially for small meta-analyses. A bootstrapped version of the DerSimonian-Laird approach performed best in both
detecting heterogeneity and in returning more accurate overall effect estimates. Re-analysing all meta-analyses with this
new method we found that in cases where heterogeneity had originally been detected but ignored, 17–20% of the
statistical conclusions changed. Rates were much lower where the original analysis did not detect heterogeneity or took it
into account, between 1% and 3%.
Conclusions: When evidence for heterogeneity is lacking, standard practice is to assume homogeneity and apply a simpler
fixed-effect meta-analysis. We find that assuming homogeneity often results in a misleading analysis, since heterogeneity is
very likely present but undetected. Our new method represents a small improvement but the problem largely remains,
especially for very small meta-analyses. One solution is to test the sensitivity of the meta-analysis conclusions to assumed
moderate and large degrees of heterogeneity. Equally, whenever heterogeneity is detected, it should not be ignored.
Citation: Kontopantelis E, Springate DA, Reeves D (2013) A Re-Analysis of the Cochrane Library Data: The Dangers of Unobserved Heterogeneity in Meta-
Analyses. PLoS ONE 8(7): e69930. doi:10.1371/journal.pone.0069930
Editor: Tim Friede, University Medical Center Go¨ttingen, Germany
Received February 20, 2013; Accepted June 13, 2013; Published July 26, 2013
Copyright: ß 2013 Kontopantelis et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: EK was partly supported by a National Institute for Health Research (NIHR) School for Primary Care Research fellowship in primary health care. The
This project was supported by the School for Primary Care Research
which is funded by the National Institute for Health Research (NIHR).
The views expressed are those of the author(s) and not necessarily
those of the NHS, the NIHR or the Department of Health.
Comments, suggestions: e.kontopantelis@manchester.ac.uk
Kontopantelis A re-analysis of the Cochrane Library data