Upcoming SlideShare
×

# A simple confidence interval for meta analysis

820 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
820
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
10
0
Likes
0
Embeds 0
No embeds

No notes for slide

### A simple confidence interval for meta analysis

1. 1. STATISTICS IN MEDICINEStatist. Med. 2002; 21:3153–3159 (DOI: 10.1002/sim.1262) A simple conÿdence interval for meta-analysis Kurex Sidik1 and Je rey N. Jonkman2;∗;† 1 Biometrics Research; Wyeth Research; CN 8000; Princeton; NJ 08543-8000; U.S.A.2 Department of Mathematics and Statistics; Mississippi State University; Mississippi State; MS 39762; U.S.A. SUMMARYIn the context of a random e ects model for meta-analysis, a number of methods are available toestimate conÿdence limits for the overall mean e ect. A simple and commonly used method is theDerSimonian and Laird approach. This paper discusses an alternative simple approach for constructingthe conÿdence interval, based on the t-distribution. This approach has improved coverage probabilitycompared to the DerSimonian and Laird method. Moreover, it is easy to calculate, and unlike somemethods suggested in the statistical literature, no iterative computation is required. Copyright ? 2002John Wiley & Sons, Ltd.KEY WORDS: coverage probability; random e ects model; simulation study; t-distribution; weighted estimation 1. INTRODUCTIONIn the random e ects model for meta-analysis, one important objective is to calculate aninterval estimate for the overall mean e ect across studies. Recently, Brockwell and Gordon[1] presented a summary and carried out comparisons of the existing methods for the intervalestimate in terms of coverage probability. They found that the conÿdence intervals from all themethods they studied generally had coverage probabilities below the speciÿed nominal level.In particular, the commonly used DerSimonian and Laird random e ects method [2] producedcoverage probabilities below the nominal level. Although the proÿle likelihood interval ofHardy and Thompson [3] led to the highest coverage probabilities, the proÿle likelihoodapproach is not simple computationally, and involves an iterative calculation, as does thesimple likelihood method [1]. In this paper we present a simple approach for constructing a 100(1 − ) per cent conÿdenceinterval for the overall mean e ect in the random e ects model. The interval is deÿned usingpivotal inference based on the t-distribution. It is simple because, like the DerSimonian andLaird method, no iterative computation is needed. Moreover, we ÿnd that this simple approachhas a better coverage probability than the DerSimonian and Laird method.∗ Correspondence to: Je rey N. Jonkman, Department of Mathematics and Statistics, Mississippi State University, Mississippi State, MS 39762, U.S.A.† E-mail: jonkman@math.msstate.edu Received June 2001Copyright ? 2002 John Wiley & Sons, Ltd. Accepted December 2001
2. 2. 3154 K. SIDIK AND J. N. JONKMAN Note that the aim of this paper is to present a brief discussion of the simple intervalin comparison with the commonly used DerSimonian and Laird random e ects method. Thescope of this paper is purely in the non-Bayesian statistical framework. Therefore, we will onlycite a few papers relevant to the study. For a more complete summary, and for comparisonsof the di erent methods of obtaining an interval estimate, we refer to the paper by Brockwelland Gordon [1]. For more detailed reviews of the various methods in meta-analysis, we referthe reader to the book by Hedges and Olkin [4], the brief note by Olkin [5], and the paperby Normand [6]. 2. METHODFor k independent studies in meta-analysis, let the random variable yi be an e ect size esti-mate from the ith study. Some commonly used measures of e ect size are mean di erence,standardized mean di erence, risk di erence, relative risk and odds ratio. The odds ratio isfrequently of particular interest in retrospective or case control studies. The random e ectsmodel can be deÿned as follows: yi = + i + i; i =1; : : : ; kwhere i represents the random error across the studies, i represents the random error within astudy, and i and i are assumed to be independent. Furthermore, it is assumed that i ∼ N(0; 2 )and i ∼ N(0; i2 ). The parameter 2 is a measure of the heterogeneity between studies and issometimes called the heterogeneity variance. Therefore, the individual e ect size yi in therandom e ects models for meta-analysis is normally distributed with mean and variance 2 2 i + . Consider construction of a conÿdence interval for the overall mean e ect in the randome ects model. There are several methods for obtaining an interval estimate for this model [1].A simple and commonly used method is the DerSimonian and Laird approach [2]. The crucialstep of this method is to estimate the heterogeneity variance 2 . DerSimonian and Laird’s [2]estimator, ˆ2 , is given by    k wi (yi − ˆ) − (k − 1)  2 ˆ2 = max 0; i=1 (1)  k w − k w2 k w i=1 i i=1 i i=1 iwhere ˆ is the weighted estimate of the mean e ect, given by ˆ = k wi yi = k wi . Here, i=1 i=1the weight wi =1= i2 is assumed to be known. Usually the sample estimate ˆi2 is substituted inplace of i2 , so that wi =1= ˆi2 is used in practice. Using the estimate ˆ2 , a 100(1 − ) per centapproximate conÿdence interval for based on the DerSimonian and Laird random e ectsmethod is given by 1 ˆ±z =2 (2) k i=1 wi ˆ k kwhere wi =1=( ˆi2 + ˆ2 ) and ˆ = ˆ i=1 wi yi = ˆ i=1 wi . Note that z ˆ =2 is the =2 upper quantile ofthe standard normal distribution.Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159
3. 3. SIMPLE CONFIDENCE INTERVAL FOR META-ANALYSIS 3155 In this paper, we consider an alternative simple interval estimate for in the random e ectsmodel. Under the assumptions that yi ∼ N( ; i2 + 2 ) and that wi is the correct weight for each ˆi, we have ˆ− k Zw = ∼ N(0; 1) and Qw = wi (yi − ˆ)2 ∼ ˆ 2 (k−1) k i=1 1= i=1 wi ˆIt can be shown that Zw and Qw are independent (see the proof in the Appendix). Hence k i=1 wi ( ˆ − ) ˆ ∼ t(k−1) (3) k i=1 wi (yi − ˆ ˆ)2 =(k − 1)Now, using this pivotal quantity, we may construct an approximate 100(1 − ) per centconÿdence interval for as follows: k wi (yi − ˆ)2 ˆ ˆ ± tk−1; =2 i=1 k (4) (k − 1) i=1 wi ˆwhere tk−1; =2 is the =2 upper quantile from the t-distribution with k − 1 degrees of freedom.Notice that, under the assumption of known weights, Qw =[(k −1) wi ] is an unbiased estimator ˆof the variance of ˆ. Strictly speaking, the normal distribution of Zw and the chi-square distribution of Qwshown above require the assumption that the weights, and hence the true variance i2 + 2of yi (i =1; : : : ; k), are known. Nevertheless, we relax this requirement by substituting theestimate ˆi2 + ˆ2 in place of the true variance. Although using the estimated values may raiseproblems for the validity of the distributional assumptions, it is the common practice in statis-tical inference based on weighted estimation. To be speciÿc, it is usually assumed that weightsare known, even though the actual applied weights are functions of statistics, and then oneobtains the estimate and conÿdence interval based on normal or chi-square distributions, asin the case of the DerSimonian and Laird method. 3. SIMULATIONTo compare the simple conÿdence interval based on the t-distribution with the DerSimonianand Laird interval in terms of coverage probability, we performed a simulation study of meta-analysis for the random e ects model. Throughout the study, the overall mean e ect isÿxed at 0.5 and the error probability of the conÿdence interval, , is set at 0.05. We useonly one value for because the t-distribution interval based on the pivotal quantity in (3)and the DerSimonian and Laird interval are both invariant to a location shift. Four di erentvalues of 2 are used: 0.03; 0.05; 0.08, and 0.1. For each 2 , twelve di erent values of kare considered, that is, k =5(5)60. The number of simulation runs for the meta-analysis of kstudies is 10 000. The simulation data for each run are generated in terms of the most popularmeasure of e ect size in meta-analysis, the log of the odds ratio. That is, the generated e ectCopyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159
4. 4. 3156 K. SIDIK AND J. N. JONKMANsize yi is interpreted as a log odds ratio. For given k, the within-study variance i2 is generatedusing the method of Brockwell and Gordon [1]. Speciÿcally, a value is generated from a chi-square distribution with one degree of freedom, which is then scaled by 1=4 and restrictedto an interval between 0.009 and 0.6. This results in a bimodal distribution of i2 , with themodes at each end of the distribution. As noted by Brockwell and Gordon, values generatedin this way are consistent with a typical distribution of ˆi2 for log odds ratios encounteredin practice. For binary outcomes, the within-study variance decreases with increasing samplesize, so large values of i2 (close to 0:6) represent small trials included in the meta-analysis,and small values of i2 represent large trials. The e ect size yi for i =1; : : : ; k is generatedfrom a normal distribution with mean and variance i2 + 2 . For each simulation of the meta-analysis, the conÿdence intervals based on the t-distributionand the DerSimonian and Laird method are calculated. The numbers of intervals containingthe true are recorded for both methods. The proportion of intervals containing the true(out of the 10 000 runs) serves as the simulation estimate of the true coverage probability. The results of the simulation study are presented in Figure 1. From the plots, it can beseen that the coverage probabilities of the interval based on the t-distribution are larger thanthe coverage probabilities of the interval using the DerSimonian and Laird method for each 2 and all values of k. Although the coverage probabilities of the conÿdence interval fromthe t-distribution, like other methods [1], are below the nominal level of 95 per cent, theyare higher than the commonly applied interval based on the DerSimonian and Laird method,particularly when k is small. This suggests that the simple conÿdence interval based on thet-distribution is an improvement compared to the existing simple conÿdence interval based onDerSimonian and Laird’s method. The coverage probabilities may be a ected by the ratio of the heterogeneity between studies( 2 ) to the within-study variances ( i2 ). In the simulation study, the two most extreme valuesof 2 are 0:03 and 0:1. Therefore the ratio 2 = i2 ranges between 0:05 and 3:33 when 2 =0:03,and between 0:167 and 11:11 when 2 =0:1. That is, the between-study variance may be asmuch as 11 times the smallest within-study variance, or as little as one-twentieth of the largestwithin-study variance. We also investigated the expected lengths of the conÿdence intervals from the two methodsfor the four values of 2 and the twelve values of k in the simulation. We found that the sim-ulation estimate of the expected length of the interval for the t-distribution ranged between0.158 and 1.003, with an average of 0.339. The estimated expected length of the intervalusing the DerSimonian and Laird method ranged between 0.155 and 0.733, with an averageof 0.305. 4. DISCUSSIONIn this paper we have discussed a simple way to construct a conÿdence interval for theoverall mean e ect in a random e ects model for meta-analysis. The interval is based onthe t-distribution and requires no iterative computation. Moreover, we found that the con-ÿdence interval has higher coverage probability than the commonly used interval based onDerSimonian and Laird’s method, particularly when the number of studies is small or moder-ate. Furthermore, our interval given in (4) takes sample size into account by incorporating themeta-analysis sample size, k, in its computation, unlike the interval given in (2). Therefore,Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159
5. 5. SIMPLE CONFIDENCE INTERVAL FOR META-ANALYSIS 3157 1.00 1.00 0.95 0.95 Coverage Prob. TD Coverage Prob. TD 0.90 0.90 DL DL 0.85 0.85 0.80 0.80 10 20 30 40 50 60 10 20 30 40 50 60 (a) k (b) k 1.00 1.00 0.95 0.95 Coverage Prob. Coverage Prob. TD TD 0.90 0.90 DL DL 0.85 0.85 0.80 0.80 10 20 30 40 50 60 10 20 30 40 50 60 (c) k (d) kFigure 1. Coverage probabilities of the intervals from t-distribution (TD) and DerSimonian and Laird (DL) methods: (a) 2 = 0:03; (b) 2 = 0:05; (c) 2 = 0:08; (d) 2 = 0:1.we recommend the interval based on the t-distribution as an alternative approximate intervalfor the overall mean e ect in a random e ects model for meta-analysis. A drawback of this interval is that it results in coverage probabilities that are lower thanthe nominal level. However, it has been previously shown that intervals using other methodsalso have coverage probability below the speciÿed nominal level [1]. Although Brockwelland Gordon [1] found that the coverage probability of the interval using the proÿle likelihoodmethod is quite close to its speciÿed level, they noted that it lacks the simplicity of theother intervals. The low coverage probabilities for the interval based on the t-distribution maybe attributed to the fact that estimated values of 2 and i2 are used in place of the truevalues, and therefore the pivotal quantity given in (3) does not follow an exact t-distribution.Nevertheless, we think that the simple interval based on the t-distribution is useful as a quickway to construct an approximate conÿdence interval for the mean e ect in a preliminary study.Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159
6. 6. 3158 K. SIDIK AND J. N. JONKMAN We have illustrated our interval by using the log-odds ratio to measure e ect size. Engelset al. [7] observed that di erent metrics for e ect size assign di erent weights to the studiesin a summary meta-analysis. We cannot be certain that our results would be the same for anabsolute measure such as a risk di erence, but we do note that Engels et al. [7] studied 125meta-analyses and concluded that the choice of metric for treatment e ect did not appear tobe of crucial importance. Finally, it should be noted that we have only considered an interval estimate for the meane ect in the context of a random e ects model for meta-analysis. However, the simple intervaldescribed in this paper also can be applied in the ÿxed e ects model. Since it is assumedthat 2 =0 for the ÿxed e ects model, the interval given in (4) may be applied to constructan approximate 100(1 − ) per cent interval for the overall mean e ect by using wi =1= ˆi2instead of wi =1=( ˆi2 + ˆ2 ). ˆ APPENDIXTo prove the independence of Zw and Qw , we use matrix notation. We assume thatY ∼ MN(U; V), where Y =(y1 · · · yk )T , U = 1k×1 , and V = diag{wi−1 }k×k , and we assume ˆthat wi is the correct weight for each i. Using matrix notation, we have ˆ k 1=2 k wi ˆ ˆ =AY and wi (yi − ˆ)2 =YT BY ˆ i=1 i=1where A =( k wi )−1=2 1T V−1 and B =V−1 − V−1 1(1T V−1 1)−1 1T V−1 . Note that BV is an i=1 ˆidempotent matrix. Now we note that k −1=2 AVB = wi ˆ 1T V−1 V[V−1 − V−1 1(1T V−1 1)−1 1T V−1 ] i=1 k −1=2 = wi ˆ [1T V−1 − 1T V−1 1(1T V−1 1)−1 1T V−1 ] i=1 k −1=2 = wi ˆ [1T V−1 − 1T V−1 ] i=1 =0Therefore the linear form ( k wi )1=2 ˆ and the quadratic form k wi (yi − ˆ)2 are independent i=1 ˆ i=1 ˆby Graybill’s theorem 4.5.2 [8], and hence Zw and Qw are independent. ACKNOWLEDGEMENTSWe are grateful to Dr Thomas Copenhaver and Dr Sharon-Lise Normand for their support of ourresearch endeavours, and to the two referees whose comments helped improve the manuscript.Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159
7. 7. SIMPLE CONFIDENCE INTERVAL FOR META-ANALYSIS 3159 REFERENCES1. Brockwell SE, Gordon IR. A comparison of statistical methods for meta-analysis. Statistics in Medicine 2001; 20:825–840.2. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials 1986; 7:177–188.3. Hardy RJ, Thompson SG. A likelihood approach to meta-analysis with random e ects. Statistics in Medicine 1996; 15:619 – 629.4. Hedges LV, Olkin I. Statistical Methods for Meta-Analysis. Academic Press: New York, 1985.5. Olkin I. Meta-analysis: methods for combining independent studies. Editor’s introduction. Statistical Science 1992; 7:226.6. Normand S-LT. Tutorial in biostatistics, meta-analysis: formulating, evaluating, combining, and reporting. Statistics in Medicine 1999; 18:321–359.7. Engels EA, Schmid HC, Terrin N, Olkin I, Lau J. Heterogeneity and statistical signiÿcance in meta-analysis: an empirical study of 125 meta-analyses. Statistics in Medicine 2000; 19:1707–1728.8. Graybill FA. Theory and Application of the Linear Model. Duxbury Press: Massachusetts, 1976.Copyright ? 2002 John Wiley & Sons, Ltd. Statist. Med. 2002; 21:3153–3159