Upcoming SlideShare
×

# Intro models 043012

389 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
389
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
6
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Intro models 043012

1. 1. PART 3010203 Fixed-Effect Versus Random-Effects0405 Models0607080910111213141516171819202122232425262728293031323334353637383940414243 9th February 2009 10:03 Wiley/ITMA Page 59 p03
2. 2. CHAPTER 10010203 Overview040506070809 Introduction10 Nomenclature1112131415 INTRODUCTION1617 Most meta-analyses are based on one of two statistical models, the fixed-effect18 model or the random-effects model.19 Under the fixed-effect model we assume that there is one true effect size (hence20 the term fixed effect) which underlies all the studies in the analysis, and that all21 differences in observed effects are due to sampling error. While we follow the22 practice of calling this a fixed-effect model, a more descriptive term would be a23 common-effect model. In either case, we use the singular (effect) since there is only24 one true effect.25 By contrast, under the random-effects model we allow that the true effect could26 vary from study to study. For example, the effect size might be higher (or lower) in27 studies where the participants are older, or more educated, or healthier than in others,28 or when a more intensive variant of an intervention is used, and so on. Because studies29 will differ in the mixes of participants and in the implementations of interventions,30 among other reasons, there may be different effect sizes underlying different studies.31 If it were possible to perform an infinite number of studies (based on the inclusion32 criteria for our analysis), the true effect sizes for these studies would be distributed33 about some mean. The effect sizes in the studies that actually were performed are34 assumed to represent a random sample of these effect sizes (hence the term random35 effects). Here, we use the plural (effects) since there is an array of true effects.36 In the chapters that follow we discuss the two models and show how to compute a37 summary effect using each one. Because the computations for a summary effect are38 not always intuitive, it helps to keep in mind that the summary effect is nothing39 more than the mean of the effect sizes, with more weight assigned to the more40 precise studies. We need to consider what we mean by the more precise studies and4142 Introduction to Meta-Analysis M. Borenstein, L. V. Hedges, J. P. T. Higgins, H. R. Rothstein43 Ó 2009, John Wiley & Sons, Ltd 8th February 2009 08:20 Wiley/ITMA Page 61 c10
3. 3. 62 Fixed-Effect Versus Random-Effects Models01 True Observed02 effect effect0304 Study0506 Combined0708 Figure 10.1 Symbols for true and observed effects.091011 how this translates into a study weight (this depends on the model), but not lose12 track of the fact that we are simply computing a weighted mean.1314 NOMENCLATURE1516 Throughout this Part we distinguish between a true effect size and an observed17 effect size. A study’s true effect size is the effect size in the underlying population,18 and is the effect size that we would observe if the study had an infinitely large19 sample size (and therefore no sampling error). A study’s observed effect size is the20 effect size that is actually observed.21 In the schematics we use different symbols to distinguish between true effects and22 observed effects. For individual studies we use a circle for the former and a square23 for the latter (see Figure 10.1). For summary effects we use a triangle for the former24 and a diamond for the latter.2526 Worked examples2728 In meta-analysis the same formulas apply regardless of the effect size being used.29 To allow the reader to work with an effect size of their choosing, we have separated30 the formulas (which are presented in the following chapters) from the worked31 examples (which are presented in Chapter 14). There, we provide a worked example32 for the standardized mean difference, one for the odds ratio, and one for33 correlations.34 The reader is encouraged to select one of the worked examples and follow the35 details of the computations while studying the formulas. The three datasets and all36 computations are available as Excel spreadsheets on the book’s web site.37383940414243 8th February 2009 08:20 Wiley/ITMA Page 62 c10
4. 4. CHAPTER 11010203 Fixed-Effect Model040506070809 Introduction10 The true effect size Impact of sampling error11 Performing a fixed-effect meta-analysis1213141516 INTRODUCTION17 In this chapter we introduce the fixed-effect model. We discuss the assumptions of18 this model, and show how these are reflected in the formulas used to compute a19 summary effect, and in the meaning of the summary effect.2021 THE TRUE EFFECT SIZE2223 Under the fixed-effect model we assume that all studies in the meta-analysis share a24 common (true) effect size. Put another way, all factors that could influence the25 effect size are the same in all the studies, and therefore the true effect size is the26 same (hence the label fixed) in all the studies. We denote the true (unknown) effect27 size by theta ()28 In Figure 11.1 the true overall effect size is 0.60 and this effect (represented by a29 triangle) is shown at the bottom. The true effect for each study is represented by a30 circle. Under the definition of a fixed-effect model the true effect size for each study31 must also be 0.60, and so these circles are aligned directly above the triangle.3233 IMPACT OF SAMPLING ERROR3435 Since all studies share the same true effect, it follows that the observed effect size36 varies from one study to the next only because of the random error inherent in each37 study. If each study had an infinite sample size the sampling error would be zero and38 the observed effect for each study would be the same as the true effect. If we were to39 plot the observed effects rather than the true effects, the observed effects would40 exactly coincide with the true effects.4142 Introduction to Meta-Analysis M. Borenstein, L. V. Hedges, J. P. T. Higgins, H. R. Rothstein43 Ó 2009, John Wiley Sons, Ltd 9th February 2009 08:19 Wiley/ITMA Page 63 c11
5. 5. 64 Fixed-Effect Versus Random-Effects Models01 Study 10203 Study 2040506 Study 30708 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.209 θ1011 Figure 11.1 Fixed-effect model – true effects.1213 ε1 Study 1141516 ε2 Study 2171819 ε320 Study 32122 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 θ232425 Figure 11.2 Fixed-effect model – true effects and sampling error.2627 In practice, of course, the sample size in each study is not infinite, and so there is28 sampling error and the effect observed in the study is not the same as the true effect.29 In Figure 11.2 the true effect for each study is still 0.60 (as depicted by the circles)30 but the observed effect (depicted by the squares) differs from one study to the next.31 In Study 1 the sampling error (e1) is À0.20, which yields an observed effect (Y1) of32 Y1 ¼ 0:60 À 0:20 ¼ 0:40:3334 In Study 2 the sampling error (e2) is 0.10, which yields an observed effect (Y2) of35 Y2 ¼ 0:60 þ 0:10 ¼ 0:70:3637 In Study 3 the sampling error (e3) is À0.10, which yields an observed effect (Y3) of38 Y3 ¼ 0:60 À 0:10 ¼ 0:50:3940 More generally, the observed effect Yi for any study is given by the population mean41 plus the sampling error in that study. That is,4243 Yi ¼ þ ei : ð11:1Þ 9th February 2009 08:19 Wiley/ITMA Page 64 c11
6. 6. Chapter 11: Fixed-Effect Model 650102 Study 103040506 Study 2070809 Study 3101112 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.213 θ1415 Figure 11.3 Fixed-effect model – distribution of sampling error.16171819 While the error in any given study is random, we can estimate the sampling20 distribution of the errors. In Figure 11.3 we have placed a normal curve about the21 true effect size for each study, with the width of the curve being based on the22 variance in that study. In Study 1 the sample size was small, the variance large,23 and the observed effect is likely to fall anywhere in the relatively wide range of24 0.20 to 1.00. By contrast, in Study 2 the sample size was relatively large, the25 variance is small, and the observed effect is likely to fall in the relatively narrow26 range of 0.40 to 0.80. (The width of the normal curve is based on the square root of27 the variance, or standard error).282930 PERFORMING A FIXED-EFFECT META-ANALYSIS3132 In an actual meta-analysis, of course, rather than starting with the population effect33 and making projections about the observed effects, we work backwards, starting34 with the observed effects and trying to estimate the population effect. In order to35 obtain the most precise estimate of the population effect (to minimize the variance)36 we compute a weighted mean, where the weight assigned to each study is the37 inverse of that study’s variance. Concretely, the weight assigned to each study in38 a fixed-effect meta-analysis is39 1 Wi ¼ ; ð11:2Þ40 VYi4142 where VYi is the within-study variance for study (i). The weighted mean (M) is then43 computed as 9th February 2009 08:19 Wiley/ITMA Page 65 c11
7. 7. 66 Fixed-Effect Versus Random-Effects Models01 X k02 Wi Yi i¼103 M¼ ; ð11:3Þ Xk04 Wi05 i¼10607 that is, the sum of the products WiYi (effect size multiplied by weight) divided by the08 sum of the weights.09 The variance of the summary effect is estimated as the reciprocal of the sum of10 the weights, or 111 VM ¼ k ð11:4Þ X12 Wi13 i¼114 and the estimated standard error of the summary effect is then the square root of the15 variance,16 pﬃﬃﬃﬃﬃﬃﬃ17 SEM ¼ VM : ð11:5Þ18 Then, 95% lower and upper limits for the summary effect are estimated as1920 LLM ¼ M À 1:96 Â SEM ð11:6Þ21 and22 ULM ¼ M þ 1:96 Â SEM : ð11:7Þ2324 Finally, a Z-value to test the null hypothesis that the common true effect is zero25 can be computed using M26 Z¼ : ð11:8Þ27 SEM28 For a one-tailed test the p-value is given by2930 p ¼ 1 À FðÆjZjÞ; ð11:9Þ31 where we choose ‘þ’ if the difference is in the expected direction and ‘–’ otherwise,32 and for a two-tailed test by33 h i34 p ¼ 2 1 À ðFðjZjÞÞ ; ð11:10Þ3536 where F(Z) is the standard normal cumulative distribution. This function is tabled37 in many introductory statistics books, and is implemented in Excel as the function38 ¼NORMSDIST(Z).3940 Illustrative example4142 We suggest that you turn to a worked example for the fixed-effect model before43 proceeding to the random-effects model. A worked example for the standardized 9th February 2009 08:19 Wiley/ITMA Page 66 c11
8. 8. Chapter 11: Fixed-Effect Model 6701 mean difference (Hedges’ g) is on page 87, a worked example for the odds ratio is on02 page 92, and a worked example for correlations is on page 97.030405 SUMMARY POINTS0607 Under the fixed-effect model all studies in the analysis share a common true08 effect.09 The summary effect is our estimate of this common effect size, and the null10 hypothesis is that this common effect is zero (for a difference) or one (for a11 ratio).12 All observed dispersion reflects sampling error, and study weights are13 assigned with the goal of minimizing this within-study error.141516171819202122232425262728293031323334353637383940414243 9th February 2009 08:19 Wiley/ITMA Page 67 c11
9. 9. 01020304050607080910111213141516171819202122232425262728293031323334353637383940414243 9th February 2009 08:19 Wiley/ITMA Page 68 c11
10. 10. CHAPTER 12010203 Random-Effects Model040506070809 Introduction10 The true effect sizes Impact of sampling error11 Performing a random-effects meta-analysis12131415161718 INTRODUCTION19 In this chapter we introduce the random-effects model. We discuss the assumptions20 of this model, and show how these are reflected in the formulas used to compute a21 summary effect, and in the meaning of the summary effect.222324 THE TRUE EFFECT SIZES2526 The fixed-effect model, discussed above, starts with the assumption that the true27 effect size is the same in all studies. However, in many systematic reviews this28 assumption is implausible. When we decide to incorporate a group of studies in a29 meta-analysis, we assume that the studies have enough in common that it makes30 sense to synthesize the information, but there is generally no reason to assume that31 they are identical in the sense that the true effect size is exactly the same in all the32 studies.33 For example, suppose that we are working with studies that compare the propor-34 tion of patients developing a disease in two groups (vaccinated versus placebo). If35 the treatment works we would expect the effect size (say, the risk ratio) to be similar36 but not identical across studies. The effect size might be higher (or lower) when the37 participants are older, or more educated, or healthier than others, or when a more38 intensive variant of an intervention is used, and so on. Because studies will differ in39 the mixes of participants and in the implementations of interventions, among other40 reasons, there may be different effect sizes underlying different studies.4142 Introduction to Meta-Analysis M. Borenstein, L. V. Hedges, J. P. T. Higgins, H. R. Rothstein43 Ó 2009, John Wiley Sons, Ltd 9th February 2009 08:21 Wiley/ITMA Page 69 c12
11. 11. 70 Fixed-Effect Versus Random-Effects Models010203 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.204 μ05 Figure 12.1 Random-effects model – distribution of true effects.060708 Or, suppose that we are working with studies that assess the impact of an09 educational intervention. The magnitude of the impact might vary depending on10 the other resources available to the children, the class size, the age, and other11 factors, which are likely to vary from study to study.12 We might not have assessed these covariates in each study. Indeed, we might13 not even know what covariates actually are related to the size of the effect.14 Nevertheless, logic dictates that such factors do exist and will lead to variations in15 the magnitude of the effect.16 One way to address this variation across studies is to perform a random-effects17 meta-analysis. In a random-effects meta-analysis we usually assume that the true18 effects are normally distributed. For example, in Figure 12.1 the mean of all true19 effect sizes is 0.60 but the individual effect sizes are distributed about this mean, as20 indicated by the normal curve. The width of the curve suggests that most of the true21 effects fall in the range of 0.50 to 0.70.222324 IMPACT OF SAMPLING ERROR25 Suppose that our meta-analysis includes three studies drawn from the distribution of26 studies depicted by the normal curve, and that the true effects (denoted 1, 2, and27 3) in these studies happen to be 0.50, 0.55 and 0.65 (see Figure 12.2).28 If each study had an infinite sample size the sampling error would be zero and the29 observed effect for each study would be the same as the true effect for that study.3031 Study 13233 Study 23435 Study 33637383940 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.241 μ42 Figure 12.2 Random-effects model – true effects.43 9th February 2009 08:21 Wiley/ITMA Page 70 c12
12. 12. Chapter 12: Random-Effects Model 7101 Y3 θ302 Study 3 ε3030405 ζ306070809 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.210 μ1112 Figure 12.3 Random-effects model – true and observed effect in one study.1314 If we were to plot the observed effects rather than the true effects, the observed15 effects would exactly coincide with the true effects.16 Of course, the sample size in any study is not infinite and therefore the sampling error17 is not zero. If the true effect size for a study is i, then the observed effect for that study18 will be less than or greater than i because of sampling error. For example, consider19 Study 3 in Figure 12.2. This study is the subject of Figure 12.3, where we consider the20 factors that control the observed effect. The true effect for Study 3 is 0.50 but the21 sampling error for this study is –0.10, and the observed effect for this study is 0.40.22 This figure also highlights the fact that the distance between the overall mean and23 the observed effect in any given study consists of two distinct parts: true variation in24 effect sizes ( i) and sampling error (ei). In Study 3 the total distance from to Y3 is25 À0.20. The distance from to 3 (0.60 to 0.50) reflects the fact that the true effect26 size actually varies from one study to the next, while the distance from 3 to Y327 (0.5 to 0.4) is sampling error.28 More generally, the observed effect Yi for any study is given by the grand mean,29 the deviation of the study’s true effect from the grand mean, and the deviation of the30 study’s observed effect from the study’s true effect. That is,31 Yi ¼ þ i þ ei : ð12:1Þ3233 Therefore, to predict how far the observed effect Yi is likely to fall from in any34 given study we need to consider both the variance of i and the variance of i.35 The distance from (the triangle) to each i (the circles) depends on the standard36 deviation of the distribution of the true effects across studies, called (tau) (or 2 for37 its variance). The same value of 2 applies to all studies in the meta-analysis, and in38 Figure 12.4 is represented by the normal curve at the bottom, which extends roughly39 from 0.50 to 0.70.40 The distance from i to Yi depends on the sampling distribution of the sample41 effects about i. This depends on the variance of the observed effect size from42 each study,V i , and so will vary from one study to the next. In Figure 12.4 the Y43 curve for Study 1 is relatively wide while the curve for Study 2 is relatively narrow. 9th February 2009 08:21 Wiley/ITMA Page 71 c12
13. 13. 72 Fixed-Effect Versus Random-Effects Models0102 Study 103040506 Study 207080910 Study 311121314 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.215 μ16 Figure 12.4 Random-effects model – between-study and within-study variance.171819 PERFORMING A RANDOM-EFFECTS META-ANALYSIS20 In an actual meta-analysis, of course, rather than start with the population effect and21 make projections about the observed effects, we start with the observed effects and22 try to estimate the population effect. In other words our goal is to use the collection23 of Yi to estimate the overall mean, . In order to obtain the most precise estimate of24 the overall mean (to minimize the variance) we compute a weighted mean, where25 the weight assigned to each study is the inverse of that study’s variance.26 To compute a study’s variance under the random-effects model, we need to know27 both the within-study variance and 2, since the study’s total variance is the sum of28 these two values. Formulas for computing the within-study variance were presented29 in Part 3. A method for estimating the between-studies variance is given here so that30 we can proceed with the worked example, but a full discussion of this method is31 deferred to Part 4, where we shall pursue the issue of heterogeneity in some detail.323334 Estimating tau-squared35 The parameter 2 (tau-squared) is the between-studies variance (the variance of the36 effect size parameters across the population of studies). In other words, if we37 somehow knew the true effect size for each study, and computed the variance of38 these effects sizes (across an infinite number of studies), this variance would be 2.39 One method for estimating 2 is the method of moments (or the DerSimonian and40 Laird) method, as follows. We compute41 Q À df42 T2 ¼ ; ð12:2Þ C43 9th February 2009 08:21 Wiley/ITMA Page 72 c12
14. 14. Chapter 12: Random-Effects Model 7301 where02 !203 X k Wi Yi04 X k i¼1 Q¼ Wi Yi2 À ; ð12:3Þ05 i¼1 Xk06 Wi i¼10708 df ¼ k À 1; ð12:4Þ0910 where k is the number of studies, and11 X X Wi212 C¼ Wi À X : ð12:5Þ13 Wi141516 Estimating the mean effect size1718 In the fixed-effect analysis each study was weighted by the inverse of its variance.19 In the random-effects analysis, too, each study will be weighted by the inverse of its20 variance. The difference is that the variance now includes the original (within-21 studies) variance plus the estimate of the between-studies variance, T 2. In keeping22 with the book’s convention, we use 2 to refer to the parameter and T 2 to refer to the23 sample estimate of that parameter.24 To highlight the parallel between the formulas here (random effects) and those in25 the previous chapter (fixed effect) we use the same notations but add an asterisk (*)26 to represent the random-effects version. Under the random-effects model the weight27 assigned to each study is28 1 WiÃ ¼ Ã ð12:6Þ29 VYi3031 where V Ãi is the within-study variance for study i plus the between-studies variance, Y32 T2. That is, Ã33 V Yi ¼ VYi þ T 2 :3435 The weighted mean, M*, is then computed as36 X k37 WiÃ Yi i¼138 MÃ ¼ ð12:7Þ39 Xk WiÃ40 i¼14142 that is, the sum of the products (effect size multiplied by weight) divided by the sum43 of the weights. 9th February 2009 08:21 Wiley/ITMA Page 73 c12
15. 15. 74 Fixed-Effect Versus Random-Effects Models01 The variance of the summary effect is estimated as the reciprocal of the sum of02 the weights, or03 1 VM Ã ¼ k ð12:8Þ04 X Ã05 Wi06 i¼107 and the estimated standard error of the summary effect is then the square root of the08 variance,09 pﬃﬃﬃﬃﬃﬃﬃﬃﬃ SEMÃ ¼ VM Ã : ð12:9Þ1011 The 95% lower and upper limits for the summary effect would be computed as12 LLMÃ ¼ M Ã À 1:96 Â SEM Ã ; ð12:10Þ1314 and15 ULMÃ ¼ M Ã þ 1:96 Â SEM Ã : ð12:11Þ1617 Finally, a Z-value to test the null hypothesis that the mean effect is zero could be18 computed using19 MÃ20 ZÃ ¼ : ð12:12Þ SEM Ã2122 For a one-tailed test the p-value is given by23 pÃ ¼ 1 À FðÆjZ Ã jÞ; ð12:13Þ2425 where we choose ‘þ’ if the difference is in the expected direction or ‘–’ otherwise,26 and for a two-tailed test by27 pÃ ¼ 2½1 À ðFðjZ Ã jÞÞ; ð12:14Þ28 where F(Z*) is the standard normal cumulative distribution. This function is tabled29 in many introductory statistics books, and is implemented in Excel as the function30 ¼NORMSDIST(Z*).3132 Illustrative example33 As before, we suggest that you turn to one of the worked examples in the next34 chapter before proceeding with this discussion.353637 SUMMARY POINTS3839 Under the random-effects model, the true effects in the studies are assumed to40 have been sampled from a distribution of true effects.41 The summary effect is our estimate of the mean of all relevant true effects, and42 the null hypothesis is that the mean of these effects is 0.0 (equivalent to a ratio43 of 1.0 for ratio measures). 9th February 2009 08:21 Wiley/ITMA Page 74 c12
16. 16. Chapter 12: Random-Effects Model 7501 Since our goal is to estimate the mean of the distribution, we need to take02 account of two sources of variance. First, there is within-study error in03 estimating the effect in each study. Second (even if we knew the true mean04 for each of our studies), there is variation in the true effects across studies.05 Study weights are assigned with the goal of minimizing both sources of06 variance.07080910111213141516171819202122232425262728293031323334353637383940414243 9th February 2009 08:21 Wiley/ITMA Page 75 c12
17. 17. 01020304050607080910111213141516171819202122232425262728293031323334353637383940414243 9th February 2009 08:21 Wiley/ITMA Page 76 c12
18. 18. CHAPTER 13010203 Fixed-Effect Versus Random-Effects0405 Models0607080910 Introduction11 Definition of a summary effect12 Estimating the summary effect13 Extreme effect size in a large study or a small study14 Confidence interval15 The null hypothesis16 Which model should we use?17 Model should not be based on the test for heterogeneity18 Concluding remarks192021222324 INTRODUCTION25 In Chapter 11 and Chapter 12 we introduced the fixed-effect and random-26 effects models. Here, we highlight the conceptual and practical differences27 between them.28 Consider the forest plots in Figures 13.1 and 13.2. They include the same six29 studies, but the first uses a fixed-effect analysis and the second a random-effects30 analysis. These plots provide a context for the discussion that follows.313233 DEFINITION OF A SUMMARY EFFECT34 Both plots show a summary effect on the bottom line, but the meaning of this35 summary effect is different in the two models. In the fixed-effect analysis we36 assume that the true effect size is the same in all studies, and the summary37 effect is our estimate of this common effect size. In the random-effects analysis38 we assume that the true effect size varies from one study to the next, and that39 the studies in our analysis represent a random sample of effect sizes that could404142 Introduction to Meta-Analysis M. Borenstein, L. V. Hedges, J. P. T. Higgins, H. R. Rothstein43 Ó 2009, John Wiley Sons, Ltd 9th February 2009 07:07 Wiley/ITMA Page 77 c13
19. 19. 78 Fixed-Effect Versus Random-Effects Models01 Impact of Intervention (Fixed effect)0203 Std Diff Relative Standardized mean difference (g) (g) Weight and 95% confidence interval0405 Carroll 0.10 12%06 Grant 0.28 13%07 Peck 0.37 8%08 Donat 0.66 39%09 Stewart 0.46 10%1011 Young 0.19 18%12 Summary 0.41 100%1314 –1.0 –0.5 0.0 0.5 1.01516 Figure 13.1 Fixed-effect model – forest plot showing relative weights.1718 Impact of Intervention (Random effects)1920 Std Diff Relative Standardized mean difference (g) with (g) Weight 95% confidence and prediction intervals2122 Carroll 0.10 16%23 Grant 0.28 16%24 Peck 0.37 13%2526 Donat 0.66 23%27 Stewart 0.46 14%28 Young 0.19 18%29 Summary 0.36 100%3031 –1.0 –0.5 0.0 0.5 1.03233 Figure 13.2 Random-effects model – forest plot showing relative weights.3435 have been observed. The summary effect is our estimate of the mean of these36 effects.373839 ESTIMATING THE SUMMARY EFFECT40 Under the fixed-effect model we assume that the true effect size for all studies41 is identical, and the only reason the effect size varies between studies is42 sampling error (error in estimating the effect size). Therefore, when assigning43 9th February 2009 07:07 Wiley/ITMA Page 78 c13
20. 20. Chapter 13: Fixed-Effect Versus Random-Effects Models 7901 weights to the different studies we can largely ignore the information in the02 smaller studies since we have better information about the same effect size in03 the larger studies.04 By contrast, under the random-effects model the goal is not to estimate one true05 effect, but to estimate the mean of a distribution of effects. Since each study06 provides information about a different effect size, we want to be sure that all these07 effect sizes are represented in the summary estimate. This means that we cannot08 discount a small study by giving it a very small weight (the way we would in09 a fixed-effect analysis). The estimate provided by that study may be imprecise, but10 it is information about an effect that no other study has estimated. By the same11 logic we cannot give too much weight to a very large study (the way we might in12 a fixed-effect analysis). Our goal is to estimate the mean effect in a range of13 studies, and we do not want that overall estimate to be overly influenced by any14 one of them.15 In these graphs, the weight assigned to each study is reflected in the size of the16 box (specifically, the area) for that study. Under the fixed-effect model there is a17 wide range of weights (as reflected in the size of the boxes) whereas under the18 random-effects model the weights fall in a relatively narrow range. For example,19 compare the weight assigned to the largest study (Donat) with that assigned to the20 smallest study (Peck) under the two models. Under the fixed-effect model Donat is21 given about five times as much weight as Peck. Under the random-effects model22 Donat is given only 1.8 times as much weight as Peck.232425 EXTREME EFFECT SIZE IN A LARGE STUDY OR A SMALL STUDY26 How will the selection of a model influence the overall effect size? In this example27 Donat is the largest study, and also happens to have the highest effect size. Under28 the fixed-effect model Donat was assigned a large share (39%) of the total weight29 and pulled the mean effect up to 0.41. By contrast, under the random-effects model30 Donat was assigned a relatively modest share of the weight (23%). It therefore had31 less pull on the mean, which was computed as 0.36.32 Similarly, Carroll is one of the smaller studies and happens to have the smallest33 effect size. Under the fixed-effect model Carroll was assigned a relatively small34 proportion of the total weight (12%), and had little influence on the summary effect.35 By contrast, under the random-effects model Carroll carried a somewhat higher36 proportion of the total weight (16%) and was able to pull the weighted mean toward37 the left.38 The operating premise, as illustrated in these examples, is that whenever 2 is39 nonzero, the relative weights assigned under random effects will be more balanced40 than those assigned under fixed effects. As we move from fixed effect to random41 effects, extreme studies will lose influence if they are large, and will gain influence42 if they are small.43 9th February 2009 07:07 Wiley/ITMA Page 79 c13
21. 21. 80 Fixed-Effect Versus Random-Effects Models01 CONFIDENCE INTERVAL02 Under the fixed-effect model the only source of uncertainty is the within-study03 (sampling or estimation) error. Under the random-effects model there is this same04 source of uncertainty plus an additional source (between-studies variance).05 It follows that the variance, standard error, and confidence interval for the summary06 effect will always be larger (or wider) under the random-effects model than under07 the fixed-effect model (unless T 2 is zero, in which case the two models are the08 same). In this example, the standard error is 0.064 for the fixed-effect model, and09 0.105 for the random-effects model.10111213 Fixed-effect model1415 Effect size16 and 95% confidence interval17 Study A1819 Study B20 Study C21 Study D2223 Summary2425 –1.0 –0.5 0.0 0.5 1.02627 Figure 13.3 Very large studies under fixed-effect model.2829 Random-effects model3031 Effect size and 95% confidence interval3233 Study A34 Study B3536 Study C37 Study D3839 Summary40 –1.0 –0.5 0.0 0.5 1.04142 Figure 13.4 Very large studies under random-effects model.43 9th February 2009 07:07 Wiley/ITMA Page 80 c13
22. 22. Chapter 13: Fixed-Effect Versus Random-Effects Models 8101 Consider what would happen if we had five studies, and each study had an02 infinitely large sample size. Under either model the confidence interval for the03 effect size in each study would have a width approaching zero, since we know04 the effect size in that study with perfect precision. Under the fixed-effect05 model the summary effect would also have a confidence interval with a width06 of zero, since we know the common effect precisely (Figure 13.3). By con-07 trast, under the random-effects model the width of the confidence interval08 would not approach zero (Figure 13.4). While we know the effect in each09 study precisely, these effects have been sampled from a universe of possible10 effect sizes, and provide only an estimate of the mean effect. Just as the error11 within a study will approach zero only as the sample size approaches infinity,12 so too the error of these studies as an estimate of the mean effect will13 approach zero only as the number of studies approaches infinity.14 More generally, it is instructive to consider what factors influence the standard15 error of the summary effect under the two models. The following formulas are16 based on a meta-analysis of means from k one-group studies, but the conceptual17 argument applies to all meta-analyses. The within-study variance of each mean18 depends on the standard deviation (denoted ) of participants’ scores and the19 sample size of each study (n). For simplicity we assume that all of the studies20 have the same sample size and the same standard deviation (see Box 13.1 for21 details).22 Under the fixed-effect model the standard error of the summary effect is given by23 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 224 SEM ¼ : ð13:1Þ25 kÂn26 It follows that with a large enough sample size the standard error will approach zero,27 and this is true whether the sample size is concentrated on one or two studies, or28 dispersed across any number of studies.29 Under the random-effects model the standard error of the summary effect is30 given by31 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ32 2 2 SEM ¼ þ : ð13:2Þ33 kÂn k3435 The first term is identical to that for the fixed-effect model and, again, with a36 large enough sample size, this term will approach zero. By contrast, the second37 term (which reflects the between-studies variance) will only approach zero as the38 number of studies approaches infinity. These formulas do not apply exactly in39 practice, but the conceptual argument does. Namely, increasing the sample size40 within studies is not sufficient to reduce the standard error beyond a certain point41 (where that point is determined by 2 and k). If there is only a small number of42 studies, then the standard error could still be substantial even if the total n is in the43 tens of thousands or higher. 9th February 2009 07:07 Wiley/ITMA Page 81 c13
23. 23. 82 Fixed-Effect Versus Random-Effects Models010203 BOX 13.1 FACTORS THAT INFLUENCE THE STANDARD ERROR OF THE04 SUMMARY EFFECT.0506 To illustrate the concepts with some simple formulas, let us consider a meta-07 analysis of studies with the very simplest design, such that each study08 comprises a single sample of n observations with standard deviation . We09 combine estimates of the mean in a meta-analysis. The variance of each10 estimate is11 212 VYi ¼ n1314 so the (inverse-variance) weight in a fixed-effect meta-analysis is1516 1 n Wi ¼ ¼17 2 =n 21819 and the variance of the summary effect under the fixed-effect model the standard20 error is given by2122 1 1 2 VM ¼ ¼ ¼ :23 X k k Â n=2 kÂn24 Wi i¼12526 Therefore under the fixed-effect model the (true) standard error of the summary27 mean is given by2829 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 230 SEM ¼ :31 kÂn3233 Under the random-effects model the weight awarded to each study is34 135 W Ãi ¼ À 2 Á36 n þ 23738 and the (true) standard error of the summary mean turns out to be39 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ40 2 2 SEM Ã ¼ þ :41 kÂn k4243 9th February 2009 07:07 Wiley/ITMA Page 82 c13
24. 24. Chapter 13: Fixed-Effect Versus Random-Effects Models 8301 THE NULL HYPOTHESIS02 Often, after computing a summary effect, researchers perform a test of the03 null hypothesis. Under the fixed-effect model the null hypothesis being tested04 is that there is zero effect in every study. Under the random-effects model the05 null hypothesis being tested is that the mean effect is zero. Although some may06 treat these hypotheses as interchangeable, they are in fact different, and it is07 imperative to choose the test that is appropriate to the inference a researcher wishes08 to make.091011 WHICH MODEL SHOULD WE USE?12 The selection of a computational model should be based on our expectation about13 whether or not the studies share a common effect size and on our goals in perform-14 ing the analysis.151617 Fixed effect1819 It makes sense to use the fixed-effect model if two conditions are met. First, we20 believe that all the studies included in the analysis are functionally identical.21 Second, our goal is to compute the common effect size for the identified population,22 and not to generalize to other populations.23 For example, suppose that a pharmaceutical company will use a thousand patients to24 compare a drug versus placebo. Because the staff can work with only 100 patients at a25 time, the company will run a series of ten trials with 100 patients in each. The studies26 are identical in the sense that any variables which can have an impact on the outcome27 are the same across the ten studies. Specifically, the studies draw patients from a28 common pool, using the same researchers, dose, measure, and so on (we assume that29 there is no concern about practice effects for the researchers, nor for the different30 starting times of the various cohorts). All the studies are expected to share a common31 effect and so the first condition is met. The goal of the analysis is to see if the drug32 works in the population from which the patients were drawn (and not to extrapolate to33 other populations), and so the second condition is met, as well.34 In this example the fixed-effect model is a plausible fit for the data and meets the35 goal of the researchers. It should be clear, however, that this situation is relatively36 rare. The vast majority of cases will more closely resemble those discussed imme-37 diately below.38 Random effects3940 By contrast, when the researcher is accumulating data from a series of studies that41 had been performed by researchers operating independently, it would be unlikely42 that all the studies were functionally equivalent. Typically, the subjects or inter-43 ventions in these studies would have differed in ways that would have impacted on 9th February 2009 07:07 Wiley/ITMA Page 83 c13
25. 25. 84 Fixed-Effect Versus Random-Effects Models01 the results, and therefore we should not assume a common effect size. Therefore, in02 these cases the random-effects model is more easily justified than the fixed-effect03 model.04 Additionally, the goal of this analysis is usually to generalize to a range of05 scenarios. Therefore, if one did make the argument that all the studies used an06 identical, narrowly defined population, then it would not be possible to extrapolate07 from this population to others, and the utility of the analysis would be severely limited.0809 A caveat10 There is one caveat to the above. If the number of studies is very small, then the11 estimate of the between-studies variance ( 2) will have poor precision. While the12 random-effects model is still the appropriate model, we lack the information needed13 to apply it correctly. In this case the reviewer may choose among several options,14 each of them problematic.15 One option is to report the separate effects and not report a summary effect.16 The hope is that the reader will understand that we cannot draw conclusions17 about the effect size and its confidence interval. The problem is that some readers18 will revert to vote counting (see Chapter 28) and possibly reach an erroneous19 conclusion.20 Another option is to perform a fixed-effect analysis. This approach would yield a21 descriptive analysis of the included studies, but would not allow us to make22 inferences about a wider population. The problem with this approach is that (a)23 we do want to make inferences about a wider population and (b) readers will make24 these inferences even if they are not warranted.25 A third option is to take a Bayesian approach, where the estimate of 2 is based on26 data from outside of the current set of studies. This is probably the best option, but the27 problem is that relatively few researchers have expertise in Bayesian meta-analysis.28 Additionally, some researchers have a philosophical objection to this approach.29 For a more general discussion of this issue see When does it make sense to30 perform a meta-analysis in Chapter 40.3132 MODEL SHOULD NOT BE BASED ON THE TEST FOR HETEROGENEITY3334 In the next chapter we will introduce a test of the null hypothesis that the between-35 studies variance is zero. This test is based on the amount of between-studies36 variance observed, relative to the amount we would expect if the studies actually37 shared a common effect size.38 Some have adopted the practice of starting with a fixed-effect model and then39 switching to a random-effects model if the test of homogeneity is statistically40 significant. This practice should be strongly discouraged because the decision to41 use the random-effects model should be based on our understanding of whether or42 not all studies share a common effect size, and not on the outcome of a statistical test43 (especially since the test for heterogeneity often suffers from low power). 9th February 2009 07:07 Wiley/ITMA Page 84 c13
26. 26. Chapter 13: Fixed-Effect Versus Random-Effects Models 8501 If the study effect sizes are seen as having been sampled from a distribution of02 effect sizes, then the random-effects model, which reflects this idea, is the logical one03 to use. If the between-studies variance is substantial (and statistically significant) then04 the fixed-effect model is inappropriate. However, even if the between-studies var-05 iance does not meet the criterion for statistical significance (which may be due simply06 to low power) we should still take account of this variance when assigning weights. If07 T 2 turns out to be zero, then the random-effects analysis reduces to the fixed-effect08 analysis, and so there is no cost to using this model.09 On the other hand, if one has elected to use the fixed-effect model a priori but the10 test of homogeneity is statistically significant, then it would be important to revisit11 the assumptions that led to the selection of a fixed-effect model.1213 CONCLUDING REMARKS1415 Our discussion of differences between the fixed-model and the random-effects16 model focused largely on the computation of a summary effect and the confidence17 intervals for the summary effect. We did not address the implications of the18 dispersion itself. Under the fixed-effect model we assume that all dispersion in19 observed effects is due to sampling error, but under the random-effects model we20 allow that some of that dispersion reflects real differences in effect size across21 studies. In the chapters that follow we discuss methods to quantify that dispersion22 and to consider its substantive implications.23 Although throughout this book we define a fixed-effect meta-analysis as assum-24 ing that every study has a common true effect size, some have argued that the fixed-25 effect method is valid without making this assumption. The point estimate of the26 effect in a fixed-effect meta-analysis is simply a weighted average and does not27 strictly require the assumption that all studies estimate the same thing. For simpli-28 city and clarity we adopt a definition of a fixed-effect meta-analysis that does29 assume homogeneity of effect.303132 SUMMARY POINTS3334 A fixed-effect meta-analysis estimates a single effect that is assumed to be35 common to every study, while a random-effects meta-analysis estimates the36 mean of a distribution of effects.37 Study weights are more balanced under the random-effects model than under the38 fixed-effect model. Large studies are assigned less relative weight and small39 studies are assigned more relative weight as compared with the fixed-effect40 model.41 The standard error of the summary effect and (it follows) the confidence42 intervals for the summary effect are wider under the random-effects model43 than under the fixed-effect model. 9th February 2009 07:07 Wiley/ITMA Page 85 c13
27. 27. 86 Fixed-Effect Versus Random-Effects Models0102 The selection of a model must be based solely on the question of which model03 fits the distribution of effect sizes, and takes account of the relevant source(s)04 of error. When studies are gathered from the published literature, the random-05 effects model is generally a more plausible match.06 The strategy of starting with a fixed-effect model and then moving to a07 random-effects model if the test for heterogeneity is significant is a mistake,08 and should be strongly discouraged.0910111213141516171819202122232425262728293031323334353637383940414243 9th February 2009 07:07 Wiley/ITMA Page 86 c13