By
Dr Utpal Sharma
PG Student, Department of Community Medicine
Gauhati Medical College
ANOVA AND META-ANALYSIS
AN OVERVIEW
Introduction
 Any data set has variability
 Variability exists within groups…
 and between groups
 Question that ANOVA allows us to answer : Is
this variability significant, or merely by chance?
Sir R A Fisher
Definitions
 ANOVA: analysis of variation in an experimental outcome
and especially of a statistical variance in order to determine
the contributions of given factors or variables to the variance.
 Compares the means of groups of independent observations
 ANOVA does not compare variances. We use variance-like
quantities to study the equality or non-equality of population
means.
 Can compare more than two groups
 Variance: the square of the standard deviation
Rationale of ANOVA
 The ANOVA technique extends what an independent-samples
t test can do to multiple means.
 If more than two means are compared, repeated t test will
lead to a higher Type I error rate.
 A better approach is to consider all means in one null
hypothesis—that is, examining the plausibility of the null
hypothesis with a single statistical test.
 Apart from saving time and energy, researchers can exercise
a better control of the probability of falsely declaring
significant differences among means.
Cont…
 ANOVA or F test is associated with three
assumptions
 Normal distribution
 Variances of dependent variable are equal in all populations
 Random samples; observations independently selected from
their respective populations.
σ2
1 =σ2
2 =σ2
3 =σ2
4 ……
One-Way ANOVA
 The one-way analysis of variance is used to test the claim that
three or more population means are equal
 The response/dependent variable is the variable we‘re
comparing
 The factor/independent variable is the variable being used to
define the groups
 The one-way is because each value is classified in exactly
one way
 Examples include comparisons by gender, race, political party,
color, etc.
Cont…
For a sample containing K independent groups
 ANOVA tests the null hypothesis which says that the means are all
equal
H0: μ1 = μ2 = … = μK
 The alternative hypothesis is that at least one of the means is
different
H1: μi ≠ μj for some i, j
That is, “the group means are all equal”
The group means are not all equal
The ANOVA doesn’t test that one mean is less than another,
only whether they’re all equal or at least one is different.
Cont….
 Variation
 Variation is the sum of the squares (SS) of the deviations
between a value and the mean of the value
 SST: The total variability of the dependent variable.
 SSB: The variability between each group relative to the grand mean
 SSW: The variability within each group relative to the group mean.
SST =  (X - X)2 ; SST = SSB + SSW
SSB =  NG (XG - X)2
SSW =  (X1 - X1)2 + (X2 - X2)2 + …….. (Xk - Xk)2
Sum of Squares is abbreviated by SS and often followed by a
variable in parentheses such as SS(B) or SS(W) so we know
which sum of squares we’re talking about
Cont….
 Degrees of Freedom, df
 A degree of freedom occurs for each value that can vary before
the rest of the values are predetermined
 The df is often one less than the number of values (N-1)
 Variances (Mean of the Squares)
 The variances abbreviated by MS, often with an accompanying
variable MS(B) or MS(W)
 They are an average squared deviation from the mean and are
found by dividing the variation by the degrees of freedom
variation (SS)
Variance (MS)=
df
Cont..
 F test statistic
 An F test statistic is the ratio of two sample variances
 The MS(B) and MS(W) are two sample variances and
that‘s what we divide to find F.
 The F test statistic has an F distribution with df(B)
numerator df and df(W) denominator df
 The p-value is the area to the right of the test statistic
F = MS(B) / MS(W)
ANOVA -Example
 The statistics classroom is divided into three rows: front,
middle, and back
 We want to see if the students further away did worse on the
exams
 A random sample of the students in each row was taken
 The score for those students on the second exam was
recorded
 Front: 82, 83, 97, 93, 55, 67, 53
 Middle: 83, 78, 68, 61, 77, 54, 69, 51, 63
 Back: 38, 59, 55, 66, 45, 52, 52, 61
Cont….
 The summary statistics for the grades of each row are
shown in the table below
 Now, here is the basic one-way ANOVA table
Row Front Middle Back
Sample size 7 9 8
Mean 75.71 67.11 53.50
Variance 310.90 119.86 80.29
St. Dev 17.63 10.95 8.96
Source SS df MS F p
Between
Within
Total
Cont…
 Grand Mean
 In our example
 The Between Group Variation for our example is
SS(B)=1902
1 1 2 2
1 2
k k
k
n x n x n x
x
n n n
  

  


SS(B)=7(75.71-65.08)2 + 9(67.11-65.08)2 + 8(53.5-65.08)2 =1902
7(75.71) + 9(67.11) +8(53.5)
x̅ = = 65.08
7 + 9 + 8
Cont…
 The within group variation for our example is 3386
 Degree of freedom
 The between group df is one less than the number of groups
 We have three groups, so df(B) = 2
 The within group df is the sum of the individual df‘s of each group
 The sample sizes are 7, 9, and 8
 df(W) = 6 + 8 + 7 = 21
 The total df is one less than the sample size
 df(Total) = 24 – 1 = 23
SS(W) = 6(310.9)+8(119.86)+7(80.29) = 3386
Cont….
 Variance (mean of squares)
 MS(B)= 1902 / 2 = 951.0
 MS(W)= 3386 / 21 = 161.2
 MS(T)= 5288 / 23 = 229.9
 Now computing ANOVA
Source SS df MS F p
Between 1902 2 951.0 5.9
Within 3386 21 161.2
Total 5288 23 229.9
Cont…
 P(F2,21 > 5.9) = 0.009
 There is enough evidence to support the claim that there is a
difference in the mean scores of the front, middle, and back
rows in class.
 The ANOVA doesn‘t tell which row is different, we need to
look at confidence intervals or run post hoc tests to determine
that
Meta analysis
What is meta analysis ?
 ―Meta-analysis is a statistical technique for combining the
results of independent, but similar, studies to obtain an
overall estimate of treatment effect.‖
 Margaliot, Zvi, Kevin C. Chung. ―Systematic Reviews: A Primer for Plastic Surgery
Research.‖ PRS Journal. 120/7 (2007) p.1840
 Quantitative approach for systematically combining results of
previous research to arrive at conclusions about the body of
research.
 Each study produces a different estimate of the magnitude.
 Meta-analysis combines the effects from all studies to give an
overall mean effect and other important statistics
The crack......
 Quantitative : numbers
 Systematic : methodical
 Combining: putting together
 Previous research: what's already done
 Conclusions: new knowledge
Systematic reviews
 ―A review that is conducted according to clearly
stated, scientific research methods, and is
designed to minimize biases and errors inherent to
traditional, narrative reviews.‖
 Systematic Reviews minimize bias.
 A systematic review is a more scientific method of
because specific protocols are used to determine
which studies will be included in the review.‖
Systemic review vs. meta analysis
Systematic Reviews Meta-analyses
Identify and critique relevant
research studies
Discuss factors that may explain
heterogeneity
Synthesize the knowledge
Identify relevant research studies
using a defined protocol
Statistically test study
heterogeneity and investigate
explanatory variables.
Statistically summarize results to
obtain an overall estimate of
treatment effect.
Four Steps of Meta Analysis
 Identifying studies
 Determining eligibility of studies
o Inclusion: which ones to keep
o Exclusion: which ones to throw out
 Abstract Data from the studies
 Analyzing data in the studies statistically
Identifying studies
 Being methodical: Defining the Research Question
 Performing the literature search
 List of popular databases to search
 Pubmed/Medline
 Embase
 Cochrane Review/Trials Register
 Other strategies ....
 Hand search (in the library...)
 Personal references, and emails
 web, eg. Google search (http://scholar.google.com)
 Selection of the studies
Eligibility of studies
 Should be determined in advance, to reduce investigator bias
 Cannot include all studies
 Keep the ones with
 high levels of evidence
 good quality
 check with QUOROM (Quality of reporting of systematic reviews)
guidelines
 Usually, MA done with RCTs
 Case series, and case reports definitely out
The QUOROM guidelines for reporting a meta-analysis requests that
investigators provide a flow diagram of the selection process.
Quality Control in MA:QUOROM Table
Cont…
Selection problems are major problems
Criteria include but are not limited to:
 Types of studies included (case control, cohort, etc)
 Years of publication covered
 Languages
 Restrictions on sample size
 Definition of disease, exposures
 Confounders that must be measured
 Dose response categories similar
The issue….
 Unpublished studies that failed to yield significant results.
 If substantial number of such studies , evaluation of the
overall significance level may be unduly optimistic.
 A biased sample – a sample of only those publications
reporting statistically significant results
 This bias inflates the probability of making a Type II error
FILE DRAWER PHENOMENA
Checking for publication bias
 Funnel plots display the studies included in the meta
analysis in a plot of effect size against sample size
 Smaller studies -more chance variability , the expected picture
is one of a symmetrical inverted funnel
 Asymmetric plot suggests that the meta analysis may have
missed some trials – usually smaller studies showing no effect
 Asymmetry could also occur if small studies tend to have
larger effect size
Interpretation…
Funnel plot
Abstract the data
 Data to be extracted from each study should be determined in
the design phase and……
 A standardized form is to be constructed to record the data.
 Examples of data commonly extracted
 Study design, descriptions of study groups, diagnostic
information, treatments, length of follow-up evaluation, and
outcome measures.
What should be abstracted from
articles?
Should at least include:
 Type of study
 Source of cases/controls or cohort
 Measures of association
 Confidence intervals
 Number of observations
 Confounders adjusted for, if any
Plan of Action
ARE THE STUDIES ELIGIBLE FOR MA (STEP I)?
DISCARD
YES
NO
ENTER INTO A SPECIFIED FORMAT
ABSTRACT THE DATA
Analyzing data in the studies
statistically
 Clinical trials present results as the frequency of some outcome in the
intervention groups and the control group.
 Meta-analysis usually summarize as a ratio of the frequency of the
events in the intervention to that in the control group.
 Most common summary measure of effect size are odds ratio
(OR),standard deviation (d) but RR and NNT are also seldom used
 Separate methods used for combining effect size and other outcome
measures such as risk difference or hazard ratio
Categories: 0.2-small, 0.5-medium, 0.8-Large (Cohen, 1977).
Examples
 Smith and Glass, 1977 synthesized the results from 400
controlled evaluations of psychotherapy and counselling to
determine whether psychotherapy ‗works‘.
 They coded and systematically analyzed each study for the
kind of experimental and control treatments used and the
results obtained.
 They were able to show that, on the average, the typical
psychotherapy client was better off than 75% of the untreated
‗control‘ individuals.
Examples….
 Iaffaldano and Muchinsky (1985) found from their meta-
analysis that overall there is only a slight relationship
between workers‘ job satisfaction and the quality of their
performance.
 Jenkins (1986) tracked down 28 published studies
measuring the impact of financial incentives on
workplace performance.
Only 57% of these found a positive effect on performance and the
overall effect was minimal.
Thank you

Anova n metaanalysis

  • 1.
    By Dr Utpal Sharma PGStudent, Department of Community Medicine Gauhati Medical College ANOVA AND META-ANALYSIS AN OVERVIEW
  • 2.
    Introduction  Any dataset has variability  Variability exists within groups…  and between groups  Question that ANOVA allows us to answer : Is this variability significant, or merely by chance? Sir R A Fisher
  • 3.
    Definitions  ANOVA: analysisof variation in an experimental outcome and especially of a statistical variance in order to determine the contributions of given factors or variables to the variance.  Compares the means of groups of independent observations  ANOVA does not compare variances. We use variance-like quantities to study the equality or non-equality of population means.  Can compare more than two groups  Variance: the square of the standard deviation
  • 4.
    Rationale of ANOVA The ANOVA technique extends what an independent-samples t test can do to multiple means.  If more than two means are compared, repeated t test will lead to a higher Type I error rate.  A better approach is to consider all means in one null hypothesis—that is, examining the plausibility of the null hypothesis with a single statistical test.  Apart from saving time and energy, researchers can exercise a better control of the probability of falsely declaring significant differences among means.
  • 5.
    Cont…  ANOVA orF test is associated with three assumptions  Normal distribution  Variances of dependent variable are equal in all populations  Random samples; observations independently selected from their respective populations. σ2 1 =σ2 2 =σ2 3 =σ2 4 ……
  • 6.
    One-Way ANOVA  Theone-way analysis of variance is used to test the claim that three or more population means are equal  The response/dependent variable is the variable we‘re comparing  The factor/independent variable is the variable being used to define the groups  The one-way is because each value is classified in exactly one way  Examples include comparisons by gender, race, political party, color, etc.
  • 7.
    Cont… For a samplecontaining K independent groups  ANOVA tests the null hypothesis which says that the means are all equal H0: μ1 = μ2 = … = μK  The alternative hypothesis is that at least one of the means is different H1: μi ≠ μj for some i, j That is, “the group means are all equal” The group means are not all equal The ANOVA doesn’t test that one mean is less than another, only whether they’re all equal or at least one is different.
  • 8.
    Cont….  Variation  Variationis the sum of the squares (SS) of the deviations between a value and the mean of the value  SST: The total variability of the dependent variable.  SSB: The variability between each group relative to the grand mean  SSW: The variability within each group relative to the group mean. SST =  (X - X)2 ; SST = SSB + SSW SSB =  NG (XG - X)2 SSW =  (X1 - X1)2 + (X2 - X2)2 + …….. (Xk - Xk)2 Sum of Squares is abbreviated by SS and often followed by a variable in parentheses such as SS(B) or SS(W) so we know which sum of squares we’re talking about
  • 9.
    Cont….  Degrees ofFreedom, df  A degree of freedom occurs for each value that can vary before the rest of the values are predetermined  The df is often one less than the number of values (N-1)  Variances (Mean of the Squares)  The variances abbreviated by MS, often with an accompanying variable MS(B) or MS(W)  They are an average squared deviation from the mean and are found by dividing the variation by the degrees of freedom variation (SS) Variance (MS)= df
  • 10.
    Cont..  F teststatistic  An F test statistic is the ratio of two sample variances  The MS(B) and MS(W) are two sample variances and that‘s what we divide to find F.  The F test statistic has an F distribution with df(B) numerator df and df(W) denominator df  The p-value is the area to the right of the test statistic F = MS(B) / MS(W)
  • 11.
    ANOVA -Example  Thestatistics classroom is divided into three rows: front, middle, and back  We want to see if the students further away did worse on the exams  A random sample of the students in each row was taken  The score for those students on the second exam was recorded  Front: 82, 83, 97, 93, 55, 67, 53  Middle: 83, 78, 68, 61, 77, 54, 69, 51, 63  Back: 38, 59, 55, 66, 45, 52, 52, 61
  • 12.
    Cont….  The summarystatistics for the grades of each row are shown in the table below  Now, here is the basic one-way ANOVA table Row Front Middle Back Sample size 7 9 8 Mean 75.71 67.11 53.50 Variance 310.90 119.86 80.29 St. Dev 17.63 10.95 8.96 Source SS df MS F p Between Within Total
  • 13.
    Cont…  Grand Mean In our example  The Between Group Variation for our example is SS(B)=1902 1 1 2 2 1 2 k k k n x n x n x x n n n          SS(B)=7(75.71-65.08)2 + 9(67.11-65.08)2 + 8(53.5-65.08)2 =1902 7(75.71) + 9(67.11) +8(53.5) x̅ = = 65.08 7 + 9 + 8
  • 14.
    Cont…  The withingroup variation for our example is 3386  Degree of freedom  The between group df is one less than the number of groups  We have three groups, so df(B) = 2  The within group df is the sum of the individual df‘s of each group  The sample sizes are 7, 9, and 8  df(W) = 6 + 8 + 7 = 21  The total df is one less than the sample size  df(Total) = 24 – 1 = 23 SS(W) = 6(310.9)+8(119.86)+7(80.29) = 3386
  • 15.
    Cont….  Variance (meanof squares)  MS(B)= 1902 / 2 = 951.0  MS(W)= 3386 / 21 = 161.2  MS(T)= 5288 / 23 = 229.9  Now computing ANOVA Source SS df MS F p Between 1902 2 951.0 5.9 Within 3386 21 161.2 Total 5288 23 229.9
  • 16.
    Cont…  P(F2,21 >5.9) = 0.009  There is enough evidence to support the claim that there is a difference in the mean scores of the front, middle, and back rows in class.  The ANOVA doesn‘t tell which row is different, we need to look at confidence intervals or run post hoc tests to determine that
  • 17.
  • 18.
    What is metaanalysis ?  ―Meta-analysis is a statistical technique for combining the results of independent, but similar, studies to obtain an overall estimate of treatment effect.‖  Margaliot, Zvi, Kevin C. Chung. ―Systematic Reviews: A Primer for Plastic Surgery Research.‖ PRS Journal. 120/7 (2007) p.1840  Quantitative approach for systematically combining results of previous research to arrive at conclusions about the body of research.  Each study produces a different estimate of the magnitude.  Meta-analysis combines the effects from all studies to give an overall mean effect and other important statistics
  • 19.
    The crack......  Quantitative: numbers  Systematic : methodical  Combining: putting together  Previous research: what's already done  Conclusions: new knowledge
  • 20.
    Systematic reviews  ―Areview that is conducted according to clearly stated, scientific research methods, and is designed to minimize biases and errors inherent to traditional, narrative reviews.‖  Systematic Reviews minimize bias.  A systematic review is a more scientific method of because specific protocols are used to determine which studies will be included in the review.‖
  • 21.
    Systemic review vs.meta analysis Systematic Reviews Meta-analyses Identify and critique relevant research studies Discuss factors that may explain heterogeneity Synthesize the knowledge Identify relevant research studies using a defined protocol Statistically test study heterogeneity and investigate explanatory variables. Statistically summarize results to obtain an overall estimate of treatment effect.
  • 22.
    Four Steps ofMeta Analysis  Identifying studies  Determining eligibility of studies o Inclusion: which ones to keep o Exclusion: which ones to throw out  Abstract Data from the studies  Analyzing data in the studies statistically
  • 23.
    Identifying studies  Beingmethodical: Defining the Research Question  Performing the literature search  List of popular databases to search  Pubmed/Medline  Embase  Cochrane Review/Trials Register  Other strategies ....  Hand search (in the library...)  Personal references, and emails  web, eg. Google search (http://scholar.google.com)  Selection of the studies
  • 24.
    Eligibility of studies Should be determined in advance, to reduce investigator bias  Cannot include all studies  Keep the ones with  high levels of evidence  good quality  check with QUOROM (Quality of reporting of systematic reviews) guidelines  Usually, MA done with RCTs  Case series, and case reports definitely out The QUOROM guidelines for reporting a meta-analysis requests that investigators provide a flow diagram of the selection process.
  • 25.
    Quality Control inMA:QUOROM Table
  • 26.
    Cont… Selection problems aremajor problems Criteria include but are not limited to:  Types of studies included (case control, cohort, etc)  Years of publication covered  Languages  Restrictions on sample size  Definition of disease, exposures  Confounders that must be measured  Dose response categories similar
  • 27.
    The issue….  Unpublishedstudies that failed to yield significant results.  If substantial number of such studies , evaluation of the overall significance level may be unduly optimistic.  A biased sample – a sample of only those publications reporting statistically significant results  This bias inflates the probability of making a Type II error FILE DRAWER PHENOMENA
  • 28.
    Checking for publicationbias  Funnel plots display the studies included in the meta analysis in a plot of effect size against sample size  Smaller studies -more chance variability , the expected picture is one of a symmetrical inverted funnel  Asymmetric plot suggests that the meta analysis may have missed some trials – usually smaller studies showing no effect  Asymmetry could also occur if small studies tend to have larger effect size Interpretation…
  • 29.
  • 30.
    Abstract the data Data to be extracted from each study should be determined in the design phase and……  A standardized form is to be constructed to record the data.  Examples of data commonly extracted  Study design, descriptions of study groups, diagnostic information, treatments, length of follow-up evaluation, and outcome measures.
  • 31.
    What should beabstracted from articles? Should at least include:  Type of study  Source of cases/controls or cohort  Measures of association  Confidence intervals  Number of observations  Confounders adjusted for, if any
  • 32.
    Plan of Action ARETHE STUDIES ELIGIBLE FOR MA (STEP I)? DISCARD YES NO ENTER INTO A SPECIFIED FORMAT ABSTRACT THE DATA
  • 33.
    Analyzing data inthe studies statistically  Clinical trials present results as the frequency of some outcome in the intervention groups and the control group.  Meta-analysis usually summarize as a ratio of the frequency of the events in the intervention to that in the control group.  Most common summary measure of effect size are odds ratio (OR),standard deviation (d) but RR and NNT are also seldom used  Separate methods used for combining effect size and other outcome measures such as risk difference or hazard ratio Categories: 0.2-small, 0.5-medium, 0.8-Large (Cohen, 1977).
  • 34.
    Examples  Smith andGlass, 1977 synthesized the results from 400 controlled evaluations of psychotherapy and counselling to determine whether psychotherapy ‗works‘.  They coded and systematically analyzed each study for the kind of experimental and control treatments used and the results obtained.  They were able to show that, on the average, the typical psychotherapy client was better off than 75% of the untreated ‗control‘ individuals.
  • 35.
    Examples….  Iaffaldano andMuchinsky (1985) found from their meta- analysis that overall there is only a slight relationship between workers‘ job satisfaction and the quality of their performance.  Jenkins (1986) tracked down 28 published studies measuring the impact of financial incentives on workplace performance. Only 57% of these found a positive effect on performance and the overall effect was minimal.
  • 36.

Editor's Notes

  • #5 The null hypothesis examined by the independetsamples ttest is that two population means are equal. (the experiment-wise α level) than the α level set for each t test.
  • #30 The funnel plot has some limitations;for example, it can sometimes be difficult to detect asymmetry by eye.20 To helpwith this, formal statistical methods havebeen developed to test for heterogeneityEgger’s regression test