SlideShare a Scribd company logo
1 of 67
More than two groups: ANOVA
and Chi-square
First, recent newsâ€Ļ
īŽ RESEARCHERS FOUND A NINE-
FOLD INCREASE IN THE RISK OF
DEVELOPING PARKINSON'S IN
INDIVIDUALS EXPOSED IN THE
WORKPLACE TO CERTAIN
SOLVENTSâ€Ļ
The dataâ€Ļ
Table 3. Solvent Exposure Frequencies and Adjusted Pairwise
Odds Ratios in PD–Discordant Twins, n = 99 Pairsa
Which statistical test?
Outcome
Variable
Are the observations correlated? Alternative to the chi-
square test if sparse
cells:
independent correlated
Binary or
categorical
(e.g.
fracture,
yes/no)
Chi-square test:
compares proportions between
two or more groups
Relative risks: odds ratios
or risk ratios
Logistic regression:
multivariate technique used
when outcome is binary; gives
multivariate-adjusted odds
ratios
McNemar’s chi-square test:
compares binary outcome between
correlated groups (e.g., before and
after)
Conditional logistic
regression: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., matched data)
GEE modeling: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., repeated measures)
Fisher’s exact test: compares
proportions between independent
groups when there are sparse data
(some cells <5).
McNemar’s exact test:
compares proportions between
correlated groups when there are
sparse data (some cells <5).
Comparing more than two
groupsâ€Ļ
Continuous outcome (means)
Outcome
Variable
Are the observations independent or correlated?
Alternatives if the normality
assumption is violated (and
small sample size):
independent correlated
Continuous
(e.g. pain
scale,
cognitive
function)
Ttest: compares means
between two independent
groups
ANOVA: compares means
between more than two
independent groups
Pearson’s correlation
coefficient (linear
correlation): shows linear
correlation between two
continuous variables
Linear regression:
multivariate regression technique
used when the outcome is
continuous; gives slopes
Paired ttest: compares means
between two related groups (e.g.,
the same subjects before and
after)
Repeated-measures
ANOVA: compares changes
over time in the means of two or
more groups (repeated
measurements)
Mixed models/GEE
modeling: multivariate
regression techniques to compare
changes over time between two
or more groups; gives rate of
change over time
Non-parametric statistics
Wilcoxon sign-rank test:
non-parametric alternative to the
paired ttest
Wilcoxon sum-rank test
(=Mann-Whitney U test): non-
parametric alternative to the ttest
Kruskal-Wallis test: non-
parametric alternative to ANOVA
Spearman rank correlation
coefficient: non-parametric
alternative to Pearson’s correlation
coefficient
ANOVA example
S1a, n=28 S2b, n=25 S3c, n=21 P-valued
Calcium (mg) Mean 117.8 158.7 206.5 0.000
SDe 62.4 70.5 86.2
Iron (mg) Mean 2.0 2.0 2.0 0.854
SD 0.6 0.6 0.6
Folate (Îŧg) Mean 26.6 38.7 42.6 0.000
SD 13.1 14.5 15.1
Zinc (mg) Mean 1.9 1.5 1.3 0.055
SD 1.0 1.2 0.4
a School 1 (most deprived; 40% subsidized lunches).
b School 2 (medium deprived; <10% subsidized).
c School 3 (least deprived; no subsidization, private school).
d ANOVA; significant differences are highlighted in bold (P<0.05).
Mean micronutrient intake from the school lunch by school
FROM: Gould R, Russell J,
Barker ME. School lunch menus
and 11 to 12 year old children's
food choice in three secondary
schools in England-are the
nutritional standards being met?
Appetite. 2006 Jan;46(1):86-92.
ANOVA
(ANalysis Of VAriance)
īŽ Idea: For two or more groups, test
difference between means, for
quantitative normally distributed
variables.
īŽ Just an extension of the t-test (an
ANOVA with only two groups is
mathematically equivalent to a t-test).
One-Way Analysis of Variance
īŽ Assumptions, same as ttest
īŽ Normally distributed outcome
īŽ Equal variances between the groups
īŽ Groups are independent
Hypotheses of One-Way
ANOVA
īŒ
ī€Ŋ
ī€Ŋ
ī€Ŋ 3
2
1
0 Îŧ
Îŧ
Îŧ
:
H
same
the
are
means
population
the
of
all
Not
:
1
H
ANOVA
īŽ It’s like this: If I have three groups to
compare:
īŽ I could do three pair-wise ttests, but this
would increase my type I error
īŽ So, instead I want to look at the pairwise
differences “all at once.”
īŽ To do this, I can recognize that variance is
a statistic that let’s me look at more than
one difference at a timeâ€Ļ
The “F-test”
groups
within
y
Variabilit
groups
between
y
Variabilit
F ī€Ŋ
Is the difference in the means of the groups more
than background noise (=variability within groups)?
Recall, we have already used an “F-test” to check for equality of variancesīƒ  If F>>1 (indicating
unequal variances), use unpooled variance in a t-test.
Summarizes the mean differences
between all groups at once.
Analogous to pooled variance from a ttest.
The F-distribution
īŽ The F-distribution is a continuous probability distribution that
depends on two parameters n and m (numerator and denominator
degrees of freedom, respectively):
http://www.econtools.com/jevons/java/Graphics2D/FDist.html
The F-distribution
īŽ A ratio of variances follows an F-distribution:
2
2
2
2
0
:
:
within
between
a
within
between
H
H
īŗ
īŗ
īŗ
īŗ
ī‚š
ī€Ŋ
īŦThe F-test tests the hypothesis that two variances
are equal.
īŦF will be close to 1 if sample variances are equal.
m
n
within
between
F ,
2
2
~
īŗ
īŗ
How to calculate ANOVA’s by
handâ€Ļ
Treatment 1 Treatment 2 Treatment 3 Treatment 4
y11 y21 y31 y41
y12 y22 y32 y42
y13 y23 y33 y43
y14 y24 y34 y44
y15 y25 y35 y45
y16 y26 y36 y46
y17 y27 y37 y47
y18 y28 y38 y48
y19 y29 y39 y49
y110 y210 y310 y410
n=10 obs./group
k=4 groups
The group means
10
10
1
1
1
īƒĨ
ī€Ŋ
ī‚ˇ ī€Ŋ
j
j
y
y
10
10
1
2
2
īƒĨ
ī€Ŋ
ī‚ˇ ī€Ŋ
j
j
y
y
10
10
1
3
3
īƒĨ
ī€Ŋ
ī‚ˇ ī€Ŋ
j
j
y
y 10
10
1
4
4
īƒĨ
ī€Ŋ
ī‚ˇ ī€Ŋ
j
j
y
y
The (within)
group variances
1
10
)
(
10
1
2
1
1
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
2
2
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
3
3
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
4
4
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
Sum of Squares Within (SSW),
or Sum of Squares Error (SSE)
The (within) group
variances
1
10
)
(
10
1
2
1
1
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
2
2
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
3
3
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
1
10
)
(
10
1
2
4
4
ī€­
ī€­
īƒĨ
ī€Ŋ
ī‚ˇ
j
j y
y
īƒĨīƒĨ
ī€Ŋ ī€Ŋ
ī‚ˇ
ī€­
ī€Ŋ
4
1
10
1
2
)
(
i j
i
ij y
y
+
īƒĨ
ī€Ŋ
ī‚ˇ
ī€­
10
1
2
1
1 )
(
j
j y
y īƒĨ
ī€Ŋ
ī‚ˇ
ī€­
10
1
2
2
2 )
(
j
j y
y īƒĨ
ī€Ŋ
ī‚ˇ
ī€­
10
3
2
3
3 )
(
j
j y
y īƒĨ
ī€Ŋ
ī‚ˇ
ī€­
10
1
2
4
4 )
(
j
j y
y
+
+
Sum of Squares Within (SSW)
(or SSE, for chance error)
Sum of Squares Between (SSB), or
Sum of Squares Regression (SSR)
Sum of Squares Between
(SSB). Variability of the
group means compared to
the grand mean (the
variability due to the
treatment).
Overall mean of
all 40
observations
(“grand mean”)
40
4
1
10
1
īƒĨīƒĨ
ī€Ŋ ī€Ŋ
ī‚ˇ
ī‚ˇ ī€Ŋ
i j
ij
y
y
2
4
1
)
(
10 īƒĨ
ī€Ŋ
ī‚ˇ
ī‚ˇ
ī‚ˇ ī€­
i
i y
y
x
Total Sum of Squares (SST)
Total sum of squares(TSS).
Squared difference of every
observation from the overall
mean. (numerator of
variance of Y!)
īƒĨīƒĨ
ī€Ŋ ī€Ŋ
ī‚ˇ
ī‚ˇ
ī€­
4
1
10
1
2
)
(
i j
ij y
y
Partitioning of Variance
īƒĨīƒĨ
ī€Ŋ ī€Ŋ
ī‚ˇ
ī€­
4
1
10
1
2
)
(
i j
i
ij y
y īƒĨ
ī€Ŋ
ī‚ˇ
ī‚ˇ
ī‚ˇ ī€­
4
1
2
)
(
i
i y
y īƒĨīƒĨ
ī€Ŋ ī€Ŋ
ī‚ˇ
ī‚ˇ
ī€­
4
1
10
1
2
)
(
i j
ij y
y
=
+
SSW + SSB = TSS
x
10
ANOVA Table
Between
(k groups)
k-1 SSB
(sum of squared
deviations of
group means from
grand mean)
SSB/k-1 Go to
Fk-1,nk-k
chart
Total
variation
nk-1 TSS
(sum of squared deviations of
observations from grand mean)
Source of
variation d.f.
Sum of
squares
Mean Sum
of Squares
F-statistic p-value
Within
(n individuals per
group)
nk-k SSW
(sum of squared
deviations of
observations from
their group mean)
s2=SSW/nk-k
k
nk
SSW
k
SSB
ī€­
ī€­1
TSS=SSB + SSW
ANOVA=t-test
Between
(2 groups)
1 SSB
(squared
difference
in means
multiplied
by n)
Squared
difference
in means
times n
Go to
F1, 2n-2
Chartīƒ 
notice
values are
just (t 2n-
2)2
Total
variation
2n-1 TSS
Source of
variation d.f.
Sum of
squares
Mean
Sum of
Squares F-statistic p-value
Within 2n-2 SSW
equivalent to
numerator of
pooled
variance
Pooled
variance
2
2
2
2
2
2
2
2
)
(
)
)
(
(
)
(
ī€­
ī€Ŋ
ī€Ģ
ī€­
ī€Ŋ
ī€­
n
p
p
p
t
n
s
n
s
Y
X
s
Y
X
n
2
2
2
2
2
2
2
2
1
2
1
2
1
2
1
)
(
)
*
2
(
)
2
*
2
)
2
(
)
2
(
2
*
2
)
2
(
)
2
((
)
2
2
(
)
2
2
(
))
2
(
(
))
2
(
(
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
i
n
n
n
i
n
n
n
n
i
n
n
n
n
i
Y
X
n
Y
Y
X
X
n
Y
X
X
Y
Y
X
Y
X
n
X
Y
n
Y
X
n
Y
X
Y
n
Y
X
X
n
SSB
ī€­
ī€Ŋ
ī€Ģ
ī€­
ī€Ŋ
ī€­
ī€Ģ
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ŋ
ī€Ģ
ī€­
ī€Ģ
ī€Ģ
ī€­
ī€Ŋ
īƒĨ
īƒĨ
īƒĨ
īƒĨ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
Example
Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67
42 43 50 54
67 67 55 67
56 67 56 68
62 59 61 65
64 67 61 65
59 64 60 56
72 63 59 60
71 65 64 65
Example
Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67
42 43 50 54
67 67 55 67
56 67 56 68
62 59 61 65
64 67 61 65
59 64 60 56
72 63 59 60
71 65 64 65
Step 1) calculate the sum
of squares between groups:
Mean for group 1 = 62.0
Mean for group 2 = 59.7
Mean for group 3 = 56.3
Mean for group 4 = 61.4
Grand mean= 59.85
SSB = [(62-59.85)2 + (59.7-59.85)2 + (56.3-59.85)2 + (61.4-59.85)2 ] xn per
group= 19.65x10 = 196.5
Example
Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67
42 43 50 54
67 67 55 67
56 67 56 68
62 59 61 65
64 67 61 65
59 64 60 56
72 63 59 60
71 65 64 65
Step 2) calculate the sum
of squares within groups:
(60-62) 2+(67-62) 2+ (42-62)
2+ (67-62) 2+ (56-62) 2+ (62-
62) 2+ (64-62) 2+ (59-62) 2+
(72-62) 2+ (71-62) 2+ (50-
59.7) 2+ (52-59.7) 2+ (43-
59.7) 2+67-59.7) 2+ (67-
59.7) 2+ (69-59.7)
2â€Ļ+â€Ļ.(sum of 40 squared
deviations) = 2060.6
Step 3) Fill in the ANOVA table
3 196.5 65.5 1.14 .344
36 2060.6 57.2
Source of variation d.f. Sum of squares Mean Sum of
Squares
F-statistic p-value
Between
Within
Total 39 2257.1
Step 3) Fill in the ANOVA table
3 196.5 65.5 1.14 .344
36 2060.6 57.2
Source of variation d.f. Sum of squares Mean Sum of
Squares
F-statistic p-value
Between
Within
Total 39 2257.1
INTERPRETATION of ANOVA:
How much of the variance in height is explained by treatment group?
R2=“Coefficient of Determination” = SSB/TSS = 196.5/2275.1=9%
Coefficient of Determination
SST
SSB
SSE
SSB
SSB
R ī€Ŋ
ī€Ģ
ī€Ŋ
2
The amount of variation in the outcome variable (dependent
variable) that is explained by the predictor (independent variable).
Beyond one-way ANOVA
Often, you may want to test more than 1
treatment. ANOVA can accommodate
more than 1 treatment or factor, so long
as they are independent. Again, the
variation partitions beautifully!
TSS = SSB1 + SSB2 + SSW
ANOVA example
S1a, n=25 S2b, n=25 S3c, n=25 P-valued
Calcium (mg) Mean 117.8 158.7 206.5 0.000
SDe 62.4 70.5 86.2
Iron (mg) Mean 2.0 2.0 2.0 0.854
SD 0.6 0.6 0.6
Folate (Îŧg) Mean 26.6 38.7 42.6 0.000
SD 13.1 14.5 15.1
Zinc (mg)
Mean 1.9 1.5 1.3 0.055
SD 1.0 1.2 0.4
a School 1 (most deprived; 40% subsidized lunches).
b School 2 (medium deprived; <10% subsidized).
c School 3 (least deprived; no subsidization, private school).
d ANOVA; significant differences are highlighted in bold (P<0.05).
Table 6. Mean micronutrient intake from the school lunch by school
FROM: Gould R, Russell J,
Barker ME. School lunch menus
and 11 to 12 year old children's
food choice in three secondary
schools in England-are the
nutritional standards being met?
Appetite. 2006 Jan;46(1):86-92.
Answer
Step 1) calculate the sum of squares between groups:
Mean for School 1 = 117.8
Mean for School 2 = 158.7
Mean for School 3 = 206.5
Grand mean: 161
SSB = [(117.8-161)2 + (158.7-161)2 + (206.5-161)2] x25 per
group= 98,113
Answer
Step 2) calculate the sum of squares within groups:
S.D. for S1 = 62.4
S.D. for S2 = 70.5
S.D. for S3 = 86.2
Therefore, sum of squares within is:
(24)[ 62.42 + 70.5 2+ 86.22]=391,066
Answer
Step 3) Fill in your ANOVA table
Source of variation d.f. Sum of squares Mean Sum of
Squares
F-statistic p-value
Between 2 98,113 49056 9 <.05
Within 72 391,066 5431
Total 74 489,179
**R2=98113/489179=20%
School explains 20% of the variance in lunchtime calcium
intake in these kids.
ANOVA summary
īŽ A statistically significant ANOVA (F-test)
only tells you that at least two of the
groups differ, but not which ones differ.
īŽ Determining which groups differ (when
it’s unclear) requires more sophisticated
analyses to correct for the problem of
multiple comparisonsâ€Ļ
Question: Why not just do 3
pairwise ttests?
īŽ Answer: because, at an error rate of 5% each test,
this means you have an overall chance of up to 1-
(.95)3= 14% of making a type-I error (if all 3
comparisons were independent)
īŽ If you wanted to compare 6 groups, you’d have to
do 6C2 = 15 pairwise ttests; which would give you
a high chance of finding something significant just
by chance (if all tests were independent with a
type-I error rate of 5% each); probability of at
least one type-I error = 1-(.95)15=54%.
Recall: Multiple comparisons
Correction for multiple comparisons
How to correct for multiple comparisons post-
hocâ€Ļ
â€ĸ Bonferroni correction (adjusts p by most
conservative amount; assuming all tests
independent, divide p by the number of tests)
â€ĸ Tukey (adjusts p)
â€ĸ Scheffe (adjusts p)
â€ĸ Holm/Hochberg (gives p-cutoff beyond which
not significant)
Procedures for Post Hoc
Comparisons
If your ANOVA test identifies a difference
between group means, then you must identify
which of your k groups differ.
If you did not specify the comparisons of interest
(“contrasts”) ahead of time, then you have to pay a
price for making all kCr pairwise comparisons to
keep overall type-I error rate to Îą.
Alternately, run a limited number of planned comparisons
(making only those comparisons that are most important to your
research question). (Limits the number of tests you make).
1. Bonferroni
Obtained P-value Original Alpha # tests New Alpha Significant?
.001 .05 5 .010 Yes
.011 .05 4 .013 Yes
.019 .05 3 .017 No
.032 .05 2 .025 No
.048 .05 1 .050 Yes
For example, to make a Bonferroni correction, divide your desired alpha cut-off
level (usually .05) by the number of comparisons you are making. Assumes
complete independence between comparisons, which is way too conservative.
2/3. Tukey and SheffÊ
īŽ Both methods increase your p-values to
account for the fact that you’ve done multiple
comparisons, but are less conservative than
Bonferroni (let computer calculate for you!).
īŽ SAS options in PROC GLM:
īŽ adjust=tukey
īŽ adjust=scheffe
4/5. Holm and Hochberg
īŽ Arrange all the resulting p-values (from
the T=kCr pairwise comparisons) in
order from smallest (most significant) to
largest: p1 to pT
Holm
1. Start with p1, and compare to Bonferroni p (=Îą/T).
2. If p1< Îą/T, then p1 is significant and continue to step 2.
If not, then we have no significant p-values and stop here.
3. If p2< Îą/(T-1), then p2 is significant and continue to step.
If not, then p2 thru pT are not significant and stop here.
4. If p3< Îą/(T-2), then p3 is significant and continue to step
If not, then p3 thru pT are not significant and stop here.
Repeat the patternâ€Ļ
Hochberg
1. Start with largest (least significant) p-value, pT,
and compare to α. If it’s significant, so are all
the remaining p-values and stop here. If it’s not
significant then go to step 2.
2. If pT-1< Îą/(T-1), then pT-1 is significant, as are all
remaining smaller p-vales and stop here. If not,
then pT-1 is not significant and go to step 3.
Repeat the patternâ€Ļ
Note: Holm and Hochberg should give you the same results. Use
Holm if you anticipate few significant comparisons; use Hochberg if
you anticipate many significant comparisons.
Practice Problem
A large randomized trial compared an experimental drug and 9 other standard
drugs for treating motion sickness. An ANOVA test revealed significant
differences between the groups. The investigators wanted to know if the
experimental drug (“drug 1”) beat any of the standard drugs in reducing total
minutes of nausea, and, if so, which ones. The p-values from the pairwise
ttests (comparing drug 1 with drugs 2-10) are below.
a. Which differences would be considered statistically significant using a
Bonferroni correction? A Holm correction? A Hochberg correction?
Drug 1 vs. drug
â€Ļ
2 3 4 5 6 7 8 9 10
p-value .05 .3 .25 .04 .001 .006 .08 .002 .01
Answer
Bonferroni makes new Îą value = Îą/9 = .05/9 =.0056; therefore, using Bonferroni, the
new drug is only significantly different than standard drugs 6 and 9.
Arrange p-values:
6 9 7 10 5 2 8 4 3
.001 .002 .006 .01 .04 .05 .08 .25 .3
Holm: .001<.0056; .002<.05/8=.00625; .006<.05/7=.007; .01>.05/6=.0083; therefore,
new drug only significantly different than standard drugs 6, 9, and 7.
Hochberg: .3>.05; .25>.05/2; .08>.05/3; .05>.05/4; .04>.05/5; .01>.05/6; .006<.05/7;
therefore, drugs 7, 9, and 6 are significantly different.
Practice problem
īŽ b. Your patient is taking one of the standard drugs that was
shown to be statistically less effective in minimizing
motion sickness (i.e., significant p-value for the
comparison with the experimental drug). Assuming that
none of these drugs have side effects but that the
experimental drug is slightly more costly than your
patient’s current drug-of-choice, what (if any) other
information would you want to know before you start
recommending that patients switch to the new drug?
Answer
īŽ The magnitude of the reduction in minutes of nausea.
īŽ If large enough sample size, a 1-minute difference could be
statistically significant, but it’s obviously not clinically
meaningful and you probably wouldn’t recommend a
switch.
Continuous outcome (means)
Outcome
Variable
Are the observations independent or correlated?
Alternatives if the normality
assumption is violated (and
small sample size):
independent correlated
Continuous
(e.g. pain
scale,
cognitive
function)
Ttest: compares means
between two independent
groups
ANOVA: compares means
between more than two
independent groups
Pearson’s correlation
coefficient (linear
correlation): shows linear
correlation between two
continuous variables
Linear regression:
multivariate regression technique
used when the outcome is
continuous; gives slopes
Paired ttest: compares means
between two related groups (e.g.,
the same subjects before and
after)
Repeated-measures
ANOVA: compares changes
over time in the means of two or
more groups (repeated
measurements)
Mixed models/GEE
modeling: multivariate
regression techniques to compare
changes over time between two
or more groups; gives rate of
change over time
Non-parametric statistics
Wilcoxon sign-rank test:
non-parametric alternative to the
paired ttest
Wilcoxon sum-rank test
(=Mann-Whitney U test): non-
parametric alternative to the ttest
Kruskal-Wallis test: non-
parametric alternative to ANOVA
Spearman rank correlation
coefficient: non-parametric
alternative to Pearson’s correlation
coefficient
Non-parametric ANOVA
Kruskal-Wallis one-way ANOVA
(just an extension of the Wilcoxon Sum-Rank (Mann
Whitney U) test for 2 groups; based on ranks)
Proc NPAR1WAY in SAS
Binary or categorical outcomes
(proportions)
Outcome
Variable
Are the observations correlated? Alternative to the chi-
square test if sparse
cells:
independent correlated
Binary or
categorical
(e.g.
fracture,
yes/no)
Chi-square test:
compares proportions between
two or more groups
Relative risks: odds ratios
or risk ratios
Logistic regression:
multivariate technique used
when outcome is binary; gives
multivariate-adjusted odds
ratios
McNemar’s chi-square test:
compares binary outcome between
correlated groups (e.g., before and
after)
Conditional logistic
regression: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., matched data)
GEE modeling: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., repeated measures)
Fisher’s exact test: compares
proportions between independent
groups when there are sparse data
(some cells <5).
McNemar’s exact test:
compares proportions between
correlated groups when there are
sparse data (some cells <5).
Chi-square test
for comparing proportions
(of a categorical variable)
between >2 groups
I. Chi-Square Test of Independence
When both your predictor and outcome variables are categorical, they may be cross-
classified in a contingency table and compared using a chi-square test of
independence.
A contingency table with R rows and C columns is an R x C contingency table.
Example
īŽ Asch, S.E. (1955). Opinions and social
pressure. Scientific American, 193, 31-
35.
The Experiment
īŽ A Subject volunteers to participate in a
“visual perception study.”
īŽ Everyone else in the room is actually a
conspirator in the study (unbeknownst
to the Subject).
īŽ The “experimenter” reveals a pair of
cardsâ€Ļ
The Task Cards
Standard line Comparison lines
A, B, and C
The Experiment
īŽ Everyone goes around the room and says
which comparison line (A, B, or C) is correct;
the true Subject always answers last – after
hearing all the others’ answers.
īŽ The first few times, the 7 “conspirators” give
the correct answer.
īŽ Then, they start purposely giving the
(obviously) wrong answer.
īŽ 75% of Subjects tested went along with the
group’s consensus at least once.
Further Results
īŽ In a further experiment, group size
(number of conspirators) was altered
from 2-10.
īŽ Does the group size alter the proportion
of subjects who conform?
The Chi-Square test
Conformed?
Number of group members?
2 4 6 8 10
Yes 20 50 75 60 30
No 80 50 25 40 70
Apparently, conformity less likely when less or more group
membersâ€Ļ
īŽ 20 + 50 + 75 + 60 + 30 = 235
conformed
īŽ out of 500 experiments.
īŽ Overall likelihood of conforming =
235/500 = .47
Calculating the expected, in
general
īŽ Null hypothesis: variables are
independent
īŽ Recall that under independence:
P(A)*P(B)=P(A&B)
īŽ Therefore, calculate the marginal
probability of B and the marginal
probability of A. Multiply P(A)*P(B)*N to
get the expected cell count.
Expected frequencies if no
association between group
size and conformityâ€Ļ
Conformed?
Number of group members?
2 4 6 8 10
Yes 47 47 47 47 47
No 53 53 53 53 53
īŽ Do observed and expected differ more
than expected due to chance?
Chi-Square test
īƒĨ
ī€Ŋ
expected
expected)
-
(observed 2
2
īŖ
Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4
85
53
)
53
70
(
53
)
53
40
(
53
)
53
25
(
53
)
53
50
(
53
)
53
80
(
47
)
47
30
(
47
)
47
60
(
47
)
47
75
(
47
)
47
50
(
47
)
47
20
(
2
2
2
2
2
2
2
2
2
2
2
4
ī‚ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ŋ
īŖ
The Chi-Square distribution:
is sum of squared normal deviates
The expected
value and
variance of a chi-
square:
E(x)=df
Var(x)=2(df)
)
Normal(0,1
~
Z
where
;
1
2
2
īƒĨ
ī€Ŋ
ī€Ŋ
df
i
Z
df
īŖ
Chi-Square test
īƒĨ
ī€Ŋ
expected
expected)
-
(observed 2
2
īŖ
Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4
Rule of thumb: if the chi-square statistic is much greater than it’s degrees of freedom,
indicates statistical significance. Here 85>>4.
85
53
)
53
70
(
53
)
53
40
(
53
)
53
25
(
53
)
53
50
(
53
)
53
80
(
47
)
47
30
(
47
)
47
60
(
47
)
47
75
(
47
)
47
50
(
47
)
47
20
(
2
2
2
2
2
2
2
2
2
2
2
4
ī‚ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ģ
ī€­
ī€Ŋ
īŖ
22
.
1
0156
.
019
.
91
)
982
)(.
018
(.
352
)
982
)(.
018
(.
)
033
.
014
(.
018
.
453
8
;
)
1
)(
(
)
1
)(
(
0
)
ˆ
ˆ
(
033
.
91
3
;
014
.
352
5
2
1
2
1
/
/
ī€­
ī€Ŋ
ī€­
ī€Ŋ
ī€Ģ
ī€­
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€­
ī€Ģ
ī€­
ī€­
ī€­
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
Z
p
n
p
p
n
p
p
p
p
Z
p
p nophone
tumor
cellphone
tumor
Brain tumor No brain tumor
Own a cell
phone
5 347 352
Don’t own a
cell phone
3 88 91
8 435 453
Chi-square example: recall dataâ€Ļ
Same data, but use Chi-square test
48
.
1
22
.
1
:
note
48
.
1
7
.
345
345.7)
-
(347
3
.
89
88)
-
(89.3
7
.
1
1.7)
-
(3
3
.
6
6.3)
-
(8
df
1
1
1
1
1
d
cell
in
89.3
b;
cell
in
345.7
c;
cell
in
1.7
6.3;
453
*
.014
a
cell
in
Expected
014
.
777
.
*
018
.
777
.
453
352
;
018
.
453
8
2
2
2
2
2
2
1
2
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
ī€Ŋ
Z
NS
*
)
)*(C-
(R-
xp
p
p
p
cellphone
tumor
cellphone
tumor
īŖ
Brain tumor No brain tumor
Own 5 347 352
Don’t own 3 88 91
8 435 453
Expected value in
cell c= 1.7, so
technically should
use a Fisher’s exact
here! Next termâ€Ļ
Caveat
**When the sample size is very small in
any cell (expected value<5), Fisher’s
exact test is used as an alternative to
the chi-square test.
Binary or categorical outcomes
(proportions)
Outcome
Variable
Are the observations correlated? Alternative to the chi-
square test if sparse
cells:
independent correlated
Binary or
categorical
(e.g.
fracture,
yes/no)
Chi-square test:
compares proportions between
two or more groups
Relative risks: odds ratios
or risk ratios
Logistic regression:
multivariate technique used
when outcome is binary; gives
multivariate-adjusted odds
ratios
McNemar’s chi-square test:
compares binary outcome between
correlated groups (e.g., before and
after)
Conditional logistic
regression: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., matched data)
GEE modeling: multivariate
regression technique for a binary
outcome when groups are
correlated (e.g., repeated measures)
Fisher’s exact test: compares
proportions between independent
groups when there are sparse data
(np <5).
McNemar’s exact test:
compares proportions between
correlated groups when there are
sparse data (np <5).

More Related Content

Similar to lecture12.ppt

One-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual FoundationsOne-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual Foundationssmackinnon
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Sneh Kumari
 
Variance component analysis by paravayya c pujeri
Variance component analysis by paravayya c pujeriVariance component analysis by paravayya c pujeri
Variance component analysis by paravayya c pujeriParavayya Pujeri
 
Anova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dkAnova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dkSyifa' Humaira
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsBabasab Patil
 
ch22-130327014547-phpapp01.ppt
ch22-130327014547-phpapp01.pptch22-130327014547-phpapp01.ppt
ch22-130327014547-phpapp01.pptPankajKhindria
 
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxA study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxjibinjohn140
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.pptNivedhanK1
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.pptElisha Gogo
 
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOI
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOIANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOI
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOIprasad439227
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.pptMishimehar1
 
Full Lecture Presentation on ANOVA
Full Lecture Presentation on ANOVAFull Lecture Presentation on ANOVA
Full Lecture Presentation on ANOVAStevegellKololi
 

Similar to lecture12.ppt (20)

One-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual FoundationsOne-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual Foundations
 
ANOVA.pptx
ANOVA.pptxANOVA.pptx
ANOVA.pptx
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)
 
Stat2013
Stat2013Stat2013
Stat2013
 
Variance component analysis by paravayya c pujeri
Variance component analysis by paravayya c pujeriVariance component analysis by paravayya c pujeri
Variance component analysis by paravayya c pujeri
 
Anova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dkAnova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dk
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
ch22-130327014547-phpapp01.ppt
ch22-130327014547-phpapp01.pptch22-130327014547-phpapp01.ppt
ch22-130327014547-phpapp01.ppt
 
{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx
 
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxA study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.ppt
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.ppt
 
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOI
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOIANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOI
ANOVAs01.ppt KHLUGYIFTFYLYUGUH;OUYYUHJLNOI
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.ppt
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.ppt
 
ANOVAs01.ppt
ANOVAs01.pptANOVAs01.ppt
ANOVAs01.ppt
 
Full Lecture Presentation on ANOVA
Full Lecture Presentation on ANOVAFull Lecture Presentation on ANOVA
Full Lecture Presentation on ANOVA
 
Anova; analysis of variance
Anova; analysis of varianceAnova; analysis of variance
Anova; analysis of variance
 

More from Nimish Savaliya

NEUROIMAGING IN PSYCHIATRY.pptx
NEUROIMAGING IN PSYCHIATRY.pptxNEUROIMAGING IN PSYCHIATRY.pptx
NEUROIMAGING IN PSYCHIATRY.pptxNimish Savaliya
 
ALD-EASL-CPG-Slide-Deck.pptx
ALD-EASL-CPG-Slide-Deck.pptxALD-EASL-CPG-Slide-Deck.pptx
ALD-EASL-CPG-Slide-Deck.pptxNimish Savaliya
 
PPT-6 How to buy and sell shares in Stock Exchange.pptx
PPT-6 How to buy and sell shares in Stock Exchange.pptxPPT-6 How to buy and sell shares in Stock Exchange.pptx
PPT-6 How to buy and sell shares in Stock Exchange.pptxNimish Savaliya
 
gen_merit_001 03102022.pdf
gen_merit_001 03102022.pdfgen_merit_001 03102022.pdf
gen_merit_001 03102022.pdfNimish Savaliya
 
I am sharing 'Abhimanyu CC' with you.pptx
I am sharing 'Abhimanyu CC' with you.pptxI am sharing 'Abhimanyu CC' with you.pptx
I am sharing 'Abhimanyu CC' with you.pptxNimish Savaliya
 
ELIMINATION DISORDER AND EATING DISORDER.pptx
ELIMINATION DISORDER AND EATING DISORDER.pptxELIMINATION DISORDER AND EATING DISORDER.pptx
ELIMINATION DISORDER AND EATING DISORDER.pptxNimish Savaliya
 
psychotherapy-131228152151-phpapp01.pdf
psychotherapy-131228152151-phpapp01.pdfpsychotherapy-131228152151-phpapp01.pdf
psychotherapy-131228152151-phpapp01.pdfNimish Savaliya
 
seminar on suicide-1.pptx
seminar on suicide-1.pptxseminar on suicide-1.pptx
seminar on suicide-1.pptxNimish Savaliya
 
EMOTIONAL INTELLIGENCE.pptx
EMOTIONAL INTELLIGENCE.pptxEMOTIONAL INTELLIGENCE.pptx
EMOTIONAL INTELLIGENCE.pptxNimish Savaliya
 
Psychiatry in Children's
Psychiatry in Children'sPsychiatry in Children's
Psychiatry in Children'sNimish Savaliya
 
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptx
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptxRole of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptx
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptxNimish Savaliya
 
110-chapter-2-introduction.ppt
110-chapter-2-introduction.ppt110-chapter-2-introduction.ppt
110-chapter-2-introduction.pptNimish Savaliya
 
neurotransmitters.pptx
neurotransmitters.pptxneurotransmitters.pptx
neurotransmitters.pptxNimish Savaliya
 

More from Nimish Savaliya (20)

NEUROIMAGING IN PSYCHIATRY.pptx
NEUROIMAGING IN PSYCHIATRY.pptxNEUROIMAGING IN PSYCHIATRY.pptx
NEUROIMAGING IN PSYCHIATRY.pptx
 
Ketamine-5.pptx
Ketamine-5.pptxKetamine-5.pptx
Ketamine-5.pptx
 
MET.ppt
MET.pptMET.ppt
MET.ppt
 
ALD-EASL-CPG-Slide-Deck.pptx
ALD-EASL-CPG-Slide-Deck.pptxALD-EASL-CPG-Slide-Deck.pptx
ALD-EASL-CPG-Slide-Deck.pptx
 
AODA.ppt
AODA.pptAODA.ppt
AODA.ppt
 
PPT-6 How to buy and sell shares in Stock Exchange.pptx
PPT-6 How to buy and sell shares in Stock Exchange.pptxPPT-6 How to buy and sell shares in Stock Exchange.pptx
PPT-6 How to buy and sell shares in Stock Exchange.pptx
 
SB2.ppt
SB2.pptSB2.ppt
SB2.ppt
 
autismasd1.pptx
autismasd1.pptxautismasd1.pptx
autismasd1.pptx
 
gen_merit_001 03102022.pdf
gen_merit_001 03102022.pdfgen_merit_001 03102022.pdf
gen_merit_001 03102022.pdf
 
I am sharing 'Abhimanyu CC' with you.pptx
I am sharing 'Abhimanyu CC' with you.pptxI am sharing 'Abhimanyu CC' with you.pptx
I am sharing 'Abhimanyu CC' with you.pptx
 
ELIMINATION DISORDER AND EATING DISORDER.pptx
ELIMINATION DISORDER AND EATING DISORDER.pptxELIMINATION DISORDER AND EATING DISORDER.pptx
ELIMINATION DISORDER AND EATING DISORDER.pptx
 
psychotherapy-131228152151-phpapp01.pdf
psychotherapy-131228152151-phpapp01.pdfpsychotherapy-131228152151-phpapp01.pdf
psychotherapy-131228152151-phpapp01.pdf
 
seminar on suicide-1.pptx
seminar on suicide-1.pptxseminar on suicide-1.pptx
seminar on suicide-1.pptx
 
PowerPoint.ppt
PowerPoint.pptPowerPoint.ppt
PowerPoint.ppt
 
EMOTIONAL INTELLIGENCE.pptx
EMOTIONAL INTELLIGENCE.pptxEMOTIONAL INTELLIGENCE.pptx
EMOTIONAL INTELLIGENCE.pptx
 
Psychiatry in Children's
Psychiatry in Children'sPsychiatry in Children's
Psychiatry in Children's
 
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptx
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptxRole of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptx
Role of Neurotransmitter Dopamine in Symptoms of Schizophrenia-1.pptx
 
110-chapter-2-introduction.ppt
110-chapter-2-introduction.ppt110-chapter-2-introduction.ppt
110-chapter-2-introduction.ppt
 
NTs_2.ppt
NTs_2.pptNTs_2.ppt
NTs_2.ppt
 
neurotransmitters.pptx
neurotransmitters.pptxneurotransmitters.pptx
neurotransmitters.pptx
 

Recently uploaded

Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 

Recently uploaded (20)

Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸
call girls in Kamla Market (DELHI) 🔝 >āŧ’9953330565🔝 genuine Escort Service 🔝✔ī¸âœ”ī¸
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 

lecture12.ppt

  • 1. More than two groups: ANOVA and Chi-square
  • 2. First, recent newsâ€Ļ īŽ RESEARCHERS FOUND A NINE- FOLD INCREASE IN THE RISK OF DEVELOPING PARKINSON'S IN INDIVIDUALS EXPOSED IN THE WORKPLACE TO CERTAIN SOLVENTSâ€Ļ
  • 3. The dataâ€Ļ Table 3. Solvent Exposure Frequencies and Adjusted Pairwise Odds Ratios in PD–Discordant Twins, n = 99 Pairsa
  • 4. Which statistical test? Outcome Variable Are the observations correlated? Alternative to the chi- square test if sparse cells: independent correlated Binary or categorical (e.g. fracture, yes/no) Chi-square test: compares proportions between two or more groups Relative risks: odds ratios or risk ratios Logistic regression: multivariate technique used when outcome is binary; gives multivariate-adjusted odds ratios McNemar’s chi-square test: compares binary outcome between correlated groups (e.g., before and after) Conditional logistic regression: multivariate regression technique for a binary outcome when groups are correlated (e.g., matched data) GEE modeling: multivariate regression technique for a binary outcome when groups are correlated (e.g., repeated measures) Fisher’s exact test: compares proportions between independent groups when there are sparse data (some cells <5). McNemar’s exact test: compares proportions between correlated groups when there are sparse data (some cells <5).
  • 5. Comparing more than two groupsâ€Ļ
  • 6. Continuous outcome (means) Outcome Variable Are the observations independent or correlated? Alternatives if the normality assumption is violated (and small sample size): independent correlated Continuous (e.g. pain scale, cognitive function) Ttest: compares means between two independent groups ANOVA: compares means between more than two independent groups Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables Linear regression: multivariate regression technique used when the outcome is continuous; gives slopes Paired ttest: compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups; gives rate of change over time Non-parametric statistics Wilcoxon sign-rank test: non-parametric alternative to the paired ttest Wilcoxon sum-rank test (=Mann-Whitney U test): non- parametric alternative to the ttest Kruskal-Wallis test: non- parametric alternative to ANOVA Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient
  • 7. ANOVA example S1a, n=28 S2b, n=25 S3c, n=21 P-valued Calcium (mg) Mean 117.8 158.7 206.5 0.000 SDe 62.4 70.5 86.2 Iron (mg) Mean 2.0 2.0 2.0 0.854 SD 0.6 0.6 0.6 Folate (Îŧg) Mean 26.6 38.7 42.6 0.000 SD 13.1 14.5 15.1 Zinc (mg) Mean 1.9 1.5 1.3 0.055 SD 1.0 1.2 0.4 a School 1 (most deprived; 40% subsidized lunches). b School 2 (medium deprived; <10% subsidized). c School 3 (least deprived; no subsidization, private school). d ANOVA; significant differences are highlighted in bold (P<0.05). Mean micronutrient intake from the school lunch by school FROM: Gould R, Russell J, Barker ME. School lunch menus and 11 to 12 year old children's food choice in three secondary schools in England-are the nutritional standards being met? Appetite. 2006 Jan;46(1):86-92.
  • 8. ANOVA (ANalysis Of VAriance) īŽ Idea: For two or more groups, test difference between means, for quantitative normally distributed variables. īŽ Just an extension of the t-test (an ANOVA with only two groups is mathematically equivalent to a t-test).
  • 9. One-Way Analysis of Variance īŽ Assumptions, same as ttest īŽ Normally distributed outcome īŽ Equal variances between the groups īŽ Groups are independent
  • 10. Hypotheses of One-Way ANOVA īŒ ī€Ŋ ī€Ŋ ī€Ŋ 3 2 1 0 Îŧ Îŧ Îŧ : H same the are means population the of all Not : 1 H
  • 11. ANOVA īŽ It’s like this: If I have three groups to compare: īŽ I could do three pair-wise ttests, but this would increase my type I error īŽ So, instead I want to look at the pairwise differences “all at once.” īŽ To do this, I can recognize that variance is a statistic that let’s me look at more than one difference at a timeâ€Ļ
  • 12. The “F-test” groups within y Variabilit groups between y Variabilit F ī€Ŋ Is the difference in the means of the groups more than background noise (=variability within groups)? Recall, we have already used an “F-test” to check for equality of variancesīƒ  If F>>1 (indicating unequal variances), use unpooled variance in a t-test. Summarizes the mean differences between all groups at once. Analogous to pooled variance from a ttest.
  • 13. The F-distribution īŽ The F-distribution is a continuous probability distribution that depends on two parameters n and m (numerator and denominator degrees of freedom, respectively): http://www.econtools.com/jevons/java/Graphics2D/FDist.html
  • 14. The F-distribution īŽ A ratio of variances follows an F-distribution: 2 2 2 2 0 : : within between a within between H H īŗ īŗ īŗ īŗ ī‚š ī€Ŋ īŦThe F-test tests the hypothesis that two variances are equal. īŦF will be close to 1 if sample variances are equal. m n within between F , 2 2 ~ īŗ īŗ
  • 15. How to calculate ANOVA’s by handâ€Ļ Treatment 1 Treatment 2 Treatment 3 Treatment 4 y11 y21 y31 y41 y12 y22 y32 y42 y13 y23 y33 y43 y14 y24 y34 y44 y15 y25 y35 y45 y16 y26 y36 y46 y17 y27 y37 y47 y18 y28 y38 y48 y19 y29 y39 y49 y110 y210 y310 y410 n=10 obs./group k=4 groups The group means 10 10 1 1 1 īƒĨ ī€Ŋ ī‚ˇ ī€Ŋ j j y y 10 10 1 2 2 īƒĨ ī€Ŋ ī‚ˇ ī€Ŋ j j y y 10 10 1 3 3 īƒĨ ī€Ŋ ī‚ˇ ī€Ŋ j j y y 10 10 1 4 4 īƒĨ ī€Ŋ ī‚ˇ ī€Ŋ j j y y The (within) group variances 1 10 ) ( 10 1 2 1 1 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 2 2 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 3 3 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 4 4 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y
  • 16. Sum of Squares Within (SSW), or Sum of Squares Error (SSE) The (within) group variances 1 10 ) ( 10 1 2 1 1 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 2 2 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 3 3 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y 1 10 ) ( 10 1 2 4 4 ī€­ ī€­ īƒĨ ī€Ŋ ī‚ˇ j j y y īƒĨīƒĨ ī€Ŋ ī€Ŋ ī‚ˇ ī€­ ī€Ŋ 4 1 10 1 2 ) ( i j i ij y y + īƒĨ ī€Ŋ ī‚ˇ ī€­ 10 1 2 1 1 ) ( j j y y īƒĨ ī€Ŋ ī‚ˇ ī€­ 10 1 2 2 2 ) ( j j y y īƒĨ ī€Ŋ ī‚ˇ ī€­ 10 3 2 3 3 ) ( j j y y īƒĨ ī€Ŋ ī‚ˇ ī€­ 10 1 2 4 4 ) ( j j y y + + Sum of Squares Within (SSW) (or SSE, for chance error)
  • 17. Sum of Squares Between (SSB), or Sum of Squares Regression (SSR) Sum of Squares Between (SSB). Variability of the group means compared to the grand mean (the variability due to the treatment). Overall mean of all 40 observations (“grand mean”) 40 4 1 10 1 īƒĨīƒĨ ī€Ŋ ī€Ŋ ī‚ˇ ī‚ˇ ī€Ŋ i j ij y y 2 4 1 ) ( 10 īƒĨ ī€Ŋ ī‚ˇ ī‚ˇ ī‚ˇ ī€­ i i y y x
  • 18. Total Sum of Squares (SST) Total sum of squares(TSS). Squared difference of every observation from the overall mean. (numerator of variance of Y!) īƒĨīƒĨ ī€Ŋ ī€Ŋ ī‚ˇ ī‚ˇ ī€­ 4 1 10 1 2 ) ( i j ij y y
  • 19. Partitioning of Variance īƒĨīƒĨ ī€Ŋ ī€Ŋ ī‚ˇ ī€­ 4 1 10 1 2 ) ( i j i ij y y īƒĨ ī€Ŋ ī‚ˇ ī‚ˇ ī‚ˇ ī€­ 4 1 2 ) ( i i y y īƒĨīƒĨ ī€Ŋ ī€Ŋ ī‚ˇ ī‚ˇ ī€­ 4 1 10 1 2 ) ( i j ij y y = + SSW + SSB = TSS x 10
  • 20. ANOVA Table Between (k groups) k-1 SSB (sum of squared deviations of group means from grand mean) SSB/k-1 Go to Fk-1,nk-k chart Total variation nk-1 TSS (sum of squared deviations of observations from grand mean) Source of variation d.f. Sum of squares Mean Sum of Squares F-statistic p-value Within (n individuals per group) nk-k SSW (sum of squared deviations of observations from their group mean) s2=SSW/nk-k k nk SSW k SSB ī€­ ī€­1 TSS=SSB + SSW
  • 21. ANOVA=t-test Between (2 groups) 1 SSB (squared difference in means multiplied by n) Squared difference in means times n Go to F1, 2n-2 Chartīƒ  notice values are just (t 2n- 2)2 Total variation 2n-1 TSS Source of variation d.f. Sum of squares Mean Sum of Squares F-statistic p-value Within 2n-2 SSW equivalent to numerator of pooled variance Pooled variance 2 2 2 2 2 2 2 2 ) ( ) ) ( ( ) ( ī€­ ī€Ŋ ī€Ģ ī€­ ī€Ŋ ī€­ n p p p t n s n s Y X s Y X n 2 2 2 2 2 2 2 2 1 2 1 2 1 2 1 ) ( ) * 2 ( ) 2 * 2 ) 2 ( ) 2 ( 2 * 2 ) 2 ( ) 2 (( ) 2 2 ( ) 2 2 ( )) 2 ( ( )) 2 ( ( n n n n n n n n n n n n n n n n n i n n n i n n n n i n n n n i Y X n Y Y X X n Y X X Y Y X Y X n X Y n Y X n Y X Y n Y X X n SSB ī€­ ī€Ŋ ī€Ģ ī€­ ī€Ŋ ī€­ ī€Ģ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ŋ ī€Ģ ī€­ ī€Ģ ī€Ģ ī€­ ī€Ŋ īƒĨ īƒĨ īƒĨ īƒĨ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ
  • 22. Example Treatment 1 Treatment 2 Treatment 3 Treatment 4 60 inches 50 48 47 67 52 49 67 42 43 50 54 67 67 55 67 56 67 56 68 62 59 61 65 64 67 61 65 59 64 60 56 72 63 59 60 71 65 64 65
  • 23. Example Treatment 1 Treatment 2 Treatment 3 Treatment 4 60 inches 50 48 47 67 52 49 67 42 43 50 54 67 67 55 67 56 67 56 68 62 59 61 65 64 67 61 65 59 64 60 56 72 63 59 60 71 65 64 65 Step 1) calculate the sum of squares between groups: Mean for group 1 = 62.0 Mean for group 2 = 59.7 Mean for group 3 = 56.3 Mean for group 4 = 61.4 Grand mean= 59.85 SSB = [(62-59.85)2 + (59.7-59.85)2 + (56.3-59.85)2 + (61.4-59.85)2 ] xn per group= 19.65x10 = 196.5
  • 24. Example Treatment 1 Treatment 2 Treatment 3 Treatment 4 60 inches 50 48 47 67 52 49 67 42 43 50 54 67 67 55 67 56 67 56 68 62 59 61 65 64 67 61 65 59 64 60 56 72 63 59 60 71 65 64 65 Step 2) calculate the sum of squares within groups: (60-62) 2+(67-62) 2+ (42-62) 2+ (67-62) 2+ (56-62) 2+ (62- 62) 2+ (64-62) 2+ (59-62) 2+ (72-62) 2+ (71-62) 2+ (50- 59.7) 2+ (52-59.7) 2+ (43- 59.7) 2+67-59.7) 2+ (67- 59.7) 2+ (69-59.7) 2â€Ļ+â€Ļ.(sum of 40 squared deviations) = 2060.6
  • 25. Step 3) Fill in the ANOVA table 3 196.5 65.5 1.14 .344 36 2060.6 57.2 Source of variation d.f. Sum of squares Mean Sum of Squares F-statistic p-value Between Within Total 39 2257.1
  • 26. Step 3) Fill in the ANOVA table 3 196.5 65.5 1.14 .344 36 2060.6 57.2 Source of variation d.f. Sum of squares Mean Sum of Squares F-statistic p-value Between Within Total 39 2257.1 INTERPRETATION of ANOVA: How much of the variance in height is explained by treatment group? R2=“Coefficient of Determination” = SSB/TSS = 196.5/2275.1=9%
  • 27. Coefficient of Determination SST SSB SSE SSB SSB R ī€Ŋ ī€Ģ ī€Ŋ 2 The amount of variation in the outcome variable (dependent variable) that is explained by the predictor (independent variable).
  • 28. Beyond one-way ANOVA Often, you may want to test more than 1 treatment. ANOVA can accommodate more than 1 treatment or factor, so long as they are independent. Again, the variation partitions beautifully! TSS = SSB1 + SSB2 + SSW
  • 29. ANOVA example S1a, n=25 S2b, n=25 S3c, n=25 P-valued Calcium (mg) Mean 117.8 158.7 206.5 0.000 SDe 62.4 70.5 86.2 Iron (mg) Mean 2.0 2.0 2.0 0.854 SD 0.6 0.6 0.6 Folate (Îŧg) Mean 26.6 38.7 42.6 0.000 SD 13.1 14.5 15.1 Zinc (mg) Mean 1.9 1.5 1.3 0.055 SD 1.0 1.2 0.4 a School 1 (most deprived; 40% subsidized lunches). b School 2 (medium deprived; <10% subsidized). c School 3 (least deprived; no subsidization, private school). d ANOVA; significant differences are highlighted in bold (P<0.05). Table 6. Mean micronutrient intake from the school lunch by school FROM: Gould R, Russell J, Barker ME. School lunch menus and 11 to 12 year old children's food choice in three secondary schools in England-are the nutritional standards being met? Appetite. 2006 Jan;46(1):86-92.
  • 30. Answer Step 1) calculate the sum of squares between groups: Mean for School 1 = 117.8 Mean for School 2 = 158.7 Mean for School 3 = 206.5 Grand mean: 161 SSB = [(117.8-161)2 + (158.7-161)2 + (206.5-161)2] x25 per group= 98,113
  • 31. Answer Step 2) calculate the sum of squares within groups: S.D. for S1 = 62.4 S.D. for S2 = 70.5 S.D. for S3 = 86.2 Therefore, sum of squares within is: (24)[ 62.42 + 70.5 2+ 86.22]=391,066
  • 32. Answer Step 3) Fill in your ANOVA table Source of variation d.f. Sum of squares Mean Sum of Squares F-statistic p-value Between 2 98,113 49056 9 <.05 Within 72 391,066 5431 Total 74 489,179 **R2=98113/489179=20% School explains 20% of the variance in lunchtime calcium intake in these kids.
  • 33. ANOVA summary īŽ A statistically significant ANOVA (F-test) only tells you that at least two of the groups differ, but not which ones differ. īŽ Determining which groups differ (when it’s unclear) requires more sophisticated analyses to correct for the problem of multiple comparisonsâ€Ļ
  • 34. Question: Why not just do 3 pairwise ttests? īŽ Answer: because, at an error rate of 5% each test, this means you have an overall chance of up to 1- (.95)3= 14% of making a type-I error (if all 3 comparisons were independent) īŽ If you wanted to compare 6 groups, you’d have to do 6C2 = 15 pairwise ttests; which would give you a high chance of finding something significant just by chance (if all tests were independent with a type-I error rate of 5% each); probability of at least one type-I error = 1-(.95)15=54%.
  • 36. Correction for multiple comparisons How to correct for multiple comparisons post- hocâ€Ļ â€ĸ Bonferroni correction (adjusts p by most conservative amount; assuming all tests independent, divide p by the number of tests) â€ĸ Tukey (adjusts p) â€ĸ Scheffe (adjusts p) â€ĸ Holm/Hochberg (gives p-cutoff beyond which not significant)
  • 37. Procedures for Post Hoc Comparisons If your ANOVA test identifies a difference between group means, then you must identify which of your k groups differ. If you did not specify the comparisons of interest (“contrasts”) ahead of time, then you have to pay a price for making all kCr pairwise comparisons to keep overall type-I error rate to Îą. Alternately, run a limited number of planned comparisons (making only those comparisons that are most important to your research question). (Limits the number of tests you make).
  • 38. 1. Bonferroni Obtained P-value Original Alpha # tests New Alpha Significant? .001 .05 5 .010 Yes .011 .05 4 .013 Yes .019 .05 3 .017 No .032 .05 2 .025 No .048 .05 1 .050 Yes For example, to make a Bonferroni correction, divide your desired alpha cut-off level (usually .05) by the number of comparisons you are making. Assumes complete independence between comparisons, which is way too conservative.
  • 39. 2/3. Tukey and SheffÊ īŽ Both methods increase your p-values to account for the fact that you’ve done multiple comparisons, but are less conservative than Bonferroni (let computer calculate for you!). īŽ SAS options in PROC GLM: īŽ adjust=tukey īŽ adjust=scheffe
  • 40. 4/5. Holm and Hochberg īŽ Arrange all the resulting p-values (from the T=kCr pairwise comparisons) in order from smallest (most significant) to largest: p1 to pT
  • 41. Holm 1. Start with p1, and compare to Bonferroni p (=Îą/T). 2. If p1< Îą/T, then p1 is significant and continue to step 2. If not, then we have no significant p-values and stop here. 3. If p2< Îą/(T-1), then p2 is significant and continue to step. If not, then p2 thru pT are not significant and stop here. 4. If p3< Îą/(T-2), then p3 is significant and continue to step If not, then p3 thru pT are not significant and stop here. Repeat the patternâ€Ļ
  • 42. Hochberg 1. Start with largest (least significant) p-value, pT, and compare to Îą. If it’s significant, so are all the remaining p-values and stop here. If it’s not significant then go to step 2. 2. If pT-1< Îą/(T-1), then pT-1 is significant, as are all remaining smaller p-vales and stop here. If not, then pT-1 is not significant and go to step 3. Repeat the patternâ€Ļ Note: Holm and Hochberg should give you the same results. Use Holm if you anticipate few significant comparisons; use Hochberg if you anticipate many significant comparisons.
  • 43. Practice Problem A large randomized trial compared an experimental drug and 9 other standard drugs for treating motion sickness. An ANOVA test revealed significant differences between the groups. The investigators wanted to know if the experimental drug (“drug 1”) beat any of the standard drugs in reducing total minutes of nausea, and, if so, which ones. The p-values from the pairwise ttests (comparing drug 1 with drugs 2-10) are below. a. Which differences would be considered statistically significant using a Bonferroni correction? A Holm correction? A Hochberg correction? Drug 1 vs. drug â€Ļ 2 3 4 5 6 7 8 9 10 p-value .05 .3 .25 .04 .001 .006 .08 .002 .01
  • 44. Answer Bonferroni makes new Îą value = Îą/9 = .05/9 =.0056; therefore, using Bonferroni, the new drug is only significantly different than standard drugs 6 and 9. Arrange p-values: 6 9 7 10 5 2 8 4 3 .001 .002 .006 .01 .04 .05 .08 .25 .3 Holm: .001<.0056; .002<.05/8=.00625; .006<.05/7=.007; .01>.05/6=.0083; therefore, new drug only significantly different than standard drugs 6, 9, and 7. Hochberg: .3>.05; .25>.05/2; .08>.05/3; .05>.05/4; .04>.05/5; .01>.05/6; .006<.05/7; therefore, drugs 7, 9, and 6 are significantly different.
  • 45. Practice problem īŽ b. Your patient is taking one of the standard drugs that was shown to be statistically less effective in minimizing motion sickness (i.e., significant p-value for the comparison with the experimental drug). Assuming that none of these drugs have side effects but that the experimental drug is slightly more costly than your patient’s current drug-of-choice, what (if any) other information would you want to know before you start recommending that patients switch to the new drug?
  • 46. Answer īŽ The magnitude of the reduction in minutes of nausea. īŽ If large enough sample size, a 1-minute difference could be statistically significant, but it’s obviously not clinically meaningful and you probably wouldn’t recommend a switch.
  • 47. Continuous outcome (means) Outcome Variable Are the observations independent or correlated? Alternatives if the normality assumption is violated (and small sample size): independent correlated Continuous (e.g. pain scale, cognitive function) Ttest: compares means between two independent groups ANOVA: compares means between more than two independent groups Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables Linear regression: multivariate regression technique used when the outcome is continuous; gives slopes Paired ttest: compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups; gives rate of change over time Non-parametric statistics Wilcoxon sign-rank test: non-parametric alternative to the paired ttest Wilcoxon sum-rank test (=Mann-Whitney U test): non- parametric alternative to the ttest Kruskal-Wallis test: non- parametric alternative to ANOVA Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient
  • 48. Non-parametric ANOVA Kruskal-Wallis one-way ANOVA (just an extension of the Wilcoxon Sum-Rank (Mann Whitney U) test for 2 groups; based on ranks) Proc NPAR1WAY in SAS
  • 49. Binary or categorical outcomes (proportions) Outcome Variable Are the observations correlated? Alternative to the chi- square test if sparse cells: independent correlated Binary or categorical (e.g. fracture, yes/no) Chi-square test: compares proportions between two or more groups Relative risks: odds ratios or risk ratios Logistic regression: multivariate technique used when outcome is binary; gives multivariate-adjusted odds ratios McNemar’s chi-square test: compares binary outcome between correlated groups (e.g., before and after) Conditional logistic regression: multivariate regression technique for a binary outcome when groups are correlated (e.g., matched data) GEE modeling: multivariate regression technique for a binary outcome when groups are correlated (e.g., repeated measures) Fisher’s exact test: compares proportions between independent groups when there are sparse data (some cells <5). McNemar’s exact test: compares proportions between correlated groups when there are sparse data (some cells <5).
  • 50. Chi-square test for comparing proportions (of a categorical variable) between >2 groups I. Chi-Square Test of Independence When both your predictor and outcome variables are categorical, they may be cross- classified in a contingency table and compared using a chi-square test of independence. A contingency table with R rows and C columns is an R x C contingency table.
  • 51. Example īŽ Asch, S.E. (1955). Opinions and social pressure. Scientific American, 193, 31- 35.
  • 52. The Experiment īŽ A Subject volunteers to participate in a “visual perception study.” īŽ Everyone else in the room is actually a conspirator in the study (unbeknownst to the Subject). īŽ The “experimenter” reveals a pair of cardsâ€Ļ
  • 53. The Task Cards Standard line Comparison lines A, B, and C
  • 54. The Experiment īŽ Everyone goes around the room and says which comparison line (A, B, or C) is correct; the true Subject always answers last – after hearing all the others’ answers. īŽ The first few times, the 7 “conspirators” give the correct answer. īŽ Then, they start purposely giving the (obviously) wrong answer. īŽ 75% of Subjects tested went along with the group’s consensus at least once.
  • 55. Further Results īŽ In a further experiment, group size (number of conspirators) was altered from 2-10. īŽ Does the group size alter the proportion of subjects who conform?
  • 56. The Chi-Square test Conformed? Number of group members? 2 4 6 8 10 Yes 20 50 75 60 30 No 80 50 25 40 70 Apparently, conformity less likely when less or more group membersâ€Ļ
  • 57. īŽ 20 + 50 + 75 + 60 + 30 = 235 conformed īŽ out of 500 experiments. īŽ Overall likelihood of conforming = 235/500 = .47
  • 58. Calculating the expected, in general īŽ Null hypothesis: variables are independent īŽ Recall that under independence: P(A)*P(B)=P(A&B) īŽ Therefore, calculate the marginal probability of B and the marginal probability of A. Multiply P(A)*P(B)*N to get the expected cell count.
  • 59. Expected frequencies if no association between group size and conformityâ€Ļ Conformed? Number of group members? 2 4 6 8 10 Yes 47 47 47 47 47 No 53 53 53 53 53
  • 60. īŽ Do observed and expected differ more than expected due to chance?
  • 61. Chi-Square test īƒĨ ī€Ŋ expected expected) - (observed 2 2 īŖ Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4 85 53 ) 53 70 ( 53 ) 53 40 ( 53 ) 53 25 ( 53 ) 53 50 ( 53 ) 53 80 ( 47 ) 47 30 ( 47 ) 47 60 ( 47 ) 47 75 ( 47 ) 47 50 ( 47 ) 47 20 ( 2 2 2 2 2 2 2 2 2 2 2 4 ī‚ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ŋ īŖ
  • 62. The Chi-Square distribution: is sum of squared normal deviates The expected value and variance of a chi- square: E(x)=df Var(x)=2(df) ) Normal(0,1 ~ Z where ; 1 2 2 īƒĨ ī€Ŋ ī€Ŋ df i Z df īŖ
  • 63. Chi-Square test īƒĨ ī€Ŋ expected expected) - (observed 2 2 īŖ Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4 Rule of thumb: if the chi-square statistic is much greater than it’s degrees of freedom, indicates statistical significance. Here 85>>4. 85 53 ) 53 70 ( 53 ) 53 40 ( 53 ) 53 25 ( 53 ) 53 50 ( 53 ) 53 80 ( 47 ) 47 30 ( 47 ) 47 60 ( 47 ) 47 75 ( 47 ) 47 50 ( 47 ) 47 20 ( 2 2 2 2 2 2 2 2 2 2 2 4 ī‚ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ģ ī€­ ī€Ŋ īŖ
  • 65. Same data, but use Chi-square test 48 . 1 22 . 1 : note 48 . 1 7 . 345 345.7) - (347 3 . 89 88) - (89.3 7 . 1 1.7) - (3 3 . 6 6.3) - (8 df 1 1 1 1 1 d cell in 89.3 b; cell in 345.7 c; cell in 1.7 6.3; 453 * .014 a cell in Expected 014 . 777 . * 018 . 777 . 453 352 ; 018 . 453 8 2 2 2 2 2 2 1 2 ī€Ŋ ī€Ŋ ī€Ŋ ī€Ģ ī€Ģ ī€Ģ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ ī€Ŋ Z NS * ) )*(C- (R- xp p p p cellphone tumor cellphone tumor īŖ Brain tumor No brain tumor Own 5 347 352 Don’t own 3 88 91 8 435 453 Expected value in cell c= 1.7, so technically should use a Fisher’s exact here! Next termâ€Ļ
  • 66. Caveat **When the sample size is very small in any cell (expected value<5), Fisher’s exact test is used as an alternative to the chi-square test.
  • 67. Binary or categorical outcomes (proportions) Outcome Variable Are the observations correlated? Alternative to the chi- square test if sparse cells: independent correlated Binary or categorical (e.g. fracture, yes/no) Chi-square test: compares proportions between two or more groups Relative risks: odds ratios or risk ratios Logistic regression: multivariate technique used when outcome is binary; gives multivariate-adjusted odds ratios McNemar’s chi-square test: compares binary outcome between correlated groups (e.g., before and after) Conditional logistic regression: multivariate regression technique for a binary outcome when groups are correlated (e.g., matched data) GEE modeling: multivariate regression technique for a binary outcome when groups are correlated (e.g., repeated measures) Fisher’s exact test: compares proportions between independent groups when there are sparse data (np <5). McNemar’s exact test: compares proportions between correlated groups when there are sparse data (np <5).