Multiple comparison - Descriptive Statistic

Comparing Several Means
(One-Way ANOVA)
INSPIRED BY

Lesson
Outline
1. Descriptive Statistics
2. The Problem of Multiple Comparisons
3. Analysis of Variance
4. Post Hoc Comparisons
5. The Equal Variance Assumption
6. Introduction to Nonparametric Tests

Illustrative Example: Data
Pets as moderators of a stress response. This chapter follows the analysis of data
from a study in which heart rates (bpm) of participants were monitored after
being exposed to a psychological stressor. Participants were randomized to one
of three groups:
• Group 1 - monitored in the presence of pet dog
• Group 2 - monitored in the presence of human friend
• Group 3 - monitored with neither dog nor human friend present

SPSS Data Table
•Most computer programs
require data in two
columns
•One column is for the
explanatory variable
(group)
•One column is for the
response variable
(hrt_rate)

Data are described and explored before moving to inferential
calculations
Here are summary statistics by group:
Descriptive Statistics

John Tukey taught us the
importance of exploratory data
analysis (EDA)
EDA techniques that apply:
Stemplots
Boxplots
Dotplots
Exploring Group Differences

The Problem of Multiple
Comparisons

• Consider a comparison of three groups. There are three
possible t tests when considering three groups:
1. H0: μ1 = μ2 versus Ha: μ1 ≠ μ2
2. H0: μ1 = μ3 versus Ha: μ1 ≠ μ3
3. H0: μ2 = μ3 versus Ha: μ2 ≠ μ3
• However, we do not perform separate t tests without
modification → this would identify too many random
differences
The Problem of Multiple Comparisons

• Family-wise error rate = probability of at least one false rejection
of H0
• Assume three null hypotheses are true:
At α = 0.05, the Pr(retain all three H0s) = (1−0.05)3 = 0.857.
Therefore, Pr(reject at least one) = 1−0.847 = 0.143 this is the
family-wise error rate.
• The family-wise error rate is much greater than intended. This is
“The Problem of Multiple Comparisons”

Mitigating the Problem of Multiple Comparisons
Two-step approach:
1. Test for overall significance using a technique called
“Analysis of Variance”
2. Do post hoc comparison on individual groups

• One-way ANalysis Of VAriance (ANOVA)
• Categorical explanatory variable
• Quantitative response variable
• Test group means for a significant difference
• Statistical hypotheses
• H0: μ1 = μ2 = … = μk
• Ha: at least one of the μis differ
• Method: compare variability between groups to variability
within groups (F statistic)
Analysis of Variance

R. A. Fisher
(1890-1962)
The F in the F statistic stands for
“Fisher”
Analysis of Variance

• Variability of group means around the grand mean → provides a
“signal” of group difference
• Based on a statistic called the Mean Square Between (MSB)
• Notation
• SSB ≡ sum of squares between
• dfB ≡ degrees of freedom between
• k ≡ number of groups
• x-bar ≡ grand mean
• x-bari ≡ mean of group i
Variability Between Groups

Mean Square Between: Formula
• Sum of Squares Between [Groups]
• Degrees of Freedom Between
• Mean Square Between

Mean Square Between: Graphically

Variability Within Groups
• Variability of data points within groups → quantifies random “noise”
• Based on a statistic called the Mean Square Within (MSW)
• Notation
• SSW ≡ sum of squares within
• dfW ≡ degrees of freedom within
• N ≡ sample size, all groups combined
• ni ≡ sample size, group I
• s2i ≡ variance of group i

Mean Square Within: Formula
• Mean Square Within
• Sum of Squares Within
• Degrees of Freedom Within

Mean Square Within: Graphically

The F statistic and ANOVA table
• Data are arranged to form an ANOVA table
• F statistic is the ratio of the MSB to MSW
08
.
14
793
.
84
843
.
1193



MSW
MSB
Fstat

Fstat and P-value
• The Fstat has numerator and denominator degrees of freedom: df1
and df2 respectively (corresponding to dfB and dfW)
• Convert Fstat to P-value with a computer program or Table D
• The P-value corresponds to the area in the right tail beyond

Table D (“F Table”)
• The F table has limited listings for df2.
• You often must round-down to the next available df2 (rounding down
preferable for conservative estimate).
• Wedge the Fstat between listing to find the approximate P-value

ANOVA Example (Summary)
A. Hypotheses: H0: μ1 = μ2 = μ3 vs. Ha: at least one of the μis
differ
B. Statistics: Fstat = 14.08 with 2 and 42 degrees of freedom
C. P-value = .000021 (via SPSS), providing highly significant
evidence against the H0; conclude the heart rates (an
indicator of the effects of stress) differed in the groups
D. Significance level (optional): Results are significantly at α =
.00005

Computation
Because of the complexity of computations, ANOVA statistics
are often calculated by computer

ANOVA and the t test (Optional)
ANOVA for two groups is equivalent to the equal
variance (pooled) t test (§12.4)
• Both address H0: μ1 = μ2
• dfW = df for t test = N – 2
• MSW = s2pooled
• Fstat = (tstat)2
• F1,df2,α = (tdf,1-α/2)2

Many post hoc comparison procedures exist. We
cover the LSD and Bonferroni methods.
SPSS Post Hoc Comparison Procedures

Do after a significant ANOVA to protect against the
problem of multiple comparisons
A. Hypotheses. H0: μi = μj vs. Ha: μi ≠ μj
for each group i and j
B. Test statistic
C. C. P-value. Use t table or software
Least Squares Difference Procedure
k
N
df
df
n
n
MSW
SE
SE
x
x
t
j
i
x
x
x
x
j
i
j
i














 

Within
stat
1
1
where
)
(
2
1

For the “pets” illustrative data, we test H0: μ1 = μ2 by
hand. The other tests will be done by computer
A. Hypotheses. H0: μi = μj vs. Ha: μi ≠ μj
for each group i and j
B. Test statistic
C. C. P-value. Use t table or software
Least Squares Difference Procedure
42
3
45
with
31
.
5
362
3
325
91
483
73
)
(
362
.
3
15
1
15
1
793
.
84
1
1
Within
2
1
stat
2
1
2
1































df
df
.
.
.
SE
x
x
t
n
n
MSW
SE
x
x
j
i
x
x

Results for illustrative “pets” data.
LSD Procedure, SPSS

95% Confidence Interval, Mean
Difference, LSD Method
  

























j
i
k
N
j
i
j
i
n
n
MSW
t
x
x
μ
1
1
)
(
for
CI
%
100
)
1
(
2
1
, 



Comparing Group 1 to Group 2:
95% CI, LSD Method, Example
 
 
)
0
.
11
,
6
.
24
(
)
362
.
3
)(
021
.
2
(
842
.
17
15
1
15
1
793
.
84
)
325
.
91
483
.
73
(
1
1
)
(
for
CI
%
95
975
,.
42
1
,
2
1
2

















































t
n
n
MSW
t
x
x
μ
j
i
k
N
j
i 


The Bonferroni procedure is instituted by multiplying the P-
value from the LSD procedure by the number of post hoc
comparisons “c”.
Bonferroni Procedure
A.Hypotheses. H0: μ1 = μ2 against Ha: μ1 ≠ μ2
B.Test statistic. Same as for the LSD method.
C.P-value. The LSD method produced P = .0000039
(two-tailed). Since there were three post hoc
comparisons, PBonf = 3 × .0000039 = .000012.

Let c represent the number of post hoc comparisons.
Comparing Group 1 to Group 2:
Bonferroni Procedure
 
)
4
.
9
,
3
.
26
(
)
362
.
3
)(
51
.
2
(
842
.
17
15
1
15
1
793
.
84
)
(
)
325
.
91
483
.
73
(
1
1
)
(
for
CI
%
95
9917
,.
42
1
, 2














































 

k
j
i
k
N
j
i
j
i
t
n
n
MSW
t
x
x
μ
c



P-values from Bonferroni are higher and confidence intervals
are broader than LSD method, reflecting its conservative
approach
Bonferroni Procedure, SPSS

• Conditions for ANOVA:
1. Sampling independence
2. Normal sampling distributions of mean
3. Equal variance within population groups
• Let us focus on condition 3, since conditions 1 and 2 are
covered elsewhere.
• Equal variance is called homoscedasticity. (Unequal variance
= heteroscedasticity).
• Homoscedasticity allows us to pool group variances to form
the MSW
The Equal Variance Assumption

1. Graphical exploration. Compare spreads visually
with side-by-side plots.
2. Descriptive statistics. If a group’s standard
deviation is more than twice that of another, be
alerted to possible heteroscedasticity
3. Test variances. A statistical test can be applied
(next slide).
Assessing “Equal Variance”

A.Hypotheses.
H0: σ2
1 = σ2
2 = … = σ2
k
Ha: at least one σ2
i differs
B. Test statistic. Test is performed by computer. The test
statistic is a particular type of Fstat based on the rank
transformed deviations (see p. 283 for details).
C. P-value. The Fstat is converted to a P-value by the
computational program. Interpretation of P is routine 
small P  evidence against H0, suggesting
heteroscedasticity.
Levene’s Test of Variances

A. H0: σ2
1 = σ2
2 = σ2
3 versus Ha: at least one σ2
i differs
B. SPSS output (below). Fstat = 0.059 with 2 and 42 df
C. P = 0.943. Very weak evidence against H0  retain
assumption of homoscedasticity
Levene’s Test – Example (“pets” data)

• Stay descriptive. Use summary statistics and EDA
methods to compare groups.
• Remove outliers, if appropriate (p. 287).
• Mathematically transform the data to compensate
for heteroscedasticity (e.g., a long right tail can be
pulled in with a log transform).
• Use robust non-parametric methods.
Analyzing Groups with Unequal Variance

Intro to Nonparametric
Methods

• Many nonparametric procedures are based on
rank transformed data (“rank tests”). Here are
examples:
Intro to Nonparametric Methods

• Let us explore the Kruskal-Wallis test as an example
of a non-parametric test
• The Kruskal-Wallis test is the non-parametric
analogue of one-way ANOVA.
• It does not require Normality or Equal Variance
conditions for inference.
• It is based on rank transformed data and seeing if
the mean ranks in groups differ significantly.
The Kruskal-Wallis Test

• The K-W hypothesis can be stated in terms of mean
or median (depending on assumptions made about
population shapes). Let us use the later.
• Let Mi ≡ the median of population i
• There are k groups
• H0: M1 = M2 = … = Mk
• Ha: at least one Mi differs
The Kruskal-Wallis Test

Alcohol and income. Data from a survey on alcohol
consumption and income are presented.
Kruskal-Wallis, Example

We wish to test whether the means differ significantly
but find graphical and hypothesis testing evidence
that the population variances are unequal.
Kruskal-Wallis, Example
Test of Homogeneityof Variances
Alcohol consumption
10.874 4 708 .000
Levene
Statistic df1 df2 Sig.

A. Hypotheses.
H0: M1 = M2 = M3 = M4 = M5
vs.
Ha: at least one Mi differs
B. Test statistic. Some computer programs use chi-square
statistic based upon a Normal approximation. SPSS
derives
Chi-square statistics = 7.793 with 4 df
(next slide)
Kruskal-Wallis Test, Example, cont.

P = 0.099, providing marginally significant evidence
against H0.
Kruskal-Wallis Test, Example, cont.

Multiple comparison - Descriptive Statistic

Recommended

Recommended

More Related Content

Similar to Multiple comparison - Descriptive Statistic

Similar to Multiple comparison - Descriptive Statistic (20)

Recently uploaded

Recently uploaded (20)

Multiple comparison - Descriptive Statistic