1 ANOVA.ppt

Analysis of Continuous
outcome Data
Comparison of Several Means

The t-tests
A. One sample t-test:
– used to compare the estimate of a sample
with a hypothesized population mean to
see if the sample is significantly different
– Hypothesis:
– Where is the hypothesized mean value

One sample t-test
• Example:- the distance covered by marathon
runners until a physiological stress occurs and
whether they used drug or not
Mean 15.59
Standard deviation 2.43

The t-tests
B. Paired t- test
– Each observation in one sample has one and only
one mate in the other sample- dependent to each
other
– For example the independent variable can be
measurements like
Before and after(BA), (e.g before and after an
intervention),
Repeated measurements (e.g. using digital and analog
apparatus)or
When the two data sources are dependent(e.g. data
from mother and father of respondent)

Paired t- test
Example : The blood pressure (BP) of 10 mothers were
measured before and after taking a new drug

C. Two independent samples t-test
– Used to compare two unrelated or independent
groups
• Assumptions include:
The variance of the dependent variable in the two
populations are equal
The dependent variable is normally distributed
within each population
The data are independent
• Hypothesis: Ho: μt = μc Vs HA: μt ≠ μc ,
– Where μt and μc are the population mean of
treatment and control(placebo) groups respectively

Two independent samples t-test….
Example:
– Do the marathon runners grouped by their
drug intake status differ in their average
distance coverage before they feel any
physiological stress?

Since tcal= -3.741< -2.12, or P-Value = 0.002 < 0.05 = α
Decision:-We reject Ho
Where

Extension of two independent samples t-test
• In the case of two independent samples t-
test,
– We have one continuous dependent variable
(interval/ratio data) and;
– One nominal or ordinal independent variable
with only two categories
• What if there are more than two categories
for the independent variable?

Extension of two independent samples t-test
 Are the birth weights of children in different
geographical regions the same?
 Are the responses of patients to different
medications and placebo different?
 Are people with different age groups have
different proportion of body fat?

One-way Analysis of variance
(one-way ANOVA)
• All the above research questions have one
common characteristic:
– Two variables(one categorical and one
quantitative)
• Main question: Are the averages of the
quantitative variable across the groups the same?
• Because there is Only one categorical
independent variable which has two or
more categories (groups)->One way ANOVA
follows

One-way Analysis of variance(1-way ANOVA
• Also called Completely Randomized
Design
• Experimental units (subjects) are assigned
randomly to treatments/groups.
• Here subjects are assumed to be
homogeneous

1-way ANOVA…
• When the concern is to compare the means of
two normal distributions, the use of two-sample
t-test for independent sample is appropriate.
• Frequently, the means of more than two
distributions are compared
• The one-way analysis of variance is suitable for
deciding whether differences exist between the
means of more than two groups

One-Way ANOVA…
• ANOVA is a generalization of the Student’s t-test
• The null hypothesis is
Or There is no difference between two or more population
means (usually at least three); or there is no difference
between a number of treatments
• HA: at least one group mean is different
• where μk is the true mean of the kth group




 


 ...
: 3
2
1
0
H

One-Way ANOVA…
• Allows us to test whether the mean of at
least one of the groups differs significantly
from some other groups
• Sometimes reported as F statistics or
Completely Randomized Design/CRD

The hypothesis to be tested for k means (k > 2)
HA: At least one population mean does not
equal to another population means
• If we reject the null hypothesis, we conclude that
at least one population mean does not equal to
another population mean
• Other methods are needed to determine which
population means are different




 


 ...
: 3
2
1
0
H

Definitions
• Response variable is the variable of interest to
be measured in the experiment
• Treatments is something that researchers
administer to experimental units
– E.g A doctor treats a patient with a skin condition with
different creams to see which is most effective
• Treatment Level implies amount or magnitude
or categories
• Factor is a variable whose levels are set by the
experimenter

Definitions
• Experimental unit is the object on which the
response of factors are observed or measured
• One-Way Analysis of Variance
–Only one source of variation, or factor, is
investigated
• Two-Way Analysis of Variance
–Two factors are analyzed in the study

Examples
One-way analysis of variance
Social class levels (I to V) and blood
pressure
Type of sickle cell anaemia (3 Types) and
haemoglobin levels
Examine lipid levels between different ethnic
groups

Why not just use lots of t-tests?
• Practically time consuming
• Conducting multiple t-tests can lead also to severe
inflation of the Type I error rate (false positives) and
is not recommended
• However, ANOVA is used to test for differences
among several means without increasing the Type I
error rate lead to false significant finding

One way ANOVA…
• ANOVA uses data from all groups at a time to
estimate standard errors
– Avoids multiple t-tests and spurious significant results

Assumptions of One Way ANOVA
• The data are normally distributed or the samples
have come from normally distributed populations
and are independent.
• The variance is the same in each group to be
compared (equal variance).
• Moderate departures from normality may be
safely ignored, but the effect of unequal
standard deviations may be serious.
• In the later case, transforming the data may be
useful.

Reminder of variance around a mean
1
n
)
x
(x
Variance
2





Notations
level
factor
i
the
of
mean
Sample
.
responses
all
for
mean
Overall
..
level
factor
i
for the
ns
observatio
the
of
Total
.
X
individual
j
for the
factor
i
the
of
n
Observatio
th
1
1 1
1
th
i
th
th











 

i
n
j
ij
i
T
k
i
n
j
ij
n
j
ij
ij
n
X
X
n
X
X
X
X
i
i
i

Total Sum of squares
Total sum of squares (SST) – the
numerator of the equation for the
variance
2
1 1
..)
( X
X
SST ij
k
i
n
j
i

 
 

Partitioning the SST
2
1 1
2
1
2
1 1
2
1
2
1 1
2
1 1
2
1 1
.)
(
..)
.
(
.)
(
..)
.
(
..))
.
(
.)
(
(
..)
.
.
(
..)
(
i
ij
k
i
n
j
k
i
i
i
i
ij
k
i
n
j
k
i
i
i
i
i
ij
k
i
n
j
i
i
ij
k
i
n
j
ij
k
i
n
j
X
X
SSW
X
X
n
SSB
X
X
X
X
n
SST
X
X
X
X
SST
X
X
X
X
X
X
SST
i
i
i
i
i

























 

 

 
 
 

One-Way ANOVA
• We test the equality of means among groups
by using the variance
• The difference between variation within
groups and variation between groups may
help us to compare the means
• If both are equal, it is likely that the observed
difference is due to chance and not real
difference
• Note that:
• Total Variability = variability between +
Variability within

Example…
By using F distribution table F (2,19,0.05)=3.52. Fcalc=3.71>Ftab=3.52,

Degrees of Freedom
What is it?
• Statisticians use the terms "degrees of
freedom" to describe the number of values
in the final calculation of a statistic that are
free to vary

In SPSS
• Analyze – Compare means – One way Anova

Analysis of Variance table in SPSS
Descriptives
rcfol
8 316.6250 58.71709 20.75963 267.5363 365.7137 243.00 392.00
9 256.4444 37.12180 12.37393 227.9101 284.9788 206.00 309.00
5 278.0000 33.75648 15.09636 236.0858 319.9142 241.00 328.00
22 283.2273 51.28439 10.93387 260.4890 305.9655 206.00 392.00
1.00
2.00
3.00
Total
N Mean Std. Deviation Std. Error Lower Bound Upper Bound
95% Confidence Interval for
Mean
Minimum Maximum
ANOVA
rcfol
15515.766 2 7757.883 3.711 .044
39716.097 19 2090.321
55231.864 21
Between Groups
Within Groups
Total
Sum of
Squares df Mean Square F Sig.

Conclusion
• At least one of the groups means differs from the
others by more than we would expect by
sampling variation

Pair-wise comparisons of group means
post hoc tests or multiple comparisons
• ANOVA test tells us only whether there is
statistically significant difference among groups
means, but
• It doesn’t tell us which particular groups are
significantly different
• To identify them, we use either a priori (pre-
planed) or post hoc tests if F-test is significant,
p<0.05

Multiple comparison testing
• Bonferroni test and Bonferroni t test
• Tukey Test
• Dunnett’s test for multiple comparisons against a
single control group
• Student-Newman-Keuls Test for all pariwise
comparison
• Duncan
Debate in the statistics world on which is best
method

Benferroni method or Modified t-test (Steps)
I. Find tcalc for the pairs of groups of interest
(to be compared)
II. The modified t-test is based on the
pooled estimate of variance from all the
groups (which is the residual variance in
the ANOVA table), not just from pair
being considered.

Benferroni method or Modified t-test (Steps)
III. If we perform k paired comparisons, then
we should multiply the P value obtained
from each test by k; that is, we calculate
P' = kP with the restriction that P' cannot
exceed 1.
Where, , that is the number of
possible comparisons

Benferroni method or Modified t-test…
• Returning to the red cell folate data given
above, the residual standard deviation is
= 45.72.
The corresponding P-value = 0.014 and the
corrected P value is P' = 0.014x3= 0.042
Group I and II are different

Benferroni method or Modified t-test
(b) Comparing groups I and III
The corresponding P value = 0.1625 and
The corrected P value is P' = 0.1625x3
= 0.4875
Group I and III are not different

Benferroni method or Modified t-test
(c) Comparing Groups II and III
t = (278 - 256.4) / (45.72 x √(1/5+1/9)
= 21.6/25.5= 0.85 on 19 degrees of freedom.
The corresponding P value = 0.425 and the
corrected P value is P' = 1.00
Group I and III are not different

Therefore, the main explanation for the
difference between the groups that was
identified in the ANOVA is thus the
difference between groups I and II.

Multiple Comparisons
Dependent Variable: rcfol
Bonferroni
60.18056* 22.21594 .042 1.8614 118.4998
38.62500 26.06443 .464 -29.7969 107.0469
-60.18056* 22.21594 .042 -118.4998 -1.8614
-21.55556 25.50141 1.000 -88.4995 45.3884
-38.62500 26.06443 .464 -107.0469 29.7969
21.55556 25.50141 1.000 -45.3884 88.4995
(J) treatg
2.00
3.00
1.00
3.00
1.00
2.00
(I) treatg
1.00
2.00
3.00
Mean
Difference
(I-J) Std. Error Sig. Lower Bound Upper Bound
95% Confidence Interval
The mean difference is significant at the .05 level.
*.
Example of Bonferroni t statistic for
multiple comparisons in SPSS

ANOVA - a recapitulation
• ANOVA is a parametric test, examining whether
the means differ between 2 or more populations
• It generates a test statistic F, which can be
thought of as a signal: noise ratio. Thus large
values of F indicate a high degree of pattern
within the data and imply rejection of Ho
• It is thus similar to the t test - in fact ANOVA on
2 groups is equivalent to a t test [F = t]

One way ANOVA’s limitation
• This technique is only applicable when there is one
treatment used
• Note that this single treatment can have 3, 4,…
,many levels
• Thus nutrition trial on children weight gain with 4
different feeding styles could be analyzed this way,
but a trial of BOTH nutrition and mothers health
status could not

1 ANOVA.ppt

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 1 ANOVA.ppt

Similar to 1 ANOVA.ppt (20)

More from Alemayehu70

More from Alemayehu70 (6)

Recently uploaded

Recently uploaded (20)

1 ANOVA.ppt