1. University of Santo Tomas
College of Fine Arts and Design
One Way ANOVA
A research paper submitted in partial fulfillment for the course
college statistics
Submitted by: Submitted to:
Cuerpo, David Prof. CrisencioM. Paner
Dela Cruz, Jose
Reynon, Ralph
March, 2012
Manila, Philippines
2. One way ANOVA or one way analysis of varianceis a hypothesis-testing technique in
statistics used to test the equality of twoor more population (or treatment) means by examining
the variances of samples that are taken.ANOVA allows one to determine whether the
differences between the samples are simply due torandom error (sampling errors) or whether
there are systematic treatment effect that causesthemean in one group to differ from the mean
in another.ANOVA is based on comparing the variance (or variation) between the data samples
to variation within each particular sample. If the between variation is much larger than the within
variation, the means of different samples will not be equal. If the between and within variations
are approximately the same size, then there will be no significant difference between sample
means.
In this photocopied article from the Journal of Chemical Education, One way
ANOVA is used to determine whether or not the three standardization methods
(external calibration curve, standard addition and internal standard) are statistically
different in determining the concentration of the three paraffin analytes.
Analysis of Results: One way ANOVA performed using Microsoft Excel’s statistical
macro ANOVA. The null hypothesis (H0: X1 = X2 = X3). If the results do not support the
null hypothesis then the Tukey multiple-comparison method will be used to obtain
confidence intervals for the differences between means to show where the experimental
design or execution failed.
Table 1. Data Matrix for One-Way ANOVA
Standard Curve (ppm) Internal Standard (ppm) Standard Addition
(ppm)
C18 179.3 176.4 135.3
C20 169.4 162.7 142.8
C22 176.1 172.2 151.1
Table 2. ANOVA Summary
Groups Count Sum Average Variance
Standard Curve 3 524.8 174.933 25.523
Internal Standard 3 511.3 170.433 49.263
Standard Additon 3 429.2 143.067 62.463
Table 3. ANOVA Results
Source of SS df MS F PValue F(crit)
Variation
Between 1784.7 2 892.33 19.505 0.0024 5.143
Groups
Within 274.5 6 45.75
Groups
Total 2059.2 8
Note: The null hypothesis is rejected: F(19.50) > F(crit)(5.143), and P value (0.002) < α = 0.05.Results in
table 3 show that the null hypothesis should be rejected therefore an error is committed.
3. Assumptions
1) The populations from which the samples were obtained must be normally or
approximately normally distributed.
2) The samples must be independent.
3) The variances of the populations must be equal.
Hypotheses - The null hypothesis will be that all population means are equal, the
alternative hypothesis is that at least one mean is different.In the following, lower case
letters apply to the individual samples and capital letters apply to the entire set
collectively. That is, n is one of many sample sizes, but N is the total sample size.
Grand Mean - The grand mean of a set of samples is the total of all the
data values divided by the total sample size. This requires that you have
all of the sample data available to you, which is usually the case, but not
always. It turns out that all that is necessary to find perform a one-way analysis of
variance are the number of samples, the sample means, the sample variances, and the
sample sizes.
Another way to find the grand mean is to find the weighted average of
the sample means. The weight applied is the sample size.
Total Variation - The total variation (not variance) is
comprised the sum of the squares of the differences of each
mean with the grand mean.
There is the between group variation and the within group variation. The whole idea
behind the analysis of variance is to compare the ratio of between group variance to
within group variance. If the variance caused by the interaction between the samples is
much larger when compared to the variance that appears within each group, then it is
because the means aren't the same.
Between Group Variation - The variation due to the
interaction between the samples is denoted SS(B) for
Sum of Squares Between groups. If the sample means are close to each other (and
therefore the Grand Mean) this will be small. There are k samples involved with one
data value for each sample (the sample mean), so there are k-1 degrees of freedom.
The variance due to the interaction between the samples is denoted MS(B) for Mean
Square Between groups. This is the between group variation divided by its degrees of
freedom. It is also denoted by .
Within Group Variation - The variation due to differences within
individual samples, denoted SS(W) for Sum of Squares Within
groups. Each sample is considered independently, no interaction between samples is
4. involved. The degrees of freedom is equal to the sum of the individual degrees of
freedom for each sample. Since each sample has degrees of freedom equal to one less
than their sample sizes, and there are k samples, the total degrees of freedom is k less
than the total sample size: df = N - k.
The variance due to the differences within individual samples is denoted MS(W) for
Mean Square Within groups. This is the within group variation divided by its degrees of
freedom. It is also denoted by . It is the weighted average of the variances (weighted
with the degrees of freedom).
F test statistic - Recall that a F variable is the ratio of two independent chi-
square variables divided by their respective degrees of freedom. Also recall
that the F test statistic is the ratio of two sample variances, well, it turns out
that's exactly what we have here. The F test statistic is found by dividing
the between group variance by the within group variance. The degrees of
freedom for the numerator are the degrees of freedom for the between
group (k-1) and the degrees of freedom for the denominator are the degrees of freedom
for the within group (N-k).
Other Sample Problems:
A manager wishes to determine whether the mean times required to complete a certain task differ for
the three levels of employee training. He randomly selected 10 employees with each of the three levels
of training (Beginner, Intermediate and Advanced). Do the data provide sufficient evidence to indicate
that the mean times required to complete a certain task differ for at least two of the three levels of
training? The data is summarized in the table.
Level of Training n s2
Advanced 10 24.2 21.54
Intermediate 10 27.1 18.64
Beginner 10 30.2 17.76
Ha: The mean times required to complete a certain task differ for at least two of the three levels of
training.
Ho: The mean times required to complete a certain task do not differ the three levels of training. ( µB =
µI = µA)
Assumptions: The samples were drawn independently and randomly from the three populations. The
time required to complete the task is normally distributed for each of the three levels of training. The
populations have equal variances.
Test Statistic:
RR: or
5. Calculations: = 10(24.2 - 27.16...)2 + 10(27.1 - 27.16...)2 + 10(30.2 -
27.16...)2 = 180.066....
= 9(21.54) + 9(18.64) + 9(17.76) = 521.46
Source df SS MS F
Treatments 2 180.067 90.033 4.662
Error 27 521.46 19.313
Total 29 702.527
Decision: Reject Ho.
Conclusion: There is sufficient evidence to indicate that the mean times required to complete a
certain task differ for at least two of the three levels of training.
Which pairs of means differ?
The Bonferroni Test is done for all possible pairs of means.
Decision rule: Reject Ho, if the interval does not contain 0.
c = # of pairs c = p(p-1)/2 = 3(2)/2 = 3
t.0083 = 2.554
(This value is not in the t table; it was obtained from a computer program.)
Since t.010 < t.0083 < t.0050 (2.473 < t.0083 < 2.771), use t.005 when using a table. If you reject the null
hypothesis when t = 2.771; you will also reject it for t.0083.
6. There is sufficient evidence to indicate that the mean response time for the advanced level of training is
less than the mean response time for the beginning level. There is not sufficient evidence to indicate
that the mean response time for the intermediate level differs from the mean response time of either of
the other two levels.
7. REFERENCE
Barrows, R. D. (2007). Quantitative Comparison of Three Standardization Methods Using a One-Way
ANOVA for Multiple Mean Comparisons, Journal of Chemical Education, 84, 839-841.
http://cba.ualr.edu/smartstat/topics/anova/example.pdf 3/17/2013 10:00 AM
http://www2.fiu.edu/~howellip/exanova.htm 3/17/2013 11:00AM
http://en.wikipedia.org/wiki/One-way_analysis_of_variance 3/16/2013 8:00AM