Anova ONE WAY

Analysis of
variance
ONE WAY ANOVA

 IT USED TO ESTIMATE AND COMPARE THE
EFFECTES OF THE Difference treatments on the
response variables
 We do this by estimating and comparing the
treatments means
 When we compare more than two groups based
on one factors.
E.g. If one the car oil company would to like to
compare the effects of three oil type assume that
A,B and C on oil mileage by midsize cars. And the
company select randomly 5 cars from 1000 cars.
There is only one Factor : oil type

Assumptions on one way
ANOVA
1. Constant variance: the populations of values
of the response variable associated with the
treatments have equal variables.
2. Normality: the pupations of the values of the
response variable associated with the
treatments all have normal distributions
3. Independence: the samples of experimental
units associated with the treatments are
randomly selected, independent sample

Testing the significant differences
between treatments means
 If the oil company gets the following result
 We have n=15
 p=3
Xa1 34 xb1 35.3 xc1 33.3
xa2 35 xb2 36.5 xc2 34
Xa3 34.3 xb3 36.4 xc3 34.7
Xa4 35.5 xb4 37 xc4 33
Xa5 35.8 xb5 37.6 xc5 34.9
sum 174.6 0 182.8 0 169.9
Mean 34.92 0 36.56 0 33.98
35.153
oil Type A OIL TYPE B OIL TYPE C

Step 1: Determine the null hypothesis and
alternative hypothesis
H0: µ1=µ2=µ3
Ha: at least two of µ1,µ2,µ3 differ
Compare the between-treatment
variability to the within-treatment
variability
 Between-treatment variability is the variability
of the sample means, sample to sample
 Within-treatment variability is the variability of
the treatments (that is, the values) within each
sample

 Step 2 in order to numerically compare
between within and between treatments
variability we define sum of square and
mean square
The treatment sum of
square(SST):measure the variability of the
sample treatment means.
SST=
so 5(34-35.153)+5(36.6-53.153)+5(33.98-35.153)
=17.0493
 

p
i
ii xxn
1
2
2
2 2

 STEP 2 : THE ERROR SUM OF SQUARES
(SSE) measure the within treatment
variability
SSE=
We compute the SSE by calculating the
squared difference between each observed
value.
So [(34.0-34.92)+(35-34.92)+…….+(34.9-
33.98)]
=8.028
  

p
i
n
j
iij
i
xx
1 1
2
22 2

 Step 3 we define a sum of square that measures
the total amount of variability in the observed
values of the response. The total sum of square
SSTO
SSTO=SST+SSE
17.0493+8.028=25.0773
 Using the treatment and error sums of square we
next define two mean square:
 The treatment mean square is MST=SST/P-1
17.0493/3-1
8.525
 The error mean square is MSE=SSE/n-p
8.028/15-3
0.669

 THEN WE CALCULATE F VALUE
DEFINE F statistic
And its value to the area under the F curve with
p-1 and n-p degree of freedom to the right of F.
We reject H0 at the significance a if either of
the following conditions holds
1. F>Fa 2. p-value <a
 
   11
1
:StatisticTest


bp-SSE/
p-SST/
MSE
MST
F=

 IT FOLLOWS THAT
F=8.525/0.669=12.74
IN order to test H0 at the 0.05 level of
significance we use F .05 with p-1 =3-1=2
nominator and n-p=15-3=12 denominator,
From the table we got 3.89
So we have F=12.74>F .05=3.89
There fore, we reject H0 at the 0.05 level of
significance.
In other words we conclude that at least two of
oil types A B C have different effects on the
mean mileage.

Pairwise comparison
 If one way anova f test says that at least two treatment
mean differ, we estimate how large the difference are.
 Comparing treatment means two at a time.
 In our example we might estimate the pairwise
differences µa-µb, it is the change in mean mileage
achieved by changeling from B to A
 There are two approaches to calculating intervals for
pairwise differences
1. INDIVIDUAL: the confidence interval for each
pairwise difference
ta/2 is based on n – p degrees of freedom
  






hi
α/hi
nn
MSEtxx
11
2

2. Simultaneous confidence interval : such an
interval make us 100(1-a) percent
confidence that all of the pairwise difference
are simultaneously contained in their
respective intervals. There are so many kinds
but mostly Tukey formula used.
qa is the upper a percentage point of the
studentized range for p and (n – p) from
Table
 
m
MSE
qxx αhi 

 E.g in the oil mileage example we are
comparing
p=3 treatments
each sample size m=5 total n=15
MSE=0.669 q.05=3.77 from the table
corresponding to p=3 and n-p=12
Similar Tukey simultaneous 95 % confidence
interval for µb-µa
[(36.56-34.92)±3.77 0.669/5=
[0.261,3.019]
The interval make us simultaneously 95%
confidence that 1 changing from oil type A to oil
B increase mean mileage by between 02.61 and
3.019 mpg.

Anova ONE WAY

More Related Content

What's hot

Viewers also liked

Similar to Anova ONE WAY

Recently uploaded

Anova ONE WAY

Editor's Notes