SlideShare a Scribd company logo
1 of 82
Analysis of Variance
Chapter 15
15.1 Introduction
• Analysis of variance compares two or more
populations of interval data.
• Specifically, we are interested in determining
whether differences exist between the population
means.
• The procedure works by analyzing the sample
variance.
• The analysis of variance is a procedure that
tests to determine whether differences exits
between two or more population means.
• To do this, the technique analyzes the sample
variances
15.2 One Way Analysis of
Variance
• Example 15.1
– An apple juice manufacturer is planning to develop a new
product -a liquid concentrate.
– The marketing manager has to decide how to market the
new product.
– Three strategies are considered
• Emphasize convenience of using the product.
• Emphasize the quality of the product.
• Emphasize the product’s low price.
One Way Analysis of Variance
• Example 15.1 - continued
– An experiment was conducted as follows:
• In three cities an advertisement campaign was launched .
• In each city only one of the three characteristics
(convenience, quality, and price) was emphasized.
• The weekly sales were recorded for twenty weeks
following the beginning of the campaigns.
One Way Analysis of Variance
One Way Analysis of Variance
Convnce Quality Price
529 804 672
658 630 531
793 774 443
514 717 596
663 679 602
719 604 502
711 620 659
606 697 689
461 706 675
529 615 512
498 492 691
663 719 733
604 787 698
495 699 776
485 572 561
557 523 572
353 584 469
557 634 581
542 580 679
614 624 532
See file
Xm15 -01
Weekly
sales
Weekly
sales
Weekly
sales
• Solution
– The data are interval
– The problem objective is to compare sales in three
cities.
– We hypothesize that the three population means are
equal
One Way Analysis of Variance
H0: µ1 = µ2= µ3
H1: At least two means differ
To build the statistic needed to test the
hypotheses use the following notation:
• Solution
Defining the Hypotheses
Independent samples are drawn from k populations (treatments).
1 2 k
X11
x21
.
.
.
Xn1,1
1
1
x
n
X12
x22
.
.
.
Xn2,2
2
2
x
n
X1k
x2k
.
.
.
Xnk,k
k
k
x
n
Sample size
Sample mean
First observation,
first sample
Second observation,
second sample
X is the “response variable”.
The variables’ value are called “responses”.
Notation
Terminology
• In the context of this problem…
Response variable – weekly sales
Responses – actual sale values
Experimental unit – weeks in the three cities when we
record sales figures.
Factor – the criterion by which we classify the populations
(the treatments). In this problems the factor is the marketing
strategy.
Factor levels – the population (treatment) names. In this
problem factor levels are the marketing trategies.
Two types of variability are employed when
testing for the equality of the population
means
The rationale of the test statistic
Graphical demonstration:
Employing two types of variability
20
25
30
1
7
Treatment 1 Treatment 2 Treatment 3
10
12
19
9
Treatment 1Treatment 2Treatment 3
20
16
15
14
11
10
9
10x1 =
15x2 =
20x3 =
10x1 =
15x2 =
20x3 =
The sample means are the same as before,
but the larger within-sample variability
makes it harder to draw a conclusion
about the population means.
A small variability within
the samples makes it easier
to draw a conclusion about the
population means.
The rationale behind the test statistic – I
• If the null hypothesis is true, we would expect all
the sample means to be close to one another
(and as a result, close to the grand mean).
• If the alternative hypothesis is true, at least
some of the sample means would differ.
• Thus, we measure variability between sample
means.
• The variability between the sample means is
measured as the sum of squared distances
between each mean and the grand mean.
This sum is called the
Sum of Squares for Treatments
SST
In our example treatments are
represented by the different
advertising strategies.
Variability between sample means
2
k
1j
jj
)xx(nSST ∑=
−=
There are k treatments
The size of sample j The mean of sample j
Sum of squares for treatments (SST)
Note: When the sample means are close to
one another, their distance from the grand
mean is small, leading to a small SST. Thus,
large SST indicates large variation between
sample means, which supports H1.
• Solution – continued
Calculate SST
2
k
1j
jj
321
)xx(nSST
65.608x00.653x577.55x
∑=
−=
===
= 20(577.55 - 613.07)2
+
+ 20(653.00 - 613.07)2
+
+ 20(608.65 - 613.07)2
=
= 57,512.23
The grand mean is calculated by
k21
kk2211
n...nn
xn...xnxn
X
+++
+++
=
Sum of squares for treatments (SST)
Is SST = 57,512.23 large enough to
reject H0 in favor of H1?
See next.
Sum of squares for treatments (SST)
• Large variability within the samples weakens the
“ability” of the sample means to represent their
corresponding population means.
• Therefore, even though sample means may
markedly differ from one another, SST must be
judged relative to the “within samples variability”.
The rationale behind test statistic – II
• The variability within samples is measured by
adding all the squared distances between
observations and their sample means.
This sum is called the
Sum of Squares for Error
SSEIn our example this is the
sum of all squared differences
between sales in city j and the
sample mean of city j (over all
the three cities).
Within samples variability
• Solution – continued
Calculate SSE
Sum of squares for errors (SSE)
∑∑= =
−=
===
k
j
jij
n
i
xxSSE
sss
j
1
2
1
2
3
2
2
2
1
)(
24.670,811,238,700.775,10
= (n1 - 1)s1
2
+ (n2 -1)s2
2
+ (n3 -1)s3
2
= (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24
= 506,983.50
Is SST = 57,512.23 large enough
relative to SSE = 506,983.50 to reject
the null hypothesis that specifies that
all the means are equal?
Sum of squares for errors (SSE)
To perform the test we need to calculate
the mean squaresmean squares as follows:
The mean sum of squares
Calculation of MST -
Mean Square for Treatments
12.756,28
13
23.512,57
1
=
−
=
−
=
k
SST
MST
Calculation of MSE
Mean Square for Error
45.894,8
360
50.983,509
=
−
=
−
=
kn
SSE
MSE
Calculation of the test statistic
23.3
45.894,8
12.756,28
=
=
=
MSE
MST
F
with the following degrees of freedom:
v1=k -1 and v2=n-k
Required Conditions:
1. The populations tested
are normally distributed.
2. The variances of all the
populations tested are
equal.
And finally the hypothesis test:
H0: µ1 = µ2 = …=µk
H1: At least two means differ
Test statistic:
R.R: F>Fα,k-1,n-k
MSE
MST
F=
The F test rejection region
The F test
Ho: µ1 = µ2= µ3
H1: At least two means differ
Test statistic F= MST/ MSE= 3.23
15.3FFF:.R.R 360,13,05.0knk ≈=> −−−,−,α 1
Since 3.23 > 3.15, there is sufficient evidence
to reject Ho in favor of H1,and argue that at least one
of the mean sales is different than the others.
23.3
17.894,8
12.756,28
MSE
MST
F
=
=
=
-0.02
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4
• Use Excel to find the p-value
– fx Statistical FDIST(3.23,2,57) = .0467
The F test p- value
p Value = P(F>3.23) = .0467
Excel single factor ANOVA
SS(Total) = SST + SSE
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Convenience 20 11551 577.55 10775.00
Quality 20 13060 653.00 7238.11
Price 20 12173 608.65 8670.24
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 57512 2 28756 3.23 0.0468 3.16
Within Groups 506984 57 8894
Total 564496 59
Xm15-01.xls
15.3 Analysis of Variance
Experimental Designs
• Several elements may distinguish between one
experimental design and others.
– The number of factors.
• Each characteristic investigated is called a factor.
• Each factor has several levels.
Factor A
Level 1Level2
Level 1
Factor B
Level 3
Two - way ANOVA
Two factors
Level2
One - way ANOVA
Single factor
Treatment 3 (level 1)
Response
Response
Treatment 1 (level 3)
Treatment 2 (level 2)
• Groups of matched observations are formed into
blocks, in order to remove the effects of
“unwanted” variability.
• By doing so we improve the chances of
detecting the variability of interest.
Independent samples or blocks
• Fixed effects
– If all possible levels of a factor are included in our analysis we
have a fixed effect ANOVA.
– The conclusion of a fixed effect ANOVA applies only to the
levels studied.
• Random effects
– If the levels included in our analysis represent a random
sample of all the possible levels, we have a random-effect
ANOVA.
– The conclusion of the random-effect ANOVA applies to all the
levels (not only those studied).
Models of Fixed and Random Effects
• In some ANOVA models the test statistic of the fixed
effects case may differ from the test statistic of the
random effect case.
• Fixed and random effects - examples
– Fixed effects - The advertisement Example (15.1): All the
levels of the marketing strategies were included
– Random effects - To determine if there is a difference in the
production rate of 50 machines, four machines are randomly
selected and there production recorded.
Models of Fixed and Random Effects.
15.4 Randomized Blocks (Two-way)
Analysis of Variance
• The purpose of designing a randomized block
experiment is to reduce the within-treatments
variation thus increasing the relative amount of
between treatment variation.
• This helps in detecting differences between the
treatment means more easily.
Treatment 4
Treatment 3
Treatment 2
Treatment 1
Block 1Block3 Block2
Block all the observations with some
commonality across treatments
Randomized Blocks
Treatment
Block 1 2 k Block mean
1 X11 X12 . . . X1k
2 X21 X22 X2k
.
.
.
b Xb1 Xb2 Xbk
Treatment mean
1]B[x
2]B[x
b]B[x
1]T[x 2]T[x k]T[x
Block all the observations with some
commonality across treatments
Randomized Blocks
• The sum of square total is partitioned into three
sources of variation
– Treatments
– Blocks
– Within samples (Error)
SS(Total) = SST + SSB + SSESS(Total) = SST + SSB + SSE
Sum of square for treatments Sum of square for blocks Sum of square for error
Recall.
For the independent
samples design we have:
SS(Total) = SST +
SSE
Partitioning the total variability
Calculating the sums of squares
• Formulai for the calculation of the sums of squares
Treatment
Block 1 2 k Block mean
1 X11 X12 . . . X1k
2 X21 X22 X2k
.
.
.
b Xb1 Xb2 Xbk
Treatment mean
1]B[x
2]B[x
1]T[x 2]T[x k]T[x x
2
1 X)]T[x(b 



 − ...X)]T[x(b
2
2 +



 −+
2
k X)]T[x(b 



 −+SST =
+



 −
2
1 X)]B[x(k
+



 −
2
2 X)]B[x(k
2
k X)]B[x(k 



 −
SSB=
=+−+−+
+−+−++−+−=
...)()(...
)()(...)()()(
2
2
2
1
2
22
2
12
2
21
2
11
XxXX
XxXxXxXxTotalSS
kk
Calculating the sums of squares
• Formulai for the calculation of the sums of squares
Treatment
Block 1 2 k Block mean
1 X11 X12 . . . X1k
2 X21 X22 X2k
.
.
.
b Xb1 Xb2 Xbk
Treatment mean
1]B[x
2]B[x
1]T[x 2]T[x k]T[x x
2
1 X)]T[x(b 



 − ...X)]T[x(b
2
2 +



 −+
2
k X)]T[x(b 



 −+SST =
+



 −
2
1 X)]B[x(k
+



 −
2
2 X)]B[x(k
2
k X)]B[x(k 



 −
SSB=
...)X]B[x]T[xx()X]B[x]T[xx(
...)X]B[x]T[xx()X]B[x]T[xx(
...)X]B[x]T[xx()X]B[x]T[xx(SSE
2
2kk2
2
1kk1
2
2222
2
1212
2
2121
2
1111
++−−++−−
++−−++−−
++−−++−−=
To perform hypothesis tests for treatments and blocks we
need
• Mean square for treatments
• Mean square for blocks
• Mean square for error
Mean Squares
1k
SST
MST
−
=
1b
SSB
MSB
−
=
1bkn
SSE
MSE
+−−
=
Test statistics for the randomized block
design ANOVA
MSE
MST
F =
MSE
MSB
F =
Test statistic for treatments
Test statistic for blocks
• Testing the mean responses for treatments
F > Fα,k-1,n-k-b+1
• Testing the mean response for blocks
F> Fα,b-1,n-k-b+1
The F test rejection regions
• Example 15.2
– Are there differences in the effectiveness of cholesterol
reduction drugs?
– To answer this question the following experiment was
organized:
• 25 groups of men with high cholesterol were matched by age
and weight. Each group consisted of 4 men.
• Each person in a group received a different drug.
• The cholesterol level reduction in two months was recorded.
– Can we infer from the data in Xm15-02 that there are
differences in mean cholesterol reduction among the four
drugs?
Randomized Blocks ANOVA - Example
• Solution
– Each drug can be considered a treatment.
– Each 4 records (per group) can be blocked, because
they are matched by age and weight.
– This procedure eliminates the variability in
cholesterol reduction related to different
combinations of age and weight.
– This helps detect differences in the mean cholesterol
reduction attributed to the different drugs.
Randomized Blocks ANOVA - Example
BlocksTreatments b-1 MST / MSE MSB / MSE
Conclusion: At 5% significance level there is sufficient evidence
to infer that the mean “cholesterol reduction” gained by at least
two drugs are different.
K-1
Randomized Blocks ANOVA - Example
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 3848.7 24 160.36 10.11 0.0000 1.67
Columns 196.0 3 65.32 4.12 0.0094 2.73
Error 1142.6 72 15.87
Total 5187.2 99
Analysis of Variance
Chapter 15 - continued
15.5 Two-Factor Analysis of Variance -
• Example 15.3
– Suppose in Example 15.1, two factors are to be
examined:
• The effects of the marketing strategy on sales.
– Emphasis on convenience
– Emphasis on quality
– Emphasis on price
• The effects of the selected media on sales.
– Advertise on TV
– Advertise in newspapers
• Solution
– We may attempt to analyze combinations of levels, one
from each factor using one-way ANOVA.
– The treatments will be:
• Treatment 1: Emphasize convenience and advertise in TV
• Treatment 2: Emphasize convenience and advertise in
newspapers
• …………………………………………………………………….
• Treatment 6: Emphasize price and advertise in newspapers
Attempting one-way ANOVA
• Solution
– The hypotheses tested are:
H0: µ1= µ2= µ3= µ4= µ5= µ6
H1: At least two means differ.
Attempting one-way ANOVA
City1 City2 City3 City4 City5 City6
Convnce Convnce Quality Quality Price Price
TV Paper TV Paper TV Paper
– In each one of six cities sales are recorded for ten
weeks.
– In each city a different combination of marketing
emphasis and media usage is employed.
• Solution
Attempting one-way ANOVA
• The p-value =.0452.
• We conclude that there is evidence that differences
exist in the mean weekly sales among the six cities.
City1 City2 City3 City4 City5 City6
Convnce Convnce Quality Quality Price Price
TV Paper TV Paper TV Paper
• Solution
Xm15-03
Attempting one-way ANOVA
• These result raises some questions:
– Are the differences in sales caused by the different
marketing strategies?
– Are the differences in sales caused by the different
media used for advertising?
– Are there combinations of marketing strategy and
media that interact to affect the weekly sales?
Interesting questions – no answers
• The current experimental design cannot provide
answers to these questions.
• A new experimental design is needed.
Two-way ANOVA (two factors)
Two-way ANOVA (two factors)
City 1
sales
City3
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
TV
Newspapers
Convenience Quality Price
Are there differences in the mean sales
caused by different marketing strategies?
Factor A: Marketing strategy
FactorB:
Advertisingmedia
Test whether mean sales of “Convenience”, “Quality”,
and “Price” significantly differ from one another.
H0: µConv.= µQuality = µPrice
H1: At least two means differ
Calculations are
based on the sum of
square for factor A
SS(A)
Two-way ANOVA (two factors)
Two-way ANOVA (two factors)
City 1
sales
City 3
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
Factor A: Marketing strategy
FactorB:
Advertisingmedia
Are there differences in the mean sales
caused by different advertising media?
TV
Newspapers
Convenience Quality Price
Test whether mean sales of the “TV”, and “Newspapers”
significantly differ from one another.
H0: µTV = µNewspapers
H1: The means differ
Calculations are based on
the sum of square for factor B
SS(B)
Two-way ANOVA (two factors)
Two-way ANOVA (two factors)
City 1
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
TV
Newspapers
Convenience Quality Price
Factor A: Marketing strategy
FactorB:
Advertisingmedia
Are there differences in the mean sales
caused by interaction between marketing
strategy and advertising medium?
City 3
sales
TV
Quality
Test whether mean sales of certain cells
are different than the level expected.
Calculation are based on the sum of square for
interaction SS(AB)
Two-way ANOVA (two factors)
Graphical description of the possible
relationships between factors A and B.
Graphical description of the possible
relationships between factors A and B.
Levels of factor A
1 2 3
Level 1 of factor B
Level 2 of factor B
1 2 3
1 2 31 2 3
Level 1and 2 of factor B
Difference between the levels of factor A
No difference between the levels of factor B
Difference between the levels of factor A, and
difference between the levels of factor B; no
interaction
Levels of factor A
Levels of factor A Levels of factor A
No difference between the levels of factor A.
Difference between the levels of factor B
Interaction
M R
e e
s
a p
n o
n
s
e
M R
e e
s
a p
n o
n
s
e
M R
e e
s
a p
n o
n
s
e
M R
e e
s
a p
n o
n
s
e
Sums of squares
∑=
−=
a
1i
2
i )x]A[x(rb)A(SS })()()){(2(10( 222
. xxxxxx pricequalityconv −+−+−
∑=
−=
b
1j
2
j )x]B[x(ra)B(SS })()){(3)(10( 22
xxxx NewspaperTV −+−
∑∑
==
+−−=
b
1j
2
jiij
a
1i
)x]B[x]A[x]AB[x(r)AB(SS
∑∑∑ ===
−=
r
k
ijijk
b
j
a
i
ABxxSSE
1
2
11
)][(
F tests for the Two-way ANOVA
• Test for the difference between the levels of the main
factors A and B
F=
MS(A)
MSE
F=
MS(B)
MSE
Rejection region: F > Fα,a-1 ,n-ab F > Fα, b-1, n-ab
• Test for interaction between factors A and B
F=
MS(AB)
MSE
Rejection region: F > Fα,(a-1)(b-1),n-ab
SS(A)/(a-1) SS(B)/(b-1)
SS(AB)/(a-1)(b-1)
SSE/(n-ab)
Required conditions:
1. The response distributions is normal
2. The treatment variances are equal.
3. The samples are independent.
• Example 15.3 – continued( Xm15-03)
F tests for the Two-way ANOVA
Convenience Quality Price
TV 491 677 575
TV 712 627 614
TV 558 590 706
TV 447 632 484
TV 479 683 478
TV 624 760 650
TV 546 690 583
TV 444 548 536
TV 582 579 579
TV 672 644 795
Newspaper 464 689 803
Newspaper 559 650 584
Newspaper 759 704 525
Newspaper 557 652 498
Newspaper 528 576 812
Newspaper 670 836 565
Newspaper 534 628 708
Newspaper 657 798 546
Newspaper 557 497 616
Newspaper 474 841 587
• Example 15.3 – continued
– Test of the difference in mean sales between the three marketing
strategies
H0: µconv. = µquality = µprice
H1: At least two mean sales are different
F tests for the Two-way ANOVA
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
Within 501136.7 54 9280.3
Total 614757.0 59
Factor A Marketing strategies
• Example 15.3 – continued
– Test of the difference in mean sales between the three
marketing strategies
H0: µconv. = µquality = µprice
H1: At least two mean sales are different
F = MS(Marketing strategy)/MSE = 5.33
Fcritical = Fα,a-1,n-ab = F.05,3-1,60-(3)(2)= 3.17; (p-value = .0077)
– At 5% significance level there is evidence to infer that
differences in weekly sales exist among the marketing
strategies.
F tests for the Two-way ANOVA
MS(A)/MSE
• Example 15.3 - continued
– Test of the difference in mean sales between the two
advertising media
H0: µTV. = µNespaper
H1: The two mean sales differ
F tests for the Two-way ANOVA
Factor B = Advertising media
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
Within 501136.7 54 9280.3
Total 614757.0 59
• Example 15.3 - continued
– Test of the difference in mean sales between the two
advertising media
H0: µTV. = µNespaper
H1: The two mean sales differ
F = MS(Media)/MSE = 1.42
Fcritical = Fα,a-1,n-ab = F.05,2-1,60-(3)(2)= 4.02 (p-value = .2387)
– At 5% significance level there is insufficient evidence to infer
that differences in weekly sales exist between the two
advertising media.
F tests for the Two-way ANOVA
MS(B)/MSE
• Example 15.3 - continued
– Test for interaction between factors A and B
H0: µTV*conv. = µTV*quality =…=µnewsp.*price
H1: At least two means differ
F tests for the Two-way ANOVA
Interaction AB = Marketing*Media
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 13172.0 1 13172.0 1.42 0.2387 4.02
Columns 98838.6 2 49419.3 5.33 0.0077 3.17
Interaction 1609.6 2 804.8 0.09 0.9171 3.17
Within 501136.7 54 9280.3
Total 614757.0 59
• Example 15.3 - continued
– Test for interaction between factor A and B
H0: µTV*conv. = µTV*quality =…=µnewsp.*price
H1: At least two means differ
F = MS(Marketing*Media)/MSE = .09
Fcritical = Fα,(a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2)= 3.17 (p-value= .9171)
– At 5% significance level there is insufficient evidence to infer
that the two factors interact to affect the mean weekly sales.
MS(AB)/MSE
F tests for the Two-way ANOVA
15.7 Multiple Comparisons
• When the null hypothesis is rejected, it may be
desirable to find which mean(s) is (are) different,
and at what ranking order.
• Three statistical inference procedures, geared at
doing this, are presented:
– Fisher’s least significant difference (LSD) method
– Bonferroni adjustment
– Tukey’s multiple comparison method
• Two means are considered different if the difference
between the corresponding sample means is larger than
a critical number. Then, the larger sample mean is
believed to be associated with a larger population
mean.
• Conditions common to all the methods here:
– The ANOVA model is the one way analysis of variance
– The conditions required to perform the ANOVA are satisfied.
– The experiment is fixed-effect
15.7 Multiple Comparisons
Fisher Least Significant Different (LSD) Method
• This method builds on the equal variances t-test of the
difference between two means.
• The test statistic is improved by using MSE rather than sp
2
.
• We can conclude that µi and µj differ (at α% significance
level if |µi - µj| > LSD, where
kn.f.d
)
n
1
n
1
(MSEtLSD
ji
2
−=
+= α
Experimentwise Type I error rate (αE)
(the effective Type I error)
• The Fisher’s method may result in an increased probability of
committing a type I error.
• The experimentwise Type I error rate is the probability of
committing at least one Type I error at significance level of α. It
is calculated by
αE = 1-(1 – α)C
where C is the number of pairwise comparisons (I.e.
C = k(k-1)/2
• The Bonferroni adjustment determines the required Type I error
probability per pairwise comparison (α) ,to secure a pre-
determined overall αE.
• The procedure:
– Compute the number of pairwise comparisons (C)
[C=k(k-1)/2], where k is the number of populations.
– Set α = αE/C, where αE is the true probability of making at
least one Type I error (called experimentwise Type I error).
– We can conclude that µi and µj differ (at α/C% significance
level if
kn.f.d
)
n
1
n
1
(MSEt
ji
)C2(ji
−=
+>µ−µ α
Bonferroni Adjustment
35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
=−=−
=−=−
=−=−
• Example 15.1 - continued
– Rank the effectiveness of the marketing strategies
(based on mean weekly sales).
– Use the Fisher’s method, and the Bonferroni adjustment method
• Solution (the Fisher’s method)
– The sample mean sales were 577.55, 653.0, 608.65.
– Then,
71.59)20/1()20/1(8894t
)
n
1
n
1
(MSEt
2/05.
ji
2
≈+
=+α
Fisher and Bonferroni Methods
• Solution (the Bonferroni adjustment)
– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.
– We set α = .05/3 = .0167, thus t.0167/2,60-3 = 2.467 (Excel).
54.73)20/1()20/1(8894467.2
)
n
1
n
1
(MSEt
ji
2
=+
=+α
Again, the significant difference is between µ1 and µ2.
35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
=−=−
=−=−
=−=−
Fisher and Bonferroni Methods
• The test procedure:
– Find a critical number ω as follows:
g
n
MSE
),k(q ν=ω α
k = the number of samples
ν =degrees of freedom = n - k
ng = number of observations per sample
(recall, all the sample sizes are the same)
α = significance level
qα(k,ν) = a critical value obtained from the studentized range table
Tukey Multiple Comparisons
If the sample sizes are not extremely different, we can use the
above procedure with ng calculated as the harmonic mean of
the sample sizes. k21 n1...n1n1
k
gn
+++
=
• Repeat this procedure for each pair of samples.
Rank the means if possible.
• Select a pair of means. Calculate the difference
between the larger and the smaller mean.
• If there is sufficient evidence to
conclude that µmax > µmin .
minmax xx −
ω>− minmax xx
Tukey Multiple Comparisons
City 1 vs. City 2: 653 - 577.55 = 75.45
City 1 vs. City 3: 608.65 - 577.55 = 31.1
City 2 vs. City 3: 653 - 608.65 = 44.35
• Example 15.1 - continued We had three populations
(three marketing strategies).
K = 3,
Sample sizes were equal. n1 = n2 = n3 = 20,
ν = n-k = 60-3 = 57,
MSE = 8894.
minmax xx −
70.71
20
8894
)57,3(.q
n
MSE
),k(q 05
g
==ν=ω α
Take q.05(3,60) from the table.
Population
Sales - City 1
Sales - City 2
Sales - City 3
Mean
577.55
653
698.65
ω>− minmax xx
Tukey Multiple Comparisons
Excel – Tukey and Fisher LSD method
Xm15-01
Fisher’s LDS
Bonferroni adjustments
α = .05
α = .05/3 = .0167
Multiple Comparisons
LSD Omega
Treatment Treatment Difference Alpha = 0.05 Alpha = 0.05
Convenience Quality -75.45 59.72 71.70
Price -31.1 59.72 71.70
Quality Price 44.35 59.72 71.70
Multiple Comparisons
LSD Omega
Treatment Treatment Difference Alpha = 0.0167 Alpha = 0.05
Convenience Quality -75.45 73.54 71.70
Price -31.1 73.54 71.70
Quality Price 44.35 73.54 71.70

More Related Content

What's hot

The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populations
Shakeel Nouman
 
Aron chpt 9 ed t test independent samples
Aron chpt 9 ed t test independent samplesAron chpt 9 ed t test independent samples
Aron chpt 9 ed t test independent samples
Karen Price
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solution
Krunal Shah
 

What's hot (18)

Cpt sampling theory revision sheet
Cpt sampling theory revision sheetCpt sampling theory revision sheet
Cpt sampling theory revision sheet
 
The t Test for Two Independent Samples
The t Test for Two Independent SamplesThe t Test for Two Independent Samples
The t Test for Two Independent Samples
 
Repeated Measures t-test
Repeated Measures t-testRepeated Measures t-test
Repeated Measures t-test
 
117 chap8 slides
117 chap8 slides117 chap8 slides
117 chap8 slides
 
Stat 3203 -multphase sampling
Stat 3203 -multphase samplingStat 3203 -multphase sampling
Stat 3203 -multphase sampling
 
Stat 3203 -pps sampling
Stat 3203 -pps samplingStat 3203 -pps sampling
Stat 3203 -pps sampling
 
Sampling theory
Sampling theorySampling theory
Sampling theory
 
The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populations
 
Analysis of Variance-ANOVA
Analysis of Variance-ANOVAAnalysis of Variance-ANOVA
Analysis of Variance-ANOVA
 
Aron chpt 9 ed t test independent samples
Aron chpt 9 ed t test independent samplesAron chpt 9 ed t test independent samples
Aron chpt 9 ed t test independent samples
 
Measures of Relative Standing and Boxplots
Measures of Relative Standing and BoxplotsMeasures of Relative Standing and Boxplots
Measures of Relative Standing and Boxplots
 
Data sampling and probability
Data sampling and probabilityData sampling and probability
Data sampling and probability
 
Stat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage samplingStat 3203 -cluster and multi-stage sampling
Stat 3203 -cluster and multi-stage sampling
 
Probability sampling
Probability samplingProbability sampling
Probability sampling
 
Introduction to the t Statistic
Introduction to the t StatisticIntroduction to the t Statistic
Introduction to the t Statistic
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
 
Statr sessions 9 to 10
Statr sessions 9 to 10Statr sessions 9 to 10
Statr sessions 9 to 10
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solution
 

Similar to Anova (1)

Anova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dkAnova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dk
Syifa' Humaira
 

Similar to Anova (1) (20)

One way anova
One way anovaOne way anova
One way anova
 
Analysis of Variance
Analysis of Variance Analysis of Variance
Analysis of Variance
 
Anova by Hazilah Mohd Amin
Anova by Hazilah Mohd AminAnova by Hazilah Mohd Amin
Anova by Hazilah Mohd Amin
 
ANOVA.pptx
ANOVA.pptxANOVA.pptx
ANOVA.pptx
 
QM Unit II.pptx
QM Unit II.pptxQM Unit II.pptx
QM Unit II.pptx
 
Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis
 
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-testHypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
Hypothesis Test _Two-sample t-test, Z-test, Proportion Z-test
 
test_using_one-way_analysis_of_varianceANOVA_063847.pptx
test_using_one-way_analysis_of_varianceANOVA_063847.pptxtest_using_one-way_analysis_of_varianceANOVA_063847.pptx
test_using_one-way_analysis_of_varianceANOVA_063847.pptx
 
Statr session 19 and 20
Statr session 19 and 20Statr session 19 and 20
Statr session 19 and 20
 
classmar16.ppt
classmar16.pptclassmar16.ppt
classmar16.ppt
 
classmar16.ppt
classmar16.pptclassmar16.ppt
classmar16.ppt
 
604_multiplee.ppt
604_multiplee.ppt604_multiplee.ppt
604_multiplee.ppt
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
Anova; analysis of variance
Anova; analysis of varianceAnova; analysis of variance
Anova; analysis of variance
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
causal-research-1219342132452457-8.pptx
causal-research-1219342132452457-8.pptxcausal-research-1219342132452457-8.pptx
causal-research-1219342132452457-8.pptx
 
Tugasan kumpulan anova
Tugasan kumpulan anovaTugasan kumpulan anova
Tugasan kumpulan anova
 
Anova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dkAnova one way sem 1 20142015 dk
Anova one way sem 1 20142015 dk
 
Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 

Anova (1)

  • 2. 15.1 Introduction • Analysis of variance compares two or more populations of interval data. • Specifically, we are interested in determining whether differences exist between the population means. • The procedure works by analyzing the sample variance.
  • 3. • The analysis of variance is a procedure that tests to determine whether differences exits between two or more population means. • To do this, the technique analyzes the sample variances 15.2 One Way Analysis of Variance
  • 4. • Example 15.1 – An apple juice manufacturer is planning to develop a new product -a liquid concentrate. – The marketing manager has to decide how to market the new product. – Three strategies are considered • Emphasize convenience of using the product. • Emphasize the quality of the product. • Emphasize the product’s low price. One Way Analysis of Variance
  • 5. • Example 15.1 - continued – An experiment was conducted as follows: • In three cities an advertisement campaign was launched . • In each city only one of the three characteristics (convenience, quality, and price) was emphasized. • The weekly sales were recorded for twenty weeks following the beginning of the campaigns. One Way Analysis of Variance
  • 6. One Way Analysis of Variance Convnce Quality Price 529 804 672 658 630 531 793 774 443 514 717 596 663 679 602 719 604 502 711 620 659 606 697 689 461 706 675 529 615 512 498 492 691 663 719 733 604 787 698 495 699 776 485 572 561 557 523 572 353 584 469 557 634 581 542 580 679 614 624 532 See file Xm15 -01 Weekly sales Weekly sales Weekly sales
  • 7. • Solution – The data are interval – The problem objective is to compare sales in three cities. – We hypothesize that the three population means are equal One Way Analysis of Variance
  • 8. H0: µ1 = µ2= µ3 H1: At least two means differ To build the statistic needed to test the hypotheses use the following notation: • Solution Defining the Hypotheses
  • 9. Independent samples are drawn from k populations (treatments). 1 2 k X11 x21 . . . Xn1,1 1 1 x n X12 x22 . . . Xn2,2 2 2 x n X1k x2k . . . Xnk,k k k x n Sample size Sample mean First observation, first sample Second observation, second sample X is the “response variable”. The variables’ value are called “responses”. Notation
  • 10. Terminology • In the context of this problem… Response variable – weekly sales Responses – actual sale values Experimental unit – weeks in the three cities when we record sales figures. Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy. Factor levels – the population (treatment) names. In this problem factor levels are the marketing trategies.
  • 11. Two types of variability are employed when testing for the equality of the population means The rationale of the test statistic
  • 13. 20 25 30 1 7 Treatment 1 Treatment 2 Treatment 3 10 12 19 9 Treatment 1Treatment 2Treatment 3 20 16 15 14 11 10 9 10x1 = 15x2 = 20x3 = 10x1 = 15x2 = 20x3 = The sample means are the same as before, but the larger within-sample variability makes it harder to draw a conclusion about the population means. A small variability within the samples makes it easier to draw a conclusion about the population means.
  • 14. The rationale behind the test statistic – I • If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). • If the alternative hypothesis is true, at least some of the sample means would differ. • Thus, we measure variability between sample means.
  • 15. • The variability between the sample means is measured as the sum of squared distances between each mean and the grand mean. This sum is called the Sum of Squares for Treatments SST In our example treatments are represented by the different advertising strategies. Variability between sample means
  • 16. 2 k 1j jj )xx(nSST ∑= −= There are k treatments The size of sample j The mean of sample j Sum of squares for treatments (SST) Note: When the sample means are close to one another, their distance from the grand mean is small, leading to a small SST. Thus, large SST indicates large variation between sample means, which supports H1.
  • 17. • Solution – continued Calculate SST 2 k 1j jj 321 )xx(nSST 65.608x00.653x577.55x ∑= −= === = 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 = = 57,512.23 The grand mean is calculated by k21 kk2211 n...nn xn...xnxn X +++ +++ = Sum of squares for treatments (SST)
  • 18. Is SST = 57,512.23 large enough to reject H0 in favor of H1? See next. Sum of squares for treatments (SST)
  • 19. • Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. • Therefore, even though sample means may markedly differ from one another, SST must be judged relative to the “within samples variability”. The rationale behind test statistic – II
  • 20. • The variability within samples is measured by adding all the squared distances between observations and their sample means. This sum is called the Sum of Squares for Error SSEIn our example this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities). Within samples variability
  • 21. • Solution – continued Calculate SSE Sum of squares for errors (SSE) ∑∑= = −= === k j jij n i xxSSE sss j 1 2 1 2 3 2 2 2 1 )( 24.670,811,238,700.775,10 = (n1 - 1)s1 2 + (n2 -1)s2 2 + (n3 -1)s3 2 = (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24 = 506,983.50
  • 22. Is SST = 57,512.23 large enough relative to SSE = 506,983.50 to reject the null hypothesis that specifies that all the means are equal? Sum of squares for errors (SSE)
  • 23. To perform the test we need to calculate the mean squaresmean squares as follows: The mean sum of squares Calculation of MST - Mean Square for Treatments 12.756,28 13 23.512,57 1 = − = − = k SST MST Calculation of MSE Mean Square for Error 45.894,8 360 50.983,509 = − = − = kn SSE MSE
  • 24. Calculation of the test statistic 23.3 45.894,8 12.756,28 = = = MSE MST F with the following degrees of freedom: v1=k -1 and v2=n-k Required Conditions: 1. The populations tested are normally distributed. 2. The variances of all the populations tested are equal.
  • 25. And finally the hypothesis test: H0: µ1 = µ2 = …=µk H1: At least two means differ Test statistic: R.R: F>Fα,k-1,n-k MSE MST F= The F test rejection region
  • 26. The F test Ho: µ1 = µ2= µ3 H1: At least two means differ Test statistic F= MST/ MSE= 3.23 15.3FFF:.R.R 360,13,05.0knk ≈=> −−−,−,α 1 Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1,and argue that at least one of the mean sales is different than the others. 23.3 17.894,8 12.756,28 MSE MST F = = =
  • 27. -0.02 0 0.02 0.04 0.06 0.08 0.1 0 1 2 3 4 • Use Excel to find the p-value – fx Statistical FDIST(3.23,2,57) = .0467 The F test p- value p Value = P(F>3.23) = .0467
  • 28. Excel single factor ANOVA SS(Total) = SST + SSE Anova: Single Factor SUMMARY Groups Count Sum Average Variance Convenience 20 11551 577.55 10775.00 Quality 20 13060 653.00 7238.11 Price 20 12173 608.65 8670.24 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 57512 2 28756 3.23 0.0468 3.16 Within Groups 506984 57 8894 Total 564496 59 Xm15-01.xls
  • 29. 15.3 Analysis of Variance Experimental Designs • Several elements may distinguish between one experimental design and others. – The number of factors. • Each characteristic investigated is called a factor. • Each factor has several levels.
  • 30. Factor A Level 1Level2 Level 1 Factor B Level 3 Two - way ANOVA Two factors Level2 One - way ANOVA Single factor Treatment 3 (level 1) Response Response Treatment 1 (level 3) Treatment 2 (level 2)
  • 31. • Groups of matched observations are formed into blocks, in order to remove the effects of “unwanted” variability. • By doing so we improve the chances of detecting the variability of interest. Independent samples or blocks
  • 32. • Fixed effects – If all possible levels of a factor are included in our analysis we have a fixed effect ANOVA. – The conclusion of a fixed effect ANOVA applies only to the levels studied. • Random effects – If the levels included in our analysis represent a random sample of all the possible levels, we have a random-effect ANOVA. – The conclusion of the random-effect ANOVA applies to all the levels (not only those studied). Models of Fixed and Random Effects
  • 33. • In some ANOVA models the test statistic of the fixed effects case may differ from the test statistic of the random effect case. • Fixed and random effects - examples – Fixed effects - The advertisement Example (15.1): All the levels of the marketing strategies were included – Random effects - To determine if there is a difference in the production rate of 50 machines, four machines are randomly selected and there production recorded. Models of Fixed and Random Effects.
  • 34. 15.4 Randomized Blocks (Two-way) Analysis of Variance • The purpose of designing a randomized block experiment is to reduce the within-treatments variation thus increasing the relative amount of between treatment variation. • This helps in detecting differences between the treatment means more easily.
  • 35. Treatment 4 Treatment 3 Treatment 2 Treatment 1 Block 1Block3 Block2 Block all the observations with some commonality across treatments Randomized Blocks
  • 36. Treatment Block 1 2 k Block mean 1 X11 X12 . . . X1k 2 X21 X22 X2k . . . b Xb1 Xb2 Xbk Treatment mean 1]B[x 2]B[x b]B[x 1]T[x 2]T[x k]T[x Block all the observations with some commonality across treatments Randomized Blocks
  • 37. • The sum of square total is partitioned into three sources of variation – Treatments – Blocks – Within samples (Error) SS(Total) = SST + SSB + SSESS(Total) = SST + SSB + SSE Sum of square for treatments Sum of square for blocks Sum of square for error Recall. For the independent samples design we have: SS(Total) = SST + SSE Partitioning the total variability
  • 38. Calculating the sums of squares • Formulai for the calculation of the sums of squares Treatment Block 1 2 k Block mean 1 X11 X12 . . . X1k 2 X21 X22 X2k . . . b Xb1 Xb2 Xbk Treatment mean 1]B[x 2]B[x 1]T[x 2]T[x k]T[x x 2 1 X)]T[x(b      − ...X)]T[x(b 2 2 +     −+ 2 k X)]T[x(b      −+SST = +     − 2 1 X)]B[x(k +     − 2 2 X)]B[x(k 2 k X)]B[x(k      − SSB= =+−+−+ +−+−++−+−= ...)()(... )()(...)()()( 2 2 2 1 2 22 2 12 2 21 2 11 XxXX XxXxXxXxTotalSS kk
  • 39. Calculating the sums of squares • Formulai for the calculation of the sums of squares Treatment Block 1 2 k Block mean 1 X11 X12 . . . X1k 2 X21 X22 X2k . . . b Xb1 Xb2 Xbk Treatment mean 1]B[x 2]B[x 1]T[x 2]T[x k]T[x x 2 1 X)]T[x(b      − ...X)]T[x(b 2 2 +     −+ 2 k X)]T[x(b      −+SST = +     − 2 1 X)]B[x(k +     − 2 2 X)]B[x(k 2 k X)]B[x(k      − SSB= ...)X]B[x]T[xx()X]B[x]T[xx( ...)X]B[x]T[xx()X]B[x]T[xx( ...)X]B[x]T[xx()X]B[x]T[xx(SSE 2 2kk2 2 1kk1 2 2222 2 1212 2 2121 2 1111 ++−−++−− ++−−++−− ++−−++−−=
  • 40. To perform hypothesis tests for treatments and blocks we need • Mean square for treatments • Mean square for blocks • Mean square for error Mean Squares 1k SST MST − = 1b SSB MSB − = 1bkn SSE MSE +−− =
  • 41. Test statistics for the randomized block design ANOVA MSE MST F = MSE MSB F = Test statistic for treatments Test statistic for blocks
  • 42. • Testing the mean responses for treatments F > Fα,k-1,n-k-b+1 • Testing the mean response for blocks F> Fα,b-1,n-k-b+1 The F test rejection regions
  • 43. • Example 15.2 – Are there differences in the effectiveness of cholesterol reduction drugs? – To answer this question the following experiment was organized: • 25 groups of men with high cholesterol were matched by age and weight. Each group consisted of 4 men. • Each person in a group received a different drug. • The cholesterol level reduction in two months was recorded. – Can we infer from the data in Xm15-02 that there are differences in mean cholesterol reduction among the four drugs? Randomized Blocks ANOVA - Example
  • 44. • Solution – Each drug can be considered a treatment. – Each 4 records (per group) can be blocked, because they are matched by age and weight. – This procedure eliminates the variability in cholesterol reduction related to different combinations of age and weight. – This helps detect differences in the mean cholesterol reduction attributed to the different drugs. Randomized Blocks ANOVA - Example
  • 45. BlocksTreatments b-1 MST / MSE MSB / MSE Conclusion: At 5% significance level there is sufficient evidence to infer that the mean “cholesterol reduction” gained by at least two drugs are different. K-1 Randomized Blocks ANOVA - Example ANOVA Source of Variation SS df MS F P-value F crit Rows 3848.7 24 160.36 10.11 0.0000 1.67 Columns 196.0 3 65.32 4.12 0.0094 2.73 Error 1142.6 72 15.87 Total 5187.2 99
  • 47. 15.5 Two-Factor Analysis of Variance - • Example 15.3 – Suppose in Example 15.1, two factors are to be examined: • The effects of the marketing strategy on sales. – Emphasis on convenience – Emphasis on quality – Emphasis on price • The effects of the selected media on sales. – Advertise on TV – Advertise in newspapers
  • 48. • Solution – We may attempt to analyze combinations of levels, one from each factor using one-way ANOVA. – The treatments will be: • Treatment 1: Emphasize convenience and advertise in TV • Treatment 2: Emphasize convenience and advertise in newspapers • ……………………………………………………………………. • Treatment 6: Emphasize price and advertise in newspapers Attempting one-way ANOVA
  • 49. • Solution – The hypotheses tested are: H0: µ1= µ2= µ3= µ4= µ5= µ6 H1: At least two means differ. Attempting one-way ANOVA
  • 50. City1 City2 City3 City4 City5 City6 Convnce Convnce Quality Quality Price Price TV Paper TV Paper TV Paper – In each one of six cities sales are recorded for ten weeks. – In each city a different combination of marketing emphasis and media usage is employed. • Solution Attempting one-way ANOVA
  • 51. • The p-value =.0452. • We conclude that there is evidence that differences exist in the mean weekly sales among the six cities. City1 City2 City3 City4 City5 City6 Convnce Convnce Quality Quality Price Price TV Paper TV Paper TV Paper • Solution Xm15-03 Attempting one-way ANOVA
  • 52. • These result raises some questions: – Are the differences in sales caused by the different marketing strategies? – Are the differences in sales caused by the different media used for advertising? – Are there combinations of marketing strategy and media that interact to affect the weekly sales? Interesting questions – no answers
  • 53. • The current experimental design cannot provide answers to these questions. • A new experimental design is needed. Two-way ANOVA (two factors)
  • 54. Two-way ANOVA (two factors) City 1 sales City3 sales City 5 sales City 2 sales City 4 sales City 6 sales TV Newspapers Convenience Quality Price Are there differences in the mean sales caused by different marketing strategies? Factor A: Marketing strategy FactorB: Advertisingmedia
  • 55. Test whether mean sales of “Convenience”, “Quality”, and “Price” significantly differ from one another. H0: µConv.= µQuality = µPrice H1: At least two means differ Calculations are based on the sum of square for factor A SS(A) Two-way ANOVA (two factors)
  • 56. Two-way ANOVA (two factors) City 1 sales City 3 sales City 5 sales City 2 sales City 4 sales City 6 sales Factor A: Marketing strategy FactorB: Advertisingmedia Are there differences in the mean sales caused by different advertising media? TV Newspapers Convenience Quality Price
  • 57. Test whether mean sales of the “TV”, and “Newspapers” significantly differ from one another. H0: µTV = µNewspapers H1: The means differ Calculations are based on the sum of square for factor B SS(B) Two-way ANOVA (two factors)
  • 58. Two-way ANOVA (two factors) City 1 sales City 5 sales City 2 sales City 4 sales City 6 sales TV Newspapers Convenience Quality Price Factor A: Marketing strategy FactorB: Advertisingmedia Are there differences in the mean sales caused by interaction between marketing strategy and advertising medium? City 3 sales TV Quality
  • 59. Test whether mean sales of certain cells are different than the level expected. Calculation are based on the sum of square for interaction SS(AB) Two-way ANOVA (two factors)
  • 60. Graphical description of the possible relationships between factors A and B. Graphical description of the possible relationships between factors A and B.
  • 61. Levels of factor A 1 2 3 Level 1 of factor B Level 2 of factor B 1 2 3 1 2 31 2 3 Level 1and 2 of factor B Difference between the levels of factor A No difference between the levels of factor B Difference between the levels of factor A, and difference between the levels of factor B; no interaction Levels of factor A Levels of factor A Levels of factor A No difference between the levels of factor A. Difference between the levels of factor B Interaction M R e e s a p n o n s e M R e e s a p n o n s e M R e e s a p n o n s e M R e e s a p n o n s e
  • 62. Sums of squares ∑= −= a 1i 2 i )x]A[x(rb)A(SS })()()){(2(10( 222 . xxxxxx pricequalityconv −+−+− ∑= −= b 1j 2 j )x]B[x(ra)B(SS })()){(3)(10( 22 xxxx NewspaperTV −+− ∑∑ == +−−= b 1j 2 jiij a 1i )x]B[x]A[x]AB[x(r)AB(SS ∑∑∑ === −= r k ijijk b j a i ABxxSSE 1 2 11 )][(
  • 63. F tests for the Two-way ANOVA • Test for the difference between the levels of the main factors A and B F= MS(A) MSE F= MS(B) MSE Rejection region: F > Fα,a-1 ,n-ab F > Fα, b-1, n-ab • Test for interaction between factors A and B F= MS(AB) MSE Rejection region: F > Fα,(a-1)(b-1),n-ab SS(A)/(a-1) SS(B)/(b-1) SS(AB)/(a-1)(b-1) SSE/(n-ab)
  • 64. Required conditions: 1. The response distributions is normal 2. The treatment variances are equal. 3. The samples are independent.
  • 65. • Example 15.3 – continued( Xm15-03) F tests for the Two-way ANOVA Convenience Quality Price TV 491 677 575 TV 712 627 614 TV 558 590 706 TV 447 632 484 TV 479 683 478 TV 624 760 650 TV 546 690 583 TV 444 548 536 TV 582 579 579 TV 672 644 795 Newspaper 464 689 803 Newspaper 559 650 584 Newspaper 759 704 525 Newspaper 557 652 498 Newspaper 528 576 812 Newspaper 670 836 565 Newspaper 534 628 708 Newspaper 657 798 546 Newspaper 557 497 616 Newspaper 474 841 587
  • 66. • Example 15.3 – continued – Test of the difference in mean sales between the three marketing strategies H0: µconv. = µquality = µprice H1: At least two mean sales are different F tests for the Two-way ANOVA ANOVA Source of Variation SS df MS F P-value F crit Sample 13172.0 1 13172.0 1.42 0.2387 4.02 Columns 98838.6 2 49419.3 5.33 0.0077 3.17 Interaction 1609.6 2 804.8 0.09 0.9171 3.17 Within 501136.7 54 9280.3 Total 614757.0 59 Factor A Marketing strategies
  • 67. • Example 15.3 – continued – Test of the difference in mean sales between the three marketing strategies H0: µconv. = µquality = µprice H1: At least two mean sales are different F = MS(Marketing strategy)/MSE = 5.33 Fcritical = Fα,a-1,n-ab = F.05,3-1,60-(3)(2)= 3.17; (p-value = .0077) – At 5% significance level there is evidence to infer that differences in weekly sales exist among the marketing strategies. F tests for the Two-way ANOVA MS(A)/MSE
  • 68. • Example 15.3 - continued – Test of the difference in mean sales between the two advertising media H0: µTV. = µNespaper H1: The two mean sales differ F tests for the Two-way ANOVA Factor B = Advertising media ANOVA Source of Variation SS df MS F P-value F crit Sample 13172.0 1 13172.0 1.42 0.2387 4.02 Columns 98838.6 2 49419.3 5.33 0.0077 3.17 Interaction 1609.6 2 804.8 0.09 0.9171 3.17 Within 501136.7 54 9280.3 Total 614757.0 59
  • 69. • Example 15.3 - continued – Test of the difference in mean sales between the two advertising media H0: µTV. = µNespaper H1: The two mean sales differ F = MS(Media)/MSE = 1.42 Fcritical = Fα,a-1,n-ab = F.05,2-1,60-(3)(2)= 4.02 (p-value = .2387) – At 5% significance level there is insufficient evidence to infer that differences in weekly sales exist between the two advertising media. F tests for the Two-way ANOVA MS(B)/MSE
  • 70. • Example 15.3 - continued – Test for interaction between factors A and B H0: µTV*conv. = µTV*quality =…=µnewsp.*price H1: At least two means differ F tests for the Two-way ANOVA Interaction AB = Marketing*Media ANOVA Source of Variation SS df MS F P-value F crit Sample 13172.0 1 13172.0 1.42 0.2387 4.02 Columns 98838.6 2 49419.3 5.33 0.0077 3.17 Interaction 1609.6 2 804.8 0.09 0.9171 3.17 Within 501136.7 54 9280.3 Total 614757.0 59
  • 71. • Example 15.3 - continued – Test for interaction between factor A and B H0: µTV*conv. = µTV*quality =…=µnewsp.*price H1: At least two means differ F = MS(Marketing*Media)/MSE = .09 Fcritical = Fα,(a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2)= 3.17 (p-value= .9171) – At 5% significance level there is insufficient evidence to infer that the two factors interact to affect the mean weekly sales. MS(AB)/MSE F tests for the Two-way ANOVA
  • 72. 15.7 Multiple Comparisons • When the null hypothesis is rejected, it may be desirable to find which mean(s) is (are) different, and at what ranking order. • Three statistical inference procedures, geared at doing this, are presented: – Fisher’s least significant difference (LSD) method – Bonferroni adjustment – Tukey’s multiple comparison method
  • 73. • Two means are considered different if the difference between the corresponding sample means is larger than a critical number. Then, the larger sample mean is believed to be associated with a larger population mean. • Conditions common to all the methods here: – The ANOVA model is the one way analysis of variance – The conditions required to perform the ANOVA are satisfied. – The experiment is fixed-effect 15.7 Multiple Comparisons
  • 74. Fisher Least Significant Different (LSD) Method • This method builds on the equal variances t-test of the difference between two means. • The test statistic is improved by using MSE rather than sp 2 . • We can conclude that µi and µj differ (at α% significance level if |µi - µj| > LSD, where kn.f.d ) n 1 n 1 (MSEtLSD ji 2 −= += α
  • 75. Experimentwise Type I error rate (αE) (the effective Type I error) • The Fisher’s method may result in an increased probability of committing a type I error. • The experimentwise Type I error rate is the probability of committing at least one Type I error at significance level of α. It is calculated by αE = 1-(1 – α)C where C is the number of pairwise comparisons (I.e. C = k(k-1)/2 • The Bonferroni adjustment determines the required Type I error probability per pairwise comparison (α) ,to secure a pre- determined overall αE.
  • 76. • The procedure: – Compute the number of pairwise comparisons (C) [C=k(k-1)/2], where k is the number of populations. – Set α = αE/C, where αE is the true probability of making at least one Type I error (called experimentwise Type I error). – We can conclude that µi and µj differ (at α/C% significance level if kn.f.d ) n 1 n 1 (MSEt ji )C2(ji −= +>µ−µ α Bonferroni Adjustment
  • 77. 35.4465.6080.653xx 10.3165.60855.577xx 45.750.65355.577xx 32 31 21 =−=− =−=− =−=− • Example 15.1 - continued – Rank the effectiveness of the marketing strategies (based on mean weekly sales). – Use the Fisher’s method, and the Bonferroni adjustment method • Solution (the Fisher’s method) – The sample mean sales were 577.55, 653.0, 608.65. – Then, 71.59)20/1()20/1(8894t ) n 1 n 1 (MSEt 2/05. ji 2 ≈+ =+α Fisher and Bonferroni Methods
  • 78. • Solution (the Bonferroni adjustment) – We calculate C=k(k-1)/2 to be 3(2)/2 = 3. – We set α = .05/3 = .0167, thus t.0167/2,60-3 = 2.467 (Excel). 54.73)20/1()20/1(8894467.2 ) n 1 n 1 (MSEt ji 2 =+ =+α Again, the significant difference is between µ1 and µ2. 35.4465.6080.653xx 10.3165.60855.577xx 45.750.65355.577xx 32 31 21 =−=− =−=− =−=− Fisher and Bonferroni Methods
  • 79. • The test procedure: – Find a critical number ω as follows: g n MSE ),k(q ν=ω α k = the number of samples ν =degrees of freedom = n - k ng = number of observations per sample (recall, all the sample sizes are the same) α = significance level qα(k,ν) = a critical value obtained from the studentized range table Tukey Multiple Comparisons
  • 80. If the sample sizes are not extremely different, we can use the above procedure with ng calculated as the harmonic mean of the sample sizes. k21 n1...n1n1 k gn +++ = • Repeat this procedure for each pair of samples. Rank the means if possible. • Select a pair of means. Calculate the difference between the larger and the smaller mean. • If there is sufficient evidence to conclude that µmax > µmin . minmax xx − ω>− minmax xx Tukey Multiple Comparisons
  • 81. City 1 vs. City 2: 653 - 577.55 = 75.45 City 1 vs. City 3: 608.65 - 577.55 = 31.1 City 2 vs. City 3: 653 - 608.65 = 44.35 • Example 15.1 - continued We had three populations (three marketing strategies). K = 3, Sample sizes were equal. n1 = n2 = n3 = 20, ν = n-k = 60-3 = 57, MSE = 8894. minmax xx − 70.71 20 8894 )57,3(.q n MSE ),k(q 05 g ==ν=ω α Take q.05(3,60) from the table. Population Sales - City 1 Sales - City 2 Sales - City 3 Mean 577.55 653 698.65 ω>− minmax xx Tukey Multiple Comparisons
  • 82. Excel – Tukey and Fisher LSD method Xm15-01 Fisher’s LDS Bonferroni adjustments α = .05 α = .05/3 = .0167 Multiple Comparisons LSD Omega Treatment Treatment Difference Alpha = 0.05 Alpha = 0.05 Convenience Quality -75.45 59.72 71.70 Price -31.1 59.72 71.70 Quality Price 44.35 59.72 71.70 Multiple Comparisons LSD Omega Treatment Treatment Difference Alpha = 0.0167 Alpha = 0.05 Convenience Quality -75.45 73.54 71.70 Price -31.1 73.54 71.70 Quality Price 44.35 73.54 71.70