Zipped Material - Hypothesis & Estimation Test for Population Variances.zip
Hypothesis & Estimation Test for Population Variances - Sample Problems.pdf
1
Sample Problems
1) A random sample of 20 values was selected from a population, and the sample standard
deviation was computed to be 360. Based on this sample result, compute a 95% confidence interval
estimate for the true population standard deviation.
No statistical method exists for developing a confidence interval estimate for a population standard deviation
directly. Instead we must first convert to variances. This, we get a sample variance equal to
22360129,600s
Now we compute the interval estimate using:
2
2
2
2
2 )1()1(
LU
snsn
where
s
= sample variance
n = sample size
2
L = Lower Critical Value
2
U = Upper Critical Value
The denominators come from the chi-square distribution with n -1 degrees of freedom. In an application in
which the sample size is n = 20 and the desired confidence level is 95%, from the chi-square table in Appendix
G we get the critical value
8523.322 025.0
2 U
Likewise, we get:
9065.82 975.0
2 L
Now given that the sample variance computed from the sample of n = 20 values is
360 9,600s
Then, we construct the 95% confidence interval as follows:
( 0 ) 9,600( 0 ) 9,600 3 .85 38.90
7 ,953.7 76, 7 .
Thus, at the 95% confidence level, we conclude that the population variance will fall in the range 74,953.7 to
276,472.2. By taking the square root, we can convert to an interval estimate of the population standard
deviation as the interval 273.78 to 525.81.
2) Given the following null and alternative hypotheses
H0: ¬
2
= 100
HA: ¬
2 100
2
a) Test when n=27, s=9, and =0.10; state the decision rule.
b) Test when n=17, s=6, and =0.05; state the decision rule.
a)
H0:
2 = 100, HA:
2 100
This is a two‐tailed test.
The random sample consists of n = 27 observations. The sample variance is s
2
= 9
2
= 81. The test statistic is
2
2
2
( 1) (27 1)81
100
n s
= 21.06
Because
2 2
0.05
21.06 38.8851 and because 2 2
0.95
21.06 15.3792 do not reject the null hypothesis
based on these sample data.
Based on the sample data and the hypothesis test conducted we do not reject the null hypothesis at the α =
0.10 level of significance and we conclude the population variance is not different from 100.
b)
H0:
2 = 100, HA:
2 100
This is a two‐tailed test.
The random sample consists of n = 17 observations. The sample variance is s
2
= 6
2
= 36. The test statistic is
2
2
2
( 1) (17 1)36
100
n s
= 5.76.
Because 9077.676.5 2
975.0
2 we reject the null hypothesis.
Based on the sample data and the hypothesis test conducted we do reject the null hypothesis at the α = 0.05
level of significance and conclude the population variance is different from 100..
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Zipped Material - Hypothesis & Estimation Test for Population Va.docx
1. Zipped Material - Hypothesis & Estimation Test for Population
Variances.zip
Hypothesis & Estimation Test for Population Variances -
Sample Problems.pdf
1
Sample Problems
1) A random sample of 20 values was selected from a populatio
n, and the sample standard
deviation was computed to be 360. Based on this sample result,
compute a 95% confidence interval
estimate for the true population standard deviation.
No statistical method exists for developing a confidence interva
l estimate for a population standard deviation
directly. Instead we must first convert to variances. This, we g
et a sample variance equal to
22360129,600s
Now we compute the interval estimate using:
2
2. 2
2
2
2 )1()1(
LU
snsn
where
s
= sample variance
n = sample size
2
= Lower Critical Value
2
= Upper Critical Value
The denominators come from the chi-square distribution with n
-1 degrees of freedom. In an application in
3. which the sample size is n = 20 and the desired confidence level
is 95%, from the chi-square table in Appendix
G we get the critical value
8523.322 025.0
Likewise, we get:
9065.82 975.0
Now given that the sample variance computed from the sample
of n = 20 values is
360 9,600s
Then, we construct the 95% confidence interval as follows:
( 0 ) 9,600( 0 ) 9,600 3 .85 38.90
Thus, at the 95% confidence level, we conclude that the
population variance will fall in the range 74,953.7 to
276,472.2. By taking the square root, we can convert to an
interval estimate of the population standard
deviation as the interval 273.78 to 525.81.
2) Given the following null and alternative hypotheses
H0: ¬
2
4. = 100
HA: ¬
2 100
2
a) Test when n=27, s=9, and state the decision rule.
b) Test when n=17, s=6, and state the decision rule.
a)
This is a two‐tailed test.
The random sample consists of n = 27 observations. The sample
variance is s
2
= 9
2
= 81. The test statistic is
2
2
2
( 1) (27 1)81
5. 100
n s
= 21.06
Because
2 2
0.05
because 2 2
0.95
not reject the null hypothesis
based on these sample data.
Based on the sample data and the hypothesis test conducted we
do not reject the null hypothesis at the α =
0.10 level of significance and we conclude the population varia
nce is not different from 100.
b)
This is a two‐tailed test.
The random sample consists of n = 17 observations. The sample
6. variance is s
2
= 6
2
= 36. The test statistic is
2
2
2
( 1) (17 1)36
100
n s
= 5.76.
Because 9077.676.5 2
975.0
reject the null hypothesis.
Based on the sample data and the hypothesis test conducted we
do reject the null hypothesis at the α = 0.05
level of significance and conclude the population variance is dif
ferent from 100.
3) A manager is interested in determining if the population stan
7. dard deviation has dropped below
130. Based on a sample of n=20 items selected randomly from t
he population, conduct the
appropriate hypothesis test at a 0.05 significance level. The sam
ple for the standard deviation is
105.
2
If reject the null hypothesis
Otherwise, do not reject the null hypothesis
The random sample of n = 20 items provided a sample standard
deviation equal to 105. Thus, The
test statistic is:
2
2
2
( 1) (20 1)11, 025
12.39
16, 900
n s
8. Since
, do not reject the null hypothesis.
3
Based on the test results, the data do not present sufficient
evidence to justify concluding that the population
standard deviation has dropped below 130.
4) The following sample data have been collected for the purpos
e of testing whether a population
standard deviation is equal to 40. Conduct the appropriate hypot
hesis test using
318 255 323 325 334
354 266 308 321 297
316 272 346 266 309
2
2
9. The decision rule is:
If the test statistic,
26.1189, reject the null hypothesis
If the test statistic,
5.6287, reject the null hypothesis
Otherwise, do not reject the null hypothesis
We first compute the sample variance as follows:
52.916
1
)(
2
n
xx
s
The test statistic is a chi‐square value computed as follows:
02.8
600.1
52.916)115()1(
2
10. 2
sn
Because = 8.02 > 5.6287 and because = 8.02 < 26.118
9 we do not reject the null hypothesis
Based on these sample data, we do not have any reason to concl
ude that the population variance is not 1,600.
Therefore, we also do not have sufficient evidence to conclude t
hat the population standard deviation is not
40.
5) Given the following null and alternative hypotheses
H0: ¬
2
= 50
HA: ¬
2 50
11. a) Test when n=12, s=9, and state the decision rule.
b) Test when n=19, s=6, and state the decision rule.
a)
The random sample consists of n = 12 observations. The sampl
e variance is s
2
= 9
2
= 81. The test statistic is
2
2
2
( 1) (12 1)81
50
n s
= 17.82
4
Because 2 2
12. 0.05
because 2 2
0.95
do not reject the null hypothesis based on
these sample data.
Based on the sample data and the hypothesis test conducted we
do not reject the null hypothesis at the α =
0.10 level of significance and we conclude the population varia
nce is not different from 50.
b)
The decision rule is:
If the test statistic, 2 2
= 31.5264, reject the null hypothesis
If the test statistic, 2 2
= 8.2307, reject the null hypothesis
Otherwise, do not reject the null hypothesis
The random sample consists of n = 19 observations. The sample
variance is s
2
= 6
2
= 36. The test statistic is
13. 2
2
2
( 1) (19 1)36
50
n s
= 12.96
Because 2 2
0.025
because 2 2
0.975
we do not reject the null hypothesis.
Based on the sample data and the hypothesis test conducted we
do not reject the null hypothesis at the α = 0.05
level of significance and conclude the population variance is not
different from 50.
6) Suppose a random sample of 22 items produces a sample stan
dard deviation of 16.
a) Use the sample results to develop a 90% confidence interval
estimate for the population variance.
14. b) Use the sample results to develop a 95% confidence interval
estimate for the population variance.
a)
The confidence interval estimate for the population variance, σ
2
is computed using Equation 11‐2 shown
below:
2
2
2
2
2 )1()1(
LU
snsn
For a 90% confidence interval we find the following values for
2
15. 2
with n‐1 = 22‐1 = 21 degrees of
freedom:
2
= 11.5913 and
2
05.0
32.6706
The confidence interval is calculated using Equation 11‐2:
2 2
2(22 1)16 (22 1)16
32.6706 11.5913
b)
5
The confidence interval estimate for the population variance, σ
16. 2
is computed using Equation 11‐2. For a 95%
confidence interval we find the following values for
2
2
L
with n‐1 = 22‐1 = 21 degrees of freedom:
2
975.0
= 10.2829 and 2
025.0
35.4789
The confidence interval is calculated using Equation 11‐2:
2 2
2(22 1)16 (22 1)16
35.4789 10.2829
7) Historical data indicate that the standard deviation of a proce
ss is 6.3. A recent sample of size
17. a) 28 produced a variance of 66.2. Test to determine if the varia
nce has increased using a
significance level of 0.05.
b) 8 produced a variance of 9.02. Test to determine if the varian
ce has decreased using a
significance level of 0.025. Use the test statistic approach.
c) 18 produced a variance of 62.9. Test to determine if the varia
nce has changed using a significance
level of 0.10.
a)
HO:
2
39.69
HA:
2
> 39.69
03.45
)3.6(
2.66)128()1(
22
2
18. sn ,
p‐value = P(
45.03). Therefore, 0.01 < p‐value < 0.025.
Since p‐value < α, reject HO.
b)
Step 1: The parameter of interest is the population variance,
,
Step 2: HO:
2
39.69 HA:
2
< 39.69,
Step 3: α = 0.025,
Step 4: 591.1
)3.6(
02.9)18()1(
22
2
19. sn ,
Step 5: The critical value is obtained from a
with n – 1 = 7 degrees of freedom, i.e., 1.6899,
Step 6: Since the test statistic = 1.591 < the
critical value = 1.6899, reject the null hypothesis,
Step 7: Conclude that the variance has decreased.
6
c)
Step 1: The parameter of interest is the population variance,
Step 2: HO:
2
= 39.69 HA:
2
20. ≠ 39.69.
Step 3: α = 0.10
Step 4: Since 94.26
)3.6(
9.62)118()1(
22
2
sn ,
p‐value = P(s
2 62.9) = 26.94).
Step 5: α = 0.10 < p‐value = 0.94
From Minitab:
Cumulative Distribution Function
Chi‐Square with 17 DF
x P( X <= x )
21. 26.94 0.941046
Step 6: Fail to reject HO,
Step 7: Conclude that there is not enough evidence to conclude
that the variance has changed.
8) Examine the sample obtained from a normally distributed pop
ulation:
5.2 10.4 5.1 2.1 4.8 15.5 10.2
8.7 2.8 4.9 4.7 13.4 15.6 14.5
a) Calculate the variance.
b) Calculate the probability that a randomly chosen sample woul
d produce a sample variance at
least as large as that produced in part (a) if the population varia
nce was equal to 20.
c) What is the statistical term used to describe the probability c
alculated in part (b)?
d) Conduct a hypothesis test to determine if the population varia
nce is larger than 15.3. Use a
significance level equal to 0.05.
a) s
2
=
23. sn , P(s
2
23.297) =
15.14) 0.30. Therefore, p‐value 0.30.
c) This is the observed probability of rejecting the null hypothe
sis when the null hypothesis is true: the p‐
value.
d) Using the seven step procedure outlined in the chapter:
7
Step 1: The parameter of interest is the population variance,
Step 2: HO:
2
15.3 HA:
2
> 15.3.
Step 3: α = 0.05
Step 4: Since 79.19
3.15
297.23)114()1(
2
2
24. sn , P(
19.79) 0.10. Therefore, p‐value 0.10.
Step 5: α = 0.05 < p‐value 0.10.,
Step 6: Fail to reject HO,
Step 7: Conclude that there is not enough evidence to conclude t
hat the variance is larger than 15.3.
9) Given the following null and alternative hypotheses
H0: ¬1
2
≤ ¬2
2
HA: ¬1
2
> ¬2
2
25. and the following sample information
Sample 1 Sample 2
N1=13 N2=21
S1
2
=1450 S2
2
=1320
a) If = 0.05, state the decision rule for the hypothesis.
b) Test the hypothesis and indicate whether the null hypothesis
should be rejected.
a) Using the F Distribution Table: If the calculated F > 2.278, r
eject H0, otherwise do not reject H0
b) F = 1450/1320 = 1.0985
Since 1.0985 < 2.278 do not reject H0.
10) Given the following null and alternative hypotheses
H0: ¬1
2
≤ ¬2
2
HA: ¬1
26. 2
> ¬2
2
and the following sample information
Sample 1 Sample 2
N1=21 N2=12
S1
2
=345.7 S2
2
=745.2
8
a) If = 0.01, state the decision rule for the hypothesis (you ne
ed to pay attention to the alternative
hypothesis to construct this decision rule.)
b) Test the hypothesis and indicate whether the null hypothesis
should be rejected.
a) Using the F Distribution Table: If the calculated F >3.858, re
ject H0, otherwise do not reject H0
27. b) F = 345.7/745.2 = 0.46390
Since 0.46390 < 3.858 do not reject H0
11) Find the appropriate critical F‐ value, from the F Distributio
n Table, for each of the following:
a) D1=16,
b) D1=5,
c) D1=16,
a) Using the F Distribution Table: F = 3.619
b) Using the F Distribution Table: F = 3.106
c) Using the F Distribution Table: F = 3.051
12) Given the following null and alternative hypotheses
H0: ¬1
2
= ¬2
2
HA: ¬1
2 ¬2
2
and the following sample information
28. a) If
= 0.02, state the decision rule for the hypothesis.
b) Test the hypothesis and indicate whether the null hypothesis
should be rejected.
a) If the calculated F > 4.405, reject H0, otherwise do not reject
H0
b) F = 33
2
/15
2
= 4.84, Since 4.84 > 4.405 reject H0
13) Consider the following two independently chosen samples:
Sample 1 Sample 2
12.1 10.5
13.4 9.5
11.7 8.2
10.7 7.8
14.0 11.1
Use a significance level of 0.05 for testing the hypothesis that
¬1
2
30. Step 3: α = 0.05,
Step 4: The critical value is obtained from the F-distribution. F
= 6.388. Reject HO if F > 6.388.
Step 5: s1
2 =
4
028.7 = 1.757 s2
2 =
1
)( 2
n
xx =
4
108.8 = 2.027. The test statistic is F =
757.1
027.2
2
1
2
31. s
s = 1.154,
Step 6: Since F = 1.154 < 6.388 = F0.05, fail to reject HO.
Step 7: Based on these sample data, there is not sufficient
evidence to conclude that population
one’s variance is larger than population two’s variance.
14) You are given two random samples with the following infor
mation:
Item Sample 1 Sample 2
1 19.6 21.3
2 22.1 17.4
3 19.5 19.0
4 20.0 21.2
5 21.5 20.1
6 20.2 23.5
7 17.9 18.9
8 23.0 22.4
9 12.5 14.3
10 19.0 17.8
32. Based on these samples, test at =0.10, whether the true differ
ence in population variances is equal
to zero.
s1 = 2.8975 s2 = 2.7033
H0: σ1
2
= σ2
2
HA: σ1
2
≠ σ2
2
Using the F Distribution Table with D1 = 9 and D2 = 9: If the c
alculated F > 3.179, reject H0, otherwise do not
reject H0.
F = 2.8975
2
/2.7033
2
= 1.1488
Since 1.1488 < 3.179 do not reject H0
34. •• Examples of derived distributions areExamples of derived
distributions are
the t distribution and the chithe t distribution and the chi--
squaresquarethe t distribution and the chithe t distribution and
the chi squaresquare
distribution distribution
ChiChi--Square DistributionSquare Distribution
Area = aArea = a
22
E lE lExampleExample
U i hiU i hi di t ib ti ith 12 dfdi t ib ti ith 12 df•• Using a
chiUsing a chi--square distribution with 12 df, square
distribution with 12 df,
6.30380)
Area = .1Area = .1
18.549318.5493
35. E lE lExampleExample
6.30380, the area to the right is .900. As the
total area is 1, the area to the left of 6.30380 is 1 total area is 1,
the area to the left of 6.30380 is 1 ––
6.303806.30380) = 1 - .1 -
.1 = .8
80% of the time a 22 value with 12 df will be between80% of
6.30380 and 6.30380 and 18.5494
E lE lExampleExample
Using the previous example determine a and b that
satis
l i h t ilequal area occurs in each tail.
.025value whose left tailed area is .025
to the right is .975
= 4.40= 4.40
36. right tailed area is .025
23 323 3= 23.3= 23.3
C fid I t l fC fid I t l f 22Confidence Interval for Confidence
–
A sample of size n = 13 results in 12df. From the
previous example, it follows that p p ,
< 12 s 2 / 4.40)
12 s 2 / 23.3 to 12 s 2 / 4.40
E lE lExampleExample
A random sample of the weights (in pounds) of 15A random
sample of the weights (in pounds) of 15A random sample of the
weights (in pounds) of 15A random sample of the weights (in
pounds) of 15
bags was obtained: 51.2, 47.5, 50.8, 51.5, 49.5, 51.1,bags was
obtained: 51.2, 47.5, 50.8, 51.5, 49.5, 51.1,
51.3, 50.7, 46.7, 49.2, 52.1, 48.3, 51.6, 49.2, 51.551.3, 50.7,
46.7, 49.2, 52.1, 48.3, 51.6, 49.2, 51.551.3, 50.7, 46.7, 49.2,
52.1, 48.3, 51.6, 49.2, 51.551.3, 50.7, 46.7, 49.2, 52.1, 48.3,
51.6, 49.2, 51.5
Mean = 50.15, and s = 1.651. Determine a 90% CI forMean =
37. 50.15, and s = 1.651. Determine a 90% CI for
14(15 - -
.95,14
(14) (1.651) 22 / 23.7 to (14) (1.651) 22 / 6.57 to(14) (1.651) /
23.7 to (14) (1.651) / 6.57 to
1.61 to 5.81
2 412.41
Hypothesis Testing for VarianceHypothesis Testing for
VarianceHypothesis Testing for Variance Hypothesis Testing
for Variance
& Standard Deviation& Standard Deviation
E lE lExampleExample
Continuing from the previous exampleContinuing from the
previous exampleContinuing from the previous example,
Continuing from the previous example,
Ha .5
-1)s 2 2 /(.5) 22 = 14s 2 2 /.25
38. -1)(1.651) 22 / (.5) 22
= 152.6 152.6
As 152.6 > 21.1 we reject HH0.0.
Estimation & Hypothesis TestsEstimation & Hypothesis
TestsEstimation & Hypothesis Tests Estimation & Hypothesis
Tests
for Two Population Variancesfor Two Population Variances
AssumptionsAssumptions
ples are independentThe samples are
independent
Population 1Population 1pp
Population 2Population 2
µµ11 µµ22µµ11 µµ22
F Distrib tionF Distrib tionF DistributionF Distribution
Look at the ratio of sample variancesLook at the ratio of sample
variances
FF ==
ss1122
39. 22ss2222
FF ==
ss1122
FF == ss2222
F Distrib tionF Distrib tionF DistributionF Distribution
FF ==
ss11
ss22
22
22
= =
Population 1Population 1
Population 1Population 1Population 2Population 2
Population 2Population 2Population 1Population 1 Population
2Population 2
40. TwoTwo--Tail TestTail Test
FF ==
ss1122
FF == ss2222
tailright tail))
or ifor if FF < < FF11-- ))
2211
wherewhere vv11 = = nn11 −− 1 and 1 and vv22 = = nn22 −−
11
OneOne--Tail TestTail Test
ss1122
ss1122
FF ==
41. ss11
ss2222
FF ==
ss11
ss2222
wherewhere vv11 == nn11 −− 11
2211
RejectReject HHoo if if FF < < FF11--
wherewhere vv11 == nn11 −− 11
2211
wherewhere vv11 nn11 11
andand vv22 = = nn22 −− 11
wherewhere vv11 nn11 11
andand vv22 = = nn22 −− 11
Finding Right Tail F ValuesFinding Right Tail F Values
A 10A 10Area = .10Area = .10
2.672.67
FF
Using The FUsing The F--Statistic TableStatistic Table
42. V1
1 2 6 7 8 9
1 39.86 49.50 58.20 58.91 59.44 59.86
V1
V2
2 8.53 9.00 9.33 9.35 9.37 9.38
6 3.78 3.46 3.05 3.01 2.98 2.96
7 3.59 3.26 2.83 2.78 2.75 2.727 3.59 3.26 2.83 2.78 2.75 2.72
8 3.46 3.11 2.67 2.62 2.59 2.56
9 3 29 3 01 2 55 2 51 2 47 2 449 3.29 3.01 2.55 2.51 2.47 2.44
Finding Left Tail F ValuesFinding Left Tail F Values
Area = .10Area = .10
.336.336
FF
ExampleExample
The test statistic is F =The test statistic is F = ss 22/s/s 22•• The
test statistic is F = The test statistic is F = ss1122/s/s2222
•• The df are v1 = 25 The df are v1 = 25 –– 1 = 24, v2 = 20 1 =
24, v2 = 20 --1 =19. We find1 =19. We find
FF.05, 24,19 .05, 24,19 = 2.11. = 2.11.
43. The test of HThe test of H00: H: Haa will be reject Ho if F >
2.11will be reject Ho if F > 2.11
•• The computed F value F* = (1.21)The computed F value F* =
(1.21)22/(.72)/(.72)22 = 2.82= 2.82
•• As 2.82 > 2.11, we reject As 2.82 > 2.11, we reject HoHo
22 22Confidence Interval for Confidence Interval for
11
The FThe F--values arevalues are
FFRR = = FF.025,.025,vv ,,vv andand FFLL ==11 22
11
FF.025,.025,vv ,,vv11 22
The confidence interval isThe confidence interval is
11 22
ss1122//ss2222
FFRR
ss1122//ss2222
FFLL
toto
FFRR FFLL
Area = 025Area = 025
Area = .025Area = .025
44. Area = .025Area = .025
o
FF
FFLL FFRR
ExampleExample
•• Determine a 95% CI for Determine a 95% CI for
•• nn11 = 25,s= 25,s11= 1.21, n= 1.21, n22 = 20, s= 20, s22 =
.72= .72
•• FFRR = F= F 025 24 19025 24 19 = 2 45= 2 45•• FFRR = F=
F.025,24,19.025,24,19 = 2.45= 2.45
•• FFLL = 1 / F= 1 / F.025,19,24.025,19,24 = 1/2.33 = .43=
1/2.33 = .43
= [(1.21)= [(1.21) 22 / (.72)/ (.72) 22] / 2.45 to [(1.21)] / 2.45
to [(1.21) 22 / (.72)/ (.72) 22] /.43 ] /.43
= 1.15 to 6.57 = 1.15 to 6.57
•• Determine a 95% CI forDetermine a 95% CI for
•• √1.15 to √6.57 = 1.07 to 2.56√1.15 to √6.57 = 1.07 to 2.56
Hypothesis & Estimation Test for Population Variances -
Notes.pdf
45. 1
Hypothesis Tests & Estimation for Population Variances
When we are considering the mean of particular random variabl
e or population we are in effect
trying to estimate what is occurring on the average. For example
, a worker involved with a
production process that manufactures 2 inch nails has been told
that in fact these bolts are 2 inches
long, on the average. Now, suppose half of the nails produced ar
e 1 inch long and the other half are
3 inches; the report was no doubt accurate, on the average, the n
ails are 2 inches long; however,
what was missing in the report was the amount of variation in th
e production process. If the
variation were 0, then every nail would be exactly 2 inches long
. In practice, there is some degree of
variation present in any production process. Here we are concer
ned not only with mean length µ of
the population of nails, but also the variance ¬
2
or standard deviation ¬ of the lengths of the nails. If
the variance is too large, the process is not operating correctly a
nd needs to be adjusted. This is a
46. key element in statistical quality improvement –
a process is in control if it is consistent and contains
only random variation. In the inference procedures for a populat
ion variance and standard deviation
that we discuss below, we will assume that the population of int
erest is normally distributed. The
hypothesis testing procedures and the confidence intervals for t
he variance are sensitive to
departures from the normal population; i.e. the tests of hypothes
es are not that robust.
Confidence Interval for the Variance & Standard Deviation
The point estimate of a population variance is the sample varian
ce; i.e. we s
2
of a sample to estimate
the variance ¬
2
of a population. When we construct a confidence interval for µ
using a small sample,
we use the t distribution. This is a derived distribution as we us
e it to describe the behavior of a
particular test statistic. We do not use this type of a distribution
to describe a population; the t
random variable just offers us a method of testing and construct
ing confidence intervals for the
47. mean of a normal population when the standard deviation is unk
nown and is replaced by its
estimate. The chi‐square
is another derived distribution which allows us to determine co
nfidence
intervals and perform hypothesis tests on the variance and the st
andard deviation of a normal
population. The shape of the distribution is as given below.
Similar to all continuous distributions, for a chi‐square distribut
ion, a probability corresponds to an
area under a curve. Also the shape of the chi‐square curve (like
that of the t distribution) depends on
the sample size n; this will be specified by the corresponding de
grees of freedom (df). When using
the chi‐square distribution to construct a confidence interval or
perform a hypothesis test on a
2
population variance or standard deviation, the degrees of freedo
m will be given by df = (n ‐1). Let
a, df
48. be the
whose area to the right is a, using the proper df.
Let’s consider an example. Use a chi‐square curve with 12 df to
determine the P (
> 18.5494) and
P (
6.30380).
The values of the chi‐square distribution are given in the appen
dix in the text. From the table, P (
> 18.5494) = .1. This can be written as
.1, 12
= 18.5494. For
6.30380, the chi‐square table
tells us that the area to the right of 6.30380 is .900. The total ar
ea is 1, the area to the left of 6.30380
is 1 ‐ .900 = .1, so the P (
6.30380) = .1. We can say that the P (6.30380 ≤
≤ 18.5494) = 1 ‐ .1
‐ .1 = .8. So, 80% of the time a
with 12 df will be between 6.30380 and 18.5494.
Using the above example, we need to determine a and b so as to
satisfy the P (a <
< b) = .95
49. with df = 12. . Choose a and b so that an equal area occurs in ea
ch tail.
The figure above shows the areas for a and b. Using the chi‐squ
are table,
a =
whose left‐tailed area is .025 =
value whose area to the right is .975 = 4.40, and
b =
whose right‐tailed area is .025 = 23.3.
To derive a confidence interval for ¬
2
we need to examine the sampling distribution of s
2
. If we
repeatedly obtain a random sample from a normal population wi
th a mean of µ and variance ¬
2
,
calculated the sample variance s
2
, and drew a histogram of these s
2
values, we will see that the
shape of the histogram will depend on the sample size n and the
value of ¬
2
50. but not on the value of
3
the population mean µ. The values of n and ¬
2
, along with the random variable s
2
, can be combined
to define a chi‐square random variable, given by
= [(n – 1) s2] / ¬2
This is a chi‐square distribution with (n –
1) df. So, the sampling distribution for s
2
can be defined
using the chi‐square distribution as given in the equation above.
Suppose a sample size of n = 12 results in 12 df. From the exam
ple we saw earlier, it then follows
that
P (4.40 <
< 23.3) = .95
P (4.40 < 12s
2
51. /¬
2
< 23.3) = .95
P (12s
2
/23.3 < ¬
2
< 12s
2
/4.40) = .95
The parameter ¬
2
is bounded between two limits defined by a random variable s
2
. This means that a
95% confidence interval for ¬
2
is 12s
2
/23.3 to 12s
2
/4.40.
In general, to construct a confidence interval for ¬
2
or ¬A( 1 ‐
52. (n – 1) s
2
/
n‐1 to (n – 1) s
2
/
n‐1
The corresponding confidence interval for ¬ is
√(n – 1) s
2
/
n‐1 to √(n – 1) s
2
/
n‐1
Let us consider an example. A certain brand of potash fertilizer
comes in 10, 25, and 50 pound bags.
The fertilizer firm’s supervisor is concerned about the variation
in the weight of the 50 pound bags
because they have recently bought a new mechanical packaging
device. A random sample of the
weights of 15 bags (in pounds) was obtained as given below.
51.2, 47.5, 50.8, 51.5, 49.5, 51.1, 51.3, 50.7, 46.7, 49.2, 52.1, 4
53. 8.3, 51.6, 49.2, 51.5
For these data, the mean = 50.15, and the standard deviation = 1
.651. We need to determine a 90%
CI for 2 and The weight of the bags come from a normal p
opulation.
The corresponding 90% CI interval for 2 is
(15 ‐1)(1.651)
2
/ .05,14 to (15 ‐1)(1.651) 2 / .95,14
(14) (1.651)
2
/ 23.7 to (14) (1.651)
2
/ 6.57 to
1.61 to 5.81
The 90% CI for would be √1.61 to √5.81 = 1.27 to
2.41
4
Hypothesis Testing for the Variance and Standard Deviation
Continuing from the previous example, based on earlier product
54. ion tests, the management is
convinced that the average weight of all the bags being produce
d is in fact 50 pounds. However, the
production supervisor has been informed that at least 95% of th
e bags produced must be within 1
pound of the specified weight (50 pounds). Using a significance
level of alpha = .1, what can we
conclude? Assume a normal distribution for the bag weights.
We know that for a normal population, 95% of the observations
will lie within two standard
deviations of the mean. So, if two standard deviations are equiv
alent to 1 pound, then the
supervisor is being told that sigma must be no more than .5 pou
nd. Is there any evidence to
conclude that this is not the case, i.e. sigma is larger than .5 pou
nd?
The hypotheses to be tester are H0: ≤ .5, and Ha > .5
The test statistic is = (15‐1) s 2 /(.5) 2 = 14s 2 /.25 which ha
s chi‐square distribution of 14 degrees of
freedom.
The rejection region is: reject H0 if > 2.11
The computed value using the sample data is: = (15 ‐1)(1.65
1) 2 / (.5) 2 = 152.6
55. As 152.6 > 21.1 we reject H0.
We conclude that sigma is larger than .5 pound. The bagging pr
ocedure has far too much variation in
the weight of the bags produced.
5
Comparing the Variances of two normal populations using indep
endent samples
Here, we focus our attention on the variations of populations, W
hen estimating and testing sigma 1
and sigma 2, we are not concerned about mew 1 and mew 2; the
y might be equal, or they might not,
they are relevant to this test procedure.
When testing for population means using small, independent sa
mples, we must consider the
population standard deviations (variances). Based on our belief
that sigma 1 does or does not equal
sigma 2 we select our corresponding test statistic for testing the
means mew 1 and mew 2. An
appropriate approach would be test sigma 1 and sigma 2 using o
ne set of samples and then obtain
56. another set of samples independently of the first to test the mea
ns. When looking at the variances,
we the ratio of the sample variances s1
2
and s2
2
, to derive a test of hypothesis and construct
confidence intervals. We do this because s1
2
/ s2
2
has a recognizable distribution when in fact sigma 1
squared and sigma 2 squared are equal. So, we define F = s1
2
/ s2
2
. If we were to obtain sets of two
samples repeatedly, calculate F for each set, and make a histogr
am of these ratios, the shape of this
histogram would resemble the F distribution.
The shape does not resemble a normal curve. There are many F
curves depending upon the sample
sizes n1 and n2. The shape of the F curve becomes more symme
tric as the sample sizes n1 and n2
57. increases. A point to note here is that the F statistic is highly se
nsitive to the assumption of normal
6
populations. For large datasets we need to examine the shape of
the sample data when using the F
statistic.
There are two samples here, one from each population, and we n
eed to specify both the sample
sizes. The degrees of freedom are: v1 = df for numerator = (n1 ‐
1), and v2 = df for denominator = (n2 ‐
1). The F statistic follows an F distribution with v1 and v2 degr
ees of freedom provided sigma 1
squared = sigma 2 squared (sigma 1 = sigma 2). Now, what hap
pens when sigma 1 is not equal to
sigma 2? Suppose that sigma 1 > sigma 2; then we would expect
s1 (the estimate for sigma 1) to be
larger than s2 (the estimate for sigma 2). We see that s1
2
> s2
2
or F = s1
58. 2
/ s2
2
> 1. Similarly, if sigma 1
< sigma 2, then the F value will be < 1.
Hypothesis Testing for sigma 1 = sigma 2
The two tailed and one tailed hypothesis tests are:
7
Finding the Right Tail F Values
The hypotheses can be written in terms of the standard deviatio
ns (sigma 1 and sigma 2) or the
variances (sigma 1 squared and sigma 2 squared); if sigma 1 > s
igma 2, then sigma 1 squared > sigma
2 squared. Suppose we want to know which F value has a right t
ail area of .10 using 6 and 8 degrees
of freedom. Let the F value whose right tail area is ‘a’ were the
degrees of freedom are v1 and v2 be
F a, v1, v2. Using the F table, F.10, 6, 8 = 2.67.
59. Finding the Left Tail F Values
To determine the left tail F values, F value (df = v1, v2) having
a left tail area of ‘a’
= 1 / F value (df = v1, v2) having a right tail area of 1.
We know that the F value having a right tail area of .10 is 2.67
where the df are 6 and 8. For this
curve, what F value has a left tail area equal to .10? Switch the
df to 8 and 6. Using the F table, find
the F value having a right tail area of .10 where the df are now
v1 = 8 and v2= 6. This value is 2.98. So,
for the F curve with 6 and 8 df, the value having a left tail area
of .10 is 1/ 2.98 = .336. Since the area
to the left of .336 is .10, the area to the right of this value is .90
; so .336 = F 1 ‐ .10, 6, 8 = F .9, 6, 8.
Let us consider an example. In this example, a firm is consideri
ng the purchase of some new
equipment that would be used to fill one quart containers with a
radiator additive. The firm has
narrowed its choices to two brands, brand 1 and brand 2. Brand
1 is less expensive than brand 2 but
the firm suspects that the contents delivered by the brand 1 equi
pment has more variation that of
60. the brand 2 equipment. The firm realizes that they need to use a
container slightly larger than one
quart so as to allow for heat expansion and overfill of their prod
uct. The firms’ production
department obtained data on the performance of both brands for
a sample of 25 containers using
brand 1 and 20 containers using brand 2. Using an alpha of .05,
test the firm’s suspicions. The mean
and the standard deviation measurements are in fluid ounces.
8
Brand 1: n1 = 25, x1bar = 31.8, s1 = 1.21; Brand 2: n2 = 20, x2
bar = 32.1, s2 = .72.
We want to determine if one standard deviation (or variance) is
larger than the other, so this is a one
tailed test. The firm’s suspicion is that sigma 1 is larger than si
gma 2 so this statement is the
alternate hypothesis.
The hypotheses are: Ho:
The test statistic is F = s1
2
/s2
61. 2
The df are v1 = 25 –
1 = 24, v2 = 20 ‐1 =19. We findF.05, 24,19 = 2.11.
The test of H0: Ha will be reject Ho if F > 2.11
The computed F value F* = (1.21)
2
/(.72)
2
= 2.82
As 2.82 > 2.11, we reject Ho
So, the firm is correct in its belief that the variation in the conta
iners filled by brand 1 exceeds that of
the containers filled by brand 2.
CI for sigma 1 squared / sigma 2 squared
Consider an F curve with v1 and v2 df. To construct a 95 % CI f
or sigma 1 squared / sigma 2 squared,
we find both the left tailed and the right tailed values. Let FL an
d FR denote the left and the right
tailed F values respectively.
FR = F.025, 24, 19 = 2.45
62. FL = 1 / F.025, 19, 24 = 1/2.33 = .43
The 95% CI for
2
2
= [(1.21)
2
/ (.72)
2
] / 2.45 to [(1.21)
2
/ (.72)
2
] /.43
= 1.15 to 6.57 fluid ounces
Determine a 95% CI for
√1.15 to √6.57 = 1.07 to 2.56 fluid ounces
Zipped Chapter 11 Material.zip
Chapter 15 and Chapter 11 Lecture Power Point Slides.pdf
64. Pc Pc -- proportion in the central region who prefer proportion
in the central region who prefer
present planpresent plan
PwPw proportion in the west coast who preferproportion in the
west coast who preferPw Pw -- proportion in the west coast who
prefer proportion in the west coast who prefer
present plan present plan
•• The null and alternate hypotheses are:The null and alternate
hypotheses are:
0
Ho: Pn = Ps = Pc = PwHo: Pn = Ps = Pc = Pw
Ha: Pn, Ps, Pc, Pw are not all equalHa: Pn, Ps, Pc, Pw are not
all equal
ChiChi--Square as a Test ofSquare as a Test ofChiChi Square as
a Test of Square as a Test of
IndependenceIndependence
Combined proportion who prefer present method = Combined
proportion who prefer present method =
(68 + 75 + 57 + 79)/(100 + 120 + 90 + 100) = .6643.(68 + 75 +
57 + 79)/(100 + 120 + 90 + 100) = .6643.(68 + 75 + 57 +
79)/(100 + 120 + 90 + 100) .6643.(68 + 75 + 57 + 79)/(100 +
120 + 90 + 100) .6643.
Th ChiTh Chi S St ti tiS St ti tiThe ChiThe Chi--Square
StatisticSquare Statistic
Calculating the ChiCalculating the Chi--Square statisticSquare
65. statistic
Th ChiTh Chi S Di t ib tiS Di t ib tiThe ChiThe Chi--Square
DistributionSquare Distribution
The ChiThe Chi--Square distributionSquare distribution
U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Test the hypotheses: Test the hypotheses:
Ho: Pn = Ps = Pc = PwHo: Pn = Ps = Pc = Pw
H P P P P t ll lH P P P P t ll lHa: Pn, Ps, Pc, Pw are not all
equalHa: Pn, Ps, Pc, Pw are not all equal
The sample chiThe sample chi--square value of 2.764 that
wesquare value of 2.764 that we
calculated earlier falls within the acceptance region.calculated
earlier falls within the acceptance region.p gp g
Hence, we accept the null hypothesis.Hence, we accept the null
hypothesis.
U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Contingency Tables with more than two rows:Contingency
Tables with more than two rows:
T t th h thT t th h thTest the hypotheses: Test the hypotheses:
Ho: length of stay and type of insurance are Ho: length of stay
and type of insurance are
66. independentindependentindependentindependent
Ha: length of stay depends on the type of insuranceHa: length of
stay depends on the type of insurance
U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Contingency Tables with more than two rows:Contingency
Tables with more than two rows:
L t A th t th t t d tL t A th t th t t d tLet A = the event that a
stay corresponds to Let A = the event that a stay corresponds to
someone whose insurance covers less than 25% someone whose
insurance covers less than 25%
of the costsof the costsof the costsof the costs
Let B = the event a stay lasts less than 5 daysLet B = the event
a stay lasts less than 5 days
P (first cell = P (a and B) = (180/660)(110/660) = 1/22P (first
cell = P (a and B) = (180/660)(110/660) = 1/22
A 1/22 i th t d ti i th fi t llA 1/22 i th t d ti i th fi t llAs 1/22 is
the expected proportion in the first cell,As 1/22 is the expected
proportion in the first cell,
the expected frequency in that cell is (1/22)(660) = 30the
expected frequency in that cell is (1/22)(660) = 30
observationsobservationsobservationsobservations
In general, fc = [(RT)(CT)]/nIn general, fc = [(RT)(CT)]/nIn
general, fc [(RT)(CT)]/nIn general, fc [(RT)(CT)]/n
67. U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Contingency Tables with more than two rows:Contingency
Tables with more than two rows:
U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Contingency Tables with more than two rows:Contingency
Tables with more than two rows:
A th l hiA th l hi l f 24 315 i tl f 24 315 i tAs the sample chiAs
the sample chi--square value of 24.315 is notsquare value of
24.315 is not
within the acceptance region, we reject the nullwithin the
acceptance region, we reject the null
hypothesishypothesishypothesis.hypothesis.
U i th ChiU i th Chi S T tS T tUsing the ChiUsing the Chi--
Square TestSquare Test
Precautions in using the ChiPrecautions in using the Chi--
Square TestSquare Test
U l l iU l l i-- Use large sample sizesUse large sample sizes
-- Use carefully collected dataUse carefully collected data-- Use
carefully collected dataUse carefully collected data
ChiChi--Square as a Test ofSquare as a Test ofChiChi Square as
a Test of Square as a Test of
Goodness of FitGoodness of Fit
68. Function of a goodness of fit testFunction of a goodness of fit
test
Calculating observed and expected frequenciesCalculating
observed and expected frequencies
Stating the hypothesesStating the hypotheses
Ho: A binomial distribution with p = 40 is a goodHo: A
binomial distribution with p = 40 is a goodHo: A binomial
distribution with p = .40 is a good Ho: A binomial distribution
with p = .40 is a good
description of the interview processdescription of the interview
process
H1: A binomial distribution with p = .40 is not a goodH1: A
binomial distribution with p = .40 is not a goodH1: A binomial
distribution with p .40 is not a good H1: A binomial
distribution with p .40 is not a good
description of the interview processdescription of the interview
process
ChiChi--Square as a Test ofSquare as a Test ofChiChi Square as
a Test of Square as a Test of
Goodness of FitGoodness of Fit
ChiChi--Square as a Test ofSquare as a Test ofChiChi Square as
a Test of Square as a Test of
Goodness of FitGoodness of Fit
70. a weighted average of the sample
means, using the relative sample sizes as themeans, using the
relative sample sizes as the
weights = (5/16)(17) + (5/16)(21) + (6/16)(19) = 304/16weights
= (5/16)(17) + (5/16)(21) + (6/16)(19) = 304/16
= 19= 19= 19= 19
ANOVAANOVAANOVAANOVA
We can state the hypotheses as follows:We can state the
hypotheses as follows:
Ho: µHo: µ11 = µ= µ22 = µ= µ33
H1: µH1: µ11, µ, µ22, µ, µ33 are not all equalare not all equal
Assumptions made in AnovaAssumptions made in Anova
Steps in AnovaSteps in Anova
1)1) Determine one estimate of the population variance from the
Determine one estimate of the population variance from the
variance among the sample meansvariance among the sample
means
2)2) Determine a second estimate of the population
varianceDetermine a second estimate of the population
variance2)2) Determine a second estimate of the population
variance Determine a second estimate of the population variance
from the variance within the samplesfrom the variance within
the samples
3)3) Compare the two estimates; if they are approximately equal
Compare the two estimates; if they are approximately equal
i l t th ll h th ii l t th ll h th iin value, accept the null hypothesis
in value, accept the null hypothesis
71. ANOVAANOVAANOVAANOVA
Calculating the variance among the sample means:Calculating
the variance among the sample means:
Finding the first estimate of the population varianceFinding the
first estimate of the population variance
Finding the variance among the sample meansFinding the
variance among the sample meansFinding the variance among
the sample meansFinding the variance among the sample means
Finding the population variance using the varianceFinding the
population variance using the variance
among the sample meansamong the sample means
ANOVAANOVAANOVAANOVA
Calculating the variance among the sample means:Calculating
the variance among the sample means:
Estimating the between column varianceEstimating the between
column variance
ANOVAANOVAANOVAANOVA
Calculating the variance within the samples:Calculating the
variance within the samples:
Finding the second estimate of the populationFinding the second
estimate of the population
variancevariance
72. Sample varianceSample variance
Estimate of within column varianceEstimate of within column
variance
ANOVAANOVAANOVAANOVA
Calculating the variance within the samples:Calculating the
variance within the samples:
ANOVAANOVAANOVAANOVA
The F test:The F test:
F = (first estimate of the population variance basedF = (first
estimate of the population variance basedF (first estimate of
the population variance based F (first estimate of the
population variance based
on the variance among the sample means)/ on the variance
among the sample means)/
(second estimate of the population variance (second estimate of
the population variance
based on the variances within the samples)based on the
variances within the samples)
F = between column variance / within columnF = between
column variance / within columnF = between column variance /
within column F = between column variance / within column
variance variance
= 20/14.769 = 1.354 (this is the F ratio)= 20/14.769 = 1.354
(this is the F ratio) 20/14.769 1.354 (this is the F ratio)
20/14.769 1.354 (this is the F ratio)
73. Interpreting the F ratioInterpreting the F ratio
ANOVAANOVAANOVAANOVA
The F distributionThe F distribution
ANOVAANOVAANOVAANOVA
Using the F distributionUsing the F distribution
Number of degrees of freedom in the numerator ofNumber of
degrees of freedom in the numerator of
the F ratio = (number of samples the F ratio = (number of
samples –– 1)1)
Number of degrees of freedom in the denominator ofNumber of
degrees of freedom in the denominator of
–– 1) = n1) = nTT –– kk
Using the F tableUsing the F tableUsing the F tableUsing the F
table
Testing the hypothesesTesting the hypotheses
F = between column variance / within column F = between
column variance / within column
variancevariancevariance variance
= 20/14.769 = 1.354 (this is the F ratio)= 20/14.769 = 1.354
(this is the F ratio)
74. ANOVAANOVAANOVAANOVA
Using the F distributionUsing the F distribution
Number of degrees of freedom in the numerator ofNumber of
degrees of freedom in the numerator of
the F ratio = (3 the F ratio = (3 –– 1) = 21) = 2
Number of degrees of freedom in the denominator ofNumber of
degrees of freedom in the denominator of
the F ratio = (5 the F ratio = (5 –– 1) + (5 1) + (5 –– 1) + (6 1)
+ (6 –– 1) = 131) = 13
Suppose we want the test at the .05 level, theSuppose we want
the test at the .05 level, the
hypothesis that there is no difference among thehypothesis that
there is no difference among thehypothesis that there is no
difference among thehypothesis that there is no difference
among the
three training methods. three training methods.
ANOVAANOVAANOVAANOVA
Using the F distributionUsing the F distribution
From the F table, the value we get is 3.81. This valueFrom the F
table, the value we get is 3.81. This value
of 3.81 sets the upper limit of the acceptance region.of 3.81 sets
the upper limit of the acceptance region.
As the calculated sample value for F of 1.354 liesAs the
calculated sample value for F of 1.354 lies
within the acceptance region, we would accept thewithin the
acceptance region, we would accept the
null hypothesisnull hypothesisnull hypothesis.null hypothesis.
75. ANOVAANOVAANOVAANOVA
Precautions in using the F TestPrecautions in using the F Test
-- Use large sample sizesUse large sample sizes
-- Control all factors except the one being studiesControl all
factors except the one being studies
A test for one factorA test for one factor-- A test for one
factorA test for one factor
ANOVAANOVA E lE lANOVA ANOVA -- ExampleExample
The manufacturer of a tape recorder decides toThe manufacturer
of a tape recorder decides to
include four alkaline batteries along with theirinclude four
alkaline batteries along with theirinclude four alkaline batteries
along with theirinclude four alkaline batteries along with their
product. Two battery suppliers are considered, brandproduct.
Two battery suppliers are considered, brand
1 and brand 2. The supervisor wants to know if the1 and brand
2. The supervisor wants to know if the
average lifetimes of the two brands are the same.average
lifetimes of the two brands are the same.
Each of ten batteries is connected to a test deviceEach of ten
batteries is connected to a test device
that places a small drain on the battery power andthat places a
small drain on the battery power andthat places a small drain on
the battery power andthat places a small drain on the battery
power and
records the battery lifetime. The following data (inrecords the
battery lifetime. The following data (in
hours) was collected.hours) was collected.hours) was
collected.hours) was collected.
76. Brand 1: 43, 48, 38, 41, 51Brand 1: 43, 48, 38, 41, 51
Brand 2: 30, 26, 37, 31, 34Brand 2: 30, 26, 37, 31, 34
ANOVAANOVA E lE lANOVA ANOVA -- ExampleExample
Within Sample Variation Within Sample Variation
xx11bar = 44.2 (sbar = 44.2 (s11 = 5.26)= 5.26)
xx22bar = 31.6 (sbar = 31.6 (s22 = 4.16)= 4.16)
Between Sample VariationBetween Sample VariationBetween
Sample VariationBetween Sample Variation
xx11bar is larger than xbar is larger than x22barbarxx11bar is
larger than xbar is larger than x22barbar
Measuring VariationMeasuring Variation
SS (total) = SS (between) + SS (within) SS (total) = SS
(between) + SS (within)
= SS (factor) + SS (error)= SS (factor) + SS (error)= SS (factor)
+ SS (error)= SS (factor) + SS (error)
Deriving the Sum of SquaresDeriving the Sum of Squares
TT 22 TT 22 TT 22 TT22
SS(factor) = + + ... + SS(factor) = +
+ ... + --
TT1122
78. ANOVAANOVA E l (E l ( tdtd))ANOVA ANOVA –– Example
(Example (contdcontd))
The ANOVA TableThe ANOVA Table
SourceSource dfdf SSSS MSMS FF
FactorFactor kk -- 11 SS(factor)SS(factor)
MS(factor)MS(factor) MS(factor) MS(factor)
ErrorError nn -- 22 SS(error)SS(error) MS(error)MS(error)
MS(error)MS(error)ErrorError nn -- 22 SS(error)SS(error)
MS(error)MS(error) MS(error)MS(error)
TotalTotal nn -- 11 SS(total)SS(total)
SS(factor)SS(factor)
MS(f t )MS(f t )
SS(error)SS(error)
MS( )MS( )kk -- 11MS(factor) =MS(factor) = nn -- kkMS(error)
=MS(error) =
MS(factor)MS(factor)
MS(error)MS(error)FF == ( )( )
ExampleExample
ExampleExample
Since alpha = .5, Since alpha = .5,
79. F F .05, 3, 20.05, 3, 20 = 3.10.= 3.10.
As F* = 10.75 > 3.10, As F* = 10.75 > 3.10,
we reject Ho.we reject Ho.
T k ’ M lti l C iT k ’ M lti l C iTukey’s Multiple Comparison
Tukey’s Multiple Comparison
TestTestTestTest
QQ ==
maximum (maximum (XXii) ) –– minimum (minimum (XXii))
MS(error) / MS(error) / nnrr
wherewhere
1.1. Maximum Maximum XXii and minimum and minimum
XXii are the largest and smallest meansare the largest and
smallest means
2.2. MS(error) is the pooled sample varianceMS(error) is the
pooled sample variance2.2. MS(error) is the pooled sample
varianceMS(error) is the pooled sample variance
3.3. nnrr is the number of replicates in each sampleis the
number of replicates in each sample
ExampleExample
There is no evidence ofThere is no evidence of
a difference betweena difference between
80. brand 1 and the brand 2brand 1 and the brand 2
populations or betweenpopulations or between
the brand 3 and thethe brand 3 and the
brand 4 populationsbrand 4 populations
Sample Problems - Chapter 11.pdf
1
Sample Problems
1)
A start‐up cell phone applications company is interested in deter
mining whether household incomes
are different for subscribers to 3 different service providers. A r
andom sample of 25 subscribers to
each of the 3 service providers was taken, and the annual house
hold income for each subscriber was
recorded. The partially completed ANOVA table for the analysi
s is shown here:
ANOVA
Source of Variation SS df MS F
Between Groups 2,949,085,157
81. Within Groups
Total 9,271,678,090
a. Complete the ANOVA table by filling in the missing sums of
squares, the degrees of freedom for
each source, the mean square, and the calculated F‐test statistic.
b. Based on the ample results, can the start‐up firm conclude tha
t there is a difference in household
incomes for subscribers to the 3 service providers? You may ass
ume normal distributions and equal
variances. Conduct your test at the α=0.10 level of significance.
Be sure to state a critical F‐statistic, a
decision rule, and a conclusion.
a. The calculations for the completed ANOVA table below are:
Between groups df = k‐1 where k is the number of magazines =
3‐1 = 2
Within groups df = nt –
k, where nt = 25 subscribers * 3 magazines = 75;
75 – 3 = 72
SSW = SST‐SSB = 9,271,678,090 –
2,949,085,157 = 6,322,592,933
MSB = 2,949,085,157/2 = 1,474,542,579
MSW = 6,322,592,933/72 = 87,813,791
F = 1,474,542,579/87,813,791 = 16.79
82. ANOVA
Source of Variation SS df MS F
Between Groups 2,949,085,157 2 1,474,542,579 16.79
Within Groups 6,322,592,933 72 87,813,791
Total 9,271,678,090 74
b.
Ho: µ1 = µ2 = µ3
HA: Not all populations have the same mean
2
F = MSB/MSW = 1,474,542,579/87,813,791 = 16.79
Because the F test statistic = 16.79 > Fα = 2.3778, we do reject
the null hypothesis based on these sample data.
2)
An analyst is interested in testing whether 4 populations have e
qual means. The following sample
data have been collected from populations that are assumed to b
e normally distributed with equal
variances:
83. Sample 1 Sample 2 Sample 3 Sample 4
9 12 8 17
6 16 8 15
11 16 12 17
14 12 7 16
14 9 10 13
Conduct the appropriate test using a significance level equal to
0.05.
:AH not all are equal
The following sample data were obtained:
The means for each sample are:
The grand mean is
The F critical value from the F‐ distribution for = 0.05 and w
ith and degrees of freedom is
3.239. Thus, the decision rule is:
If the test statistic F > 3.239, reject the null hypothesis, otherwi
84. se do not reject
The samples are independent and the data level is ratio. Becaus
e the sample sizes are equal, the ANOVA test
is robust to the normality and equal variance assumptions. The
Hartley’s F max test can be used to check the
equal variance assumption. The samples are too small to check
the normality assumption.
2 2 2 2
:AH not all population variances are equal
3
The sample variances are:
2
2
2
2
The test statistic for the F max test is:
85. 2
max
2
min
11.7
4.18
2.8
s
F
s
The critical value from the F max table with c = 4 and v = 4 deg
rees of freedom for = 0.05 is 20.6.
Because F = 4.18 < 20.6, do not reject the null hypothesis
Thus, there is no evidence to suggest that population variances a
re not equal.
The required calculations can be computed manually or by using
Excel or Minitab. We get the following:
Since the test statistic = F = 5.905 > 3.239, reject the null hypot
hesis.
Also, using the p‐value approach, because p‐value = 0.0065 < 0.
05, we reject the null hypothesis.
3)
86. A manager is interested in testing whether three populations of i
nterest have equal population
means. Simple random samples of size 10 were selected from ea
ch population. The following
ANOVA table and related statistics were computed:
ANOVA: Single Factor
Summary
Groups Count Sum Average Variance
Sample 1 10 507.18 50.72 35.06
Sample 2 10 405.79 40.58 30.08
Sample 3 0 487.64 48.76 23.13
4
ANOVA
Source SS df MS F p‐value F‐crit
Between Groups 578.78 2 289.39 9.84 0.0006 3.354
Within Groups 794.36 27 29.42
87. Total 1373.14 29
a. Sate the appropriate null and alternative hypotheses.
b. Conduct the appropriate test of the null hypothesis assuming t
hat the populations have equal
variances and the populations are normally distributed. Use a 0.
05 level of significance.
c. If warranted, use the Tukey‐Kramer procedure for the multipl
e comparisons to determine which
populations have different means. (Assume α=0.05.)
a. The appropriate null and alternative hypotheses are:
:AH not all are equal
b. The one‐way ANOVA test is appropriate for testing the null a
nd alternative hypotheses. All the information
needed is supplied in the table.
Using the F test approach, because F = 9.84 > critical F = 3.35,
we reject the null hypothesis and conclude that
the population means are not all equal.
Using the p-value approach, because p-
0.05, we reject the null hypothesis and conclude
that the population means are not all equal.
c. Because the null hypothesis has been rejected and we conclud
e that not all population means are equal, we
can now apply the Tukey‐Kramer method to determine which m
88. eans are different. We start by calculating the
Tukey‐Kramer critical range value using:
ji nn
MSW
qngeCriticalRa
11
2
The value for q0.95 with k = 3 and
= 30‐3=27 degrees of freedom is found in Appendix J as
approximately 3.50. Because the sample sizes are equal in this
situation, we need only compute one critical
range value shown as follows:
0.6
10
89. 1
10
1
2
42.29
We now compare all the possible contrasts of differences
between sample means to the Tukey-Kramer critical
range value.
Contrast Significant ?
> 6.0 Yes
5
< 6.0 No
90. 18 > 6.0 Yes
Thus, we conclude and
. Thus, the mean for population 2 is less than the means for the
other two populations. However, the sample data do not provid
e sufficient evidence to conclude that the
means for populations one and three are different.
4)
Respond to each of the following questions using this partially c
ompleted one‐way ANOVA table:
Source of Variation SS df MS F‐ratio
Between Samples 1745
Within Samples 240
Total 6504 246
a. How many different populations are being considered in this
analysis?
b. Fill in the ANOVA table with the missing values.
c. State the appropriate null and alternative hypotheses.
d. Based in the analysis of variance F‐test, what conclusion sho
uld be reached regarding the null
hypothesis? Test using a significance level of 0.01.
a. dfB + dfW = dfT dfB = dfT ‐ dfW = 246 – 240 = 6 = k –
1 k = 7 = number of populations.
b.
91. Source SS df MS F
Between
Samples
1,745 6 290.833 14.667
Within Samples 4,759 240 19.829
Total 6,504 246
c. H0: μ1 = μ2 = μ3 = μ4 = μ5 = μ6= μ7
HA: At least two population means are different
d.
F critical = 2.8778 (Minitab); from text table use F6,200 = 2.89
3
Since 14.667 > 2.8778 reject Ho and conclude that at least two
populations means are different.
5)
Respond to each of the following questions using this partially c
ompleted one‐way ANOVA table:
6
Source of Variation SS df MS F‐ratio
92. Between Samples 3
Within Samples 405
Total 888 31
a. How many different populations are being considered in this
analysis?
b. Fill in the ANOVA table with the missing values.
c. State the appropriate null and alternative hypotheses.
d. Based in the analysis of variance F‐test, what conclusion sho
uld be reached regarding the null
hypothesis? Test using α=0.05.
a. dfB = 3 = k – 1 k = 4 = number of populations.
b.
Source SS df MS F
Between Samples 483 3 161 11.1309
Within Samples 405 28 14.464
Total 888 31
c. H0: μ1 = μ2 = μ3 = μ4
HA: At least two population means are different
d.
F critical = 2.9467 (Minitab); from text table use F3, 24 = 3.009
Since 11.1309 > 2.9467 reject Ho and conclude that at least two
populations means are different.
93. 6)
Given the following sample data:
Item Group1 Group 2 Group 3 Group 4
1 20.9 28.2 17.8 21.2
2 27.2 26.2 15.9 23.9
3 26.6 2.6 18.4 19.5
4 22.1 29.7 20.2 17.4
5 25.3 30.3 14.1
6 30.1 25.9
7 23.8
7
a. Based on the computation for the within and between sample
variation, develop the ANOVA table
and test the appropriate null hypothesis using α=0.05. Use the p
‐value approach.
b. If warranted, us the Tukey‐Kramer procedure to determine w
hich populations have different
means. Use α =0.05.
Anova: Single Factor
94. SUMMARY
Groups Count Sum Average Variance
Group 1 7 176 25.14286 10.00286
Group 2 6 161.9 26.98333 10.12567
Group 3 5 86.4 17.28 5.517
Group 4 4 82 20.5 7.553333
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 315.9324 3 105.3108 12.20025 0.000136
3.159911
Within Groups 155.3735 18 8.63186
Total 471.3059 21
a.
H0: μ1 = μ2 = μ3 = μ4
HA: At least two population means are different
Since p‐value = 0.000136 < 0.05 reject H0 and conclude that at
95. least two population means are different.
b. Tukey‐Kramer Critical Range = )
11
(
2
632.8
00.4
ji nn
Pair Critical Range Difference in Means Significant?
1 vs 2 4.72 1.84 No
1 vs 3 5.137 7.863 Yes
1 vs 4 5.475 4.643 No
8
2 vs 3 4.968 9.703 Yes
2 vs 4 5.318 6.483 Yes
3 vs 4 5.691 3.22 No
7)
96. Examine the 3 samples obtained independently from 3 populatio
ns:
Item Group1 Group 2 Group 3
1 14 17 17
2 13 16 14
3 12 16 15
4 15 18 16
5 16 14
6 16
a. Conduct a one‐way analysis of variance on the data. Use alph
a=0.05.
b. If warranted, use the Tukey‐Kramer procedure to determine w
hich populations have different
means. Use an experiment‐wide error rate of 0.05.
a.
HA: At least two population means are different.
Because k – 1 = 2 and nT – k = 12, F0.05 = 3.885. The decision
rule is if the calculated F > F0.05 = 3.885, reject
HO, or if the p-
reject HO.
If we assume that the populations are normally distributed,
Harley’s Fmax test can be used to test whether the
three populations have equal variances. The sample variances
97. are s1
2 =
2( )
1
x x
n
=
15
10
= 2.50, s2
2 =
0.916, and s3
2 = 1.467 test statistic is Fmax =
916.0
50.2
2
min
2
98. s
s
= 2.729. From Appendix I, the critical value for
alpha = 0.05, k = 3, and n - 1 = 4 is 15.5. Because 2.729 < 15.5,
we conclude that the population variances
could be equal.
9
Since F = 5.03 > 3.885, we reject HO.
We conclude there is sufficient evidence to indicate that at least
two of the population means differ.
b. Use the Tukey‐Kramer test to determine which populations
have different means.
To construct the critical ranges:
99. ji nn
MSW
q
11
2
For n1 = 5 and n2 = 4, critical range =
1.67 1 1
3.77
2 5 4
= 2.311;
for n1 = 5 and n3 = 6, critical range =
1.67 1 1
3.77
2 5 6
100. = 2.086;
and for n2 = 4 and n3 = 6, critical range =
1.67 1 1
3.77
2 4 6
= 2.224
The contrast are 2.75 > 2.311,
1.33 < 2.086, and
1.42 < 2.224
Therefore, we can infer that population 1 and population 2 have
different means. However, no other
differences are supported by these sample data.
8)
A study was conducted to determine if differences in new textbo
ok prices exist between on‐campus
bookstores, off‐campus bookstores, and Internet bookstores. To
control for differences in textbook
prices that might exist across disciplines, the study randomly se
101. lected 12 textbooks and recorded the
price of each of the 12 books at each of the 3 retailers. You may
assume normality and equal‐
variance assumptions have been met. The partially completed A
NOVA able based on the study’s
findings is shown here:
10
ANOVA
Source of Variation SS df MS F
Textbooks 16624
Retailer 2.4
Error
Total 17477.6
a. Complete the ANOVA table by filling in the missing sum of s
quares, the degrees of freedom for
each source, the mean square, and the calculated F‐test statistic
for each possible hypothesis test.
b. Based on the study’s findings, was it correct to block for diff
erences in textbooks? Conduct the
appropriate test at the α=0.10 level of significance.
c. Based on the study’s findings, can it be concluded that there i
102. s a difference in the average price of
textbooks across the 3 retail outlets? Conduct the appropriate hy
pothesis test at the α=0.10 level of
significance.
a. The calculations for the completed ANOVA table below are:
Textbooks (blocks) df = b‐1 = 12‐1 = 11
Retailer df = k‐1 = 3‐1 = 2
Error df = (k‐1)(b‐1) = 11(2) = 22
Total df = nt ‐1, where nt = (12 textbooks) * ( 3 retailers) = 36
= 36 – 1 = 35
SSW (error) = SST‐SSBL‐SSB = 17,477.6 – 16,624 –
2.4 = 851.2
MSBL (Textbooks) = 16,624/11 = 1,511.3
MSB (Retailer) = 2.4/2 = 1.2
MSW (error) = 851.2/22 = 38.7
F (textbooks) = 1,511.3/38.7 = 39.05
F (Retailer) = 1.2/38.7 = 0.031
ANOVA
Source of
Variation SS df MS F
Textbooks 16,624 11 1511.3 39.05
103. 11
Retailer 2.4 2 1.2 0.031
Error 851.2 22 38.7
Total 17477.6 35
b.
Ho: µOn = µOff = µI
HA: Not all populations have the same mean
Test to determine whether blocking is effective. Twelve textbo
oks were used to evaluate the prices at the
three types of retail outlets. These constitute the blocks. The n
ull and alternative hypotheses are:
Ho: µ1 = µ2 = µ3 = … = µ12
HA: Not all block means are equal.
As shown in the ANOVA table from part (a), the F test statistic
for this hypothesis test is the F for blocks
(textbooks) = 39.05.
Using Excel’s FINV function with α = 0.10 and 11 and 22 degre
104. es of freedom, Fα=0.10 = 1.88. Since F = 39.05 >
Fα=0.10 = 1.88, reject the null hypothesis. This means that bas
ed on these sample data we can conclude that
blocking is effective.
c. We have three types of retail outlets (on‐campus, off‐campus,
and Internet). The appropriate null and
alternative hypotheses are:
Ho: µOn = µOff = µI
HA: Not all populations have the same mean
As shown in the ANOVA table from part (a), the F test statistic
for this null hypothesis is 0.031.
Using Excel’s FINV function with α = 0.10 and 2 and 22 degree
s of freedom, Fα=0.10 = 2.56. Since F = 0.031 <
Fα=0.10 = 2.56, do not reject the null hypothesis. Thus, based
on these sample data we cannot conclude that
there is a difference in textbook prices at the three different typ
es of retail outlets.
9)
The following data were collected for a randomized block analy
sis of variance design with 4
populations and 8 blocks.
Group 1 Group 2 Group 3 Group 4
105. Block 1 56 44 57 84
Block 2 34 30 38 50
12
Block 3 50 41 48 52
Block 4 19 17 21 30
Block 5 33 30 35 38
Block 6 74 72 78 79
Block 7 33 24 27 33
Block 8 56 44 56 71
a. State the appropriate null and alternative hypotheses for the t
reatments and determine whether
blocking is necessary.
b. Construct the appropriate ANOVA table.
c. Using a significance level equal to 0.05, can you conclude tha
t blocking was necessary in this case?
Use a test‐statistic approach.
d. Based on the data and a significance level equal to 0.05, is th
ere a difference in population means
for the 4 groups? Use a p‐value approach.
e. If you found that a difference exists in part d, use the LSD ap
proach to determine which
populations have different means.
106. a. H0: μ1 = μ2 = μ3 = μ4
HA: At least two population means are different
H0: μb1 = μb2 = μb3 = μb4 = μb5 = μb6 = μb7 = μb8
HA: Not all block means are equal
b.
ANOVA
Source of Variation SS df MS F P-value F crit
Blocks 9123.375 7 1303.339 46.87669 2.08E-11 2.487582
Groups 1158.625 3 386.2083 13.8906 3.26E-05 3.072472
Error 583.875 21 27.80357
Total 10865.88 31
c.
Since 46.87669 > 2.487582 reject H0 and conclude that there is
an indication that blocking was necessary.
d.
Since p‐value 0.0000326 < 0.05 reject H0 and conclude that at l
east two means are different.
107. 13
e.
Least Significant
Difference (LSD) 5.48280844
Mean
Difference
Absolute Mean
Difference Significant
G1 v. G2 6.625 6.625 YES
G1 v. G3 -0.625 0.625 NO
G1 v. G4 -10.25 10.25 YES
G2 v. G3 -7.25 7.25 YES
G2 v. G4 -16.875 16.875 YES
G3 v. G4 -9.625 9.625 YES
10)
The following ANOVA table and accompanying information are
the result of a randomized block
ANOVA test.
109. Rows 199899 7 28557.0 112.8 0.0000 2.488
Columns 11884 3 3961.3 15.7 0.0000 3.073
Error 5317 21 253.2
Total 217100 31
a. How many blocks were used in this study?
b. How many populations are involved in this test?
c. Test to determine if blocking is effective using an alpha level
equal to 0.05.
d. Test the main hypothesis of interest using α=0.05.
e. If warranted, conduct an LSD test with α =0.05 to determine
which population means are
different.
a. There were 8 blocks used.
b. There were 4 populations involved in the study.
c. The hypothesis to be tested is:
(blocking is not effective)
(blocking is effective)
The hypothesis is tested by computing the test statistic F ratio a
s follows:
28, 557
112.79
253.19
110. MSBL
F
MSW
The critical F value from the F‐distribution for = 0.05 and de
grees of freedom is
approximately 2.5. Therefore the decision rule is:
If test statistic F > critical F = 2.5, reject the null hypothesis
Otherwise, do not reject the null hypothesis
Because F = 112.79 > critical F = 2.5, we reject the null hypoth
esis and conclude that blocking is effective.
d.
This hypothesis is tested using an F test with the test statistic co
mputed as follows:
15
65.15
19.253
33.961,3
111. MSW
MSB
F
The critical F value from the F‐distribution for alpha = 0.05 and
degrees of freedom is
approximately 3.0. Therefore the decision rule is:
If test statistic F > critical F = 3.0, reject the null hypothesis
Otherwise, do not reject the null hypothesis
Because F = 15.65 > critical F = 3.0, we reject the null hypothes
is and conclude that the four populations do not
have the same mean.
e. Because the primary null hypothesis was rejected in part d.,
we can now use the Least Significant Difference
test to determine which populations have different means.
We can use the following steps to do this:
Step 1: Compute the LSD statistic.
The LSD statistic is computed as:
55.16
8
2
19.2530796.2
2
112. b
Note, the t value is for 0.025
2
and (k‐1)(b‐1) = (3)(7) = 21 degrees of freedom from the t‐distr
ibution
table is 2.0796
Step 2: Compute the sample means from each population.
The sample means are provided in the table as:
Step 3: Form all possible contrasts by finding the absolute diffe
rences between all pairs of sample means.
Compare these to the LSD statistic.
Absolute Difference Comparison Conclusion
14.5 < 16.55
35 > 16.55
17.5 > 16.55
20.5 > 16.55
2 32 > 16.55
113. 16
52.5 > 16.55
11)
The following sample data were recently collected in the course
of conducting a randomized block
analysis of variance. Based on these sample data, what conclusi
ons should be reached about
blocking effectiveness and about the means of the 3 populations
involved? Test using a significance
level equal to 0.05.
Block Sample 1 Sample 2 Sample 3
1 30 40 40
2 50 70 50
3 60 40 70
4 40 40 30
5 80 70 90
6 20 10 10
114. The sample data are:
The following sums of squares values are computed:
SST = 9,000 SSB = 33.33 SSBL = 7,866.67 SSW = 1,100
The completed ANOVA table is:
17
The hypothesis to be tested is:
(blocking is not effective)
(blocking is effective)
The hypothesis is tested by computing the test statistic F ratio a
s follows:
1, 573.33
14.3
110
115. MSBL
F
MSW
The critical F value from the F‐distribution for alpha = 0.05 and
degrees of freedom is
3.326. Therefore the decision rule is:
If test statistic F > critical F = 3.326, reject the null hypothesis
Otherwise, do not reject the null hypothesis
Because F = 14.3 > critical F = 3.326, we reject the null hypoth
esis and conclude that blocking is effective.
Recall that the main hypothesis is:
This hypothesis is tested using an F test with the test statistic co
mputed as follows:
16.67
0.1515
110
MSB
F
116. MSW
The critical F value from the F‐distribution for alpha = 0.05 and
degrees of freedom 1 22 10D is
4.103. Therefore the decision rule is:
If test statistic F > critical F = 4.103, reject the null hypothesis
Otherwise, do not reject the null hypothesis
Because F = 0.1515 < critical F = 4.103, we do not reject the nu
ll hypothesis and conclude that the three
populations may have the same mean value.
12)
A randomized complete block design is carried out, resulting in
the following statistics:
Source x̅ 1 x̅ 2 x̅ 3 x̅ 4
Primary Factor 237.15 315.15 414.01 612.52
Block 363.57 382.22 438.33
18
SST = 364428
a. Determine if blocking was effective for this design.
117. b. Using a significance level of 0.05, produce the relevant ANO
VA and determine if the average
responses of the factor levels are equal to each other.
c. If you discovered that there were differences among the avera
ge responses of the factor levels,
use the LSD approach to determine which populations have diff
erent means.
a.
different means
SST = 364,428
T
ii
n
xn
x
– (SSB + SSBl) =
364428 – (236905.90 + 12113.60) = 115518.50. MSB = SSB/((k
– 1) = 236905.90/(4 – 1) = 78968.63 MSBl =
SSBl/(b – 1) = 12113.60/2 = 6056.8 MSW = SSW/(k -1)(b – 1)
= 115518.50/3(2) = 19253.08
118. SS df MS F-ratio
Between Blocks 12113.60 2 6056.8 0.3146
Between Samples 236905.90 3 78968.63 4.1016
Within samples 115518.50 6 19253.08
Total 364428 11
F = MSBl/MSW = 12113.60/19253.08 = 0.3146 F0.05 = 5.143.
Since F = 0.3146 < F0.05 = 5.143, fail to reject HO. Conclude
that blocking was not effective for this design
b.
Conduct the main hypothesis test to determine whether the
treatments have equal means.
HA: at least two factor levels have different means F =
MSB/MSW = 78968.63/19253.08 = 4.1016 F0.05 = 4.757.
Since F = 4.1016 < F0.05 = 4.757, do not reject HO. Conclude
that the average response associated with the
factor’s levels may be equal to each other.
c. Since we did not reject the null hypothesis in part b this part
is not necessary.
13)
Consider the following data from a two‐factor experiment:
Factor A
Factor B Level 1 Level 2 Level 3
Level 1 43 25 37
49 26 45
119. 1x 2x 3x 4x
Primary Factor 237.15 315.15 414.01 612.52
Block 363.57 382.22 438.33
19
Level 2 50 27 46
53 31 48
a. Determine if there is interaction between factor A and factor
B. Use the p‐value approach and a
significance level of 0.05.
b. Does the average response vary among the levels of factor A?
Use the test‐ statistic approach and
a significance level of 0.05.
c. Determine if there are differences in the average response bet
ween the levels of factor B. Use the
p‐value approach and a significance level of 0.05.
HA: the interaction terms have different response averages.
P‐value = 0.854 > α = 0.05. Therefore, fail to reject HO.
Conclude that there is not sufficient evidence to
determine that there is interaction between Factor A and Factor
B.
120. HA: at least two levels have different mean response
F = MSA/MSW = 47.10 F0.05 = 5.143. Since F = 47.10 > F0.05
= 5.143, reject HO. Conclude that there is
sufficient evidence to indicate that at least two of the Factor A
response variable averages differ.
HA: the two levels of Factor B have different response averages
.
P‐value = 0.039 < α = 0.05. Therefore, reject HO. Conclude
that there is sufficient evidence to determine that the
two mean responses of Factor B differ
14)
Examine the following two‐factor analysis of variance table:
20
Source SS df MS F‐Ratio
Factor A 162.79 4
Factor B 28.12
AB Interaction 262.31 12
Error
121. Total 1298.74 84
a. Complete the analysis of variance table.
b. Determine if interaction exists between factor A and factor B.
Use α=0.05.
c. Determine if the levels of Factor A have equal means. Use a s
ignificance level of 0.05.
d. Does the ANOVA table indicate that the levels of factor B ha
ve equal means? Use a significance
level of 0.05.
a. MSA = SSA/(a – 1) =162.79/4 = 40.698. Since (a – 1)(b – 1)
= 12 and (a – – 1) = 3. MSB =
SSB/(b – – 1) = 28.12(3) = 84.35. MSAB =
SSAB/(a – 1)(b – 1) = 262.31/12 = 21.859.
SSE = SST – SSA – SSB – SSAB = 1298.74 – 162.79 – 84.35 –
262.31 = 789.29. Also, d.f. for Error = d.f.T –
d.f.A – d.f.B – d.f.AB = 84 – 4 – 3 – 12 = 65. MSE = 789.29/65
= 12.143. Then F-ratio for Factor A = MSA/MSE
= 40.698/12.143 = 3.35. Then F-ratio for Factor B = MSB/MSE
= 28./12.143 = 2.316. Then F-ratio for the AB
interaction = MSAB/MSE = 21.859/12.143 = 1.800.
b.
b. HO:
Interaction between Factor A and Factor B does not exist,
HA: Interaction between Factor A and Factor B does exist,
F = MSAB/MSE = 1.800.
Note that F = 1.800 < (F12,100, 0.05 = 1.850 < F12,65, 0.05 <
F12,50, 0.05 = 1.952). Therefore, fail to reject HO.
Conclude that there is not sufficient evidence to indicate
interaction exists between Factor A and Factor B.
122. HA: at least two levels of Factor A have different mean respons
es,
F = MSA/MSE = 3.35. Note that F4,100, 0.05 = 2.463 < F4,65,
0.05 < F4,50, 0.05 = 2.557 < F = 3.35. Therefore, reject
HO. Conclude that there is sufficient evidence to indicate that
at least two levels of Factor A have different
mean responses.
d. HO: 3
HA: at least two levels of Factor B have different mean respons
es,
F = MSB/MSW = 2.316.
Note that F = 2.316 < (F3,100, 0.05 = 2.696 < F3,65, 0.05 <
F3,50, 0.05 = 2.790). Therefore, fail to reject HO. Conclude
that there is not sufficient evidence to indicate that
at least two levels of Factor B have different mean
responses.
21
15)
Factor A
Factor B
123. Level 1 Level 2 Level 3
Level 1 33 30 21
31 42 30
35 36 30
Level 2 23 30 21
32 27 33
27 25 18
a. Based on the sample data, do factors A and B have significant
interaction? State the appropriate
null and alternative hypotheses and test using a significance lev
el of 0.05.
b. Based on these sample data, can you conclude that the levels
of factor A have equal means? Test
using a significance level of 0.05.
c. Do the data indicate that the levels of factor B have different
means? Test using a significance
level equal to 0.05.
a. H0: Factors A and B do not interact
HA: Factors A and B do interact
ANOVA
Source of Variation SS df MS F P-value F crit
124. Factor B 150.2222 1 150.2222 5.753191 0.033605 4.747221
Factor A 124.1111 2 62.05556 2.376596 0.135052 3.88529
Interaction 24.11111 2 12.05556 0.461702 0.640953 3.88529
Within 313.3333 12 26.11111
Total 611.7778 17
Since 0.4617 < 3.8853 do not reject H0 and conclude that
Factors A and B do not interact.
b. H0: μα1 = μα2 = μα3
HA: Not all means are equal
Since 2.3766 < 3.8853 do not reject H0 and conclude that all
means are equal
22
c. H0: μβ1 = μβ2
HA: Not all means are equal
Since 5.7532 > 4.7472 reject H0 and conclude that not all means
are equal.
16)
Consider the following partially completed two‐factor analysis
of variance table, which is an
125. outgrowth of a study in which factor A has 4 levels and factor B
has 3 levels. The number of
replications was 11 in each cell.
Source of Variation SS df MS F‐Ratio
Factor A 345.1 4
Factor B 28.12
AB Interaction 1123.2 12
Error 256.7
Total 1987.3 84
a. Complete the analysis of variance table.
b. Based on the sample data, can you conclude that the 2 factors
have significant? Test using a
significance level equal to 0.05?
c. Based on the sample data, should you conclude that the mean
s for factor A differ across the 4
levels or the means for factor B differ across the 3 levels? Discu
ss.
d. Considering the outcome of part b, determine what can be sai
d concerning the differences of the
levels of factors A and B. Use a significance level of 0.10 for an
y hypothesis tests required. Provide a
rationale for your response to this question.
a.
ANOVA
126. Source of Variation SS df MS F‐ratio
Factor A 345.1 3 115.0333 53.77483
Factor B 262.3 2 131.15 61.30892
AB Interaction 1123.2 6 187.2 87.51071
Error 256.7 120 2.139167
Total 1987.3 131
b. H0: Factors A and B do not interact
HA: Factors A and B do interact
23
Using Excel’s FINV function the critical F for .05 significance
and 6 and 120 degrees of freedom is equal to
2.1750.
If F > 2.1750 reject H0, otherwise do not reject H0
87.5107 > 2.1750 so reject H0 and conclude that Factors A and
B do interact.
c.
Since interaction is present Factor B must be tested with a One‐
Way ANOVA at a given level of Factor A. Or
Factor A should be tested at a given level of Factor B. Note, st
127. udents should state that the lack of the raw data
makes determining the data for Factor B at a given level of Fact
or A impossible. Also that the lack of the raw
data makes determining the data for Factor B at a given level of
Factor A impossible.
d. One other possible approach is to ignore Factor B and the in
teraction of Factor A with Factor B. This of course
will make it harder to detect any differences in the mean factor l
evels of Factor A if there are actual differences
in the average levels of Factor A and the interaction terms.
SSEone-way = SST – SSA = 1987.3 – 345.1 = 1642.2.
Analysis of Variance
Source DF SS MS F
Factor A 3 345.1 115.03 8.966
Error 128 1642.2 12.83
Total 131 1987.3
H0: μ1 = μ2 = μ3 = μ4
HA: Not all means are equal
Using Excel’s FINV function the critical F for .1 significance
and 3 and 128 degrees of freedom is equal to
2.1271
If F > 2.2.1271 reject H0, otherwise do not reject H0
8.966 > 2.1271 so reject H0 and conclude that levels of Factors
A do not have equal means.
We could also ignore Factor A and the interaction of Factor A w
128. ith Factor B. This of course will make it harder
to detect any differences in the mean factor levels of Factor B if
there are actual differences in the average
levels of Factor A and the interaction terms.
SSEone-way = SST – SSB = 1987.3 – 262.3 = 1725.
Analysis of Variance
Source DF SS MS F
Factor B 2 262.3 131.15 9.809
Error 129 1725 13.37
Total 131 1987.3
H0: μ1 = μ2 = μ3
HA: Not all means are equal
Using Excel’s FINV function the critical F for .1 significance
and 2 and 129 degrees of freedom is equal to
2.3442
If F > 2.3442 reject H0, otherwise do not reject H0
9.809 > 23442 so reject H0 and conclude that levels of Factors
B do not have equal means.
17)
A two‐factor experiment yielded the following data:
24
Factor A
129. Factor B Level 1 Level 2 Level 3
Level 1 375 402 395
390 396 390
Level 2 335 336 320
342 338 331
Level 3 302 485 351
324 455 346
a. Determine if there is interaction between factor A and factor
B. Use the p‐value and a significance
level of 0.05.
b. Given your findings in part a, determine any significant diffe
rences among the response means of
the levels of factor A for level 1 of factor B.
c. Repeat part b at levels 2 and 3 of factor B, respectively.
a.
HO: AB interaction does not exist,
HA: AB interaction does exist
F = MSAB/MSW = 39.63 F0.05 = 3.633.
Since F = 39.63 > F0.05 = 3.633, reject HO. Conclude that there
is sufficient evidence to indicate that interaction
exists between Factors A and B.
130. b. Since interaction exists, it is futile to conduct inference on
the response means associated with the levels of
Factor A and Factor B. Therefore, a one-way analysis of
variance using only those values for Factor A
associated with level one of Factor B.
25
@ Level 1 of Factor B,
HA: at least two levels of Factor A have different mean respons
es @ Level 1 of Factor B
F = MSAB1/MSW = 2.90 F0.05 = 9.552.
Since F = 2.90 < F0.05 = 9.552, fail to reject HO. Conclude that
there is not sufficient evidence to indicate that at
least two levels of Factor A have different mean responses @ Le
vel 1 of Factor B.
c. Factor B at level 2:
@ Level 2 of Factor B,
HA: at least two levels of Factor A have different mean respons
es @ Level 2 of Factor B
F = MSAB2/MSW = 3.49 F0.05 = 9.552.
131. Since F = 3.49 < F0.05 = 9.552, fail to reject HO. Conclude that
there is not sufficient evidence to indicate that at
least two levels of Factor A have different mean responses @ Le
vel 2 of Factor B.
26
Factor B at level 3:
@ Level 3 of Factor B,
HA: at least two levels of Factor A have different mean respons
es @ Level 3 of Factor B
F = MSAB3/MSW = 57.73 F0.05 = 9.552.
Since F = 57.73 > F0.05 = 9.552, reject HO. Conclude that there
is sufficient evidence to indicate that at least two
levels of Factor A have different mean responses @ Level 2 of
Factor B.
Chapter 11 Power Point Slides.pdf
ff
C
ha
Analysis of VarianceAnalysis of Variance
132. pter
Chapter ContentsChapter Contents
11
11.1 Overview of ANOVA11.1 Overview of ANOVAO e e o
OO e e o O
11.2 One11.2 One--Factor ANOVA (Completely Randomized
Model)Factor ANOVA (Completely Randomized Model)
11.3 Multiple Comparisons11.3 Multiple Comparisonsp pp p
11.4 Tests for Homogeneity of Variances11.4 Tests for
Homogeneity of Variances
11.5 Two11.5 Two--Factor ANOVA without Replication
(Randomized Block Model)Factor ANOVA without Replication
(Randomized Block Model)
11.6 Two11.6 Two--Factor ANOVA with Replication (Full
Factorial Model)Factor ANOVA with Replication (Full
Factorial Model)
11.7 Higher Order ANOVA Models (Optional)11.7 Higher
Order ANOVA Models (Optional)
11-1
C
ha
ff
pter 1
Analysis of VarianceAnalysis of Variance
Chapter Learning Objectives (LO’s)Chapter Learning
133. Objectives (LO’s)
11
p g j ( )p g j ( )
LO11LO11--1:1: Use basic ANOVA terminology correctlyUse
basic ANOVA terminology correctlyLO11LO11--1:1: Use basic
ANOVA terminology correctly.Use basic ANOVA terminology
correctly.
LO11LO11--2:2: Recognize from data format when
oneRecognize from data format when one--factor ANOVA is
factor ANOVA is
appropriateappropriateappropriate.appropriate.
LO11LO11--3:3: Interpret sums of squares and calculations in
an ANOVA table.Interpret sums of squares and calculations in
an ANOVA table.
LO11LO11--4:4: Use Excel or other software for ANOVA
calculations.Use Excel or other software for ANOVA
calculations.
LO11LO11--5:5: Use a table or Excel to find critical values for
the Use a table or Excel to find critical values for the F
distribution.F distribution.
LO11LO11--6:6: Explain the assumptions of ANOVA and why
they are important.Explain the assumptions of ANOVA and why
they are important.
11-2
C
ha
134. ff
pter
Analysis of VarianceAnalysis of Variance
Chapter Learning Objectives (LO’s)Chapter Learning
Objectives (LO’s)
11
LO11LO11--7:7: Understand and perform Tukey's test for
paired means.Understand and perform Tukey's test for paired
means.
LO11LO11--8:8: Use Hartley's test for equal variances in Use
Hartley's test for equal variances in c c treatment treatment
groupsgroups..
LO11LO11--9:9: Recognize from data format when
twoRecognize from data format when two--factor ANOVA is
factor ANOVA is
needed.needed.
LO11LO11--10:10: Interpret main effects and interaction effects
in twoInterpret main effects and interaction effects in two--
factor factor
ANOVA.ANOVA.
LO11LO11--11:11: Recognize the need for experimental design
and GLM Recognize the need for experimental design and
GLM
(optional).(optional).
11-3
135. O f OO f O
C
ha
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
pter
LO11LO11--1: 1: Use basic ANOVA terminology correctly.Use
basic ANOVA terminology correctly.
11
•• Analysis of variance (ANOVA) is a comparison of
means.Analysis of variance (ANOVA) is a comparison of
means.
•• ANOVA allows one to compare more than two meansANOVA
allows one to compare more than two meansANOVA allows one
to compare more than two means ANOVA allows one to
compare more than two means
simultaneously.simultaneously.
•• Proper experimental design efficiently uses limited data to
Proper experimental design efficiently uses limited data to p p g
yp p g y
draw the strongest possible inferences.draw the strongest
possible inferences.
11-4
C
136. ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
The Goal: Explaining VariationThe Goal: Explaining Variation
11
• ANOVA seeks to identify sources of variation in a numerical
dependent variable Y (the response variable).
V i ti i Y b t it i l i d b• Variation in Y about its mean is
explained by one or more
categorical independent variables (the factors) or is unexplained
(random error).( )
11-5
C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
The Goal: Explaining VariationThe Goal: Explaining Variation
11
137. • Each possible value of a factor or combination of factors is a
treatment.
• We test to see if each factor has a significant effect on Y using
(for example) the hypotheses:
same for
all four plants)
H1: Not all the means are equal1 q
• The test uses the F distribution.
• If we cannot reject H0, we conclude that observations within
each 0
11-6
C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
OneOne--Factor ANOVA ExampleFactor ANOVA Example
11
11-7
138. C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
The Goal: Explaining VariationThe Goal: Explaining Variation
11
•• For example, a oneFor example, a one--factor ANOVA would
test the hypothesis that factor ANOVA would test the
hypothesis that
the length of hospital stay (LOS) is affected by Type of
Fracture:the length of hospital stay (LOS) is affected by Type
of Fracture:
Length of stay =Length of stay = ff(type of fracture) See Figure
11 3 (slide # 10)(type of fracture) See Figure 11 3 (slide #
10)Length of stay = Length of stay = ff(type of fracture). See
Figure 11.3 (slide # 10).(type of fracture). See Figure 11.3
(slide # 10).
•• A twoA two--factor ANOVA would test the hypothesis that
the length of factor ANOVA would test the hypothesis that the
length of
hospital stay (LOS) is affected by Type of Fracture and Age
hospital stay (LOS) is affected by Type of Fracture and Age p y
( ) y yp gp y ( ) y yp g
Group:Group:
Length of stay = Length of stay = ff(type of fracture, age
139. group)(type of fracture, age group)
•• We can also test for interaction between factors.We can also
test for interaction between factors.
• Another Example: Paint quality is a major concern of car
makers.
A key characteristic of paint is its viscosity a continuousA key
characteristic of paint is its viscosity, a continuous
numerical variable. Viscosity is to be tested for dependence on
application temperature (low, medium, high), as illustrated in
11-8
Figure 11.3 (slide# 10).
C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--11
The Goal: Explaining VariationThe Goal: Explaining Variation
11
Figure 11 3
11-9
Figure 11.3
140. C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of
ANOVALO11LO11--66
11
LO11LO11--6: 6: Explain the assumptions of ANOVA and why
they areExplain the assumptions of ANOVA and why they are
importantimportant
ANOVA AssumptionsANOVA Assumptions
important.important.
•• Analysis of Variance assumes that theAnalysis of Variance
assumes that the
-- observations onobservations on YY are independentare
independent-- observations on observations on YY are
independent,are independent,
-- populations being sampled are normal,populations being
sampled are normal,
-- populations being sampled have equalpopulations being
sampled have equalpopulations being sampled have equal
populations being sampled have equal
variances.variances.
•• ANOVA is somewhat robust to departures from normality and
ANOVA is somewhat robust to departures from normality and
141. equal variance assumptions.equal variance assumptions.
11-10
C
ha
O f OO f O
pter
11.1 Overview of ANOVA11.1 Overview of ANOVA
ANOVA CalculationsANOVA Calculations
11
• Software (e.g., Excel, MegaStat, MINITAB, SPSS) can be
used to
analyze data.
• Large samples increase the power of the test• Large samples
increase the power of the test,
but power also depends on the degree of variation in Y.
• Lowest power would be in a small sample with high variation
in Y.Lowest power would be in a small sample with high
variation in Y.
11-11
11 2 One11 2 One FactorFactor ANOVAANOVA
C
142. ha
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
pter
LO11LO11--22
D t F tD t F t
11LO11LO11--2: 2: Recognize from data format when
oneRecognize from data format when one--factor ANOVA is
factor ANOVA is
appropriateappropriate..
•• A oneA one--factor ANOVA only compares the means of
factor ANOVA only compares the means of cc groups groups
((t t tt t t f t l lf t l l ))
Data FormatData Format
((treatments treatments oror factor levelsfactor levels).).
•• Consider the format for a oneConsider the format for a one--
factor ANOVA with factor ANOVA with cc treatments,
treatments,
denoteddenoted AA11 AA22 AAdenoted denoted AA11, A,
A22, …, A, …, Ac.c.
Table 11.1 11-12
C
ha
143. 11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
•• Sample sizes within each treatment doSample sizes within
each treatment do notnot need to be equal (i eneed to be equal (i
e
Data FormatData Format
11
•• Sample sizes within each treatment do Sample sizes within
each treatment do notnot need to be equal (i.e., need to be equal
(i.e.,
balanced).balanced).
•• The total number of observations is equal toThe total number
of observations is equal to nn == nn11 + n+ n22 + … + n+ … +
nccThe total number of observations is equal to The total
number of observations is equal to nn nn11 n n2 2 … n …
ncc
Hypothesis to Be TestedHypothesis to Be TestedHypothesis to
Be TestedHypothesis to Be Tested
• ANOVA tests all means simultaneously and so does not inflate
the
type I error.yp
11-13
144. C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
A i l t t thA i l t t th f t d l i t th tf t d l i t th t
OneOne--Factor ANOVA as a Linear ModelFactor ANOVA as a
Linear Model
11
•• An equivalent way to express the oneAn equivalent way to
express the one--factor model is to say that factor model is to
say that
treatment treatment jj came from a population with a common
mean (
plus
a treatment effect (a treatment effect (AAjj) plus random error
••
jj = 1, 2, …, = 1, 2, …, cc and and ii = 1, 2, …, = 1, 2, …, nn
•• Random error is assumed to be normally distributed with zero
Random error is assumed to be normally distributed with zero
mean and the same variance for all treatments.mean and the
same variance for all treatments.
145. 11-14
C
ha
11 2 One11 2 One Factor ANOVAFactor ANOVA pter
11.2 One11.2 One--Factor ANOVAFactor ANOVA
(Completely Randomized Model)(Completely Randomized
Model)
LO11LO11--22
A fi d ff t d l l l k t h t h t th
OneOne--Factor ANOVA as a Linear ModelFactor ANOVA as a
Linear Model
11
• A fixed effects model only looks at what happens to the
response
for particular levels of the factor.
H0: A1 = A2 = … = Ac = 0H0: A1 A2 … Ac 0
H1: Not all Aj are zero
11-15
C
146. ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
Group MeansGroup Means
11
•• The mean of each group is calculated as:The mean of each
group is calculated as:
•• The overall sample mean (grand mean) can be calculated
as:The overall sample mean (grand mean) can be calculated as:
11-16
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
Partitioned Sum of SquaresPartitioned Sum of Squares
11
147. •• For a given observation For a given observation yyijij, the
following relationship must hold, the following relationship
must hold
11-17
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
•• This relationship is true for sums of squared deviations
yieldingThis relationship is true for sums of squared deviations
yielding
Partitioned Sum of SquaresPartitioned Sum of Squares
11
•• This relationship is true for sums of squared deviations,
yielding This relationship is true for sums of squared
deviations, yielding
partitioned sum of squarespartitioned sum of squares::
•• Simply put, Simply put, SSTSST = = SSA SSA ++ SSESSE
11-18
148. C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--22
• SSA and SSE are used to test the hypothesis of equal
treatment
Partitioned Sum of SquaresPartitioned Sum of Squares
11
• SSA and SSE are used to test the hypothesis of equal
treatment
means by dividing each sum of squares by it degrees of freedom
to adjust for group size. j g p
• These ratios are called Mean Squares (MSA and MSE).
• The resulting test statistic is F = MSA/MSE.
11-19
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
149. (Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--33
11
LO11LO11--3: 3: Interpret sums of squares and calculations in
an ANOVA table.Interpret sums of squares and calculations in
an ANOVA table.
Partitioned Sum of SquaresPartitioned Sum of Squares
11-20
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--33
Partitioned Sum of SquaresPartitioned Sum of Squares
11
• The ANOVA calculations are mathematically simple but
involve
tedious sums.
•• One can use Excel’s oneOne can use Excel’s one--factor
ANOVA menu using Datafactor ANOVA menu using DataOne
150. can use Excel s oneOne can use Excel s one factor ANOVA
menu using Data factor ANOVA menu using Data
Analysis to analyze data.Analysis to analyze data.
11-21
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--33
Test StatisticTest Statistic
11
• The F distribution describes the ratio of two variances.
• The F statistic is the ratio of the variance due to treatments
(MSA)
to the variance due to error (MSE)to the variance due to error
(MSE).
11-22
C
ha
151. 11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)
LO11LO11--33
Test StatisticTest Statistic
11
• When F is near zero, then there is little difference among
treatments and we would not expect to reject the hypothesis of
equal treatment meansequal treatment means.
Decision RuleDecision Rule
• F cannot be negative has no upper limit.ca o be ega e as o
uppe
• For ANOVA, the F test is a right-tailed test.
• Use Appendix F or Excel (or other approptriate software) to
obtain pp ( pp p )
the critical value
11-23
C
ha
11 2 One11 2 One FactorFactor ANOVAANOVA pter
11.2 One11.2 One--Factor Factor ANOVAANOVA
(Completely Randomized (Completely Randomized
Model)Model)