2. Definitions
- Experiment: a planned scientific inquiry designed to
investigate one or more populations under several
treatments and/or levels
e.g.:
- Experimental Design: The plan of the experiment
which specifies the treatment conditions (independent
variables ) and what is to be measured (dependent
variables).
- Treatment(s): various conditions (processes,
techniques, operations) which distinguish the population
of interest)
e.g.:
3. Definitions
- Factor: when several aspects are studied in a single
experiment, each is called a factor (independent variable).
The different categories within a factor are called the
levels of factor.
Control: A group of subjects which does not receive the
experimental treatment but in all other respects is treated
in the same way as the experimental group.
Term used for the “standard treatment” included in the
experiment so that there is a reference value to which
other treatments may be compared.
Placebo: Placebo: An inactive substance or dummy
treatment administered to a control group to compare its'
effects with a real substance, drug or treatment.
4. Definitions
-EU: Experimental Unit: the smallest entity receiving a
single treatment. Could be one entry or group
- Experimental Error: Uncontrolled sources of
variability in the results which occur randomly during
the experiment. Much of this error is due to individual
differences among subjects.
5. -ANOVA: Analysis of Variance:
• A statistical procedure which allows the
comparison of the means and standard deviations
of three or more groups in order to examine
whether significant differences exist anywhere in
the data.
• Is the process of subdividing the total variability
of experimental observation into portions
attributable to recognized source of variation.
Definitions
6. - Mean Separation: (Multiple comparisons)
If the null hypothesis is rejected then at least one
mean is significantly different from at least one
other one.
-LSD: Least Significant Difference: a value based
on the standard error that distinguish “statistically”
similar from non-similar means.
Definitions
9. Example. A manufacturer of paper used for making
grocery bags is interested in improving the tensile strength
of the product. Product engineering thinks that tensile
strength is a function of the hardwood concentration in the
pulp and that the range of hardwood concentrations of
practical interest is between 5 and 20%. A team of
engineers responsible for the study decides to investigate
four levels of hardwood concentration: 5%, 10%, 15%, and5%, 10%, 15%, and
20%.20%. They decide to make up six test specimens at each
concentration level, using a pilot plant. All 24 specimens
are tested on a laboratory tensile tester, in random order.
CRD
12. From where the variation comes
?????????
؟؟؟؟؟؟؟
CRD
ANOVA
13. One-Way ANOVA
Partitions Total Variation
Variation due to
treatment
Variation due to
random sampling
Variation due to
random sampling
Total variationTotal variation
• Sum of Squares Within
• Sum of Squares Error (SSE)
• Within Groups Variation
• Sum of Squares Among
• Sum of Squares Between
• Sum of Squares Treatment
(SST)
• Among Groups Variation
14. Total Variation
XX
Group 1Group 1 Group 2Group 2 Group 3Group 3
Response, XResponse, X
( ) ( ) ( ) ( )22
21
2
11 XXXXXXTotalSS ij −++−+−=
18. The design model
where Yij is a random variable denoting the (ij)th
observation, µ
is a parameter common to all treatments called the overall mean,
τi is a parameter associated with the ith
treatment called the ith
treatment effect, and εij is a random error component.
=
=
++=
nj
ai
Y ijiij
,.......,2,1
,....,2,1
ετµ
20. Completely Randomized Design
This is an example of a completely randomized single-
factor experiment with four levels of the factor.
The levels of the factor are called treatments, and each
treatment has six observations or replicates.
This figure indicates that changing the hardwood
concentration has an effect on tensile strength; specifically,
higher hardwood concentrations produce higher observed
tensile strength.
22. Analysis of Variance
Suppose we have a different levels of a single factor that we
wish to compare.
The response for each of the a treatments is a random variable.
Let yij, represents the jth
observation taken under treatment i.
We initially consider the case in which there are an equal
number of observations, n, on each treatment.
23. Analysis of Variance
We are interested in testing the equality of the a
treatment means µ1 , µ2 ,..., µa .We find that this is
equivalent to testing the hypotheses
If the null hypothesis is true, each observation consists of
the overall mean µ plus a realization of the random error
component εij and changing the levels of the factor has no
effect on the mean response.
ioneleastatforH
H
ia
a
0:
0.....: 210
≠
====
τ
τττ
24. Analysis of Variance
The sum of square total is
The sum of square treatment is
The error sum of squares is
SSError = SSTotal - SSTreatment
N
y
ySS
a
i
n
j
ijT
2
..
1 1
2
−= ∑∑
= =
N
y
n
y
SS
a
i
i
treatment
2
..
1
2
−= ∑
=
25. Analysis of Variance
The ANOVA partitions the total variability in the sample
data into two component parts.
Then, the test of the hypothesis is based on a
comparison of two independent estimates of the population
variance.
The total variability in the data is described by the total
sum of squares.
26. Analysis of Variance
We can show that if the null hypothesis H0 is true, the
ratio has an F-distribution with a - 1 and a(n - 1) degrees of
freedom.
If the null hypothesis is false, the expected value of
MSTreatments is greater than σ2
.
We would reject H0 if F >F∞,a-1,a(n-1).
( )
( )[ ] E
Treatmant
E
Treatment
MS
MS
naSS
aSS
F =
−
−
=
1/
1/
27. Analysis of Variance
Example. In the paper tensile strength experiment,
we can use the ANOVA to test the hypothesis that
different hardwood concentrations do not affect the
mean tensile strength of the paper.
The hypotheses are
H0: τ1= τ2= τ3= τ4= 0
Ha: τi ≠ 0 for at least one
31. Analysis of Variance
The typical ANOVA table for CRD
Source of
Varaition
Sum of
Squares
Degree of
Freedom
Mean
Square
F
Treatment SSTreatment a-1 MSTreatment
MSTreatment
MSError
Error SSError a(n-1( MSError
Total SSTotal an-1
32. Source of
Varaition
Sum of
Squares
Degree of
Freedom
Mean
Square
F
Hardwood
concentration
382.79 3 127.60 19.60
Error 130.17 20 6.51
Total 512.96 23
Analysis of Variance
33. Analysis of Variance
SSE = SST – SStreatment
= 512.96 – 382.79 = 130.17
From ANOVA results, we will reject H0, if F > FTable
F = 127.60 / 6.51 = 19.60
F0.01, 3, 20 = 4.94
Therefore, we reject H0 and conclude that
hardwood concentration affectsaffects the mean strength of
the paper.
35. Multiple Comparisons
When the null hypothesis is rejected in the ANOVA, we
know that some of the treatment or factor level means are
different.
However, the ANOVA doesn’t identify which means are
different.
Methods for investigating this issue are called multiple
comparisons methods.
Fisher’s least significant difference (LSD) method.
36. Multiple Comparisons
where LSD, the least significant difference, is
If the sample sizes are different in each treatment, the
LSD is
n
MS
tLSD E
na
2
(1(,2/ −= α
+= −
ji
EaN
nn
MStLSD
11
,2/α
37. Multiple Comparisons
Ex. Apply the Fisher LSD method to the hardwood concentration
experiment. There are a = 4, n = 6, MSE = 6.51, with 95 %
confidence interval and t0.025,20 = 2.086. The treatment means are
The value of LSD is
17.21
00.17
67.15
00.10
.4
.3
.2
.1
=
=
=
=
y
y
y
y
07.36/(51.6(2086.2/220,025.0 ==nMSt E
38. Source of
Varaition
Sum of
Squares
Degree of
Freedom
Mean
Square
F
Hardwood
concentration
382.79 3 127.60 19.60
Error 130.17 20 6.51
Total 512.96 23
Analysis of Variance
39. Multiple Comparisons
Therefore, any pair of treatment averages that differs by more
than 3.07 implies that the corresponding pair of treatment means
are different.
The comparisons among the observed treatment averages are
4 vs. 1 = 21.17 – 10.00 = 11.17 > 3.07
4 vs. 2 = 21.17 – 15.67 = 5.50 > 3.07
4 vs. 3 = 21.17 – 17.00 = 4.17 > 3.07
3 vs. 1 = 17.00 – 10.00 = 7.00 > 3.07
3 vs. 2 = 17.00 – 15.67 = 1.33 < 3.07
2 vs. 1 = 15.67 – 10.00 = 5.67 > 3.07
40. Multiple Comparisons
From this analysis, we see that there are significant
differences between all pairs of means except 2 and 3.
This implies that 10 and 15% hardwood concentration
produce approximately the same tensile strength and that
all other concentration levels tested produce different
tensile strengths.
42. • divides the group of experimental units
into n homogeneous groups of size t.
• These homogeneous groups are called
blocks.
• The treatments are then randomly
assigned to the experimental units in
each block - one treatment to a unit in
each block.
RCBD
43. Example 1:
• Suppose we are interested in how weight gain
(Y) in rats is affected by Source of protein (Beef,
Cereal, and Pork) and by Level of Protein (High
or Low).
• There are a total of t = 3×2 treatment
combinations of the two factors (Beef -High
Protein, Cereal-High Protein, Pork-High Protein,
Beef -Low Protein, Cereal-Low Protein, and
Pork-Low Protein) .
RCBD
44. • Suppose we have available to us a total of N =
60 experimental rats to which we are going to
apply the different diets based on the t = 6
treatment combinations.
• Prior to the experimentation the rats were divided
into n = 10 homogeneous groups of size 6.
• The grouping was based on factors that had
previously been ignored (Example - Initial weight
size, appetite size etc.)
• Within each of the 10 blocks a rat is randomly
assigned a treatment combination (diet).
RCBD
45. • The weight gain after a fixed period is
measured for each of the test animals
and is tabulated on the next slide:
RCBD
47. Example 2:
• The following experiment is interested in
comparing the effect four different
chemicals (A, B, C and D) in producing
water resistance (y) in textiles.
• A strip of material, randomly selected from
each bolt, is cut into four pieces (samples)
the pieces are randomly assigned to
receive one of the four chemical
treatments.
RCBD
48. • This process is replicated three times
producing a Randomized Block (RB)
design.
• Moisture resistance (y) were measured
for each of the samples. (Low readings
indicate low moisture penetration).
• The data is given in the diagram and table
on the next slide.
RCBD
51. The Model for a randomized Block
Experiment
ijjiijy εβτµ +++=
ijjiijy εβτµ +++=
i = 1,2,…, t j = 1,2,…, b
yij = the observation in the jth
block receiving the
ith
treatment
µ = overall mean
τi = the effect of the ith
treatment
βj = the effect of the jth
Block
εij = random error
52. The Anova Table for a randomized Block
Experiment
Source S.S. d.f. M.S. F p-value
Treat SST t-1 MST MST /MSE
Block SSB n-1 MSB MSB /MSE
Error SSE (t-1)(b-1) MSE
RCBD
53. • A randomized block experiment is assumed
to be a two-factor experiment.
• The factors are blocks and treatments.
• The is one observation per cell. It is
assumed that there is no interaction between
blocks and treatments.
• The degrees of freedom for the interaction is
used to estimate error.
RCBD
54. The ANOVA Table for Diet Experiment
Source S.S d.f. M.S. F p-value
Block 5992.4167 9 665.82407 9.52 0.00000
Diet 4572.8833 5 914.57667 13.076659 0.00000
ERROR 3147.2833 45 69.93963
55. The Anova Table for Textile Experiment
SOURCE SUM OF SQUARES D.F. MEANSQUARE F TAILPROB.
Blocks 7.17167 2 3.5858 40.21 0.0003
Chem 5.20000 3 1.7333 19.44 0.0017
ERROR 0.53500 6 0.0892
56. • If the treatments are defined in terms of two
or more factors, the treatment Sum of
Squares can be split (partitioned) into:
– Main Effects
– Interactions
57. The ANOVA Table for Diet Experiment
terms for the main effects and interactions
between Level of Protein and Source of Protein
Source S.S d.f. M.S. F p-value
Block 5992.4167 9 665.82407 9.52 0.00000
Diet 4572.8833 5 914.57667 13.076659 0.00000
ERROR 3147.2833 45 69.93963
Source S.S d.f. M.S. F p-value
Block 5992.4167 9 665.82407 9.52 0.00000
Source 882.23333 2 441.11667 6.31 0.00380
Level 2680.0167 1 2680.0167 38.32 0.00000
SL 1010.6333 2 505.31667 7.23 0.00190
ERROR 3147.2833 45 69.93963