The basic principles of Experimental Design and Analysis will be examined in this unit and topics covered will span the concepts and examples of experimental Design. This unit will cover simple and factorial experiments, one-way ANOVA, complete randomize design (CRD), Randomized Complete Block (RCB), missing data, Latin square (LS). Additionally, students will develop skills relevant to the research area.
1. College of Agriculture
Department of Horticultural Science
------------------------------------------------------------------------------------
AHO305 Experimental Design and Analysis
Horticulture Department Unit Guide
Third Year - First Semester, 2017
Noura Kka
2. How to begin
Come to as many classes as possible. Don’t rely on the provided
class notes alone; the classes provide an interpretation of the
concepts, which may be difficult to grasp from written notes. Keep
up with your revision and don’t leave it until the last minute.
Information and practice problems will be provided on a regular
basis.
Ask for help if you are struggling with the unit material.
Noura Kka
3. If you need help
Please feel free to come and talk to me to get helpful feedback on
your progress, or if you are struggling in any way.
If you wish to see me then I am usually available after class or you
can arrange an appointment. It is best to email me to set up a
meeting time. Please use my direct email address for this purpose.
Noura Kka: email nkka@outlook.com.au
Noura Kka
4. ABOUT THIS UNIT
The basic principles of Experimental Design and Analysis will be
examined in this unit and topics covered will span the concepts and
examples of experimental Design. This unit will cover simple and
factorial experiments, one-way ANOVA, complete randomize design
(CRD), Randomized Complete Block (RCB), missing data, Latin square
(LS). Additionally, students will develop skills relevant to research area.
Noura Kka
5. Your Unit Learning Outcomes (the student will be able to):
1. Describe the major characteristics of a scientific experiment.
2. Calculate variance and standard deviation from a data set.
3. Perform ANOVA to determine whether means are significantly
different.
4. Explain the difference between CRD, RCB, and LS
5. Be able to select and plan the correct design to conduct
experiments in the laboratory, greenhouse and field.
Noura Kka
6. ASSESSING YOUR ACHIEVEMENT OF THE UNIT LEARNING OUTCOMES
Items Theoretical Practical Total
During semester 27 13 40
Final exam 40 20 60
Total 67 33 100
To obtain a pass in the unit, students must submit and pass at least 4
of the 5 practical class assessments. In addition, two theoretical and
practical tests during semester.
Noura Kka
7. Attendance
Attendance for this class is mandatory. Attendance will be
confirmed with evaluation sheets. Each unexcused absence will
result in the lowering of your final grade by one grade.
Noura Kka
8. Student Conduct
Students are expected to respect the rights of others in the class.
Cell phones and other electronic equipment should be turned off
prior to the beginning of class. Use of these items during class time,
or any other unwarranted classroom disruption, will result in your
immediate excusal from class for the remainder of the period.
You may bring drinks to class. Please finish any meals before class
begins. The use of tobacco products during class time is strictly
prohibited.
Noura Kka
9. Week 1 Experimental Principles and Basic Statistics
Introduction
Researchers use experiments to answer questions. Typical questions
might be:
- What is the relations between salinity and mineral nutrition of
horticultural crops?
- What is the different between local and imported seeds of
tomato and how we can improve the germination rate of local cultivar
in Iraq?
- How drought at different growth stages of potato modulate the
tuber's Yield?
- What is the effect of drought and cultivar on growth parameters,
yield and yield components of potato?
- How much vitamin C is in onion?Noura Kka
10. Why Experiment?
Consider the local cultivar of tomato example mentioned above.
researchers need to know how they can improve the germination
rate of local cultivar. They are investigating three different plant
hormones, gibberellin, abscisic acid and salicylic acid. Which of
these hormones works best? How can they compare the effects of
the three hormones?
Noura Kka
11. Experiments help us answer questions. What is so special about
experiments? Consider that:
1. Experiments allow us to set up a direct comparison between the
treatments of interest.
2. We can design experiments to minimize any bias in the comparison.
3. We can design experiments so that the error in the comparison is
small.
4. Most important, we are in control of experiments, and having that
control allows us to make stronger inferences about the nature of
differences that we see in the experiment.
Noura Kka
12. Terms and Concepts
Treatments are the different procedures we want to compare.
Experimental unit is the smallest unit receiving a certain treatment.
Noura Kka
13. Responses are outcomes that we observe after applying a treatment
to an experimental unit.
Terms and Concepts
Noura Kka
14. Factors combine to form treatments. For example, effect of
temperature and nitrogen fertilizer on the yield of eggplant.
Terms and Concepts
Noura Kka
15. Control treatment is a “standard” treatment that is used as a baseline
or basis of comparison for the other treatments.
Noura Kka
16. Randomization refers to the allocation of treatments to plots in such a
way that, within a specific experimental design, units are not
discriminated for or against.
Noura Kka
17. Replication refers to experiments of the same nature, when presented
under similar conditions, should yield similar results. In other words,
the "repetition," i.e. "replication," of the same treatment on several
plots or experimental units.
Noura Kka
18. Experimental error is a measure of the difference between two units
treated alike. Or it is the random variation present in all experimental
results.
Noura Kka
19. Variable is any characteristics, number, or quantity that can be
measured or counted. Or the unit on which measurements are
made may be a whole plot, a small area of a plot, a single
plant, a stem, a leaf, etc.
There are different ways variables can be described according
to the ways they can be studied, measured, and presented.
Noura Kka
21. Week 2: Hypothesis Testing and The Completely Randomised Design
Steps for Planning, Conducting and Analyzing an
Experiment
1.Recognition and statement of the problem
2.Choice of factors, levels, and ranges
3.Selection of the response variable(s)
4.Choice of design
5.Conducting the experiment
6.Statistical analysis
7.Drawing conclusions, and making recommendationsNoura Kka
22. Hypothesis Testing, A hypothesis in statistics, is a statement about a
population where this statement typically is represented by some
specific numerical value.
Noura Kka
23. 1. Setting up two competing hypotheses
2. Set some level of significance called alpha.
3. Calculate a test statistic.
4. Calculate probability value (p-value), or find rejection region
5. Make a test decision about the null hypothesis
6. State an overall conclusion
In hypothesis testing, there are certain steps one must follow.
Noura Kka
24. The Null and Alternative Hypothesis
We usually set the hypothesis that one wants to conclude
as the alternative hypothesis,
- Something is going on,
- There is a difference from the traditional state of affairs.
- The alternative hypothesis, H1, usually represents what
we want to check or what we suspect is really going on.
Noura Kka
25. There are three types of alternative hypotheses:
1. The population parameter is not equal to a certain value. Referred
to as a "two- tailed test".
H1: p≠p0, or H1: μ≠μ0
2. The population parameter is less than a certain value. Referred to
as a "left-tailed test"
H1: p<p0, or H1: μ<μ0
3. The population parameter is greater than a certain value. Referred
to as a "right-tailed test".
H1: p>p0, or H1: μ>μ0
Noura Kka
26. The null hypothesis is the population parameter is equal to that
certain value.
- Nothing special is going on,
- No differences from the traditional state of affairs,
- No relationship
The null hypothesis in each case would be:
H0: p=p0, or H0: μ=μ0
Noura Kka
27. If the conditions necessary to conduct the hypothesis test are
satisfied, then we can use the formulas below to calculate the
appropriate test statistic from our sample data. These
assumptions and test statistics are as follows:
Choosing the Null and Alternative Hypothesis
Noura Kka
28. 𝑧 =
𝑝−𝑝0
𝑝0 1−𝑝0
𝑛
Test of One Proportion
Test of One Mean:
𝑧 =
ҧ𝑥 − 𝜇0
𝜎 ∕ 𝑛
Z-Test Use it when you test a population mean and you know σ
𝑡 =
ҧ𝑥 − 𝜇0
𝑠 ∕ 𝑛
T-Test Use it when you test a population mean and you don’t know σ and a small
sample size (less than 30)
Noura Kka
29. The Logic of Hypothesis Testing
How do we decide whether to reject the null hypothesis?
▪ If the sample data are consistent with the null hypothesis, then
we do not reject it.
▪ If the sample data are inconsistent with the null hypothesis,
but consistent with the alternative, then we reject the null
hypothesis and conclude that the alternative hypothesis is
true.
Noura Kka
30. Type I and Type II Errors and the Setting Up of
Hypotheses
How do we determine whether to reject the null
hypothesis? It begins the level of significance α, which is
the probability of the Type I error.
What is Type I error and what is Type II error?
Noura Kka
31. When doing hypothesis testing, two types of mistakes may
be made and we call them Type I error and Type II error.
Decision
Reality
H0 is true H0 is false
Reject H0 Type I error Correct
Fail to Reject H0 Correct Type II error
Noura Kka
32. If we reject H0 when H0 is true, we commit a Type I
error. The probability of Type I error is denoted by: α.
If we fail to reject H0 when H0 is false, we commit a
Type II error. The probability of Type II error is
denoted by: β.
Noura Kka
33. Comparing the P-Value Approach to the Rejection Region Approach
1.Using the rejection region approach, you need to check the table for
the critical value every time people give you a different α value.
2.In addition to just using it to reject or not reject H0 by comparing p-
value to α value, p-value also gives us some idea of the strength of
the evidence against H0.
"The results are statistically significant" - when p−value < α
"The results are not statistically significant" - when p−value > α
Noura Kka
34. The Completely Randomised Design
Appropriate use of completely randomized designs
1. Under conditions where the experimental material is
homogeneous, i.e.,laboratory, or growth chamber experiments.
2. Where a fraction of the experimental units is likely to be
destroyed or fail to respond.
3. In small experiments where there is a small number of degrees
of freedom.
Noura Kka
35. Advantages of completely randomized designs
1. Complete flexibility is allowed - any number of treatments and
replicates may be used.
2. Relatively easy statistical analysis, even with variable replicates
and variable experimental errors for different treatments.
3. Analysis remains simple when data are missing.
4. Provides the maximum number of degrees of freedom for
error for a given number of experimental units and treatments.
Noura Kka
36. Disadvantages of completely randomized designs
1. Relatively low accuracy due to lack of restrictions which
allows environmental variation to enter experimental
error.
2. Not suited for large numbers of treatments because a
relatively large amount of experimental material is needed
which increases the variation.
Noura Kka
37. Randomization and Layout
STEP 1. Determine the total number of experimental plots (n) as the
product of the number of treatments (t) and the number of
replications (r); that is, n = (r)(t). For our example, n = (5)(4) = 20.
STEP 2. Assign a plot number to each experimental plot in any
convenient manner; for example, consecutively from 1 to n. For our
example, the plot numbers 1,..., 20 are assigned to the 20
experimental plots as shown in Figure 1.
STEP 3. Assign the treatments to the experimental plots by any of
the following randomization schemes: see reference (4) page (7-12)
Or reference (5) Chapter 9 page 102- 105 for more details.
Noura Kka
38. A.By table of random numbers.
B.By drawing lots. The steps involved are:
Treatment label: D B A B C A D C B D
Sequence: 1 2 3 4 5 6 7 8 9 10
Treatment label: D A A B B C D C C A
Sequence: 11 12 13 14 15 16 17 18 19 20
Noura Kka
39. ANOVA table
Source of
Variation
Degree of
Freedom
(d.f )
Sum of
Squares
(SS)
Mean
Square
(MS)
Computed
F
Tabular F
Treatment
t-1 Treatmen
t SS
Treatmen
t MS
𝑇 𝑀𝑆
𝐸 𝑀𝑆
5% 1%
Experimental
error
t(r-1) Error SS Error MS
Total tr-1 Total SS
Noura Kka
40. - Compare the computed F-value of with the tabular F-values, and decide on the
significance of the difference among treatments using the following rules:
1. If the computed F-value is larger than the tabular F-value at the 1% level of
significance, the treatment difference is said to be highly significant. Such a result
is generally indicated by placing two asterisks on the computed F value in the
analysis of variance.
2. If the computed F-value is larger than the tabular F-value at the 5% level of
significance but smaller than or equal to the tabular F-value at the 1% level of
significance, the treatment difference is said to be significant. Such a result is
indicated by placing one asterisk on the computed F-value in the analysis of
variance.
3. If the computed F-value is smaller than or equal to the tabular F-value at the
5%level of significance, the treatment difference is said to be nonsignificant. Such
a result is indicated by placing ns on the computed F-value in the analysis of
variance.
Noura Kka
41. - Compute the coefficient of variation (CV)
The cv indicates the degree of precision with which the treatments are
compared and is a good index of the reliability of the experiment.
The cv varies greatly with the type of experiment, the crop grown,
and the characters measured..
Noura Kka
43. The ……………………is the population parameter is equal to that certain value. H0: μ=μ0
The ……………………is the population parameter is not equal to that certain value. H1: μ=μ0
If we …………when H0 is true, we commit a Type I error. The probability of ……..is denoted by: ……...
If we ………… when H0 is false, we commit a ………... The probability of Type II error is denoted by: ……………….
The results are statistically …………………………." - when p−value < α
"The results are …………………………………." - when p−value > α
ANOVA table
Source of
Variation
Computed F Tabular F
Treatment
Treatment SS Treatment
MS
5% 1%
Experimental
error
t(r-1) Error SS
Total tr-1 Total SS
If the computed F-value is smaller than or equal to the tabular F-value at the 5%level of significance, the
treatment difference is said to …………………… . Noura Kka
44. Linear model
Predict the behavior of complex systems or analyze experimental,
financial, and biological data.
It can clearly specify a response variable, also termed the dependent
variable and designated Y, and one or more predictor variables, also
termed the independent variables or covariates and designated X1,
X2, etc.
Week 3 Linear model and Multiple comparison
Noura Kka
45. General form of the statistical models:
response variable=model + error
13(-2)(2.94)12.06
)1513()06.1215(06.12Y
2.5(-0.56)12.06
)5.1114()06.125.11(06.12Y
8(-3.5)(-0.56)12.06
)5.118()06.125.11(06.12Y
i.)Y-(Yij..)Y-i.Y(..YYij
asrewrittenbecanyij
unitexperimentanyof
.
effect...
Mean..
....r1,2,3,....j
....t1,2,3,....iYij
ModelANOVAorlinearorthmatical
36
12
11
i
ij
i
ij
i
ij
valueYij
ErroriYYij
TreatmentYiY
GeneralY
M
Noura Kka
46. Multiple Comparisons
Pair comparison is the simplest and most commonly used
comparison in agricultural research. There are two types:
- Planned pair comparison, in which the specific pair of
treatments to be compared was identified before the start of the
experiment.
- Unplanned pair comparison, in which no specific comparison
is chosen in advance. Instead, every possible pair of treatment
means is compared to identify pairs of treatments that are
significantly different. Noura Kka
47. Brief comments are included below for some test procedures.
Tukey’s HSD test (honestly significant differenced)
Student–Neuman–Keuls (SNK) test
Ryan’s test
Peritz’s test
Scheffe’s test
Fisher’s Protected Least Significant Difference test (LSD test)
Least Significant Increase (LSI)
Least Significant Difference
Duncan's Multiple Range Test (DMRT)
Dunnett’s test
Noura Kka
48. Least Significant Difference
The procedure provides a single LSD value, at a prescribed
level of significance.
The LSD test is most appropriate for making planned pair
comparisons but is not valid for comparing all possible pairs
of means, especially when the number of treatments is
large.
The LSD test must be used only when the F test for treatment
effect is significant and the number of treatments is not too
large-less than six.
Noura Kka
49. The procedure for applying the LSD test to compare any two treatments, involves the
following steps:
Step 1. Compute the LSD value at
Step 2. Find the t-critical value. The t-critical value for α, df error. Make sure you are using the degree of freedom
of error from your results (ANOVA).
Step 3. Insert the given values, the MSE from your results and the t-critical value from Step 2 into the least
significant difference formula.
Step 4. Compute the mean difference between the ith and the jth treatment as:
Step 5. Compare Step 1 and Step 4. If | | ≥ LSD then you can reject the null hypothesis or you say
the two treatments are significantly different at the α level of significance.
Noura Kka
50. Duncan's Multiple Range Test (DMRT)
This stepwise test was historically popular. However, it does not control the
Type I error rate at a known level.
The procedure for applying the DMRT is similar to that for the LSD test;
However, unlike the LSD test, the DMRT requires computation of a series of
values, each corresponding to a specific set of pair comparisons.
Noura Kka
51. Step 1: Rank the treatments from highest to lowest mean. Then compare the highest with the lowest mean.
Step 2: Look up the table. With number of treatments, and degrees of freedom for the error term to find SSR
Step 3: Compute the
Step 4: Multiply (Step 3) by the (Step 2)
Step 5: Compare the first highest LSR with the difference between the highest and lowest means. So, if it is
continue down your table, changing the LSR as you go. Once you have a non-significant result, you can stop
at that point.
Noura Kka
52. Dunnett’s test
This is a modified t test designed specifically for comparing each
group to a control group. Under this scenario, there are fewer
comparisons than when comparing all pairs of group means, so Dunnett’s
test is more powerful than other multiple comparison tests in this
situation.
Noura Kka
53. Step 1: Look up the tDunnett critical value in the Dunnett-critical value table. You’ll need:
- Your chosen alpha level (usually 5%),
- Degrees of freedom from the ANOVA dft on the top of the table
- Degrees of freedom from the ANOVA dfE in the left hand column of the table.
Plug the value into the formula.
Step 2: Find the mean squares (MSE) in the ANOVA source table. Plug that value
into the above formula.
Step 3: Find the differences between means and the control group mean. If the
difference is greater than DDunnett, then that difference is significant
Noura Kka
54. Unequal Replication
A comparison between equal and unequal replicates
ANOVA table
Source of
Variation (S.O.V)
Degree of
Freedom
(d.f )
Sum of
Squares
(SS)
Mean
Square
(MS)
Computed F Tabular F
Treatment
t-1 C.F-
r
Xi.2
𝑡𝑆𝑆
𝑑𝑓𝑡
𝑀𝑆𝑡
𝑀𝑆𝐸
5% 1%
Experimental
error
t(r-1) 𝑇𝑆𝑆 − 𝑡𝑆𝑆
𝐸𝑆𝑆
𝑑𝑓𝑒
Total 1tr C.F-Xij2
Xij
..
2
tr
FC
- Equal CRD
ANOVA table
Source of
Variation (S.O.V)
Degree of
Freedom
(d.f )
Sum of
Squares
(SS)
Mean
Square
(MS)
Computed F Tabular F
Treatment
t-1
fc
r
t
i
i
.][
2
𝑡𝑆𝑆
𝑑𝑓𝑡
𝑀𝑆𝑡
𝑀𝑆𝐸
5% 1%
Experimental
error
t ir Error SS
𝐸𝑆𝑆
𝑑𝑓𝑒
Total 1 ir Total SS
i
2
r
Xij
c.f.
Noura Kka
55. Randomized complete block (RCB) designs
The RCBD is the standard design for agricultural experiments where
similar experimental units are grouped into blocks or replicates.
Noura Kka
56. Example of a Randomized Complete Block (RCB) design with three replicates of two
treatments and one control.
Example 1
Noura Kka
57. Arial view of an irrigation trial design in a 10-acre vineyard using a RCB design that assures each zone receives all
treatments. Different colors represent different irrigation treatments.
Example 2
Noura Kka
58. Different portions of a
cluster from which
representative berry
samples should be
collected.
Noura Kka
62. Randomization Procedure for RCBD
• Each replicate is randomized separately.
• Each treatment has the same probability of being assigned to a
given experimental unit within a replicate.
• Each treatment must appear at least once per replicate.
B
C
D
A
B
A
C
D
A
D
B
C
Noura Kka
64. ANOVA table
Source of
Variation
Degree of
Freedom
(d.f )
Sum of
Squares
(SS)
Mean
Square
(MS)
Computed
F
Tabular F
Treatment
t-1 Treatment
SS
Treatment
MS
𝑇 𝑀𝑆
𝐸 𝑀𝑆
5% 1%
Experimental
error
t(r-1) Error SS Error MS
Total tr-1 Total SS
What is the difference between the ANOVA tables
ANOVA table
Source of
Variation
Degree of
Freedom
(d.f )
Sum of
Squares
(SS)
Mean
Square
(MS)
Computed F Tabular F
Treatment
t-1 Trt SS Treatment
MS
𝑇 𝑀𝑆
𝐸 𝑀𝑆
5% 1%
Replicate r-1 Rep SS Rep MS
𝑅 𝑀𝑆
𝐸 𝑀𝑆
Experimental
error
(t-1)(r-1) Error SS Error MS
Total tr-1 Total SSNoura Kka
65. Advantages of the RCBD
1. Generally more precise than the CRD.
2. No restriction on the number of treatments or replicates.
3. Some treatments may be replicated more times than
others.
4. Missing plots are easily estimated.
5. Whole treatments or entire replicates may be deleted
from the analysis.
Noura Kka
66. Disadvantages of the RCBD
1. Error df is smaller than that for the CRD (problem with a
small number of treatments).
2. If there is a large variation between experimental units
within a block, a large error term may result (this may be
due to too many treatments).
3. If there are missing data, a RCBD experiment may be less
efficient than a CRD.
Noura Kka
67. Note: If the F-test for replicate is significant, this indicates that the
precision of the experiment has been increased by using CRBD
design instead of a CRD.
Noura Kka
68. NAOVA table RCBD
Step 1. Calculate the correction factor (CF).
trtrtr
FC
222
GX..Xij
..
Step 2. Calculate the Total SS. C.F.-Xijss 2
Toltal
Step 3. Calculate the Replicate SS (Rep SS)
..
t
.....XX
C.F.-
t
X.j
ssTreatment
2
.2
2
.1
2
FC
Step 4. Calculate the Treatment SS (Trt SS)
..
r
.....XX
C.F.-
r
Xi.
ssTreatment
2
2.
2
1.
2
FC
Step 5. Calculate the Error SS sstreatment-ssTotalssError
Step 6. Complete the ANOVA Table
Noura Kka
69. Step 7. Look up Table F-values for Rep and Trt
Step 8. Make conclusions.
Compare the computed F-value of with the tabular F-values, and decide on the
significance of the difference among treatments using the same rules as CRD.
Noura Kka
70. Step 9. Calculate Coefficient of Variation (CV).
Step 10. Calculate LSD’s if necessary
• There is no need to calculate a LSD for replicate since you generally are not interested in
comparing differences between replicate means.
• Since the F-test for treatment was non-significant, one would not calculate the F
protected LSD. However, if the F-test for treatment was significant, the LSD would be same
as CRD steps.
Noura Kka
71. • Reasons for missing data include:
1. Break a test tube in the laboratory.
2. A plot may be flooded, grazed, or vandalized.
3. Plants may be destroyed by pests or disease.
4. Some measurement may be lost.
5. Illogical data may be present because an experimental
unit accidentally received the wrong treatment.
Missing Values in a Randomised Block Design (RBD)
Noura Kka
72. Missing values estimation
The Yates method is a classical approach developed by Yates in
1933.
The purpose of missing values estimation is
minimizing error variance.
This approach consists of three stages,
1- Estimation of missing values.
2- Replacement of the missing values with prediction values.
3- Analysis of the complete data.
Noura Kka
73. where:
r = number of replicates
t = number of treatments
B = total of observed values of the replication that contains the missing data
T=total of observed values of the treatment that contains the missing data
G. = grand total of all observed values (Y..)
-The formula for a single missing value is
Noura Kka
74. - If number of missing values are more than one, we iterate
formula above using a starting value (Approximated value)
ത𝑦𝑖. =Mean of observed values of the treatment that contains the missing data.
ത𝑦. 𝑗=Mean of observed values of the replication that contains the missing
data.
Noura Kka
75. Example: The value for a missing value can be estimated by using the
following steps:
Step 1. Estimate the missing value for the missed value of the fourth
treatment (100 kg seed/ha) in replication two using the formula for a
single missing value.
Seed yield, Kg/ha
Treatment, N kg/ha R 1 R 2 R3 R4 Treat. Total (Yi.)
T1 (25) 5113 5398 5307 4678 20496
T2 (50) 5346 5952 4719 4264 20281
T3 (75) 5272 5713 5483 4749 21217
T4 (100) 5164 m 4986 4410 14560
T5 (125) 4804 4848 4432 4748 18832
T6 (150) 5254 4542 4919 4098 18813
Rep. total (Y.J) 30953 26453 29846 26947 114199Noura Kka
79. Step 4. Compute adjusted treatment SS
1
1
2
tt
ytB
ssttss
ijj
955,139,1
)16(6
)]265,5)(16(453,26[
501,140,1
2
tss
Bj=total of observed values of the replication that contains the missing data
Noura Kka
80. Step 5. Complete the adjusted ANOVA
96.205.0.
4,869,966221)-(trTotal
110,0521,540,726141)-1)(r-(tError
729,5802,188,73931-rBlock
2.07227,9911,139,95551-ttreat.
cal.fMSSSdf.S.O.V.
tableANOVA
ns
FTab
- Remember that for each missing value in the experiment, you lose one degree of freedom
from error and total.
Noura Kka
82. T6 T5 T4 T2 T1 T3
4703.25 4708 4956.25 5070.25 5124 5304.25
T3 5304.25 601* 596.25* 348 234 180.25 0
T1 5124 420.75 416 167.75 53.75 0
T2 5070.25 367 362.25 114 0
T4 4956.25 253 248.25 0
T5 4708 4.75 0
T6 4703.25 0
Step 7. Compute the mean difference between the treatments:
Step 8. Compare Step 6 and Step 7. If the mean difference between the
treatments ≥ LSDα then you can reject the null hypothesis or you say the two
treatments are significantly different at the 0.05 level of significance.
551... 05.0 DSL
Noura Kka
83. Conclusion: The difference between (T3 and T6)
and (T3 and T5) is statistically significant. However,
the difference between the other treatments are
not statistically significant. Therefore, we reject the
null hypothesis.
T3≠T6, T3≠T5.
Noura Kka
84. Latin Square Design
The Latin Square Design is an extension of the Randomized
Block Design to include two blocking factors, each with the
same number of levels as the primary treatment factor.
Noura Kka
86. Advantage:
Allows the experimenter to control two sources of variation
Disadvantages:
-The experiment becomes very large if the number of treatments is
large
-The statistical analysis is complicated by missing plots and miss-
assigned treatments
- df Error is small if there are only a few treatments
Note: Use LS only when you have more than four but fewer than ten
treatments a minimum of 12 df for error
Noura Kka
87. Randomization and Layout
Step 1 First row in alphabetical order A B C D E
Step 2 Subsequent rows - shift letters one position
Step 3 Randomize the order of the rows: e.g., 2 4 1 3 5
Step 4 Finally, randomize the order of the columns: e.g., 4 3 5 1 2
Noura Kka
89. Source of
variation
Degrees of
freedoma
Sums of
squares (SS)
Mean
square (MS)
F
Rows (R) r-1 SSR SSR/(r-1) MSR/MSE
Columns (C) c-1 SSC SSC/(r-1) MSC/MSE
Treatments
(Tr)
t-1 SSTr SSTr/(r-1) MSTr/MSE
Error (E) (t-1) (r-2) SSE SSE/((r-1) (r-2))
Total (Tot) r2-1 SSTot
ANOVA table
Noura Kka
90. Example
- Are the blocks for shading represented by the row or columns?
- Is the gradient due to irrigation accounted for by rows or columns?
Noura Kka
92. C 1 C2 C3 C4 C5 Sum R R Mean
R 1 33.8 33.7 30.4 32.7 24.4 155 31
R2 37 28.8 33.5 34.6 33.4 167.3 33.46
R3 35.8 35.6 36.9 26.7 35.1 170.1 34.02
R4 33.2 37.1 37.4 38.1 34.1 179.9 35.98
R5 34.8 39.1 32.7 37.4 36.4 180.4 36.08
Sum R 174.6 174.3 170.9 169.5 163.4 852.7 34.108
C Mean 34.92 34.86 34.18 33.9 32.68
A 33.8 34.6 36.9 37.1 36.4 178.8 35.76
B 33.7 33.5 35.1 38.1 34.8 175.2 35.04
C 32.7 33.4 35.8 37.4 39.1 178.4 35.68
D 30.4 37 35.6 34.1 37.4 174.5 34.9
E 24.4 28.8 26.7 33.2 32.7 145.8 29.16
852.7 34.108
Noura Kka
93. Source of
variation
Degrees of
freedoma
Sums of
squares (SS)
Mean
square (MS)
F
Rows (R) 4 87.40 21.85 7.13**
Columns (C) 4 16.56 4.14 1.35
Treatments
(Tr)
4 155.89 38.97 12.71**
Error (E) 12 36.80 3.07
Total (Tot) 24 296.66
ANOVA
- Differences among treatment means were highly significant
- (Row) Blocking by irrigation effect was useful in reducing experimental error
- (Column) Distance from shade did not appear to have a significant effect
Noura Kka
94. Relative Efficiency
The relative efficiency of a LS design as compared to a CRD:
𝑅𝐸(𝐿𝑆: 𝐶𝑅𝐷) =
21.85 + 4.14 + 5 − 1 3.07
5 + 1 3.07
= 2.08
Noura Kka
95. When the d.f error. in the LS analysis of variance is less than 20, the
R. E. value should be multiplied by the adjustment factor k defined as:
K =
)dfEA + 1 (dfEB + 3
dfEB + 1)(dfEA + 3
K =
)12 + 1 (20 + 3
20 + 1)(12 + 3
=
299
315
= 0.94
(2.08 *0.94) 100 = 197.43% gain
CRD would require 1.97*5= 9.85 or 10 reps
Noura Kka
96. The relative efficiency of a LS design as compared to a RCBD can be
computed in two ways
- When rows are considered as blocks, to compare with an RBD using columns as
blocks
𝑅𝐸 =
4.14 + 5 − 1 3.07
5 ∗ 3.07
= 1.07
K =
)12 + 1 (16 + 3
16 + 1)(12 + 3
=
247
255
= 0.97
(1.07* 0.97) *100 = 103.79
3% gain in efficiency by adding columns
Noura Kka
97. - When columns are considered as blocks, of the RCB design. These two relative
efficiencies are computed as:
𝑅𝐸 =
21.85 + 5 − 1 3.07
5 ∗ 3.07
= 2.22
K =
)12 + 1 (16 + 3
16 + 1)(12 + 3
=
247
255
= 0.97
(2.22*0.97) *100 = 215% gain in efficiency by adding rows
Noura Kka
98. The missing data in a Latin square design
- If only one plot is missing, you could use the following formula:
Where:
• t = number of treatments
• Ri = total of observed values of the row that
contains the missing data
• Cj = total of observed values of the column that
contains the missing data
• Tk = total of observed values of the treatment that
contains the missing data
• G= grand total of the available observations
Noura Kka
99. C1 C2 C3 C4 Sum
R1 A (8) C (11) D(2) B(8) 29
R2 C (7) A B (2) D (4) 13
R3 D (3) B (9) A (7) C (9) 28
R4 B (4) D (5) C (9) A (3) 21
Sum 22 25 20 24 91
EXAMPLE
A 18
B 23
C 36
D 14
Noura Kka
100. Step 1. Estimate the missing value, using the preceding formula
ˆ i j k
ij(k)
t R + C + T -2 G
Y =
t-1 t-2
𝑌22𝐴 =
)4 13 + 25 + 18 − 2(91
4 − 1)(4 − 2
=
42
6
= 7
Noura Kka
101. Step 2. Replace the missing value by its estimated and do analysis of variance of
the augmented data set based on the standard procedure.
C1 C2 C3 C4 Sum
R1 A (8) C (11) D(2) B(8) 29
R2 C (7) A (7) B (2) D (4) 20
R3 D (3) B (9) A (7) C (9) 28
R4 B (4) D (5) C (9) A (3) 21
Sum 22 32 20 24 98
A 25
B 23
C 36
D 14
Noura Kka
102. df SS MS Cal-F Tab-F (0.05)
R 3 16.25 5.42 1.38 n.s 4.76
C 3 20.75 6.92 1.77 n.s
Tr 3 61.25 20.42 5.21 *
E 6 23.50 3.92
Total 15 121.75
Step 3: Make the following modifications to the analysis of variance obtained in
* Subtract one from both the total and error d.f. For our example, the
total d.f. of 15 becomes 14 and the error d.f. of 6 becomes 5.
Noura Kka
104. Step 5. Complete adjusted ANOVA for the above Data
df SS MS Cal-F Tab-F (0.05)
R 3 16.25 5.42 1.15 n.s 5.41
C 3 20.75 6.92 1.47 n.s
t 3 61.22 20.41 4.34 n.s
E 5 23.53 4.71
Total 14 121.75
Conclusion: According to F-test, the difference between treatments is statistically not significant.
Therefore, we accept the null hypothesis.
Noura Kka
105. FACTORIAL EXPERIMENTS
An experiment in which the treatments consist of all possible
combinations of the selected levels in two or more factors is referred to
as a factorial experiment.
Or you can say
Factorials combine the levels of two or more factors to create
treatments
Noura Kka
106. Advantages of Factorial Arrangements
1. Provides estimates of interactions.
2. Possible increase in precision due to hidden replication.
Disadvantages of Factorial Arrangements
1. Some treatment combinations may be of little interest.
2. Experimental error may become large with a large number of treatments.
3. interpretation may be difficult (especially for 3-way or more interactions).
Noura Kka
107. Some examples of factorial treatment structure
- Example 1: Consider an experiment to investigate the effect of
phosphorus (P) and potassium (K) fertilizers on the yield of a crop. The
two levels of P could be zero and 20 kg/h. The two levels of K could be
zero and 15kg/h.
• P Zero = P0
• P 20 kg/h = P1
• K Zero = K0
• K 15 kg/h = K1
The four treatment combinations.
P0 P1
K0 K0 P0 K0 P1
K1 K1 P0 K1 P1
Noura Kka
108. Simple, main effects and interactions
The simple effect of a factor is the difference between its two levels at a given
level of the other factor.
Simple effect of P at K0 = K0 P1 - K0 P0
Simple effect of P at K1 = K1 P1 - K1 P0
Simple effect of K at P0 = K1 P0 - K0 P 0
Simple effect of K at P1 = K1 P1 - K0 P1
Noura Kka
109. Main factor: An estimate of the effect of an independent factor of any
other factors.
Main effect of P = (simple effect of P at K0 + simple effect of P at K1) / 2
Main effect of K = (simple effect of K at P0 + simple effect of K at P1) / 2
Noura Kka
110. Interaction: An interaction occurs when the effect of an independent factor
on the response depends upon the level of another independent factor.
The interaction of P = Simple effect of P at K0 - Simple effect of P at K1
The interaction of K= Simple effect of K at P0 - Simple effect of K at P1
Noura Kka
112. - Example 2: A horticulturist is interested in the impact of water loss due to transpiration
on the yield of tomato plants
- The levels of shading would be reductions of 0, ¼, ½, and ¾ in the normal sunlight that
the plots naturally receive.
- Plant development would be divided into three stages: stage I, stage II, and stage III.
Provide the factor- level combinations (treatments) to be used in a complete randomized
experiment with a 3 × 4 factorial treatment structures.
Noura Kka
113. The 3×4 factors- level combinations result in 12 treatments
Treatments
Factor 1 2 3 4 5 6 7 8 9 10 11 12
Growth
stage
I I I I II II II II III III III II
Shading 0 1/4 1/2 3/4 0 1/4 1/2 3/4 0 1/4 1/2 3/4
Noura Kka
114. A a1 a2 a3
B b1 b2 b3 b4 b1 b2 b3 b4 b1 b2 b3 b4
ab a1b1 a1b2 a1b3 a1b4 a2 b1 a2b2 a2b3 a2b4 a3b1 a3b2 a3 b3 a3b4
The 12 treatment combinations
Noura Kka
115. - The number of replicates used were 3,
- The number of experimental units were 3×4 ×3= 36
The observation value (yijk) is donated by abr
A B r1 r2 r3
a1 b1 a1 b1 r1 a1 b1 r2 a1 b1 r3
b2 a1 b2 r1 a1 b2 r2 a1 b2 r3
b3 a1 b3 r1 a1 b3 r2 a1 b3 r3
b4 a1 b4 r1 a1 b4 r2 a1 b4 r3
a2 b1 a2 b1 r1 a2 b1 r2 a2 b1 r3
b2 a2 b2 r1 a2 b2 r2 a2 b2 r3
b3 a2 b3 r1 a2 b3 r2 a2 b3 r3
b4 a2 b4 r1 a2 b4 r2 a2 b4 r3
a3 b1 a3 b1 r1 a3 b1 r2 a3 b1 r3
b2 a3 b2 r1 a3 b2 r2 a3 b2 r3
b3 a3 b3 r1 a3 b3 r2 a3 b3 r3
b4 a3 b4 r1 a3 b4 r2 a3 b4 r3
Noura Kka
117. - Example 3: A group of researchers examined the effect of the interaction
between water (Factor A, two levels) and age of seed (Factor B, five levels)
on germination rate. The number of replicates used were 3, The number of
germinated seeds was the measured response. Calculate the difference
between the factors using F-test.
Noura Kka
118. The 2×5 factors- level combinations result in 10 treatments
Treatments
Factor 1 2 3 4 5 6 7 8 9 10
Water level (A) I I I I I II II II II II
Age of seeds (B) 1 2 3 4 5 1 2 3 4 5
Noura Kka
119. Randomization and Layout for factorial experiment (2×5) of complete design
- The number of replicates used were 3,
- The number of experimental units would be 2×5 ×3= 30
3 1 10 6 7 6
2 9 5 1 9 2
8 4 8 4 4 8
7 10 10 7 9 5
3 5 1 6 3 2
Noura Kka
120. Linear model
yijk = μ + αi + βj + αβij + ɛijk i=a= 1,2.
μ is called the overall mean J =b =1,2,3,4,5.
αi is called the main effect of A at level i K=r= 1,2,3.
βj is the main effect of factor B at level j.
αβij is called the interaction effect of A and B in the ij treatment
ɛijk is random error of expected value.
Noura Kka
121. The number of germinated seeds were:
Rep.
Water (ml)
Age of seeds
(years) r1 r2 r3
Sam
ab
I
1 11 9 6 26
2 7 16 17 40
3 9 19 35 63
4 13 35 28 76
5 20 37 45 102
II
1 8 3 3 14
2 1 7 3 11
3 5 9 9 23
4 1 10 9 20
5 11 15 25 51
Sum r 86 160 180 426Noura Kka
122. Interaction between Water Level and Age of Seeds
Age of seeds (years)
(Water) 1 2 3 4 5 Sum a
I 26 40 63 76 102 307
II 14 11 23 20 51 119
Sum b 40 51 86 96 153 426
Noura Kka
123. The Analysis of Variance (ANOVA) for CRD Factorials
Step 1. Calculate the correction factor (CF).
6049.2
3*5*2
426GX...Xijk
..
2222
abrabrabr
FC
Step 2. Calculate the Total SS.
3902.82.6049259.........711C.F.-Xijk 22222
SST
Step 3. Calculate SS factor A and B (SSA) (SSB)
Noura Kka
125. Step 4. Calculate the AB SS (SSAB)
208.8671321.133-1178.133-2708.1332.6049
3
51...20..........4062
.SSB-SSA-.
r
.....XX
SSB.-SSA-C.F-
r
Xij.
SSAB
2222
2
2.
2
11.
2
FC
Step 5. Calculate the Error SS SSAB-SSB-SSA-SSTSSE
Step 6. Complete the ANOVA Table
Noura Kka
126. ANOVA table
Source
of
Variation
Degree of
Freedom (df)
Sum of Squares
(SS)
Mean Square
(MS)
Computed F
Tabular F
5% 1%
Treatment
A
a-1 SSA SSA/dfa 𝑀𝑆𝐴
𝑀𝑆𝐸
B b-1 SSB SSB/ dfb
𝑀𝑆𝐵
𝑀𝑆𝐸
AB (a-1) (b-1) SSAB SSAB/ dfab
𝑀𝑆𝐴𝐵
𝑀𝑆𝐸
Error ab(r-1) SSE SSE/ dfe
Total abr-1 SST
Noura Kka
127. ANOVA table
Source
of
Variation
Degree of
Freedom (df)
Sum of Squares
(SS)
Mean Square
(MS)
Computed F
Tabular F
5%
Treatment
A 2-1=1 1178.133 1178.133 19.72 4.35 *
B 5-1= 4 1321.133 330.2833 5.53 2.87*
AB (2-1) (5-1) = 4 208.867 52.21675 0.87
2.87
Error 2*5(3-1) = 20 1194.667 59.73335
Total 2*5*3 -1=29 3902.8
Conclusion: The large cal-F indicate that there is evidence that water and Age of seeds causes
a change in germination. And H0 rejected.
Noura Kka
128. Example of ANOVA for RCBD Factorial (two factors)
ANOVA table
Source
of
Variation
Degree of
Freedom (df)
Sum of Squares
(SS)
Mean Square
(MS)
Computed F
Tabular F
5% 1%
Rep,
r-1 SSR SSR/ dfr 𝑀𝑆𝑅
𝑀𝑆𝐸
Treatment
A
a-1 SSA SSA/dfa 𝑀𝑆𝐴
𝑀𝑆𝐸
B b-1 SSB SSB/ dfb
𝑀𝑆𝐵
𝑀𝑆𝐸
AB (a-1) (b-1) SSAB SSAB/ dfab
𝑀𝑆𝐴𝐵
𝑀𝑆𝐸
Error ab(r-1) SSE SSE/ dfe
Total abr-1 SST
Noura Kka
129. The size of the Latin Square would be 6
The ANOVA table would be as follows:
Note that r= C =ab
Example of a Latin Square with a 3x2 Factorial Arrangement
Source of Variation Degree of Freedom (df)
Row ab-1 = 5
Column ab-1 = 5
Treatment
A a-1
B b-1
AB (a-1) (b-1)
Error (ab-1)(ab-2)
Total ab2
-1
Noura Kka
130. Page 1 of 1
Example of ANOVA for RCBD Factorial (three factors)
Example of a RCBD with a 4x3x2 Arrangement
Given there are 5 replicates, the ANOVA would look as follows:
Noura Kka
131. Split-plot designs
Split-plot designs can take many forms. The most common is randomized block
design for a two-factor experiment with restricted randomization of the treatment
combinations within each block.
Noura Kka
132. In a split-plot design, one of the factors is assigned to the main plot. The assigned
factor is called the main-plot factor.
The main plot is divided into subplots to which the second factor, called the
subplot factor, is assigned.
Thus, each main plot becomes a block for the subplot treatments (i.e., the levels
of the subplot factor).
Noura Kka
133. With a split-plot design, the precision for the measurement of the effects of
the main-plot factor is sacrificed to improve that of the subplot factor.
Noura Kka
134. Uses of split plot designs
1. To measure one factor more precisely than the others.
2. The nature of the factors necessitates that one be applied to
larger plots than the other.
3. One factor may require large quantity of material, whilst another
factor can be applied to smaller quantities.
4. In experiments where successive observations are made on the
same plot over a period of times. For example, if factor A is
planting date or harvest date, then all the plots within each main
plot are planted or harvested at the same time.
Noura Kka
135. Disadvantages of split plot designs:
1. The main plot treatments are measured with less precision than the subplot
factor.
2. The analysis more complex than that for a factorial experiment in randomized
complete block design.
Noura Kka
136. Randomization
- Levels of the whole-plot factor are randomly assigned to the main
plots, using a different randomization for each block (for an CRBD)
- Levels of the subplots are randomly assigned within each main plot
using a separate randomization for each main plot.
Noura Kka
137. Example, a two-factor experiment involving six levels of nitrogen (main-plot
treatments(a)) and four cabbage varieties (subplot treatments (b)) in three
replications is used.
Step 1. Divide the experimental area into r = 3 blocks, each of which is further divided into
a = 6 main plots.
Step 2. Randomly assign the 6 nitrogen treatments to the 6 main plots in each of the 3
blocks.
Noura Kka
141. Note:
✓ Split-plot arrangement of treatments could be used in a CRD or
Latin Square, as well as in an CRBD
✓ Could extend the same principles to include another factor in a
split-split plot (3-way factorial)
✓ Could add another factor without an additional split (3-way
factorial, split-plot arrangement of treatments)
Noura Kka
142. Home work
- Draw a layout of randomization of the following example:
✓ An onion breeder wanted to determine the effect of planting
date on the yield of four varieties of onion.
✓ Two factors:
✓Planting date (Oct 15, Nov 1, Nov 15)
✓Variety (V1, V2, V3, V4)
✓ Because of the machinery involved, planting dates were assigned
to the main plots
✓ Used a Randomized Block Design with 3 blocks
Noura Kka