1. INTRODUCTION
1.1. Definitions of BIOMETRY
• Bio means life and Metro (Metry) is measurement and
therefore Literally Biometry is the measurement of life.
• But, Scientifically biometry is also defined as the application of
statisticalmethodstothesolutionofbiologicalproblems.
• Biometry can also be called biometrics or biostatistics or
biomathematics.
• Biometry is always associated with research or experiments.
1
2. Research
is a systematic plan/approach that goes into a subject matter or
problem to investigate something new facts/knowledge or
to verify the earlier/past research result/hypothesis.
Therefore, biometry is;
the main tool in the research system to solve a
particular problem or
to investigate new fact/knowledge or to verify the past
research work.
• Research methodology is the procedure that we
follow in the research.
• Biometry is the main part of the research
methodology in an experiment. 2
3. 1.2. Definition of Basic Terminologies in Biometry
• Treatment:Something thattobeappliedontheexperimental materials.
• Experimental Material/Unit: It is an animal/plant/part of
it/other things that the treatment is going to be applied on to
solve the problem.
• Selection of experimental material should be randomly and
independently butnotpurposively/systematicallyinanexperiment.
• Experiment Designs: is the area of statistics concerned with designing an
investigation mechanism to best meet the study objectives as well as the
assumptions forstatistical inferences/analysis.
3
4. 1.2. Definition of Basic Terminologies in Biometry…
• Statistics is the subject that deals with the collection, organization,
presentation, analysis and interpretation of numerical data.
Variables: it refers to measurable attributes as these typically vary
over time or between individuals.
Variables can be Discrete (taking values from a finite or countable
set), Continuous (having continuous distribution function), or neither.
E. g. To is a continuous variable and no of legs of animals
is discrete variable.
• Variables can be also classified as independent and dependent
variables, the later being expected to vary in value in response to
changes in the former.
4
5. • 1.2. Definition of Basic Terminologies in Biometry…
• In other words, an independent variable is presumed to potentially
affect a dependent one.
• Population: the totality of individual observation which can exist
anywhere in the world or at least within specific area or limiting space
and time.
• A (statistical) population is the complete set of possible
measurements for which inferences/assumptions are to be made. The
population represents the target of an investigation, and the objective
of the investigation is to draw conclusions about the population hence
we sometimes call it target population. 5
6. • Sample: A sample from a population is the set of measurements that are
actually collected in the course of an investigation.
• It should be selected using some pre-defined sampling technique in such a
way that they represent the population very well.
• Parameter: Characteristic or measure obtained from a population.
• Statistic: Characteristic or measure obtained from a sample.
• Sampling: The process of sample selection and taking from the population.
Hypothesis: is a working assumption, a proposed explanation for a state of
affairs.
This is also one way of making inference about population parameter,
where the investigator has prior notion about the value of the parameter.
6
7. • Null Hypothesis
- It is the hypothesis to be tested
- Itisthehypothesisofequalityorthehypothesisofno difference.
- Usually denoted by Ho
• Alternative Hypothesis
-It is the hypothesis available when the null
hypothesis has to be rejected.
- It is the hypothesis of difference.
- Usually denoted by H1 or Ha.
Types of errors during hypothesis Testing
- Testing hypothesis is based on sample data which may involve
sampling and non-sampling errors.
- The following table gives a summary of possible results of any
hypothesis test:
7
8. When Ho is true Ho is false
• Reject Ho Error Type I Correct decision
• Do not Reject Ho Correct decision Error Type II
Test of significance
• It is the device to find out whether or not observed pair of means (or
percentages, proportions, variances, etc) differs significantly from
each other.
• It is a common feature that even repeated samples drawn from the
same population would give rise to mean values which differ from
each other, merely due to chance factors. 8
9. • So, it is necessary to formulate some test criterion on which to base
our judgment whether an observed difference between two means
is a chance difference there by implying that they arise from the
same population or the difference significant there by asserting that
the means belong to two different populations.
• All the test of significance have the same underlining logic which is
somewhat expressed in terms of probability, how frequency a
difference of observed magnitude or more can arise purely due to
chance factors, the risk in concluding that is the real difference.
9
10. • The level of risk is called the level of significance and its
complement is the level of confidence.
• The level of significance ( LS) specifies the level of risk involved in
any judgment.
• But the main questions is how small the magnitude of this risk before
one concludes the mean difference is produced by non-random effects
or unknown factors.
• LS-Should be either be 20%, 10%, 5%, and 1%?. It is entirely
depends on the amount of risk one is prepared to take care.
• The LS is guided by the nature and the need of problem.
• But, in practical, the rule of thumb is to stick to 5% and 1%
level.
10
11. 1.3. Elements of Experimentation
• Inagriculturalresearch,thekeyquestiontobeansweredisgenerallyexpressedasa
statementofhypothesis.
• Thishypothesishastobeverifiedordisprovedthroughexperimentation.
• Hypothesis is usually suggested by past research experiences or observations
orbytheoreticalconsiderations usingliteratures.
Example 1: Suppose there is a livestock breeder in a given area. There is a
shortageofmilkforthehumanpopulationofthatarea.Thisisthemajorproblem.
Then,theanimalbreederhypothesized/assumesthatcross-breedingoflocalanimal
with exotic breeds are more productive in terms of milk than pure breed local
animals.
Thisiscalledanullhypothesis.Theaimofnullhypothesisistoincreasemilkyieldto
solvethefacedproblemandthereforeshouldbeverifiedordisprovedbyresearch.
However, the crossbred animals due to susceptibility of disease/environmental stress,
produces low milk. In this case, the null hypothesis is disproved. This is
called alternative hypothesis
11
12. On the other hand, the crossbred animal due to high genetic
potential could produces more milk than the local animals.
• This is, therefore, the null hypothesis is verified against the
alternative hypothesis.
• Example 2: There is a problem of feed for animals and always
low productivity of animals in terms of milk, meat and draft
power. Therefore, the null hypothesis says that introduction of
improved forage crops in the farming system could solve the feed
shortage problem for livestock that can increase milk, meat and
draft power.
Then, a forage breeder introduce large number of forage species
and evaluate them in terms of dry matter yield (DMY) and they
found out that the forage species are high yielding than the
native pasture.
The null hypothesis is verified not disproved in this case. 12
13. Onceahypothesisisframed,thenextstepistodesignaprocedureforitsverification.
This is the Experimental Procedure or Research Methodology, which usually consists of
fourphases:
• Selectingtheappropriatematerialstotestthehypothesis
• Specifyingthecharacterstomeasure
• Selectingtheprocedure/design tomeasurethosecharacters
• Specifying the procedure/method of analyzing the characters to determine
whetherthemeasurementsmadesupportthehypothesisornot.
• Ingeneral,thefirsttwophasesarefairlyeasyforasubjectmatterspecialisttospecify.
• Ontheotherhand,theprocedures/designregardinghowmeasurementsaretobemadeand
how prove or disprove a hypothesis depend heavily on techniques developed by
statisticians. 13
14. • The procedures and how measurements can prove/disprove the
hypothesis requires generally the idea of experimentation. This is
whatwecalldesignoftheexperiments.
• The design of experiments has 3 essential components
• Estimate of error
• Control of error
• Proper interpretation of results obtained either
verified/disproved
1. Estimate of Error
14
15. Exotic breed (A)
Local breed /indigenous (B)
We need to compare the two cattle breeds in terms of their milk yield.
Breed A and B have received the same management and are housed side
by side. Milk yield is measured and higher yield is judged better. The
difference in milk yield of the two breeds could be caused due to breed
differences. But this certainly is not true. Even if the same breed might
have been housed on both houses, the milk yield could differ. Other
factors such as, climatic factors (temperature), damage by disease and
insects affect milk yield. 15
16. • Therefore, a satisfactory evaluation of the two cattle
breeds must involve a procedure that can separate breed
difference from other sources of variation. That is
therefore, the animal breeder must be able to design an
experiment that allows them to decide whether the milk
yield difference observed is caused by breed difference
or by other factors. In this case, we are able to estimate
the exact experimental error in livestock research.
• The difference among experimental plots/materials treated
alike (similarly) is called Experimental Error. This error is
the primary basis for deciding whether an observed
difference is real or just due to chance. Clearly, every
experiment must be designed to have a measure of the
experimental error. It is unavoidable but try to be reduced
as minimum as possible in the experiment. 16
17. METHODS TO REDUCE EXPERIMENTAL ERROR
• Increase the size of experiment either through provision of
more replicates or by inclusion of additional treatments.
• Refine or improving the experimental
techniques/procedures
• Have uniformity in the application of treatments such as
equally spreading of fertilizers, recording data on the same
day, similar housing, similar feeding, etc.
• Control should be done over external influences so that all
treatments produce their effects under comparable
conditions e.g. protecting disease, etc.
17
18. 1.1. Replication
• It is the repetition of treatments in an experiment. At least two
plots/experimental materials of the same breed/variety are needed to
determine the difference among plots/experimental materials treated
alike.
• Experimental error can be measured if there are at least two
plots treated the same or receiving the same treatment.
Thus, to obtain a measure of experimental error, replication
is needed.
• The advantage of replication in an experiment is to increase
precision of error estimation and error variance is reduced
and easily estimated.
18
19. Functions of Replication
• Provides an ease of estimate of exp,tal error
– Because it provides several observations on experimental units
receiving the same treatment. For an experiment on which each
treatment appears only once, no estimate of experimental error is
possible.
• Improves the precision or accuracy of an experiment
– As the number of replicates increases, the estimates of population
meansasobservedtreatmentmeansbecomesmoreprecise.
• Increasesthescopeofinference/conclusionoftheexperiments
19
20. 1.2. Randomization
• Randomization ensures that no treatment is consistently favored or
discriminated being placed under best or unfavorable conditions, thereby
avoidingbiasness.
– It means that each variety/breeds of animal will have
equal chance of being assigned to any experimental plots.
– It also ensures independence among observations, which is
a necessary condition for validity of assumptions to
providesignificancetestsand confidenceintervals.
• Randomization can be done by using random number, lottery
system, or coin system. Thus, experimental error of the difference
willbereducedifassignedrandomlyandindependently.
20
21. Purposes of Randomization
• To eliminate biasness:
– It ensures that no treatment is favored or discriminated
against the systematic assignment to units in a design.
• To ensure independence among the observations.
– This is necessary to provide valid significance tests.
21
22. P1 P2 P3 P4 P5 P6 P7 P8
A B A B A B A A
The above table indicates that the treatments are not randomly
assigned
22
23. 2. Control of Error
• The ability of the experiment to detect the existing difference among
treatments/experimental materials is increased as the size of the
experimentalerrordecreased.
– A good experiment should incorporate all possible means of
minimizing the experimental error.
• Three commonly used techniques for controlling
experimental error in agricultural research are as
follows:
– Blocking
– Proper plot technique
– Proper data analysis
23
24. 1. Blocking
• Putting experimental units that are as similar as possible together in the same group is generally
referred as a block.
• By assigning all treatments/experimental plots in to each block separately and independently,
variation among blocks can be measured and removed from experimental error. Reduction in
experimental error is usually achieved with the use of proper blocking techniques in different
experimental designs.
2. Proper plot technique
• For all experiments it is absolutely essential that except treatments, all other factors must be
maintained uniformly for all experimental units. For example, for a forage variety trial where the
treatments consists solely of the test varieties, it is required that all other factors such as soil
nutrients, solar energy, temperature, plant population, pest incidence and other infinite
environmental factors should be maintained as uniformly for all plots in the experiments as
possible. This is primarily a concern of a good plot technique.
3. Proper data analysis
• In cases, where blocking alone may not able to achieve adequate control of
experimental error, proper data analysis can help greatly. In this case,
covariance analysis is most commonly used for this purpose.
24
25. 3. Proper Interpretation of Results of an Experiment
• After estimating and controlling experimental error, the
result of experiment must be interpreted properly according
to the situation of the environment and conditions of the
experiment for practical application.
– For example, the DMY of the forage variety must be reported
based on the environmental conditions where the study is
conducted including climatic data (temperatures, rainfall, others),
soilfertilityandtype,topography,andothersasmuchaspossible.
25
26. Analysis of Variance (ANOVA)
• It is the partitioning of sources of variations in an
experiment in to different components due to known
and unknown factors.
• The sources of variations in an experiment caused
by unknown factors are calculated and variance is
called error variance or experimental error.
• But the variation in an experiment caused by known
factors is due to treatment effect.
26
28. Chapter 2. Common Designs of Agricultural Experiments
• Experiments in which only a single factor varies while all others are kept constant is
called single factor experiments.
• In single factor experiments, the treatments consist solely of the different levels of
the single variable factor. However, all other factors in the experiment are applied
uniformly to all plots at a single prescribed level. For example, most forage crop
variety trials are single factor experiments in which the single factor is the variety
and the factor levels (i.e. treatments) are the different varieties. Only the variety
planted differs from one experimental plot to another and all management factors,
such as fertilizer, insect control and water management, are applied uniformly.
• There are two groups of experimental design that are applicable to a single factor
experiment. One group is the family of complete block design, which is suited for
experiments with a small number of treatments and is characterized by block, each
of which contains at least one complete set of treatments.
• The other group of experimental design is the family of incomplete block design,
which is suited for experiments with a large number of treatments and is
characterized by block, each of which consist only a fraction of treatments to be
tested. There are three complete block designs namely complete randomized
design (CRD), Randomized complete block design (RCBD), Latin square design (LSD)
28
29. Completely Randomized Design (CRD)
• Completely Randomized Design is the least used, but could be helpful in situations where
experimental area is believed to be uniform, and few treatments are examined.
• It is one where the treatments are completely assigned at random so that each experimental
unit has the same chance of receiving any one of the treatment. Any difference among
experimental units receiving the same treatment is considered as experimental error.
Advantage of CRD
• Is only appropriate for experiments with homogenous experimental units, such as laboratory
experiments where environmental effects are relatively easy to control.
• Is used where there is a missing plot in an experiment.
• Has flexibility that the number of replication (repeats) for each treatment could vary. Could
have equal/unequal number of replication.
• Easy for analysis.
Disadvantage of CRD
• Treatment means in an experiment could not be estimated with the same precision because
of unequal number of replication in treatments.
• For field experiments, where there is generally large variation among experimental plots, it is
rarely used.
29
30. Randomization and layout of CRD
• The treatments are allocated (assigned) in complete
randomization. All the treatments have got equal chance to be
assigned to any of the experimental units in the experiments.
Steps for randomization and layout in CRD
1. Determine the total number of experimental plots (n), as the
product of number of treatments (t) and the number of
replication (r). e.g if the number of treatments are four (A,B,
C, D) and number of replication (r) is five, the number of
experimental plots will be n (tr) or 4x5 = 20 units.
2. Assign a plot number to each experimental plot (unit) in any
convenient manner.
3. Assign the treatments to the experimental plots by the use of
any of the randomization methods either table of random
number or drawing cards/playing cards or drawing lots using
piece of paper. 30
32. Analysis of variance (ANOVA) of CRD
• There are two sources of variation among the n observation
obtained from a CRD trial. One is the treatment variation,
the other is the experimental error variation.
• The treatment difference is said to be real, if the treatment
variation is sufficiently larger than experimental error.
32
33. Step 1. Outline of the ANOVA table for CRD
Source of
variation
Degreeof
freedom
Sumof
squares
Mean
squares
Computed
F value
Tabular F Value
At 5 % At 1 %
Treatment t-1 SSt MSt MSt/EM
S
Exp’tal
error
t(r-1) Ess EMS
Total tr-1 TSS MST
33
34. Step 2. Calculate correction factor (CF) = (GT)2 /n
Step 3. Calculate total sum of squares (TSS)
TSS=∑Yij2 - CF
Step4.Calculate treatment sum of squares (SSt)
SSt =∑Ti2 - CF
r
Step 5. Calculate error sum of squares (ESS)
ESS = TSS-SSt
Step 6. Calculate mean sum of squares (MSS)
MST=TSS MSt=SSt , MSE= ESS
DFT DFt DFE
Step7. Calculate F calculated (Fcalc.)
Fcalc = MSt
MSE 34
35. Step 8. Take F tabulated from table (from points for the distribution
of F) at 5 % and 1 % level of significance by using error degree of
freedom (from upper to bottom) and treatment degree of freedom
(from left to right).
F0.05 (DFE, DFt) and F0.01 (DFE, DFt)
After calculating the F value, we can conclude three things
1. If the computed F value is greater than the tabular F value at 1%
level of significance, the treatment difference (variation) is said
to be highly significant. Such a result is generally indicated by
placing two asterisks on the computed F value in the ANOVA
table.
2. If the computed F value is larger than the tabular F value at 5%
level of significance but smaller than or equal to the tabular F
value at 1 % level of significance, the treatment difference is said
to be significant. Such a result is indicated by placing one asterisk
on the computed F value in the ANOVA table.
3. If the computed value is smaller than or equal to the tabular F
value at 5% level of significance, the treatment difference is said
to be non-significance, such a result is indicated by placing “NS”
on the computed F value in the ANOVA table. 35
36. • The usual accepted level of significance for most agricultural
research/animal science field experiments is at 5 % level of probability. But,
most laboratory experiments use 1% or below this value to detect the
treatment differences and measures experimental error.
• If the result is non-significant, do not proceed further. Because, a non-
significance F test in the ANOVA indicates the failure of the experiment to
detect any difference among treatments.
• The failure to detect differences among the treatments based on the non-
significant F test could be the result of either a very small or nil treatment
difference or a very large experimental error or both.
• If the difference among treatments is very small and large experimental
error, the trial may be repeated and efforts made to reduce the
experimental error so that the difference among treatment can be
detected.
• On the other hand, if treatment difference and experimental error is very
small, the difference among the treatments probably too small to be of any
economic value, so no additional trials are required.
36
37. • If the F test in the ANOVA shows significant difference
among treatments either at 5 % or 1 % probability level,
continue to calculate the following:
Step 9. Calculate Standard error of the means
SE =
Step 10. Calculate Least Significant Difference (LSD)
• LSD0.05 = tœ at error d.f x SE
• LSD0.01 = tœ at error d.f x SE
37
38. • Step 11.Calculate Coefficient of Variation (CV %) =
x 100
• The CV value is generally placed below the analysis of
variance (ANOVA) table.
Example: An animal scientist performed a study on
weight gain on 8 randomly selected animals using 3
diets. The following weight gains were recorded. It is
assumed that the 8 animals are similar in all other
aspects.
Solution
• Step 1. Determine the total number of experimental
plots (n), as the product of number of treatments (3)
and the number of replication (8). The number of
experimental plots will be n (tr) or 3 x 8 = 24 units.
• Step 2. Assign a plot number to each experimental plot
(unit) in any convenient manner. 38
40. • Step 3. Assign the treatments to the experimental
plots by the use of any of the randomization
methods.
1
C(14.15)
2
A(14.20)
3
B(12.85)
4
A(14.30)
5
B(13.65)
6
C(13.90)
7
B(13.40)
8
B(14.20)
9
C(13.65)
10
A(15.00)
11
C(13.60)
12
A(14.60)
13
C(13.20)
14
B(12.75)
15
A(14.55)
16
B(13.35)
17
C(13.20)
18
C(14.05)
19
A(15.15)
20
B(12.50)
21
C(13.80)
22
B(12.80)
23
A(14.60)
24
A(14.55)
40
41. Step 4.Organizing the weight gain data (kg)
Diets
Repeats A B C Total Mean
1 14.20 12.85 14.15
41.2 13.73
2 14.30 13.65 13.90
41.85 13.95
3 15.00 13.40 13.65
42.05 14.02
4 14.60 14.20 13.60
42.4 14.13
5 14.55 12.75 13.20
40.5 13.5
6 15.15 13.35 13.20
41.7 13.9
7 14.60 12.50 14.05
41.15 13.72
8 14.55 12.80 13.80
41.15 13.72
Sum ∑A=116.95 ∑ B =105.50 ∑C=109.55 332.00
Mean 14.619 13.188 13.694 13.833
41
42. Step 5. Calculate correction factor (CF) = (GT)2 /n =3322/24 = 110224/24
= 4592.667
Step 6. Calculate total sum of squares (TSS)
TSS = ∑Yij2 – CF = [(14.20)2 + (14.30)2 +……. (13.80)2] -4592.667 =
12.2683
Step7. Calculate treatment sum of squares (SSt)
SSt=∑Ti2 - CF = [(116.95)2 + (105.50)2 + (109.55)2] - 4592.667 = 8.4277
r 8
Step 8. Calculate error sum of squares (ESS)
ESS = TSS-SSt = 12.2683-8.4277 = 3.8407
42
43. • Step9. Calculate mean sum of squares (MSS)
• MST=TSS =12.2683 = 0.533 , MSt=SSt = 8.4277 = 4.21385
• DFT 23 DFt 2
•
• MSE= ESS = 3.8407 = 0.18289
• DFE 21
Step10. Calculate F calculated (Fcalc)
• Fcalc = MSt = 4.21385 = 23.04035
• MSE 0.18289
Step11. Take F tabulated from table (from points for the distribution of F) at 5% and
1% level of significance by using error degree of freedom (from upper to bottom)
and treatment degree of freedom (from left to right).
F0.05 (21, 2) = 3.47 and F0.01 (21, 2) = 5.78
43
44. Step 12. Complete ANOVA
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value
At 5 % At 1 %
Diet 2 8.4277 4.21385 23.04035** 3.47 5.78
Error 21 3.8407 0.18289 -
Total 23 12.2683 0.53 -
44
45. Therefore, the differences among the weight gain means of the diets are highly
significant. Since a significant level is found out among diets the following should
be computed.
Step 13. Standard error of the means SE (M) =
= + 0.2138
Then, each diet means would be considered plus or minus SE (M) value.
Example. Diet A= 14.691 + 0.2138
Step 14. Least Significant Difference (LSD)
LSD would help to separate similar treatment mean (s) in the experiment.
LSD0.05 = tœ at error d.f x SE (M) = LSD0.05 =t (0.05, 21) x 0.2138 =2.080x0.2138 = 0.4447
LSD0.01 = tœ at error d.f x SE (M) = LSD0.01 =t (0.01, 21) x 0.2138= 2.831x0.2138 = 0.6053
Therefore, the difference between each treatment means is 0.6053 at 1 % level of
significant and 0.4447 at 5% level of significance. Thus group together the
treatment means having a difference below or equal to 0.6053 or 0.4447.
A B C
14.619 13.188 13.964
• This indicates that all diets are statistically different.
45
46. Step 15. Coefficient of Variation (CV %): it indicates
the degree of precision with which the treatments
are compared and is a good index of the reliability
of the experiment. It expresses the experimental
error as percentage of the mean, thus the higher
the CV, the lower is the reliability of the experiment
and vice versa. CV = x 100 = 3.1%
Step 16. Data interpretation and conclusion of ANOVA
There was a significant (p<0.01) difference among the
diets tested on body weight gain of the animal in the study
period.
46
47. Unequal replication in CRD
• Since the computational procedure for the CRD is not overly
complicated when the number of replication differs among
treatments, the CRD is commonly used for studies where the
experimental material makes it difficult to use an equal number of
replications for all treatments.
• Some examples of these cases are:-
• Animal feeding experiments where the number of animal for each
feed is not the same.
• Experiments for comparing body length of different species of
insect caught in an insect trap
• Experiments that are originally set up with an equal number of
replication but some experimental units are likely to be lost or
disproved during experimentation due to some reasons.
• Example: An animal scientist wants to perform a study on weight
change of randomly selected animals using five diets. Due to lack
of animal (experimental material), treatment 2 is replicated 3 times
and the other four each replicated 4 times. The following weight
change data were organized according to treatments.
47
48. Treatment Number of repeats Treatment
Total
Mean
1 2 3 4
M1
19 16 16 20
71 17.75
M2
26 - 27 22
75 25.00
M3
22 25 24 20
91 22.75
M4
27 25 20 28
100 25.00
M5
28 29 30 31
118 29.50
Grand Total
455
Grand Mean 23.95
48
49. Step 1. Calculate correction factor (CF)
(GT)2 /n =4552/19 = 207025/19 =10896.0526
Step 2. Calculate total sum of squares (TSS)
TSS=∑Yij2 – CF = [(19)2 + (16)2 +……. (31)2] -10896.0526 = 374.95
Step3. Calculate treatment sum of squares (SSt)
• SSt=∑Ti2 - CF = [(71)2 + (75)2 + (91)2+ (100)2 + (118)2] – 10896.0526 = 290.45
• ri 4 3 4 4 4
Step4. Calculate error sum of squares (ESS)
ESS = TSS-SSt = 374.95-290.45 = 84.50
49
50. Step 5. Calculate mean sum of squares (MSS)
•
• MST=TSS =374.95 = 19.73 , MSt=SSt = 290.45 = 72.6125
• DFT 19 DFt 4
•
• MSE= ESS = 84.50 = 6.0357
• DFE 14
• Step6. Calculate F calculated (Fcalc)
• Fcalc = MSt = 72.6125 = 12.0305
• MSE 6.0357
Step 7. Take F tabulated from table (from points for the distribution of F) at 5 % and 1
% level of significance by using error degree of freedom (from upper to bottom)
and treatment degree of freedom (from left to right).
F0.05 (14, 4) =3.11 and F0.01 (14, 4) = 5.03
50
51. Step 8. Complete ANOVA
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value
At 5 % At 1 %
Diet 4 290.45 72.6125 12.0305** 3.11 5.03
Error 14 84.50 6.0357 -
Total 19 374.95 19.73 -
51
52. • Therefore, the differences among the treatments
are highly significant. Since a significant level is
found out among treatments the following should
be computed.
Step 9. Standard error of the means SE (M)
There are two standard errors for the equal and for
unequal repeats
• SE for the equal observation = 1.737
• SE for the unequal observation = 2.0059
Then, each treatment means would be considered
plus or minus SE (M) value.
52
53. Step 10.Calculate Least Significant Difference (LSD)
To compare the means of treatments that have equal repeats
LSD0.05 = tœ at error d.f x SE (M) = LSD0.05 =t(0.05, 14) x1.737 =2.145x1.737 = 3.7258
LSD0.01 = tœ at error d.f x SE (M) = LSD0.01 =t (0.01, 14) x1.737= 2.977x1.737 = 5.171
To compare the means of equal repeats and unequal repeats
LSD0.05 = tœ at error d.f x SE (M) = LSD0.05 =t (0.05, 14) x2.0059 =2.145x2.0059 = 4.3026
LSD0.01 = tœ at error d.f x SE (M) = LSD0.01 =t (0.01, 14) x2.0059 = 2.977x2.0059 = 5.9715
The two LSD values for treatments which have equal repeats and unequal repeats at
1% level of significant are 5.171 and 5.9715 respectively.
Mean separation among treatments
53
55. Step 12. Data interpretation and conclusion
• There was significant difference at 1% level on body
weight changes for the unequal repeated treatments in a
feeding trial.
• Treatment 5 and 1 are statistically different from the
others. However treatment 4, 3 and 2 are not
significantly different from each other, only numerical
difference.
• Treatment 5 gave the highest body weight change
followed by treatment 4 and 2.
• The lowest body weight gained was obtained by
treatment 1 in the study.
55
56. Randomized Complete Block Design (RCBD)
• The RCBD is one of the most widely used experimental designs
in agricultural research.
• The design is especially suited for field experiments where the
number of treatments is not large and when the experimental
area has a known or suspected source of variation in one
direction.
• The distinguishing feature of the RCBD is the presence of blocks
of equal size, each of which contains all the treatments.
Therefore, within each block, the conditions are as homogeneous
as possible, but between blocks, large differences may exist.
• It is the simplest block design.
• In animal studies, to achieve the uniformity within blocks,
animals may be classified on the basis of age, weight, litter size,
or other characteristics that will provide a basis for grouping for
more uniformity within blocks.
56
57. The advantage of RCBD
• Provide more accurate results due to grouping/blocking
• When some data from some individual units/treatment is
lacking, the missing plot technique developed by “Yates”
enables the available results to be fully utilized.
• Complete flexibility. Can have any number of treatments and
blocks.
• Relatively easy statistical analysis even with missing data.
• Allows calculation of unbiased error for specific treatments.
The disadvantage of RCBD
• Not suitable for large numbers of treatments because blocks
become too large.
• Not suitable when complete block contains considerable
variability.
• Interactions between block and treatment effects increase error.
57
58. • Blocking technique in RCBD
• The primary purpose of blocking in RCB design is to
reduce experimental error. Blocking is done by
grouping the homogeneous experimental units in to
blocks such that variability within each block is
minimized and variability among blocks is
maximized and only the variation within a block
becomes part of the experimental error. Blocking in
RCBD is effective when the experimental area has a
predictable pattern of variability. After identifying
the specific source of variability to be used as the
basis for blocking, the size and shape of the blocks
must be selected to maximize variability among
blocks.
58
59. Guidelines for making maximum variability among
blocks (for forage experiments)
• When the variability gradient is unidirectional (if there is
one gradient) use long and narrow blocks.
• When the pattern of variability is not predictable, blocks
should be as square as possible.
• Please note that all operations for the whole experimental
plots in RCBD should be finalized at the same time, if not,
the task should be completed for all plots of the same block
at one time and continue the rest blocks another time.
59
60. Randomization and layout
• The randomization process for a RCB design is
applied separately and independently to each block,
not overall. Thus this is sometimes called a
restricted randomization. In RCBD, blocking is equal
to replication. We have to set treatments in to plots
within blocks by using one of randomization
method. Example if replication =4 and treatment =6
(A, B, C, D, E, F), we have four blocks and we have to
prepare six similar experimental plot per block.
Replication I (Block I)
1
A
2
C
3
F
4
D
5
B
6
E 60
61. Replication II (Block II)
Replication III (Block III)
Replication IV (Block IV)
1
C
2
F
3
D
4
B
5
E
6
A
1
F
2
D
3
B
4
E
5
A
6
C
1
B
2
E
3
A
4
C
5
F
6
D
61
62. ANOVA of RCBD
• There are three sources of variability in RCBD such
as treatment, replication (block) and experimental
error. Note that RCBD is one more variation than
that for a CRD because of the addition of
replication, which corresponds to the variability
among blocks.
Outline of the ANOVA of RCBD
Source of
variation
Degree of
freedom
Sum of squares Mean squares Computed F
value
Tabular F Value
At 5 % At 1 %
Replication r-1 RSS RMS RMS/EMS
Treatment t-1 SSt MSt MSt/EMS
Experimental
error
(t-1)(r-1) ESS EMS
Total tr-1 TSS
62
63. • The model of the experiment in RCB design is:
• Yij=µ +ti +bj +eij, where µ is the overall mean,
bj is the block (replication) effect, ti is the
treatment effect and eijk is the unknown factor
(error) effect.
• After this, we can follow all the steps used for
CRD.
• In addition to steps in CRD, replication sum of
square, mean sum of squares of replication
and F calculated for replication should be
incorporated.
63
64. Example: The organized data in RCBD is given below.
Treatment Replication/Blocks Treatment
total (Ti)
Treatment
mean
1 2 3
1 4.5 3.5 3.0 11.0 3.7
2 2.0 1.8 1.9 5.70 1.90
3 3.0 2.8 2.5 8.3 2.80
4 8.0 6.0 5.0 19.0 6.3
5 9.0 8.0 7.0 24.0 8.0
6 5.0 4.5 4.0 13.5 4.50
7 4.0 3.0 3.8 10.80 3.60
8 1.5 1.0 1.4 3.90 1.30
Block total 37.0 30.6 28.60 96.20
(Grand total)
32.1
Block mean 4.63 3.83 3.58 4.01 (grand
mean)
64
65. Step 1. Calculate correction factor (CF)
(GT)2 /n =96.202/24 = 385.60
Step 2. Calculate total sum of squares (TSS)
TSS=∑Yij2 – CF = [(4.5)2 + (2.0)2 +……. (1.4)2] -385.60= 114.74
Step3. Calculate treatment sum of squares (SSt)
SSt=∑Ti2 - CF = [(11)2 +…. (3.9)2] – 385.60 = 105.6
ri 3
Step 4. Calculate replication sum of squares (RSS)
RSS = ∑rj2 –CF = [(37)2 +….. (28.60)2] -385.60 = 4.82
t 8
65
66. Step5. Calculate error sum of squares (ESS)
ESS = TSS-SSt-RSS = 114.74 -105.6 – 4.82 = 4.36
Step 6. Calculate mean sum of squares (MSS)
• RMS=RSS =4.82 = 2.41 , MSt= SSt = 105.6 = 15.08
• DFR 2 DFt 7
• MSE= ESS = 4.36 = 0.311
• DFE 14
• Step7. Calculate F calculated (Fcalc.)
• FcalcR =RMS = 2.41 = 7.77
• MSE 0.31
• Fcalct = MSt = 15.08 = 48.65
MSE 0.31
Step 8. Take F tabulated from table (from points for the distribution of F) at 5
% and 1 % level of significance by using error degree of freedom (from
upper to bottom), treatment and replication degree of freedom (from left
to right).
F0.05 (14, 2) = 3.74 and F0.01 (14, 2) = 6.51 (for replication)
F0.05 (14, 7) = 2.77 and F0.01 (14, 7) = 4.28 (for treatment)
66
67. Step 9. Complete ANOVA table
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value
At 5 % At 1 %
Replication 2 4.82 2.41 7.77** 3.74 6.51
Treatment 7 105.6 15.08 48.65** 2.77 4.28
Error 14 4.36 0.31
Total 23 114.74
67
68. • Therefore, the differences among the treatments are highly significant. Since
a significant level is found out among treatments the following should be
computed.
Step 10. Calculate Standard error
SE = 0.45
Then, each treatment means would be considered plus or minus SE (M) value.
Step 11. Calculate Least Significant Difference (LSD)
LSD0.05 = tœ at error d.f x SE = LSD0.05 =t (0.05, 14) x 0.45 =2.145x0.45 =
0.965
LSD0.01 = tœ at error d.f x SE = LSD0.01 =t (0.01, 14) x 0.45= 2.977x0.45 =
1.3396
The two LSD values for treatments at 5 % and 1 % level of significant are
0.965 and 1.3396 respectively.
68
69. Mean separation among treatments
Treatments No of Repeats Mean value
• T5 3 8.00
• T4 3 6.33
• T6 3 4.50
• T1 3 3.70
• T7 3 3.60
• T3 3 2.77
• T2 3 1.69
• T8 3 1.30
69
70. Step 12. Calculate CV = 14%
Step 13. Data interpretation and conclusion
• There was significant difference among treatments
at 5 % level of probability.
• Among the treatments tested in the study period,
T5 gave the highest milk followed by T4.
• The lowest milk yield was obtained by treatment 8
and 2.
70
71. Relative efficiency of blocking in RCBD
• Blocking maximizes the difference among blocks, leaving the difference
among plots of the same block as small as possible. Thus, the result of
every RCBD experiment should be examined how to see how this
objective has been achieved. We can determine the magnitude of the
reduction in experimental error due to blocking (replication) by
computing the relative efficiency (RE) parameters as:-
• R.E = (r-1) MSB + r (t-1) MSE = (DFr)MSB+ r(DFt) MSE for RCBD
(rt-1)MSE (DFT) MSE for RCBD
• Where r is the number of replication, t is the number of treatments, MSB
is the block mean square and MSE is the error mean square in RCBD. If
the error df for error is less than 20, the R.E value should be multiplied
by the adjustment factor K, it is defined as:-
K= [(r-1) (t-1) + 1] [t(r-1) +3] = [EDF of RCBD +1] [EDF of CRD +3]
[(r-1) (t-1) +3] [t(r-1) +1] [EDF of RCBD +3] [EDF of CRD +1]
• Then, the adjusted R.E. = (K) (R.E), if R.E is more than 1, it is efficient
otherwise not.
71
72. Example: The relative efficiency of the above example in randomized complete
block design is calculated as follows:-
• R.E = (r-1) MSB + r (t-1) MSE = (3-1)2.41+3(8-1)0.31 = (2)2.41+ (21)0.31
• (rt-1)MSE (3x8-1)0.31 (23)0.31
• = 4.82+6.51 = 11.33 = 1.59
• 7.13 7.13
Since the error d.f is 14 and since it is less than 20, we have to calculate the
adjusted factor (K).
Therefore, the adjusted factor (K) = [EDF of RCBD +1] [EDF of CRD +3
• [EDF of RCBD +3] [EDF of CRD +1]
• = [14+1] [16+3] = (15) (19) = 285 = 0.99
[14+3][16+1] (17)(17) 289
• The adjusted (R.E) = (K) (R.E) = (0.99) (1.59) = 1.57 72
73. • Since the R.E value is greater than one, using blocking is efficient.
Meaning, classifying the experimental material in to different block
incase of RCBD is able to control sources of variation in the
experiment and able to reduce experimental error so that as a
conclusion it is better to use RCBD for this type of trial rather than
using CRD.
RCBD with one missing data (value)
• For this design, the ANOVA procedure for one missed data (value)
is the same as the ANOVA procedure without missing value but the
only difference is calculating the missed value using a given
formula developed by “Yates”. The formula for calculating one
missing value in RCBD is as follows:-
X1 = [rbj1 +tTj1- GT1] = [rbj1 +tTj1- GT1]
[(r-1) (t-1)] DFE
• Where r is the replication, X1 is the missed value, bj1 is block or
replication total where there is missing plot (data), t is the
treatment, Tj1 is treatment total where there is a missing value and
GT1is the grand total where there is a missing value.
73
74. Example: Use the previous figure as one missing value
experiment
The organized data
Treatment Replication/Blocks Treatment total(Ti) Treatment mean
B1 B2 B3
1 4.5 X1 3.0 7.5 3.75
2 2.0 1.8 1.9 5.7 1.9
3 3.0 2.8 2.5 8.3 2.8
4 8.0 6.0 5.0 19.0 6.3
5 9.0 8.0 7.0 24.0 8.0
6 5.0 4.5 4.0 13.5 4.5
7 4.0 3.0 3.8 10.8 3.6
8 1.5 1.0 1.4 3.9 1.3
Block total 37.0 27.10 28.60 92.7 32.1
Block mean 4.63 3.87 3.58 74
75. A. Find the missed value.
B. Check whether there is significant difference between
treatments or not.
C. Interpret the result.
Solution
Correction (adjustment) Term (CT)
CF = (Bj1+tTi1-G’)2 = [27.10+8(7.5)– 92.70]2 = 0.14
t(t-1) (r-1)2 8(8-1) (3-1)2
Calculating the missing value using the formula
X1 = [rbj1 +tTj1- GT1]
DFE
= 3(27.10) + 8(7.5)-92.70 =141.30-92.70 = 48.6 = 3.47
14 14 14
Treatment one total = 4.5+3.5+ 3.0 = 11.0, Treatment one
mean= 3.7
Block 2 total = 27.10 +3.5 =30.6, Block 2 mean =3.83 75
76. • Once you obtained the estimated missed value, substitute
the estimated missed values and calculate the ANOVA as
usual using RCB design procedure until the complete
ANOVA
• After this, treatment mean square and F calculated should
be calculated again and set the adjusted ANOVA.
• If the adjusted F calculated show significance difference, go
further steps.
76
77. Latin Square Design (LSD)
• The major feature of LSD is its capacity to simultaneously handle two
sources of variation of blocking.
• The two-directional blocking in LSD is commonly referred to as row
blocking and column blocking. This is accomplished by ensuring that
every treatment occurs only once in each row blocks and once in each
column blocks. This ensures to estimate variation among row-blocks
as well as column blocks and remove them from experimental error.
Therefore, to use LSD, under field conditions, a researcher should
knows or suspects that there are two fertility trends running
perpendicular to each other across the study plots.
• It is also important for animal sciences, e.g. to study weight gain in
piglets which is significantly affected by both litter membership and
initial weights.
• The no of replication is fixed and the no of treatments equals to the no
of replication and the total number of observation or plots will be t2 in
LSD. Analysis of a Latin square is very similar to that of RCBD, only
one more source of variation in the model.
77
78. Advantages
• Allows for control of two extraneous sources of
variation.
• The experimental error is small because of double
blocking.
• The analysis is relatively quite simple.
Disadvantages
• Requires t2 experimental units to study treatments (not
flexible)
• Best suited for t in range: 5 t 10.
• The effect of each treatment on the response must be
approximately the same across rows and columns.
• Implementation problems.
• Missing data causes major analysis problems. 78
79. Randomization and Layout
• Randomization of treatments in LSD is started first
in rows and secondly in columns. This is the rule of
thumb. Arrange all rows, columns and treatments
independently at random using different techniques.
• Each treatment should appear once in each row and
in each column.
• Example: If we have 5 treatments and if the
experimental material has two source of variation.
The layout of 5 x 5 LS design
79
80. C1 C2 C3 C4 C5
R1 E D C B A
R2 C E B A D
R3 A C E D B
R4 B A D E C
R5 D B A C E
80
81. ANOVA of Latin Square Design
There are 4 sources of variation in LSD, 2 more than that of
CRD and 1 more than that of RCBD.
Outline of the ANOVA of LSD
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed F
Value
Tabular F Value
At 5 % At 1 %
Row t-1 RSS RMS RMS/EMS
Column t-1 CSS CMS CMS/EMS
Treatment t-1 SSt MSt MSt/EMS
Experimental
error
(t-1)(t-2) ESS EMS
Total t2-1 TSS
81
82. • Latin Square Linear Model: Yijk = µ +Ri +Cj +Tk + Eijk , where
• yijk = observation on the unit in the ith row, jth column , the kth
treatment.
• µ = the general mean common to all experimental units.
• Ri = the effect of level i of the row blocking factor.
• Cj = the effect of level j of the column blocking factor.
• tk = the effect of level k of treatment factor, a fixed effect.
• eij(k) = component of random variation associated with observation
ij(k).
The ANOVA procedure and mean squares
• Step 1. Organizing the data according to treatments
• Step 2. Calculate correction factor (CF)
(GT)2 /t2
• Step 3. Calculate total sum of squares (TSS)
TSS=∑Yij2 – CF
• Step 4. Calculate row sum of squares (RSS)
RSS = ∑Rj2 –CF
t 82
83. Step 5. Calculate column sum of squares (CSS)
• CSS = ∑Ck2 –CF
• t
Step 6. Calculate treatment sum of squares (SSt)
• SSt=∑Ti2 - CF
• t
Step 7. Calculate error sum of squares (ESS)
• ESS = TSS-SSt-RSS-CSS
Step 8. Calculate mean sum of squares (MSS)
• RMS= RSS , CMS = CSS , MSt=SSt , MSE= ESS
• DFR DFC DFt DFE
Step 9. Calculate F calculated (Fcalc)
• FcalcR = RMS , Fcalcc = CMS Fcalct = MSt
• MSE MSE MSE
Step 10. Take F tabulated from table (from points for the distribution of F) at 5 % and 1
% level of significance by using error degree of freedom (from upper to bottom),
row , column and treatment degree of freedom (from left to right) independently.
F0.05 ((DFE, DFR) and F0.01 (DFE, DFR) (for row)
F0.05 (DFE, DFC) and F0.01 (DFE, DFC) (for column)
F0.05 (DFE, DFt) and F0.01 (DFE, DFt) (for treatment)
83
84. Step 11. Complete ANOVA
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed F
value
Tabular F Value
At 5 % At 1 %
Row
Column
Treatment
Experimental
error
Total
84
85. If a significant level is found out among treatments
the following should be computed.
Step 12. Calculate Standard error SE
• Then, each treatment mean would be considered
plus or minus SE value.
Step 13. Calculate Least Significant Difference (LSD)
LSD would help to separate similar treatment mean
(s) in the experiment.
• LSD0.05 = tœ at error d.f x SE
• LSD0.01 = tœ at error d.f x SE
Step 14. Calculate CV
Step 15. Data interpretation and conclusion of ANOVA
85
86. Example: An experiment was conducted to determine
the effect of diets A, B, C, D and a Control E on liver
cholesterol in sheep.
• Recognized sources of variation where body weight
and age.
• The research randomly selected a 5 x 5 Latin
square, and randomly arranged the results of the
experiment.
86
87. The liver cholesterol
Weight
group
(Row)
Age group (Column)
I II III IV V Total
1 A(3.38) B(3.37) D(3.04) C(3.27) E(2.44) 15.49
2 D(3.70) E(2.88) B(3.35) A(3.46) C(3.34) 16.73
3 C(3.58) D(3.56) A(3.69) E(2.67) B(3.51) 17.01
4 E(3.32) A(3.71) C(3.74) B(3.81) D(3.41) 17.72
5 B(3.48) C(3.91) E(3.27) D(3.74) A(3.64) 17.61
Total 17.46 17.43 17.09 16.95 16.34
Grand total 84.56
Grand mean 3.38
87
88. • Step 1. Organizing the data according to diets
Treatments
(diets)
Treatments
(diets) Total
Treatment
(diet) mean
A 17.87 3.57
B 17.52 3.50
C 17.41 3.48
D 17.18 3.44
E(control) 14.58 2.92
88
89. Step 2. Calculate correction factor (CF)
(GT) 2 /t2 = (84.56)2/25 = 7150.39/25 = 286.02
Step 3. Calculate total sum of squares (TSS)
TSS=∑Yij2 – CF = [(3.38)2 + (3.37)2….. + (3.64)2] - 286.02 = 2.9666
Step 4. Calculate row sum of squares (RSS)
RSS = ∑Rj2 –CF = [(15.49)2+ (16.73)2 + (17.01)2 + (17.72)2+ (17.61)2]-286.02=0.874
t 5
Step 5. Calculate column sum of squares (CSS)
CSS = ∑Ck2 –CF = [(17.46)2+ (14.43)2 + (17.09)2 + (16.95)2+ (16.34)2]-286.02=0.1656
t 5
Step 6. Calculate treatment sum of squares (SSt)
SSt=∑Ti2 - CF = [(17.87)2+ (17.52)2 + (17.41)2 + (17.18)2+ (14.58)2]-286.02=1.5589
t 5
Step 7. Calculate error sum of squares (ESS)
ESS = TSS-SSt-RSS –CSS = 2.9666 -1.5589 - 0.874- 0.1656 = 0.3681
89
90. Step 8. Calculate mean sum of squares (MSS)
• RMS=RSS = 0.874 = 0.2185, CMS = CSS = 0.1656 = 0.0414
• DFR 4 DFC 4
• MSt=SSt = 1.5589 = 0.3897 MSE= ESS = 0.3681=0.0307
• DFt 4 DFE 12
Step 9. Calculate F calculated (Fcalc)
FcalcR = RMS = 0.2185 = 7.12 , Fcalcc = CMS = 0.0414 = 1.35
• MSE 0.0307 MSE 0.0307
• Fcalct = MSt = 0.3897 = 12.69
• MSE 0.0307
Step 10. Take F tabulated from table (from points for the distribution of F) at
5% and 1% level of significance by using error degree of freedom (from
upper to bottom), row , column and treatment degree of freedom (from
left to right) independently.
F0.05 ((DFE, DFR) = 3.26 and F0.01 (DFE, DFR) =5.41(for row)
• F0.05 (DFE, DFC) = 3.26 and F0.01 (DFE, DFC) = 5.41 (for column)
• F0.05 (DFE, DFt) = 3.26 and F0.01 (DFE, DFt) = 5.41 (for treatment)
90
91. Step 11. Complete ANOVA
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value
At 5 % At 1 %
Weight 4 0.8740 0.2185 7.12** 3.26 5.41
Age 4 0.1656 0.0414 1.35NS 3.26 5.41
Diet 4 1.5589 0.3897 12.69** 3.26 5.41
Experime
ntal error
12 0.3681 0.0307
Total 24 2.9666 - -
91
92. Since a significant level is found out among
treatments the following should be computed.
Step 12. Calculate Standard error SE = 0.11
• Then, each diet mean would be considered plus or
minus SE value.
Step 13. Calculate Least Significant Difference (LSD)
• LSD would help to separate similar treatment mean
(s) in the experiment.
• LSD0.05 = tœ at error d.f x SE = 2.179 x 0.11=0.2397
• LSD0.01 = tœ at error d.f x SE =3.055 x 0.11 =0.3360
92
94. Step 14. Calculate CV = 5.18 %
Step 15. Data interpretation and conclusion of ANOVA
• There is highly significant (p<0.01) difference
between the control diet (E) and the other (A, B, C,
D) diets.
94
95. Split Plot Design
• It is specifically suited for a two factor experiment that has
more treatments than can be accommodated by a complete
block design.
• Split plot design is a factorial experiment with two factors
having different levels each. It is applicable for two factors
when the effect of one factor is stronger than the other
factor, but the two factors are important as a whole in the
experiment. This is the best quality of split plot design.
• In split plot design, one of the factors is assigned to the main
plot and the assigned factor is called the main plot factor. The
main plot is divided into sub-plots to which the second factor,
called the sub plot factor, is assigned. Thus, each main plot
becomes a block for the sub-plot treatment or level of sub
plot factor. 95
96. • In a split plot design, more emphasis or interest is put
for the sub plot treatments because an assumption is
given to the fact that the sub plot factor influences the
entire experiment more than the main plot factor. That
is why the precision for the measurement of the effect
of the main plot factor is scarified to improve the sub
plot factor.
• Measurement of the main effect of the sub plot factor
is more precise than that obtainable with a RCB design.
On the other hand, the measurement of the effect of
the main plot treatment is less precise than that
obtainable with a RCBD.
• Because of the importance of the sub plot factor in the
entire experiment, the sub plot treatments are
assigned in the sub plot, so that higher degree of
precision can be achieved in the experiment. 96
97. Advantage
• In situations where it is known that larger differences will occur for some
factors than others.
• Where greater precision is desired for some factors than others.
• Since sub-unit variance is generally less than whole unit variance, the sub-
unit treatment factor and the interaction are generally tested with greater
sensitivity.
• Allows for experiments with a factor requiring relatively large amounts of
experimental material (whole units) along with a factor requiring relatively
little experimental material (sub -unit) in the same experiment.
• If an experiment is designed to study one factor, a second factor may be
included at very little cost.
• It is the design (univariate) for experiments involving repeated measures on
the same experimental unit (whole unit), while the repeated measures in
time are the sub-unit.
Disadvantage
• Analysis is complicated by the presence of two experimental error
variances, which leads to several different SE for comparisons.
• High variance and few replications of whole plot units frequently lead to
poor sensitivity on the whole unit factor.
97
98. Randomization and layout
• There are two separate randomization processes in a split-plot
design, one for the main plot, and the other for the sub plot
treatments. In each replication, main plot treatments are first
randomly assigned to the main plots followed by a randomly
assignment of the sub plot treatments within each main plot.
Example: Show the layout and randomization of split plot design
using main plot factors with 6 levels of nitrogen (N) and 4 forage
varieties (sub plot) in 3 replication.
Step 1. Divide the experimental area in to r=3 blocks
Step 2. Each of the blocks further divides in to 6 main plots
Step 3. The main plots further sub-divide in to 4 sub-plots.
Then,
Step 4. First assign the Six N-levels (main plot factors) in each
replication randomly and independently,
Step 5. Assign the sub plot treatments in each main plot
treatment randomly and independently.
98
99. • The total number of treatment combination in the
experiment is a product of the levels of the two factors
and if it is replicated, the product is multiplied by r,
6x4x3 = 72.
• The size of the main plots should be r times the number of sub
plots and each main plots is evaluated replication times where as
sub plot treatment is tested the number of main plot time’s
replication and is the primary reason for the more precision for
the sub plot treatments relative to the main plot treatments. The
assignment of the factor to either in main or sub plot is depends
on many factors such as degree of precision of the factor, relative
size of the main effect and the management practices of the
experiment.
• If greater precision for a particular factor is required,
assigned as a sub-plot and if the main effect of the
factor is easier, assign to the main plot. 99
103. ANOVA of Split Plot Design
• The ANOVA of a split plot design is divided into
main plot and sub plot.
Outline of the ANOVA for a split plot design
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value At
5 % 1%
Replication
Main plot Factor
(A)
Error (a)
Sub plot factor (B)
r-1
a-1
(r-1) (a-1)
b-1
(a-1) (b-1)
a(r-1) (b-1)
Rab -1
AXB (Interaction)
Error (b)
Total
103
104. Example: The dry matter forage yield data of forage
varieties with 6 levels of seed rates (SR) and four
levels of sowing dates (SD) in a split plot design
with 3 replications is given below. Show the
necessary steps, interpret and conclude the data.
Replication I
SR4 SR3 SR1 SR6 SR5 SR2
SD2(7.1) SD1(6.1) SD2(3.9) SD3(5.6) SD4(1.4) SD1(5.4)
SD3(5.8) SD2(6.0) SD1(4.4) SD2(6.2) SD3(7.1) SD4(5.2)
SD1(6.5) SD4(4.5) SD3(3.5) SD1(8.5) SD2(7.7) SD2(6.5)
SD4(2.8) SD3(6.2) SD4(4.1) SD4(2.2) SD1(7.3) SD3(4.8)
104
110. Step 5. Calculate Mean Squares
RMS = RSS/r-1= 1.08/2 =0.54,
A (SR) MS = ASS/a-1=22.561/5=4.5122
EMS (a) = ESS (a)/(r-1) (a-1) = 9.432/10 = 0.9432
B (SD) MS = B (SD) SS/b-1 = 90.773/3 = 30.258
AB (SRxSD) MS= AB(SRxSD) SS/ (a-1) (b-1) =
77.193/15= 5.146
EMS (b) = ESS (b)/a(r-1) (b-1) = 4.161/36 = 0.116
110
111. Step 6. Calculate F calculated
Fcalc . = RMS/EMS (a) = 0.54/0.9432 = 0.573
F calc (A) = AMS/EMS (a) = 4.5122/0.9432 = 4.784
F calc (B) = BMS/EMS (b) = 30.258/0.116 = 260.8
Fcalc (AxB) = AB(SRxSD)MS/EMS(b)=5.146/0.116 = 44.362
Step 7. Take F tabulated value for each from f distribution
table at both 0.05 and 0.01 level of significance.
• F0.05 ((10, 2) = 4.10 and F0.01 (10, 2) =7.56(for replication)
• F0.05 (10, 5) = 3.33 and F0.01 (10, 5) = 5.64 (for Seed Rate)
• F0.05 (36, 3) =2.86 and F0.01 (36, 3) = 4.38 (for Sowing Date)
• F0.05 (36, 15) = 1.96 and F0.01 (36, 15) =2.58 (For the
interaction of Seed Rate and Sowing Date)
111
112. Step 8. Complete ANOVA
Source of
variation
Degree of
freedom
Sum of
squares
Mean
squares
Computed
F value
Tabular F Value At
5 % 1 %
Replication
2 1.08 0.54 0.573 NS 4.10 7.56
Main plot
factor(A) 5 22.561 4.5122 4.784* 3.33 5.64
Error (a)
10 9.432 0.9432
Sub plot
factor(B) 3 90.773 30.258 260.8** 2.86 4.38
AxB
(interaction) 15 77.193 5.146 44.362** 1.96 2.58
Error (b) 36 4.161 0.116
Total
71 205.2
-
112
113. Step 9. Calculate Standard Error
• SE for SR = = +0.396
• SE for SD = = + 0.1135
• SE for SDxSR =
Step 10. Calculate LSDLSD for SR = SE for SR x t (0.05, 10)
• = 0.396 x 2.228 = 0.882
• LSD for SD = SE for SD x t(0.05, 36)
• = 0.1135 x 2.021 =0.229
LSD for SD = SE for SD x t(0.01, 36)
• = 0.1135 x 2.704 = 0.3069
113
114. • After this, we can check whether the different level
of seed rate has significant difference or not by
comparing the mean difference of each level and
the LSD value.
• We can also check whether the different level of
sowing date has significant difference or not by
comparing the mean difference of each level and
the LSD value for both level of significance.
114
116. Chapter 3. Factorial Experiments
• Factorial experiments are a treatment arrangement in which
the treatments consist of all combination of all levels of two
or more factors. Note that it is arrangement of treatments,
not a design. Can use approach with a variety of designs.
Factor – refers to a set of related treatments.
• Levels of factor – refers to different states or components
making up a factor. It may be quantitative or qualitative. E.g.
forage varieties,
• Biological organisms are simultaneously exposed to many
growth factors during their life time. An organism response to
any single factor may vary with the levels of the other factor.
The result of a single factor experiment is applicable only to
the particular level in which the other factors are maintained
in the trial. Because of single factor experiments are often
criticized for their narrowness. 116
117. • Many experiments involve the study of the effects of two or
more factors. When response to the factor of interest is
expected to be different under different levels of the other
factors, avoid single factor experiments and consider instead
the use of a factorial experiment which involved
simultaneously two or more variable factors. In factorial
experiment all possible combinations of the levels of factors
are investigated. If there are a level of factor A and b levels of
factor B, each replicate or complete trial contains all ab
treatment combinations.
• The effect of a factor is defined to be the change in response
produced by a change in the level of the factor. This is
frequently called the main effect because it refers to the
primary factor of interest in the experiment. For example,
consider the example experiment having two factors A and B
which are acaricide and concentrate feeding on the milk yield
of cows, each with two levels (ao and a1 for factor A and bo
and b1 for factor B). The four treatment combinations are
expressed as aobo, a1bo, aob1, and a1b1.
117
118. • The minimum number of factors in factorial experiment is
two having two levels of each factor and we called it 2x2
factorial experiments.
• The term complete factorial experiment is sometimes used
when the treatments include all possible combinations of the
selected levels of the variable factors.
• The term incomplete factorial experiment is used when only a
fraction (portion) of all the possible combinations is tested in
the experiment. There is no limits of the factors tested in
factorial experiment unless the researcher has problem of
cost and space for experimentation.
• When we increase the number of factors with more number of levels,
the possible combinations become too large and make difficult for
undertaking the research and up accurate and precise result. In this
situation, we can use incomplete factorial experiment by selecting
some of best combinations from each factor and test under the
appropriateexperimentaldesign. 118
119. Advantages
• Gives an opportunity to evaluate the effect of two or more factors
simultaneously.
• More precision on each factor than with single factor
experimentation. (Due to “hidden replication”).
• Possible to estimate the interaction effect.
• Good for exploratory work where we wish to find the most
important factor or the optimal level of a factor (combination of
levels of more than one factor.
• Save experimental resources (labor, land, input)
• Save time (time required for combined is less than each single
experiment.
Disadvantages
• When several treatment combinations are involved the execution
of the experiment and statistical analysis became more complex.
• With a number of factors each at several levels, the experiment
can become very large and may be difficult to ensure homogeneity
of experimental unit. This lead to high experimental error.
119
120. Types of Factorial Experiments
• There are different types of factorial experiment in livestock and forage and pasture crop
research.
Symmetrical factorial experiment
• In this factorial experiment, the different factors consist of the same each levels of each
factor and it is designated as pn where n is the number of factors and p is the levels of each
factor evaluated in the study. E.g. 23 indicate that there are three factors with two levels
each (2x2x2 factorial experiment).
Asymmetrical factorial experiment
• In this case the factors consists of not necessary the same levels of each factor because
some times in reality for research it is difficult to get all factors having equal levels of factors.
Therefore, the factors in the experiment includes different levels of factors, this is called
asymmetric factorial experiment. E.g. 3x4x2 – indicates that there are three factors that
factor one with level 3, factor two with level 4 and factor three with 2 levels.
Mixed factorial experiment
• This factorial experiment basically consists of a combination of both symmetrical and
asymmetrical experiments together, meaning some of the factors may have equal levels of
factors and other factors different levels of factors in the experiment. E.g. 22 x 3 (2x2x3) -
indicates that there are three factors in the experiment of which the two factors consists of
two equal levels of factors and the third factor consists of three levels.
120
121. Chapter 4. Comparison between Treatment Means
• There are many ways to compare the means of treatments
tested in an experiment. Only those comparisons helpful in
answering the experimental objectives should be thoroughly
examined. It can be classified in to pair comparison and group
comparison.
Paired comparisons
• Paired comparison is the simplest and most commonly used
comparison in agricultural research. There are two types:
• Planned pair comparison- in which the specific pair of
treatments to be compared was identified before the start of
the experiment. A common example is comparison of the
control treatment with each of the other treatments.
• Unplanned pair comparison- in which no specific pair of
treatments to be compared was chosen before the start of
the experiment. Instead, every possible pair of treatment
means is compared to identify pairs of treatments that are
significantly different. 121
122. • The two most commonly used test procedures for pair comparisons in
agricultural research are the least significant difference (LSD) test which is
suited for a planned pair comparison, and Duncan’s multiple range test
(DMRT) which is applicable to an unplanned pair comparison. Other test
procedures, such as the honestly significant difference (HSD) test and the
Student-Newman-Keuls multiple range test can be also used.
1. Least significance Difference Test (LSD)
• The least significant difference test is the simplest and the most commonly
used procedures for making pair comparisons. The procedures provides for
a single LSD value, at a prescribed level of significance, which serves as the
boundary between significant and non significant differences between any
pair of treatment means. That is, two treatments are declared significantly
different at a prescribed level of significance if their difference exceeds the
computed LSD value; otherwise they are not significantly different. The LSD
test is most appropriate for making planned pair comparisons but, strictly
speaking, is not valid for comparing all possible pairs of means, especially
when the number of treatments is large.
122
123. • The procedures for applying the LSD test to compare any two
treatments, say the ith and the jth treatments is hereunder:
• Step 1. Compute the mean difference between the ith and the jth
treatments as:
• Difference = mean of treatment i – mean of treatment j
• Step 2. Compute the LSD value at œ level of significance as:
• LSDœ = (tœ) SE(d), where SE (d) is the standard error of the mean
difference and tœ is the tabular t value (at œ level of significance
and with n= error degree of freedom).
• Step 3. Compare the mean difference computed in step 1 and the
LSD value computed in step 2 and if the absolute value of the
mean difference is greater than the LSD value at œ level of
significance, there is significant difference between the ith and the
jth treatments, otherwise it is not significantly different.
• In applying the foregoing procedure, it is important that the
appropriate standard error of the mean difference for the
treatment pair being compared is correctly identified. This task is
affected by the experimental design used, the number of
replications of the two treatments being compared, and the
specific type of means to be compared. 123
124. 2. Duncan’s Multiple Range Test
• For experiments that require the evaluation of all
possible pairs of treatment means, the LSD test is
usually not suitable. This is especially true when the
total number of treatments is large. In such cases,
DMRT is useful.
• The procedure for applying the DMRT is similar to the
LSD test: DMRT involves the computation of numerical
boundaries that allow the classification of the
difference between any two treatments means as
significant or non significant. However, unlike the LSD
test in which only a single value is required for any pair
comparison at a prescribed level of significance, the
DMRT requires computation of a series of values, each
corresponding to a specific set of pair comparison.
124
125. Group Comparison
• For group comparison, more than two treatments are
involved in each comparison. There are four types of
comparison: Between Group comparison, Within Group
comparison, trend comparison and Factorial comparison.
• Between Group comparison
• In this comparison, treatments are classified into s (where
s>2) meaningful groups, each group consisting of one or more
treatments, and the aggregate mean of each group is
compared to that of the others.
• Within Group comparison
• This comparison is primarily designed to compare treatments
belonging to a subset of all the treatments tested. This subset
generally corresponds to a group of treatments used in the
between group comparison. In some instance, the subset of
the treatments in which the within group comparison is to be
made may be selected independently of the between group
comparison.
125
126. Chapter 5. Correlation and Regression Analysis
• Because of the nature of agricultural research that
focuses primarily on the behaviour of biological
organisms in a specified environment, the association
among treatments, environmental factors, and
responses that are usually evaluated in livestock
research are association between response variables,
Association between response and treatments,
association between response and environment.
• Both correlation and regression has a numbers of
advantages:-
• To know the association/relationships between numbers of variables
that could affect the response of treatment on the experimental
units in an experiment.
• To understand the association of different variables on the response
of animal performances in an experiment
• To predict the association of different variables on animal
performance so that it would be possible to adjust the amount of
treatments used on the experimental units. 126
127. Correlation analysis
• The discovering and measuring of the magnified and direction of the
relationship between two/more variable is called Correlation. It is a
measure of the degree to which variables vary together or a measure of the
intensity of the association or goodness of fit between different variables in
an experiment.
• Suppose you have two continuous variables X and Y, if the change in one
variable affects, the change in the other variable, the variable X is said to be
correlated with variable Y or vice versa. In this case, the correlation
between two or more variables does not necessarily interested to have
dependent or independent variables, both can be dependent or
independent variables or both alternatively. The correlation procedures can
be classified according to the number of variables involved and the form of
the functional relationship between variables involved in the experiment.
• The procedures is termed simple if only two variables are involved and
multiple, otherwise. The procedure is termed linear if the form of the
underlying relationship is linear and non-linear, otherwise. Thus, correlation
analysis can be classified into four types.
1. Simple linear correlation
2. Multiple linear correlation
3. Simple non linear correlation
4. Multiple non linear correlation 127
128. • Correlation analysis is usually expressed by using index called coefficient of
correlation and it is symbolized by “r” incase of sample, and “p” incase of
population. The values of coefficient of correlation range between –1 and 1,
inclusively (−1 ≤ r ≤ 1). It tells us only the magnitude, degree, and direction of
association of the variables in an experiment. For r > 0, the two variables have a
positive correlation, and for r < 0, the two variables have a negative correlation.
• The positive correlation means where the changes in both variables move in the
same direction (as values of one variable increase, increasing values of the other
variable are observed and as values of one variable decrease, decreasing values of
the other variable are observed).
• A negative correlation means that as values of one variable increase, decreasing
values of the other variable are observed or vice versa. The value r = 1 or r = –1
indicates an ideal or perfect linear relationship, and r = 0 means that there is no
linear association.
• Coefficient of correlation is unit free. It is not affected by change of the origin, scale
or both in an experiment. The coefficient of correlation ® is used under certain
assumptions, such as the variables are continuous, random variables and are
normally distributed, the relationship between variables is linear and each pair of
observation is not connected with each other.
• The magnitude of correlation is calculated by the formula called coefficient of
determination (r2) that shows the amount of change in one variable is accounted by
the second variable.
• Correlation can be used as selection criteria in animal breeding if it is positive so
decide up to what level the variables are used in an experiment. 128
129. Example: From a research which is conducted in Horro
Guduru Wollega Goats, the following weight and heart girth
data are taken. Calculate linear correlation coefficient and
coefficient of determination.
Heart girth (x) Body weight (y) XiYi
70 25 1750
67 22 1474
73 32 2336
73 32 2336
65 20 1300
74 31 2294
73 31 2263
68 27 1836
Total ∑Xi =563 ∑Yi = 220 ∑XiYi = 15,589
Mean X- = 70.4 y- =27.5 129
130. Solution
• SSX =∑ xi2 – (∑ xi)2 = [(70)2 + (67)2 +--- (68)2] – (563)2 = 39701-39621.13 =79.87
n 8
• SSY = =∑ Yi2 – (∑ Yi)2 = [(25)2 + (22)2 +--- (27)2] – (220)2 = 6208 -6050 =158
• n 8
• Cov XY = ∑XiYi – (∑Xi∑Yi) = [1750 +1474+---1836] – (563x220) = 15589-15482.5 = 106.5
• n 8
• rXY (correlation coefficient) = Cov XY = 106.5 = 106.5 = 0.95
√SSX*SSY √79.87*158 112.34
• r2 (coefficient of determination) = 0.952 = 0.8988 = 89.88%.
This shows that the relation between heart girth(x) and
body weight(y) variable is 89.88%.
130
131. Regression analysis
• It is often of interest to determine how changes of values of
some variables influence the change of values of other
variables. For example, how alteration of air temperature
affects feed intake, or how increasing the protein level in a
feed affects daily gain. In both the first and the second
example, the relationship between variables can be described
with a function, a function of temperature to describe feed
intake, or a function of protein level to describe daily gain. A
function that explains such relationships is called a regression
function and analysis of such problems and estimation of the
regression function is called regression analysis. Regression
includes a set of procedures designed to study statistical
relationships among variables in a way in which one variable
is defined as dependent upon others defined as independent
variables. By using regression, the cause-consequence
relationship between the independent and dependent
variables can be determined. In the examples above, feed
intake and daily gain are dependent variables, and
temperature and protein level are independent variables.131
132. • Regression analysis describes the effect of one or more
variables (designated as independent) on a single
variable (designated as the dependent variable) by
expressing the latter as a function of the former.
• For this analysis, it is important to clearly distinguish
between the dependent and independent variable.
• The regression analysis tells us the cause and effect or
the magnitude of relationship between variables in an
experiment. The regression procedures can be
classified according to the number of variables involved
and the form of the functional relationship between
variables involved in the experiment. The procedures is
termed simple if only two variables are involved and
multiple, otherwise. The procedure is termed linear if
the form of the underlying relationship is linear and
non-linear, otherwise. Thus, regression analysis can be
classified into four types.
132
133. 1. Simple linear regression
2. Multiple linear regression
3. Simple non linear regression
4. Multiple non linear regression
• The functional form of the linear relationship between a dependent
variable Y and an independent variable X is represented by the equation: Y=
œ + βX, where œ is the intercept of the line on the Y axis and β is the linear
regression coefficient, is the slope of the line or the amount of change in Y
for each unit change in X. where there is more than one independent
variable, so k independent variables (X1, X2, X3,---Xk), the simple linear
functional form of the equation Y= œ + βX can be extended to the multiple
linear functional form of Y = œ + β1X1+ β2X2 +----+ βkXk, where œ is the
intercept (the value of Y where all X’s are zero) and βi (i =1---k), is the
regression coefficient associated with independent variable Xi, represents
the amount of change in Y for each unit change in Xi. The two main
applications of regression analysis are:
1. Estimation of a function of dependency between variables
2. Prediction of future measurements or means of the dependent variable
using new measurements of the independent variable(s).
133
134. • Example: From a research which is conducted in
Horro Guduru Wollega Goats, the following weight
and heart girth data are taken.
Heart girth (x) Body weight (x) XiYi
70 25 1750
67 22 1474
73 32 2336
73 32 2336
65 20 1300
74 31 2294
73 31 2263
68 27 1836
Total ∑Xi =563 ∑Yi = 220 ∑XiYi = 15,589
Mean X- = 70.4 y- =27.5 134
136. Chapter 6. Problem Data
• Analysis of variance is valid for use only if the basic
research data satisfy certain conditions. Some of those
conditions are implied, others are specified. In field
experiments, for example, it is implied that all plots are
grown successfully and all necessary data are taken and
recorded. In addition, it is specified that the data satisfy
all the mathematical assumptions underlining the
analysis of variance.
• We use the term problem data for any set of data that
does not satisfy the implied or the stated conditions for
a valid analysis of variance. The two groups of problem
data that are commonly encountered in agriculture
research are:
• Missing Data
• Data that violate some assumptions of the analysis of variance136
137. Missing Data
• A missing data situation occurs whenever a valid observation is not
available for any one of the experimental units. Occurrence of missing data
results in two major difficulties: loss of information and non applicability of
the standard analysis of variance.
Common cause of missing data
• Even though data gathering in field experiments is usually done with
extreme care, numerous factors beyond the researcher’s control can
contribute to missing data.
• Improper treatment- is declared when an experiment has one or more
experimental plots that do not receive the intended treatment. Non
application, application of an incorrect dose, and wrong timing of
application are common causes of improper treatment.
• Destruction of Experimental Animals- most field experiments aim for a
perfect stand in all experimental plots but that is not always achieved.
Diseases are common causes of the destruction of experimental animals. It
is extremely important, however, to carefully examine a stand- deficient
plot before declaring missing data. The destruction of experimental animals
must not be the result of the treatment effect. An incorrect declaration of
missing data can easily lead to an incorrect conclusion.
137
138. • Loss of Harvested Samples- Many forage characters cannot
be conveniently recorded, either in the field or immediately
after harvest. Harvested samples may require additional
processing before the required data can be measured.
• Illogical Data- in contrast to the cases of missing data where
the problem is recognized before data recorded, illogical data
are usually recognized after the data have been recorded and
transcribed. Data may be considered illogical if their values
are too extreme to be considered within the logical range of
the normal behavior of the experimental materials. However,
only illogical data resulting from some kind of error can be
considered as missing. Common errors resulting in illogical
data are misread observation, incorrect transcription, and
improper application of the sampling technique or the
measuring instrument.
138
139. Missing Data formula technique
• When an experiment has one or more observation
missing, the standard computational procedures of
the analysis of variance for the various designs (as
we have seen in chapter two of this lecture note),
except CRD, no longer apply.
• In such causes, either the missing data formula
technique or the analysis of covariance technique
should be applied.
139
140. • In the missing data formula technique, an estimate of a
single missing observation is provided through an
appropriate formula according to the experimental
design used. This estimate is used to replace the
missing data and the augmented data set is then
subjected, with some slight modifications, to the
standard analysis of variance. We emphasize here that
an estimate of the missing data obtained through the
missing data formula technique does not supply any
additional information to the incomplete set of data.
Once the data is lost, no amount of statistical
manipulation can retrieve it. What the procedure
attempts to do is to allow the researcher to complete
the analysis of variance in the usual manner (i.e., as the
data were complete) without resorting to the more
complex procedures needed for incomplete data sets.
140
141. CHAPTER 7
Checking the Assumptions of ANOVA and
Transformation of Data
The Assumptions of ANOVA
• Normal distribution
• Independence
• Homogeneity of variance
Data that violate the Assumptions of ANOVA
• The usual interpretation of the analysis of variance is
valid only when certain mathematical assumptions
concerning the data are met. Failure to meet one or
more of the assumptions affects both the level of
significance and the sensitivity of the F test in the
analysis of variance. Thus, any drastic departure from
one or more of the assumptions must be corrected
before the analysis of variance is applied. 141
142. Common Violations in Agricultural Experiments
• The assumptions underlying the analysis of variance are reasonably
satisfied for most experimental data in agricultural research, but there are
certain types of experiment that are notorious for frequent violations of
these assumptions.
• Non additive Effects – the effects of two factors, say treatment and
replication, are said to be additive if the effect of one factor remains
constant over all levels of the other factor. In other words, if the treatment
effect remains constant for all replications and the replication effect
remains constant for all treatments, then the effects of treatment and
replication are additive. A common departure from the assumptions of
additive in agricultural experiments is one where the effects are
multiplicative instead of additive. Two factors are said to have multiplicative
effects if their effects are additive only when expressed in terms of
percentages.
• Non independence of Errors – The assumption of independence of
experimental errors requires that the error of an observation is not related
to or dependent upon that of another. This assumption is usually assured
with the use of proper randomization (i.e., treatments are assigned at
random to experimental units). However, in a systematic design, where the
experimental units are arranged systematically instead of at random, the
assumption of independence of errors is usually violated.
143. Data Transformation
• Data transformation is the most appropriate remedial
measure for variance heterogeneity where the variance
and the mean are functionally related. With this
technique, the original data are converted into a new
scale resulting in a new data set that is expected to
satisfy the condition of homogeneity of variance,
because a common transformation scale is applied to
all observations, the comparative values between
treatments are not altered and comparisons between
them remain valid.
• The appropriate data transformation to be used
depends on the specific type of relationship between
the variance and the mean. The most commonly used
transformations for data in agriculture research are:
143
144. • Logarithmic Transformation- it is most appropriate for data
where the standard deviation is proportional to the mean or
where the effects are multiplicative. These conditions are
generally found in data that are whole numbers and cover a
wide range of values.
• To transform a data set into the logarithmic scale, simply take
the logarithm of each and every component of the data set.
• Square-Root Transformation–it is appropriate for data
consisting of small whole numbers and percentage data
where the range is between 0 and 30% or between 70 and
100%. If most of the values in the data set are small (e.g. less
than 10), especially with zeros present, (x+0.5)1/2 should be
used instead of X1/2, where X is the original data.
• Arc-Sine Transformation/Angular transformation- it is appropriate for
data on proportions, data obtained from a count, and data expressed
asdecimalfractionsorpercentages.
144