1. 1
An experimental design consists of a careful description of how a particular hypothesis
can be experimentally tested. This requires: (a) an explicit specification of the treatment
factors to be tested; (b) the specific range of values over which these treatment factors
will be tested; (c) the manner in which observations will be generated, recorded, and
reported; and (d) the criteria that will be used to evaluate the degree to which the
observations appear to support the hypothesis or to be inconsistent with the hypothesis.
Some basic concepts of design of experiment
Treatments are the different procedures we want to compare. These could be different
kinds or amounts of fertilizer in agronomy, different long distance rate structures in
marketing, or different temperatures in a reactor vessel in chemical engineering.
Experimental units are the things to which we apply the treatments. These could be
plots of land receiving fertilizer, groups of customers receiving different rate structures,
or batches of feedstock processing at different temperatures.
Responses are outcomes that we observe after applying a treatment to an experimental
unit. That is, the response is what we measure to judge what happened in the
experiment; we often have more than one response. Responses for the above examples
might be nitrogen content or biomass of corn plants, profit by customer group, or yield
and quality of the product per ton of raw material.
Randomization is the use of a known, understood probabilistic mechanism for the
assignment of treatments to units. Other aspects of an experiment can also be
randomized: for example, the order in which units are evaluated for their responses.
2. 2
Experimental Error is the random variation present in all experimental results.
Different experimental units will give different responses to the same treatment, and it
is often true that applying the same treatment over and over again to the same unit will
result in different responses in different trials. Experimental error does not refer to
conducting the wrong experiment or dropping test tubes.
Measurement units (or response units) are the actual objects on which the response is
measured. These may differ from the experimental units. For example, consider the
effect of different fertilizers on the nitrogen content of corn plants. Different field plots
are the experimental units, but the measurement units might be a subset of the corn
plants on the field plot, or a sample of leaves, stalks, and roots from the field plot.
Control has several different uses in design. First, an experiment is controlled because
we as experimenters assign treatments to experimental units. Otherwise, we would
have an observational study.
Second, a control treatment is a âstandardâ treatment that is used as a baseline or basis
of comparison for the other treatments. This control treatment might be the treatment in
common use, or it might be a null treatment (no treatment at all). For example, a study
of new pain killing drugs could use a standard pain killer as a control treatment, or a
study on the efficacy of fertilizer could give some fields no fertilizer at all.
Uncontrollable input factors are those parameters that cannot be changed. In the rice-
cooking example, this may be the temperature in the kitchen. These factors need to be
recognized to understand how they may affect the response.
Controllable input factors, or x factors, are those input parameters that can be modified
in an experiment or process. For example, in cooking rice, these factors include the
quantity and quality of the rice and the quantity of water used for boiling.
3. 3
Experimental Framework An experimental framework consists of a structural
environment within which a user can systematically vary one or more key structural
features (treatment factors).
Basic Principles of Experimental Designs
The basic principles of experimental designs are randomization, replication and local
control. These principles make a valid test of significance possible.
Randomization
The first principle of an experimental design is randomization, which is a random
process of assigning treatments to the experimental units. The random process implies
that every possible allotment of treatments has the same probability. An experimental
unit is the smallest division of the experimental material, and a treatment means an
experimental condition whose effect is to be measured and compared. The purpose of
randomization is to remove bias and other sources of extraneous variation which are
not controllable. Another advantage of randomization (accompanied by replication) is
that it forms the basis of any valid statistical test. Hence, the treatments must be
assigned at random to the experimental units. Randomization is usually done by
drawing numbered cards from a well-shuffled pack of cards, by drawing numbered
balls from a well-shaken container or by using tables of random numbers.
Replication
The second principle of an experimental design is replication, which is a repetition of
the basic experiment. In other words, it is a complete run for all the treatments to be
tested in the experiment. In all experiments, some kind of variation is introduced
because of the fact that the experimental units such as individuals or plots of land in
agricultural experiments cannot be physically identical. This type of variation can be
removed by using a number of experimental units. We therefore perform the
experiment more than once, i.e., we repeat the basic experiment. An individual
4. 4
repetition is called a replicate. The number, the shape and the size of replicates depend
upon the nature of the experimental material.
A replication is used to:
I. Secure a more accurate estimate of the experimental error, a term which
represents the differences that would be observed if the same treatments were
applied several times to the same experimental units.
II. Decrease the experimental error and thereby increase precision, which is a
measure of the variability of the experimental error; and
III. Obtain a more precise estimate of the mean effect of a treatment
Local Control
It has been observed that all extraneous sources of variation are not removed by
randomization and replication. This necessitates a refinement of the experimental
technique. In other words, we need to choose a design in such a manner that all
extraneous sources of variation are brought under control. For this purpose, we make
use of local control, a term referring to the amount of balancing, blocking and grouping
of the experimental units. Balancing means that the treatments should he assigned to
the experimental units in such a way that the result is a balanced arrangement of the
treatments. Blocking means that like experimental units should be collected together to
form a relatively homogeneous group. A block is also a replicate. The main purpose of
the principle of local control is to increase the efficiency of an experimental design by
decreasing the experimental error. The point to remember here is that the term local
control should not be confused with the word control. The word control in
experimental design is used for a treatment which does not receive any treatment when
we need to find out the effectiveness of other treatments through comparison.
5. 5
Completely Randomized Design
In a completely randomized design, there is only one primary factor under
consideration in the experiment. The test subjects are assigned to treatment levels of the
primary factor at random.
Sample layout
Different colors represent different treatments. There are 4 (A-D) treatments with 4
replications (1-4) each
Description of the Design
⢠Simplest design to use.
⢠Design can be used when experimental units are essentially homogeneous. -
Because of the homogeneity requirement, it may be difficult to use this design for
field experiments.
⢠The CRD is best suited for experiments with a small number of treatments.
Randomization Procedure
I. Treatments are assigned to experimental units completely at random.
II. Every experimental unit has the same probability of receiving any treatment.
6. 6
III. Randomization is performed using a random number table, computer, program,
etc.
Example of Randomization
-Given you have 4 treatments (A, B, C, and D) and 5 replicates, how many experimental
Units would you have?
Note that there is no âblockingâ of experimental units into replicates. -Every
experimental unit has the same probability of receiving any treatment.
Advantages of a CRD
I. Very flexible design (i.e. number of treatments and replicates is only limited by
the available number of experimental units).
II. Statistical analysis is simple compared to other designs.
III. Loss of information due to missing data is small compared to other designs due
to the larger number of degrees of freedom for the error source of variation.
Disadvantages
I. If experimental units are not homogeneous and you fail to minimize this
variation using blocking, there may be a loss of precision.
II. Usually the least efficient design unless experimental units are homogeneous.
III. Not suited for a large number of treatments.
Fixed vs. Random Effects
I. The choice of labeling a factor as a fixed or random effect will affect how you will
make the F-test.
7. 7
II. This will become more important later in the course when we discuss
interactions.
Fixed Effect
I. All treatments of interest are included in your experiment.
II. You cannot make inferences to a larger experiment.
Example 1: An experiment is conducted at Fargo and Grand Forks, ND. If location is
considered a fixed effect, you cannot make inferences toward a larger area (e.g. the
central Red River Valley).
Example 2: An experiment is conducted using four rates (e.g. ½ X, X, 1.5 X, 2 X) of a
herbicide to determine its efficacy to control weeds. If rate is considered a fixed effect,
you cannot make inferences about what may have occurred at any rates not used in the
experiment (e.g. Âź x, 1.25 X, etc.).
Random Effect
I. Treatments are a sample of the population to which you can make inferences.
II. You can make inferences toward a larger population using the information from
the analyses.
Example 1: An experiment is conducted at Fargo and Grand Forks, ND. If location is
considered a random effect, you can make inferences toward a larger area (e.g. you
could use the results to state what might be expected to occur in the central Red River
Valley).
Example 2: An experiment is conducted using four rates (e.g. ½ X, X, 1.5 X, 2 X) of an
herbicide to determine its efficacy to control weeds. If rate is considered a random
effect, you can make inferences about what may have occurred at rates not used in the
experiment (e.g. Âź x, 1.25 X, etc.).
8. 8
A completely randomized design (CRD) is the simplest design for comparative
experiments, as it uses only two basic principles of experimental designs:
randomization and replication. Its power is best understood in the context of
agricultural experiments (for which it was initially developed), and it will be discussed
from that perspective, but true experimental designs, where feasible, are useful in the
social sciences and in medical experiments.
In a completely randomized design, treatment levels or combinations are assigned to
experimental units at random. This is typically done by listing the treatment levels or
treatment combinations and assigning a random number to each. By sorting on the
random number, one can produce a random order for application of the treatments to
experimental units.
Mathematical Problem
Suppose the following table represents the sales figures of the 3 new menu items in
the 18 restaurants after a week of test marketing. At .05 level of significance, test
whether the mean sales volume for the 3 new menu items are all equal.
Item 1 Item 2 Item 3
22 52 15
42 33 24
44 8 19
52 47 18
45 43 34
37 32 39
10. 10
Interpretation
Since the p-value of 0.11 is greater than the .05 significance level, we do not reject the
null hypothesis that the mean sales volume of the new menu items is all equal.
Randomized Block Designs
The Randomized Block Design is research design's equivalent to stratified random
sampling. Like stratified sampling, randomized block designs are constructed to reduce
noise or variance in the data (see classifying the Experimental Designs). How do they
do it? They require that the researcher divide the sample into relatively homogeneous
subgroups or blocks (analogous to "strata" in stratified sampling). Then, the
experimental design you want to implement is implemented within each block or
homogeneous subgroup. The key idea is that the variability within each block is less
than the variability of the entire sample. Thus each estimate of the treatment effect
within a block is more efficient than estimates across the entire sample. And, when we
pool these more efficient estimates across blocks, we should get an overall more
efficient estimate than we would without blocking.
Here, we can see a simple example. Let's assume that we originally intended to conduct
a simple posttest-only randomized experimental design. But, we recognize that our
sample has several intact or homogeneous subgroups. For instance, in a study of college
students, we might expect that students are relatively homogeneous with respect to
class or year. So, we decide to block the sample into four groups: freshman, sophomore,
junior, and senior. If our hunch is correct, that the variability within class is less than the
variability for the entire sample, we will probably get more powerful estimates of the
treatment effect within each block (see the discussion on Statistical Power). Within each
of our four blocks, we would implement the simple post-only randomized experiment.
11. 11
Notice a couple of things about this strategy. First, to an external observer, it may not be
apparent that you are blocking. You would be implementing the same design in each
block. And, there is no reason that the people in different blocks need to be segregated
or separated from each other. In other words, blocking doesn't necessarily affect
anything that you do with the research participants. Instead, blocking is a strategy for
grouping people in your data analysis in order to reduce noise -- it is an analysis
strategy. Second, you will only benefit from a blocking design if you are correct in your
hunch that the blocks are more homogeneous than the entire sample is. If you are
wrong -- if different college-level classes aren't relatively homogeneous with respect to
your measures -- you will actually be hurt by blocking (you'll get a less powerful
estimate of the treatment effect). How do you know if blocking is a good idea? You
need to consider carefully whether the groups are relatively homogeneous. If you are
measuring political attitudes, for instance, is it reasonable to believe that freshmen are
more like each other than they are like sophomores or juniors? Would they be more
homogeneous with respect to measures related to drug abuse? Ultimately the decision
to block involves judgment on the part of the researcher.
12. 12
So how does blocking work to reduce noise in the data? To see how it works, you have
to begin by thinking about the non-blocked study. The figure shows the pretest-posttest
distribution for a hypothetical pre-post randomized experimental design. We use the 'X'
symbol to indicate a program group case and the 'O' symbol for a comparison group
member. You can see that for any specific pretest value, the program group tends to
outscore the comparison group by about 10 points on the posttest. That is, there is about
a 10-point posttest mean difference.
Now, let's consider an example where we divide the sample into three relatively
homogeneous blocks. To see what happens graphically, we'll use the pretest measure to
block. This will assure that the groups are very homogeneous. Let's look at what is
happening within the third block. Notice that the mean difference is still the same as it
was for the entire sample -- about 10 points within each block. But also notice that the
variability of the posttest is much less than it was for the entire sample. Remember that
13. 13
the treatment effect estimate is a signal-to-noise ratio. The signal in this case is the mean
difference. The noise is the variability. The two figures show that we haven't changed
the signal in moving to blocking -- there is still about a 10-point posttest difference. But,
we have changed the noise -- the variability on the posttest is much smaller within each
block that it is for the entire sample. So, the treatment effect will have less noise for the
same signal.
It should be clear from the graphs that the blocking design in this case will yield the
stronger treatment effect. But this is true only because we did a good job assuring that
the blocks were homogeneous. If the blocks weren't homogeneous -- their variability
was as large as the entire sample's -- we would actually get worse estimates than in the
simple randomized experimental case. We'll see how to analyze data from a
randomized block design in the Statistical Analysis of the Randomized Block Design.
Advantages of RBD
I. The precision is more in RBD.
II. The amount of information obtained in RBD is more as compared to CRD.
14. 14
III. RBD is more flexible. Statistical analysis is simple and easy.
IV. Even if some values are missing, still the analysis can be done by using missing
plot technique.
Disadvantages of RBD
I. When the number of treatments is increased, the block size will increase.
II. If the block size is large maintaining homogeneity is difficult and hence when
more number of treatments is present this design may not be suitable.
Mathematical Problem
We wish to determine whether or not four different tips produce different readings on a
hardness testing machine. An experiment such as these might be part of a gauge
capability study. The machine operates by pressing the tip into a metal test coupon, and
from the depth of the resulting depression, the hardness of the coupon can be
determined. THe experimenter has decided to obtain four observations for each tip.
There is only one factor â tip type â and a completely randomized single factor design
would consist of randomly assigning each one of the 4Ă4=16 runs to an experimental
unit, that is , a metal coupon , and observing the hardness reading that results.
However, if the metal coupons differ slightly in their hardness, as might happen if they
are taken from ingots that are produced in different heats, the experimental units will
contribute to the variability observed in the hardness data. As a result the experimental
error will reflect both random error and variability between coupons. We would like to
remove the variability between coupond from the experimental error. A design that
would accomplish this requires the experimenter to test each tip once on each of four
coupons. This design is called a randomized complete block design. Each block contains
all the treatments. Within a block the order in which the four tips are tested is randomly
determined.
15. 15
The test data is
Type of Tip Test Coupon Hardness
1 1 9.3
2 1 9.4
3 1 9.2
4 1 9.7
1 2 9.4
2 2 9.3
3 2 9.4
4 2 9.6
1 3 9.6
2 3 9.8
3 3 9.5
4 3 10
1 4 10
2 4 9.9
3 4 9.7
4 4 10.2
Calculate and interpret the data.
16. 16
Solution
Steps of calculation â
Latin Square Design
Latin Square Designs are probably not used as much as they should be - they are very
efficient designs. Latin square designs allow for two blocking factors. In other words,
these designs are used to simultaneously control (or eliminate) two sources of nuisance
variability. For instance, if you had a plot of land the fertility of this land might change
in both directions, North -- South and East -- West due to soil or moisture gradients. So,
both rows and columns can be used as blocking factors. However, you can use Latin
17. 17
squares in lots of other settings. As we shall see, Latin squares can be used as much as
the RCBD in industrial experimentation as well as other experiments.
Whenever, you have more than one blocking factor a Latin square design will allow you
to remove the variation for these two sources from the error variation. So, consider we
had a plot of land, we might have blocked it in columns and rows, i.e. each row is a
level of the row factor, and each column is a level of the column factor. We can remove
the variation from our measured response in both directions if we consider both rows
and columns as factors in our design.
The Latin Square Design gets its name from the fact that we can write it as a square
with Latin letters to correspond to the treatments. The treatment factor levels are the
Latin letters in the Latin square design. The number of rows and columns has to
correspond to the number of treatment levels. So, if we have four treatments then we
would need to have four rows and four columns in order to create a Latin square. This
gives us a design where we have each of the treatments and in each row and in each
column.
Pros and cons of Latin square design
The advantages of Latin square designs are:
⢠They handle the case when we have several nuisance factors and we either
cannot
⢠Combine them into a single factor or we wish to keep them separate.
⢠They allow experiments with a relatively small number of runs.
The disadvantages are
⢠The number of levels of each blocking variable must equal the number of levels
⢠of the treatment factor.
⢠The Latin square model assumes that there are no interactions between the
⢠Blocking variables or between the treatment variable and the blocking variable.
18. 18
An example of Latin square design
Actually, in many cases, Latin squares are necessary because one such combination of
levels from two blocking factors can be combined with one treatment, and not all. The
following example taken from Mead et al. (2003) illustrates this: Example1: An
experiment to investigate the effects of various dietary starch levels on milk production
was conducted on four cows. The four diets, T1, T2, T3, and T4, (in order of increasing
starch equivalent), were fed for three weeks to each cow and the total yield of milk in
the third week of each period was recorded (i.e. third week to minimize carry-over
effects due to the use of treatments administered in a previous period). That is, the trial
lasted 12 weeks since each cow received each treatment, and each treatment required
three weeks. The investigator felt strongly that time period effects might be important
(i.e earlier periods in the experiment might influence milk yields differently compared
to later periods). Hence, the investigator wanted to block on both cow and period.
However, each cow cannot possibly receive more than one treatment during the same
time period; that is, all possible cow-period blocking combinations could not logically
be considered. To start the randomization for a Latin square that accommodates these
types of concerns, let's choose at random from one of the 4 standard Latin squares when
a = 4 treatments
The two blocking variables in a Latin square design are often generically labeled as row
and column blocking variables. In this example, cow is identified as the column variable
and period as the row variable. Standard Latin squares are Latin squares in which
19. 19
elements of the first row and first column are arranged alphabetically by treatment
category (i.e. the letters in the square above denote different treatments). There are a
number of standard Latin squares that might exist for different values of a (i.e. total
number of treatment effects). For each value of a, (the size of the square), there are a
large number of different a by a squares that have the Latin square property that each
letter (treatment group label) appears once in each row and once in each column. As
with randomized block designs, in order to make the analysis of data from a design
statistically valid, we must choose one design randomly from a larger set of possible
Latin squares.
Mathematical Problem
Suppose you want to analyses the productivity of 5 kinds on fertilizer, 5 kinds of tillage,
and 5 kinds of seed. The data are organized in a Latin square design, as follow:
Treatment A Treatment B Treatment C Treatment D Treatment E
Fertilizer 1 "A42" "C47" "B55" "D51" "E44"
Fertilizer 2 "E45" "B54" "C52" "A44" "D50"
Fertilizer 3 "C41" "A46" "C52" "A44" "D50"
Fertilizer 4 "B56" "D52" "E49" "C50" "A43"
Fertilizer 5 "D47" "E49" "A45" "B54" "C46"
The three factors are: fertilizer (fertilizer1:5), tillage (treatA:E), seed (A:E). The numbers
are the productivity in cwt / year.
Calculate and interpret the data.
Solution
Steps of calculation-
Create a data frame in R with these data:
22. 22
Look at the significance of the F-test-
I. The difference between group considering the fertilizer is not significant (p-value
> 0.1);
II. The difference between group considering the tillage is quite significant (p-value
< 0.05);
III. The difference between group considering the seed is very significant (p-value <
0.001)
23. 23
Factorial Experiments
In practice, the response of biological organism to the factor of interest is expected to differ
under different levels of other factors. For example the yield of wheat varieties may differ under
different rates of fertilizers application, spacing and irrigation schedules Thus when the effect of
several factors are investigated simultaneously in a single experiment, such experiment is
known as factorial experiment. In factorial experiments, the treatments consist of all possible
combinations of the selected levels of two or more factors. Factorial experiments are
advantageous over single factor experiments in following ways. In factorial experiments we can
evaluate combined effect of two or more factors when they are used simultaneously, i.e. we can
study the individual effects of each factors as well as their interactions in one experiment. The
factorial approach results in a considerable saving of experimental resources.
Simple and Main Effects
The simple effect of factors is the difference between its responses for a fixed level of other
factors. The mean of the simple effects of a factor is called the main effect of the factor.
Interaction Effects
The effect of one factor changes as the level of other factor changes i.e. the factors are not
independent, the dependence of factors in their responses is known as interaction the
interaction is measured as the mean of the differences between simple effects of the factors.
Split Plot Design
The Split plot design is specifically suited for a two factor experiment that has more treatments
than can be accommodated by a complete block design. In this design, one of the factors is
assigned to the main plot (main plot factor). The main plot is divided into subplots to which the
second factor, called the subplot factor, is assigned. With a split plot design, the precision for the
measurement of the effects of main-plot factor is sacrificed to improve that of the subplot
treatment. Measurement of the main effect of the subplot factor and its interaction with the
main plot factor is more precise than that obtainable with RCBD.
The following considerations are required for assigning main plot and subplot factors-
I. Degree of precision.
II. Relative size of main effects.
III. Management practices.
24. 24
Strip Plot Design
Strip plot design is specifically suited for a two-factor experiment in which the desired precision
for measuring the interaction effect between two factors is higher than that for measuring the
main effect of either one of the two factors.
This is accomplished with the use of three plot sizes-
I. Vertical-strip plot for first factor-a vertical factor.
II. Horizontal-strip plot for second factor- the horizontal factor.
III. Intersection plot for the interaction between the two factors.
The vertical-strip plot and the horizontal-strip plot are always perpendicular to each other. In
strip-plot design, the degree of precision associated with the main effects of both factors is
sacrificed in order to improve the precision of the interaction effect.
Two-Factor Experiments
When response to factor of interest is expected to differ under different levels of the other
factors, we consider the factorial experimental design to handle simultaneously two or more
variable factors. Two factor factorial design. Split--plot design and Strip- plot design come
under this category of experiments.
Three or More - Factor Experiments
A two factor experiment can be expanded to include a third factor, a third factor experiment to
include a fourth factor and so on. With the increase in factors we come across with two
important consequences. There is a rapid increase in the treatments to be tested, and There is an
increase in the number and type of interaction effects, e.g. a four factor experiment has 10
interaction effects.
Choice of Experimental Designs
It depends on the objective of the experiment and number and nature of the treatments under
study. Another important consideration for the choice of any experimental design is the
availability of resources. Some considerations under which the different designs are appropriate
are as under:
I. CRD is appropriate when the experimental material is limited and homogeneous, such
as the soil in the pot experiments.
25. 25
II. RBD is appropriate when the fertility gradient of the field is in one direction.
III. LSD can replace RBD when the fertility gradient is in two directions instead of one.
IV. When there are several factors with different levels to be studied simultaneously with
the same precision, a factorial scheme may be adopted.
V. When the factors are such that some of them require large plots, like irrigation, sowing
dates etc. and may be studied with different precision, split plot design may be used.
VI. An incomplete block design may be used when treatments are more than plots in a block
and small block size can be maintained even if the number of treatments is very large.