DoE for assay_developers_chp1_rev-1.0

A Practical Guide to
Design of Experiments (DOE)
for Assay Developers

Daniel Joelsson

© 2010 Daniel Joelsson All rights reserved

Introduction
Why write this book? Those of you that know me probably aren't surprised. I've been a
strong advocate for a more systematic approach to assay development for many years, to
the point of annoyance for many of my colleagues. I feel that using Design of Experiments
(DOE) in assay development is the most efficient way to develop quality assays in the least
amount of time possible. Then why aren't scientists using DOE more widely? I think part of
the problem is the lack of understanding of what DOE really is and why it's useful. I looked
for a good introduction to DOE for assay development scientists and I couldn't find one. Most
of the texts out there are engineering and process focused. Therefore, I'm writing my own
guide for us scientists and assay developers.
I should state up front that I'm not a statistician, I'm a scientist. This guide is not
intended for statisticians and I will not try to turn you into one either. However, some
basic understanding of statistics is important to fully appreciate DOE. I will also make
some assumptions that you are familiar with basic biology and biochemistry techniques.
My background is primarily in the design of bioassays and immunoassays for vaccines and
biotherapeutics. Most of my examples will come out of those disciplines.
I will do my best to explain any statistical topics in a relevant and pragmatic way, but in case
you want more, these are two resources that I have found to be excellent texts on statistics
for scientists:
The Biostatistics Cookbook - The Most User-Friendly Guide for the Bio/Medical Scientist by
Seth Michelson and Timothy Schofield
Biometry: The Principles and Practices of Statistics in Biological Research by Robert Sokal
and F. James Rohlf
This book is meant to be a guide for beginning to intermediate DOE users, and slanted
completely towards scientists that do assay development. If you would like a more in depth
discussion about DOE and the statistics behind it, Design & Analysis of Experiments by
Douglas C. Montgomery is a great resource to get you started.
While I will try to teach you enough about DOE to allow you to design and analyze your
own experiments, a statistician familiar with assay development can be a valuable resource.
What this book will give you is an understanding of the concepts and language of DOE so that
you can easily communicate with your statistics friends. One of my goals is to increase the
overlap in understanding on both sides. You might want to give a copy of this book to your
statistician as well, especially if they are not familiar with the pitfalls of assay design.


Before we start, a few thanks are in order. Thanks to Tim Schofield for schooling me on
statistics from the first day we met. Thanks to Edith Senderak for all of the mentoring and
collaborations over the years. Thanks to Joe Pigeon for sparking my initial interest in DOE
as well as introducing me to split-plot designs (which you will see are perfect for developing
assays in 96 well plates). Thanks to all of my colleagues for trusting me enough to allow me
to help with their DOE designs.
This book is being published under a Creative Commons license. You are free to distribute
this work to anyone you think would be interested, free of charge. You may not use any
portion of this book for any commercial purpose without prior permission. You may create
derivatives of this work, as long as these derivatives adhere to the same licence restrictions.
For complete license information, please visit:
http://creativecommons.org/licenses/by-nc-sa/3.0/


Chapter 1 - Screening Designs
1.1 One Factor At a Time (OFAT) and Interactions
Think back to your early years of scientific training for a moment. If your experience was
anything like mine, you were first taught to do experiments using the scientific method. It
went something like this: first generate a hypothesis based on past knowledge, design an
experiment to test the hypothesis, analyze the data and then refine the hypothesis. Repeat
until we get to the answer.
This is the correct way to do science. Unfortunately, it is in the second step where we usually
learn some bad habits. We were told to vary one factor at a time (OFAT) and hold everything
else constant. At first glance, OFAT makes a lot of sense, but it can be misleading. Let's
look at a simple assay development experiment.
Imagine that you're trying to develop an ELISA to detect a bacterial contaminant in a sample
from a process development study for a new biotherapeutic. Two of the variables you
assume to be important are the amount of capture antibody you add to your plate and the
time you incubate your sample. Consistent with the OFAT approach, you start out by testing
two different antibody concentrations: 0.2 ug/ml and 1.0 ug/ml while keeping the incubation
time constant at one hour. You then run the assay and faithfully record the results in your
lab notebook (Table 1.1).
Table 1.1
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

Great! Obviously, adding more antibody increases the response (it increased from 0.31 to
0.97). Your initial hypothesis that adding more antibody would be beneficial was correct.
Good to know! What if you increased it even further? Let's find out what happens (Table
1.2).


Table 1.2

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

Ok, it appears that adding even more antibody does not further increase the signal. So you
conclude that the response has been saturated somewhere around 1.0 ug/ml.
Since the conditions for the first variable have been optimized, let's look at the second
variable, incubation time. Again, you do a simple experiment increasing the incubation time
to two hours. Since you already know that 1.0 ug/ml of antibody is ideal, you only need to
run one experiment with 1.0 ug/ml of antibody for two hours of incubation. The results are
shown in Table 1.3
Table 1.3

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

Even better! Clearly two hours is better than one. What if you increase the incubation time
even further? You try three hours and see what happens (Table 1.4).
Table 1.4

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

1.0 ug/ml

3 hours

1.74

Again, increasing the variable did not have a further effect on the response. So you conclude
that the optimum settings for this assay is to use 1.0 ug/ml and incubate for two hours.


Since you've now tested and optimized both variables, you're done right? Well, maybe not.
There is one experiment that was not performed: 0.2 ug/ml of antibody incubated for two
hours. Let's run that experiment and see what happens (Table 1.5)
Table 1.5

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

1.0 ug/ml

3 hours

1.74

0.2 ug/ml

2 hours

2.62

Whoa! The response is now much higher than what we previously considered optimal.
There are now all kinds of experiments we should probably run to make sure we have all the
answers we need. What happens if we use less than 0.2 ug/ml? What about testing other
incubation times with 0.2 ug/ml?
This example highlights one of the greatest weaknesses of OFAT experiments. You are
very likely to miss an interaction between two or more of the variables. An interaction
occurs when the effect of two variables are not completely independent of each other; i.e.
the response from one variable is dependent on the level of the other. Since the OFAT
approach (and traditional scientific training) assumes that all variables are additive, it doesn't
encourage us to test for interactions. The only way to find them is by luck. Unfortunately,
interactions like these happen all the time in the real world.
For scientists trained in linear thinking, interactions can sometimes be hard to visualize.
There's a specific graph called an interaction plot that makes it slightly easier (Figures 1.1
and 1.2). In an interaction plot where the two variables have no interaction the two response
lines will be perfectly parallel (Figure 1.1). The two lines represent the response due to
Factor A at the two different levels of Factor B. Since the lines are perfectly parallel, the
effect of A and B is completely additive. When you increase Factor B it shifts the entire Factor
A line upwards by the same amount. This is the kind of relationship we assume is always in
place in an OFAT experimental design.
If the lines of the interaction plot intersect (or are simply non-parallel), there's an interaction
between the two variables (Figure 1.2). Increasing Factor B no longer just moves the Factor
A line higher. Instead, at the lower concentration of B, the response in Factor A actually
decreases from low to high. Clearly the effects of the two variables are not simply additive.
This is exactly the situation we observed in the assay development example above.


Figure 1.1 - Interaction plot for two factors without an interaction

Figure 1.2 - Interaction plot for two factors with an interaction

1.2 Factorial experiments - foundations of DOE
I hope I convinced you in the previous section that interactions between variables can be an
important factor to the success of your experiments and that the OFAT approach makes it
hard to find them unless you're willing to do a lot of work.
This section will introduce some of the basics of Design of Experiments (DOE). DOE is a more
systematic approach to experimentation than the OFAT approach. The goal is to test all of
the variables in a system in a set of multi-factorial experiments allowing us optimize all of
them and find interactions at the same time.
The assay development example in the last section made it clear that it would have been
a good idea to run all four combinations of the two variables right away. This type of
experiment is called a factorial experiment. A picture of the four different experiments (or


"runs") shows how we start exploring the "design space" of our experimental system (Figure
1.3). As you can see, we have covered the four corners of the space.
The runs can also be shown in a table such as Table 1.6. During each run, a factor can take
on a low value (depicted as "-") or a high value (depicted as "+"). With two factors there are
four such combinations possible. By performing all the possible combinations of the factors,
we will be able to tell if each of the factors are important (i.e. changing them has an effect
on the response), but also if any interactions are present.

Figure 1.3 Pictorial depiction of a 2^2 factorial experiment.
Table 1.6
Run #

Factor A

Factor B

Run 1

+

+

Run 2

-

+

Run 3

+

-

Run 4

-

-

Let's expand the same thinking to three factors (Figure 1.4 and Table 1.7). With three factors
we have to complete eight runs to run every combination of high and low for all three factors.
But again, you get a lot of information out of those runs, you will know if any or all of the
factors influence the response, if any two-factor interactions exist (AxB, AxC, BxC), and if
there's a three-factor interaction (AxBxC).


Figure 1.4 Pictorial depiction of a 2^3 factorial experiment
Table 1.7
Run#

Factor A

Factor B

Factor C

1

+

+

+

2

-

+

+

3

+

-

+

4

-

-

+

5

+

+

-

6

-

+

-

7

+

-

-

8

-

-

-

This exercise can be continued for four factors and so on. Performing eight runs for three
factors might seem reasonable, but as you may have figured out already, as we add factors,
the number of runs increases exponentially (Table 1.8). Since most assay systems contain
more than three or four factors, factorial experiments quickly become too large to feasibly
perform. With this in mind, it's easy to see how OFAT is still widely practiced. Factorial
experiments rapidly become unwieldy as the numbers of factors goes up.


Table 1.8
Number of factors

Number of runs

1

2

2

4

3

8

4

16

5

32

6

64

7

128

8

256

9

512

10

1024

But, since OFAT experiments make it very difficult to find those elusive interactions, we need
an alternative approach. Luckily DOE provides that in the form of fractional factorials, the
topic of the next section.
However, before I get to that topic, I want to address a common question. How come we
only use two levels (high and low) for each factor? Obviously, it would be risky to make the
assumption that any response is perfectly linear between two points. It might be that the
response has a complex shape in that space.
In the example in Figure 1.5, the optimum response is actually somewhere between our low
and high settings. This is where DOE differs from traditional experimentation. The goal of
these factorial experiments is not to optimize the response completely, but to screen for the
few factors that actually affect the response. These factors are then carried forward in a set
of optimization experiments.
Not all factors are likely to be important in every system, therefore we should do fairly low
resolution experiments to identify the ones that truly matter. As you will see, screening
experiments have relative fewer runs per factor than optimization experiments. We can
afford to include more factors in the initial experiments to make sure that we don't miss any
of the critical few.


Figure 1.5 - Non linear response

1.3 Fractional Factorials
In the last section, we learned that factorial experiments are useful designs to pick up both
main and interaction effects in our experimental system. But for more than a few factors,
they quickly become too large to be feasible. In most assay development experiments ,
we can easily identify ten or more factors that could be important (reagent concentrations,
incubation temperatures, incubation times etc.). We need a different approach. There is a
set of DOE designs called fractional factorials that meet this need nicely.
Let's go back to a simple three factor factorial experiment (Figure 1.6). In this experiment,
we have three factors. Since we are doing screening experiments to identify the important
factors, we will only be testing two levels per factor. Now imagine taking the three
dimensional space in Figure 1.6 and condensing it down to two dimensions (Figure 1.7). In
effect, we are now looking at the front of the cube so that we can't see the different levels of
factor C any longer. We can take this thinking one step further and compress the resulting
square into a single dimension (Figure 1.8).


Figure 1.6 Three factor screening experiment.

Figure 1.7 Three factor screening experiment compressed into two dimensions.

Figure 1.8 Three factor screening experiment compressed into one dimension.
When we look at our experimental space this way, something interesting happens. It
becomes clear that we have actually run four replicates of each of the two levels for factor
A. This property of factorial designs is called hidden replication. While each of the eight
experimental runs have a different combination of levels of all three factors, each factor's
level is actually replicated four times.


Note that it was completely arbitrary that we chose factor A for this analysis. The same thing
happens if we compress the design for either factor B or C. This is a useful characteristics
of factorial designs (and fractional factorial designs as we will see in a little bit) called
rotatability. Because the design is perfectly symmetrical, we can in effect assign any factor
to be A, B, or C and it doesn't matter. We can get away with this because instead of using
the actual numbers for the setting for each factor, we are going to use coded numbers of -1
or +1. This might seem confusing for now, but bear with me. It will make more sense when
we discuss how we analyze these designs. What it does for us now is to scale the levels of
each response to have the same distance, thus making the design rotatable.
There is also a second characteristic of these designs worth noting. The reason we could
compress the figure down into a single dimension is because of a property called
orthogonality. Notice how the axes for the three factors in figure 6 are at 90 degree angles
to each other? This means that when we compress the design down to a single dimension,
the effects of all the other factors cancel out, allowing us to estimate just the effect of the one
variable we want. This is what we mean by a multi-variable experiment. Unlike in an OFAT
study, we truly can vary all of our variables at the same time and still be able to distinguish
effects from each.
Looking for hidden replication in our experimental space showed us that each factor was
actually replicated four times. Since we probably don't need that many replicates to tell if
the response changes from the low to the high setting, let's eliminate some of these extra
runs and see what happens.
One of the possible ways of doing this is shown in figure 1.9. As you can see, we have
eliminated half of the runs. The amazing thing is that we are still doing two replicates of each
level for all three factors. With only four runs we can estimate the effects of three variables
with two replicates at each setting. That's a pretty good return for your experimental
investment.


Figure 1.9 Three factor fractional factorial experiment (half-fraction).
We could try and eliminate another two runs (Figure 1.10), but this time we have pushed
things too far. Now we can't estimate factor C any longer. For three factors we can only
eliminate the first half of runs and still have a viable designs. However, for designs with
more than three factors we can eliminate more than half of the runs and still estimate all of
the main effects. In fact, the more factors we have, the more runs we can eliminate.

Figure 1.10 Three factor fractional factorial (quarter fraction).


1.4 Statistical Power and Aliasing in Fractional Factorials
You're probably thinking that if we just eliminated half of our work, there has to be a catch.
You're right. An immediate impact is that we have lowered the number of replicates per data
point from four to two. This has the outcome of lowering the statistical power of our design.
We would need a larger effect in the response when going from low to high in order to see it.
But that might be a trade-off you are willing to take, especially if you only care about large
(relative to the experimental system) effects.
How do you know how much of a trade-off you are making? That's a function of the
underlying variability of the system you are looking at. If the variability is large and the
effect you expect is small, you will need more replicates. Power calculations are described
thoroughly in most statistics books (see the introduction to this book for sources), but you
probably won't have to calculate it by hand. All of the specialized DOE software on the
market will do this calculation for you. I'll talk more about these packages when I discuss
the analysis of screening designs in the next chapter. For now, let's assume that you have
enough power to estimate the effects you're expecting to see.
The other penalty we take when eliminating runs from our design is to create what is termed
aliasing of effects. In the discussion of full factorial experiments, I explained the concept of
an interaction between two factors. A consequence of fractional factorials is the confounding
of interactions with main effects, or with each other.
Let me explain what that means in a little more detail. Let's look back at our three-factor
fractional factorial in figure 1.9 again. Recall in our discussion in section 1.1 that if we
wanted to estimate an interaction between factors A and B we would have to run all four of
the "corners" of the square diagram. Unfortunately, in our fractional factorial, two of those
corners are run at the lower level of factor C and the other two are run at the higher level of
factor C. The design is no longer orthogonal with respect to interaction AB and factor C. The
same is the case for the interactions BC and AC. In the nomenclature of DOE we would say
that the main effect in A is aliased with the interaction BC. The notation we use to describe
this relationship is as follows:
A = A + BC
B = B + AC
C = C + AB
The effect we observe due to factor A is a combination of A and the interaction BC and we
can't tell how much is contributed by each.


This relationship can also be described using the notation in table 1.9. Since this is not a
statistics book, I will take some liberties with explaining this table. For a more complete
discussion please see the book by Montgomery listed in the introduction or any other
statistics-based DOE book.
Table 1.9 is similar to the run tables in previous sections, with some differences. Instead
of run # in the first column, it is now labeled "treatment combination". I have also added
columns for all of the possible interactions and one labeled "I" which stands for identity.
While this table may not make a lot of sense right now, there are a few things we can take
away from it.
In chapter 2, we will learn how to build a mathematical model that relates the settings of
the individual factors to the level of the output response. In that model, each factor will be
preceded by a constant. The value of that constant will be solved by using the + and - signs
in each column. For now, look carefully at the table and you will notice that each column has
a unique pattern. We can therefore solve for the constant for each term in the model.
On the bottom half of the table, I have shaded in grey the combinations that we eliminated
in figure 1.9. Let's take a closer look at the remaining pluses and minuses in the top of
the table. If we said that these sign represent coefficients in an equation for estimating the
effects, then you will notice that the coefficients in column A are the same as those in column
BC (in the top half of the table). The same goes for column B with AC and column C with
AB. Column ABC is identical to column I, which means that we are no longer able to estimate
a three factor interaction.
Table 1.9
Treatment
Combination

I

A

B

C

AB

AC

BC

ABC

a

+

+

-

-

-

+

+

+

b

+

-

+

-

-

+

-

+

c

+

-

-

+

+

-

-

+

abc

+

+

+

+

+

+

+

+

ab

+

+

+

-

+

-

-

-

ac

+

+

-

+

-

-

-

-

bc

+

-

+

+

-

-

+

-

(1)

+

-

-

-

+

+

+

-

If you're more comfortable with thinking about aliases graphically or by using the table, the
end result is the same. When we reduce the number of runs in a factorial, the number of
aliases increase and with them our ability to estimate some of the effects in the experiment.


As you might have noticed, we could just as easily have eliminated the other half of the runs
in the experiments. The aliasing structure now takes on a negative relationship (i.e. A = A BC etc.), but the principle is the same. We can no longer estimate the effects independently.

1.5 Other Screening Designs
While fractional factorial designs are the easiest screening designs to understand due their
simple aliasing structures, they are not the only designs available if you are looking to identify
important factors. In fact, there are whole families of designs that aim to minimize the
amount of runs while still being able to identify main effects. I'll discuss one of the more
popular ones in brief detail here, and get to the more advanced designs in Chapter 4.
Plackett-Burman designs are a family of two-level screening designs that allow you to use
the least amount of runs possible for situations where you have 11, 12, 19, 23, 27, and 31
factors. In fact, you only need to run one more run than the number of factors you have.
These designs have extremely complex aliasing structures. Here's an example of the aliasing
structure for the main effect of factor A in an 11 factor design:
[A] = A - 0.333 * BC - 0.333 * BD - 0.333 * BE + 0.333 * BF - 0.333 * BG
- 0.333 * BH + 0.333 * BJ + 0.333 * BK - 0.333 * BL + 0.333 * CD - 0.333 * CE
- 0.333 * CF + 0.333 * CG - 0.333 * CH + 0.333 * CJ - 0.333 * CK - 0.333 * CL
+ 0.333 * DE + 0.333 * DF - 0.333 * DG - 0.333 * DH - 0.333 * DJ - 0.333 * DK
- 0.333 * DL - 0.333 * EF - 0.333 * EG - 0.333 * EH - 0.333 * EJ + 0.333 * EK
+ 0.333 * EL - 0.333 * FG + 0.333 * FH - 0.333 * FJ - 0.333 * FK - 0.333 * FL
+ 0.333 * GH - 0.333 * GJ + 0.333 * GK - 0.333 * GL - 0.333 * HJ - 0.333 * HK
+ 0.333 * HL - 0.333 * JK + 0.333 * JL - 0.333 * KL - 0.333 * BCD + 0.333 * BCE
- 0.333 * BCF + 0.333 * BCG + 0.333 * BCH + 0.333 * BCJ + 0.333 * BCK
- 0.333 * BCL + 0.333 * BDE + 0.333 * BDF + 0.333 * BDG - 0.333 * BDH
- 0.333 * BDJ + 0.333 * BDK + 0.333 * BDL + 0.333 * BEF - 0.333 * BEG
+ 0.333 * BEH - 0.333 * BEJ + 0.333 * BEK - 0.333 * BEL - 0.333 * BFG
+ 0.333 * BFH + 0.333 * BFJ + 0.333 * BFK + 0.333 * BFL - 0.333 * BGH
+ 0.333 * BGJ + 0.333 * BGK + 0.333 * BGL + 0.333 * BHJ - 0.333 * BHK
+ 0.333 * BHL + 0.333 * BJK + 0.333 * BJL - 0.333 * BKL + 0.333 * CDE
+ 0.333 * CDF + 0.333 * CDG + 0.333 * CDH + 0.333 * CDJ - 0.333 * CDK
+ 0.333 * CDL - 0.333 * CEF - 0.333 * CEG + 0.333 * CEH + 0.333 * CEJ
- 0.333 * CEK + 0.333 * CEL + 0.333 * CFG + 0.333 * CFH - 0.333 * CFJ
+ 0.333 * CFK + 0.333 * CFL + 0.333 * CGH + 0.333 * CGJ + 0.333 * CGK
- 0.333 * CGL - 0.333 * CHJ - 0.333 * CHK - 0.333 * CHL + 0.333 * CJK
+ 0.333 * CJL + 0.333 * CKL + 0.333 * DEF - 0.333 * DEG - 0.333 * DEH
+ 0.333 * DEJ + 0.333 * DEK + 0.333 * DEL + 0.333 * DFG + 0.333 * DFH
- 0.333 * DFJ + 0.333 * DFK - 0.333 * DFL + 0.333 * DGH - 0.333 * DGJ


- 0.333 * DGK + 0.333 * DGL + 0.333 * DHJ + 0.333 * DHK - 0.333 * DHL
+ 0.333 * DJK + 0.333 * DJL - 0.333 * DKL + 0.333 * EFG - 0.333 * EFH
+ 0.333 * EFJ + 0.333 * EFK - 0.333 * EFL + 0.333 * EGH + 0.333 * EGJ
+ 0.333 * EGK + 0.333 * EGL - 0.333 * EHJ + 0.333 * EHK + 0.333 * EHL
- 0.333 * EJK + 0.333 * EJL + 0.333 * EKL + 0.333 * FGH + 0.333 * FGJ
- 0.333 * FGK - 0.333 * FGL + 0.333 * FHJ - 0.333 * FHK + 0.333 * FHL
- 0.333 * FJK + 0.333 * FJL + 0.333 * FKL - 0.333 * GHJ + 0.333 * GHK
+ 0.333 * GHL + 0.333 * GJK - 0.333 * GJL + 0.333 * GKL + 0.333 * HJK
+ 0.333 * HJL + 0.333 * HKL - 0.333 * JKL
As you can see, you really don't want to try to figure out if any of these interactions are
confounded with the main effect. But, if you look closely, you can also see that the main
effect of A is only aliased with interactions of other factors. Thus, you can use these designs
to estimate all of the main effects without a problem. That being said, if you think you will
have any interactions at all, you're better off running a few more runs in a fractional factorial.
During assay development, it's rare to not have any interactions. In my experience, these
types of designs are most often used in robustness experiments where you expect very
few of the factors to be significant. In those cases, a follow up experiment can be used to
further investigate the presence of interactions after the Plackett-Burman design was used to
eliminate most of the factors from consideration.

1.6 How to pick a design - blocking, resolution, and power
Let's say you have an experimental system in mind. You have identified the factors you want
to investigate. How do you get started?
First, we have to decide on a low and high level setting for each of your factors. This is where
some of the "art" of DOE comes into play and why you always need a subject matter expert
involved in the design phase. The best guidance I can give you is to set the levels of your
factors aggressively, but not too aggressively. Helpful, huh?
Let's break down that statement. What exactly does it mean to set your levels aggressively?
Imagine that you have a response that increases as you move from low to high of one of the
factors (figure 1.11).


Figure 1.11. Picking the correct levels for a factor.
In a screening design, you will usually run just two levels of a factor (and sometimes a center
point). If you picked the two levels that are shown in red in Figure 1.10 you may not be
able to detect a change in the response. Instead, it makes more sense to pick the two green
levels. You would probably be able to detect the difference between them. Remember that
the point is not to optimize the settings of the factor, just to identify the factors that are
actually impacting your system. Also notice how the response continues outside of the green
levels. You don't want to pick levels that are on an edge of the area you have explored in
the past. Select levels aggressively, but not too aggressively.
Another problem that often occurs in DOE designs for assay development is that you have to
spread the testing across several operators, days, lots of reagents, etc. and you're worried
that these changes will affect your responses. DOE designs can take care of those problems
for us using a concept called blocking. Blocking essentially adds another factor for each of
these "nuisance variables" to your model. The difference between these factors and your
"regular" ones is that these factors are not analyzed for significance. Instead, any effects due
to them are subtracted from the other responses so that they don't mask the effects of the
responses you are really interested in. Each blocking variable uses up one available factor
that you can estimate, so you need to make sure you have enough power in your design. If
your design has a complicated aliasing structure, blocking makes it even worse since you've
now added yet another factor. Be judicious in your design to keep your number of blocks
low.
Resolution is another concept that is important to understand when picking a screening
design. Common factorials can be categorized into one of four types: resolution III,
resolution IV, resolution V and higher, and saturated designs. These categories tell you


what the aliasing structures of these design are. In a resolution III design, the main effects
are aliased with two-factor interactions. In a resolution IV design, two-factor interactions
are aliased with other two-factor interactions and main effects are aliased with three-factor
interactions. In a resolution V design, two factor interactions are aliased with three factor
interactions, and main effects are aliased with four factor interactions. A saturated design
1
has no aliasing at all (i.e. all interactions can be estimated).
How then do you use this information? If you're interested in estimating the main effects and
you're worried about aliasing them with two-factor interactions, you would pick a resolution
IV design. However, if you expect that two-factor interactions are rare, you may be able
to get away with a resolution III design. Most DOE software has a handy table to help you
select designs. Figure 1.12 is an example of such a table.

Number of Factors

Figure 1.12 - Table with possible fractional factorial designs

1. There's an easy way to remember these alising relationships called the finger rule. If you
hold up the number of fingers as the name of the resolution (i.e. three fingers for a resolution
III design), you will be able to split them into groups that show the aliasing structure. Three
fingers can be divided into a pair and a single finger. Therefore, the main effects (the single
finger) is aliased with two factor interactions (the pair). This also works with resolution IV
and V designs. Try it and see for yourself.


If you study the table, you'll see that the lower the resolution, the less runs you need to
perform to estimate the effects you care about. Intuitively, this makes sense. The more
information you have (runs), the less confounding (aliasing) you have in the results.
The table is also useful in determining how fractionated each design is. For example, a 2^5
full factorial has 32 runs. The first half fraction of this factorial has 16 runs, resolution V, and
is denoted 2^(5-1). A quarter fraction is denoted 2^(5-2), has 8 runs and is a resolution III
design. By studying the table you can quickly familiarize yourself with how this nomenclature
works.
You might also notice that some designs have the same resolution and factors, yet one
only has half as many runs as the other (2^(7-2) and 2^(7-3) is such an example). How
is that possible, and why would you ever run the design with more runs? It comes down
to statistical power. Power is what gives you the confidence that you actually detected a
response. The way the analysis works, the more power you have, the more confident you
are that you're not missing the significance of the effect. In a low power design, the effect
of a factor has to be much higher than the variability in the measurement in order to be
detected.
To complicate matters, another way to increase power is to add replicates of a design.
Replication decreases the uncertainly, and thus variability, of your responses. It's not always
clear whether it's better to run a less fractionated design or just replicate a more fractionated
one. If you find yourself in this conundrum, your best bet is to talk to a statistician. In
most cases, I would personally choose the less fractionated design without replication, since
it includes runs with more combinations of all the different factors. I therefore feel like I have
more confidence in the analysis of interactions.
To summarize, when picking a design, you have to consider how many factors you have, if
you need to add blocks to the design, what resolution you need in your aliasing structure, and
how many runs you are willing to perform. Once you have made those decisions, you can use
a DOE software package to decide if you will have enough power to detect the differences you
expect in your factors. If you don't have enough, you can either switch to a higher resolution
design or add more replicates to your current design.

1.7 Response Variables - what to measure?
Once you have your design, there's one last decision to make before you actually execute
your runs: what to measure. Usually, you have at least one measure in mind when you start
thinking about your experiments. In assay development, some common examples include
signal strength, dynamic range, curve parameters (such as slope), replicate variability,
background, signal to noise, etc.


It may seem like it would be a lot of work to optimize all of those responses, but DOE makes
it extremely easy. As you will see in the next chapter, once you do your runs, analysis is
essentially free. So I would encourage you to think up front about measuring as many things
as possible even if you don't intend to analyze them right away. I've seen many examples
where a response you didn't think would matter suddenly becomes important. If you have
the data, the analysis is usually much less painful than having to go back and generate more
data.
Another reason for measuring as many parameters as possible is that some responses
may contradict others. For example, optimizing only for signal strength may increase
background. The optimum setting for optimizing both may be a compromise setting. You
would not know that unless you analyze for both variables at the same time. As you will see
in the next chapter, most DOE software packages have optimization algorithms that will help
you find the optimum settings across all your responses.


DoE for assay_developers_chp1_rev-1.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to DoE for assay_developers_chp1_rev-1.0

Similar to DoE for assay_developers_chp1_rev-1.0 (20)

More from National Institute of Biologics

More from National Institute of Biologics (20)

Recently uploaded

Recently uploaded (20)

DoE for assay_developers_chp1_rev-1.0