SlideShare a Scribd company logo
1 of 22
Download to read offline
A Practical Guide to
Design of Experiments (DOE)
for Assay Developers

Daniel Joelsson

© 2010 Daniel Joelsson All rights reserved
Introduction
Why write this book? Those of you that know me probably aren't surprised. I've been a
strong advocate for a more systematic approach to assay development for many years, to
the point of annoyance for many of my colleagues. I feel that using Design of Experiments
(DOE) in assay development is the most efficient way to develop quality assays in the least
amount of time possible. Then why aren't scientists using DOE more widely? I think part of
the problem is the lack of understanding of what DOE really is and why it's useful. I looked
for a good introduction to DOE for assay development scientists and I couldn't find one. Most
of the texts out there are engineering and process focused. Therefore, I'm writing my own
guide for us scientists and assay developers.
I should state up front that I'm not a statistician, I'm a scientist. This guide is not
intended for statisticians and I will not try to turn you into one either. However, some
basic understanding of statistics is important to fully appreciate DOE. I will also make
some assumptions that you are familiar with basic biology and biochemistry techniques.
My background is primarily in the design of bioassays and immunoassays for vaccines and
biotherapeutics. Most of my examples will come out of those disciplines.
I will do my best to explain any statistical topics in a relevant and pragmatic way, but in case
you want more, these are two resources that I have found to be excellent texts on statistics
for scientists:
The Biostatistics Cookbook - The Most User-Friendly Guide for the Bio/Medical Scientist by
Seth Michelson and Timothy Schofield
Biometry: The Principles and Practices of Statistics in Biological Research by Robert Sokal
and F. James Rohlf
This book is meant to be a guide for beginning to intermediate DOE users, and slanted
completely towards scientists that do assay development. If you would like a more in depth
discussion about DOE and the statistics behind it, Design & Analysis of Experiments by
Douglas C. Montgomery is a great resource to get you started.
While I will try to teach you enough about DOE to allow you to design and analyze your
own experiments, a statistician familiar with assay development can be a valuable resource.
What this book will give you is an understanding of the concepts and language of DOE so that
you can easily communicate with your statistics friends. One of my goals is to increase the
overlap in understanding on both sides. You might want to give a copy of this book to your
statistician as well, especially if they are not familiar with the pitfalls of assay design.

© 2010 Daniel Joelsson All rights reserved
Before we start, a few thanks are in order. Thanks to Tim Schofield for schooling me on
statistics from the first day we met. Thanks to Edith Senderak for all of the mentoring and
collaborations over the years. Thanks to Joe Pigeon for sparking my initial interest in DOE
as well as introducing me to split-plot designs (which you will see are perfect for developing
assays in 96 well plates). Thanks to all of my colleagues for trusting me enough to allow me
to help with their DOE designs.
This book is being published under a Creative Commons license. You are free to distribute
this work to anyone you think would be interested, free of charge. You may not use any
portion of this book for any commercial purpose without prior permission. You may create
derivatives of this work, as long as these derivatives adhere to the same licence restrictions.
For complete license information, please visit:
http://creativecommons.org/licenses/by-nc-sa/3.0/

© 2010 Daniel Joelsson All rights reserved
Chapter 1 - Screening Designs
1.1 One Factor At a Time (OFAT) and Interactions
Think back to your early years of scientific training for a moment. If your experience was
anything like mine, you were first taught to do experiments using the scientific method. It
went something like this: first generate a hypothesis based on past knowledge, design an
experiment to test the hypothesis, analyze the data and then refine the hypothesis. Repeat
until we get to the answer.
This is the correct way to do science. Unfortunately, it is in the second step where we usually
learn some bad habits. We were told to vary one factor at a time (OFAT) and hold everything
else constant. At first glance, OFAT makes a lot of sense, but it can be misleading. Let's
look at a simple assay development experiment.
Imagine that you're trying to develop an ELISA to detect a bacterial contaminant in a sample
from a process development study for a new biotherapeutic. Two of the variables you
assume to be important are the amount of capture antibody you add to your plate and the
time you incubate your sample. Consistent with the OFAT approach, you start out by testing
two different antibody concentrations: 0.2 ug/ml and 1.0 ug/ml while keeping the incubation
time constant at one hour. You then run the assay and faithfully record the results in your
lab notebook (Table 1.1).
Table 1.1
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

Great! Obviously, adding more antibody increases the response (it increased from 0.31 to
0.97). Your initial hypothesis that adding more antibody would be beneficial was correct.
Good to know! What if you increased it even further? Let's find out what happens (Table
1.2).

© 2010 Daniel Joelsson All rights reserved
Table 1.2
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

Ok, it appears that adding even more antibody does not further increase the signal. So you
conclude that the response has been saturated somewhere around 1.0 ug/ml.
Since the conditions for the first variable have been optimized, let's look at the second
variable, incubation time. Again, you do a simple experiment increasing the incubation time
to two hours. Since you already know that 1.0 ug/ml of antibody is ideal, you only need to
run one experiment with 1.0 ug/ml of antibody for two hours of incubation. The results are
shown in Table 1.3
Table 1.3
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

Even better! Clearly two hours is better than one. What if you increase the incubation time
even further? You try three hours and see what happens (Table 1.4).
Table 1.4
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

1.0 ug/ml

3 hours

1.74

Again, increasing the variable did not have a further effect on the response. So you conclude
that the optimum settings for this assay is to use 1.0 ug/ml and incubate for two hours.

© 2010 Daniel Joelsson All rights reserved
Since you've now tested and optimized both variables, you're done right? Well, maybe not.
There is one experiment that was not performed: 0.2 ug/ml of antibody incubated for two
hours. Let's run that experiment and see what happens (Table 1.5)
Table 1.5
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ug/ml

1 hour

0.95

1.0 ug/ml

2 hours

1.72

1.0 ug/ml

3 hours

1.74

0.2 ug/ml

2 hours

2.62

Whoa! The response is now much higher than what we previously considered optimal.
There are now all kinds of experiments we should probably run to make sure we have all the
answers we need. What happens if we use less than 0.2 ug/ml? What about testing other
incubation times with 0.2 ug/ml?
This example highlights one of the greatest weaknesses of OFAT experiments. You are
very likely to miss an interaction between two or more of the variables. An interaction
occurs when the effect of two variables are not completely independent of each other; i.e.
the response from one variable is dependent on the level of the other. Since the OFAT
approach (and traditional scientific training) assumes that all variables are additive, it doesn't
encourage us to test for interactions. The only way to find them is by luck. Unfortunately,
interactions like these happen all the time in the real world.
For scientists trained in linear thinking, interactions can sometimes be hard to visualize.
There's a specific graph called an interaction plot that makes it slightly easier (Figures 1.1
and 1.2). In an interaction plot where the two variables have no interaction the two response
lines will be perfectly parallel (Figure 1.1). The two lines represent the response due to
Factor A at the two different levels of Factor B. Since the lines are perfectly parallel, the
effect of A and B is completely additive. When you increase Factor B it shifts the entire Factor
A line upwards by the same amount. This is the kind of relationship we assume is always in
place in an OFAT experimental design.
If the lines of the interaction plot intersect (or are simply non-parallel), there's an interaction
between the two variables (Figure 1.2). Increasing Factor B no longer just moves the Factor
A line higher. Instead, at the lower concentration of B, the response in Factor A actually
decreases from low to high. Clearly the effects of the two variables are not simply additive.
This is exactly the situation we observed in the assay development example above.

© 2010 Daniel Joelsson All rights reserved
Figure 1.1 - Interaction plot for two factors without an interaction

Figure 1.2 - Interaction plot for two factors with an interaction

1.2 Factorial experiments - foundations of DOE
I hope I convinced you in the previous section that interactions between variables can be an
important factor to the success of your experiments and that the OFAT approach makes it
hard to find them unless you're willing to do a lot of work.
This section will introduce some of the basics of Design of Experiments (DOE). DOE is a more
systematic approach to experimentation than the OFAT approach. The goal is to test all of
the variables in a system in a set of multi-factorial experiments allowing us optimize all of
them and find interactions at the same time.
The assay development example in the last section made it clear that it would have been
a good idea to run all four combinations of the two variables right away. This type of
experiment is called a factorial experiment. A picture of the four different experiments (or

© 2010 Daniel Joelsson All rights reserved
"runs") shows how we start exploring the "design space" of our experimental system (Figure
1.3). As you can see, we have covered the four corners of the space.
The runs can also be shown in a table such as Table 1.6. During each run, a factor can take
on a low value (depicted as "-") or a high value (depicted as "+"). With two factors there are
four such combinations possible. By performing all the possible combinations of the factors,
we will be able to tell if each of the factors are important (i.e. changing them has an effect
on the response), but also if any interactions are present.

Figure 1.3 Pictorial depiction of a 2^2 factorial experiment.
Table 1.6
Run #

Factor A

Factor B

Run 1

+

+

Run 2

-

+

Run 3

+

-

Run 4

-

-

Let's expand the same thinking to three factors (Figure 1.4 and Table 1.7). With three factors
we have to complete eight runs to run every combination of high and low for all three factors.
But again, you get a lot of information out of those runs, you will know if any or all of the
factors influence the response, if any two-factor interactions exist (AxB, AxC, BxC), and if
there's a three-factor interaction (AxBxC).

© 2010 Daniel Joelsson All rights reserved
Figure 1.4 Pictorial depiction of a 2^3 factorial experiment
Table 1.7
Run#

Factor A

Factor B

Factor C

1

+

+

+

2

-

+

+

3

+

-

+

4

-

-

+

5

+

+

-

6

-

+

-

7

+

-

-

8

-

-

-

This exercise can be continued for four factors and so on. Performing eight runs for three
factors might seem reasonable, but as you may have figured out already, as we add factors,
the number of runs increases exponentially (Table 1.8). Since most assay systems contain
more than three or four factors, factorial experiments quickly become too large to feasibly
perform. With this in mind, it's easy to see how OFAT is still widely practiced. Factorial
experiments rapidly become unwieldy as the numbers of factors goes up.

© 2010 Daniel Joelsson All rights reserved
Table 1.8
Number of factors

Number of runs

1

2

2

4

3

8

4

16

5

32

6

64

7

128

8

256

9

512

10

1024

But, since OFAT experiments make it very difficult to find those elusive interactions, we need
an alternative approach. Luckily DOE provides that in the form of fractional factorials, the
topic of the next section.
However, before I get to that topic, I want to address a common question. How come we
only use two levels (high and low) for each factor? Obviously, it would be risky to make the
assumption that any response is perfectly linear between two points. It might be that the
response has a complex shape in that space.
In the example in Figure 1.5, the optimum response is actually somewhere between our low
and high settings. This is where DOE differs from traditional experimentation. The goal of
these factorial experiments is not to optimize the response completely, but to screen for the
few factors that actually affect the response. These factors are then carried forward in a set
of optimization experiments.
Not all factors are likely to be important in every system, therefore we should do fairly low
resolution experiments to identify the ones that truly matter. As you will see, screening
experiments have relative fewer runs per factor than optimization experiments. We can
afford to include more factors in the initial experiments to make sure that we don't miss any
of the critical few.

© 2010 Daniel Joelsson All rights reserved
Figure 1.5 - Non linear response

1.3 Fractional Factorials
In the last section, we learned that factorial experiments are useful designs to pick up both
main and interaction effects in our experimental system. But for more than a few factors,
they quickly become too large to be feasible. In most assay development experiments ,
we can easily identify ten or more factors that could be important (reagent concentrations,
incubation temperatures, incubation times etc.). We need a different approach. There is a
set of DOE designs called fractional factorials that meet this need nicely.
Let's go back to a simple three factor factorial experiment (Figure 1.6). In this experiment,
we have three factors. Since we are doing screening experiments to identify the important
factors, we will only be testing two levels per factor. Now imagine taking the three
dimensional space in Figure 1.6 and condensing it down to two dimensions (Figure 1.7). In
effect, we are now looking at the front of the cube so that we can't see the different levels of
factor C any longer. We can take this thinking one step further and compress the resulting
square into a single dimension (Figure 1.8).

© 2010 Daniel Joelsson All rights reserved
Figure 1.6 Three factor screening experiment.

Figure 1.7 Three factor screening experiment compressed into two dimensions.

Figure 1.8 Three factor screening experiment compressed into one dimension.
When we look at our experimental space this way, something interesting happens. It
becomes clear that we have actually run four replicates of each of the two levels for factor
A. This property of factorial designs is called hidden replication. While each of the eight
experimental runs have a different combination of levels of all three factors, each factor's
level is actually replicated four times.

© 2010 Daniel Joelsson All rights reserved
Note that it was completely arbitrary that we chose factor A for this analysis. The same thing
happens if we compress the design for either factor B or C. This is a useful characteristics
of factorial designs (and fractional factorial designs as we will see in a little bit) called
rotatability. Because the design is perfectly symmetrical, we can in effect assign any factor
to be A, B, or C and it doesn't matter. We can get away with this because instead of using
the actual numbers for the setting for each factor, we are going to use coded numbers of -1
or +1. This might seem confusing for now, but bear with me. It will make more sense when
we discuss how we analyze these designs. What it does for us now is to scale the levels of
each response to have the same distance, thus making the design rotatable.
There is also a second characteristic of these designs worth noting. The reason we could
compress the figure down into a single dimension is because of a property called
orthogonality. Notice how the axes for the three factors in figure 6 are at 90 degree angles
to each other? This means that when we compress the design down to a single dimension,
the effects of all the other factors cancel out, allowing us to estimate just the effect of the one
variable we want. This is what we mean by a multi-variable experiment. Unlike in an OFAT
study, we truly can vary all of our variables at the same time and still be able to distinguish
effects from each.
Looking for hidden replication in our experimental space showed us that each factor was
actually replicated four times. Since we probably don't need that many replicates to tell if
the response changes from the low to the high setting, let's eliminate some of these extra
runs and see what happens.
One of the possible ways of doing this is shown in figure 1.9. As you can see, we have
eliminated half of the runs. The amazing thing is that we are still doing two replicates of each
level for all three factors. With only four runs we can estimate the effects of three variables
with two replicates at each setting. That's a pretty good return for your experimental
investment.

© 2010 Daniel Joelsson All rights reserved
Figure 1.9 Three factor fractional factorial experiment (half-fraction).
We could try and eliminate another two runs (Figure 1.10), but this time we have pushed
things too far. Now we can't estimate factor C any longer. For three factors we can only
eliminate the first half of runs and still have a viable designs. However, for designs with
more than three factors we can eliminate more than half of the runs and still estimate all of
the main effects. In fact, the more factors we have, the more runs we can eliminate.

Figure 1.10 Three factor fractional factorial (quarter fraction).

© 2010 Daniel Joelsson All rights reserved
1.4 Statistical Power and Aliasing in Fractional Factorials
You're probably thinking that if we just eliminated half of our work, there has to be a catch.
You're right. An immediate impact is that we have lowered the number of replicates per data
point from four to two. This has the outcome of lowering the statistical power of our design.
We would need a larger effect in the response when going from low to high in order to see it.
But that might be a trade-off you are willing to take, especially if you only care about large
(relative to the experimental system) effects.
How do you know how much of a trade-off you are making? That's a function of the
underlying variability of the system you are looking at. If the variability is large and the
effect you expect is small, you will need more replicates. Power calculations are described
thoroughly in most statistics books (see the introduction to this book for sources), but you
probably won't have to calculate it by hand. All of the specialized DOE software on the
market will do this calculation for you. I'll talk more about these packages when I discuss
the analysis of screening designs in the next chapter. For now, let's assume that you have
enough power to estimate the effects you're expecting to see.
The other penalty we take when eliminating runs from our design is to create what is termed
aliasing of effects. In the discussion of full factorial experiments, I explained the concept of
an interaction between two factors. A consequence of fractional factorials is the confounding
of interactions with main effects, or with each other.
Let me explain what that means in a little more detail. Let's look back at our three-factor
fractional factorial in figure 1.9 again. Recall in our discussion in section 1.1 that if we
wanted to estimate an interaction between factors A and B we would have to run all four of
the "corners" of the square diagram. Unfortunately, in our fractional factorial, two of those
corners are run at the lower level of factor C and the other two are run at the higher level of
factor C. The design is no longer orthogonal with respect to interaction AB and factor C. The
same is the case for the interactions BC and AC. In the nomenclature of DOE we would say
that the main effect in A is aliased with the interaction BC. The notation we use to describe
this relationship is as follows:
A = A + BC
B = B + AC
C = C + AB
The effect we observe due to factor A is a combination of A and the interaction BC and we
can't tell how much is contributed by each.

© 2010 Daniel Joelsson All rights reserved
This relationship can also be described using the notation in table 1.9. Since this is not a
statistics book, I will take some liberties with explaining this table. For a more complete
discussion please see the book by Montgomery listed in the introduction or any other
statistics-based DOE book.
Table 1.9 is similar to the run tables in previous sections, with some differences. Instead
of run # in the first column, it is now labeled "treatment combination". I have also added
columns for all of the possible interactions and one labeled "I" which stands for identity.
While this table may not make a lot of sense right now, there are a few things we can take
away from it.
In chapter 2, we will learn how to build a mathematical model that relates the settings of
the individual factors to the level of the output response. In that model, each factor will be
preceded by a constant. The value of that constant will be solved by using the + and - signs
in each column. For now, look carefully at the table and you will notice that each column has
a unique pattern. We can therefore solve for the constant for each term in the model.
On the bottom half of the table, I have shaded in grey the combinations that we eliminated
in figure 1.9. Let's take a closer look at the remaining pluses and minuses in the top of
the table. If we said that these sign represent coefficients in an equation for estimating the
effects, then you will notice that the coefficients in column A are the same as those in column
BC (in the top half of the table). The same goes for column B with AC and column C with
AB. Column ABC is identical to column I, which means that we are no longer able to estimate
a three factor interaction.
Table 1.9
Treatment
Combination

I

A

B

C

AB

AC

BC

ABC

a

+

+

-

-

-

+

+

+

b

+

-

+

-

-

+

-

+

c

+

-

-

+

+

-

-

+

abc

+

+

+

+

+

+

+

+

ab

+

+

+

-

+

-

-

-

ac

+

+

-

+

-

-

-

-

bc

+

-

+

+

-

-

+

-

(1)

+

-

-

-

+

+

+

-

If you're more comfortable with thinking about aliases graphically or by using the table, the
end result is the same. When we reduce the number of runs in a factorial, the number of
aliases increase and with them our ability to estimate some of the effects in the experiment.

© 2010 Daniel Joelsson All rights reserved
As you might have noticed, we could just as easily have eliminated the other half of the runs
in the experiments. The aliasing structure now takes on a negative relationship (i.e. A = A BC etc.), but the principle is the same. We can no longer estimate the effects independently.

1.5 Other Screening Designs
While fractional factorial designs are the easiest screening designs to understand due their
simple aliasing structures, they are not the only designs available if you are looking to identify
important factors. In fact, there are whole families of designs that aim to minimize the
amount of runs while still being able to identify main effects. I'll discuss one of the more
popular ones in brief detail here, and get to the more advanced designs in Chapter 4.
Plackett-Burman designs are a family of two-level screening designs that allow you to use
the least amount of runs possible for situations where you have 11, 12, 19, 23, 27, and 31
factors. In fact, you only need to run one more run than the number of factors you have.
These designs have extremely complex aliasing structures. Here's an example of the aliasing
structure for the main effect of factor A in an 11 factor design:
[A] = A - 0.333 * BC - 0.333 * BD - 0.333 * BE + 0.333 * BF - 0.333 * BG
- 0.333 * BH + 0.333 * BJ + 0.333 * BK - 0.333 * BL + 0.333 * CD - 0.333 * CE
- 0.333 * CF + 0.333 * CG - 0.333 * CH + 0.333 * CJ - 0.333 * CK - 0.333 * CL
+ 0.333 * DE + 0.333 * DF - 0.333 * DG - 0.333 * DH - 0.333 * DJ - 0.333 * DK
- 0.333 * DL - 0.333 * EF - 0.333 * EG - 0.333 * EH - 0.333 * EJ + 0.333 * EK
+ 0.333 * EL - 0.333 * FG + 0.333 * FH - 0.333 * FJ - 0.333 * FK - 0.333 * FL
+ 0.333 * GH - 0.333 * GJ + 0.333 * GK - 0.333 * GL - 0.333 * HJ - 0.333 * HK
+ 0.333 * HL - 0.333 * JK + 0.333 * JL - 0.333 * KL - 0.333 * BCD + 0.333 * BCE
- 0.333 * BCF + 0.333 * BCG + 0.333 * BCH + 0.333 * BCJ + 0.333 * BCK
- 0.333 * BCL + 0.333 * BDE + 0.333 * BDF + 0.333 * BDG - 0.333 * BDH
- 0.333 * BDJ + 0.333 * BDK + 0.333 * BDL + 0.333 * BEF - 0.333 * BEG
+ 0.333 * BEH - 0.333 * BEJ + 0.333 * BEK - 0.333 * BEL - 0.333 * BFG
+ 0.333 * BFH + 0.333 * BFJ + 0.333 * BFK + 0.333 * BFL - 0.333 * BGH
+ 0.333 * BGJ + 0.333 * BGK + 0.333 * BGL + 0.333 * BHJ - 0.333 * BHK
+ 0.333 * BHL + 0.333 * BJK + 0.333 * BJL - 0.333 * BKL + 0.333 * CDE
+ 0.333 * CDF + 0.333 * CDG + 0.333 * CDH + 0.333 * CDJ - 0.333 * CDK
+ 0.333 * CDL - 0.333 * CEF - 0.333 * CEG + 0.333 * CEH + 0.333 * CEJ
- 0.333 * CEK + 0.333 * CEL + 0.333 * CFG + 0.333 * CFH - 0.333 * CFJ
+ 0.333 * CFK + 0.333 * CFL + 0.333 * CGH + 0.333 * CGJ + 0.333 * CGK
- 0.333 * CGL - 0.333 * CHJ - 0.333 * CHK - 0.333 * CHL + 0.333 * CJK
+ 0.333 * CJL + 0.333 * CKL + 0.333 * DEF - 0.333 * DEG - 0.333 * DEH
+ 0.333 * DEJ + 0.333 * DEK + 0.333 * DEL + 0.333 * DFG + 0.333 * DFH
- 0.333 * DFJ + 0.333 * DFK - 0.333 * DFL + 0.333 * DGH - 0.333 * DGJ

© 2010 Daniel Joelsson All rights reserved
- 0.333 * DGK + 0.333 * DGL + 0.333 * DHJ + 0.333 * DHK - 0.333 * DHL
+ 0.333 * DJK + 0.333 * DJL - 0.333 * DKL + 0.333 * EFG - 0.333 * EFH
+ 0.333 * EFJ + 0.333 * EFK - 0.333 * EFL + 0.333 * EGH + 0.333 * EGJ
+ 0.333 * EGK + 0.333 * EGL - 0.333 * EHJ + 0.333 * EHK + 0.333 * EHL
- 0.333 * EJK + 0.333 * EJL + 0.333 * EKL + 0.333 * FGH + 0.333 * FGJ
- 0.333 * FGK - 0.333 * FGL + 0.333 * FHJ - 0.333 * FHK + 0.333 * FHL
- 0.333 * FJK + 0.333 * FJL + 0.333 * FKL - 0.333 * GHJ + 0.333 * GHK
+ 0.333 * GHL + 0.333 * GJK - 0.333 * GJL + 0.333 * GKL + 0.333 * HJK
+ 0.333 * HJL + 0.333 * HKL - 0.333 * JKL
As you can see, you really don't want to try to figure out if any of these interactions are
confounded with the main effect. But, if you look closely, you can also see that the main
effect of A is only aliased with interactions of other factors. Thus, you can use these designs
to estimate all of the main effects without a problem. That being said, if you think you will
have any interactions at all, you're better off running a few more runs in a fractional factorial.
During assay development, it's rare to not have any interactions. In my experience, these
types of designs are most often used in robustness experiments where you expect very
few of the factors to be significant. In those cases, a follow up experiment can be used to
further investigate the presence of interactions after the Plackett-Burman design was used to
eliminate most of the factors from consideration.

1.6 How to pick a design - blocking, resolution, and power
Let's say you have an experimental system in mind. You have identified the factors you want
to investigate. How do you get started?
First, we have to decide on a low and high level setting for each of your factors. This is where
some of the "art" of DOE comes into play and why you always need a subject matter expert
involved in the design phase. The best guidance I can give you is to set the levels of your
factors aggressively, but not too aggressively. Helpful, huh?
Let's break down that statement. What exactly does it mean to set your levels aggressively?
Imagine that you have a response that increases as you move from low to high of one of the
factors (figure 1.11).

© 2010 Daniel Joelsson All rights reserved
Figure 1.11. Picking the correct levels for a factor.
In a screening design, you will usually run just two levels of a factor (and sometimes a center
point). If you picked the two levels that are shown in red in Figure 1.10 you may not be
able to detect a change in the response. Instead, it makes more sense to pick the two green
levels. You would probably be able to detect the difference between them. Remember that
the point is not to optimize the settings of the factor, just to identify the factors that are
actually impacting your system. Also notice how the response continues outside of the green
levels. You don't want to pick levels that are on an edge of the area you have explored in
the past. Select levels aggressively, but not too aggressively.
Another problem that often occurs in DOE designs for assay development is that you have to
spread the testing across several operators, days, lots of reagents, etc. and you're worried
that these changes will affect your responses. DOE designs can take care of those problems
for us using a concept called blocking. Blocking essentially adds another factor for each of
these "nuisance variables" to your model. The difference between these factors and your
"regular" ones is that these factors are not analyzed for significance. Instead, any effects due
to them are subtracted from the other responses so that they don't mask the effects of the
responses you are really interested in. Each blocking variable uses up one available factor
that you can estimate, so you need to make sure you have enough power in your design. If
your design has a complicated aliasing structure, blocking makes it even worse since you've
now added yet another factor. Be judicious in your design to keep your number of blocks
low.
Resolution is another concept that is important to understand when picking a screening
design. Common factorials can be categorized into one of four types: resolution III,
resolution IV, resolution V and higher, and saturated designs. These categories tell you

© 2010 Daniel Joelsson All rights reserved
what the aliasing structures of these design are. In a resolution III design, the main effects
are aliased with two-factor interactions. In a resolution IV design, two-factor interactions
are aliased with other two-factor interactions and main effects are aliased with three-factor
interactions. In a resolution V design, two factor interactions are aliased with three factor
interactions, and main effects are aliased with four factor interactions. A saturated design
1
has no aliasing at all (i.e. all interactions can be estimated).
How then do you use this information? If you're interested in estimating the main effects and
you're worried about aliasing them with two-factor interactions, you would pick a resolution
IV design. However, if you expect that two-factor interactions are rare, you may be able
to get away with a resolution III design. Most DOE software has a handy table to help you
select designs. Figure 1.12 is an example of such a table.

Number of Factors

Figure 1.12 - Table with possible fractional factorial designs

1. There's an easy way to remember these alising relationships called the finger rule. If you
hold up the number of fingers as the name of the resolution (i.e. three fingers for a resolution
III design), you will be able to split them into groups that show the aliasing structure. Three
fingers can be divided into a pair and a single finger. Therefore, the main effects (the single
finger) is aliased with two factor interactions (the pair). This also works with resolution IV
and V designs. Try it and see for yourself.

© 2010 Daniel Joelsson All rights reserved
If you study the table, you'll see that the lower the resolution, the less runs you need to
perform to estimate the effects you care about. Intuitively, this makes sense. The more
information you have (runs), the less confounding (aliasing) you have in the results.
The table is also useful in determining how fractionated each design is. For example, a 2^5
full factorial has 32 runs. The first half fraction of this factorial has 16 runs, resolution V, and
is denoted 2^(5-1). A quarter fraction is denoted 2^(5-2), has 8 runs and is a resolution III
design. By studying the table you can quickly familiarize yourself with how this nomenclature
works.
You might also notice that some designs have the same resolution and factors, yet one
only has half as many runs as the other (2^(7-2) and 2^(7-3) is such an example). How
is that possible, and why would you ever run the design with more runs? It comes down
to statistical power. Power is what gives you the confidence that you actually detected a
response. The way the analysis works, the more power you have, the more confident you
are that you're not missing the significance of the effect. In a low power design, the effect
of a factor has to be much higher than the variability in the measurement in order to be
detected.
To complicate matters, another way to increase power is to add replicates of a design.
Replication decreases the uncertainly, and thus variability, of your responses. It's not always
clear whether it's better to run a less fractionated design or just replicate a more fractionated
one. If you find yourself in this conundrum, your best bet is to talk to a statistician. In
most cases, I would personally choose the less fractionated design without replication, since
it includes runs with more combinations of all the different factors. I therefore feel like I have
more confidence in the analysis of interactions.
To summarize, when picking a design, you have to consider how many factors you have, if
you need to add blocks to the design, what resolution you need in your aliasing structure, and
how many runs you are willing to perform. Once you have made those decisions, you can use
a DOE software package to decide if you will have enough power to detect the differences you
expect in your factors. If you don't have enough, you can either switch to a higher resolution
design or add more replicates to your current design.

1.7 Response Variables - what to measure?
Once you have your design, there's one last decision to make before you actually execute
your runs: what to measure. Usually, you have at least one measure in mind when you start
thinking about your experiments. In assay development, some common examples include
signal strength, dynamic range, curve parameters (such as slope), replicate variability,
background, signal to noise, etc.

© 2010 Daniel Joelsson All rights reserved
It may seem like it would be a lot of work to optimize all of those responses, but DOE makes
it extremely easy. As you will see in the next chapter, once you do your runs, analysis is
essentially free. So I would encourage you to think up front about measuring as many things
as possible even if you don't intend to analyze them right away. I've seen many examples
where a response you didn't think would matter suddenly becomes important. If you have
the data, the analysis is usually much less painful than having to go back and generate more
data.
Another reason for measuring as many parameters as possible is that some responses
may contradict others. For example, optimizing only for signal strength may increase
background. The optimum setting for optimizing both may be a compromise setting. You
would not know that unless you analyze for both variables at the same time. As you will see
in the next chapter, most DOE software packages have optimization algorithms that will help
you find the optimum settings across all your responses.

© 2010 Daniel Joelsson All rights reserved

More Related Content

What's hot

Contamination control and sterile manufacturing
Contamination control and sterile manufacturingContamination control and sterile manufacturing
Contamination control and sterile manufacturingGeorge Wild
 
method development and validation
method development and validationmethod development and validation
method development and validationNITIN KANWALE
 
QMS Seminar.pptx
QMS Seminar.pptxQMS Seminar.pptx
QMS Seminar.pptxRohitKoli29
 
Residual Solvents, USP <467>
Residual Solvents, USP <467>Residual Solvents, USP <467>
Residual Solvents, USP <467>Aditya Sharma
 
Quality metrics
Quality metricsQuality metrics
Quality metricsDhruvi50
 
Analytical method validation as per ich and usp
Analytical method validation as per ich and usp Analytical method validation as per ich and usp
Analytical method validation as per ich and usp shreyas B R
 
Enteral Pharmaceutical Packaging- By Kaleem Petkar
Enteral Pharmaceutical Packaging- By Kaleem PetkarEnteral Pharmaceutical Packaging- By Kaleem Petkar
Enteral Pharmaceutical Packaging- By Kaleem PetkarKaleem Petkar
 
Cleaning validation
Cleaning validationCleaning validation
Cleaning validationdinesh pawar
 
USFDA guidelines on process validation a life cycle approach
USFDA guidelines on process validation a life cycle approachUSFDA guidelines on process validation a life cycle approach
USFDA guidelines on process validation a life cycle approachRx Ayush Sharma
 
Supac - Guidance for Immediate Release Dosage Form
Supac - Guidance for Immediate Release Dosage FormSupac - Guidance for Immediate Release Dosage Form
Supac - Guidance for Immediate Release Dosage FormJubiliant Generics Limited
 
Documentation in pharmaceutical industry
Documentation  in pharmaceutical industryDocumentation  in pharmaceutical industry
Documentation in pharmaceutical industryPRANJAY PATIL
 
Batch Review And Batch Release.pptx
Batch Review And Batch Release.pptxBatch Review And Batch Release.pptx
Batch Review And Batch Release.pptxAbhishekJadhav189260
 
Investigation of OOS and OOT results
Investigation of OOS and OOT resultsInvestigation of OOS and OOT results
Investigation of OOS and OOT resultsMoshfiqur Rahaman
 
Concept of qa, qc, gmp 112070804010
Concept of qa, qc, gmp  112070804010Concept of qa, qc, gmp  112070804010
Concept of qa, qc, gmp 112070804010Patel Parth
 
Qrm presentation
Qrm presentationQrm presentation
Qrm presentationGeetha Svcp
 

What's hot (20)

Contamination control and sterile manufacturing
Contamination control and sterile manufacturingContamination control and sterile manufacturing
Contamination control and sterile manufacturing
 
Cleaning validation
Cleaning validationCleaning validation
Cleaning validation
 
method development and validation
method development and validationmethod development and validation
method development and validation
 
QMS Seminar.pptx
QMS Seminar.pptxQMS Seminar.pptx
QMS Seminar.pptx
 
Residual Solvents, USP <467>
Residual Solvents, USP <467>Residual Solvents, USP <467>
Residual Solvents, USP <467>
 
Quality metrics
Quality metricsQuality metrics
Quality metrics
 
Qc, qa
Qc, qaQc, qa
Qc, qa
 
Analytical method validation as per ich and usp
Analytical method validation as per ich and usp Analytical method validation as per ich and usp
Analytical method validation as per ich and usp
 
FDA 483 observations in the lab
FDA 483 observations in the labFDA 483 observations in the lab
FDA 483 observations in the lab
 
Enteral Pharmaceutical Packaging- By Kaleem Petkar
Enteral Pharmaceutical Packaging- By Kaleem PetkarEnteral Pharmaceutical Packaging- By Kaleem Petkar
Enteral Pharmaceutical Packaging- By Kaleem Petkar
 
Cleaning validation
Cleaning validationCleaning validation
Cleaning validation
 
USFDA guidelines on process validation a life cycle approach
USFDA guidelines on process validation a life cycle approachUSFDA guidelines on process validation a life cycle approach
USFDA guidelines on process validation a life cycle approach
 
Supac - Guidance for Immediate Release Dosage Form
Supac - Guidance for Immediate Release Dosage FormSupac - Guidance for Immediate Release Dosage Form
Supac - Guidance for Immediate Release Dosage Form
 
Water system validation
Water system validationWater system validation
Water system validation
 
Documentation in pharmaceutical industry
Documentation  in pharmaceutical industryDocumentation  in pharmaceutical industry
Documentation in pharmaceutical industry
 
Batch Review And Batch Release.pptx
Batch Review And Batch Release.pptxBatch Review And Batch Release.pptx
Batch Review And Batch Release.pptx
 
Investigation of OOS and OOT results
Investigation of OOS and OOT resultsInvestigation of OOS and OOT results
Investigation of OOS and OOT results
 
Concept of qa, qc, gmp 112070804010
Concept of qa, qc, gmp  112070804010Concept of qa, qc, gmp  112070804010
Concept of qa, qc, gmp 112070804010
 
cGMP in the USA Training by CDER FDA
cGMP in the USA Training by CDER FDAcGMP in the USA Training by CDER FDA
cGMP in the USA Training by CDER FDA
 
Qrm presentation
Qrm presentationQrm presentation
Qrm presentation
 

Similar to DoE for assay_developers_chp1_rev-1.0

Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docx
Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docxRunning head PSYCHOLOGYPSYCHOLOGY2Autism How is .docx
Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docxtoltonkendal
 
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docx
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docxXimena CarrilloSON 310010113Why SonographyIt was exactl.docx
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docxericbrooks84875
 
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docx
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docxBond J has a coupon rate of 4.3 percent. Bond S has a coupon.docx
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docxAASTHA76
 
1.6 the scientific method name objectivesafter comple
1.6 the scientific method name objectivesafter comple1.6 the scientific method name objectivesafter comple
1.6 the scientific method name objectivesafter complesmile790243
 
Scientific method
Scientific methodScientific method
Scientific methodzimirajpoot
 
8th Grade Chapter 1- nature of science
8th Grade Chapter 1- nature of science8th Grade Chapter 1- nature of science
8th Grade Chapter 1- nature of scienceSteven_iannuccilli
 
The Nature of Science and Experimental Design- Part 3Instruction.docx
The Nature of Science and Experimental Design- Part 3Instruction.docxThe Nature of Science and Experimental Design- Part 3Instruction.docx
The Nature of Science and Experimental Design- Part 3Instruction.docxcherry686017
 
Dr. Megan Zobb, a key researcher within the North Luna University
Dr. Megan Zobb, a key researcher within the North Luna University Dr. Megan Zobb, a key researcher within the North Luna University
Dr. Megan Zobb, a key researcher within the North Luna University DustiBuckner14
 
1. Week 5 Assignment - Case Study Statistical ForecastingDr.
1. Week 5 Assignment - Case Study Statistical ForecastingDr. 1. Week 5 Assignment - Case Study Statistical ForecastingDr.
1. Week 5 Assignment - Case Study Statistical ForecastingDr. TatianaMajor22
 
Biometry BOOK by mbazi E
Biometry BOOK by mbazi EBiometry BOOK by mbazi E
Biometry BOOK by mbazi Emusadoto
 
Writing a lab report
Writing a lab reportWriting a lab report
Writing a lab reportArashPreet3
 
Writing a lab report
Writing a lab reportWriting a lab report
Writing a lab reportArashPreet3
 
Experimental design-workshop10
Experimental design-workshop10Experimental design-workshop10
Experimental design-workshop10clifflyon
 
Assignment 1BackgroundWhen you look around at the world, you .docx
Assignment 1BackgroundWhen you look around at the world, you .docxAssignment 1BackgroundWhen you look around at the world, you .docx
Assignment 1BackgroundWhen you look around at the world, you .docxsherni1
 
Scientific method notes & quiz
Scientific method notes & quizScientific method notes & quiz
Scientific method notes & quizjkentner
 
Essential Biology 6.6 Reproduction (SL)
Essential Biology 6.6 Reproduction (SL)Essential Biology 6.6 Reproduction (SL)
Essential Biology 6.6 Reproduction (SL)Stephen Taylor
 

Similar to DoE for assay_developers_chp1_rev-1.0 (20)

Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docx
Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docxRunning head PSYCHOLOGYPSYCHOLOGY2Autism How is .docx
Running head PSYCHOLOGYPSYCHOLOGY2Autism How is .docx
 
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docx
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docxXimena CarrilloSON 310010113Why SonographyIt was exactl.docx
Ximena CarrilloSON 310010113Why SonographyIt was exactl.docx
 
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docx
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docxBond J has a coupon rate of 4.3 percent. Bond S has a coupon.docx
Bond J has a coupon rate of 4.3 percent. Bond S has a coupon.docx
 
1.6 the scientific method name objectivesafter comple
1.6 the scientific method name objectivesafter comple1.6 the scientific method name objectivesafter comple
1.6 the scientific method name objectivesafter comple
 
Scientific method (1)
Scientific method (1)Scientific method (1)
Scientific method (1)
 
Scientific method
Scientific methodScientific method
Scientific method
 
8th Grade Chapter 1- nature of science
8th Grade Chapter 1- nature of science8th Grade Chapter 1- nature of science
8th Grade Chapter 1- nature of science
 
The Nature of Science and Experimental Design- Part 3Instruction.docx
The Nature of Science and Experimental Design- Part 3Instruction.docxThe Nature of Science and Experimental Design- Part 3Instruction.docx
The Nature of Science and Experimental Design- Part 3Instruction.docx
 
Dr. Megan Zobb, a key researcher within the North Luna University
Dr. Megan Zobb, a key researcher within the North Luna University Dr. Megan Zobb, a key researcher within the North Luna University
Dr. Megan Zobb, a key researcher within the North Luna University
 
1. Week 5 Assignment - Case Study Statistical ForecastingDr.
1. Week 5 Assignment - Case Study Statistical ForecastingDr. 1. Week 5 Assignment - Case Study Statistical ForecastingDr.
1. Week 5 Assignment - Case Study Statistical ForecastingDr.
 
Biometry BOOK by mbazi E
Biometry BOOK by mbazi EBiometry BOOK by mbazi E
Biometry BOOK by mbazi E
 
Writing a lab report
Writing a lab reportWriting a lab report
Writing a lab report
 
Writing a lab report
Writing a lab reportWriting a lab report
Writing a lab report
 
Factorial Design
Factorial DesignFactorial Design
Factorial Design
 
Experimental design-workshop10
Experimental design-workshop10Experimental design-workshop10
Experimental design-workshop10
 
Assignment 1BackgroundWhen you look around at the world, you .docx
Assignment 1BackgroundWhen you look around at the world, you .docxAssignment 1BackgroundWhen you look around at the world, you .docx
Assignment 1BackgroundWhen you look around at the world, you .docx
 
Scientific method notes & quiz
Scientific method notes & quizScientific method notes & quiz
Scientific method notes & quiz
 
Essential Biology 6.6 Reproduction (SL)
Essential Biology 6.6 Reproduction (SL)Essential Biology 6.6 Reproduction (SL)
Essential Biology 6.6 Reproduction (SL)
 
Science fair[1]
Science fair[1]Science fair[1]
Science fair[1]
 
Science fair[1]
Science fair[1]Science fair[1]
Science fair[1]
 

More from National Institute of Biologics

Defining your-target-product-profile in-vitro-diagnostic-products
Defining your-target-product-profile in-vitro-diagnostic-productsDefining your-target-product-profile in-vitro-diagnostic-products
Defining your-target-product-profile in-vitro-diagnostic-productsNational Institute of Biologics
 
Accelerating development and approval of targeted cancer therapies
Accelerating development and approval of targeted cancer therapiesAccelerating development and approval of targeted cancer therapies
Accelerating development and approval of targeted cancer therapiesNational Institute of Biologics
 
Canonical structures for the hypervariable regions of immunoglobulins
Canonical structures for the hypervariable regions of immunoglobulinsCanonical structures for the hypervariable regions of immunoglobulins
Canonical structures for the hypervariable regions of immunoglobulinsNational Institute of Biologics
 
Development trends for human monoclonal antibody therapeutics
Development trends for human monoclonal antibody therapeuticsDevelopment trends for human monoclonal antibody therapeutics
Development trends for human monoclonal antibody therapeuticsNational Institute of Biologics
 
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...Therapeutic fc fusion proteins and peptides as successful alternatives to ant...
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...National Institute of Biologics
 
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...National Institute of Biologics
 
Therapeutic antibodies for autoimmunity and inflammation
Therapeutic antibodies for autoimmunity and inflammationTherapeutic antibodies for autoimmunity and inflammation
Therapeutic antibodies for autoimmunity and inflammationNational Institute of Biologics
 
Introduction to current and future protein therapeutics - a protein engineeri...
Introduction to current and future protein therapeutics - a protein engineeri...Introduction to current and future protein therapeutics - a protein engineeri...
Introduction to current and future protein therapeutics - a protein engineeri...National Institute of Biologics
 
Pharmaceutical monoclonal antibodies production - guidelines to cell engine...
Pharmaceutical monoclonal antibodies   production - guidelines to cell engine...Pharmaceutical monoclonal antibodies   production - guidelines to cell engine...
Pharmaceutical monoclonal antibodies production - guidelines to cell engine...National Institute of Biologics
 
Intended use of reference products & who international standards or reference...
Intended use of reference products & who international standards or reference...Intended use of reference products & who international standards or reference...
Intended use of reference products & who international standards or reference...National Institute of Biologics
 
Evaluation of similar biotherapeutic products (SBP's) scientific principles ...
Evaluation of similar biotherapeutic products (SBP's)   scientific principles ...Evaluation of similar biotherapeutic products (SBP's)   scientific principles ...
Evaluation of similar biotherapeutic products (SBP's) scientific principles ...National Institute of Biologics
 

More from National Institute of Biologics (20)

Waters protein therapeutics application proctocols
Waters protein therapeutics application proctocolsWaters protein therapeutics application proctocols
Waters protein therapeutics application proctocols
 
Potential aggregation prone regions in biotherapeutics
Potential aggregation prone regions in biotherapeuticsPotential aggregation prone regions in biotherapeutics
Potential aggregation prone regions in biotherapeutics
 
How the biologics landscape is evolving
How the biologics landscape is evolvingHow the biologics landscape is evolving
How the biologics landscape is evolving
 
Evaluation of antibody drugs quality safety
Evaluation of antibody drugs quality safetyEvaluation of antibody drugs quality safety
Evaluation of antibody drugs quality safety
 
Approved m abs_feb_2015
Approved m abs_feb_2015Approved m abs_feb_2015
Approved m abs_feb_2015
 
Translating next generation sequencing to practice
Translating next generation sequencing to practiceTranslating next generation sequencing to practice
Translating next generation sequencing to practice
 
From biomarkers to diagnostics –the road to success
From biomarkers to diagnostics –the road to successFrom biomarkers to diagnostics –the road to success
From biomarkers to diagnostics –the road to success
 
Defining your-target-product-profile in-vitro-diagnostic-products
Defining your-target-product-profile in-vitro-diagnostic-productsDefining your-target-product-profile in-vitro-diagnostic-products
Defining your-target-product-profile in-vitro-diagnostic-products
 
Accelerating development and approval of targeted cancer therapies
Accelerating development and approval of targeted cancer therapiesAccelerating development and approval of targeted cancer therapies
Accelerating development and approval of targeted cancer therapies
 
Canonical structures for the hypervariable regions of immunoglobulins
Canonical structures for the hypervariable regions of immunoglobulinsCanonical structures for the hypervariable regions of immunoglobulins
Canonical structures for the hypervariable regions of immunoglobulins
 
Canonical correlation
Canonical correlationCanonical correlation
Canonical correlation
 
Development trends for human monoclonal antibody therapeutics
Development trends for human monoclonal antibody therapeuticsDevelopment trends for human monoclonal antibody therapeutics
Development trends for human monoclonal antibody therapeutics
 
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...Therapeutic fc fusion proteins and peptides as successful alternatives to ant...
Therapeutic fc fusion proteins and peptides as successful alternatives to ant...
 
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...
Fc fusion proteins and fc rn - structural insights for longer-lasting and mor...
 
Therapeutic antibodies for autoimmunity and inflammation
Therapeutic antibodies for autoimmunity and inflammationTherapeutic antibodies for autoimmunity and inflammation
Therapeutic antibodies for autoimmunity and inflammation
 
Introduction to current and future protein therapeutics - a protein engineeri...
Introduction to current and future protein therapeutics - a protein engineeri...Introduction to current and future protein therapeutics - a protein engineeri...
Introduction to current and future protein therapeutics - a protein engineeri...
 
Pharmaceutical monoclonal antibodies production - guidelines to cell engine...
Pharmaceutical monoclonal antibodies   production - guidelines to cell engine...Pharmaceutical monoclonal antibodies   production - guidelines to cell engine...
Pharmaceutical monoclonal antibodies production - guidelines to cell engine...
 
Intended use of reference products & who international standards or reference...
Intended use of reference products & who international standards or reference...Intended use of reference products & who international standards or reference...
Intended use of reference products & who international standards or reference...
 
How dissimilarly similar are biosimilars
How dissimilarly similar are biosimilarsHow dissimilarly similar are biosimilars
How dissimilarly similar are biosimilars
 
Evaluation of similar biotherapeutic products (SBP's) scientific principles ...
Evaluation of similar biotherapeutic products (SBP's)   scientific principles ...Evaluation of similar biotherapeutic products (SBP's)   scientific principles ...
Evaluation of similar biotherapeutic products (SBP's) scientific principles ...
 

Recently uploaded

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 

Recently uploaded (20)

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 

DoE for assay_developers_chp1_rev-1.0

  • 1. A Practical Guide to Design of Experiments (DOE) for Assay Developers Daniel Joelsson © 2010 Daniel Joelsson All rights reserved
  • 2. Introduction Why write this book? Those of you that know me probably aren't surprised. I've been a strong advocate for a more systematic approach to assay development for many years, to the point of annoyance for many of my colleagues. I feel that using Design of Experiments (DOE) in assay development is the most efficient way to develop quality assays in the least amount of time possible. Then why aren't scientists using DOE more widely? I think part of the problem is the lack of understanding of what DOE really is and why it's useful. I looked for a good introduction to DOE for assay development scientists and I couldn't find one. Most of the texts out there are engineering and process focused. Therefore, I'm writing my own guide for us scientists and assay developers. I should state up front that I'm not a statistician, I'm a scientist. This guide is not intended for statisticians and I will not try to turn you into one either. However, some basic understanding of statistics is important to fully appreciate DOE. I will also make some assumptions that you are familiar with basic biology and biochemistry techniques. My background is primarily in the design of bioassays and immunoassays for vaccines and biotherapeutics. Most of my examples will come out of those disciplines. I will do my best to explain any statistical topics in a relevant and pragmatic way, but in case you want more, these are two resources that I have found to be excellent texts on statistics for scientists: The Biostatistics Cookbook - The Most User-Friendly Guide for the Bio/Medical Scientist by Seth Michelson and Timothy Schofield Biometry: The Principles and Practices of Statistics in Biological Research by Robert Sokal and F. James Rohlf This book is meant to be a guide for beginning to intermediate DOE users, and slanted completely towards scientists that do assay development. If you would like a more in depth discussion about DOE and the statistics behind it, Design & Analysis of Experiments by Douglas C. Montgomery is a great resource to get you started. While I will try to teach you enough about DOE to allow you to design and analyze your own experiments, a statistician familiar with assay development can be a valuable resource. What this book will give you is an understanding of the concepts and language of DOE so that you can easily communicate with your statistics friends. One of my goals is to increase the overlap in understanding on both sides. You might want to give a copy of this book to your statistician as well, especially if they are not familiar with the pitfalls of assay design. © 2010 Daniel Joelsson All rights reserved
  • 3. Before we start, a few thanks are in order. Thanks to Tim Schofield for schooling me on statistics from the first day we met. Thanks to Edith Senderak for all of the mentoring and collaborations over the years. Thanks to Joe Pigeon for sparking my initial interest in DOE as well as introducing me to split-plot designs (which you will see are perfect for developing assays in 96 well plates). Thanks to all of my colleagues for trusting me enough to allow me to help with their DOE designs. This book is being published under a Creative Commons license. You are free to distribute this work to anyone you think would be interested, free of charge. You may not use any portion of this book for any commercial purpose without prior permission. You may create derivatives of this work, as long as these derivatives adhere to the same licence restrictions. For complete license information, please visit: http://creativecommons.org/licenses/by-nc-sa/3.0/ © 2010 Daniel Joelsson All rights reserved
  • 4. Chapter 1 - Screening Designs 1.1 One Factor At a Time (OFAT) and Interactions Think back to your early years of scientific training for a moment. If your experience was anything like mine, you were first taught to do experiments using the scientific method. It went something like this: first generate a hypothesis based on past knowledge, design an experiment to test the hypothesis, analyze the data and then refine the hypothesis. Repeat until we get to the answer. This is the correct way to do science. Unfortunately, it is in the second step where we usually learn some bad habits. We were told to vary one factor at a time (OFAT) and hold everything else constant. At first glance, OFAT makes a lot of sense, but it can be misleading. Let's look at a simple assay development experiment. Imagine that you're trying to develop an ELISA to detect a bacterial contaminant in a sample from a process development study for a new biotherapeutic. Two of the variables you assume to be important are the amount of capture antibody you add to your plate and the time you incubate your sample. Consistent with the OFAT approach, you start out by testing two different antibody concentrations: 0.2 ug/ml and 1.0 ug/ml while keeping the incubation time constant at one hour. You then run the assay and faithfully record the results in your lab notebook (Table 1.1). Table 1.1 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 Great! Obviously, adding more antibody increases the response (it increased from 0.31 to 0.97). Your initial hypothesis that adding more antibody would be beneficial was correct. Good to know! What if you increased it even further? Let's find out what happens (Table 1.2). © 2010 Daniel Joelsson All rights reserved
  • 5. Table 1.2 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 Ok, it appears that adding even more antibody does not further increase the signal. So you conclude that the response has been saturated somewhere around 1.0 ug/ml. Since the conditions for the first variable have been optimized, let's look at the second variable, incubation time. Again, you do a simple experiment increasing the incubation time to two hours. Since you already know that 1.0 ug/ml of antibody is ideal, you only need to run one experiment with 1.0 ug/ml of antibody for two hours of incubation. The results are shown in Table 1.3 Table 1.3 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 Even better! Clearly two hours is better than one. What if you increase the incubation time even further? You try three hours and see what happens (Table 1.4). Table 1.4 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 1.0 ug/ml 3 hours 1.74 Again, increasing the variable did not have a further effect on the response. So you conclude that the optimum settings for this assay is to use 1.0 ug/ml and incubate for two hours. © 2010 Daniel Joelsson All rights reserved
  • 6. Since you've now tested and optimized both variables, you're done right? Well, maybe not. There is one experiment that was not performed: 0.2 ug/ml of antibody incubated for two hours. Let's run that experiment and see what happens (Table 1.5) Table 1.5 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 1.0 ug/ml 3 hours 1.74 0.2 ug/ml 2 hours 2.62 Whoa! The response is now much higher than what we previously considered optimal. There are now all kinds of experiments we should probably run to make sure we have all the answers we need. What happens if we use less than 0.2 ug/ml? What about testing other incubation times with 0.2 ug/ml? This example highlights one of the greatest weaknesses of OFAT experiments. You are very likely to miss an interaction between two or more of the variables. An interaction occurs when the effect of two variables are not completely independent of each other; i.e. the response from one variable is dependent on the level of the other. Since the OFAT approach (and traditional scientific training) assumes that all variables are additive, it doesn't encourage us to test for interactions. The only way to find them is by luck. Unfortunately, interactions like these happen all the time in the real world. For scientists trained in linear thinking, interactions can sometimes be hard to visualize. There's a specific graph called an interaction plot that makes it slightly easier (Figures 1.1 and 1.2). In an interaction plot where the two variables have no interaction the two response lines will be perfectly parallel (Figure 1.1). The two lines represent the response due to Factor A at the two different levels of Factor B. Since the lines are perfectly parallel, the effect of A and B is completely additive. When you increase Factor B it shifts the entire Factor A line upwards by the same amount. This is the kind of relationship we assume is always in place in an OFAT experimental design. If the lines of the interaction plot intersect (or are simply non-parallel), there's an interaction between the two variables (Figure 1.2). Increasing Factor B no longer just moves the Factor A line higher. Instead, at the lower concentration of B, the response in Factor A actually decreases from low to high. Clearly the effects of the two variables are not simply additive. This is exactly the situation we observed in the assay development example above. © 2010 Daniel Joelsson All rights reserved
  • 7. Figure 1.1 - Interaction plot for two factors without an interaction Figure 1.2 - Interaction plot for two factors with an interaction 1.2 Factorial experiments - foundations of DOE I hope I convinced you in the previous section that interactions between variables can be an important factor to the success of your experiments and that the OFAT approach makes it hard to find them unless you're willing to do a lot of work. This section will introduce some of the basics of Design of Experiments (DOE). DOE is a more systematic approach to experimentation than the OFAT approach. The goal is to test all of the variables in a system in a set of multi-factorial experiments allowing us optimize all of them and find interactions at the same time. The assay development example in the last section made it clear that it would have been a good idea to run all four combinations of the two variables right away. This type of experiment is called a factorial experiment. A picture of the four different experiments (or © 2010 Daniel Joelsson All rights reserved
  • 8. "runs") shows how we start exploring the "design space" of our experimental system (Figure 1.3). As you can see, we have covered the four corners of the space. The runs can also be shown in a table such as Table 1.6. During each run, a factor can take on a low value (depicted as "-") or a high value (depicted as "+"). With two factors there are four such combinations possible. By performing all the possible combinations of the factors, we will be able to tell if each of the factors are important (i.e. changing them has an effect on the response), but also if any interactions are present. Figure 1.3 Pictorial depiction of a 2^2 factorial experiment. Table 1.6 Run # Factor A Factor B Run 1 + + Run 2 - + Run 3 + - Run 4 - - Let's expand the same thinking to three factors (Figure 1.4 and Table 1.7). With three factors we have to complete eight runs to run every combination of high and low for all three factors. But again, you get a lot of information out of those runs, you will know if any or all of the factors influence the response, if any two-factor interactions exist (AxB, AxC, BxC), and if there's a three-factor interaction (AxBxC). © 2010 Daniel Joelsson All rights reserved
  • 9. Figure 1.4 Pictorial depiction of a 2^3 factorial experiment Table 1.7 Run# Factor A Factor B Factor C 1 + + + 2 - + + 3 + - + 4 - - + 5 + + - 6 - + - 7 + - - 8 - - - This exercise can be continued for four factors and so on. Performing eight runs for three factors might seem reasonable, but as you may have figured out already, as we add factors, the number of runs increases exponentially (Table 1.8). Since most assay systems contain more than three or four factors, factorial experiments quickly become too large to feasibly perform. With this in mind, it's easy to see how OFAT is still widely practiced. Factorial experiments rapidly become unwieldy as the numbers of factors goes up. © 2010 Daniel Joelsson All rights reserved
  • 10. Table 1.8 Number of factors Number of runs 1 2 2 4 3 8 4 16 5 32 6 64 7 128 8 256 9 512 10 1024 But, since OFAT experiments make it very difficult to find those elusive interactions, we need an alternative approach. Luckily DOE provides that in the form of fractional factorials, the topic of the next section. However, before I get to that topic, I want to address a common question. How come we only use two levels (high and low) for each factor? Obviously, it would be risky to make the assumption that any response is perfectly linear between two points. It might be that the response has a complex shape in that space. In the example in Figure 1.5, the optimum response is actually somewhere between our low and high settings. This is where DOE differs from traditional experimentation. The goal of these factorial experiments is not to optimize the response completely, but to screen for the few factors that actually affect the response. These factors are then carried forward in a set of optimization experiments. Not all factors are likely to be important in every system, therefore we should do fairly low resolution experiments to identify the ones that truly matter. As you will see, screening experiments have relative fewer runs per factor than optimization experiments. We can afford to include more factors in the initial experiments to make sure that we don't miss any of the critical few. © 2010 Daniel Joelsson All rights reserved
  • 11. Figure 1.5 - Non linear response 1.3 Fractional Factorials In the last section, we learned that factorial experiments are useful designs to pick up both main and interaction effects in our experimental system. But for more than a few factors, they quickly become too large to be feasible. In most assay development experiments , we can easily identify ten or more factors that could be important (reagent concentrations, incubation temperatures, incubation times etc.). We need a different approach. There is a set of DOE designs called fractional factorials that meet this need nicely. Let's go back to a simple three factor factorial experiment (Figure 1.6). In this experiment, we have three factors. Since we are doing screening experiments to identify the important factors, we will only be testing two levels per factor. Now imagine taking the three dimensional space in Figure 1.6 and condensing it down to two dimensions (Figure 1.7). In effect, we are now looking at the front of the cube so that we can't see the different levels of factor C any longer. We can take this thinking one step further and compress the resulting square into a single dimension (Figure 1.8). © 2010 Daniel Joelsson All rights reserved
  • 12. Figure 1.6 Three factor screening experiment. Figure 1.7 Three factor screening experiment compressed into two dimensions. Figure 1.8 Three factor screening experiment compressed into one dimension. When we look at our experimental space this way, something interesting happens. It becomes clear that we have actually run four replicates of each of the two levels for factor A. This property of factorial designs is called hidden replication. While each of the eight experimental runs have a different combination of levels of all three factors, each factor's level is actually replicated four times. © 2010 Daniel Joelsson All rights reserved
  • 13. Note that it was completely arbitrary that we chose factor A for this analysis. The same thing happens if we compress the design for either factor B or C. This is a useful characteristics of factorial designs (and fractional factorial designs as we will see in a little bit) called rotatability. Because the design is perfectly symmetrical, we can in effect assign any factor to be A, B, or C and it doesn't matter. We can get away with this because instead of using the actual numbers for the setting for each factor, we are going to use coded numbers of -1 or +1. This might seem confusing for now, but bear with me. It will make more sense when we discuss how we analyze these designs. What it does for us now is to scale the levels of each response to have the same distance, thus making the design rotatable. There is also a second characteristic of these designs worth noting. The reason we could compress the figure down into a single dimension is because of a property called orthogonality. Notice how the axes for the three factors in figure 6 are at 90 degree angles to each other? This means that when we compress the design down to a single dimension, the effects of all the other factors cancel out, allowing us to estimate just the effect of the one variable we want. This is what we mean by a multi-variable experiment. Unlike in an OFAT study, we truly can vary all of our variables at the same time and still be able to distinguish effects from each. Looking for hidden replication in our experimental space showed us that each factor was actually replicated four times. Since we probably don't need that many replicates to tell if the response changes from the low to the high setting, let's eliminate some of these extra runs and see what happens. One of the possible ways of doing this is shown in figure 1.9. As you can see, we have eliminated half of the runs. The amazing thing is that we are still doing two replicates of each level for all three factors. With only four runs we can estimate the effects of three variables with two replicates at each setting. That's a pretty good return for your experimental investment. © 2010 Daniel Joelsson All rights reserved
  • 14. Figure 1.9 Three factor fractional factorial experiment (half-fraction). We could try and eliminate another two runs (Figure 1.10), but this time we have pushed things too far. Now we can't estimate factor C any longer. For three factors we can only eliminate the first half of runs and still have a viable designs. However, for designs with more than three factors we can eliminate more than half of the runs and still estimate all of the main effects. In fact, the more factors we have, the more runs we can eliminate. Figure 1.10 Three factor fractional factorial (quarter fraction). © 2010 Daniel Joelsson All rights reserved
  • 15. 1.4 Statistical Power and Aliasing in Fractional Factorials You're probably thinking that if we just eliminated half of our work, there has to be a catch. You're right. An immediate impact is that we have lowered the number of replicates per data point from four to two. This has the outcome of lowering the statistical power of our design. We would need a larger effect in the response when going from low to high in order to see it. But that might be a trade-off you are willing to take, especially if you only care about large (relative to the experimental system) effects. How do you know how much of a trade-off you are making? That's a function of the underlying variability of the system you are looking at. If the variability is large and the effect you expect is small, you will need more replicates. Power calculations are described thoroughly in most statistics books (see the introduction to this book for sources), but you probably won't have to calculate it by hand. All of the specialized DOE software on the market will do this calculation for you. I'll talk more about these packages when I discuss the analysis of screening designs in the next chapter. For now, let's assume that you have enough power to estimate the effects you're expecting to see. The other penalty we take when eliminating runs from our design is to create what is termed aliasing of effects. In the discussion of full factorial experiments, I explained the concept of an interaction between two factors. A consequence of fractional factorials is the confounding of interactions with main effects, or with each other. Let me explain what that means in a little more detail. Let's look back at our three-factor fractional factorial in figure 1.9 again. Recall in our discussion in section 1.1 that if we wanted to estimate an interaction between factors A and B we would have to run all four of the "corners" of the square diagram. Unfortunately, in our fractional factorial, two of those corners are run at the lower level of factor C and the other two are run at the higher level of factor C. The design is no longer orthogonal with respect to interaction AB and factor C. The same is the case for the interactions BC and AC. In the nomenclature of DOE we would say that the main effect in A is aliased with the interaction BC. The notation we use to describe this relationship is as follows: A = A + BC B = B + AC C = C + AB The effect we observe due to factor A is a combination of A and the interaction BC and we can't tell how much is contributed by each. © 2010 Daniel Joelsson All rights reserved
  • 16. This relationship can also be described using the notation in table 1.9. Since this is not a statistics book, I will take some liberties with explaining this table. For a more complete discussion please see the book by Montgomery listed in the introduction or any other statistics-based DOE book. Table 1.9 is similar to the run tables in previous sections, with some differences. Instead of run # in the first column, it is now labeled "treatment combination". I have also added columns for all of the possible interactions and one labeled "I" which stands for identity. While this table may not make a lot of sense right now, there are a few things we can take away from it. In chapter 2, we will learn how to build a mathematical model that relates the settings of the individual factors to the level of the output response. In that model, each factor will be preceded by a constant. The value of that constant will be solved by using the + and - signs in each column. For now, look carefully at the table and you will notice that each column has a unique pattern. We can therefore solve for the constant for each term in the model. On the bottom half of the table, I have shaded in grey the combinations that we eliminated in figure 1.9. Let's take a closer look at the remaining pluses and minuses in the top of the table. If we said that these sign represent coefficients in an equation for estimating the effects, then you will notice that the coefficients in column A are the same as those in column BC (in the top half of the table). The same goes for column B with AC and column C with AB. Column ABC is identical to column I, which means that we are no longer able to estimate a three factor interaction. Table 1.9 Treatment Combination I A B C AB AC BC ABC a + + - - - + + + b + - + - - + - + c + - - + + - - + abc + + + + + + + + ab + + + - + - - - ac + + - + - - - - bc + - + + - - + - (1) + - - - + + + - If you're more comfortable with thinking about aliases graphically or by using the table, the end result is the same. When we reduce the number of runs in a factorial, the number of aliases increase and with them our ability to estimate some of the effects in the experiment. © 2010 Daniel Joelsson All rights reserved
  • 17. As you might have noticed, we could just as easily have eliminated the other half of the runs in the experiments. The aliasing structure now takes on a negative relationship (i.e. A = A BC etc.), but the principle is the same. We can no longer estimate the effects independently. 1.5 Other Screening Designs While fractional factorial designs are the easiest screening designs to understand due their simple aliasing structures, they are not the only designs available if you are looking to identify important factors. In fact, there are whole families of designs that aim to minimize the amount of runs while still being able to identify main effects. I'll discuss one of the more popular ones in brief detail here, and get to the more advanced designs in Chapter 4. Plackett-Burman designs are a family of two-level screening designs that allow you to use the least amount of runs possible for situations where you have 11, 12, 19, 23, 27, and 31 factors. In fact, you only need to run one more run than the number of factors you have. These designs have extremely complex aliasing structures. Here's an example of the aliasing structure for the main effect of factor A in an 11 factor design: [A] = A - 0.333 * BC - 0.333 * BD - 0.333 * BE + 0.333 * BF - 0.333 * BG - 0.333 * BH + 0.333 * BJ + 0.333 * BK - 0.333 * BL + 0.333 * CD - 0.333 * CE - 0.333 * CF + 0.333 * CG - 0.333 * CH + 0.333 * CJ - 0.333 * CK - 0.333 * CL + 0.333 * DE + 0.333 * DF - 0.333 * DG - 0.333 * DH - 0.333 * DJ - 0.333 * DK - 0.333 * DL - 0.333 * EF - 0.333 * EG - 0.333 * EH - 0.333 * EJ + 0.333 * EK + 0.333 * EL - 0.333 * FG + 0.333 * FH - 0.333 * FJ - 0.333 * FK - 0.333 * FL + 0.333 * GH - 0.333 * GJ + 0.333 * GK - 0.333 * GL - 0.333 * HJ - 0.333 * HK + 0.333 * HL - 0.333 * JK + 0.333 * JL - 0.333 * KL - 0.333 * BCD + 0.333 * BCE - 0.333 * BCF + 0.333 * BCG + 0.333 * BCH + 0.333 * BCJ + 0.333 * BCK - 0.333 * BCL + 0.333 * BDE + 0.333 * BDF + 0.333 * BDG - 0.333 * BDH - 0.333 * BDJ + 0.333 * BDK + 0.333 * BDL + 0.333 * BEF - 0.333 * BEG + 0.333 * BEH - 0.333 * BEJ + 0.333 * BEK - 0.333 * BEL - 0.333 * BFG + 0.333 * BFH + 0.333 * BFJ + 0.333 * BFK + 0.333 * BFL - 0.333 * BGH + 0.333 * BGJ + 0.333 * BGK + 0.333 * BGL + 0.333 * BHJ - 0.333 * BHK + 0.333 * BHL + 0.333 * BJK + 0.333 * BJL - 0.333 * BKL + 0.333 * CDE + 0.333 * CDF + 0.333 * CDG + 0.333 * CDH + 0.333 * CDJ - 0.333 * CDK + 0.333 * CDL - 0.333 * CEF - 0.333 * CEG + 0.333 * CEH + 0.333 * CEJ - 0.333 * CEK + 0.333 * CEL + 0.333 * CFG + 0.333 * CFH - 0.333 * CFJ + 0.333 * CFK + 0.333 * CFL + 0.333 * CGH + 0.333 * CGJ + 0.333 * CGK - 0.333 * CGL - 0.333 * CHJ - 0.333 * CHK - 0.333 * CHL + 0.333 * CJK + 0.333 * CJL + 0.333 * CKL + 0.333 * DEF - 0.333 * DEG - 0.333 * DEH + 0.333 * DEJ + 0.333 * DEK + 0.333 * DEL + 0.333 * DFG + 0.333 * DFH - 0.333 * DFJ + 0.333 * DFK - 0.333 * DFL + 0.333 * DGH - 0.333 * DGJ © 2010 Daniel Joelsson All rights reserved
  • 18. - 0.333 * DGK + 0.333 * DGL + 0.333 * DHJ + 0.333 * DHK - 0.333 * DHL + 0.333 * DJK + 0.333 * DJL - 0.333 * DKL + 0.333 * EFG - 0.333 * EFH + 0.333 * EFJ + 0.333 * EFK - 0.333 * EFL + 0.333 * EGH + 0.333 * EGJ + 0.333 * EGK + 0.333 * EGL - 0.333 * EHJ + 0.333 * EHK + 0.333 * EHL - 0.333 * EJK + 0.333 * EJL + 0.333 * EKL + 0.333 * FGH + 0.333 * FGJ - 0.333 * FGK - 0.333 * FGL + 0.333 * FHJ - 0.333 * FHK + 0.333 * FHL - 0.333 * FJK + 0.333 * FJL + 0.333 * FKL - 0.333 * GHJ + 0.333 * GHK + 0.333 * GHL + 0.333 * GJK - 0.333 * GJL + 0.333 * GKL + 0.333 * HJK + 0.333 * HJL + 0.333 * HKL - 0.333 * JKL As you can see, you really don't want to try to figure out if any of these interactions are confounded with the main effect. But, if you look closely, you can also see that the main effect of A is only aliased with interactions of other factors. Thus, you can use these designs to estimate all of the main effects without a problem. That being said, if you think you will have any interactions at all, you're better off running a few more runs in a fractional factorial. During assay development, it's rare to not have any interactions. In my experience, these types of designs are most often used in robustness experiments where you expect very few of the factors to be significant. In those cases, a follow up experiment can be used to further investigate the presence of interactions after the Plackett-Burman design was used to eliminate most of the factors from consideration. 1.6 How to pick a design - blocking, resolution, and power Let's say you have an experimental system in mind. You have identified the factors you want to investigate. How do you get started? First, we have to decide on a low and high level setting for each of your factors. This is where some of the "art" of DOE comes into play and why you always need a subject matter expert involved in the design phase. The best guidance I can give you is to set the levels of your factors aggressively, but not too aggressively. Helpful, huh? Let's break down that statement. What exactly does it mean to set your levels aggressively? Imagine that you have a response that increases as you move from low to high of one of the factors (figure 1.11). © 2010 Daniel Joelsson All rights reserved
  • 19. Figure 1.11. Picking the correct levels for a factor. In a screening design, you will usually run just two levels of a factor (and sometimes a center point). If you picked the two levels that are shown in red in Figure 1.10 you may not be able to detect a change in the response. Instead, it makes more sense to pick the two green levels. You would probably be able to detect the difference between them. Remember that the point is not to optimize the settings of the factor, just to identify the factors that are actually impacting your system. Also notice how the response continues outside of the green levels. You don't want to pick levels that are on an edge of the area you have explored in the past. Select levels aggressively, but not too aggressively. Another problem that often occurs in DOE designs for assay development is that you have to spread the testing across several operators, days, lots of reagents, etc. and you're worried that these changes will affect your responses. DOE designs can take care of those problems for us using a concept called blocking. Blocking essentially adds another factor for each of these "nuisance variables" to your model. The difference between these factors and your "regular" ones is that these factors are not analyzed for significance. Instead, any effects due to them are subtracted from the other responses so that they don't mask the effects of the responses you are really interested in. Each blocking variable uses up one available factor that you can estimate, so you need to make sure you have enough power in your design. If your design has a complicated aliasing structure, blocking makes it even worse since you've now added yet another factor. Be judicious in your design to keep your number of blocks low. Resolution is another concept that is important to understand when picking a screening design. Common factorials can be categorized into one of four types: resolution III, resolution IV, resolution V and higher, and saturated designs. These categories tell you © 2010 Daniel Joelsson All rights reserved
  • 20. what the aliasing structures of these design are. In a resolution III design, the main effects are aliased with two-factor interactions. In a resolution IV design, two-factor interactions are aliased with other two-factor interactions and main effects are aliased with three-factor interactions. In a resolution V design, two factor interactions are aliased with three factor interactions, and main effects are aliased with four factor interactions. A saturated design 1 has no aliasing at all (i.e. all interactions can be estimated). How then do you use this information? If you're interested in estimating the main effects and you're worried about aliasing them with two-factor interactions, you would pick a resolution IV design. However, if you expect that two-factor interactions are rare, you may be able to get away with a resolution III design. Most DOE software has a handy table to help you select designs. Figure 1.12 is an example of such a table. Number of Factors Figure 1.12 - Table with possible fractional factorial designs 1. There's an easy way to remember these alising relationships called the finger rule. If you hold up the number of fingers as the name of the resolution (i.e. three fingers for a resolution III design), you will be able to split them into groups that show the aliasing structure. Three fingers can be divided into a pair and a single finger. Therefore, the main effects (the single finger) is aliased with two factor interactions (the pair). This also works with resolution IV and V designs. Try it and see for yourself. © 2010 Daniel Joelsson All rights reserved
  • 21. If you study the table, you'll see that the lower the resolution, the less runs you need to perform to estimate the effects you care about. Intuitively, this makes sense. The more information you have (runs), the less confounding (aliasing) you have in the results. The table is also useful in determining how fractionated each design is. For example, a 2^5 full factorial has 32 runs. The first half fraction of this factorial has 16 runs, resolution V, and is denoted 2^(5-1). A quarter fraction is denoted 2^(5-2), has 8 runs and is a resolution III design. By studying the table you can quickly familiarize yourself with how this nomenclature works. You might also notice that some designs have the same resolution and factors, yet one only has half as many runs as the other (2^(7-2) and 2^(7-3) is such an example). How is that possible, and why would you ever run the design with more runs? It comes down to statistical power. Power is what gives you the confidence that you actually detected a response. The way the analysis works, the more power you have, the more confident you are that you're not missing the significance of the effect. In a low power design, the effect of a factor has to be much higher than the variability in the measurement in order to be detected. To complicate matters, another way to increase power is to add replicates of a design. Replication decreases the uncertainly, and thus variability, of your responses. It's not always clear whether it's better to run a less fractionated design or just replicate a more fractionated one. If you find yourself in this conundrum, your best bet is to talk to a statistician. In most cases, I would personally choose the less fractionated design without replication, since it includes runs with more combinations of all the different factors. I therefore feel like I have more confidence in the analysis of interactions. To summarize, when picking a design, you have to consider how many factors you have, if you need to add blocks to the design, what resolution you need in your aliasing structure, and how many runs you are willing to perform. Once you have made those decisions, you can use a DOE software package to decide if you will have enough power to detect the differences you expect in your factors. If you don't have enough, you can either switch to a higher resolution design or add more replicates to your current design. 1.7 Response Variables - what to measure? Once you have your design, there's one last decision to make before you actually execute your runs: what to measure. Usually, you have at least one measure in mind when you start thinking about your experiments. In assay development, some common examples include signal strength, dynamic range, curve parameters (such as slope), replicate variability, background, signal to noise, etc. © 2010 Daniel Joelsson All rights reserved
  • 22. It may seem like it would be a lot of work to optimize all of those responses, but DOE makes it extremely easy. As you will see in the next chapter, once you do your runs, analysis is essentially free. So I would encourage you to think up front about measuring as many things as possible even if you don't intend to analyze them right away. I've seen many examples where a response you didn't think would matter suddenly becomes important. If you have the data, the analysis is usually much less painful than having to go back and generate more data. Another reason for measuring as many parameters as possible is that some responses may contradict others. For example, optimizing only for signal strength may increase background. The optimum setting for optimizing both may be a compromise setting. You would not know that unless you analyze for both variables at the same time. As you will see in the next chapter, most DOE software packages have optimization algorithms that will help you find the optimum settings across all your responses. © 2010 Daniel Joelsson All rights reserved