A Practical Guide to
Design of Experiments (DOE)
for Assay Developers

Daniel Joelsson

© 2010 Daniel Joelsson All rights ...
Introduction
Why write this book? Those of you that know me probably aren't surprised. I've been a
strong advocate for a m...
Before we start, a few thanks are in order. Thanks to Tim Schofield for schooling me on
statistics from the first day we m...
Chapter 1 - Screening Designs
1.1 One Factor At a Time (OFAT) and Interactions
Think back to your early years of scientifi...
Table 1.2
Antibody concentration

Incubation time

Response (O.D.)

0.2 ug/ml

1 hour

0.31

1.0 ug/ml

1 hour

0.97

1.5 ...
Since you've now tested and optimized both variables, you're done right? Well, maybe not.
There is one experiment that was...
Figure 1.1 - Interaction plot for two factors without an interaction

Figure 1.2 - Interaction plot for two factors with a...
"runs") shows how we start exploring the "design space" of our experimental system (Figure
1.3). As you can see, we have c...
Figure 1.4 Pictorial depiction of a 2^3 factorial experiment
Table 1.7
Run#

Factor A

Factor B

Factor C

1

+

+

+

2

...
Table 1.8
Number of factors

Number of runs

1

2

2

4

3

8

4

16

5

32

6

64

7

128

8

256

9

512

10

1024

But,...
Figure 1.5 - Non linear response

1.3 Fractional Factorials
In the last section, we learned that factorial experiments are...
Figure 1.6 Three factor screening experiment.

Figure 1.7 Three factor screening experiment compressed into two dimensions...
Note that it was completely arbitrary that we chose factor A for this analysis. The same thing
happens if we compress the ...
Figure 1.9 Three factor fractional factorial experiment (half-fraction).
We could try and eliminate another two runs (Figu...
1.4 Statistical Power and Aliasing in Fractional Factorials
You're probably thinking that if we just eliminated half of ou...
This relationship can also be described using the notation in table 1.9. Since this is not a
statistics book, I will take ...
As you might have noticed, we could just as easily have eliminated the other half of the runs
in the experiments. The alia...
- 0.333 * DGK + 0.333 * DGL + 0.333 * DHJ + 0.333 * DHK - 0.333 * DHL
+ 0.333 * DJK + 0.333 * DJL - 0.333 * DKL + 0.333 * ...
Figure 1.11. Picking the correct levels for a factor.
In a screening design, you will usually run just two levels of a fac...
what the aliasing structures of these design are. In a resolution III design, the main effects
are aliased with two-factor...
If you study the table, you'll see that the lower the resolution, the less runs you need to
perform to estimate the effect...
It may seem like it would be a lot of work to optimize all of those responses, but DOE makes
it extremely easy. As you wil...
Upcoming SlideShare
Loading in...5
×

DoE for assay_developers_chp1_rev-1.0

183

Published on

An xlnt write up by Daniel Joelsson on Design of Experiments for Bioassays. Hope one day Daniel shares the whole book. Good job mate.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
183
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DoE for assay_developers_chp1_rev-1.0

  1. 1. A Practical Guide to Design of Experiments (DOE) for Assay Developers Daniel Joelsson © 2010 Daniel Joelsson All rights reserved
  2. 2. Introduction Why write this book? Those of you that know me probably aren't surprised. I've been a strong advocate for a more systematic approach to assay development for many years, to the point of annoyance for many of my colleagues. I feel that using Design of Experiments (DOE) in assay development is the most efficient way to develop quality assays in the least amount of time possible. Then why aren't scientists using DOE more widely? I think part of the problem is the lack of understanding of what DOE really is and why it's useful. I looked for a good introduction to DOE for assay development scientists and I couldn't find one. Most of the texts out there are engineering and process focused. Therefore, I'm writing my own guide for us scientists and assay developers. I should state up front that I'm not a statistician, I'm a scientist. This guide is not intended for statisticians and I will not try to turn you into one either. However, some basic understanding of statistics is important to fully appreciate DOE. I will also make some assumptions that you are familiar with basic biology and biochemistry techniques. My background is primarily in the design of bioassays and immunoassays for vaccines and biotherapeutics. Most of my examples will come out of those disciplines. I will do my best to explain any statistical topics in a relevant and pragmatic way, but in case you want more, these are two resources that I have found to be excellent texts on statistics for scientists: The Biostatistics Cookbook - The Most User-Friendly Guide for the Bio/Medical Scientist by Seth Michelson and Timothy Schofield Biometry: The Principles and Practices of Statistics in Biological Research by Robert Sokal and F. James Rohlf This book is meant to be a guide for beginning to intermediate DOE users, and slanted completely towards scientists that do assay development. If you would like a more in depth discussion about DOE and the statistics behind it, Design & Analysis of Experiments by Douglas C. Montgomery is a great resource to get you started. While I will try to teach you enough about DOE to allow you to design and analyze your own experiments, a statistician familiar with assay development can be a valuable resource. What this book will give you is an understanding of the concepts and language of DOE so that you can easily communicate with your statistics friends. One of my goals is to increase the overlap in understanding on both sides. You might want to give a copy of this book to your statistician as well, especially if they are not familiar with the pitfalls of assay design. © 2010 Daniel Joelsson All rights reserved
  3. 3. Before we start, a few thanks are in order. Thanks to Tim Schofield for schooling me on statistics from the first day we met. Thanks to Edith Senderak for all of the mentoring and collaborations over the years. Thanks to Joe Pigeon for sparking my initial interest in DOE as well as introducing me to split-plot designs (which you will see are perfect for developing assays in 96 well plates). Thanks to all of my colleagues for trusting me enough to allow me to help with their DOE designs. This book is being published under a Creative Commons license. You are free to distribute this work to anyone you think would be interested, free of charge. You may not use any portion of this book for any commercial purpose without prior permission. You may create derivatives of this work, as long as these derivatives adhere to the same licence restrictions. For complete license information, please visit: http://creativecommons.org/licenses/by-nc-sa/3.0/ © 2010 Daniel Joelsson All rights reserved
  4. 4. Chapter 1 - Screening Designs 1.1 One Factor At a Time (OFAT) and Interactions Think back to your early years of scientific training for a moment. If your experience was anything like mine, you were first taught to do experiments using the scientific method. It went something like this: first generate a hypothesis based on past knowledge, design an experiment to test the hypothesis, analyze the data and then refine the hypothesis. Repeat until we get to the answer. This is the correct way to do science. Unfortunately, it is in the second step where we usually learn some bad habits. We were told to vary one factor at a time (OFAT) and hold everything else constant. At first glance, OFAT makes a lot of sense, but it can be misleading. Let's look at a simple assay development experiment. Imagine that you're trying to develop an ELISA to detect a bacterial contaminant in a sample from a process development study for a new biotherapeutic. Two of the variables you assume to be important are the amount of capture antibody you add to your plate and the time you incubate your sample. Consistent with the OFAT approach, you start out by testing two different antibody concentrations: 0.2 ug/ml and 1.0 ug/ml while keeping the incubation time constant at one hour. You then run the assay and faithfully record the results in your lab notebook (Table 1.1). Table 1.1 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 Great! Obviously, adding more antibody increases the response (it increased from 0.31 to 0.97). Your initial hypothesis that adding more antibody would be beneficial was correct. Good to know! What if you increased it even further? Let's find out what happens (Table 1.2). © 2010 Daniel Joelsson All rights reserved
  5. 5. Table 1.2 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 Ok, it appears that adding even more antibody does not further increase the signal. So you conclude that the response has been saturated somewhere around 1.0 ug/ml. Since the conditions for the first variable have been optimized, let's look at the second variable, incubation time. Again, you do a simple experiment increasing the incubation time to two hours. Since you already know that 1.0 ug/ml of antibody is ideal, you only need to run one experiment with 1.0 ug/ml of antibody for two hours of incubation. The results are shown in Table 1.3 Table 1.3 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 Even better! Clearly two hours is better than one. What if you increase the incubation time even further? You try three hours and see what happens (Table 1.4). Table 1.4 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 1.0 ug/ml 3 hours 1.74 Again, increasing the variable did not have a further effect on the response. So you conclude that the optimum settings for this assay is to use 1.0 ug/ml and incubate for two hours. © 2010 Daniel Joelsson All rights reserved
  6. 6. Since you've now tested and optimized both variables, you're done right? Well, maybe not. There is one experiment that was not performed: 0.2 ug/ml of antibody incubated for two hours. Let's run that experiment and see what happens (Table 1.5) Table 1.5 Antibody concentration Incubation time Response (O.D.) 0.2 ug/ml 1 hour 0.31 1.0 ug/ml 1 hour 0.97 1.5 ug/ml 1 hour 0.95 1.0 ug/ml 2 hours 1.72 1.0 ug/ml 3 hours 1.74 0.2 ug/ml 2 hours 2.62 Whoa! The response is now much higher than what we previously considered optimal. There are now all kinds of experiments we should probably run to make sure we have all the answers we need. What happens if we use less than 0.2 ug/ml? What about testing other incubation times with 0.2 ug/ml? This example highlights one of the greatest weaknesses of OFAT experiments. You are very likely to miss an interaction between two or more of the variables. An interaction occurs when the effect of two variables are not completely independent of each other; i.e. the response from one variable is dependent on the level of the other. Since the OFAT approach (and traditional scientific training) assumes that all variables are additive, it doesn't encourage us to test for interactions. The only way to find them is by luck. Unfortunately, interactions like these happen all the time in the real world. For scientists trained in linear thinking, interactions can sometimes be hard to visualize. There's a specific graph called an interaction plot that makes it slightly easier (Figures 1.1 and 1.2). In an interaction plot where the two variables have no interaction the two response lines will be perfectly parallel (Figure 1.1). The two lines represent the response due to Factor A at the two different levels of Factor B. Since the lines are perfectly parallel, the effect of A and B is completely additive. When you increase Factor B it shifts the entire Factor A line upwards by the same amount. This is the kind of relationship we assume is always in place in an OFAT experimental design. If the lines of the interaction plot intersect (or are simply non-parallel), there's an interaction between the two variables (Figure 1.2). Increasing Factor B no longer just moves the Factor A line higher. Instead, at the lower concentration of B, the response in Factor A actually decreases from low to high. Clearly the effects of the two variables are not simply additive. This is exactly the situation we observed in the assay development example above. © 2010 Daniel Joelsson All rights reserved
  7. 7. Figure 1.1 - Interaction plot for two factors without an interaction Figure 1.2 - Interaction plot for two factors with an interaction 1.2 Factorial experiments - foundations of DOE I hope I convinced you in the previous section that interactions between variables can be an important factor to the success of your experiments and that the OFAT approach makes it hard to find them unless you're willing to do a lot of work. This section will introduce some of the basics of Design of Experiments (DOE). DOE is a more systematic approach to experimentation than the OFAT approach. The goal is to test all of the variables in a system in a set of multi-factorial experiments allowing us optimize all of them and find interactions at the same time. The assay development example in the last section made it clear that it would have been a good idea to run all four combinations of the two variables right away. This type of experiment is called a factorial experiment. A picture of the four different experiments (or © 2010 Daniel Joelsson All rights reserved
  8. 8. "runs") shows how we start exploring the "design space" of our experimental system (Figure 1.3). As you can see, we have covered the four corners of the space. The runs can also be shown in a table such as Table 1.6. During each run, a factor can take on a low value (depicted as "-") or a high value (depicted as "+"). With two factors there are four such combinations possible. By performing all the possible combinations of the factors, we will be able to tell if each of the factors are important (i.e. changing them has an effect on the response), but also if any interactions are present. Figure 1.3 Pictorial depiction of a 2^2 factorial experiment. Table 1.6 Run # Factor A Factor B Run 1 + + Run 2 - + Run 3 + - Run 4 - - Let's expand the same thinking to three factors (Figure 1.4 and Table 1.7). With three factors we have to complete eight runs to run every combination of high and low for all three factors. But again, you get a lot of information out of those runs, you will know if any or all of the factors influence the response, if any two-factor interactions exist (AxB, AxC, BxC), and if there's a three-factor interaction (AxBxC). © 2010 Daniel Joelsson All rights reserved
  9. 9. Figure 1.4 Pictorial depiction of a 2^3 factorial experiment Table 1.7 Run# Factor A Factor B Factor C 1 + + + 2 - + + 3 + - + 4 - - + 5 + + - 6 - + - 7 + - - 8 - - - This exercise can be continued for four factors and so on. Performing eight runs for three factors might seem reasonable, but as you may have figured out already, as we add factors, the number of runs increases exponentially (Table 1.8). Since most assay systems contain more than three or four factors, factorial experiments quickly become too large to feasibly perform. With this in mind, it's easy to see how OFAT is still widely practiced. Factorial experiments rapidly become unwieldy as the numbers of factors goes up. © 2010 Daniel Joelsson All rights reserved
  10. 10. Table 1.8 Number of factors Number of runs 1 2 2 4 3 8 4 16 5 32 6 64 7 128 8 256 9 512 10 1024 But, since OFAT experiments make it very difficult to find those elusive interactions, we need an alternative approach. Luckily DOE provides that in the form of fractional factorials, the topic of the next section. However, before I get to that topic, I want to address a common question. How come we only use two levels (high and low) for each factor? Obviously, it would be risky to make the assumption that any response is perfectly linear between two points. It might be that the response has a complex shape in that space. In the example in Figure 1.5, the optimum response is actually somewhere between our low and high settings. This is where DOE differs from traditional experimentation. The goal of these factorial experiments is not to optimize the response completely, but to screen for the few factors that actually affect the response. These factors are then carried forward in a set of optimization experiments. Not all factors are likely to be important in every system, therefore we should do fairly low resolution experiments to identify the ones that truly matter. As you will see, screening experiments have relative fewer runs per factor than optimization experiments. We can afford to include more factors in the initial experiments to make sure that we don't miss any of the critical few. © 2010 Daniel Joelsson All rights reserved
  11. 11. Figure 1.5 - Non linear response 1.3 Fractional Factorials In the last section, we learned that factorial experiments are useful designs to pick up both main and interaction effects in our experimental system. But for more than a few factors, they quickly become too large to be feasible. In most assay development experiments , we can easily identify ten or more factors that could be important (reagent concentrations, incubation temperatures, incubation times etc.). We need a different approach. There is a set of DOE designs called fractional factorials that meet this need nicely. Let's go back to a simple three factor factorial experiment (Figure 1.6). In this experiment, we have three factors. Since we are doing screening experiments to identify the important factors, we will only be testing two levels per factor. Now imagine taking the three dimensional space in Figure 1.6 and condensing it down to two dimensions (Figure 1.7). In effect, we are now looking at the front of the cube so that we can't see the different levels of factor C any longer. We can take this thinking one step further and compress the resulting square into a single dimension (Figure 1.8). © 2010 Daniel Joelsson All rights reserved
  12. 12. Figure 1.6 Three factor screening experiment. Figure 1.7 Three factor screening experiment compressed into two dimensions. Figure 1.8 Three factor screening experiment compressed into one dimension. When we look at our experimental space this way, something interesting happens. It becomes clear that we have actually run four replicates of each of the two levels for factor A. This property of factorial designs is called hidden replication. While each of the eight experimental runs have a different combination of levels of all three factors, each factor's level is actually replicated four times. © 2010 Daniel Joelsson All rights reserved
  13. 13. Note that it was completely arbitrary that we chose factor A for this analysis. The same thing happens if we compress the design for either factor B or C. This is a useful characteristics of factorial designs (and fractional factorial designs as we will see in a little bit) called rotatability. Because the design is perfectly symmetrical, we can in effect assign any factor to be A, B, or C and it doesn't matter. We can get away with this because instead of using the actual numbers for the setting for each factor, we are going to use coded numbers of -1 or +1. This might seem confusing for now, but bear with me. It will make more sense when we discuss how we analyze these designs. What it does for us now is to scale the levels of each response to have the same distance, thus making the design rotatable. There is also a second characteristic of these designs worth noting. The reason we could compress the figure down into a single dimension is because of a property called orthogonality. Notice how the axes for the three factors in figure 6 are at 90 degree angles to each other? This means that when we compress the design down to a single dimension, the effects of all the other factors cancel out, allowing us to estimate just the effect of the one variable we want. This is what we mean by a multi-variable experiment. Unlike in an OFAT study, we truly can vary all of our variables at the same time and still be able to distinguish effects from each. Looking for hidden replication in our experimental space showed us that each factor was actually replicated four times. Since we probably don't need that many replicates to tell if the response changes from the low to the high setting, let's eliminate some of these extra runs and see what happens. One of the possible ways of doing this is shown in figure 1.9. As you can see, we have eliminated half of the runs. The amazing thing is that we are still doing two replicates of each level for all three factors. With only four runs we can estimate the effects of three variables with two replicates at each setting. That's a pretty good return for your experimental investment. © 2010 Daniel Joelsson All rights reserved
  14. 14. Figure 1.9 Three factor fractional factorial experiment (half-fraction). We could try and eliminate another two runs (Figure 1.10), but this time we have pushed things too far. Now we can't estimate factor C any longer. For three factors we can only eliminate the first half of runs and still have a viable designs. However, for designs with more than three factors we can eliminate more than half of the runs and still estimate all of the main effects. In fact, the more factors we have, the more runs we can eliminate. Figure 1.10 Three factor fractional factorial (quarter fraction). © 2010 Daniel Joelsson All rights reserved
  15. 15. 1.4 Statistical Power and Aliasing in Fractional Factorials You're probably thinking that if we just eliminated half of our work, there has to be a catch. You're right. An immediate impact is that we have lowered the number of replicates per data point from four to two. This has the outcome of lowering the statistical power of our design. We would need a larger effect in the response when going from low to high in order to see it. But that might be a trade-off you are willing to take, especially if you only care about large (relative to the experimental system) effects. How do you know how much of a trade-off you are making? That's a function of the underlying variability of the system you are looking at. If the variability is large and the effect you expect is small, you will need more replicates. Power calculations are described thoroughly in most statistics books (see the introduction to this book for sources), but you probably won't have to calculate it by hand. All of the specialized DOE software on the market will do this calculation for you. I'll talk more about these packages when I discuss the analysis of screening designs in the next chapter. For now, let's assume that you have enough power to estimate the effects you're expecting to see. The other penalty we take when eliminating runs from our design is to create what is termed aliasing of effects. In the discussion of full factorial experiments, I explained the concept of an interaction between two factors. A consequence of fractional factorials is the confounding of interactions with main effects, or with each other. Let me explain what that means in a little more detail. Let's look back at our three-factor fractional factorial in figure 1.9 again. Recall in our discussion in section 1.1 that if we wanted to estimate an interaction between factors A and B we would have to run all four of the "corners" of the square diagram. Unfortunately, in our fractional factorial, two of those corners are run at the lower level of factor C and the other two are run at the higher level of factor C. The design is no longer orthogonal with respect to interaction AB and factor C. The same is the case for the interactions BC and AC. In the nomenclature of DOE we would say that the main effect in A is aliased with the interaction BC. The notation we use to describe this relationship is as follows: A = A + BC B = B + AC C = C + AB The effect we observe due to factor A is a combination of A and the interaction BC and we can't tell how much is contributed by each. © 2010 Daniel Joelsson All rights reserved
  16. 16. This relationship can also be described using the notation in table 1.9. Since this is not a statistics book, I will take some liberties with explaining this table. For a more complete discussion please see the book by Montgomery listed in the introduction or any other statistics-based DOE book. Table 1.9 is similar to the run tables in previous sections, with some differences. Instead of run # in the first column, it is now labeled "treatment combination". I have also added columns for all of the possible interactions and one labeled "I" which stands for identity. While this table may not make a lot of sense right now, there are a few things we can take away from it. In chapter 2, we will learn how to build a mathematical model that relates the settings of the individual factors to the level of the output response. In that model, each factor will be preceded by a constant. The value of that constant will be solved by using the + and - signs in each column. For now, look carefully at the table and you will notice that each column has a unique pattern. We can therefore solve for the constant for each term in the model. On the bottom half of the table, I have shaded in grey the combinations that we eliminated in figure 1.9. Let's take a closer look at the remaining pluses and minuses in the top of the table. If we said that these sign represent coefficients in an equation for estimating the effects, then you will notice that the coefficients in column A are the same as those in column BC (in the top half of the table). The same goes for column B with AC and column C with AB. Column ABC is identical to column I, which means that we are no longer able to estimate a three factor interaction. Table 1.9 Treatment Combination I A B C AB AC BC ABC a + + - - - + + + b + - + - - + - + c + - - + + - - + abc + + + + + + + + ab + + + - + - - - ac + + - + - - - - bc + - + + - - + - (1) + - - - + + + - If you're more comfortable with thinking about aliases graphically or by using the table, the end result is the same. When we reduce the number of runs in a factorial, the number of aliases increase and with them our ability to estimate some of the effects in the experiment. © 2010 Daniel Joelsson All rights reserved
  17. 17. As you might have noticed, we could just as easily have eliminated the other half of the runs in the experiments. The aliasing structure now takes on a negative relationship (i.e. A = A BC etc.), but the principle is the same. We can no longer estimate the effects independently. 1.5 Other Screening Designs While fractional factorial designs are the easiest screening designs to understand due their simple aliasing structures, they are not the only designs available if you are looking to identify important factors. In fact, there are whole families of designs that aim to minimize the amount of runs while still being able to identify main effects. I'll discuss one of the more popular ones in brief detail here, and get to the more advanced designs in Chapter 4. Plackett-Burman designs are a family of two-level screening designs that allow you to use the least amount of runs possible for situations where you have 11, 12, 19, 23, 27, and 31 factors. In fact, you only need to run one more run than the number of factors you have. These designs have extremely complex aliasing structures. Here's an example of the aliasing structure for the main effect of factor A in an 11 factor design: [A] = A - 0.333 * BC - 0.333 * BD - 0.333 * BE + 0.333 * BF - 0.333 * BG - 0.333 * BH + 0.333 * BJ + 0.333 * BK - 0.333 * BL + 0.333 * CD - 0.333 * CE - 0.333 * CF + 0.333 * CG - 0.333 * CH + 0.333 * CJ - 0.333 * CK - 0.333 * CL + 0.333 * DE + 0.333 * DF - 0.333 * DG - 0.333 * DH - 0.333 * DJ - 0.333 * DK - 0.333 * DL - 0.333 * EF - 0.333 * EG - 0.333 * EH - 0.333 * EJ + 0.333 * EK + 0.333 * EL - 0.333 * FG + 0.333 * FH - 0.333 * FJ - 0.333 * FK - 0.333 * FL + 0.333 * GH - 0.333 * GJ + 0.333 * GK - 0.333 * GL - 0.333 * HJ - 0.333 * HK + 0.333 * HL - 0.333 * JK + 0.333 * JL - 0.333 * KL - 0.333 * BCD + 0.333 * BCE - 0.333 * BCF + 0.333 * BCG + 0.333 * BCH + 0.333 * BCJ + 0.333 * BCK - 0.333 * BCL + 0.333 * BDE + 0.333 * BDF + 0.333 * BDG - 0.333 * BDH - 0.333 * BDJ + 0.333 * BDK + 0.333 * BDL + 0.333 * BEF - 0.333 * BEG + 0.333 * BEH - 0.333 * BEJ + 0.333 * BEK - 0.333 * BEL - 0.333 * BFG + 0.333 * BFH + 0.333 * BFJ + 0.333 * BFK + 0.333 * BFL - 0.333 * BGH + 0.333 * BGJ + 0.333 * BGK + 0.333 * BGL + 0.333 * BHJ - 0.333 * BHK + 0.333 * BHL + 0.333 * BJK + 0.333 * BJL - 0.333 * BKL + 0.333 * CDE + 0.333 * CDF + 0.333 * CDG + 0.333 * CDH + 0.333 * CDJ - 0.333 * CDK + 0.333 * CDL - 0.333 * CEF - 0.333 * CEG + 0.333 * CEH + 0.333 * CEJ - 0.333 * CEK + 0.333 * CEL + 0.333 * CFG + 0.333 * CFH - 0.333 * CFJ + 0.333 * CFK + 0.333 * CFL + 0.333 * CGH + 0.333 * CGJ + 0.333 * CGK - 0.333 * CGL - 0.333 * CHJ - 0.333 * CHK - 0.333 * CHL + 0.333 * CJK + 0.333 * CJL + 0.333 * CKL + 0.333 * DEF - 0.333 * DEG - 0.333 * DEH + 0.333 * DEJ + 0.333 * DEK + 0.333 * DEL + 0.333 * DFG + 0.333 * DFH - 0.333 * DFJ + 0.333 * DFK - 0.333 * DFL + 0.333 * DGH - 0.333 * DGJ © 2010 Daniel Joelsson All rights reserved
  18. 18. - 0.333 * DGK + 0.333 * DGL + 0.333 * DHJ + 0.333 * DHK - 0.333 * DHL + 0.333 * DJK + 0.333 * DJL - 0.333 * DKL + 0.333 * EFG - 0.333 * EFH + 0.333 * EFJ + 0.333 * EFK - 0.333 * EFL + 0.333 * EGH + 0.333 * EGJ + 0.333 * EGK + 0.333 * EGL - 0.333 * EHJ + 0.333 * EHK + 0.333 * EHL - 0.333 * EJK + 0.333 * EJL + 0.333 * EKL + 0.333 * FGH + 0.333 * FGJ - 0.333 * FGK - 0.333 * FGL + 0.333 * FHJ - 0.333 * FHK + 0.333 * FHL - 0.333 * FJK + 0.333 * FJL + 0.333 * FKL - 0.333 * GHJ + 0.333 * GHK + 0.333 * GHL + 0.333 * GJK - 0.333 * GJL + 0.333 * GKL + 0.333 * HJK + 0.333 * HJL + 0.333 * HKL - 0.333 * JKL As you can see, you really don't want to try to figure out if any of these interactions are confounded with the main effect. But, if you look closely, you can also see that the main effect of A is only aliased with interactions of other factors. Thus, you can use these designs to estimate all of the main effects without a problem. That being said, if you think you will have any interactions at all, you're better off running a few more runs in a fractional factorial. During assay development, it's rare to not have any interactions. In my experience, these types of designs are most often used in robustness experiments where you expect very few of the factors to be significant. In those cases, a follow up experiment can be used to further investigate the presence of interactions after the Plackett-Burman design was used to eliminate most of the factors from consideration. 1.6 How to pick a design - blocking, resolution, and power Let's say you have an experimental system in mind. You have identified the factors you want to investigate. How do you get started? First, we have to decide on a low and high level setting for each of your factors. This is where some of the "art" of DOE comes into play and why you always need a subject matter expert involved in the design phase. The best guidance I can give you is to set the levels of your factors aggressively, but not too aggressively. Helpful, huh? Let's break down that statement. What exactly does it mean to set your levels aggressively? Imagine that you have a response that increases as you move from low to high of one of the factors (figure 1.11). © 2010 Daniel Joelsson All rights reserved
  19. 19. Figure 1.11. Picking the correct levels for a factor. In a screening design, you will usually run just two levels of a factor (and sometimes a center point). If you picked the two levels that are shown in red in Figure 1.10 you may not be able to detect a change in the response. Instead, it makes more sense to pick the two green levels. You would probably be able to detect the difference between them. Remember that the point is not to optimize the settings of the factor, just to identify the factors that are actually impacting your system. Also notice how the response continues outside of the green levels. You don't want to pick levels that are on an edge of the area you have explored in the past. Select levels aggressively, but not too aggressively. Another problem that often occurs in DOE designs for assay development is that you have to spread the testing across several operators, days, lots of reagents, etc. and you're worried that these changes will affect your responses. DOE designs can take care of those problems for us using a concept called blocking. Blocking essentially adds another factor for each of these "nuisance variables" to your model. The difference between these factors and your "regular" ones is that these factors are not analyzed for significance. Instead, any effects due to them are subtracted from the other responses so that they don't mask the effects of the responses you are really interested in. Each blocking variable uses up one available factor that you can estimate, so you need to make sure you have enough power in your design. If your design has a complicated aliasing structure, blocking makes it even worse since you've now added yet another factor. Be judicious in your design to keep your number of blocks low. Resolution is another concept that is important to understand when picking a screening design. Common factorials can be categorized into one of four types: resolution III, resolution IV, resolution V and higher, and saturated designs. These categories tell you © 2010 Daniel Joelsson All rights reserved
  20. 20. what the aliasing structures of these design are. In a resolution III design, the main effects are aliased with two-factor interactions. In a resolution IV design, two-factor interactions are aliased with other two-factor interactions and main effects are aliased with three-factor interactions. In a resolution V design, two factor interactions are aliased with three factor interactions, and main effects are aliased with four factor interactions. A saturated design 1 has no aliasing at all (i.e. all interactions can be estimated). How then do you use this information? If you're interested in estimating the main effects and you're worried about aliasing them with two-factor interactions, you would pick a resolution IV design. However, if you expect that two-factor interactions are rare, you may be able to get away with a resolution III design. Most DOE software has a handy table to help you select designs. Figure 1.12 is an example of such a table. Number of Factors Figure 1.12 - Table with possible fractional factorial designs 1. There's an easy way to remember these alising relationships called the finger rule. If you hold up the number of fingers as the name of the resolution (i.e. three fingers for a resolution III design), you will be able to split them into groups that show the aliasing structure. Three fingers can be divided into a pair and a single finger. Therefore, the main effects (the single finger) is aliased with two factor interactions (the pair). This also works with resolution IV and V designs. Try it and see for yourself. © 2010 Daniel Joelsson All rights reserved
  21. 21. If you study the table, you'll see that the lower the resolution, the less runs you need to perform to estimate the effects you care about. Intuitively, this makes sense. The more information you have (runs), the less confounding (aliasing) you have in the results. The table is also useful in determining how fractionated each design is. For example, a 2^5 full factorial has 32 runs. The first half fraction of this factorial has 16 runs, resolution V, and is denoted 2^(5-1). A quarter fraction is denoted 2^(5-2), has 8 runs and is a resolution III design. By studying the table you can quickly familiarize yourself with how this nomenclature works. You might also notice that some designs have the same resolution and factors, yet one only has half as many runs as the other (2^(7-2) and 2^(7-3) is such an example). How is that possible, and why would you ever run the design with more runs? It comes down to statistical power. Power is what gives you the confidence that you actually detected a response. The way the analysis works, the more power you have, the more confident you are that you're not missing the significance of the effect. In a low power design, the effect of a factor has to be much higher than the variability in the measurement in order to be detected. To complicate matters, another way to increase power is to add replicates of a design. Replication decreases the uncertainly, and thus variability, of your responses. It's not always clear whether it's better to run a less fractionated design or just replicate a more fractionated one. If you find yourself in this conundrum, your best bet is to talk to a statistician. In most cases, I would personally choose the less fractionated design without replication, since it includes runs with more combinations of all the different factors. I therefore feel like I have more confidence in the analysis of interactions. To summarize, when picking a design, you have to consider how many factors you have, if you need to add blocks to the design, what resolution you need in your aliasing structure, and how many runs you are willing to perform. Once you have made those decisions, you can use a DOE software package to decide if you will have enough power to detect the differences you expect in your factors. If you don't have enough, you can either switch to a higher resolution design or add more replicates to your current design. 1.7 Response Variables - what to measure? Once you have your design, there's one last decision to make before you actually execute your runs: what to measure. Usually, you have at least one measure in mind when you start thinking about your experiments. In assay development, some common examples include signal strength, dynamic range, curve parameters (such as slope), replicate variability, background, signal to noise, etc. © 2010 Daniel Joelsson All rights reserved
  22. 22. It may seem like it would be a lot of work to optimize all of those responses, but DOE makes it extremely easy. As you will see in the next chapter, once you do your runs, analysis is essentially free. So I would encourage you to think up front about measuring as many things as possible even if you don't intend to analyze them right away. I've seen many examples where a response you didn't think would matter suddenly becomes important. If you have the data, the analysis is usually much less painful than having to go back and generate more data. Another reason for measuring as many parameters as possible is that some responses may contradict others. For example, optimizing only for signal strength may increase background. The optimum setting for optimizing both may be a compromise setting. You would not know that unless you analyze for both variables at the same time. As you will see in the next chapter, most DOE software packages have optimization algorithms that will help you find the optimum settings across all your responses. © 2010 Daniel Joelsson All rights reserved

×