Your SlideShare is downloading. ×
DOE - Design Of Experiments For Dummies – A Beginner’s Guide
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DOE - Design Of Experiments For Dummies – A Beginner’s Guide


Published on

DOE - Design Of Experiments For Dummies – A Beginner’s Guide

DOE - Design Of Experiments For Dummies – A Beginner’s Guide

Published in: Business
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Design Of Experiments For Dummies – A Beginner’s Guide By Willy Vandenbrande | Published: 21 Feb 07 Despite all the efforts by specialists in quality and statistics, Design Of Experiments (DOE) is still not applied as widely as it could and should be. When I talk to people who had some kind of introduction to DOE, I notice that they are reluctant or even scared to apply it. To them, DOE deals with very complex tests with many factors at various levels and it requires in depth knowledge of at least two textbooks full of statistics to understand it and to be able to use it. In addition to that, the continuous fight between classicists, Taguchi adepts and Shainin enthusiasts does not make it easier for the interested newcomer to the field. As far as this ridiculous fighting is concerned I refer to reference (1). More interesting is the comment about the complexity of the subject. I dare say that 90% of all tests performed in industry are one factor two level experiments. For the specialists among you: 21 designs. We just want to know how the system, product or process will react if one factor is changed from one level to another level. In most cases the first level is the original, presently used level of the factor, so we already have some knowledge about the behavior of the system at this setting. Simple, you say? Do it all the time? Undoubtedly, but how often is it done correctly? I know from experience that in industrial experimentation it does not need to be complex to be messed up! And there are two ways of messing it up: the first one is not being able to draw any conclusion from the test, the second one, probably even worse, is drawing wrong conclusions from the test. Where do we go wrong? If we divide the experimentation process in the following four phases: setting-up the experiment, executing the tests, analyzing the results and drawing conclusions, we see things going wrong in all phases, but generally most damage is done before the tests start. A wrongly set-up experiment cannot be saved, not even by the most advanced of statistical software programs. We need to think more up front and use basic rules from DOE to avoid problems. We are talking here about experiments that are so simple that the calculations and the analyses to be made will be very straightforward. It will never be a problem for anyone to make these calculations and to understand the results. What we need to be sure of is that these results contain the answers to our questions! To illustrate the many ways to mess up an experiment, we will use a simple example. I call it: quot;the hunt for the red tomatoquot; and it stars Sam, the friendly amateur gardener who is a keen tomato grower. His local supplier tells him he now has a great new fertilizer called quot;the red
  • 2. tomatoquot; that is 20% more expensive than the current one, quot;tomato loverquot;, but that is guaranteed to give him a lot more tomatoes by the end of the season. The offer is tempting, but Sam decides to test the product this season, before eventually changing over for his entire crop the next season. He knows a few things about experimentation and will follow some basic DOE rules. Rule number one: write down the questions you would like to see answered by the experiment. In Six Sigma terminology: define the problem and one aspect of problem definition is quantification. If you want to compare results, you have to know on what bases the results will be compared, so you need to quantify the output. In this case the question Sam wants to see answered by his experiment is: does the quot;red tomatoquot; fertilizer increase my tomato harvest by at least 20% in weight? Rule number two: don’t forget that characteristics that are not part of the study also need to fulfill requirements. It is not very interesting to solve one problem or improve on one aspect and at the same create three new problems or deteriorate on several other characteristics. Sam would not really be happy if as a result of changing fertilizer he would now have 20 % more, but badly tasting, tomatoes. So he would certainly evaluate the taste, but might also look at the size of the tomatoes, or the size distribution, the color, etc. Although we do not perform the experiment to improve on these characteristics, we at least have to make sure that they stay at an acceptable level. So at the end of the experiment we need to measure and evaluate them! Rule number three: make sure to have a reliable measurement system. It is not the purpose of this article to give a course on Measurement Systems Analysis (MSA), but you must be aware of the importance of the variation introduced by the measurement system and have to keep it at a minimum. Weighing systems are generally pretty good, but if Sam wants to be sure he should perform an analysis of the measurement system to see if it is adequate. For more info on MSA, see reference (2). Rule number four: use statistics and statistical principles upfront. One question that always needs to be answered before starting a test is: quot;how big should the sample size be?quot; Don’t forget that we will use a sample result to draw conclusions for the behavior of the population. Table 1 shows the two types of wrong conclusions we may draw when comparing two means, and the associated risks. In order to determine the appropriate sample size we need to determine the α- and β- risk (or their counterparts: the significance level (1- α) and the power (1- β) of the test), the distribution and the standard deviation of the population and the minimum difference we want to be able to detect between the two populations. This requires several assumptions.
  • 3. Decision taken based on the experiment Real (but unknown) situation There is a difference There is no difference There is a difference between the means. OK Type I error (α-risk) significance level = 1- α There is no difference between the means Type II error (β-risk) Power = 1- β OK Table 1: The two types of errors If we want to compare the means of normally distributed populations with equal variance (we can use the historical standard deviation of the current process as an approximation) Table 2 can be used as a guideline for determining sample sizes. It is based on a two sample t-test and contains the values for one sided comparison (is the new setting better?) and two sided comparison (is there any difference between old and new setting?). Despite its limiting assumptions it is useful in many cases. Note that if you want to detect small differences the sample sizes increase drastically. For other cases (different significance and power levels, other comparisons, etc) we refer to (3) and (4). Minimum difference (in standard deviation units) One sided comparison Two sided comparison 0.5 70 86 1.0 18 23 1.5 9 11 2.0 6 7 2.5 4 5 3.0 3 4 Table 2: Selecting sample sizes when comparing two means of normal distributions. Assumptions: α-risk = 0.05 (significance level= 95%) • β- risk = maximum 0.10 (power = minimum 90 %) • Based on t-test for two means with equal variance. •
  • 4. The data from last year's harvest are shown in table 3. Sam only wants to change over to the new fertilizer if it gives him an average yield of at least 20% extra (one sided comparison). This is 0.95 kg or 2.3 standard deviations. Looking at table 2 he decides to treat 6 plants with quot;tomato loverquot; and 6 plants with quot;the red tomatoquot;. This should put him on the safe side when he draws his conclusions. Tomato plant number Yield (kg) 1 4.94 2 4.11 3 4.64 4 4.43 5 4.86 6 5.02 7 4.68 8 4.85 9 4.51 10 5.14 11 4.80 12 4.65 13 4.57 14 4.88 15 4.26 16 4.47 17 4.71 18 4.84 19 5.51 20 5.09
  • 5. 21 4.33 22 4.85 23 4.63 24 4.53 25 5.52 26 4.85 27 4.82 28 3.78 29 5.37 30 5.58 Average 4.77 Standard Deviation 0.41 Table 3: Results of last year’s harvest Rule number five: beware of known enemies. Figure 1 gives you an idea of Sam’s garden and the location he has reserved for his tomato plants. One thing is immediately obvious: because of the tree, some of the plants will receive a lot more sunshine than others. Now suppose Sam treats 6 plants in the shade with quot;tomato loverquot; and 6 plants in the sun with quot;the red tomatoquot; and that at the end of the season he sees that the quot;red tomatoquot; plants clearly yield more tomatoes. What has he proven now? That the fertilizer is superior or that tomato plants in the sun yield more tomatoes? No one knows, because the effects of the two factors have been mixed. Sam has three options. He can place all test plants in the sun, he can place all test plants in the shade or he can place half of them in the sun and half of them in the shade. He chooses this third option because in this way the conclusions he can take from the test have a bigger validity. It allows him to compare the effect of the fertilizer over various conditions of sunshine hours. In DOE this is called quot;blockingquot;. For every known enemy we have to develop a strategy: will we keep it constant for the test or use it as a block factor?
  • 6. Figure 1: Schematic representation of Sam's garden Figure 2: A known enemy: amount of sunshine As a result of this rule Sam has marked twelve test spots, six in the shady area, six in the sunny region (see figure 2). In both areas he now has to select which three plants he will treat with quot;the red tomatoquot;. Rule number six: beware of unknown enemies. Gardens are mysterious places. They may hold all sorts of differences that we are not aware of: small changes in soil composition, effect of wind, ground water levels, etc. All these factors may or may not influence the result of our test. The only way Sam can defend himself against
  • 7. these enemies is by setting up his experiment in such a way that these factors are distributed randomly, by chance, over his experiment. Because of the blocking (see rule number four) he has to randomize within each block. This can be done in a very simple way because there are only two levels to consider. Sam takes three black and three red playing cards, shuffles them and at each test location within the block he has his daughter pick one card. If it is a black card, he will treat that plant with quot;tomato loverquot;, if it is a red card he will treat it with quot;the red tomatoquot;. The two of them have a lot of fun and figure 3 shows the result of their work. Figure 3: selecting test plants with playing cards Note that this is a randomization in location, in many industrial tests, randomization in time is needed. This means that the sequence of executing the tests has to be decided by chance within each block. Rule number seven: beware of what goes on during testing. Sam will not have too much problems with this, he does it all himself and has full control of what goes on during testing. He may want to instruct his daughter not to touch anything in the tomato garden. Or not to be overenthusiastic and to start watering some of the plants. With industrial experiments this is a different story. There is no end to what can go wrong during testing. In many cases the people performing the tests have not been part of the team that designed it, they have no idea what it is about or sometimes even why it is done. So keep these two golden rules in mind: 1. He who communicates is king 2. Be where it happens when it happens.
  • 8. Rule number eight: analyze the results statistically. The tomato season has come to an end; Sam has weighted all tomatoes and has put the test results together in table 4. Test number quot;Tomato loverquot; treated (kg) quot;The red tomatoquot; treated (kg) 1 4.46 6.17 2 4.65 5.11 3 5.19 4.76 4 5.97 5.21 5 4.23 4.62 6 4.49 5.43 Average 4.83 5.22 Standard Deviation0.64 0.55 Table 4: results of the tests In rule number four we saw that he only wants to change if the new fertilizer brings him at least 20% or 0.97 kg more tomatoes. In other words: is the observed average difference greater than 0.97kg? From the data it is clear that this is not the case (the observed difference is only 0.39 kg). Statistically we test the null hypothesis that the means are equal versus the alternative that the difference between the means is larger than 0.97 kg. This is done with a t-test. We refer to (5) for an overall discussion on the t-test. If we use the assumption of unknown but equal variances, the formulas to be used in this case are: Click to enlarge Using the data of table 4 we find s = 0.60 and t0=-1.69 with 10 degrees of freedom. The critical value at the 0.05 level is 1.812, so only if the calculated t-statistic were > 1.812 could we say with 95% confidence that the new fertilizer gives more than 20% increase in yield. This is clearly not the case!
  • 9. If the result was positive, Sam would still have to analyze all the other characteristics that need to fulfill minimum requirements. Rule number nine: present the results graphically. All this is correct and for those who know statistics it shows that there is not a significant difference between the two fertilizers. However, not all people involved in the experiment are knowledgeable of statistics. Sam’s daughter for instance! That’s why graphical presentation of results is so important in communicating. Both figures 4 and 5 show that there is definitely not a 20% higher tomato harvest with the new fertilizer. Moreover, if you look at figure 5 it is also clear that there is a lot of overlap on the individual test results. This is an indication that the difference in mean that can be observed, could be due to chance and has no great significance. Figure 4: boxplot of the yield of tomatoes per fertilizer
  • 10. Figure 5: Individual value plot of the yield of tomatoes per fertilizer Actually, in most cases the graphical output will tell the whole story. Only when there is some doubt left, the correct numbers may be needed to take a final decision. Conclusion There is no such thing as a quot;simplequot; experiment. No matter how simple it may look, you need to take several rules into account if you want to be able to draw correct conclusions out of your tests. Don’t forget that it is equally expensive to run a bad or a good experiment. The only difference is that the good experiment has a return on investment.