Upcoming SlideShare
×

# ANOVA & EXPERIMENTAL DESIGNS

14,609 views
14,379 views

Published on

12 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
14,609
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
674
0
Likes
12
Embeds 0
No embeds

No notes for slide

### ANOVA & EXPERIMENTAL DESIGNS

1. 1. BY VISWANTH REDDY.S DEPARTMENT OF PHARMACOLOGYGOKARAJU RANGARAJU COLLEGE OF PHARMACY
2. 2.  Analysis of variance(ANOVA) Experimental designs  CRD  RCBD  LSD Applications of biostatistics
3. 3.  Its mainly employed for comparison of means of three or more samples including the variations in each sample. this statistical technique first devoloped by R.A.Fisher and was extensively used for agricultural experiments. The analyis of variance is a method to estimate the contribution made by each factor to the total variation.the total variation splits in to the following two components . 1.variation with in the samples 2.variation between the samples
4. 4.  There are two classifications for the analysis of variance when we classify data based on one factor analysis it is known as one way ANOVA When we classify data on the basis of two factors which is known as two way ANOVA The technique of analysing variance in case of one factor and two factors is similar.however , incase of onefactor analysis the total variance is divided in to twoparts only 1. Variance between samples 2. Variance with in the samples. the variance with in the samples is residual variance.
5. 5.  In case of two factor analysis ,the total variance is divided in to 3parts viz.,,  variance due to factor number one  Variance due to factor number two  Residual variance PROCEDURE FOR CALCULATING F-STATISTIC: T-test employed for two mean samples F-test is employed for comparison means of three or more samples. in this case , the variation between the treatments and the replicates are shown in columns and rows, respectively. Now we have to find out whether these variations are significant and if so what level of significance, for this purpose calculate the F-statistic which is the ratio of variances. The detailed procedure as follows:
6. 6. TREATMENTS 1 2 3 1 X11 X21 X31---------------∑XR1 R E P 2 X12 X22 X32----------------∑XR2 L I C 3 X13 X23 X33-----------------∑XR3 A T E S ∑X=  ∑XC1 ∑ XC21+ ∑ XC22+ ∑XC32---------------------------------------------------------------A ∑X2= ∑XC2 ∑XC3= GRANDTOTAL(G)  (∑X)2/nc= (∑ XC1)2/nc1+ (∑ XC2)2/nc2+ (∑XC3)2/nc3-----------------B  (∑X)2/nr= (∑ XR1)2/nr1+ (∑ XR2)2/nr2+ (∑XR3)2/nr3-------------------C  C.F = (∑X)2/n= G2/n---------------------------------------------------------------------D  Now total sum of squares=A-D  between treatments sum of squares=B-D  between rows sum of square= C-D  residual sum of squares= (A-D)-[(B-D)+(C-D)]
7. 7. SOURCE OF DEGREES OF SUM OF MEANS OFVARIATION FREEDOM(d.f) SQUARES(SS) SQUARES(MS)BETWEEN c-1 B-D B-D/c-1TREATMENTSBETWEEN ROWS r-1 C-D C-D/r-1RESIDUAL (C-1)(r-1) (A-B-[(B-D)+(C-D)] (A-B-[(B-D)+(C-D)]/(C- 1)(r-1)TOTAL Cr-1 A-D
8. 8. TREATMENTS 1 2 3 1 X11 X21 X31 R E P 2 X12 X22 X32 L I C 3 X13 X23 X33 A T E S ∑X= ∑XC1 ∑XC2 ∑XC3= GRANDTOTAL(G) 1. Find the total sum of squares ∑X2= ∑ XC21+ ∑ XC22+ ∑XC32--------A 2. Square the coloumn total and divide separately each total by number of observations inn each coloumn denoted by C1,C2,C3------etc (∑X)2/nc= (∑ XC1)2/nc1+ (∑ XC2)2/nc2+ (∑XC3)2/nc3-----------------B
9. 9. 3.Find the grand total ∑X= ∑XC1 + ∑XC2 + ∑XC3= GRAND TOTAL(G) 4.Square the grand total and divide it by the number of observations(n). correction factor, C.F.=( ∑X)2/n or GT2/n---------------------------------D 5. Calculate the F value F=BET WEEN TREATMENT MEAN SQUARE/RESIDUAL MEAN SQUARESOURCE OF DEGREES OF SUM OF MEANS OF F VALUEVARIATION FREEDOM(d.f) SQUARES(SS) SQUARES(MS)BETWEEN c-1 B-D B-D/c-1TREATMENTS / B-D/c-1 A-B/C(r-1)RESIDUAL C(r-1) A-B A-B/C(r-1)TOTAL Cr-1 A-D
10. 10.  In one way classification we have studied influence of one factor.however , in two way classification we will study the influence of two factors. In such cases , data are classified based on two criteria..for example , the yield of different varieties of wheat may be affected by the application of different fertilizers. Therefore analysis of variance can be used to test the effects of these two factors simultaneosly. The calculation in two factors analysis is more or less the same In addition to the calculation based on rows. In one way classification columns are taken into consideration . However in two way analysis both coloumns and rows are considered.
11. 11. TREATMENTS 1 2 3 1 X11 X21 X31---------------∑ XR1 R E P 2 X12 X22 X32----------------∑XR2 L I C 3 X13 X23X33----------------- ∑XR3 A T E S  ∑X2= ∑ XC21+ ∑ XC22+ ∑XC32---------------------------------------------------------------A ∑X= (∑X)2/nc= (∑ XC1)2/nc1+ (∑∑XC2 ∑XC1 ∑XC3= XC2)2/nc2+ (∑XC3)2/nc3-----------------BGRAND TOTAL(G)  (∑X)2/nr= (∑ XR1)2/nr1+ (∑ XR2)2/nr2+ (∑XR3)2/nr3-------------------C  C.F = (∑X)2/n= G2/n---------------------------------------------------------------------D  Now total sum of squares=A-D  between treatments sum of squares=B-D  between rows sum of square= C-D  residual sum of squares= (A-D)-[(B-D)+(C-D)]
12. 12. SOURCE OF DEGREES OF SUM OF MEANS OFVARIATION FREEDOM(d.f SQUARES(SS) SQUARES(MS F VALUE ) )BETWEEN c-1 B-D B-D/c-1TREATMENTS / B-D/c-1 (A-B- [(B-D)+(C-D)]/(C- 1)(r-1)BETWEEN r-1 C-D C-D/r-1 C-D/r-1/(A-B-ROWS [(B-D)+(C-D)]/(C- 1)(r-1)RESIDUAL (C-1)(r-1) (A-B-[(B-D)+(C- (A-B-[(B-D)+(C-D)]/ D)] (C-1)(r-1)TOTAL Cr-1 A-D
13. 13.  A statistical design is a plan for the collection and analysis of data. It mainly deals with the following parameters.. However the selection of an efficient design requires careful planning in advance of data collection and also analysis A B D A A B D C C D B C C D B A B A D C B A D C
14. 14.  To eliminate bias To ensure independence among observations Required for valid significance tests and interval estimates Low High Old New Old New Old New Old New In each pair of plots, although replicated, the new variety is consistently assigned to the plot with the higher fertility level.
15. 15.  The repetition of a treatment in an experiment A B D A C D B C B A D C
16. 16.  Ex: If physicians wants to know whether a particular drug which has been invented will be benificial in the treatment of particular disease A farmer wants to know whether new type of fertilizer will give him better yields..he will frane his investigation interms of some suitable hypothesis. There are many types of experimental designs… in which the most imp are as follows….
17. 17. DEPT OF PHARMACOLOGY Complete randomized design(CRD) Randomized complete block design(RCBD) Latin square design(LSD)
18. 18. DEPT OF PHARMACOLOGY Where the treatments are assigned completetly at random so that each treatment unit has the same chance of receiving any one treatment. This is suitable for only the expriment material is homogenous.(ex:laboratory experiments, green house studies etc.) Not suitable for heterogenous study.(ex: field experiments)
19. 19. Advantages : Simple and easy Provides maximum number of degrees of freedomDisadvantages: Onlysuitable for small number of treatments and for homogenous experimental material. Low precision if the plots are not uniform A B D A C D B C B A D C
20. 20.  Simplest and least restrictive Every plot is equally likely to be assigned to any treatment A B D A C D B C B A D C
21. 21.  We have an experiment to test three varieties: the top line from Oregon, Washington, and Idaho to find which grows best in our area ----- t=3, r=4 A1 1 12 6 5 2 3 4 A A 5 6 7 8 A 9 10 11 12
22. 22. DEPT OF PHARMACOLOGY Layout of CRD: The step by step procedures for randamization and layout of a CRD are given for a field experiment with four treatments with five replications. Determine the total number of experimental units (n) as the number of treatments and number of replications. n=r×t→5×4=20 The entire experimental material is divided in to “n” number of experiments. ex: five treatments with four replicatons . We need 20 experimental units.the 20 units are numberd as follows……
23. 23. 1 2 3 4 5 6 7 8 9 10 11 12 13. 14 15 16 17 18 19 20 Assign the treatments to the experimental units by 3 digit random numbers , selected from random number table. The random numbers written in order and are ranked , however the lowest random number gives rank1, the highest rank allotted to large number. These ranks corresponds to unit number Then the first set of r units are alloted to treatment T 1 Then the next set of r units are alloted to treatment T2 Then the other set of r units T3 & so on…
24. 24. random number rank treatment 937 17 149 02 908 15 T1 361 07 953 19 749 13 180 04 T2 951 18 953 19 749 13 180 04 T3 951 18 957 20 157 03 571 11 T4 226 05
25. 25. DEPT OF PHARMACOLOGY Final layout: 1 2 3 4 5 T3 T1 T5 T2 T5 6 7 8 9 10 T4 T1 T3 T4 T4 11 12 13 14 15 T5 T4 T2 T3 T1 16 17 18 19 20 T3 T1 T2 T2 T5
26. 26.  Analysis of variance: There are two sources of variation among these observations obtained from a CRD trial. 1. Treatment variation 2. Experimental error The relative size of the two is used to indicate whether the observed difference among the treatment is real or due to chance.
27. 27. DEPT OF PHARMACOLOGY Calculations:1. Correction factor(C.F)= (GT)2/n2. Total sum of squares(total ss)=total ss-c.f3. Treatment sum of squares(TSS)=TSS-cf4. Error sum of squares(ESS)=total ss – TSS These results are summarized in the ANOVA table & the mean squares and F are calculated. ANOVA table: Source of df ss ms F variation treatments t-1 TSS TMS=TSS/t-1 TMS/EMS Error n-t ESS EMS=ESS/n-t Total n-1 Total SS
28. 28.  Most widely used experimental designs in agricultural research. The design also extensively used in the fields of biology, medical, social sciences and also business research. Experimental material is grouped in to homogenous sub groups… the sub group is commonly termed as block.since each block will consists the entire set of treatments , a block is equivalent to a replication.
29. 29.  Ex: in field experiments , the soil fertility is an important character that influences crop responses. Hence the treatments applied at random to relatively homogenous units with in each block and replicated over all the blocks, the design is known as a RBD. divides the group of experimental units into n homogeneous groups of size t. These homogeneous groups are called blocks. The treatments are then randomly assigned to the experimental units in each block - one treatment to a unit in each block.
30. 30. A dvantages& Disadvantages of RCBD:Advantages of RCBD:  this design has been shown to be more efficient or accurate than CRD for most of types of experimental work . The elimination of between SS from residual SS , usually results in a decrease of error of mean SS.  Flexibility is another advantage of RCBD. Large number of treatments can be included in this design.Dis advantages of RCBD:  not suitable for large number of treatments … because if the block size is large it may be difficult to maintain homogenicity with in blocks. Consequently error will be increased.
31. 31.  Layout of RCBD:  let us consider that the experiment is to be conducted on 4 blocks of land, each having 5 plots. Now we take in to consideration five treatments , each replicated 4 times, we divide the whole experimental area in to 4 relatively homogenous blocks and each block into five plots or units. Treatments allocated at random to the units of a block . PLOTS 1 2 3 4 B 5 1 A E B D C L O E D C B A C C B A E D K S A D E C B
32. 32. The Anova Table for a randomized BlockSource of d.f ExperimentM.S.S S.S. FvariationTreatments t-1 SST SST/t-1 SST/t-1/SSE/(t-1) (r-1)Blocks r-1 SSB SSB/r-1 SSB/r-1/SSE/(t-1) (r-1)Error (t-1)(r-1) SSE SSE/(t-1)(r-1)Total rt-1 total SS
33. 33.  By comparing the variance ratio of treatments with the critical value of F we can find out if the different treatments are significantly differe The conclusion will be irrespective of the difference on account of blocks. Ex:
34. 34.  A Latin Square experiment is assumed to be a three-factor experiment. The factors are rows, colum and treatm ns ents. It is assumed that there is no interaction between rows, columns and treatments. The degrees of freedom for the interactions is used to estimate error differ from randomized complete block designs in that the experimental units are grouped in blocks in two different ways, that is, by rows and columns. A requirement of the latin square is that the number of treatments, rows, and number of replications, columns, must be equal; therefore, the total number of experimental units must be a perfect square. For example, if there are 4 treatments,
35. 35. Latin Square Designs Selected Latin Squares 3 x 3 4 x 4 ABC ABCD ABCD ABCD ABCD BCA BADC BCDA BDAC BADC CAB CDBA CDAB CADB CDAB DCAB DABC DCBA DCBA  5 x 5 6 x 6 ABCDE ABCDEF BAECD BFDCAE CDAEB CDEFBA DEBAC DAFECB ECDBA ECABFD FEBADC
36. 36.  The layout LSD is shown below for an experiment with five treatments A,B.C,D,E . The 5×5 LSD plan given as follows. A B C D E B A E C D C D A E B D E B A C E C D B A Later on the process of randomization is done with the help of table of random numbers method. for this select 5 three digit random numbers. Random numbers sequence rank 628 1 3 846 2 4 475 . 3 2 902 4 5 452 5 1
37. 37.  Now use the rank to represent the existing row number of the selected plan and sequence to represents the row number of new plan. However the third row of the selected plan (rank=3) becomes the firstrow(sequence=1)then so on..... C D A E B D E B A C B A E C D E C D B A A B C D E The column should be randomized in the same way by using the same procedure used for rearrangement… the five random numbers selected are as follows: Random numbers sequence rank 792 1 4 032 2 1 947 . 3 5 293 4 3 196 5 2
38. 38.  However , the rank will now used to represent the column number of the plan obtained above and the sequence will be used to represent the column number of the final plan. In this way ,the fourth column of the above plan becomes the first column of the final plan. In addition to this , the fifth column becomes third: third becomes fourth and seconds becomes fifth.the final plan which becomes the layout of the design , is as follows: Row 1 2 3 4 5 number 1 E C B A D 2 A D C B E 3 C B D E A 4 B E A D C 5 D A E C B
39. 39. ANALYSIS OF VARIANCE FOR LSD: C.F=(GT)2/n Total SS=∑X2-CF Row SS=1/n ∑R2-CF Column SS=1/n ∑C2-CF Treatment SS=1/n ∑T2-CF Error SS=Total SS-Row SS-ColumnSS-Treatment SS
40. 40. The Anova Table for a Latin Square ExperimentSource d.f. SS M.S. FTreat n-1 TSS TMS TMS/EMSRows n-1 RSS RMS RMS/EMSCols n-1 CSS CMS CMS/EMSError (n-1)(n-2) ESS EMSTotal n2 - 1 Total SS
41. 41. A dvantages Controls more variation than CR or RCB designs because of 2-way stratification. Results in a smaller mean square for error. Simple analysis of data Analysis is simple even with missing plots. DisadvantagesNumber of treatments is limited to the number ofreplicates which seldom exceeds 10.If have less than 5 treatments, the df for controllingrandom variation is relatively large and the df forerror is small.
42. 42. Applications of biostatistics in pharmacy: Applications of biostatistics in pharmacy:  Public health, including epidemiology, health services research, nutrition, environmental health and healthcare policy & management. Design and analysis of clinical trials in medicine Population genetics, and statistical genetics in order to link variation in genotype with a variation in phenotype. This has been used in agriculture to improve crops and farm animals (animal breeding). In biomedical research, this work can assist in finding candidates for gene alleles that can cause or influence predisposition to disease in human genetics Analysis of genomics data, for example from microarray or proteomics experiments.Often concerning diseases or disease stages. Ecology, ecological forecasting Biological sequence analysis Systems biology for gene network inference or pathways analysis Statistical methods are beginning to be integrated into medical informatics, public health informatics, bioinformatics and computational biology.
43. 43.  Test whether the new treatments / new diagnostics / new vaccine works or not? Ideally clinical trial should include all patients. Is it practically possible? No We test the new treatments / new diagnostics / new vaccine on a representative sample of the population Statistics allows us to draw conclusions about the likely effect on the population using data from the sample BUT ALWAYS REMEMBER… Statistics can never PROVE or DISPROVE a hypothesis, it only suggests to accept or reject the hypothesis based on the available evidences
44. 44. REFERENCES Hinkelmann and Kempthorne (2008, Volume 1, Section 6.6: Completelyrandomized design; Approximating the randomization test)http://en.wikipedia.org/wiki/Analysis_of_variance Montgomery (2001, Section 5-2: Introduction to factorial designs; Theadvantages of factorials)http://www.slideshare.net/Medresearch/analysis-of-variance-ppt-powerpoint-presentationhttp://www.synchronresearch.com/pdf_files/Application-Biostatistics-in-Trials.pdf
45. 45. 48
46. 46. 49