Upcoming SlideShare
Loading in …5
×

# Anova

5,325 views

Published on

one way analysis of variance and two way analysis of variance with case study

0 Comments
8 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

No Downloads
Views
Total views
5,325
On SlideShare
0
From Embeds
0
Number of Embeds
186
Actions
Shares
0
Downloads
455
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

### Anova

1. 1. Analysis of Variance<br />
2. 2. Introduction <br />Analysis of variance compares two or more populations of interval data.<br />Specifically, we are interested in determining whether differences exist between the population means.<br />The procedure works by analyzing the sample variance.<br />
3. 3. One Way Analysis of Variance <br />The analysis of variance is a procedure that tests to determine whether differences exits between two or more population means.<br />To do this, the technique analyzes the sample variances<br />
4. 4. One Way Analysis of Variance <br />Example<br />An apple juice manufacturer is planning to develop a new product -a liquid concentrate.<br />The marketing manager has to decide how to market the new product.<br />Three strategies are considered<br />Emphasize convenience of using the product.<br />Emphasize the quality of the product.<br />Emphasize the product’s low price.<br />
5. 5. One Way Analysis of Variance <br />Example continued<br />An experiment was conducted as follows:<br />In three cities an advertisement campaign was launched .<br />In each city only one of the three characteristics (convenience, quality, and price) was emphasized.<br />The weekly sales were recorded for twenty weeks following the beginning of the campaigns.<br />
6. 6. One Way Analysis of Variance<br />Weekly sales<br />Weekly sales<br />Weekly sales<br />
7. 7. One Way Analysis of Variance <br />Solution<br />The data are interval<br />The problem objective is to compare sales in three cities.<br />We hypothesize that the three population means are equal<br />
8. 8. <ul><li>Solution</li></ul>H0: m1 = m2= m3<br />H1: At least two means differ<br /> To build the statistic needed to test thehypotheses use the following notation:<br />Defining the Hypotheses<br />
9. 9. 1<br />2<br />k<br />First observation,<br />first sample<br />Second observation,<br />second sample<br />Independent samples are drawn from k populations (treatments).<br />X11<br />x21<br />.<br />.<br />.<br />Xn1,1<br />X12<br />x22<br />.<br />.<br />.<br />Xn2,2<br />X1k<br />x2k<br />.<br />.<br />.<br />Xnk,k<br />Sample size<br />Sample mean<br />X is the “response variable”.<br />The variables’ value are called “responses”.<br />Notation<br />
10. 10. Terminology<br />In the context of this problem…<br />Response variable – weekly salesResponses – actual sale valuesExperimental unit – weeks in the three cities when we record sales figures.Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy.<br />Factor levels – the population (treatment) names. In this problem factor levels are the marketing strategies.<br />
11. 11. Two types of variability are employed when testing for the equality of the population means<br />The rationale of the test statistic <br />
12. 12. Graphical demonstration:<br />Employing two types of variability<br />
13. 13. 30<br />25<br />20<br />19<br />12<br />10<br />9<br />7<br />1<br />Treatment 3<br />Treatment 1<br />Treatment 2<br />20<br />16<br />15<br />14<br />11<br />10<br />9<br />The sample means are the same as before,<br />but the larger within-sample variability <br />makes it harder to draw a conclusion<br />about the population means.<br />A small variability within<br />the samples makes it easier<br />to draw a conclusion about the <br />population means. <br />Treatment 1<br />Treatment 2<br />Treatment 3<br />
14. 14. The rationale behind the test statistic – I <br />If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean).<br />If the alternative hypothesis is true, at least some of the sample means would differ.<br />Thus, we measure variability between sample means. <br />
15. 15. <ul><li>The variability between the sample means is measured as the sum of squared distances between each mean and the grand mean.</li></ul>This sum is called the <br />Sum of Squares for Treatments<br />SST<br />In our example treatments are<br />represented by the different<br />advertising strategies.<br />Variability between sample means<br />
16. 16. There are k treatments<br />The mean of sample j<br />The size of sample j <br />Sum of squares for treatments (SST)<br />Note: When the sample means are close toone another, their distance from the grand <br />mean is small, leading to a small SST. Thus, <br />large SST indicates large variation between <br />sample means, which supports H1.<br />
17. 17. Solution – continuedCalculate SST <br />= 20(577.55 - 613.07)2 + <br />+ 20(653.00 - 613.07)2 + <br />+ 20(608.65 - 613.07)2 =<br />= 57,512.23<br />The grand mean is calculated by <br />Sum of squares for treatments (SST)<br />
18. 18. Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. <br />Therefore, even though sample means may markedly differ from one another, SST must be judged relative to the “within samples variability”. <br />The rationale behind test statistic – II <br />
19. 19. The variability within samples is measured by adding all the squared distances between observations and their sample means.<br />This sum is called the <br />Sum of Squares for Error <br />SSE<br />In our example this is the <br />sum of all squared differences<br />between sales in city j and the<br />sample mean of city j (over all <br />the three cities).<br />Within samples variability <br />
20. 20. Solution – continuedCalculate SSE<br /> Sum of squares for errors (SSE) <br />= (n1 - 1)s12 + (n2 -1)s22 + (n3 -1)s32<br />= (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24 <br />= 506,983.50<br />
21. 21. To perform the test we need to calculate the mean squaresas follows:<br />The mean sum of squares <br />Calculation of MST - <br />Mean Square for Treatments <br />Calculation of MSE<br />Mean Square for Error<br />
22. 22. Calculation of the test statistic <br />Required Conditions:<br />1. The populations tested<br /> are normally distributed.<br />2. The variances of all the<br /> populations tested are<br /> equal.<br />with the following degrees of freedom:<br />v1=k -1 and v2=n-k<br />
23. 23. H0: m1 = m2 = …=mk<br />H1: At least two means differ<br />Test statistic: <br />R.R: F>Fa,k-1,n-k<br />the hypothesis test:<br />And finally<br />The F test rejection region <br />
24. 24. The F test<br /> Ho: m1 = m2= m3<br /> H1: At least two means differ <br /> Test statistic F= MST/ MSE= 3.23<br />Since 3.23 > 3.15, there is sufficient evidence <br />to reject Ho in favor of H1,and argue that at least one <br />of the mean sales is different than the others.<br />
25. 25. single factor ANOVA<br />SS(Total) = SST + SSE<br />
26. 26. Fixed effects<br />If all possible levels of a factor are included in our analysis we have a fixed effect ANOVA.<br />The conclusion of a fixed effect ANOVA applies only to the levels studied.<br />Random effects<br />If the levels included in our analysis represent a random sample of all the possible levels, we have a random-effect ANOVA.<br />The conclusion of the random-effect ANOVA applies to all the levels (not only those studied).<br />Models of Fixed and Random Effects<br />
27. 27. In some ANOVA models the test statistic of the fixed effects case may differ from the test statistic of the random effect case.<br />Fixed and random effects - examples<br />Fixed effects - The advertisement Example .All the levels of the marketing strategies were included <br />Random effects - To determine if there is a difference in the production rate of 50 machines, four machines are randomly selected and there production recorded.<br />Models of Fixed and Random Effects.<br />
28. 28. Two Way Analysis of Variance<br />
29. 29. One - way ANOVA<br />Single factor<br />Two - way ANOVA<br />Two factors<br />Response<br />Response<br />Treatment 3 (level 1)<br /> Treatment 2 (level 2)<br />Treatment 1 (level 3)<br />Level 3<br />Level2<br />Factor A<br />Level 1<br />Level 1<br />Level2<br />Factor B<br />
30. 30. Two-Factor Analysis of Variance -<br />Example<br />Suppose in the Example, two factors are to be examined:<br />The effects of the marketing strategy on sales.<br />Emphasis on convenience<br />Emphasis on quality<br />Emphasis on price<br />The effects of the selected media on sales.<br />Advertise on TV<br />Advertise in newspapers <br />
31. 31. Attempting one-way ANOVA<br />Solution<br />We may attempt to analyze combinations of levels, one from each factor using one-way ANOVA.<br />The treatments will be:<br />Treatment 1: Emphasize convenience and advertise in TV<br />Treatment 2: Emphasize convenience and advertise in newspapers<br />…………………………………………………………………….<br />Treatment 6: Emphasize price and advertise in newspapers<br />
32. 32. Attempting one-way ANOVA<br />Solution<br />The hypotheses tested are:<br />H0: m1= m2= m3= m4= m5= m6<br />H1: At least two means differ.<br />
33. 33. Attempting one-way ANOVA<br /><ul><li>Solution
34. 34. In each one of six cities sales are recorded for ten weeks.
35. 35. In each city a different combination of marketing emphasis and media usage is employed. </li></ul>City1City2City3City4City5City6Convnce Convnce Quality Quality Price Price<br /> TV Paper TV Paper TV Paper<br />
36. 36. City1City2City3City4City5City6Convnce Convnce Quality Quality Price Price<br /> TV Paper TV Paper TV Paper<br />Attempting one-way ANOVA<br />Solution<br /><ul><li> The p-value =.0452.
37. 37. We conclude that there is evidence that differences exist in the mean weekly sales among the six cities.</li></li></ul><li>Interesting questions – no answers<br />These result raises some questions:<br />Are the differences in sales caused by the different marketing strategies?<br />Are the differences in sales caused by the different media used for advertising?<br />Are there combinations of marketing strategy and media that interact to affect the weekly sales?<br />
38. 38. The current experimental design cannot provide answers to these questions.<br />A new experimental design is needed.<br />Two-way ANOVA (two factors)<br />
39. 39. Two-way ANOVA (two factors)<br />Factor A: Marketing strategy<br />Factor B: <br />Advertising media<br />Convenience<br />Quality<br />Price<br />City 1<br />sales<br />City3<br />sales<br />City 5<br />sales<br />TV<br />City 2<br />sales<br />City 4<br />sales<br />City 6<br />sales<br />Newspapers<br />Are there differences in the mean sales <br />caused by different marketing strategies? <br />
40. 40. Calculations are <br />based on the sum of <br />square for factor ASS(A)<br /> Test whether mean sales of “Convenience”, “Quality”, <br /> and “Price” significantly differ from one another. <br /> H0: mConv.= mQuality = mPrice<br />H1: At least two means differ <br />Two-way ANOVA (two factors)<br />
41. 41. Two-way ANOVA (two factors)<br />Factor A: Marketing strategy<br />Convenience<br />Quality<br />Price<br />City 1<br />sales<br />City 3<br />sales<br />City 5<br />sales<br />TV<br />Factor B: <br />Advertising media<br />City 2<br />sales<br />City 4<br />sales<br />City 6<br />sales<br />Newspapers<br />Are there differences in the mean sales <br />caused by different advertising media? <br />
42. 42. Calculations are based onthe sum of square for factor BSS(B)<br />Test whether mean sales of the “TV”, and “Newspapers” significantly differ from one another. H0: mTV = mNewspapers<br />H1: The means differ <br />Two-way ANOVA (two factors)<br />
43. 43. Two-way ANOVA (two factors)<br />Quality<br />TV<br />Factor A: Marketing strategy<br />Convenience<br />Quality<br />Price<br />City 1<br />sales<br />City 5<br />sales<br />City 3<br />sales<br />TV<br />Factor B: <br />Advertising media<br />City 2<br />sales<br />City 4<br />sales<br />City 6<br />sales<br />Newspapers<br />Are there differences in the mean sales <br />caused by interaction between marketing <br />strategy and advertising medium? <br />
44. 44. Test whether mean sales of certain cells <br /> are different than the level expected.<br />Calculation are based on the sum of square for interaction SS(AB)<br />Two-way ANOVA (two factors)<br />
45. 45. Sums of squares<br />
46. 46. MS(A)<br />MSE<br />MS(B)<br />MSE<br />MS(AB)<br />MSE<br />F= <br />F= <br />F= <br />F tests for the Two-way ANOVA<br />Test for the difference between the levels of the main factors A and B<br />SS(A)/(a-1)<br />SS(B)/(b-1)<br />SSE/(n-ab)<br />Rejection region: F > Fa,a-1 ,n-ab F > Fa, b-1, n-ab<br /><ul><li>Test for interaction between factors A and B</li></ul>SS(AB)/(a-1)(b-1)<br />Rejection region: F > Fa,(a-1)(b-1),n-ab<br />
47. 47. Required conditions:<br />The response distributions is normal<br />The treatment variances are equal.<br />The samples are independent.<br />
48. 48. F tests for the Two-way ANOVA<br />
49. 49. F tests for the Two-way ANOVA<br />Example – continued<br />Test of the difference in mean sales between the three marketing strategies<br />H0: mconv. = mquality = mprice<br />H1: At least two mean sales are different<br />Factor A Marketing strategies<br />
50. 50. F tests for the Two-way ANOVA<br />Example – continued<br />Test of the difference in mean sales between the three marketing strategies<br />H0: mconv. = mquality = mprice<br />H1: At least two mean sales are different<br />F = MS(Marketing strategy)/MSE = 5.33 <br /> Fcritical = Fa,a-1,n-ab = F.05,3-1,60-(3)(2) = 3.17; (p-value = .0077)<br />At 5% significance level there is evidence to infer that differences in weekly sales exist among the marketing strategies.<br />MS(A)/MSE<br />
51. 51. F tests for the Two-way ANOVA<br />Example - continued<br />Test of the difference in mean sales between the two advertising media<br />H0: mTV. = mNespaper<br />H1: The two mean sales differ<br />Factor B = Advertising media<br />
52. 52. F tests for the Two-way ANOVA<br />Example - continued<br />Test of the difference in mean sales between the two advertising media<br />H0: mTV. = mNespaper<br />H1: The two mean sales differ<br />F = MS(Media)/MSE = 1.42 <br /> Fcritical = Fa,a-1,n-ab = F.05,2-1,60-(3)(2) = 4.02 (p-value = .2387)<br />At 5% significance level there is insufficient evidence to infer that differences in weekly sales exist between the two advertising media.<br />MS(B)/MSE<br />
53. 53. F tests for the Two-way ANOVA<br />Example - continued<br />Test for interaction between factors A and B<br />H0: mTV*conv. = mTV*quality =…=mnewsp.*price<br />H1: At least two means differ<br />Interaction AB = Marketing*Media<br />
54. 54. F tests for the Two-way ANOVA<br />Example - continued<br />Test for interaction between factor A and B<br />H0: mTV*conv. = mTV*quality =…=mnewsp.*price<br />H1: At least two means differ<br />F = MS(Marketing*Media)/MSE = .09 <br /> Fcritical = Fa,(a-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = 3.17 (p-value= .9171)<br />At 5% significance level there is insufficient evidence to infer that the two factors interact to affect the mean weekly sales.<br />MS(AB)/MSE<br />
55. 55. JyothimonC<br />M.Tech Technology Management<br />University of Kerala<br />Send your feedbacks and queries to<br />jyothimonc@yahoo.com <br />