Like this presentation? Why not share!

4,597

Published on

No Downloads

Total Views

4,597

On Slideshare

0

From Embeds

0

Number of Embeds

5

Shares

0

Downloads

0

Comments

0

Likes

4

No embeds

No notes for slide

- 1. 1 Research Methods in Health Chapter 8. Statistical Methods 2 ANOVA Young Moon Chae, Ph.D. Graduate School of Public Health Yonsei University, Korea ymchae@yuhs.ac
- 2. 2 Table of Contents • One way ANOVA • Multiple comparison • Repeated measure ANOVA • ANCOVA
- 3. 3 One-Way ANOVA
- 4. 4 ANOVA When to use it • Analysis of variance (ANOVA) is the most commonly used technique for comparing the means of groups of measurement data. There are lots of different experimental designs that can be analyzed with different kinds of ANOVA • In a one-way ANOVA (also known as a single-classification ANOVA), there is one measurement variable and one nominal variable. Null hypothesis • The statistical null hypothesis is that the means of the measurement variable are the same for the different categories of data; the alternative hypothesis is that they are not all the same.
- 5. 5 Rationale for ANOVA (1) • We have at least 3 means to test, e.g., H0: m1 = m2 = m3. • Could take them 2 at a time, but really want to test all 3 (or more) at once. • Instead of using a mean difference, we can use the variance of the group means about the grand mean over all groups. • Logic is just the same as for the t-test. Compare the observed variance among means (observed difference in means in the t-test) to what we would expect to get by chance.
- 6. 6 ANOVA Assumptions • Data in each group are a random sample from some population. • Observations within groups are independent. • Samples are independent. • Underlying populations normally distributed. • Underlying populations have the same variance – This can formally tested with Bartlett’s test • What might happen (why would it be a problem) if the assumption of {normality, equality of error, independence of error} turned out to be false? -> Use non-parametric statistics or use data transformation
- 7. 7 Multiple comparisons • When we carry out an ANOVA on k treatments, we test H0 : μ1 = · · · = μk versus Ha : H0 is false • Assume we reject the null hypothesis, i.e. we have some evidence that not all treatment means are equal. Then we could for example be interested in which ones are the same, and which ones differ. • For this, we might have to carry out some more hypothesis tests. • This procedure is referred to as multiple comparisons.
- 8. 8 Types of multiple comparisons • There are two different types of multiple comparisons procedures: • Sometimes we already know in advance what questions we want to answer. Those comparisons are called planned (or a priori) comparisons. • Sometimes we do not know in advance what questions we want to answer, and the judgment about which group means will be studied the same depends on the ANOVA outcome. Those comparisons are called unplanned (or a posteriori) comparisons. -Planned comparisons: adjust for just those tests that are planned. -Unplanned comparisons: adjust for all possible comparisons.
- 9. 9 Independence of planned comparison å = j jjcc 021 Comparison A1 A2 A3 A4 1 -1/3 1 -1/3 -1/3 2 -1/2 0 -1/2 1 3 1/2 1/2 -1/2 -1/2 0)1*3/1()2/1*3/1( )0*1()2/1*3/1(21 =-+-- ++--=åj jj cc 3/26/4)2/1*3/1()2/1*3/1( )2/1*1()2/1*3/1(31 ==--+-- ++-=åj jj cc One and two are orthogonal; one and three are not There are J-1 orthogonal comparisons. Use only what you need. .
- 10. 10 Example 1 • We previously investigated whether the mean blood coagulation times for animals receiving different diets (A, B, C or D) were the same. • Imagine A is the standard diet, and we wish to compare each of diets B, C, D to diet A. → planned comparisons! • After inspecting the treatment means, we find that A and D look similar, and B and C look similar, but A and D are quite different from B and C. We might want to formally test the hypothesis → unplanned comparisons!
- 11. 11 Example 2 • A plant physiologist recorded the length of pea sections grown in tissue culture with auxin present. The purpose of the experiment was to investigate the effects of various sugars on growth. Four different treatments were used, plus one control (no sugar): No sugar 2% glucose 2% fructose 1% glucose + 1% fructose 2% sucrose • The investigator wants to answer three specific questions: - Does the addition of sugars have an effect on the lengths of the pea sections? - Are there differences between the pure sugar treatments and the mixed sugar treatment? - Are there differences among the pure sugar treatments? Planned comparisons!
- 12. 12 ANOVA Table
- 13. 13 Bonferroni Correction • Suppose we have 10 treatment groups, and so 45 pairs. • If we perform 45 t-tests at the significance level = 0.05, we’d expect to reject 5% × 45 ≈ 2 of them, even if all of the means were the same. • Let = Pr(reject at least one pairwise test | all μ’s the same) ≤ (no. tests) × Pr(reject test #1 | μ’s the same) • The Bonferroni correction: Use ′ = /(no. tests) as the significance level for each test. • For example, with 10 groups and so 45 pairwise tests, we’d use ′ = 0.05 / 45 ≈ 0.0011 for each test.
- 14. 14 Post Hoc Tests • Given a significant F, where are the mean differences? • Often do not have planned comparisons. • Usually compare pairs of means. • There are many methods of post hoc (after the fact) tests.
- 15. 15 Scheffé • Can use for any contrast. Follows same calculations, but uses different critical values. • Instead of comparing the test statistic to a critical value of t, use: )ˆvar(. ˆ y y est t = aFJ )1(S -= Where the F comes from the overall F test (J-1 and N-J df).
- 16. 16 Scheffé (2) Source SS df MS F Cells (A1- A4) 219 3 73 12.17 Error 72 12 6 Total 291 15 5.3)22*5(.)28*5(. )25*5(.)18*5(.ˆ1 -=-- +=y 2/3 4 )5.()5.(5.5. 6)ˆ(. 2222 = -+-++ =yVarest 86.2 2247.1 5.3 )ˆvar(. ˆ -= - == y y est t 24.349.3)14()1(S =-=-= aFJ 49.3)12,3,05.( ==aF (Data from earlier problem.) The comparison is not significant because |-2.86|<3.24.
- 17. 17 Paired comparisons •Newman Keuls and Tukey HSD depend on q, the studentized range statistic. • Suppose we have J independent sample means and we find the largest and the smallest. nMS yy q error / minmax - = MS error comes from the ANOVA we did to get the J means. The n refers to sample size per cell. If two cells are unequal, use 2n1n2/(n1+n2). The sampling distribution of q depends on k, the number of means covered by the range (max-min), and on v, the degrees of freedom for MSerror.
- 18. 18 Tukey HSD HSD = honestly significant difference. For HSD, use k = J, the number of groups in the study. Choose alpha, and find the df for error. Look up the value qα. Then find the value: n MS qHSD error a= Compare HSD to the absolute value of the difference between all pairs of means. Any difference larger than HSD is significant.
- 19. 19 HSD 2 Grp -> 1 2 3 4 5 M -> 63 82 80 77 70 Source SS df MS F p Grps 2942.4 4 725.6 4.13 <.05 Error 9801.0 55 178.2 K = 5 groups; n=12 per group, v has 55 df. Tabled value of q with alpha =.05 is 3.98. 34.15 12 2.178 98.3 === n MS qHSD error a Group 1 5 4 3 2 1 63 0 7 14 17* 19* 5 70 0 7 10 12 4 77 0 3 5 3 80 0 2 2 82 0
- 20. 20 Comparing Post Hoc Tests •The Newman-Keuls found 3 significant differences in our example. The HSD found 2 differences. •If we had used the Bonferroni approach, we would have found an interval of 15.91 required for significance (and therefore the same two significant as HSD). Thus, power descends from the Newman-Keuls to the HSD to the Bonferroni. • The type I error rates go just the opposite, the lowest to Bonferroni, then HSD and finally Newman-Keuls. Do you want to be liberal or conservative in your choice of tests? Type I error vs Power.
- 21. 21 Repeated Measures ANOVA When to Use Repeated Measures ANOVA • Repeated measures ANOVA is used when all members of a random sample are measured under a number of different conditions. As the sample is exposed to each condition in turn, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because the data violate the ANOVA assumption of independence. • This approach is used for several reasons. - Some research hypotheses require repeated measures. Longitudinal research, for example, measures each sample member at each of several ages. In this case, age would be a repeated factor. - When sample members are difficult to recruit, repeated measures designs are economical because each member is measured under all conditions.
- 22. 22 Statistical Terminology Used in this Document • A sample member is called a subject. • When a dependent variable is measured repeatedly for all sample members across a set of conditions, this set of conditions is called a within-subjects factor. The conditions that constitute this type of factor are called trials. • When a dependent variable is measured on independent groups of sample members, where each group is exposed to a different condition, the set of conditions is called a between-subjects factor. The conditions that constitute this factor type are called groups. • When an analysis has both within-subjects factors and between subjects factors, it is called a repeated measures ANOVA with between-subjects factors
- 23. 23 Example • Suppose that, as a health researcher, you want to examine the impact of dietary habit and exercise on pulse rate. To investigate these issues, you collect a sample of individuals and group them according to their dietary preferences: meat eaters and vegetarians. You then divide each diet category into three groups, randomly assigning each group to one of three types of exercise: aerobic stair climbing, racquetball, and weight training. So far, then, your design has two between- subjects (grouping) factors: dietary preference and exercise type. • Suppose that, in addition to these between-subjects factors, you want to include a single within-subjects factor in the analysis. Each subject's pulse rate will be measured at three levels of exertion: after warm-up exercises, after jogging, and after running. Thus, intensity (of exertion) is the within-subjects factor in this design. The order of these three measurements will be randomly assigned for each subject
- 24. 24 Research Questions : Within-Subjects Main Effect • Does intensity influence pulse rate? (Does mean pulse rate change across the trials for intensity?) This is the test for a within-subjects main effect of intensity. Between-Subjects Main Effects • Does dietary preference influence pulse rate? (Do vegetarians have different mean pulse rates than meat eaters?) This is the test for a between-subjects main effect of dietary preference. • Does exercise type influence pulse rate? (Are there differences in mean pulse rates between stair climbers, racquetball players, and weight trainers?) This is the test for a between-subjects main effect of exercise type. Between-Subjects Interaction Effect • Does the influence of exercise type on pulse rate depend on dietary preference? (Does the pattern of differences between mean pulse rates for exercise-type groups change for each dietary-preference group?) This is the test for a between-subjects interaction of exercise type by dietary preference.
- 25. 25 Results • Diet: With a p value less than .0001, you have a statistically significant effect. You can therefore conclude that a statistically significant difference exists between vegetarians and meat eaters on their overall pulse rates. In other words, there is a main effect for diet. The cell means (not shown here) show that meat eaters experience higher pulse rates than vegetarians. • Exercise type: It is non-significant: F(2, 144) = .31, p=.7341. Thus, you can conclude that the type of exercise has no statistically significant effect on overall mean pulse rates. • The test of the DIET BY EXERTYPE interaction also shows a non-significant result (F(2, 144) = .52, p=.594). This suggests that dietary preferences and type of exercise do not combine to influence the overall average pulse rate. • When an interaction effect is significant, the pattern of cell means must be examined to determine the meaning not only of the interaction, but also the meaning of any main effects involved in the interaction.
- 26. 26 Analyzing Continuous and Categorical IVs Simultaneously Analysis of Covariance
- 27. 27 (cont.) When to use it • The purpose of ANCOVA is to compare two or more linear regression lines. It is a way of comparing the Y variable among groups while statistically controlling for variation in Y caused by variation in the X variable. Null hypotheses • Two null hypotheses are tested in an ANCOVA. The first is that the slopes of the regression lines are all the same. If this hypothesis is not rejected, the second null hypothesis is tested: that the Y-intercepts of the regression lines are all the same. • Although the most common use of ANCOVA is for comparing two regression lines, it is possible to compare three or more regressions. If their slopes are all the same, it is then possible to do planned or unplanned comparisons of Y-intercepts, similar to the planned or unplanned comparisons of means in an ANOVA
- 28. 28 ANCOVA (GLM): Example The General Linear Model (GLM) approach is used to ANCOVA to determine whether MCAT scores are significantly different among medical students who had different types of undergraduate majors, when adjusted for year of matriculation.
- 29. 29 • Dependent variable § nmtot1: MCAT total (most recent) • Fixed factor § bmaj2: Undergraduate major 1 = Biology/Chemistry 2 = Other science/health 3 = Other • Covariate § matyr: Year of matriculation ANCOVA (GLM): cont.
- 30. 30 ANCOVA (GLM) Output: : Descriptive Statistics
- 31. 31 ANCOVA (GLM) Output: : Descriptive Statistics (continued)
- 32. 32 ANCOVA (GLM) Output: Tests of Between-Subjects Effects
- 33. 33 ANCOVA (GLM): Estimated Marginal Means: Undergraduate Major
- 34. 34 ANCOVA (GLM): Post Hoc Tests: Undergraduate Major
- 35. 35 ANCOVA (GLM): Profile Plot: Gender*Resloc

Be the first to comment