Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ANOVA II

44,353 views

Published on

Describes the design, assumptions, and interpretations for one-way ANOVA, one-way repeated measures ANOVA, factorial ANOVA, SPANOVA, ANCOVA, and MANOVA. More info: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Lectures/ANOVA_II

Published in: Education, Technology

ANOVA II

  1. 1. Lecture 10 Survey Research & Design in Psychology James Neill, 2009 ANOVA Part 2
  2. 2. Overview (Parametric tests of mean differences) <ul><li>One-sample t -test
  3. 3. Independent samples t -test
  4. 4. Paired samples t -test
  5. 5. One-way ANOVA
  6. 6. One-way repeated measures ANOVA
  7. 7. Factorial ANOVA
  8. 8. Mixed design ANOVA
  9. 9. ANCOVA
  10. 10. MANOVA
  11. 11. Repeated measures MANOVA </li></ul>
  12. 12. Major assumptions <ul><li>Normally distributed variables
  13. 13. Homogeneity of variance </li></ul>In general, ANOVA is robust to violation of assumptions, particularly with large cell sizes, but don't be complacent
  14. 14. Why a t-test or ANOVA? <ul><li>A t -test or ANOVA is used to determine whether a sample of scores are from the same population as another sample of scores.
  15. 15. These are inferential tools for examining differences between group means. </li></ul>
  16. 16. t -tests <ul><li>An inferential statistical test used to determine whether two sets of scores come from the same population
  17. 17. Is the difference between two sample means ‘real’ or due to chance? </li></ul>
  18. 18. Use of t in t -tests <ul><li>Question: Is the t large enough that it is unlikely that the two samples have come from the same population?
  19. 19. Decision: Is t larger than the critical value for t (see t tables – depends on critical  and N ) </li></ul>
  20. 20. Ye good ol’ normal distribution 68% 95% 99.7%
  21. 21. Use of t in t -tests <ul><li>t reflects the ratio of differences between groups to within groups variability
  22. 22. Is the t large enough that it is unlikely that the two samples have come from the same population?
  23. 23. Decision: Is t larger than the critical value for t </li></ul>(see t tables – depends on critical  and N )
  24. 24. One-tail vs. two-tail tests <ul><li>Two-tailed test rejects null hypothesis if obtained t -value is extreme is either direction
  25. 25. One-tailed test rejects null hypothesis if obtained t -value is extreme is one direction (you choose – too high or too low)
  26. 26. One-tailed tests are twice as powerful as two-tailed, but they are only focused on identifying differences in one direction. </li></ul>
  27. 27. Single sample t -test <ul><li>Compare one group (a sample) with a fixed, pre-existing value (e.g., population norms) </li></ul>e.g., Does a sample of university students who sleep on average 6.5 hours per day ( SD = 1.3) differ significantly from the recommended 8 hours of sleep? Compare one group (a sample) with a fixed, pre-existing value (e.g., population norms) E.g., Does a sample of university students who sleep on average 6.5 hours per day ( SD = 1.3) differ significantly from the recommended 8 hours of sleep?
  28. 28. Independent groups t -test Compares mean scores on the same variable across different populations (groups) e.g., <ul><li>Do males and females differ in IQ?
  29. 29. Do Americans vs. Non-Americans differ in their approval of George Bush? </li></ul>Compares mean scores on the same variable across different populations (groups) Do Americans vs. Non-Americans differ in their approval of George Bush?
  30. 30. Assumptions (Independent samples t-test) <ul><li>IV is ordinal / categorical e.g., gender
  31. 31. DV is interval / ratio e.g., self-esteem
  32. 32. Homogeneity of Variance </li><ul><li>If variances unequal (Levene’s test), adjustment made
  33. 33. Normality – t-tests robust to modest departures from normality: consider use of Mann-Whitney U test if severe skewness </li></ul><li>Independence of observations (one participant’s score is not dependent on any other participant’s score) </li></ul>
  34. 34. Do males and females differ in memory recall?
  35. 35. Paired samples t -test <ul><li>Same participants, with repeated measures
  36. 36. Data is sampled within subjects , e.g., </li><ul><li>Pre- vs. post- treatment ratings
  37. 37. Different factors e.g., Voter’s approval ratings of candidate X vs. Y </li></ul></ul>
  38. 38. Assumptions (paired samples t -test) <ul><li>DV must be measured at interval or ratio level
  39. 39. Population of difference scores must be normally distributed (robust to violation with larger samples)
  40. 40. Independence of observations (one participant’s score is not dependent on any other participant’s score) </li></ul>
  41. 41. Do females’ memory recall scores change over time?
  42. 42. Assumptions <ul><li>IV is ordinal / categorical e.g., gender
  43. 43. DV is interval / ratio e.g., self-esteem
  44. 44. Homogeneity of Variance </li><ul><li>If variances unequal, adjustment made (Levene’s Test) </li></ul><li>Normality - often violated, without consequence </li><ul><li>look at histograms
  45. 45. look at skewness
  46. 46. look at kurtosis </li></ul></ul>
  47. 47. SPSS Output: Independent Samples t -test: Same Sex Relations
  48. 48. Independent Samples t -test: Opposite Sex Relations
  49. 49. Independent Samples t -test: Opposite Sex Relations
  50. 50. What is ANOVA? (Analysis of Variance) <ul><li>An extension of a t -test
  51. 51. A way to test for differences between M s of: </li><ul><li>more than 2 groups, or
  52. 52. more than 2 times or variables </li></ul><li>Levels of measurement: </li><ul><li>DV: metric,
  53. 53. IV: categorical </li></ul></ul>
  54. 54. Introduction to ANOVA <ul><li>Single DV, with 1 or more IVs
  55. 55. IVs are discrete
  56. 56. Are there differences in the central tendency of groups?
  57. 57. Inferential: Could the observed differences be due to chance?
  58. 58. Follow-up tests: Which of the M s differ?
  59. 59. Effect Size: How large are the differences? </li></ul>
  60. 60. F test <ul><li>ANOVA partitions the ‘sums of squares’ (variance from the mean) into: </li><ul><li>Explained variance (between groups)
  61. 61. Unexplained variance (within groups) – or error variance </li></ul><li>F = ratio b/w explained & unexplained variance
  62. 62. F = p (that the observed mean differences between groups could be attributable to chance0
  63. 63. F = MLR test of the significance of R. </li></ul>
  64. 64. F is the ratio of between- : within-group variance
  65. 65. Assumptions – One-way ANOVA DV must be: <ul><li>Measured at interval or ratio level
  66. 66. Normally distributed in all groups of the IV (robust to violations of this assumption if Ns are large and approximately equal e.g., >15 cases per group)
  67. 67. Have approximately equal variance across all groups of the IV (homogeneity of variance)
  68. 68. Independence of observations </li></ul>
  69. 69. Example: One-way between groups ANOVA Does Locus of Control differ across age groups? <ul><li>20-25 year-olds
  70. 70. 40-45 year olds
  71. 71. 60-65 year-olds </li></ul>
  72. 74.  2 = SS between /SS total = 395.433 / 3092.983 = 0.128 Eta-squared is expressed as a percentage: 12.8% of the total variance in control is explained by differences in Age
  73. 75. Which age groups differ in their mean control scores? (Post hoc tests) Conclude: Gps 0 differs from 2; 1 differs from 2
  74. 76. ONE-WAY ANOVA Are there differences in Satisfaction levels between students who get different Grades?
  75. 83. Assumptions - Repeated measures ANOVA <ul><li>Sphericity - Variance of the population difference scores for any two conditions should be the same as the variance of the population difference scores for any other two conditions (Mauchly's test of sphericity) </li><ul><li>Note: This assumption is commonly violated, however the multivariate test (provided by default in SPSS output) does not require the assumption of sphericity and may be used as an alternative.
  76. 84. When results are consistent, not of major concern. When results are discrepant, better to go with MANOVA </li></ul><li>Normality </li></ul>
  77. 85. Example: Repeated measures ANOVA <ul><li>Does LOC vary over a period of 12 months?
  78. 86. LOC measures obtained over 3 intervals: baseline, 6 month follow-up, 12 month follow-up. </li></ul>
  79. 87. Mean LOC scores (with 95% C.I.s) across 3 measurement occasions
  80. 88. Descriptive statistics
  81. 89. Mauchly's test of sphericity
  82. 90. Tests of within-subject effects
  83. 91. 1-way Repeated Measures ANOVA Do satisfaction levels vary between Education, Teaching, Social and Campus aspects of university life?
  84. 94. Tests of within-subject effects
  85. 95. Follow-up tests <ul><li>Post hoc : Compares every possible combination
  86. 96. Planned : Compares specific combinations </li></ul>(Do one or the other; not both)
  87. 97. Post hoc <ul><li>Control for Type I error rate
  88. 98. Scheffe, Bonferroni, Tukey’s HSD, or Student-Newman-Keuls
  89. 99. Keeps experiment-wise error rate to a fixed limit </li></ul>
  90. 100. Planned <ul><li>Need hypothesis before you start
  91. 101. Specify contrast coefficients to weight the comparisons (e.g., 1 st two vs. last one)
  92. 102. Tests each contrast at critical  </li></ul>
  93. 103. TWO-WAY ANOVA Are there differences in Satisfaction levels between Gender and Age?
  94. 108. TWO-WAY ANOVA Are there differences in LOC between Gender and Age?
  95. 109. Example: Two-way (factorial) ANOVA Main1: Do LOC scores differ by Age? Main2: Do LOC scores differ by Gender? Interaction: Is the relationship between Age and LOC moderated by Gender? (Does any relationship between Age and LOC vary as a function of Gender?)
  96. 110. Example: Two-way (factorial) ANOVA Factorial designs test Main Effects and Interactions In this example we have two main effects (Age and Gender) And one interaction (Age x Gender) potentially explaining variance in the DV (LOC)
  97. 111. Data structure IVs Age recoded into 3 groups (3) Gender dichotomous (2) DV Locus of Control (LOC) Low scores = more internal High scores = more external
  98. 112. Plot of LOC by Age and Gender
  99. 113. Age x gender interaction
  100. 114. Age main effect
  101. 115. Age main effect
  102. 116. Gender main effect
  103. 117. Gender main effect
  104. 118. Age x gender interaction
  105. 120. Mixed Design ANOVA (SPANOVA) <ul><li>It is very common for factorial designs to have within-subject (repeated measures) on some (but not all) of their treatment factors. </li></ul>
  106. 121. Mixed Design ANOVA (SPANOVA) <ul><li>Since such experiments have mixtures of between subjects and within-subject factors they are said to be of MIXED DESIGN </li></ul>
  107. 122. <ul><li>Common practice to select two samples of subjects </li></ul><ul><li>e.g., Males/Females </li><ul><ul><li>Winners/Losers </li></ul></ul></ul>Mixed Design ANOVA (SPANOVA)
  108. 123. <ul><li>Then perform some repeated measures on each group.
  109. 124. Males and females are tested for recall of a written passage with three different line spacings </li></ul>Mixed Design ANOVA (SPANOVA)
  110. 125. <ul><li>This experiment has two Factors </li></ul>B/W = Gender (Male or Female) W/I = Spacing (Narrow, Medium, Wide) <ul><li>Gender varies between subjects
  111. 126. Spacing varies within-subjects </li></ul>Mixed Design ANOVA (SPANOVA)
  112. 127. Design <ul><li>If A is Gender and B is Spacing the Reading experiment is of the type A X (B)
  113. 128. signifying a mixed design with repeated measures on Factor B </li></ul>
  114. 129. Design <ul><li>With three treatment factors, two mixed designs are possible
  115. 130. These may be one or two repeated measures
  116. 131. A X B X (C) or
  117. 132. A X (B X C) </li></ul>
  118. 133. Assumptions <ul><li>Random Selection
  119. 134. Normality
  120. 135. Homogeneity of Variance
  121. 136. Sphericity
  122. 137. Homogeneity of inter-correlations </li></ul>
  123. 138. Sphericity The variance of the population difference scores for any two conditions should be the same as the variance of the population difference scores for any other two conditions.
  124. 139. Sphericity <ul><li>Tested by Mauchly’s Test of Sphericity </li></ul><ul><li>If Mauchly’s W Statistic is p < .05 then assumption of sphericity is violated. </li></ul>
  125. 140. Sphericity The obtained F ratio must then be evaluated against new degrees of freedom calculated from the Greenhouse-Geisser, or Huynh-Feld, Epsilon values.
  126. 141. Homogeneity of intercorrelations <ul><li>The pattern of inter-correlations among the various levels of repeated measure factor(s) should be consistent from level to level of the Between-subject Factor(s) </li></ul>
  127. 142. Homogeneity of intercorrelations <ul><li>The assumption is tested using Box’s M statistic </li></ul><ul><li>Homogeneity is present when the M statistic is NOT significant at p > .001. </li></ul>
  128. 143. Mixed ANOVA or Split-Plot ANOVA Do Satisfaction levels vary between Gender for Education and Teaching?
  129. 148. ANCOVA Does Education Satisfaction differ between people who are ‘Not coping’, ‘Just coping’ and ‘Coping well’?
  130. 153. What is ANCOVA? <ul><li>Analysis of Cov ariance
  131. 154. Extension of ANOVA, using ‘regression’ principles
  132. 155. Assess effect of </li><ul><li>one variable (IV) on
  133. 156. another variable (DV)
  134. 157. after controlling for a third variable (CV) </li></ul></ul>
  135. 158. Why use ANCOVA? <ul><li>Reduces variance associated with covariate (CV) from the DV error (unexplained variance) term
  136. 159. Increases power of F -test
  137. 160. May not be able to achieve experimental over a variable (e.g., randomisation), but can measure it and statistically control for its effect. </li></ul>
  138. 161. Why use ANCOVA? <ul><li>Adjusts group means to what they would have been if all P ’s had scored identically on the CV.
  139. 162. The differences between P ’s on the CV are removed, allowing focus on remaining variation in the DV due to the IV.
  140. 163. Make sure hypothesis (hypotheses) is/are clear. </li></ul>
  141. 164. Assumptions of ANCOVA <ul><li>As per ANOVA
  142. 165. Normality
  143. 166. Homogeneity of Variance (use Levene’s test) </li></ul>
  144. 167. Assumptions of ANCOVA <ul><li>Independence of observations
  145. 168. Independence of IV and CV.
  146. 169. Multicollinearity - if more than one CV, they should not be highly correlated - eliminate highly correlated CVs.
  147. 170. Reliability of CVs - not measured with error - only use reliable CVs. </li></ul>
  148. 171. Assumptions of ANCOVA <ul><li>Check for linearity between CV & DV - check via scatterplot and correlation. </li></ul>
  149. 172. Assumptions of ANCOVA Homogeneity of regression <ul><li>Estimate regression of CV on DV
  150. 173. DV scores & means are adjusted to remove linear effects of CV
  151. 174. Assumes slopes of regression lines between CV & DV are equal for each level of IV, if not, don’t proceed with ANCOVA
  152. 175. Check via scatterplot, lines of best fit. </li></ul>
  153. 176. Assumptions of ANCOVA
  154. 177. ANCOVA Example <ul><li>Does Teaching Method affect Academic Achievement after controlling for motivation?
  155. 178. IV = teaching method
  156. 179. DV = academic achievement
  157. 180. CV = motivation
  158. 181. Experimental design - assume students randomly allocated to different teaching methods. </li></ul>
  159. 182. ANCOVA example 1 Academic Achievement (DV) Teaching Method (IV) Motivation (CV)
  160. 183. ANCOVA example 1 Academic Achievement Teaching Method Motivation
  161. 184. ANCOVA Example <ul><li>A one-way ANOVA shows a non-significant effect for teaching method (IV) on academic achievement (DV) </li></ul>
  162. 185. <ul><li>An ANCOVA is used to adjust for differences in motivation
  163. 186. F has gone from 1 to 5 and is significant because the error term (unexplained variance) was reduced by including motivation as a CV. </li></ul>ANCOVA Example
  164. 187. ANCOVA & Hierarchical MLR <ul><li>ANCOVA is similar to hierarchical regression – assesses impact of IV on DV while controlling for 3 rd variable.
  165. 188. ANCOVA more commonly used if IV is categorical. </li></ul>
  166. 189. ANCOVA & Hierarchical MLR <ul><li>Does teaching method affect achievement after controlling for motivation? </li><ul><li>IV = teaching method
  167. 190. DV = achievement
  168. 191. CV = motivation </li></ul><li>We could perform hierarchical MLR, with Motivation at step 1, and Teaching Method at step 2. </li></ul>
  169. 192. ANCOVA & Hierarchical MLR
  170. 193. ANCOVA & Hierarchical MLR
  171. 194. 1 - Motivation is a sig. predictor of achievement. 2 - Teaching method is a sig, predictor of achievement after controlling for motivation. ANCOVA & Hierarchical MLR
  172. 195. <ul><li>Does employment status affect well-being after controlling for age? </li><ul><li>IV = Employment status
  173. 196. DV = Well-being
  174. 197. CV = Age </li></ul><li>Quasi-experimental design - P ’s not randomly allocated to ‘employment status’. </li></ul>ANCOVA Example
  175. 198. <ul><li>ANOVA - significant effect for employment status </li></ul>ANCOVA Example
  176. 199. <ul><li>ANCOVA - employment status remains significant, after controlling for the effect of age. </li></ul>ANCOVA Example
  177. 200. Summary of ANCOVA <ul><li>Use ANCOVA in survey research when you can’t randomly allocate participants to conditions e.g., quasi-experiment, or control for extraneous variables.
  178. 201. ANCOVA allows us to statistically control for one or more covariates. </li></ul>
  179. 202. Summary of ANCOVA <ul><li>We can use ANCOVA in survey research when can’t randomly allocate participants to conditions e.g., quasi-experiment, or control for extraneous variables.
  180. 203. ANCOVA allows us to statistically control for one or more covariates. </li></ul>
  181. 204. Summary of ANCOVA <ul><li>Decide which variable(s) are IV, DV & CV.
  182. 205. Check assumptions: </li><ul><li>normality
  183. 206. homogeneity of variance (Levene’s test)
  184. 207. Linearity between CV & DV (scatterplot)
  185. 208. homogeneity of regression (scatterplot – compares slopes of regression lines) </li></ul><li>Results – does IV effect DV after controlling for the effect of the CV? </li></ul>
  186. 209. MANOVA Multivariate Analysis of Variance
  187. 210. MANOVA: Overview Generalisation to situation where there are several Dependent Variables. E.g., Researcher interested in <ul><li>different types of treatment
  188. 211. on several types of anxiety. </li><ul><li>Test Anxiety
  189. 212. Sport Anxiety
  190. 213. Speaking Anxiety </li></ul></ul>
  191. 214. MANOVA: Overview IV’s could be 3 different anxiety interventions: Systematic Desensitisation Autogenic Training Waiting List – Control MANOVA is used to ask whether the three anxiety measures vary overall as a function of the different treatments.
  192. 215. MANOVA: Overview ANOVAs test whether mean differences among groups on a single DV are likely to have occurred by chance. MANOVA tests whether mean differences among groups on a combination of DV’s are likely to have occurred by chance.
  193. 216. 1. By measuring several DV’s instead of only one the researcher improves the chance of discovering what it is that changes as a result of different treatments and their interactions. MANOVA advantages over ANOVA
  194. 217. e.g., Desensitisation may have an advantage over relaxation training or control, but only on test anxiety. The effect is missing if anxiety is not one of your DV’s. MANOVA advantages over ANOVA
  195. 218. 2. When there are several DV’s there is protection against inflated Type 1 error due to multiple tests of likely correlated DV’s. 3. When responses to two or more DV’s are considered in combination, group differences become apparent. MANOVA advantages over ANOVA
  196. 219. <ul><li>The best choice is a set of DV’s uncorrelated with one another because they each measure a separate aspect of the influence </li></ul>of IV’s. <ul><li>When there is little correlation among DV’s univariate F is acceptable.
  197. 220. Unequal cell sizes and missing data are problematical for MANOVA. </li></ul>
  198. 221. <ul><li>Reduced power can mean a non-significant multivariate effect but one or more significant Univariate F’s!
  199. 222. When cell sizes > 30 assumptions of normality and equal variances are of little concern.
  200. 223. Equal cell sizes preferred (but not essential)
  201. 224. Ratios of smallest to largest of 1:1.5+ may cause problems. </li></ul>
  202. 225. <ul><li>MANOVA is sensitive to violations of univariate and multivariate normality. Test each group or level of the IV using the split file option.
  203. 226. Multivariate outliers which affect normality can normally be identified using Mahalanobis distance in the Regression sub-menu. </li></ul>
  204. 227. <ul><li>Linearity: Linear relationships among all pairs of DV’s must be assumed. Within cell scatterplots must be conducted to test this assumption.
  205. 228. Homogeneity of Regression: It is assumed that the relationships between covariates and DV’s in one group is the same as other groups. Necessary if stepdown analyses required. </li></ul>
  206. 229. <ul><li>Homogeneity of Variance: Covariance Matrices similar to assumption of homogeneity of variance for individual DV’s.
  207. 230. Box’s M test is used for this assumption and should be non-significant p>.001.
  208. 231. Multicollinearity and Singularity: When correlations among DV’s are high, problems of multicollinearity exist. </li></ul>
  209. 232. Wiks’ lambda ( λ ) <ul><li>Several multivariate statistics are available to test significance of main effects and interactions. </li></ul><ul><li>Wilks’ lambda is one such statistic. </li></ul>
  210. 233. Effect Size: Eta-squared (  2 ) <ul><li>Analagous to R 2 from regression
  211. 234. = SS between / SS total = SS B / SS T
  212. 235. = prop. of variance in Y explained by X
  213. 236. = Non-linear correlation coefficient
  214. 237. = prop. of variance in Y explained by X
  215. 238. Ranges between 0 and 1
  216. 239. Interpret as for r 2 or R 2 ; a rule of thumb: .01 is small, .06 medium, .14 large </li></ul>
  217. 240. Effect Size: Eta-squared (  2 ) <ul><li>The eta-squared column in SPSS F-table output is actually partial eta-squared (  p 2 ) .
  218. 241.  2 is not provided by SPSS – calculate separately.
  219. 242. R 2 -squared at the bottom of SPSS F-tables is the linear effect as per MLR – however, if an IV has 3 or more non-interval levels, this won’t equate with  2 . </li></ul>
  220. 243. Results - Writing up ANOVA <ul><li>Establish clear hypotheses
  221. 244. Test the assumptions, esp. LOM, normality, univariate and multivariate outliers, homogeneity of variance, N
  222. 245. Present the descriptive statistics (text/table)
  223. 246. Consider presenting a figure to illustrate the data </li></ul>
  224. 247. Results - Writing up ANOVA <ul><li>F results (table/text) and direction of any effects
  225. 248. Consider power, effect sizes and confidence intervals
  226. 249. Conduct planned or post-hoc testing as appropriate
  227. 250. State whether or not results support hypothesis (hypotheses) </li></ul>

×