Successfully reported this slideshow.

Multiple regression presentation


Published on

Published in: Education
  • Be the first to comment

Multiple regression presentation

  1. 1. Multiple Regression and Correlation Dr. Carlo Magno
  2. 2. <ul><li>Y = a + bX </li></ul><ul><li>Bivariate correlation </li></ul><ul><li>y = b 1 x 1 + b 2 x 2 + ... + b n x n + c </li></ul><ul><li>Multiple correlation </li></ul>
  3. 3. <ul><li>Multiple Regression– association between a criterion variable and two or more predictor variables (Aron & Aron, 2003). </li></ul><ul><li> </li></ul><ul><li>Multiple correlation coefficient =  R </li></ul><ul><li>Using two or more variables to predict a criterion variable. </li></ul>
  4. 4. Onwuegbuzie, A. J., Bailey, P, & Daley, C. E. (2000). Cognitive, affective, personality, and demographic predictors of foreign-language achievement. The Journal of Educational Research, 94 , 3-15. Foreign Language Achievement Cognitive Academic Ach. Study Habits Expectation Affective Perception Anxiety Personality Cooperativeness Competitiveness Demographic Gender Age
  5. 5. Espin, C., Shin, J., Deno, S. L., Skare, S., Robinson, S., & Brenner, B. (2000). Identifying indicators of written expression proficiency for middle school students. The Journal of Special Education, 34 , 140-153. Words written Words correct Characters Sentences Character/Word Word/sentences Correct word sentences Incorrect Word sentences Correct minus incorrect word sentences Mean length of correct word sentences Written Expression Proficiency
  6. 6. Results <ul><li>Regression coefficient (  ) / Beta weight – Distinct contribution of a variable, excluding any overlap with other predictor variables. Unstandardized simple regression coefficient </li></ul><ul><li>Standardized regression coefficient - converted variables (independent and dependent) to z-scores before doing the regression. Indicates which independent variable has most effect on the dependent variable. </li></ul>
  7. 7. Results <ul><li>Multiple correlation coefficient ( R ) – the correlation between the criterion variable and all the predictor variables taken together. </li></ul><ul><li>Squared Correlation Coefficient ( R 2 ) –The percent of variance in the dependent variable explained collectively by all of the independent variables. </li></ul><ul><li>R 2 adjusted - assessing the goodness of fit of a regression equation. How well do the predictors (regressors), taken together, explain the variation in the dependent variable. </li></ul><ul><li>R 2 adj = 1 - (1- R 2 )( N - n -1)/( N -1) </li></ul>
  8. 8. <ul><li>R 2 adj </li></ul><ul><li>above 75% as very good ; </li></ul><ul><li>50-75% as good ; </li></ul><ul><li>25-50% as fair ; </li></ul><ul><li>below 25% as poor and perhaps unacceptable. R 2 adj values above 90% are rare in psychological data </li></ul>
  9. 9. <ul><li>Residual - The deviation of a particular point from the regression line (its predicted value).   </li></ul><ul><li>t-tests - used to assess the significance of individual b coefficients. </li></ul><ul><li>F test - The F test is used to test the significance of R,   </li></ul><ul><li>F = [R 2 /k]/[(1 - R 2 )/(n - k - 1)]. </li></ul>
  10. 10. Considerations in using multiple regression: <ul><li>The units (usually people) observed should be a random sample from some well defined population. </li></ul><ul><li>The dependent variable should be measured on an interval , continuous scale. </li></ul><ul><li>The independent variables should be measured on interval scales </li></ul>
  11. 11. Considerations in using multiple regression: <ul><li>The distributions of all the variables should be normal </li></ul><ul><li>The relationships between the dependent variable and the independent variable should be linear. </li></ul><ul><li>Although the independent variables can be correlated, there must be no perfect (or near-perfect) correlations among them, a situation called multicollinearity. </li></ul>
  12. 12. Considerations in using multiple regression: <ul><li>There must be no interactions (in the anova sense) between independent variables </li></ul><ul><li>a rule of thumb for testing b coefficients is to have N >= 104 + m, where m = number of independent variables. </li></ul>
  13. 13. Reporting regression results: <ul><li>“ The data were analyzed by multiple regression, using as regressors age, income and gender. The regression was a rather poor fit (R2adj = 40%), but the overall relationship was significant (F 3,12 = 4.32, p < 0.05). With other variables held constant, depression scores were negatively related to age and income, decreasing by 0.16 for every extra year of age, and by 0.09 for every extra pound per week income. Women tended to have higher scores than men, by 3.3 units. Only the effect of income was significant (t12 = 3.18, p < 0.01). </li></ul>
  14. 14. Partial Correlation <ul><li>In its squared form is the percent of variance in the dependent uniquely and jointly attributable to the given independent when other variables in the equation are controlled </li></ul>
  15. 15. Stepwise Regression <ul><li>y = ß 0 + ß 1 x 1 + ß 2 x 2 + ß 3 x 3 + ß 4 x 4 + ß 5 x 5 + ß 6 x 6 + ß 7 x 7 + ß 8 x 8 + ß 9 x 9 + ß 10 x 10 + ß 11 x 11 + ß 12 x 12 + ß 13 x 13 + ß 14 x 14 + ß 14 x 14 +  </li></ul><ul><li>choose a subset of the independent variables which &quot;best&quot; explains the dependent variable. </li></ul>
  16. 16. Heirarchical Regression <ul><li>1) Forward Selection </li></ul><ul><li>Start by choosing the independent variable which explains the most variation in the dependent variable. </li></ul><ul><li>Choose a second variable which explains the most residual variation, and then recalculate regression coefficients. </li></ul><ul><li>Continue until no variables &quot;significantly&quot; explain residual variation. </li></ul>
  17. 17. Stepwise Regression <ul><li>2) Backward Selection </li></ul><ul><li>Start with all the variables in the model, and drop the least &quot;significant&quot;, one at a time, until you are left with only &quot;significant&quot; variables. </li></ul><ul><li>  </li></ul><ul><li>3) Mixture of the two </li></ul><ul><li>Perform a forward selection, but drop variables which become no longer &quot;significant&quot; after introduction of new variables. </li></ul>
  18. 18. Hierarchical Regression <ul><li>The researcher determines the order of entry of the variables. </li></ul><ul><li>F-tests are used to compute the significance of each added variable (or set of variables) to the explanation reflected in R-square </li></ul><ul><li>an alternative to comparing betas for purposes of assessing the importance of the independents </li></ul>
  19. 19. Categorical Regression <ul><li>Used when there is a combination of nominal, ordinal, and interval-level independent variables. </li></ul>