Upcoming SlideShare
×

# My regression lecture mk3 (uploaded to web ct)

4,874

Published on

Published in: Technology, Economy & Finance
5 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
4,874
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
355
0
Likes
5
Embeds 0
No embeds

No notes for slide

### My regression lecture mk3 (uploaded to web ct)

1. 1. SIMPLE AND MULTIPLE REGRESSION Chris Stiff [email_address]
2. 2. LEARNING OBJECTIVES <ul><li>In this lecture you will learn: </li></ul><ul><ul><li>What simple and multiple regression mean. </li></ul></ul><ul><ul><li>The rationale behind these forms of analyses </li></ul></ul><ul><ul><li>How to conduct a simple bivariate and multiple regression analyses using SPSS </li></ul></ul><ul><ul><li>How to interpret the results of a regression analysis </li></ul></ul>
3. 3. REGRESSION <ul><li>What is regression? </li></ul><ul><li>Regression is similar to correlation in the sense that both assess the relationship between two variables </li></ul><ul><li>Regression is used to predict values of an outcome variable (y) from one or more predictor variables (x) </li></ul><ul><li>Predictors must either be continuous or categorical with ONLY two categories </li></ul>
4. 4. SIMPLE REGRESSION <ul><li>Simple regression involves a single predictor variable and an outcome variable </li></ul><ul><li>Examines changes in an outcome variable from a predictor variable </li></ul><ul><li>Other names: </li></ul><ul><ul><li>Outcome = dependent, endogenous or criterion variable. </li></ul></ul><ul><ul><li>Predictor = independent, exogenous or explanatory variable. </li></ul></ul>
5. 5. SIMPLE REGRESSION <ul><li>The relationship between two variables can be expressed mathematically by the slope of line of best fit. </li></ul><ul><li>Usually expressed as </li></ul><ul><li>Y = a + b X </li></ul><ul><li>Outcome Intercept + (Coefficient x Predictor) </li></ul>
6. 6. SIMPLE REGRESSION <ul><li>Where: </li></ul><ul><li>Y = Outcome (e.g., amount of stupid behaviour) </li></ul><ul><li>a = Intercept/constant (average amount of stupid behaviour is nothing is drunk </li></ul><ul><li>b = Unit increment in the outcome that is explained by a unit increase in the predictor – line gradient </li></ul><ul><li>X = Predictor (e.g., amount of alcohol drunk) </li></ul>
7. 7. LINE OF BEST FIT Amount of alcohol Stupid behaviour
8. 8. LINE OF BEST FIT – POOR EXAMPLE Stupid behaviour Number of pairs of socks ?
9. 9. SIMPLE REGRESSION USING SPSS <ul><li>Analyze Regression Linear </li></ul>
10. 10.
11. 11. SPSS OUTPUT
12. 12. SPSS OUTPUT R = correlation between amount drunk and stupid behaviour R square = proportion of variance in outcome (behaviour) accounted for by the predictor (amount drunk) Adjusted R square = takes into account the sample size and the number of predictor variables
13. 13. THE R 2 <ul><li>The R 2 , increases with inclusion of more predictor variables into a regression model </li></ul><ul><ul><li>Commonly reported </li></ul></ul><ul><li>The adjusted R 2 however only increases when the new predictor(s) improves the model more than would be expected by chance </li></ul><ul><ul><li>The adj. R 2 will always be equal to, or less than R 2 </li></ul></ul><ul><ul><li>Particularly useful during variable selection stage of model building </li></ul></ul>
14. 14. SPSS OUTPUT
15. 15. SPSS OUTPUT Beta = standardised regression coefficient and shows the degree to which a unit increase in the predictor variable produces a standard deviation change in the outcome variable with all other things constant
16. 16. REPORTING THE RESULTS OF SIMPLE REGRESSION <ul><li>ß = 74, t (18) = 4.74, p < .001, R 2 = .56 </li></ul>Beta value t value and associate df and p R square
17. 17. GENERATING DF AND T <ul><li>df = n – p - 1 </li></ul><ul><ul><li>Where n is number of observations and </li></ul></ul><ul><ul><li>p is number of parameters estimated (i.e., predictor(s) + constant). </li></ul></ul><ul><ul><li>NB This is for regression, df can be calculated differently for other tests! </li></ul></ul>
18. 18. ASSUMPTIONS OF SIMPLE REGRESSION <ul><li>Outcome variable should be measured at interval level </li></ul><ul><li>When plotted the data should have a linear trend </li></ul>
19. 19. SUMMARY OF SIMPLE REGRESSION <ul><li>Used to predict the outcome variable from a predictor variable </li></ul><ul><li>Used when one predictor variable and one outcome variable </li></ul><ul><li>The relationship must be linear </li></ul>
20. 20. MULTIPLE REGRESSION <ul><li>Multiple regression is used when there is more than one predictor variable </li></ul><ul><li>Two major uses of multiple regression: </li></ul><ul><ul><li>Prediction </li></ul></ul><ul><ul><li>Causal analysis </li></ul></ul>
21. 21. USES OF MULTIPLE REGRESSION <ul><li>Multiple regression can be used to examine the following: </li></ul><ul><ul><li>How well a set of variables predict an outcome </li></ul></ul><ul><ul><li>Which variable in a set of variables is the best predictor of the outcome </li></ul></ul><ul><ul><li>Whether a predictor variable still predicts the outcome when another variable is controlled for. </li></ul></ul>
22. 22. MULTIPLE REGRESSION - EXAMPLE Attendance at lectures Books read Motivation Exam Performance (Grade) What might predict exam performance?
23. 23. MULTIPLE REGRESSION USING SPSS <ul><li>Analyze Regression Linear </li></ul>
24. 24.
25. 25. MULTIPLE REGRESSION: SPSS OUTPUT
26. 26. MULTIPLE REGRESSION: SPSS OUTPUT
27. 27. MULTIPLE REGRESSION: SPSS OUTPUT For overall model: F(2, 42) = 12.153, p<.001
28. 28. MULTIPLE REGRESSION: SPSS OUTPUT Number of books read is significant predictor b=.33, t(42) = 2.24, p<.05 Lectures attended is a significant predictor b=.36, t(42) = 2.41, p<.05
29. 29. MAJOR TYPES OF MULTIPLE REGRESSION <ul><li>There are different types of multiple regression: </li></ul><ul><ul><li>Standard multiple regression </li></ul></ul><ul><ul><ul><li>Enter </li></ul></ul></ul><ul><ul><li>Hierarchical multiple regression </li></ul></ul><ul><ul><ul><li>Block entry </li></ul></ul></ul><ul><ul><li>Sequential multiple regression </li></ul></ul><ul><ul><ul><li>Forward </li></ul></ul></ul><ul><ul><ul><li>Backward </li></ul></ul></ul><ul><ul><ul><li>Stepwise </li></ul></ul></ul>} } Statistical model building Theory-based model building
30. 30. STANDARD MULTIPLE REGRESSION <ul><li>Most common method. All the predictor variables are entered into the analysis simultaneously (i.e., enter) </li></ul><ul><li>Used to examine how much: </li></ul><ul><ul><li>An outcome variable is explained by a set of predictor variables as a group </li></ul></ul><ul><ul><li>Variance in the outcome variable is explained by a single predictor (unique contribution). </li></ul></ul>
31. 31. EXAMPLE <ul><li>The different methods of regression and their associated outputs will be illustrated using: </li></ul><ul><ul><li>Outcome variable </li></ul></ul><ul><ul><ul><li>Essay mark </li></ul></ul></ul><ul><ul><li>Predictor variables </li></ul></ul><ul><ul><ul><li>Number lectures attended (out of 20) </li></ul></ul></ul><ul><ul><ul><li>Motivation of student (on scale from 0 – 100) </li></ul></ul></ul><ul><ul><ul><li>Number of course books read (from 0 -10) </li></ul></ul></ul>Attendance at lectures Books read Motivation Exam Performance (Grade)
32. 32. ENTER OUTPUT
33. 33. ENTER OUTPUT R square = proportion of variance in outcome accounted for by the predictor variables Adjusted R square = takes into account the sample size and the number of predictor variables
34. 34. ENTER OUTPUT
35. 35. ENTER OUTPUT Beta = standardised regression coefficient and shows the degree to which the predictor variable predicts the outcome variable with all other things constant
36. 36. HIERARCHICAL MULTIPLE REGRESSION <ul><li>aka sequential regression </li></ul><ul><li>Predictor variables entered in a prearranged order of steps (i.e., block entry) </li></ul><ul><li>Can examine how much variance is accounted for by a predictor when others already in the model </li></ul>
37. 37.
38. 38. Don’t forget to choose the r-square change option from the Statistics menu
39. 39. BLOCK ENTRY OUTPUT
40. 40. BLOCK ENTRY OUTPUT NB – this will be in one long line in the output!
41. 41. BLOCK ENTRY OUTPUT
42. 42. BLOCK ENTRY OUTPUT
43. 43. STATISTICAL MULTIPLE REGRESSION <ul><li>aka sequential techniques </li></ul>
44. 44. STATISTICAL MULTIPLE REGRESSION <ul><li>aka sequential techniques </li></ul><ul><li>Relies on SPSS selecting which predictor variables to include in a model </li></ul><ul><li>Three types: </li></ul><ul><ul><li>Forward selection </li></ul></ul><ul><ul><li>Backward selection </li></ul></ul><ul><ul><li>Stepwise selection </li></ul></ul>
45. 45. <ul><li>Forward  Starts with no variables in model, tries them all, includes best predictor, repeats </li></ul><ul><li>Backward  Starts with ALL variable, removes lowest contributor, repeats </li></ul><ul><li>Stepwise  Combination. Starts as Forward, checks that all variables are making contribution after each iteration (like Backward) </li></ul>
46. 46. SUMMARY OF MODEL SELECTION TECHNIQUES <ul><li>Theory based </li></ul><ul><ul><li>Enter - all predictors entered together (standard) </li></ul></ul><ul><ul><li>Block entry – predictors entered in groups (hierarchical) </li></ul></ul><ul><li>Statistical based </li></ul><ul><ul><li>Forward – variables entered in to the model based on their statistical significance </li></ul></ul><ul><ul><li>Backward – variables are removed from the model based on their statistical significance </li></ul></ul><ul><ul><li>Stepwise – variables are moved in and out of the model based on their statistical significance </li></ul></ul>
47. 47. ASSUMPTIONS OF REGRESSION <ul><li>Linearity </li></ul><ul><ul><li>Relationship between the dependent and predictors must be linear </li></ul></ul><ul><ul><ul><li>check : violations assessed using a scatter-plot </li></ul></ul></ul><ul><li>Independence </li></ul><ul><ul><li>Values on outcome variables must be independent </li></ul></ul><ul><ul><ul><li>i.e., each value comes from a different participant </li></ul></ul></ul><ul><li>Homoscedasity </li></ul><ul><ul><li>At each level of the predictor variable the variance of the residual terms should be equal (i.e. all data points should be about as close to the line of best fit) </li></ul></ul><ul><ul><ul><li>Can indicate if all data is drawn from same sample </li></ul></ul></ul><ul><li>Normality </li></ul><ul><ul><li>Residuals/errors should be normally distributed </li></ul></ul><ul><ul><ul><li>check : violations using histograms (e.g., outliers) </li></ul></ul></ul><ul><li>Multicollinearity </li></ul><ul><ul><li>Predictor variables should not be highly correlated </li></ul></ul>
48. 48. OTHER IMPORTANT ISSUES <ul><li>Regression in this case is for continuous/interval or categorical predictors with ONLY two categories </li></ul><ul><ul><li>More than two are possible (dummy coding) </li></ul></ul><ul><li>Outcome must be continuous/interval </li></ul><ul><li>Sample Size </li></ul><ul><ul><li>Multiple regression needs a relatively large sample size </li></ul></ul><ul><ul><li>some authors suggest using between 10 and 20 participants per predictor variable </li></ul></ul><ul><ul><li>others argue should be 50 cases more than the number of predictors </li></ul></ul><ul><ul><ul><li>to be sure that one is not capitalising on chance effects </li></ul></ul></ul>
49. 49. OUTCOMES <ul><li>So – what is regression? </li></ul><ul><li>This lecture has: </li></ul><ul><ul><li>introduced the different types regression </li></ul></ul><ul><ul><li>detailed how to conduct and interpret regression using SPSS </li></ul></ul><ul><ul><li>described the underlying assumptions of regression </li></ul></ul><ul><ul><li>outlined the data types and sample sizes needed for regression </li></ul></ul><ul><ul><li>outlined the major limitation of a regression analysis </li></ul></ul>
50. 50. REFERENCES <ul><li>Allison, P. D. (1999). Multiple regression: a primer. Thousand oaks: pine press. </li></ul><ul><li>Clark-carter, D. (2004). Quantitative psychological research: A student’s handbook. Hove: psychology press. </li></ul><ul><li>Coolican, H. (2004). Research methods and statistics in psychology (4 th ed). Oxon: Hodder Arnold. </li></ul><ul><li>George, D., & Mallery, P. (2005). SPSS for windows step by step (5 th ed). Pearson: Boston . </li></ul><ul><li>Field, A. (2002). Discovering statistics using SPSS for windows. London: sage publications. </li></ul><ul><li>Pallant, J. (2002). SPSS survival manual. Buckingham: open university press. </li></ul><ul><li>http://www.statsoft.com/textbook/stmulreg.html#aassumption </li></ul>
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.