• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
My regression lecture   mk3 (uploaded to web ct)
 

My regression lecture mk3 (uploaded to web ct)

on

  • 4,632 views

 

Statistics

Views

Total Views
4,632
Views on SlideShare
4,631
Embed Views
1

Actions

Likes
4
Downloads
320
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    My regression lecture   mk3 (uploaded to web ct) My regression lecture mk3 (uploaded to web ct) Presentation Transcript

    • SIMPLE AND MULTIPLE REGRESSION Chris Stiff [email_address]
    • LEARNING OBJECTIVES
      • In this lecture you will learn:
        • What simple and multiple regression mean.
        • The rationale behind these forms of analyses
        • How to conduct a simple bivariate and multiple regression analyses using SPSS
        • How to interpret the results of a regression analysis
    • REGRESSION
      • What is regression?
      • Regression is similar to correlation in the sense that both assess the relationship between two variables
      • Regression is used to predict values of an outcome variable (y) from one or more predictor variables (x)
      • Predictors must either be continuous or categorical with ONLY two categories
    • SIMPLE REGRESSION
      • Simple regression involves a single predictor variable and an outcome variable
      • Examines changes in an outcome variable from a predictor variable
      • Other names:
        • Outcome = dependent, endogenous or criterion variable.
        • Predictor = independent, exogenous or explanatory variable.
    • SIMPLE REGRESSION
      • The relationship between two variables can be expressed mathematically by the slope of line of best fit.
      • Usually expressed as
      • Y = a + b X
      • Outcome Intercept + (Coefficient x Predictor)
    • SIMPLE REGRESSION
      • Where:
      • Y = Outcome (e.g., amount of stupid behaviour)
      • a = Intercept/constant (average amount of stupid behaviour is nothing is drunk
      • b = Unit increment in the outcome that is explained by a unit increase in the predictor – line gradient
      • X = Predictor (e.g., amount of alcohol drunk)
    • LINE OF BEST FIT Amount of alcohol Stupid behaviour
    • LINE OF BEST FIT – POOR EXAMPLE Stupid behaviour Number of pairs of socks ?
    • SIMPLE REGRESSION USING SPSS
      • Analyze Regression Linear
    • SPSS OUTPUT
    • SPSS OUTPUT R = correlation between amount drunk and stupid behaviour R square = proportion of variance in outcome (behaviour) accounted for by the predictor (amount drunk) Adjusted R square = takes into account the sample size and the number of predictor variables
    • THE R 2
      • The R 2 , increases with inclusion of more predictor variables into a regression model
        • Commonly reported
      • The adjusted R 2 however only increases when the new predictor(s) improves the model more than would be expected by chance
        • The adj. R 2 will always be equal to, or less than R 2
        • Particularly useful during variable selection stage of model building
    • SPSS OUTPUT
    • SPSS OUTPUT Beta = standardised regression coefficient and shows the degree to which a unit increase in the predictor variable produces a standard deviation change in the outcome variable with all other things constant
    • REPORTING THE RESULTS OF SIMPLE REGRESSION
      • ß = 74, t (18) = 4.74, p < .001, R 2 = .56
      Beta value t value and associate df and p R square
    • GENERATING DF AND T
      • df = n – p - 1
        • Where n is number of observations and
        • p is number of parameters estimated (i.e., predictor(s) + constant).
        • NB This is for regression, df can be calculated differently for other tests!
    • ASSUMPTIONS OF SIMPLE REGRESSION
      • Outcome variable should be measured at interval level
      • When plotted the data should have a linear trend
    • SUMMARY OF SIMPLE REGRESSION
      • Used to predict the outcome variable from a predictor variable
      • Used when one predictor variable and one outcome variable
      • The relationship must be linear
    • MULTIPLE REGRESSION
      • Multiple regression is used when there is more than one predictor variable
      • Two major uses of multiple regression:
        • Prediction
        • Causal analysis
    • USES OF MULTIPLE REGRESSION
      • Multiple regression can be used to examine the following:
        • How well a set of variables predict an outcome
        • Which variable in a set of variables is the best predictor of the outcome
        • Whether a predictor variable still predicts the outcome when another variable is controlled for.
    • MULTIPLE REGRESSION - EXAMPLE Attendance at lectures Books read Motivation Exam Performance (Grade) What might predict exam performance?
    • MULTIPLE REGRESSION USING SPSS
      • Analyze Regression Linear
    • MULTIPLE REGRESSION: SPSS OUTPUT
    • MULTIPLE REGRESSION: SPSS OUTPUT
    • MULTIPLE REGRESSION: SPSS OUTPUT For overall model: F(2, 42) = 12.153, p<.001
    • MULTIPLE REGRESSION: SPSS OUTPUT Number of books read is significant predictor b=.33, t(42) = 2.24, p<.05 Lectures attended is a significant predictor b=.36, t(42) = 2.41, p<.05
    • MAJOR TYPES OF MULTIPLE REGRESSION
      • There are different types of multiple regression:
        • Standard multiple regression
          • Enter
        • Hierarchical multiple regression
          • Block entry
        • Sequential multiple regression
          • Forward
          • Backward
          • Stepwise
      } } Statistical model building Theory-based model building
    • STANDARD MULTIPLE REGRESSION
      • Most common method. All the predictor variables are entered into the analysis simultaneously (i.e., enter)
      • Used to examine how much:
        • An outcome variable is explained by a set of predictor variables as a group
        • Variance in the outcome variable is explained by a single predictor (unique contribution).
    • EXAMPLE
      • The different methods of regression and their associated outputs will be illustrated using:
        • Outcome variable
          • Essay mark
        • Predictor variables
          • Number lectures attended (out of 20)
          • Motivation of student (on scale from 0 – 100)
          • Number of course books read (from 0 -10)
      Attendance at lectures Books read Motivation Exam Performance (Grade)
    • ENTER OUTPUT
    • ENTER OUTPUT R square = proportion of variance in outcome accounted for by the predictor variables Adjusted R square = takes into account the sample size and the number of predictor variables
    • ENTER OUTPUT
    • ENTER OUTPUT Beta = standardised regression coefficient and shows the degree to which the predictor variable predicts the outcome variable with all other things constant
    • HIERARCHICAL MULTIPLE REGRESSION
      • aka sequential regression
      • Predictor variables entered in a prearranged order of steps (i.e., block entry)
      • Can examine how much variance is accounted for by a predictor when others already in the model
    • Don’t forget to choose the r-square change option from the Statistics menu
    • BLOCK ENTRY OUTPUT
    • BLOCK ENTRY OUTPUT NB – this will be in one long line in the output!
    • BLOCK ENTRY OUTPUT
    • BLOCK ENTRY OUTPUT
    • STATISTICAL MULTIPLE REGRESSION
      • aka sequential techniques
    • STATISTICAL MULTIPLE REGRESSION
      • aka sequential techniques
      • Relies on SPSS selecting which predictor variables to include in a model
      • Three types:
        • Forward selection
        • Backward selection
        • Stepwise selection
      • Forward  Starts with no variables in model, tries them all, includes best predictor, repeats
      • Backward  Starts with ALL variable, removes lowest contributor, repeats
      • Stepwise  Combination. Starts as Forward, checks that all variables are making contribution after each iteration (like Backward)
    • SUMMARY OF MODEL SELECTION TECHNIQUES
      • Theory based
        • Enter - all predictors entered together (standard)
        • Block entry – predictors entered in groups (hierarchical)
      • Statistical based
        • Forward – variables entered in to the model based on their statistical significance
        • Backward – variables are removed from the model based on their statistical significance
        • Stepwise – variables are moved in and out of the model based on their statistical significance
    • ASSUMPTIONS OF REGRESSION
      • Linearity
        • Relationship between the dependent and predictors must be linear
          • check : violations assessed using a scatter-plot
      • Independence
        • Values on outcome variables must be independent
          • i.e., each value comes from a different participant
      • Homoscedasity
        • At each level of the predictor variable the variance of the residual terms should be equal (i.e. all data points should be about as close to the line of best fit)
          • Can indicate if all data is drawn from same sample
      • Normality
        • Residuals/errors should be normally distributed
          • check : violations using histograms (e.g., outliers)
      • Multicollinearity
        • Predictor variables should not be highly correlated
    • OTHER IMPORTANT ISSUES
      • Regression in this case is for continuous/interval or categorical predictors with ONLY two categories
        • More than two are possible (dummy coding)
      • Outcome must be continuous/interval
      • Sample Size
        • Multiple regression needs a relatively large sample size
        • some authors suggest using between 10 and 20 participants per predictor variable
        • others argue should be 50 cases more than the number of predictors
          • to be sure that one is not capitalising on chance effects
    • OUTCOMES
      • So – what is regression?
      • This lecture has:
        • introduced the different types regression
        • detailed how to conduct and interpret regression using SPSS
        • described the underlying assumptions of regression
        • outlined the data types and sample sizes needed for regression
        • outlined the major limitation of a regression analysis
    • REFERENCES
      • Allison, P. D. (1999). Multiple regression: a primer. Thousand oaks: pine press.
      • Clark-carter, D. (2004). Quantitative psychological research: A student’s handbook. Hove: psychology press.
      • Coolican, H. (2004). Research methods and statistics in psychology (4 th ed). Oxon: Hodder Arnold.
      • George, D., & Mallery, P. (2005). SPSS for windows step by step (5 th ed). Pearson: Boston .
      • Field, A. (2002). Discovering statistics using SPSS for windows. London: sage publications.
      • Pallant, J. (2002). SPSS survival manual. Buckingham: open university press.
      • http://www.statsoft.com/textbook/stmulreg.html#aassumption