2. Introduction
• Ordinary Least Square (OLS)
• Ordinary Least Squares (OLS) is a type of linear least squares method
for estimating the unknown parameters (i.e. Beta or coefficients,
Constant or intercept etc) in a linear regression model
• When using linear regression to analyse your data using linear
regression, part of the process involves checking to make sure that
the data you want to analyse can be analysed using linear regression.
• You need to do this because it is only appropriate to use linear
regression if your data "passes" six assumptions that are required for
linear regression to give you a valid result.
29/9/2019 OLS Assumptions 2
3. Assumptions
• Continuous Variable Test : Your two variables should be measured on
continuous level.
• Normality
• Linearity
• The problem with outliers
• Homoscedasticity
• Absence of multicollinearity.
29/9/2019 OLS Assumptions 3
4. Normality
• In order to make valid inferences from your regression, the residuals
of the regression should follow a normal distribution.
• The residuals are simply the error terms, or the differences between
the observed value of the dependent variable and the predicted
value.
• If we examine a normal Predicted Probability (P-P) plot, we can
determine if the residuals are normally distributed. If they are, they
will conform to the diagonal normality line indicated in the plot.
29/9/2019 OLS Assumptions 4
5. Homoscedasticity
• Homoscedasticity refers to whether these residuals are equally
distributed, or whether they tend to bunch together at some values,
and at other values, spread far apart.
• Your data is homoscedastic if it looks somewhat like a shotgun blast
of randomly distributed data.
• The opposite of homoscedasticity is heteroscedasticity, where you
might find a cone or fan shape in your data. You check this
assumption by plotting the predicted values and residuals on a
scatterplot.
29/9/2019 OLS Assumptions 5
6. Linearity
• Linearity means that the independent variables in the regression have
a straight-line relationship with the outcome (dependent) variable.
• Note: If your residuals are normally distributed and homoscedastic,
you do not have to worry about linearity.
29/9/2019 OLS Assumptions 6
7. Multicollinearity
• Multicollinearity refers to when your predictor (independent, mediator,
moderator) variables are highly correlated with each other.
• This assumption is only relevant for a multiple linear regression (multiple
independent variables).
• If you are performing a simple linear regression (one independent
variable), you can skip this assumption.
• Multicollinearity can be checked in two ways: Correlation coefficients
(Correlation Analysis) and variance inflation factor (VIF) values. T
• To check it using correlation coefficients, include all the independent
variables (including mediating and moderating variables) into a correlation
matrix and look for coefficients with magnitudes of .80 or higher.
• VIF values is good at below 10.00, and best case would be if these values
are below 5.00.
29/9/2019 OLS Assumptions 7
8. Procedure
OLS Assumptions 29/9/2019 8
Check Check the Normal Probability Plot
Insert Insert your predicted values (*ZPRED) in the X box, and your residual values
(*ZRESID) in the Y box
Click Click Plots
Check Check Estimates, Model Fit and Collinearity Diagnostics
Click Click Statistics
Insert Insert your independent variables and dependent variable
Open Open Linear Regression
9. • Insert your
independent and
dependent variables
in the Linear
Regression Boxes.
29/9/2019 OLS Assumptions 9
10. 3 items to Check
• 1. Estimates
• 2. Model fit
• 3. Collinearity diagnostics
29/9/2019 OLS Assumptions 10
11. Click Plots
• Insert the ZRESID into the Y
box
• Insert the ZPRED into the X
box
• Check the Normality
probability plot
29/9/2019 OLS Assumptions 11