The document discusses the assumptions and properties of ordinary least squares (OLS) estimators in linear regression analysis. It notes that OLS estimators are best linear unbiased estimators (BLUE) if the assumptions of the linear regression model are met. Specifically, it assumes errors have zero mean and constant variance, are uncorrelated, and are normally distributed. Violation of the assumption of constant variance is known as heteroscedasticity. The document outlines how heteroscedasticity impacts the properties of OLS estimators and their use in applications like econometrics.
2. (CentreforKnowledgeTransfer)
institute
In Simple Linear Regression or Multiple Linear Regression we make some basic assumptions on
the error term
Assumptions:
1. Error has zero mean
2. Error has constant variance
3. Errors are uncorrelated
4. Errors are normally distributed
https://www.youtube.com/watch?v=iIUq0SqBSH0
3. (CentreforKnowledgeTransfer)
institute
Introduction to Properties of OLS
Estimators
Linear regression models have several applications in real life. In econometrics, Ordinary Least
Squares (OLS) method is widely used to estimate the parameters of a linear regression model.
For the validity of OLS estimates, there are assumptions made while running linear regression
models.
A1. The linear regression model is “linear in parameters.”
A2. There is a random sampling of observations.
A3. The conditional mean should be zero.
A4. There is no multi-collinearity (or perfect collinearity).
A5. Spherical errors: There is homoscedasticity and no auto-correlation
A6: Optional Assumption: Error terms should be normally distributed.
4. (CentreforKnowledgeTransfer)
institute
The second assumption is known as Homoscedasticity and therefore, the violation of this
assumption is known as Heteroscedasticity.
Homoscedasticity vs Heteroscedasticity
• Therefore, in simple terms, we can define heteroscedasticity as the condition in which the variance of
error term or the residual term in a regression model varies.
• As you can see in the above diagram, in case of homoscedasticity, the data points are equally scattered
while in case of heteroscedasticity the data points are not equally scattered.
5. (CentreforKnowledgeTransfer)
institute
Possible reasons of arising
Heteroscedasticity:
Often occurs in those data sets which have a large range between the largest and the smallest
observed values i.e. when there are outliers.
When model is not correctly specified.
If observations are mixed with different measures of scale.
When incorrect transformation of data is used to perform the regression.
Skewness in the distribution of a regressor, and may be some other sources
6. (CentreforKnowledgeTransfer)
institute
Effects of Heteroscedasticity:
As mentioned above that one of the assumption (assumption number 2) of linear regression is
that there is no heteroscedasticity. Breaking this assumption means that OLS (Ordinary Least
Square) estimators are not the Best Linear Unbiased Estimator(BLUE) and their variance is not
the lowest of all other unbiased estimators.
Estimators are no longer best/efficient.
The tests of hypothesis (like t-test, F-test) are no longer valid due to the inconsistency in the co-
variance matrix of the estimated regression coefficients.
8. (CentreforKnowledgeTransfer)
institute
Example:
Consider a bank that wants to predict the exposure of a customer at default.
The bank can take the exposure at default to be the dependent variable and several
independent variables like customer level characteristics, credit history, type of loan, mortgage,
etc.
The bank can simply run OLS regression and obtain the estimates to see which factors are
important in determining the exposure at default of a customer.
OLS estimators are easy to use and understand.
They are also available in various statistical software packages and can be used extensively.
9. (CentreforKnowledgeTransfer)
institute
OLS regressions form the building blocks of econometrics.
Any econometrics class will start with the assumption of OLS regressions.
It is one of the favorite interview questions for jobs and university admissions.
Based on the building blocks of OLS, and relaxing the assumptions, several different models have
come up like GLM (generalized linear models), general linear models, heteroscedastic models,
multi-level regression models, etc.
10. (CentreforKnowledgeTransfer)
institute
Research in Economics and Finance are highly driven by Econometrics.
OLS is the building block of Econometrics.
However, in real life, there are issues, like reverse causality, which render OLS irrelevant or not
appropriate.
However, OLS can still be used to investigate the issues that exist in cross-sectional data.
Even if OLS method cannot be used for regression, OLS is used to find out the problems, the
issues, and the potential fixes.
11. (CentreforKnowledgeTransfer)
institute
Summary
Linear regression is important and widely used, and OLS stimation technique is the most prevalent.
In this session, the properties of OLS estimators were discussed because it is the most widely used
estimation technique.
OLS estimators are BLUE (i.e. they are linear, unbiased and have the least variance among the class of
all linear and unbiased estimators).
Amidst all this, one should not forget the Gauss-Markov Theorem (i.e. the estimators of OLS model
are BLUE) holds only if the assumptions of OLS are satisfied.
Each assumption that is made while studying OLS adds restrictions to the model, but at the same
time, also allows to make stronger statements regarding OLS.
So, whenever you are planning to use a linear regression model using OLS, always check for the OLS
assumptions.
If the OLS assumptions are satisfied, then life becomes simpler, for you can directly use OLS for the
best results – thanks to the Gauss-Markov theorem!
https://www.youtube.com/watch?v=OCwZyYH14uw