This document provides an overview of simple linear regression models. It defines regression analysis as concerned with studying the dependence of a dependent variable on one or more independent variables. The major objectives of regression are to estimate mean dependent variable values based on independent variable values, test hypotheses about the relationship, and predict/forecast dependent variable values. It describes estimating regression coefficients using the ordinary least squares method and the properties of least squares estimators established by the Gauss-Markov theorem. Examples are provided to demonstrate estimating and interpreting a simple linear regression model.
3. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
The term ‘Regression‘ was introduced by Francis Galton and Galton‘s Law of Universal
regression was confirmed by his Friend, Karl Pearson. The modern interpretation of
regression is quiet different from their analysis.
By using modern interpretation of regression, we may say that,
Regression analysis is concerned with the study of the dependence of one variable
(dependant variable), on one or more other variables, the explanatory variables
(Independent Variable), with a view to estimating and/or predicting the mean or
average value of the former in terms of the known or fixed values of the later‘
4. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
That is, the major objectives of regression analysis are;
1.To estimate the mean value of the dependant variable given the value of the
independent variables
2.To test the hypothesis suggested by the underlying economic theory about the
nature of the dependence.
3.To predict or forecast the mean value of the dependant variable, given the values of
the independent variables
5. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
We can have an idea about the type of relationship by looking into, what we call,
a scatter diagram
Scatter Diagram
The pattern of the scatter diagram shown above indicates a linear relationship
between X and Y and this relationship can be described by a straight line through these
points .This line is known as the line of regression.This line is known as the line of best fit
Scatter diagram shows the pairs of actual observations. We usually plot the dependent
variable against an explanatory variable to see if we can observe a pattern. If the pattern
shows a linear relation, we use a linear regression model.
6. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
The above diagram shows that the expenditure on food is a direct (increasing) function of
the income levels. The dots showing the plots of the pairs of observation resemble a linear
shape (straight line). The points do not lie exactly on a straight line but are scattered
around a hypothetical straight line.
In the diagram below, the annual sales seem to be inversely related to the price of the
commodity. This is because the dots of pairs of observation seem to be scattered around a
(hypothetical) straight line that is negatively sloped
7. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
Example
Draw a scatter diagram for the following information and regression line
Father (X) Son(Y)
63 66
65 68
66 65
67 67
67 69
68 70
64
65
66
67
68
69
70
71
62 64 66 68 70
Y(Dependent
variable
Heights of Father(X)
Independant variable
Son(Y)
Linear (Son(Y))
8. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
Therefore, simple linear regression model (SLRM) means that,
There are only two variables; one dependant and one independent and
The relation between dependant and independent variables are linear in
parameters.(may or may not be linear in variables)
Different statistical estimation procedures, e.g., method of maximum likelihood, the
principle of least squares, method of moments etc. can be employed to estimate the
parameters of the model. The method of maximum likelihood needs further knowledge of
the distribution of Y whereas the method of moments and the principle of least squares do
not need any knowledge about the distribution of Y. The regression analysis is a tool to
determine the values of the parameters given the data on Y and X1, X2 ,. , Xk. Before going
in to the process of
estimation,
it is better to have an idea of some important terms and terminologies such as population
regression function, sample regression function, significance of stochastic disturbance
term etc. that are frequently used in the analysis of regression models.
9. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model
Basic Econometrics
The Least square Method
The aim is to produce a line which minimises all positive and negative deviations of the
data from a straight line drawn through the data.This is carried out by squaring the
deviations and therefore the least squares is the ‘best’ line the line which minimises the
error in the direction of the variable being predicted.
“ The method of least squares is the automobile of modern statistical analysis;despite its
limitations,occasional accidents and incidental pollution,it and its numerous
variations,extensions and related conveyances carry the bulk of statistical analysis and
are known and valued by all” Stephen M.Stigler
10. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
Simple Regression Line by OLS
•The relationship seems to be ‘linear’ that can be captured with the equation of a
straight line (Y = a + b X)
•We may need to predict Y if the value of X is given
•We capture the relation by writing a ‘simple regression equation’
𝑌 = 𝑎 + 𝑏 𝑋 + 𝑒 OR 𝑌 = 𝛽0 + 𝛽1𝑋 + 𝑒
Residual: Note that we have added 𝑒 which is called an error term or residual.
We add this because the actual values do not exactly lie on a straight line but
maybe scattered around it.
To account for this difference, we capture it in the residual 𝑒.
When we estimate the parameters ‘a’ and ‘b’, they do not provide exact
estimates of the value of the dependent variable.
The difference is called error term or residual
11. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
The difference between the actual values of Y and predicted values of Y is called
as residual
∑e² =minimum
The sum of residuals(∑e) will always be zero since the predicted values of Y can
be randomly above and below actual value of Y.So e² is taken as a criterion and
∑e² is set to be minimum.
12. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
Regression Explained
Population Regression Function (PRF)
The group of individuals or items under study is known as the population.
In statistics, population is the aggregate of facts or objects, animate or inanimate,
under study in any statistical investigation. A Population Regression Function (PRF) can be
defined as the average value of the dependant variable for a given value of the independent variable.
In other words,
PRF tries to find out how the average value of the dependant variable varies with the given value of the
explanatory variable.
Population Regression Equation is an assumed equation that may have possibly been estimated from a population.
We will use samples to get the values of the parameters 𝖰𝟎 𝑎𝑛𝑑 𝖰𝟏as all the population may not be available or observed.
𝒀𝒊 = 𝖰𝟎 + 𝖰𝟏𝑿𝒊 + 𝑒𝒊
Here
Yi = Dependent Variable or Explained Variable. 𝖰𝟎 𝑎𝑛𝑑 𝖰𝟏 are Parameters the we need to estimate. X is the
Independent Variable OR Explanatory Variable
13. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
The Sample Regression Function (SRF)
Practically, it is not possible to rely on population studies always. Under such circumstances we have to rely on sample studies associated
with this we
face sampling related problems too. Therefore, our task is to estimate the PRF on the basis of the sample information.
For this, we randomly select some of the Y values corresponding to fixed values of X from the given population.
In this way, we have to draw so many samples from the population. But in practice, we are interested in a sample and with the help of the sample,
we are trying to estimate the PRF.
When the plot the sample data on consumption expenditure on a graph paper we have the Figure 1.2
14. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
Weekly Family
Income
Weekly Family
consumption Expenditure
10 7
12 8
15 11
18 13
20 15
23 17
25 18
28 20
30 21
35 24
Sample Data on Consumption Expenditure
Figure 1.2 Sample Regression Line (SRL)
15. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
The SRF can be expressed as; ^ ^ ^
Yi = β0+β1Xi + ui (1)
Then our objective is to estimate the PRF, Yi = β0+β1Xi + ui on the basis of the SRF,
Here the SRF is the estimator of the PRF.
To conclude, we can say that the primary objective of regression analysis is to estimate
PRF on the basis of the SRF. We may have to select as many samples as possible to reduce the sampling fluctuations,
so that it will become more easy to approximate the SRF to the PRF.
16. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
The method of ordinary least squares is attributed to Carl Friedrich Gauss,a German
mathematician .
To understand this method, we first explain the least- squares principle.
Recall the two-variable PRF:
Yi = β1 + β2 Xi + ui (1)
However, the PRF is not directly observable. We
17. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
estimate it from the SRF:
Yi =
ˆ ˆ ˆ
β1 + β2 Xi + µi (2)
First, express (2.6.3) as
ˆ
uˆi = Yi − Yi (3)
ˆ ˆ
Yi − β1 − β2 Xi
which shows that the ui (the residuals) are simply the differences between the actual
and estimated Y values. where
ˆ
Yi is the estimated (conditional mean) value of Yi.
18. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
As a result we got two normal equations
ˆ ˆ
∑ 𝑌 = 𝑛 B₁ + B₂∑ 𝑋 (1)
ˆ ˆ
∑ 𝑋𝑌 = B₁ ∑ 𝑋 + B₂∑ 𝑋2 (2)
NOTE IN FORMALE
BELOW SLIDE FOR B1 IS B0 and FOR B1 =B2
19. BASIC ECONOMETRICS
Simple Linear Regression Model
Simple linear Regression Model by Ordinary Least Square Method
Basic Econometrics
20. BASIC ECONOMETRICS
Simple Linear Regression Model
THE CLASSICAL LINEAR REGRESSION MODEL:
THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUARES
Basic Econometrics
Ass 1: Linear regression model
(in parameters)
Ass 2: X values are fixed in repeated
sampling
Ass 3: Zero mean value of ui : E(uiXi)=0
Ass 4: Homoscedasticity or equal
variance of ui : Var (uiXi) = 2
[VS. Heteroscedasticity]
Ass 5: No autocorrelation between the
disturbances: Cov(ui,ujXi,Xj ) = 0
with i # j [VS. Correlation, + or - ]
21. BASIC ECONOMETRICS
Simple Linear Regression Model
THE CLASSICAL LINEAR REGRESSION MODEL:
THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUARES
Basic Econometrics
Ass 6: Zero covariance between ui and Xi
Cov(ui,Xi) = E(ui, Xi) = 0
Ass 7: The number of observations n must be
greater than the number of parameters to be estimated
Ass 8: Variability in X values. They must
not all be the same
Ass 9: The regression model is correctly
specified
Ass 10: There is no perfect multicollinearity
between Xs
22. BASIC ECONOMETRICS
Simple Linear Regression Model
Properties of Least-squares estimators: THE GAUSS-MARKOV THEOREM
Basic Econometrics
As noted earlier, given the assumptions of the classical linear regression model, the
least-squares estimates possess some ideal or optimum properties.
These properties are contained in the well-known Gauss–Markov theorem. To
understand this theorem, we need to consider the best linear unbiasedness property
of an estimator.
As explained an estimator, say the OLS estimator βˆ 2,
is said to be a best linear unbiased estimator (BLUE) of β2 if the following hold:
1. It is linear, that is, a linear function of a random variable, such as the dependent
variable Y in the regression model.
2. It is unbiased, that is, its average or expected value, E(βˆ 2), is equal to the true
value, β2.
3. It has minimum variance in the class of all such linear unbiased estimators; an
unbiased estimator with the least variance is known as an efficient estimator.
23. BASIC ECONOMETRICS
Simple Linear Regression Model
Properties of Least-squares estimators: THE GAUSS-MARKOV THEOREM
Basic Econometrics
In the regression context it can be proved that the OLS estimators are BLUE.
This is the gist of the famous Gauss–Markov theorem, which can be stated as follows:
Although known as the Gauss–Markov theorem, the least-squares approach of Gauss
antedates (1821) the minimum-variance approach of Markov (1900).
24. BASIC ECONOMETRICS
Simple Linear Regression MODEL
SIMPLE LINEAR REGRESSION MODEL ESTIMATION
Basic Econometrics
Example:
Consider the following example where X = Income in thousand rupees
and Y = expenditure on food items (thousand rupees)
Observation # X Y XY X2
1 25 20 500 625
2 30 24 720 900
3 35 32 1120 1225
4 40 33 1320 1600
5 45 36 1620 2025
Totals 175 145 5280 6375
∑𝑋 ∑𝑌 ∑𝑋𝑌 ∑𝑋2
25. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE LINEAR REGRESSION MODEL ESTIMATION
Basic Econometrics
ˆ ˆ
∑ 𝑌 = 𝑛 B₁ + B₂∑ 𝑋 (1)
ˆ ˆ
∑ 𝑋𝑌 = B₁ ∑ 𝑋 + B₂∑ 𝑋2 (2)
X = ∑x/n =175/5 = 35
Y= ∑y/n = 145/5 =29
26. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE LINEAR REGRESSION MODEL ESTIMATION
Basic Econometrics
B2 = 𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑛 ∑ 𝑋2 − (∑ 𝑋) 2
5(5280) − (175)(145)
5(6375) − (175)2
B₂ = 0.82
B₁ = Y – B₂X
B₁ = 29 - 0.82 *35
B₁ =0.3
B₂ = 0.82 means that a one unit change in X (income level) brings 0.82 unit changes in Y
(expenditure of food), on the average.
OR Change if 1000 rupees (one unit is in thousands) increase in income may increase the
expenditure on food items by 820 rupees.
27. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE LINEAR REGRESSION MODEL ESTIMATION
Hence regression of Y on X is
Y = B₁ + B₂X
Y = 0.3 + 0.82X
Basic Econometrics
28. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE REGRESSION MODEL ESTIMATION
Basic Econometrics
Trend Values and Errors:
We can substitute the values of X in the estimated regression equation and find
Trend Values
Observation# X Y 𝑻𝑒𝑛𝑑 𝑽𝑎𝑙𝑢𝑒
^
Y
Residual or
Error
^
𝑒 = 𝒀 − 𝒀
Square of Residuals
𝑒𝟐
1 25 20 20.8 -0.8 0.64
2 30 24 24.9 -0.9 0.81
3 35 32 29 3.0 9
4 40 33 33.1 -0.1 0.01
5 45 36 37.2 -1.2 1.44
Totals 175 145 145 Zero 11.9
∑𝑋 ∑𝑌 ^
∑ 𝑌= ∑ 𝑌
∑𝑒 ∑𝑒2
29. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE REGRESSION MODEL ESTIMATION
Basic Econometrics
The First trend value is computed as Y = 0.3 + 0.82 (25) = 20.8 and so on.
If you change the values of B₁ and B₂ and compute new squares of errors,
the new value would be larger than the value here (Least Square of errors)
The Error Term
We assume that error are normally distributed with zero mean and constant variance
𝑒~𝑁(0, 𝜎2)
As you must have noticed while estimating regression parameters,
𝑛
∑ 𝑒𝒊 = 𝟎
𝒊=𝟏
Also, you can verify easily that
𝑛∑ 𝑒𝒊𝑿𝒊 = 𝟎
𝒊=𝟏
30. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE REGRESSION MODEL ESTIMATION
Basic Econometrics
And as we got the regression equation by minimization process,
𝑛
𝒊
∑ 𝑒𝟐 𝑖𝑠 𝑚𝑖𝑛𝑖𝑚𝑢𝑚
𝒊=𝟏
•Error Term may represent the influence the variables NOT included in the model. (Missing Variables)
•Even if we are able to include all variables or determinants of the dependent variable, there will remain randomness in the error
as
human behavior is not rational and predictable to the extent of 100%e may represent ‘Measurement Error’;
•When data is collected we may round some values or observe values in ranges or some variables are not accurately measured
31. BASIC ECONOMETRICS
Simple Linear Regression Model
SIMPLE REGRESSION MODEL ESTIMATION
STANDARD ERROR
Basic Econometrics
𝒊
Standard Error of Estimate/Standard Error of Regression
The standard error of the estimate is a measure of the accuracy of predictions.
It is the standard deviation of errors and defined as
defined as
^
𝝈e ∑ ei𝟐
N-K
33. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
R² is called coefficient of determination
This gives the contribution made by regression in explaining the variations in dependent variable
This is worked out as a ratio between the regression sum of square of squares and total sum of
square.
TSS = ESS + RSS
We can show the goodness of fit of a regression line through the graph (Figure 1) and from that we can
calculate the value of r2.
Basic Econometrics
34. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
Figure 1 Goodness of fit of estimated regression line
There are two lines in Figure 1, a horizontal line placed at
the average response, , and a shallow-sloped estimated regression line, . From Figure 1, the calculation of
sum of squares are;
Basic Econometrics
35. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
There are two lines in Figure 1, a horizontal line placed at
^
the average response,y and a shallow-sloped estimated regression line,y . From Figure 1, the calculation of
sum of squares are;
Explained Sum of Squares (ESS) quantifies how far the estimated sloped regression line, , is from the horizontal
"no relationship line," the sample
mean ESS =
Residual sum of Squares (RSS) quantifies how much the data points, yi, vary around the estimated regression
line,
Basic Econometrics
36. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
Total Sum of Squares (TSS) quantifies how much the data points, yi, vary around their mean,
Basic Econometrics
37. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
Basic Econometrics
TSS =Total Sum of Squares
ESS =
Explained Sum of Squares
RSS = u^2
I = Residual Sum of
Squares
ESS RSS
1 = -------- + -------- ; or
TSS TSS
RSS RSS
1 = r2 + ------- ; or r2 = 1 - -------
TSS TSS
38. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
Basic Econometrics
r²= 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
So the above can be written as
𝑅2 = 1- 𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 − 𝑈𝑛𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
or
1- ∑ 𝑒𝟐
∑(𝑌 − ^𝑌)2
𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
39. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
Basic Econometrics
PROPERTIES OF r²
r2 = ESS/TSS
is coefficient of determination, it measures the proportion or
percentage of the total variation in Y explained by the regression
Model
0 r2 1;
r = r2 is sample correlation coefficient
Some properties of r
40. BASIC ECONOMETRICS
Simple Linear Regression
COEFFICIENT OF DETERMINATION-GOODNESS OF FIT
FORMULA
r² =
1- 𝑵 ∑ 𝑒𝟐
𝑁 ∑ 𝑌2 − (∑ 𝑌)2
Basic Econometrics
41. BASIC ECONOMETRICS
Simple Linear Regression
CONFIDENCE INTERVAL
Confidence-interval approach has a specified probability of including within its
limits the true value of the unknown parameter. If the null-hypothesized value lies
in the confidence interval, H0 is not rejected, whereas if it lies outside this interval,
H0 can be rejected
Basic Econometrics
42. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
Hypothesis Testing
One important way to make statistical inferences about a population parameter, we use hypothesis
testing to make decisions about the parameter‘s value.
The null hypothesis (null always indicates zero) is usually a hypothesis of equality between
population parameters; e.g.,
a null hypothesis may state that the population mean is equal to zero. The alternative hypothesis is
effectively the opposite of a null hypothesis (e.g., the population mean return is not equal to zero).
Thus,
they are mutually exclusive, and only one can be true. However, one of the two hypotheses will
always be true.
Basic Econometrics
43. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
There are mainly two ways for proceeding with the testing of a hypothesis.
The rejection region method
To decide between two competing claims, we can conduct a hypothesis test as follows.
Express the claim about a specific value for the population parameter of interest as a null hypothesis, denoted
H0. The null hypothesis needs to be in the form "parameter = some hypothesized value," for example, H0: E(Y)
= 255.
Express the alternative claim as an alternative hypothesis, denoted H1. The alternative hypothesis
can be in a lower- tail form, for example, H1: E(Y) < 255, or an upper- tail form, for example, H1:
E(Y) > 255, or a two-tail form, for example, H1: E(Y) ≠ 255. The alternative hypothesis, also
sometimes called the research hypothesis, is what we would like to demonstrate to be the case, and
needs to be stated before looking at the data.
Basic Econometrics
44. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
Calculate a test statistic based on the assumption that the null hypothesis is true. For testing a univariate
population mean, the relevant test statistic is t-statistic.
Under the assumption that the null hypothesis is true, this test statistic will have a particular probability
distribution. For testing a univariate population mean, this t-statistic has a t-distribution with n−1 degrees of
freedom.
We would therefore expect it to be "close" to zero (if the null hypothesis is true). Conversely, if it is far from
zero, then we might begin to doubt the null hypothesis:
For an upper-tail test, a t-statistic that is positive and far from zero would then lead us to favor the alternative
hypothesis (a t-statistic that was far from zero but negative would favor neither hypothesis and the test would
be inconclusive).
For a two-tail test, any t-statistic that is far from zero (positive or negative) would lead us to favor the
alternative hypothesis.
Basic Econometrics
45. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
The significance level dictates the critical value(s) for the test, beyond which an observed t-statistic leads to
rejection of the null hypothesis in favor of the alternative. This region, which leads to rejection of the null
hypothesis, is called the rejection region. For example, for a significance level of 5%
For an upper-tail test, the critical value is the 95th percentile of the t-distribution with n−1 degrees of
freedom; reject the null in favor of the alternative if the t- statistic is greater than this.
For a lower-tail test, the critical value is the 5th percentile of the t-distribution with n−1 degrees of freedom;
reject the null in favor of the alternative if the t-statistic is less than this.
For a two-tail test, the two critical values are the 2.5th and the 97.5th percentiles of the t-distribution with
n−1 degrees of freedom; reject the null in favor of the alternative if the t-statistic is less than the 2.5th
percentile or greater than the 97.5th percentile.
Basic Econometrics
46. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
The significance level dictates the critical value(s) for the test, beyond which an observed t-statistic leads to
rejection of the null hypothesis in favor of the alternative. This region, which leads to rejection of the null
hypothesis, is called the rejection region. For example, for a significance level of 5%
For an upper-tail test, the critical value is the 95th percentile of the t-distribution with n−1 degrees of
freedom; reject the null in favor of the alternative if the t- statistic is greater than this.
For a lower-tail test, the critical value is the 5th percentile of the t-distribution with n−1 degrees of freedom;
reject the null in favor of the alternative if the t-statistic is less than this.
For a two-tail test, the two critical values are the 2.5th and the 97.5th percentiles of the t-distribution with
n−1 degrees of freedom; reject the null in favor of the alternative if the t-statistic is less than the 2.5th
percentile or greater than the 97.5th percentile.
Basic Econometrics
47. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
t’ test
The‘ t test is usually used to conduct hypothesis tests on the regression coefficients (βs) obtained from simple
linear regression. A statistic based on the ‗t‘ distribution is used to test the two-sided hypothesis that the true
slope, β1, equals some constant value, β1,0. The statements for the hypothesis test are expressed as:
Basic Econometrics
The test statistic used for this test is:
48. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
where I is the least square estimate of β1, and se( ) is its standard error. The value of se( ) can be calculated
as follows:
Basic Econometrics
:
49. BASIC ECONOMETRICS
Simple Linear Regression
HYPOTHESIS TESTING
Basic Econometrics
The test statistic, T0 , follows a t distribution with (n−2) degrees of freedom, where n is the total number of observations.
The null hypothesis, H0, is accepted if the calculated value of the test statistic is such that:
where tα/2,n−2 and −tα/2,n−2 are the critical values for the two-sided hypothesis. tα/2,n−2 is the percentile of the t distribution
corresponding to a
cumulative probability of (1−α/2) and α is the significance level.
If the value of β1,0 is zero, then the hypothesis tests for the
significance of regression. In other words, the test indicates if
the fitted regression model is significant in explaining
variations in the observations or if you are trying to impose a
regression model when no true relationship exists
between x and Y. Failure to reject H0:β1=0 implies that no
linear relationship exists between x and Y.
50. BASIC ECONOMETRICS
CASE STUDY 3
Basic Econometrics
You have obtained a sample of 14,925 individuals from the Current
Population Survey (CPS) and are interested in the relationship
between average hourly earnings and years of education. The
regression yields the following result:
^
ahe= -4.58 + 1.71×educ ,
R2= 0.182, SER = 9.30
51. BASIC ECONOMETRICS
CASE STUDY 2
Basic Econometrics
DISCUSSION
a)Interpret the coefficients and the regression R2.
b)Is the effect of education on earnings large?
c)Why should education matter in the determination of earnings? Do the results
suggest that there is a guarantee for average hourly earnings to rise for everyone
as they receive an additional year of education? Do you think that the relationship
between education and average hourly earnings is linear?
d)Interpret the measure SER. What is its unit of measurement.
52. BASIC ECONOMETRICS
CASE STUDY 2
a. A person with one more year of education increases her earnings by $1.71. There is no meaning
attached to the intercept, it just determines the height of the regression.
b.The difference between a high school graduate and a college graduate is four years of education.
Hence a college graduate will earn almost $7 more per hour, on average ($6.84 to be precise). If you
assume that there are 2,000 working hours per year, then the average salary difference would be
close to
$14,000 (actually $13,680). Depending on how much you have spent for an additional year of
education and how much income you have forgone, this does not seem particularly large.
Basic Econometrics
53. BASIC ECONOMETRICS
CASE STUDY 2
c.In general, you would expect to find a positive relationship between years of education and
average hourly earnings. Education is considered investment in human capital. If this were not the
case, then it would be a puzzle as to why there are students in the econometrics course — surely
they are not there to just “find themselves” (which would be quite expensive in most cases).
However, if you consider education as an investment and you wanted to see a return on it, then the
relationship will most likely not be linear. For example, a constant percent return would imply an
exponential relationship whereby the additional year of education would bring a larger increase in
average hourly earnings at higher levels of education. The results do not suggest that there is a
guarantee for earnings to rise for everyone as they become more educated since the regression R2
does not equal 1. Instead the result holds “on average.”
Basic Econometrics
54. BASIC ECONOMETRICS
CASE STUDY 2
d.The typical prediction error is $9.30. Since the measure is related to the deviation of the actual and fitted
values, the unit of measurement must be the same as that of the dependent variable, which is in dollars here.
Basic Econometrics