ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy

Basic Statistics and
Econometric approaches
Devesh Roy
IFPRI-ICAR training
21st September, 2015

2
Why study Econometrics?
• Rare in economics (and many other areas without labs!) to have
experimental data
• Need to use nonexperimental, or observational, data to make
inferences
• Important to be able to apply economic theory to real world data

Elements of econometrics
• An empirical analysis uses data to test a theory or to estimate a
relationship
• A formal economic model can be tested
• Theory may be ambiguous as to the effect of some policy change –
can use econometrics to evaluate the program

4
Types of Data – Cross Sectional
• Cross-sectional data as a random sample
• Each observation is a new farmer, firm, etc. with information at a
point in time
• If the data is not a random sample, we have a sample-selection
problem

5
Types of Data – Panel
• Can pool random cross sections and treat similar to a normal cross
section. Will just need to account for time differences.
• Can follow the same random individual observations over time –
known as panel data or longitudinal data

6
Types of Data – Time Series
• Time series data has a separate observation for each time period – e.g.
prices
• Since not a random sample, different problems to consider
• Trends and seasonality will be important

7
The Question of Causality
• Simply establishing a relationship between variables is not sufficient
• In economics and most of the work we strive to show that effect is
causal
• If we truly control for enough other variables, then the estimated
ceteris paribus effect can often be considered to be causal
• Can be difficult to establish causality

8
Example: Returns to Education (Wooldridge
textbook example)
• A model of human capital investment implies getting more education
should lead to higher earnings
• In the simplest case, this implies an equation like
ueducationEarnings  10 

9
Example: (continued)
• The estimate of 1, is the return to education, but
can it be considered causal?
• While the error term, u, includes other factors
affecting earnings, want to control for as much as
possible
• Some things are still unobserved, which can
threaten establishing causality

Economics 20 - Prof. Anderson 10
Simple linear regression
• y = 0 + 1x + u, we typically refer to y as the
• Dependent Variable, or
• Left-Hand Side Variable, or
• Explained Variable, or
• Regressand

11
Terminology, continued
• In the simple linear regression of y on x, we typically refer to x as the
• Independent Variable, or
• Right-Hand Side Variable, or
• Explanatory Variable, or
• Regressor, or
• Covariate, or
• Control Variables

12
A Simple Assumption
• The average value of u, the error term, in the population is 0. That is,
• E(u) = 0
• This is not a restrictive assumption, since we can always use 0 to
normalize E(u) to 0

13
Zero Conditional Mean (very important
condition)
• We need to make a crucial assumption about how u and x are related
• We want it to be the case that knowing something about x does not
give us any information about u, so that they are completely
unrelated. That is, that
• E(u|x) = E(u) = 0, which implies
• E(y|x) = 0 + 1x

14
.
.
x1 x2
E(y|x) as a linear function of x, where for any x
the distribution of y is centered about E(y|x)
E(y|x) = 0 + 1x
y
f(y)

15
Ordinary Least Squares
• Basic idea of regression is to estimate the population parameters
from a sample
• Let {(xi,yi): i=1, …,n} denote a random sample of size n from the
population
• For each observation in this sample, it will be the case that
• yi = 0 + 1xi + ui

.
.
.
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
u1
u2
u3
u4
x
y
Population regression line, sample data points
and the associated error terms
E(y|x) = 0 + 1x

Gauss Markov assumptions
• Assumption 1- linearity in parameters- Population equation is linear
in the (unknown) parameters.
• Assumption 2- Sample is random
• Assumption 3- There is variation in x
• Assumption 4- Zero conditional mean E(u/x)=0

18
Deriving OLS Estimates
• To derive the OLS estimates we need to realize that our main
assumption of E(u|x) = E(u) = 0 also implies that
• Cov(x,u) = E(xu) = 0
• Why? Remember from basic probability that Cov(X,Y) = E(XY) –
E(X)E(Y)

Deriving OLS continued
• We can write our 2 restrictions just in terms of x, y, 0 and 1 , since u
= y – 0 – 1x
• E(y – 0 – 1x) = 0
• E[x(y – 0 – 1x)] = 0
• These are called moment restrictions

20
Deriving OLS using M.O.M.
• The method of moments approach to estimation implies imposing
the population moment restrictions on the sample moments
• What does this mean? Recall that for E(X), the mean of a population
distribution, a sample estimator of E(X) is simply the arithmetic mean
of the sample

21
More Derivation of OLS
• We want to choose values of the parameters that will ensure that
the sample versions of our moment restrictions are true
• The sample versions are as follows:
 
  0ˆˆ
0ˆˆ
1
10
1
1
10
1








n
i
iii
n
i
ii
xyxn
xyn



22
• Given the definition of a sample mean, and
properties of summation, we can rewrite the first
condition as follows
xy
xy
10
10
ˆˆ
or
,ˆˆ





23
  
   
    








n
i
ii
n
i
i
n
i
ii
n
i
ii
n
i
iii
xxyyxx
xxxyyx
xxyyx
1
2
1
1
1
1
1
1
11
ˆ
ˆ
0ˆˆ




24
So the OLS estimated slope is
  
 
  0thatprovided
ˆ
1
2
1
2
1
1










n
i
i
n
i
i
n
i
ii
xx
xx
yyxx


25
Summary of OLS slope estimate
• The slope estimate is the sample covariance between x and y divided
by the sample variance of x
• If x and y are positively correlated, the slope will be positive
• If x and y are negatively correlated, the slope will be negative
• Only need x to vary in our sample

26
More OLS
• Intuitively, OLS is fitting a line through the sample points such that
the sum of squared residuals is as small as possible, hence the term
least squares
• The residual, û, is an estimate of the error term, u, and is the
difference between the fitted line (sample regression function) and
the sample point

.
.
.
.
y4
y1
y2
y3
x1 x2 x3 x4
}
}
{
{
û1
û2
û3
û4
x
y
Sample regression line, sample data points
and the associated estimated error terms
xy 10
ˆˆˆ  

28
Minimizing the sum of least squares
• Given the intuitive idea of fitting a line, we can set up a formal
minimization problem
• That is, we want to choose our parameters such that we minimize
the following:
    

n
i
ii
n
i
i xyu
1
2
10
1
2 ˆˆˆ 

29
Alternate approach, continued
• If one uses calculus to solve the minimization problem for the two
parameters you obtain the following first order conditions, which are
the same as we obtained before, multiplied by n
 
  0ˆˆ
0ˆˆ
1
10
1
10






n
i
iii
n
i
ii
xyx
xy



30
Algebraic Properties of OLS
• The sum of the OLS residuals is zero
• Thus, the sample average of the OLS residuals is zero as well
• The sample covariance between the regressors and the OLS residuals
is zero
• The OLS regression line always goes through the mean of the sample

31
Algebraic Properties (precise)
xy
ux
n
u
u
n
i
ii
n
i
in
i
i
10
1
1
1
ˆˆ
0ˆ
0
ˆ
thus,and0ˆ
 









More terminology
 
 
SSRSSESSTThen
(SSR)squaresofsumresidualtheisˆ
(SSE)squaresofsumexplainedtheisˆ
(SST)squaresofsumtotaltheis
:followingthedefinethenWeˆˆ
part,dunexplaineanandpart,explainedanofup
madebeingasnobservatioeachofcan thinkWe
2
2
2







i
i
i
iii
u
yy
yy
uyy

Proof that SST = SSE + SSR
      
  
   
 
 

 







0ˆˆthatknowweand
SSEˆˆ2SSR
ˆˆˆ2ˆ
ˆˆ
ˆˆ
22
2
22
yyu
yyu
yyyyuu
yyu
yyyyyy
ii
ii
iiii
ii
iiii

Goodness-of-Fit
• How do we think about how well our sample regression line fits our
sample data?
• Can compute the fraction of the total sum of squares (SST) that is
explained by the model, call this the R-squared of regression
• R2 = SSE/SST = 1 – SSR/SST

35
Unbiasedness of OLS
• Assume the population model is linear in parameters as y = 0 + 1x +
u
• Assume we can use a random sample of size n, {(xi, yi): i=1, 2, …, n},
from the population model. Thus we can write the sample model yi =
0 + 1xi + ui
• Assume E(u|x) = 0 and thus E(ui|xi) = 0
• Much of the problem in econometric estimation that we are going
to face is because of violation of this assumption
• Assume there is variation in xi

Unbiasedness of OLS (cont)
• In order to think about unbiasedness, we need to rewrite our
estimator in terms of the population parameter
• Start with a simple rewrite of the formula as
 
 




22
21 where,ˆ
xxs
s
yxx
ix
x
ii


    
   
 
   
  ii
iii
ii
iii
iiiii
uxx
xxxxx
uxx
xxxxx
uxxxyxx










10
10
10




 
   
 
 
211
2
1
2
ˆ
thusand,
asrewrittenbecannumeratorthe,so
,0
x
ii
iix
iii
i
s
uxx
uxxs
xxxxx
xx












39
Unbiasedness Summary
• The OLS estimates of 1 and 0 are unbiased
• Proof of unbiasedness depends on assumptions – if any assumption
fails, then OLS is not necessarily unbiased
• Remember unbiasedness is a description of the estimator – in a given
sample we may be “near” or “far” from the true parameter

40
Variance of the OLS Estimators
• Now we know that the sampling distribution of our estimate is
centered around the true parameter
• Want to think about how spread out this distribution is
• Much easier to think about this variance under an additional
assumption, so
• Assume Var(u|x) = s2 (Homoskedasticity)

41
Variance of OLS (cont)
• Var(u|x) = E(u2|x)-[E(u|x)]2
• E(u|x) = 0, so s2 = E(u2|x) = E(u2) = Var(u)
• Thus s2 is also the unconditional variance, called the error variance
• s, the square root of the error variance is called the standard
deviation of the error
• Can say: E(y|x)=0 + 1x and Var(y|x) = s2

.
.
x1 x2
Homoskedastic Case
E(y|x) = 0 + 1x
y
f(y|x)

.
xx1 x2
f(y|x)
Heteroskedastic Case
x3
.
. E(y|x) = 0 + 1x

44
Variance of OLS (cont)
 
   
 12
22
2
2
2
2
2
2
222
2
2
2
2
2
2
2
211
ˆ1
11
11
1ˆ
ss
ss

Var
s
s
s
d
s
d
s
uVard
s
udVar
s
ud
s
VarVar
x
x
x
i
x
i
x
ii
x
ii
x
ii
x
















































45
Variance of OLS Summary
• The larger the error variance, s2, the larger the variance of the slope
estimate
• The larger the variability in the xi, the smaller the variance of the
slope estimate
• As a result, a larger sample size should decrease the variance of the
slope estimate
• Problem that the error variance is unknown

46
Estimating the Error Variance
• We don’t know what the error variance, s2, is, because we don’t
observe the errors, ui
• What we observe are the residuals, ûi
• We can use the residuals to form an estimate of the error variance

47
Error Variance Estimate (cont)
 
   
 
 2/ˆ
2
1
ˆ
isofestimatorunbiasedanThen,
ˆˆ
ˆˆ
ˆˆˆ
22
2
1100
1010
10






 nSSRu
n
u
xux
xyu
i
i
iii
iii
s
s




48
Error Variance Estimate (cont)
 
     2
1
2
1
1
2
/ˆˆse
,ˆoferrorstandardthe
havethen weforˆsubstituteweif
ˆsdthatrecall
regressiontheoferrorStandardˆˆ
 


xx
s
i
x
s

ss
s
ss

ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

More from International Food Policy Research Institute- South Asia Office

More from International Food Policy Research Institute- South Asia Office (20)

Recently uploaded

Recently uploaded (20)

ICAR- IFPRI - Basic statistics and econometric approaches lecture 3 - Devesh Roy