SlideShare a Scribd company logo
1 of 63
Advanced econometrics and Stata
L5-6 (1) Hypothesis testing, multi-
regression
Dr. Chunxia Jiang
Business School, University of Aberdeen, UK
Beijing , 17-26 Nov 2019
 Topics and schedule
Sessions plan
Evening —
L1-2 Introduction to Econometrics and Stata
Evening —
L3-4 Data, single regression
Morning —
L5-6 Hypothesis testing, Multi-regression , Violation of assumptions
Afternoon Exercises and practice
Morning —
L7-8 Time series models
Evening —
L9-10 Panel data models & Endogeneity
Morning Exercises and practice
Afternoon L11-12 Frontier1 SFA
Evening L13-14: Frontier2 DEA
Evening L15-16 DID
Morning Revision
Afternoon Exam
 Basic data analysis: Summary statistics
 One variable:
 Mean or average value
 Minimum and Maximum value
 Mode & Median
 Variance and standard deviation
 Two variables:
 Covariance
 Correlation
 Cross-plot (or scatter gram or scatter plot).
 Single regression/multivariate regression analysis
 How do we tell that OLS is a good estimator of the PRF?
 Assumptions
 R-square
Review: Data and simple regression
 Statistical inference
 Derivation of variance and standard error of our coefficient
estimates
 How to use the variance and standard error to assess our
model and testing hypotheses
 Hypothesis testing: null and alternative hypothesis
 Testing hypotheses for single parameters
 Testing joint hypotheses (for two or more
parameters)
 Dummy variables
 Functional form
Preview:
Statistical inference
 We want to know how good and are as estimates of
the population parameters, alpha and beta.
 How reliable are our estimates?
 How reliable is the least square estimation procedure?
 We need to know the nature of the variables in our
regression model: which variables are random and which
variables are deterministic.
Let’s look again at the assumptions of the CLRM :
̂ 

5
Assumptions of the Classical Linear
Regression Model (CLRM)
 (1) The regression model is linear in the parameters.
 (2) X values are fixed in repeated sampling.
 (3) The number of observations must be greater than the number of
parameters to be estimated.
 (4) There must be variability in the X values.
 (5) The explanatory variable X is uncorrelated with the error term:
 (6) There is no perfect multicollinearity.
 (7) Given the value of X, the expected value of the error term is zero
 (8) The variance of the error term is constant (homoscedasticity).
 (9) There is no correlation between two error terms (no
autocorrelation).
 (10) The disturbance term must be normally distributed
 (11) The model is correctly specified.
0
)
|
( 
X
u
E
0
)
,
( 
i
i X
u
Cov
2
)
var( 

i
i X
u
0
)
,
,
cov( 
j
i
j
i X
X
u
u
)
σ
N(0,
u 2
t 
6
What is random and what is non
random in our regression model?
 Random variables: u Y
 Non-random variables: X
 To estimate the properties of the estimators and
we need to know:
 The mean (expected value)
 The variance and covariance
 The probability distribution of the error term, and
̂ 
ˆ
̂ 
ˆ
̂ 
ˆ
7
Distribution of the error term &
coefficient estimates
 Assumption: the error term is normally distributed with
mean 0 and variance:

 Important property of the normal distribution: any linear
function of normally distributed variables is itself normally
distributed.
 Therefore alpha hat and beta hat are also normally
distributed:
 We now need to find their mean and variance.
)
,
0
(
~ 2

N
ui
~ (?,?)
N
 ~ (?,?)
N

ˆ
ˆ
ˆ
u = y - y = y - α - βx
t t t t t
8
Mean of alpha hat and beta hat
 Under the CLRM assumptions, the expected value of the
OLS estimator of alpha and beta are:
 The expected values of and is equal to the true
parameter values. This means that the estimator is
unbiased.
ˆ
( )
E  


 
)
ˆ
(
E
̂ ˆ

9
A ‘little story’ about repeated sampling
 Let’s imagine that we want to estimate the average height of our
class. Our class is our population. From this population we draw a
random sample of 2 students, we measure their height and we take
the average: this is our first estimate
 We then randomly draw another sample of 2 students and we
obtain a different estimates, call it .
 We carry on drawing different samples and each time a slightly
different estimate for the average height of our class. That is why
our coefficient estimates are random variables.
 We assume that this random variable is normally distributed.
 Using our sample we will produce a value of that is very close to
the population mean.
 By randomly drawing sample we are very likely to be very close to
the population mean.
1
ˆ

2
ˆ


ˆ
10
...and what we actually do
 In reality we do not use repeated sampling, we only
have 1 sample. We could be very unlucky and choose a
sample that is very far from the population mean. But
there is a high probability that our random sample will
give a value of
which is very close to the real population parameter.
• We recognise that there is always a margin of error in
our estimates. This is represented by the variance (or
by the standard error) of .

ˆ

ˆ
11
Variance of alpha hat and beta hat
 If the assumptions of the CLRM are correct it is possible to
derive the variance of the estimator alpha hat and beta hat
and obtain the following expressions:












)
(
)
ˆ
var(
2
2
x
x
N
x
i
i


 
 2
2
)
(
)
ˆ
var(
x
xi


12
Distribution of alpha hat and beta hat
 Wrapping up what we have said so far:
)
,
0
(
~ 2

N
ui











2
2
2
)
(
,
~
ˆ
x
x
N
x
N
i
i












 2
2
)
(
,
~
ˆ
x
x
N
i



13
Standardised normal distribution
 Let’s now go back to the distribution of our coefficient beta hat:
 A more convenient way to represent this distribution is by
constructing a variable Z obtained by subtracting the mean and
dividing by the standard error (standardised normal
distribution):
 SE: the standard deviation of the sampling distribution of the
estimatoran estimate of that standard deviation










 2
2
)
(
,
~
ˆ
x
x
N
i



)
1
,
0
(
~
)
(
/
ˆ
2
2
N
x
x
Z
i
 





14
The t-distribution
 Remember: we do not know the real variance of the error
term. We have to substitute it with its estimate.
 We use the variance of the residual :
 By doing so we obtain a random variable with a slightly
different probability distribution.
 We now have a t-distribution with N-2 degrees of freedom:
)
2
(
2
2
~
)
(
/
ˆ
ˆ

 

 N
i
t
x
x
t



Or more simply:
2)
-
(N
t
~
)
(




se
t

 )
2
(
~
)
(


 N
N
se
t




 t
2
ˆ
ˆ
2
2



N
u t

t
û
15
Degree of freedom
 It is total number of observations in the sample (n)
less the number of independent (linear) constraints
or restrictions put on them.
 It is the number of independent observations out of a
total of n observations.
 General rule is df=n-number of parameters estimated
16
From the Standardised normal
distribution to the T distribution









 2
2
)
(
,
~
ˆ
x
x
N
i



)
1
,
0
(
~
)
(
/
ˆ
2
2
N
x
x
Z
i
 





)
2
(
2
2
~
)
(
/
ˆ
ˆ

 

 N
i
t
x
x
t



Very important for
Hypothesis testing !
17
 It is a symmetrical distribution with zero mean and normally
flatter than the normal distribution. As the degrees of
freedom increase the t distribution approximate the normal
distribution.
 The area under the curve measures the probability of a
certain event occurring.
Probability density function of a
T distribution
18
Testing hypotheses
 We now want to use all these information to carry out
hypothesis testing.
 We derive our hypotheses from the theory.
 For example, in the CAPM model, the theory suggests that
the intercept should be equal to zero.
 Here are the steps to follow to carry out hypothesis
testing:
  jt
ft
mt
ft
jt u
R
R
R
R 














return
adjusted
risk
expected
t
at time
j
fund
of
return
excess


19
How to carry out hypothesis testing:
Using the t test
 1. Draw up the null hypothesis (H0)
 2. Draw up the alternative hypothesis (H1 or HA).
This will determine the type of test to carry out
(one tailed or two tailed test)
 3. Compute your t statistic (or t ratio)
 4. Choose your significance level
 5. Find out the critical value on the table
 6. Compare your t statistic to the critical value
and decide the outcome of the test.
20
Null Hypothesis
 We can set up hypotheses on any
parameter of our model.
 Step 1: Draw up the null hypothesis (H0)
 In the case of the CAPM model this will
be: H0: α = 0
  jt
ft
mt
ft
jt u
R
R
R
R 














return
adjusted
risk
expected
t
at time
j
fund
of
return
excess


21
Alternative Hypothesis
 The alternative is more tricky
 Usually the alternative is just that the null
is wrong:
◦ H1: α  0 (fund managers earn non-zero risk
adjusted excess returns)
 But sometimes is more specific
◦ H1: α < 0 (fund managers underperform)
◦ H1: α > 0 (fund managers outperform)
22
Construct your t statistics
 Our primary interest is in testing the following null hypothesis:
 Is the unknown population mean equal zero?
 In order to test this hypothesis we first compute the t-statistics or t-
ratio:
 The t ratio measures how many estimated standard deviations beta hat
is away from zero.
 How far this value is from zero?
0
:
0 

H
)
ˆ
(
ˆ
ˆ




se
t


)
ˆ
(
0
ˆ
ˆ



se
t


23
Choose your significance level
 We want to test:
 Against the alternative:
0
:
0 

H
We have to choose a significance level: the probability of
rejecting H0 when is true.
Usually we choose the 5% significance level
(5% of times we make the error of rejecting H0 when it is true).
But we might also want to use also the 10% and the 1%
significance level.
0
:
1 

H
24
One tailed test/one sided test
 In order to reject H0 in favour of H1 we need a sufficiently
large value of t.
 How large?
 Look at the t tables. Find the critical value in
correspondence to the chosen significant level and number
of degrees of freedom.
 If t>critical value we reject H0
 If t<critical value we cannot reject H0
25
T distribution and critical values
Suppose we have to find the critical value for the 5% significance level and 20
Degrees of freedom Use your tables
26
Example: the CAPM model
Regression Statistics
R Square 0.928
Adjusted R Square 0.903
Standard Error 3.179
Observations 5
Coefficients Standard Error t Stat
Intercept -1.737 4.114 -0.422
X Variable 1 1.642 0.265 6.200
Dependent variable (Y): excess return on portfolio XXX
Independent variable (X): excess return on the market portfolio
We are testing whether alpha (intercept)=0, against the alternative that is > 0
27
 This is a very small sample with only 5 observations
and 3 degrees of freedom. The critical value from the
t distribution with 3 degrees of freedom at the 5%
significance level is 3.182 for a two tailed test and
2.353 for a one tailed test. In both cases we cannot
reject the null hypothesis.
Example: the CAPM model
Test of significance
 The test we have just seen is very common in
econometric analysis
 We are usually interested in checking that the variables
included in our model are relevant
 We carry out a test of significance.
 For this test we always have:
 H0: beta =0
 H1: beta ≠0
 It is always a two-sided test
29
An example
 A simple model: explaining how the demand for
computers is related to personal income.
 Y = number of PC per 100 persons
 X= per capita income (in $)
 We use cross sectional data for 34 countries
Country PCs Per capita income
($)
Argentina 8.2 11410
China 2.76 4980
Canada 48.7 30040
India 0.72 2880
30
Results from our OLS regression
analysis
i
i X
Y 0018
.
0
5833
.
6
ˆ 


Se: (2.7437) (0.00014)
R2 = 0.829
Standard
error
Interpretation of the results:
•There is a positive relationship between number of computers and
per capita income. If income increases by $1,000 the demand for computer
goes up by about 2 units per 100 persons.
•The intercept has not meaningful interpretation in this case;
•The estimated R2 is very high. It suggests that 83% of the variation in
the demand for computers is explained by per capital income.
•Is the estimated coefficient significantly different from zero?
The t ratio = 0.0018/0.00014 = 12.857
Find the critical value for the 5% significance level.
31
Confidence intervals
(or interval estimates)
 We can also use interval estimates to carry out our
statistical inference.
 In this case we compare our hypothesised value for
beta with a range of values derived from our
estimates.
 If the hypothesised value of beta is within that range
we cannot reject the null hypothesis at the chosen
level of significance.
32
..or more simply
33
)
ˆ
(
tan
*
96
.
1
ˆ 
 darderror
s

5% Critical value for a two-tailed test, with over 120 degrees of freedom
We could also define a 99% confidence interval:
)
ˆ
(
tan
*
576
.
2
ˆ 
 darderror
s

1% critical value
Let’s construct a confidence interval
for an example
34
OLS estimate Standard error T stat. Critical values
Intercept .283 (.104) 2.721 1.960 (5% )
Exper .004 (.001) 4.00 1.960 (5% )
001
.
0
*
960
.
1
004
.
0
001
.
0
*
960
.
1
004
.
0 


 
Parameter estimate +/- (critical value*standard error) = [0.002 0.006]
Does the value ‘0’ lie within that interval? No
We reject the null hypothesis that beta = 0
We could also say that an extra year of experience will increase wages between
0.002 and 0.006.
Probability values
 Another way of carrying out hypothesis testing is based on
the use of probability values
 Compute the t statistic
 Check on the tables what is the probability of obtaining a
value of the test statistic as much or greater than that
obtained in the example.
 This probability is called the p value (probability value).
 This is the lowest significance level at which a null
hypothesis can be rejected.
35
Probability values – in practice
 Probability values are usually reported with the
regression output in most computer software.
 To reject the null hypothesis we want the p-value to be
quite small.
Relationship between significance level and
probability level:
- We choose the significance level before running the
regression. In practice, economists use the 10% or the 5%
or the 1% significance.
- The probability value gives us each time what exactly is
the lowest probability at which the null hypothesis can
be rejected (of making a type I error).
36
Summary
 We have looked at three possible ways of carrying out
hypothesis testing.
 All three ways of carrying out the hypothesis lead to the
same conclusion.
 Computer output usually provide all the necessary
information to easily carry out hypothesis testing.
37
Hypothesis testing
 Individual coefficient (-t-test)
 Significance of all coefficients (F-test)
 Test for restriction (F-test)
 Test for stability overtime (F-test)
 Normality test is a Chi-square to see whether
the error term is distributed normally (we
don’t discuss it here)
38
T-test: Example: model of R&D expenditure, n=32
 R&D is a function of sales and the profit margin:
 Log(rd)=-4.38 +1.084*log(sales)+0.0217*prof.marg.
 R-square = 0.918
(.47) (.060) (.0218)
39
Testing the overall significance of the sample regression
 The null hypothesis is that all the coefficients are equal to
zero. The alternative hypothesis is that at least one of
these coefficients is not equal to zero.
 F-test
 If F>Fc(k-1,n-k), reject H0. Where Fc(k-1,n-k) is the critical value,
e.g. critical value at 5% level, or 1% level, with the first
degree of freedom (df1)= (k-1) and the second degree of
freedom at df2 = (n-k). k is the number of regressors
including the intercept. N is the number of observations in
the regression.
0
...
β
β
H 3
2
0 


k)
)/(n
R
(1
1)
/(k
R
k)
RSS/(n
1)
ESS/(k
RSS/df2
ESS/df1
F 2
2








40
The overall significance example:
 Expectations-augmented Phillips curve for US 1970-82(n=13)
 Actual inflation rate Y(%), unemployment rate X2(%), and
expected inflation rate X3(%)
 F 0.05(2,10) = 4.96, F 0.01(2,10) = 10.0, the calculated F-value is
greater than the critical value at 1%. we reject the null
hypothesis .We conclude that the coefficients are jointly
significantly different from zero.
8766
.
0
)
1758
.
0
(
)
3050
.
0
(
(1.5958)
1.4700X
1.3925X
7.1933
Ŷ
2
3t
2t
t




R
35.52
0.8766)/10
(1
0.8766/2
k)
)/(n
R
(1
1)
/(k
R
F 2
2







41
Testing for additional variable
 To test whether an additional variable is significantly different
from zero
 Example, the R2 from model 1 is 0.9978, the R2 from model 2 is
0.9988, n=15, knew = 3, df1 = 1, df2 = 12.
 F > F0.01(1,12) = 9.33, we reject the null hypothesis that 3 = 0 at the
1% critical level.
)/df2
R
(1
)/df1
R
(R
)
k
/(n
RSS
variables
new
of
)/number
ESS
(ESS
F
ESS
,
R
,
e
X
β
X
β
β
Y
:
2
Model
ESS
,
R
,
e
X
β
β
Y
:
1
Model
2
new
2
old
2
new
new
new
old
new
new
2
new
t
3t
3
2t
2
1
t
old
2
old
t
2t
2
1
t













10.3978
0.9988)/12
(1
0.9978)
(0.9988
)/12
R
(1
)/1
R
(R
F 2
new
2
old
2
new







42
Testing linear equality restriction
 Model 1: Unrestricted model- double-log linear
function
 Model 2: Restricted model
 If we assume constant returns to scale then, 2 + 3 = 1,
by imposing this, we have
t
3t
3
2t
2
1
t e
lnX
β
lnX
β
β
lnY 



model
restricted
,
e
)
/X
ln(X
β
β
)
/X
ln(Y
e
)
X
ln(X
β
β
lnX
lnY
e
)
X
ln(X
β
lnX
β
lnY
e
lnX
β
)lnX
β
(1
β
nY
t
2t
3t
3
1
2t
t
t
2t
3t
3
1
2t
t
t
2t
3t
3
2t
1
t
t
3t
3
2t
3
1
t


















l
43
Testing linear equality restriction
 Hypothesis testing
 Run regressions of both the unrestricted and
restricted models and obtain the R2’s or the RSS’s,
then calculate the F value.
 Where RSSR = RSS of the restricted model, RSSUR = RSS
of the unrestricted model, R2
R = R2 of the restricted
model and R2
UR = R2 of the unrestricted model.
1
:
1
:
3
2
1
3
2
0








H
H
k)
)/(n
R
(1
)/m
R
(R
k)
/(n
RSS
)/m
RSS
(RSS
F 2
UR
2
R
2
UR
UR
UR
R







44
Testing linear equality restriction:
example
 The Cobb-Douglas production function for Taiwanese
agriculture 1958-72.
 Unrestricted model
 Restricted model
 F?
12
df
0.8890,
R
(4.80)
(2.78)
1.36)
(
t
0.4899X
1.4988lnX
3.3384
Ŷ
ln
2
UR
3t
2t
t








1
m
0.8489,
2
R
R
(6.57)
(4.11)
t
)
2t
/X
3t
0.6129(X
1.7086
)
2t
/X
t
Ŷ
ln(





45
Testing linear equality restriction:
example
 We cannot reject the null hypothesis that 2+3 = 1 at
the 5% critical level. However, the F value and the
critical value are close. Hence we have to be careful
about making decision. For example, we can reject
the hypothesis at 10% level.
4.36
0.8890)/12
(1
0.8489)
(0.8890
k)
)/(n
R
(1
)/m
R
(R
F 2
UR
2
R
2
UR








0.05(1,12)
0.05(1,12) F
F
4.75,
F 

46
Testing for structural stability (Chow-
test, use of F-value again)
 This is to test whether the same model Yt = f(Xt) can be
used for different time periods. If not, there will be
structural break in the model.
 There are three possibilities that the same model may not
work for different time periods:
 The intercept may be different
 The coefficients may be different
 Both the intercept and the coefficients are different
 Hence, it is important that we test for the structural
stability of the model.
47
Testing for structural stability (Chow-
test, use of F-value again)
 Suppose we break the entire data period into the sub-period,
then run the model for each of the two sub-periods and for the
entire data period to obtain the RSS’s from three different
regressions.
 RSSall = RSS from the regression that uses all the observations from
the entire data period, the number of observations is n = n1 + n2.
 RSS1 = RSS from the regression that use only the observations of
the first sub-period, the number of observations is n1.
 RSS2 = RSS from the regression that use only the observations of
the second sub-period, the number of observations is n2.
 RSS1+2 = RSS1 + RSS2.
 K = number of regressors including intercept.
2k)
/(n
RSS
)/k
RSS
(RSS
F
2
1
2
1
all





48
Testing for structural stability
 The null hypothesis is no structural change
 If the F value does not exceed the critical F value, we do
not reject the null hypothesis of parameter stability
 If the F value exceed the critical F value, we reject the
hypothesis of parameter stability
49
Looking at the p-values for the F test
 We can also look at the p-values, the probability of
not rejecting the null hypothesis, given the estimated
coefficients and their standard errors.
 These are computed by the econometric package.
 For a 5% significance level, you reject H0 if p-
value<0.05.
50
Example
Source | SS df MS Number of obs = 6763
-------------+------------------------------ F( 3, 6759) = 644.53
Model | 357.752575 3 119.250858 Prob > F = 0.0000
Residual | 1250.54352 6759 .185019014 R-squared = 0.2224
-------------+------------------------------ Adj R-squared = 0.2221
Total | 1608.29609 6762 .237843255 Root MSE = .43014
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
jc | .0666967 .0068288 9.77 0.000 .0533101 .0800833
univ | .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361
------------------------------------------------------------------------------
51
Introducing qualitative factors in our
analysis
 What do we mean by qualitative factors?
 Examples:
 Do men earn more than women?
 Do students that live in campus get higher scores in their
exams compared to students who live off-campus?
 How do we account for these factors?
 Dummy variable = it’s a binary variable (0/1) where 1 indicates
that a particular event occurs (living in campus or not) or a
particular feature is present (beautiful).
52
Dummy variables
 The name of the dummy indicates how it is constructed.
 A dummy called ‘female’ equals 1 when the person is a female
and 0 otherwise.
 A dummy called ‘campus’ equals 1 when the student lives on
campus and 0 otherwise.
 Different types of dummies
 Shift/intercept dummies
 Slope dummies: dummy interact with X
 Interaction dummies: the product of two dummy variables
53
Case 1: intercept or shift dummy
 Let’s start with the simple case of a single dummy
dependent variable
 Although there are 2 genders, we only need one
dummy variable. In this case we are using a dummy
called ‘female’. Remember that female + male = 1.
 The quality or feature that we set equal to zero is called the base
group or the benchmark group (or reference group).
 Examples: men is the base group in the previous case.
 We cannot include a dummy for male and one for female
because this will cause a problem of perfect collinearity
i
i
i
i u
education
female
wage 


 1
0 


54
Model with a dummy variable
i
i
i
i u
education
female
wage 


 1
0 


i
i
i u
education
wage 


 1
0 0
* 


After estimating our equation we may find the following results:
This shows that alpha is the intercept for male workers.
The intercept for female workers is alpha + delta.
0
 Not significantly different from zero
Has a negative sign
Has a positive sign
When a particular worker is a male, our equation will look like:
55
Graphical analysis
0

 

Wage
Education
In this case women are discriminated against men.
56
Hypothesis testing
 We could simply carry out a significance test to test
whether the dummy is significantly different from zero.
 Or we might want to test the null against the alternative
that there is discrimination against women:
0
:
0
:
0
1
0
0




H
H
0
:
0
:
0
1
0
0




H
H
Two tailed/sided test
One tailed/sided test
57
Possible outcomes of the test for the
wage example
POSSIBLE RESULTS
FOR 0

INTERPRETATION
Negative and
significantly different
from zero
Women earn less than men, given the same level of
education. We find evidence of discrimination against
women.
Positive and
significantly different
from zero
Women earn more than men, given the same level of
education. We find evidence of discrimination against
men.
Not significantly
different from zero
There is no significant difference between male and
female hourly earning. We do not find any evidence of
discrimination
i
i
i
i u
education
female
wage 


 1
0 


58
Interaction dummies:
interact dummies with explanatory
variables
 We can also interact dummy variables with
other explanatory variables that are not
dummy variables to test whether there
are differences in slope.
 For example, we want to estimate
whether the returns to education, in term
of wages, are the same for men and
women.
59
Graphical representation
Wage ($ per hour)
Education (years)
9 18
Intercept
For female
Intercept
For male
60
How do we formulate this model for
OLS estimation?
i
i
i
i
i
i u
education
female
educ
female
wage 



 *
)
log( 1
1
0 



We could have different situations according to the coefficient estimates.
Example: let’s assume that all variable are statistically significant.
Return to education
for men
Female’s returns to education
are 0.56% less than men.
To know the returns to education for women,
take the difference between
the two coefficients.
i
i
i
i
i educ
female
educ
female
ge
a
w *
0056
.
0
082
.
0
227
.
0
389
.
0
)
log( 




61
Exercise 5.1
62
Exercise 5.2
63

More Related Content

Similar to Advanced Econometrics L5-6.pptx

Chapter4
Chapter4Chapter4
Chapter4Vu Vo
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handoutfatima d
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesLuca Massarelli
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Lec. 10: Making Assumptions of Missing data
Lec. 10: Making Assumptions of Missing dataLec. 10: Making Assumptions of Missing data
Lec. 10: Making Assumptions of Missing dataMohamadKharseh1
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Marina Santini
 
AP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAlfredNavea1
 
raghu veera stats.ppt
raghu veera stats.pptraghu veera stats.ppt
raghu veera stats.pptDevarajuBn
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.pptAbdulhaqAli
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesisswapnac12
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceLong Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Long Beach City College
 

Similar to Advanced Econometrics L5-6.pptx (20)

Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter4
Chapter4Chapter4
Chapter4
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handout
 
2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data 2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical Sciences
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Chapter13
Chapter13Chapter13
Chapter13
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Lec. 10: Making Assumptions of Missing data
Lec. 10: Making Assumptions of Missing dataLec. 10: Making Assumptions of Missing data
Lec. 10: Making Assumptions of Missing data
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
AP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).ppt
 
Inorganic CHEMISTRY
Inorganic CHEMISTRYInorganic CHEMISTRY
Inorganic CHEMISTRY
 
raghu veera stats.ppt
raghu veera stats.pptraghu veera stats.ppt
raghu veera stats.ppt
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.ppt
 
Estimating a Population Proportion
Estimating a Population Proportion  Estimating a Population Proportion
Estimating a Population Proportion
 
working with python
working with pythonworking with python
working with python
 
Data Analysis Assignment Help
Data Analysis Assignment Help Data Analysis Assignment Help
Data Analysis Assignment Help
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesis
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 

More from akashayosha

Capital Market Responses to Environmental performance.pptx
Capital Market Responses to Environmental performance.pptxCapital Market Responses to Environmental performance.pptx
Capital Market Responses to Environmental performance.pptxakashayosha
 
Advanced Microeconomics Presentation.pptx
Advanced Microeconomics Presentation.pptxAdvanced Microeconomics Presentation.pptx
Advanced Microeconomics Presentation.pptxakashayosha
 
Lab practice session.pptx
Lab practice session.pptxLab practice session.pptx
Lab practice session.pptxakashayosha
 
Advanced Econometrics L13-14.pptx
Advanced Econometrics L13-14.pptxAdvanced Econometrics L13-14.pptx
Advanced Econometrics L13-14.pptxakashayosha
 
Advanced Econometrics L11- 12.pptx
Advanced Econometrics L11- 12.pptxAdvanced Econometrics L11- 12.pptx
Advanced Econometrics L11- 12.pptxakashayosha
 
Advanced Econometrics L10.pptx
Advanced Econometrics L10.pptxAdvanced Econometrics L10.pptx
Advanced Econometrics L10.pptxakashayosha
 
Advanced Econometrics L9.pptx
Advanced Econometrics L9.pptxAdvanced Econometrics L9.pptx
Advanced Econometrics L9.pptxakashayosha
 
Advanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptxAdvanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptxakashayosha
 
Advanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxAdvanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxakashayosha
 
Advanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxAdvanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxakashayosha
 

More from akashayosha (17)

HSK1-L6.pptx
HSK1-L6.pptxHSK1-L6.pptx
HSK1-L6.pptx
 
HSK1-L5.pptx
HSK1-L5.pptxHSK1-L5.pptx
HSK1-L5.pptx
 
HSK1-L4.pptx
HSK1-L4.pptxHSK1-L4.pptx
HSK1-L4.pptx
 
HSK1-L3.pptx
HSK1-L3.pptxHSK1-L3.pptx
HSK1-L3.pptx
 
HSK1-L2.pptx
HSK1-L2.pptxHSK1-L2.pptx
HSK1-L2.pptx
 
HSK1-L1.pptx
HSK1-L1.pptxHSK1-L1.pptx
HSK1-L1.pptx
 
Capital Market Responses to Environmental performance.pptx
Capital Market Responses to Environmental performance.pptxCapital Market Responses to Environmental performance.pptx
Capital Market Responses to Environmental performance.pptx
 
Advanced Microeconomics Presentation.pptx
Advanced Microeconomics Presentation.pptxAdvanced Microeconomics Presentation.pptx
Advanced Microeconomics Presentation.pptx
 
Lab practice session.pptx
Lab practice session.pptxLab practice session.pptx
Lab practice session.pptx
 
Exercises.pptx
Exercises.pptxExercises.pptx
Exercises.pptx
 
Advanced Econometrics L13-14.pptx
Advanced Econometrics L13-14.pptxAdvanced Econometrics L13-14.pptx
Advanced Econometrics L13-14.pptx
 
Advanced Econometrics L11- 12.pptx
Advanced Econometrics L11- 12.pptxAdvanced Econometrics L11- 12.pptx
Advanced Econometrics L11- 12.pptx
 
Advanced Econometrics L10.pptx
Advanced Econometrics L10.pptxAdvanced Econometrics L10.pptx
Advanced Econometrics L10.pptx
 
Advanced Econometrics L9.pptx
Advanced Econometrics L9.pptxAdvanced Econometrics L9.pptx
Advanced Econometrics L9.pptx
 
Advanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptxAdvanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptx
 
Advanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxAdvanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptx
 
Advanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxAdvanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptx
 

Recently uploaded

Test bank for advanced assessment interpreting findings and formulating diffe...
Test bank for advanced assessment interpreting findings and formulating diffe...Test bank for advanced assessment interpreting findings and formulating diffe...
Test bank for advanced assessment interpreting findings and formulating diffe...robinsonayot
 
Current scenario of Energy Retail utilities market in UK
Current scenario of Energy Retail utilities market in UKCurrent scenario of Energy Retail utilities market in UK
Current scenario of Energy Retail utilities market in UKParas Gupta
 
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usanajoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usamazhshah570
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumFinTech Belgium
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxazadalisthp2020i
 
The Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered BondsThe Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered BondsNeil Day
 
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize ThemSignificant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them360factors
 
GIFT City Overview India's Gateway to Global Finance
GIFT City Overview  India's Gateway to Global FinanceGIFT City Overview  India's Gateway to Global Finance
GIFT City Overview India's Gateway to Global FinanceGaurav Kanudawala
 
black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...batoole333
 
amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...israjan914
 
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...batoole333
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfAdnet Communications
 
Lion One Corporate Presentation May 2024
Lion One Corporate Presentation May 2024Lion One Corporate Presentation May 2024
Lion One Corporate Presentation May 2024Adnet Communications
 
Bank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For ReadingBank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For ReadingNghiaPham100
 
Shrambal_Distributors_Newsletter_May-2024.pdf
Shrambal_Distributors_Newsletter_May-2024.pdfShrambal_Distributors_Newsletter_May-2024.pdf
Shrambal_Distributors_Newsletter_May-2024.pdfvikashdidwania1
 
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...Henry Tapper
 
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...mazhshah570
 
Pitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsxPitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsxFuadS2
 

Recently uploaded (20)

Abortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
Abortion pills in Dammam Saudi Arabia | +966572737505 |Get CytotecAbortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
Abortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
 
Test bank for advanced assessment interpreting findings and formulating diffe...
Test bank for advanced assessment interpreting findings and formulating diffe...Test bank for advanced assessment interpreting findings and formulating diffe...
Test bank for advanced assessment interpreting findings and formulating diffe...
 
Current scenario of Energy Retail utilities market in UK
Current scenario of Energy Retail utilities market in UKCurrent scenario of Energy Retail utilities market in UK
Current scenario of Energy Retail utilities market in UK
 
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usanajoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech Belgium
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptx
 
The Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered BondsThe Pfandbrief Roundtable 2024 - Covered Bonds
The Pfandbrief Roundtable 2024 - Covered Bonds
 
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize ThemSignificant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
 
GIFT City Overview India's Gateway to Global Finance
GIFT City Overview  India's Gateway to Global FinanceGIFT City Overview  India's Gateway to Global Finance
GIFT City Overview India's Gateway to Global Finance
 
black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...
 
amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...
 
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...
Black magic specialist in Saudi Arabia (Kala jadu expert in UK) Bangali Amil ...
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdf
 
Lion One Corporate Presentation May 2024
Lion One Corporate Presentation May 2024Lion One Corporate Presentation May 2024
Lion One Corporate Presentation May 2024
 
Obat Penggugur Kandungan Aman Bagi Ibu Menyusui 087776558899
Obat Penggugur Kandungan Aman Bagi Ibu Menyusui  087776558899Obat Penggugur Kandungan Aman Bagi Ibu Menyusui  087776558899
Obat Penggugur Kandungan Aman Bagi Ibu Menyusui 087776558899
 
Bank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For ReadingBank of Tomorrow White Paper For Reading
Bank of Tomorrow White Paper For Reading
 
Shrambal_Distributors_Newsletter_May-2024.pdf
Shrambal_Distributors_Newsletter_May-2024.pdfShrambal_Distributors_Newsletter_May-2024.pdf
Shrambal_Distributors_Newsletter_May-2024.pdf
 
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
 
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
 
Pitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsxPitch-deck CopyFinancial and MemberForex.ppsx
Pitch-deck CopyFinancial and MemberForex.ppsx
 

Advanced Econometrics L5-6.pptx

  • 1. Advanced econometrics and Stata L5-6 (1) Hypothesis testing, multi- regression Dr. Chunxia Jiang Business School, University of Aberdeen, UK Beijing , 17-26 Nov 2019
  • 2.  Topics and schedule Sessions plan Evening — L1-2 Introduction to Econometrics and Stata Evening — L3-4 Data, single regression Morning — L5-6 Hypothesis testing, Multi-regression , Violation of assumptions Afternoon Exercises and practice Morning — L7-8 Time series models Evening — L9-10 Panel data models & Endogeneity Morning Exercises and practice Afternoon L11-12 Frontier1 SFA Evening L13-14: Frontier2 DEA Evening L15-16 DID Morning Revision Afternoon Exam
  • 3.  Basic data analysis: Summary statistics  One variable:  Mean or average value  Minimum and Maximum value  Mode & Median  Variance and standard deviation  Two variables:  Covariance  Correlation  Cross-plot (or scatter gram or scatter plot).  Single regression/multivariate regression analysis  How do we tell that OLS is a good estimator of the PRF?  Assumptions  R-square Review: Data and simple regression
  • 4.  Statistical inference  Derivation of variance and standard error of our coefficient estimates  How to use the variance and standard error to assess our model and testing hypotheses  Hypothesis testing: null and alternative hypothesis  Testing hypotheses for single parameters  Testing joint hypotheses (for two or more parameters)  Dummy variables  Functional form Preview:
  • 5. Statistical inference  We want to know how good and are as estimates of the population parameters, alpha and beta.  How reliable are our estimates?  How reliable is the least square estimation procedure?  We need to know the nature of the variables in our regression model: which variables are random and which variables are deterministic. Let’s look again at the assumptions of the CLRM : ̂   5
  • 6. Assumptions of the Classical Linear Regression Model (CLRM)  (1) The regression model is linear in the parameters.  (2) X values are fixed in repeated sampling.  (3) The number of observations must be greater than the number of parameters to be estimated.  (4) There must be variability in the X values.  (5) The explanatory variable X is uncorrelated with the error term:  (6) There is no perfect multicollinearity.  (7) Given the value of X, the expected value of the error term is zero  (8) The variance of the error term is constant (homoscedasticity).  (9) There is no correlation between two error terms (no autocorrelation).  (10) The disturbance term must be normally distributed  (11) The model is correctly specified. 0 ) | (  X u E 0 ) , (  i i X u Cov 2 ) var(   i i X u 0 ) , , cov(  j i j i X X u u ) σ N(0, u 2 t  6
  • 7. What is random and what is non random in our regression model?  Random variables: u Y  Non-random variables: X  To estimate the properties of the estimators and we need to know:  The mean (expected value)  The variance and covariance  The probability distribution of the error term, and ̂  ˆ ̂  ˆ ̂  ˆ 7
  • 8. Distribution of the error term & coefficient estimates  Assumption: the error term is normally distributed with mean 0 and variance:   Important property of the normal distribution: any linear function of normally distributed variables is itself normally distributed.  Therefore alpha hat and beta hat are also normally distributed:  We now need to find their mean and variance. ) , 0 ( ~ 2  N ui ~ (?,?) N  ~ (?,?) N  ˆ ˆ ˆ u = y - y = y - α - βx t t t t t 8
  • 9. Mean of alpha hat and beta hat  Under the CLRM assumptions, the expected value of the OLS estimator of alpha and beta are:  The expected values of and is equal to the true parameter values. This means that the estimator is unbiased. ˆ ( ) E       ) ˆ ( E ̂ ˆ  9
  • 10. A ‘little story’ about repeated sampling  Let’s imagine that we want to estimate the average height of our class. Our class is our population. From this population we draw a random sample of 2 students, we measure their height and we take the average: this is our first estimate  We then randomly draw another sample of 2 students and we obtain a different estimates, call it .  We carry on drawing different samples and each time a slightly different estimate for the average height of our class. That is why our coefficient estimates are random variables.  We assume that this random variable is normally distributed.  Using our sample we will produce a value of that is very close to the population mean.  By randomly drawing sample we are very likely to be very close to the population mean. 1 ˆ  2 ˆ   ˆ 10
  • 11. ...and what we actually do  In reality we do not use repeated sampling, we only have 1 sample. We could be very unlucky and choose a sample that is very far from the population mean. But there is a high probability that our random sample will give a value of which is very close to the real population parameter. • We recognise that there is always a margin of error in our estimates. This is represented by the variance (or by the standard error) of .  ˆ  ˆ 11
  • 12. Variance of alpha hat and beta hat  If the assumptions of the CLRM are correct it is possible to derive the variance of the estimator alpha hat and beta hat and obtain the following expressions:             ) ( ) ˆ var( 2 2 x x N x i i      2 2 ) ( ) ˆ var( x xi   12
  • 13. Distribution of alpha hat and beta hat  Wrapping up what we have said so far: ) , 0 ( ~ 2  N ui            2 2 2 ) ( , ~ ˆ x x N x N i i              2 2 ) ( , ~ ˆ x x N i    13
  • 14. Standardised normal distribution  Let’s now go back to the distribution of our coefficient beta hat:  A more convenient way to represent this distribution is by constructing a variable Z obtained by subtracting the mean and dividing by the standard error (standardised normal distribution):  SE: the standard deviation of the sampling distribution of the estimatoran estimate of that standard deviation            2 2 ) ( , ~ ˆ x x N i    ) 1 , 0 ( ~ ) ( / ˆ 2 2 N x x Z i        14
  • 15. The t-distribution  Remember: we do not know the real variance of the error term. We have to substitute it with its estimate.  We use the variance of the residual :  By doing so we obtain a random variable with a slightly different probability distribution.  We now have a t-distribution with N-2 degrees of freedom: ) 2 ( 2 2 ~ ) ( / ˆ ˆ      N i t x x t    Or more simply: 2) - (N t ~ ) (     se t   ) 2 ( ~ ) (    N N se t      t 2 ˆ ˆ 2 2    N u t  t û 15
  • 16. Degree of freedom  It is total number of observations in the sample (n) less the number of independent (linear) constraints or restrictions put on them.  It is the number of independent observations out of a total of n observations.  General rule is df=n-number of parameters estimated 16
  • 17. From the Standardised normal distribution to the T distribution           2 2 ) ( , ~ ˆ x x N i    ) 1 , 0 ( ~ ) ( / ˆ 2 2 N x x Z i        ) 2 ( 2 2 ~ ) ( / ˆ ˆ      N i t x x t    Very important for Hypothesis testing ! 17
  • 18.  It is a symmetrical distribution with zero mean and normally flatter than the normal distribution. As the degrees of freedom increase the t distribution approximate the normal distribution.  The area under the curve measures the probability of a certain event occurring. Probability density function of a T distribution 18
  • 19. Testing hypotheses  We now want to use all these information to carry out hypothesis testing.  We derive our hypotheses from the theory.  For example, in the CAPM model, the theory suggests that the intercept should be equal to zero.  Here are the steps to follow to carry out hypothesis testing:   jt ft mt ft jt u R R R R                return adjusted risk expected t at time j fund of return excess   19
  • 20. How to carry out hypothesis testing: Using the t test  1. Draw up the null hypothesis (H0)  2. Draw up the alternative hypothesis (H1 or HA). This will determine the type of test to carry out (one tailed or two tailed test)  3. Compute your t statistic (or t ratio)  4. Choose your significance level  5. Find out the critical value on the table  6. Compare your t statistic to the critical value and decide the outcome of the test. 20
  • 21. Null Hypothesis  We can set up hypotheses on any parameter of our model.  Step 1: Draw up the null hypothesis (H0)  In the case of the CAPM model this will be: H0: α = 0   jt ft mt ft jt u R R R R                return adjusted risk expected t at time j fund of return excess   21
  • 22. Alternative Hypothesis  The alternative is more tricky  Usually the alternative is just that the null is wrong: ◦ H1: α  0 (fund managers earn non-zero risk adjusted excess returns)  But sometimes is more specific ◦ H1: α < 0 (fund managers underperform) ◦ H1: α > 0 (fund managers outperform) 22
  • 23. Construct your t statistics  Our primary interest is in testing the following null hypothesis:  Is the unknown population mean equal zero?  In order to test this hypothesis we first compute the t-statistics or t- ratio:  The t ratio measures how many estimated standard deviations beta hat is away from zero.  How far this value is from zero? 0 : 0   H ) ˆ ( ˆ ˆ     se t   ) ˆ ( 0 ˆ ˆ    se t   23
  • 24. Choose your significance level  We want to test:  Against the alternative: 0 : 0   H We have to choose a significance level: the probability of rejecting H0 when is true. Usually we choose the 5% significance level (5% of times we make the error of rejecting H0 when it is true). But we might also want to use also the 10% and the 1% significance level. 0 : 1   H 24
  • 25. One tailed test/one sided test  In order to reject H0 in favour of H1 we need a sufficiently large value of t.  How large?  Look at the t tables. Find the critical value in correspondence to the chosen significant level and number of degrees of freedom.  If t>critical value we reject H0  If t<critical value we cannot reject H0 25
  • 26. T distribution and critical values Suppose we have to find the critical value for the 5% significance level and 20 Degrees of freedom Use your tables 26
  • 27. Example: the CAPM model Regression Statistics R Square 0.928 Adjusted R Square 0.903 Standard Error 3.179 Observations 5 Coefficients Standard Error t Stat Intercept -1.737 4.114 -0.422 X Variable 1 1.642 0.265 6.200 Dependent variable (Y): excess return on portfolio XXX Independent variable (X): excess return on the market portfolio We are testing whether alpha (intercept)=0, against the alternative that is > 0 27
  • 28.  This is a very small sample with only 5 observations and 3 degrees of freedom. The critical value from the t distribution with 3 degrees of freedom at the 5% significance level is 3.182 for a two tailed test and 2.353 for a one tailed test. In both cases we cannot reject the null hypothesis. Example: the CAPM model
  • 29. Test of significance  The test we have just seen is very common in econometric analysis  We are usually interested in checking that the variables included in our model are relevant  We carry out a test of significance.  For this test we always have:  H0: beta =0  H1: beta ≠0  It is always a two-sided test 29
  • 30. An example  A simple model: explaining how the demand for computers is related to personal income.  Y = number of PC per 100 persons  X= per capita income (in $)  We use cross sectional data for 34 countries Country PCs Per capita income ($) Argentina 8.2 11410 China 2.76 4980 Canada 48.7 30040 India 0.72 2880 30
  • 31. Results from our OLS regression analysis i i X Y 0018 . 0 5833 . 6 ˆ    Se: (2.7437) (0.00014) R2 = 0.829 Standard error Interpretation of the results: •There is a positive relationship between number of computers and per capita income. If income increases by $1,000 the demand for computer goes up by about 2 units per 100 persons. •The intercept has not meaningful interpretation in this case; •The estimated R2 is very high. It suggests that 83% of the variation in the demand for computers is explained by per capital income. •Is the estimated coefficient significantly different from zero? The t ratio = 0.0018/0.00014 = 12.857 Find the critical value for the 5% significance level. 31
  • 32. Confidence intervals (or interval estimates)  We can also use interval estimates to carry out our statistical inference.  In this case we compare our hypothesised value for beta with a range of values derived from our estimates.  If the hypothesised value of beta is within that range we cannot reject the null hypothesis at the chosen level of significance. 32
  • 33. ..or more simply 33 ) ˆ ( tan * 96 . 1 ˆ   darderror s  5% Critical value for a two-tailed test, with over 120 degrees of freedom We could also define a 99% confidence interval: ) ˆ ( tan * 576 . 2 ˆ   darderror s  1% critical value
  • 34. Let’s construct a confidence interval for an example 34 OLS estimate Standard error T stat. Critical values Intercept .283 (.104) 2.721 1.960 (5% ) Exper .004 (.001) 4.00 1.960 (5% ) 001 . 0 * 960 . 1 004 . 0 001 . 0 * 960 . 1 004 . 0      Parameter estimate +/- (critical value*standard error) = [0.002 0.006] Does the value ‘0’ lie within that interval? No We reject the null hypothesis that beta = 0 We could also say that an extra year of experience will increase wages between 0.002 and 0.006.
  • 35. Probability values  Another way of carrying out hypothesis testing is based on the use of probability values  Compute the t statistic  Check on the tables what is the probability of obtaining a value of the test statistic as much or greater than that obtained in the example.  This probability is called the p value (probability value).  This is the lowest significance level at which a null hypothesis can be rejected. 35
  • 36. Probability values – in practice  Probability values are usually reported with the regression output in most computer software.  To reject the null hypothesis we want the p-value to be quite small. Relationship between significance level and probability level: - We choose the significance level before running the regression. In practice, economists use the 10% or the 5% or the 1% significance. - The probability value gives us each time what exactly is the lowest probability at which the null hypothesis can be rejected (of making a type I error). 36
  • 37. Summary  We have looked at three possible ways of carrying out hypothesis testing.  All three ways of carrying out the hypothesis lead to the same conclusion.  Computer output usually provide all the necessary information to easily carry out hypothesis testing. 37
  • 38. Hypothesis testing  Individual coefficient (-t-test)  Significance of all coefficients (F-test)  Test for restriction (F-test)  Test for stability overtime (F-test)  Normality test is a Chi-square to see whether the error term is distributed normally (we don’t discuss it here) 38
  • 39. T-test: Example: model of R&D expenditure, n=32  R&D is a function of sales and the profit margin:  Log(rd)=-4.38 +1.084*log(sales)+0.0217*prof.marg.  R-square = 0.918 (.47) (.060) (.0218) 39
  • 40. Testing the overall significance of the sample regression  The null hypothesis is that all the coefficients are equal to zero. The alternative hypothesis is that at least one of these coefficients is not equal to zero.  F-test  If F>Fc(k-1,n-k), reject H0. Where Fc(k-1,n-k) is the critical value, e.g. critical value at 5% level, or 1% level, with the first degree of freedom (df1)= (k-1) and the second degree of freedom at df2 = (n-k). k is the number of regressors including the intercept. N is the number of observations in the regression. 0 ... β β H 3 2 0    k) )/(n R (1 1) /(k R k) RSS/(n 1) ESS/(k RSS/df2 ESS/df1 F 2 2         40
  • 41. The overall significance example:  Expectations-augmented Phillips curve for US 1970-82(n=13)  Actual inflation rate Y(%), unemployment rate X2(%), and expected inflation rate X3(%)  F 0.05(2,10) = 4.96, F 0.01(2,10) = 10.0, the calculated F-value is greater than the critical value at 1%. we reject the null hypothesis .We conclude that the coefficients are jointly significantly different from zero. 8766 . 0 ) 1758 . 0 ( ) 3050 . 0 ( (1.5958) 1.4700X 1.3925X 7.1933 Ŷ 2 3t 2t t     R 35.52 0.8766)/10 (1 0.8766/2 k) )/(n R (1 1) /(k R F 2 2        41
  • 42. Testing for additional variable  To test whether an additional variable is significantly different from zero  Example, the R2 from model 1 is 0.9978, the R2 from model 2 is 0.9988, n=15, knew = 3, df1 = 1, df2 = 12.  F > F0.01(1,12) = 9.33, we reject the null hypothesis that 3 = 0 at the 1% critical level. )/df2 R (1 )/df1 R (R ) k /(n RSS variables new of )/number ESS (ESS F ESS , R , e X β X β β Y : 2 Model ESS , R , e X β β Y : 1 Model 2 new 2 old 2 new new new old new new 2 new t 3t 3 2t 2 1 t old 2 old t 2t 2 1 t              10.3978 0.9988)/12 (1 0.9978) (0.9988 )/12 R (1 )/1 R (R F 2 new 2 old 2 new        42
  • 43. Testing linear equality restriction  Model 1: Unrestricted model- double-log linear function  Model 2: Restricted model  If we assume constant returns to scale then, 2 + 3 = 1, by imposing this, we have t 3t 3 2t 2 1 t e lnX β lnX β β lnY     model restricted , e ) /X ln(X β β ) /X ln(Y e ) X ln(X β β lnX lnY e ) X ln(X β lnX β lnY e lnX β )lnX β (1 β nY t 2t 3t 3 1 2t t t 2t 3t 3 1 2t t t 2t 3t 3 2t 1 t t 3t 3 2t 3 1 t                   l 43
  • 44. Testing linear equality restriction  Hypothesis testing  Run regressions of both the unrestricted and restricted models and obtain the R2’s or the RSS’s, then calculate the F value.  Where RSSR = RSS of the restricted model, RSSUR = RSS of the unrestricted model, R2 R = R2 of the restricted model and R2 UR = R2 of the unrestricted model. 1 : 1 : 3 2 1 3 2 0         H H k) )/(n R (1 )/m R (R k) /(n RSS )/m RSS (RSS F 2 UR 2 R 2 UR UR UR R        44
  • 45. Testing linear equality restriction: example  The Cobb-Douglas production function for Taiwanese agriculture 1958-72.  Unrestricted model  Restricted model  F? 12 df 0.8890, R (4.80) (2.78) 1.36) ( t 0.4899X 1.4988lnX 3.3384 Ŷ ln 2 UR 3t 2t t         1 m 0.8489, 2 R R (6.57) (4.11) t ) 2t /X 3t 0.6129(X 1.7086 ) 2t /X t Ŷ ln(      45
  • 46. Testing linear equality restriction: example  We cannot reject the null hypothesis that 2+3 = 1 at the 5% critical level. However, the F value and the critical value are close. Hence we have to be careful about making decision. For example, we can reject the hypothesis at 10% level. 4.36 0.8890)/12 (1 0.8489) (0.8890 k) )/(n R (1 )/m R (R F 2 UR 2 R 2 UR         0.05(1,12) 0.05(1,12) F F 4.75, F   46
  • 47. Testing for structural stability (Chow- test, use of F-value again)  This is to test whether the same model Yt = f(Xt) can be used for different time periods. If not, there will be structural break in the model.  There are three possibilities that the same model may not work for different time periods:  The intercept may be different  The coefficients may be different  Both the intercept and the coefficients are different  Hence, it is important that we test for the structural stability of the model. 47
  • 48. Testing for structural stability (Chow- test, use of F-value again)  Suppose we break the entire data period into the sub-period, then run the model for each of the two sub-periods and for the entire data period to obtain the RSS’s from three different regressions.  RSSall = RSS from the regression that uses all the observations from the entire data period, the number of observations is n = n1 + n2.  RSS1 = RSS from the regression that use only the observations of the first sub-period, the number of observations is n1.  RSS2 = RSS from the regression that use only the observations of the second sub-period, the number of observations is n2.  RSS1+2 = RSS1 + RSS2.  K = number of regressors including intercept. 2k) /(n RSS )/k RSS (RSS F 2 1 2 1 all      48
  • 49. Testing for structural stability  The null hypothesis is no structural change  If the F value does not exceed the critical F value, we do not reject the null hypothesis of parameter stability  If the F value exceed the critical F value, we reject the hypothesis of parameter stability 49
  • 50. Looking at the p-values for the F test  We can also look at the p-values, the probability of not rejecting the null hypothesis, given the estimated coefficients and their standard errors.  These are computed by the econometric package.  For a 5% significance level, you reject H0 if p- value<0.05. 50
  • 51. Example Source | SS df MS Number of obs = 6763 -------------+------------------------------ F( 3, 6759) = 644.53 Model | 357.752575 3 119.250858 Prob > F = 0.0000 Residual | 1250.54352 6759 .185019014 R-squared = 0.2224 -------------+------------------------------ Adj R-squared = 0.2221 Total | 1608.29609 6762 .237843255 Root MSE = .43014 ------------------------------------------------------------------------------ lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- jc | .0666967 .0068288 9.77 0.000 .0533101 .0800833 univ | .0768762 .0023087 33.30 0.000 .0723504 .0814021 exper | .0049442 .0001575 31.40 0.000 .0046355 .0052529 _cons | 1.472326 .0210602 69.91 0.000 1.431041 1.51361 ------------------------------------------------------------------------------ 51
  • 52. Introducing qualitative factors in our analysis  What do we mean by qualitative factors?  Examples:  Do men earn more than women?  Do students that live in campus get higher scores in their exams compared to students who live off-campus?  How do we account for these factors?  Dummy variable = it’s a binary variable (0/1) where 1 indicates that a particular event occurs (living in campus or not) or a particular feature is present (beautiful). 52
  • 53. Dummy variables  The name of the dummy indicates how it is constructed.  A dummy called ‘female’ equals 1 when the person is a female and 0 otherwise.  A dummy called ‘campus’ equals 1 when the student lives on campus and 0 otherwise.  Different types of dummies  Shift/intercept dummies  Slope dummies: dummy interact with X  Interaction dummies: the product of two dummy variables 53
  • 54. Case 1: intercept or shift dummy  Let’s start with the simple case of a single dummy dependent variable  Although there are 2 genders, we only need one dummy variable. In this case we are using a dummy called ‘female’. Remember that female + male = 1.  The quality or feature that we set equal to zero is called the base group or the benchmark group (or reference group).  Examples: men is the base group in the previous case.  We cannot include a dummy for male and one for female because this will cause a problem of perfect collinearity i i i i u education female wage     1 0    54
  • 55. Model with a dummy variable i i i i u education female wage     1 0    i i i u education wage     1 0 0 *    After estimating our equation we may find the following results: This shows that alpha is the intercept for male workers. The intercept for female workers is alpha + delta. 0  Not significantly different from zero Has a negative sign Has a positive sign When a particular worker is a male, our equation will look like: 55
  • 56. Graphical analysis 0     Wage Education In this case women are discriminated against men. 56
  • 57. Hypothesis testing  We could simply carry out a significance test to test whether the dummy is significantly different from zero.  Or we might want to test the null against the alternative that there is discrimination against women: 0 : 0 : 0 1 0 0     H H 0 : 0 : 0 1 0 0     H H Two tailed/sided test One tailed/sided test 57
  • 58. Possible outcomes of the test for the wage example POSSIBLE RESULTS FOR 0  INTERPRETATION Negative and significantly different from zero Women earn less than men, given the same level of education. We find evidence of discrimination against women. Positive and significantly different from zero Women earn more than men, given the same level of education. We find evidence of discrimination against men. Not significantly different from zero There is no significant difference between male and female hourly earning. We do not find any evidence of discrimination i i i i u education female wage     1 0    58
  • 59. Interaction dummies: interact dummies with explanatory variables  We can also interact dummy variables with other explanatory variables that are not dummy variables to test whether there are differences in slope.  For example, we want to estimate whether the returns to education, in term of wages, are the same for men and women. 59
  • 60. Graphical representation Wage ($ per hour) Education (years) 9 18 Intercept For female Intercept For male 60
  • 61. How do we formulate this model for OLS estimation? i i i i i i u education female educ female wage      * ) log( 1 1 0     We could have different situations according to the coefficient estimates. Example: let’s assume that all variable are statistically significant. Return to education for men Female’s returns to education are 0.56% less than men. To know the returns to education for women, take the difference between the two coefficients. i i i i i educ female educ female ge a w * 0056 . 0 082 . 0 227 . 0 389 . 0 ) log(      61