SlideShare a Scribd company logo
1 of 48
Bule Hora University
College of Business and Economics
Department of Economics
Course Title: Econometrics Theory and Application
Course Code: MLSCM 2033
Credit Hr: 3 ECTC: 5
Aschalew Shiferaw
1
Dummy Variable Model
• In the regression analysis the dependent variable may also be
influenced by variable that are qualitative in nature (in addition to
quantitative variables).
• Such variables include Sex, Marital status, job category, region,
seasons, etc.
• We quantify such variables by artificially assigning values to them (
for example, assigning 0 and 1 to sex, where 0 indicates male and 1
indicates female), and then in the regression equation together with
the other independent variables. Such variables are called dummy
variable.
2
Dummy Variable Model
• ANOVA model
• This model involves only dummy variables as explanatory variables.
• Example: Consider the following model:
Where Yi= annul starting salary of an employee
3
1 2 .......(1)
i i i
Y D u
 
  
1
0
i
for male
D
for female

 

Dummy Variable Model
• Under the usual assumption of CLRM, the mean salary for a female
employee is:
4
1 2 1 1
0
1 2 1 2 1 2
0
( / 0) ( (0) ) ( )
:
( / 1) ( (1) ) ( )
i i i i
i i i i
E Y D E u E u
and themean salary for a maleemployeeis
E Y D E u E u
   
     


      
        
Dummy Variable Model
• ANCOVA models
• Regression models in most economic research involves quantitative
explanatory variables in addition to dummy variables. Such models
are known as Covariance (ANCOVA).
• Example: Consider the following model:
• Where Yi= annual starting salary of an employee
Xi= Years of work experiences
5
1
0
i
for male
D
for female

 

1 2 3 .......(2)
i i i i
Y D X u
  
   
Dummy Variable Model
• Assuming that E(ui)=0 we can see that the mean salary of a female
employee is:
• And the mean salary for a male employee is:
Remark: If we use two dummy variables( one for each male and
female), our model becomes:
6
1 2 3 1 3 1 3
0
( / 0) ( (0) ) ( )
i i i i i i i
E Y D E X u X E u X
      

         
1 2 3 1 2 3 1 2 3
0
( / 1) ( (1) ) ( ) ( )
i i i i i i i
E Y D E X u X E u X
        

           
1 2 1 3 2 4
1 2
.......(3)
1 1
0 0
i i i i i
i i
Y D D X u
for male for female
where D and D
for female for male
   
    
 
 
 
 
Cont…
• In model (3) it can clearly be seen that ,that is perfect
Multicollinearity between .Consequently, the model can not
be estimated.
• Thus, the number of dummies should be one less than the number of
categories. For example, if a variable has four categories, we should
construct only three dummy variables.
• Note
1.The category that is assigned a value 0 is referred to as the base category
or the benchmark category, and all comparisons are made with reference
to this category.
2.The Coefficient attached to the dummy variables (e.g α2 in the model (2)) is
referred to as the differential intercept coefficient. It tells us by how much
the value of the intercept term of the category that is assigned the value 1
differs from that of the base category.
7
1 2 1
i i
D D
 
1 2
i i
D and D
Cont…
• Dummy variable model are also used if one has take care of seasonal
factors. For example, if we have quarterly data on consumption (C)
and income (Y), we fit the regression model.
• In equation (4) , the constant term( ) is the intercept for the base
category (quarter IV). The intercept terms for quarters I,II and III are
8
0 1 1 2 2 3 3 4
1 2 3
1 2 3
.......(4)
, :
1 1 1
,
0 0 0
i i i i i i
i i i
Y D D D Y u
Where D D and D areseasonal dummiesdefined by
for quarter I for quarter II for quarter III
D D and D
otherwise otherwise otherwise
    
     
  
  
  
  
0

0 1 0 2 0 3
( ), ( ) ( ),
and respectively
     
  
Test of Structural Stability
• Suppose we are interested in estimating a simple saving function that
relates domestic households savings(S) with gross domestic product
(Y) for a certain country. Suppose further that, at certain point of
time, a series of economic reform have been introduced.
• The hypothesis here is that such reforms might have considerably
influenced the savings-income R/ship, that is, the R/ship b/n saving
and income might be different in the post-reform period as
compared to that in the pre-reform period.
• If this hypothesis is true, then we say a structural change has
happened. How we check if this is so?
9
Test of Structural Stability
1. Chow’s test
One approach for testing the presence of structural change (structural
instability) is by means of Chow’s test. The step involved in this
procedure are as follows:
a) Estimate the regression equation
For the whole period (pre-reform plus post- reform periods) and find
the Error Sum of Square ( )
b) Estimate equation (5) using the available data in the pre-reform
period (say, of size n1), that is, estimate the model:
10
, 1,2,..., .......(5)
i i i
S Y i n
  
   
R
ESS
1 1 1
, 1,2,...,
i i i
S Y i n
  
   
Test of Structural Stability
And find the Error sum of square ( )
c) Estimate equation (5) using the available data in the post-reform
period (say, of size n2), that is, estimate the model:
And find the Error Sum of Square ( )
d) Calculate:
e)Calculate the Chow test statistic:
11
1
ESS
2 2 2
, 1,2,...,
i i i
S Y i n
  
   
2
ESS
1 2
UR
ESS ESS ESS
 
1 2
( ) /
/( 2 )
.
R UR
UR
R
ESS ESS k
F
ESS n n K
Where ESS is error sum of square for the whole period
K isthenumber of estimated regressioncoefficients


 
Test of Structural Stability
f) Decision rule: Reject the null hypothesis of identical intercept and
slope for pre-reform and post-reform period, that is:
Where is the critical value from F-distribution
with K (in our case K=2) and degree of freedom for a given
significance level α.
Note that rejecting Ho (the null hypothesis) means there is a
structural change.
12
1 2
1 2
1 2
:
:
( , 2 )
Ho
if
F F K n n k

 
 





 
1 2
( , 2 )
F K n n k
  
1 2 2
n n k
 
Test of Structural Stability
Illustrative Example:
1. The following data is on domestic household saving (S) and gross
domestic product (Y) for India for the period 1980 to 2002.
13
year Savings GDP year Savings GDP
1980 27136 401128 1992 162906 737791.6
1981 31355 425072.8 1993 193621.3 781345
1982 34368 438079.5 1994 251463.4 838031
1983 38587 471742.2 1995 298747.3 899563
1984 46063 492077.3 1996 317260.7 970082
1985 54167 513990 1997 352178 1016595
1986 58951 536256.7 1998 374659 1082748
1987 72908 556777.8 1999 468681 1148368
1988 87913 615098.4 2000 495986 1198592
1989 106979 656331.2 2001 535185 1267833
1990 131340 692871.5 2002 597697 1318321
1991 143908 701863.2
1 1
2 2
644994361.865( in ( 12))
2736652790.434( in ( 11))
13937337067.461( ).
:
R
ESS error sum of squares the pre reform period n
ESS error sum of squares the post reform period n
ESS error sum of squares for the whole period
We then have
  
  

1 2 3381647152.299
UR
ESS ESS ESS
  
Test of Structural Stability
• The test statistics is:
• Decision: Since the calculated value of F exceeds the tabulated value, we
reject the null hypothesis of identical intercept and slopes for pre-reform
and post-reform periods at the 5% level of significance. Thus, we can
conclude that there is a structural change.
14
1 2
( ) /
/( 2 )
R UR
UR
ESS ESS k
F
ESS n n K


 
(13937337067.461) 3381647152.299) / 2
29.65
3381647152.299/(12 11 2(2))
2 19deg
5% 3.52
F
Thetabulated value fromthe F distribution with and reeof freedomat
the level of significanceis

 
 

Test of Structural Stability
• Drawbacks
• Chow’s test does not tell us whether the difference (change) is in the
slope only, in intercept only or in both the intercept and the slope.
2.Using Dummy variables
Write the saving function as:
15
1 2 3 ( ) , 1,2,..., .......(6)
, :
0
1
t t t t t t
t
S o D Y DY u i n
Where St is household saving at time t Yt isGDPat timet and
pre reform period
D
post reform period
   
     


 


Test of Structural Stability
• Here is the differential slope of coefficient indicating how much
the slope coefficient of the pre-reform period saving function differs
from the slope coefficient of the savings function in the post reform
period. Observe that
• If β1 and β3 are both statistically significant as judged by the t-test,
then the pre-reform and post-reform regressions differ in both the
intercept and the slope.
• However, if only β1 is statistically significant, then the pre-reform and
post-reform regression differ only in the intercept( meaning the
marginal propensity to save(MPS) is the same for pre-reform and
post-reform periods).
16
3

2
1 2 3
( / 0, )
( / 1, ) ( ) ( )
t t t t
t t t t
E S D Y o Y
E S D Y o Y
 
   
  
    
Test of Structural Stability
• Similarly , if only β3 is statistically significant , then the two
regressions differ only in the slope(MPS).
Estimating equation (6) using above data by OLS yields the following
results.
17
Unstandardized
Coefficients
standardized
Coefficients
t Sig
β Std.Error Beta B Std.Error
(constant) -1336990.916 21258.997 -6.289 0.000
Dt -228628.947 30775.092 -0.641 -7.429 0.000
Yt 0.375 0.039 0.596 9.717 0.000
DtYT 0.339 0.044 1.003 7.674 0.000
Test of Structural Stability
• We can see that both differential intercept coefficient
And differential slope coefficient are statistically
significant. Thus the saving- income R/ship for the two periods is
different.
18
1 228628.947( 0.001)
p value


  
3 0.339( 0.001)
p value


 
Limited Dependent variable model
• Dependent variable in a regression equation simply represent a
discrete choice assuming only a limited number of values.
• Model involving dependent variable of this kind are called limited
(discrete) dependent variable models ( also called qualitative
response model.
• Example of choices
A)whether to work or not
B) whether to attend formal education or not
C) Choice of occupation
D)Which brand of consumer durable goods to purchase
19
Limited Dependent variable model
• In the above situation, the variables are discrete valued. Example (a)
and (b) the variables are binary ( having only two possible values),
whereas, the variable in (c) and (d) are multinomial ( having more
than two but a finite number of distinct values).
• In such cases instead of standard regression models, we apply
different methods of modeling and analyzing discrete data.
• Example
Suppose the choice is whether to work or not. The discrete dependent
variable we are working with will assume only two values 0 and 1:
20
th
th
1 if i individual is working/seeking work
0 if i individual is not working
i
Y


 


Limited Dependent variable model
• There are four approaches to developing a probability
model for a binary response variable:
1. The linear probability model (LPM)
2. The logit model
3. The probit model
4. The tobit model
21
The Linear Probability Model (LPM)
• To fix ideas, consider the following regression model:
• where X = family income and Y = 1 if the family owns a house and 0 if
it does not own a house.
Model (*) looks like a typical linear regression model but because the
regressand is binary, or dichotomous, it is called a linear probability
model (LPM).
• This is because the conditional expectation of Yi given Xi , E(Yi | Xi ),
can be interpreted as the conditional probability that the event will
occur given Xi , that is, Pr (Yi = 1 | Xi ).
22
i 1 2 i i
Y = ß + ß X + u ...............(* )
The Linear Probability Model (LPM)
Thus, in our example, E(Yi | Xi ) gives the probability of a family owning a
house and whose income is the given amount Xi .
The justification of the name LPM for models like Eq. (*) can be seen as
follows:
• Assuming E(ui ) = 0, as usual (to obtain unbiased estimators), we obtain
• Now, if Pi = probability that Yi = 1 (that is, the event occurs), and (1 − Pi ) =
probability that Yi = 0 (that is, the event does not occur), the variable Yi
has the following (probability) distribution:
Yi Probability
0 1 − Pi
1 Pi
Total 1
• That is, Yi follows the Bernoulli probability distribution.
23
i i 1 2 i
E(Y | X ) = ß + ß X ......(**)
The Linear Probability Model (LPM)
• That is, Yi follows the Bernoulli probability distribution.
• Now, by the definition of mathematical expectation, we obtain:
E(Yi ) = 0(1 − Pi ) + 1(Pi ) = Pi (***)
• Comparing Eq. (**) with Eq. (***), we can equate
E(Yi | Xi ) = β1 + β2Xi = Pi (****)
• that is, the conditional expectation of the model (*) can, in fact, be interpreted as
the conditional probability of Yi. In general, the expectation of a Bernoulli random
variable is the probability that the random variable equals 1.
• In passing note that if there are n independent trials, each with a probability p of
success and probability (1 − p) of failure, and X of these trials represent the
number of successes, then X is said to follow the binomial distribution. The
mean of the binomial distribution is np and its variance is np(1 − p). The term
success is defined in the context of the problem.
24
Logit and Probit
• Logit and probit models are the most widely used models for
estimating the functional relationship between dependent and
independent variables in practice.
• Logit and probit models are also among the generalized linear models (GLM)
family.
• If the latent variable is unobserved or the dependent variable is binary, this
model cannot be estimated using the normal least squares method (OLS).
• Instead, the maximum probability estimate is used which requires
assumptions about the distribution of errors.
• Often, the choice is between the normal errors in the probit model and
the logistic errors in the logit model
25
Con’t…
• Limited dependent variables are generally divided into two groups:
censored and truncated regression models.
• The Tobit model is also known as the censored regression model.
• When the dependent variable is censored, least squares estimates
give biased results. Therefore, when censored is applied to the
dependent variable, the Tobit model allows us to derive consistent
and asymptotically efficient predictors.
• Tobit model, which is known as models where the dependent
variable has a lower or upper limit, was first used by Tobin to analyze
household expenditures by working on durable consumer goods
considering the fact that expenditure cannot be negative.
26
Logit Model
• In the logit regression model, none of the assumptions (linear
distribution of the dependent variable, withdrawal of independent
variables from normal distribution, normal distribution of the error
term and no relationship between error term values, etc.) involved in
the linear regression analysis are not sought/required.
• Therefore, it provides researchers with considerable flexibility and has
become a more preferred method.
• A general linear regression model can be written as expressed in
Equation 1, where yi is a dependent variable and xi is an independent
variable.
𝑦𝑖=𝛼+𝛽1𝑥1+𝛽2𝑥2+⋯+𝛽𝑘𝑥𝑘+𝜀𝑖 (1)
27
Logit Model
• In the above model, 𝛼 constant term and 𝛽 are regression coefficients. This
model can be predicted by classical OLS when the dependent variable is
continuous. However, logit or probit regression methods are used in cases
where the dependent variable is discrete.
• The Logit model can be used to model the probability of a particular class or
event with two states.
• Suppose that the unobservable or latent variable generated from the observed
variable 𝑦𝑖 between −∞ and +∞. is 𝑦𝑖∗.
• Values greater than 𝑦𝑖∗ are considered 𝑦𝑖=1 and values less than or equal to 𝑦𝑖∗
are considered 𝑦𝑖=0.
• The latent variable 𝑦𝑖∗ is assumed to be linearly dependent on the observed 𝑥𝑖
throughout the structural model. 𝑦𝑖∗ is connected to the binary variable 𝑦𝑖
observed by the measurement equation in Equation 2:
28
Logit Model
29
Logit Model
30
Logit Model
• Thus, when the dependent variable yi takes “0” and “1”, binary logit takes the
name of the model. When the dependent variable is “1”, the probability is
expressed by Equation 3:
• In this model, 𝑃𝑖 provides information about the argument 𝑥𝑖 while the first
individual expresses the probability of making a particular choice. Thus 𝑃𝑖 also
takes values between “0” and “1”. The equations given in Equation 4 and
Equation 5 can be written here:
31
Logit Model
• To determine the logit function, 𝛼 and 𝛽 parameters cannot be directly
predicted by OLS and Equation 6 is used to estimate the model:
• If equations (3) and (6) are proportional,
32
Logit Model
Equation 7 is obtained. It is also the odds or odds ratio (Odds Ratio, OR).
• Variables close to 1 among these OR values are not the factors that have a
significant effect on the change of 𝑦.
• For OR values greater than 1, it is interpreted that the factor is an important risk
factor provided that the coefficient is significant.
• Values close to zero indicate that the factor is an important risk factor, provided
that the coefficient is significant, but that it is a negative factor that causes the y
to take low values
• Equation 8 can be written by taking the natural logarithm of this model according
to “e” base:
33
Logit Model
• 𝐿𝑖 is the difference rate logarithm and is linear with respect to both 𝑥𝑖 and
parameters. Here 𝐿𝑖 is called the “logit model” . This model is a semi-logarithmic
function. Therefore, the logit model is one of the best known models among
generalized linear models.
• In order to estimate the parameters in the model, when the 𝐿𝑖 function, 𝑃𝑖=1 and
𝑃𝑖=0 are put in their places in logit 𝐿𝑖, then 𝑙𝑛(1/0) and 𝑙𝑛(0/1) values are
obtained which are insignificant.
• Estimates of the parameters in the 𝐿𝑖 function cannot be found by OLS but these
parameters can be estimated by the maximum likelihood model (ML).
34
Logit Model
• However, the following points should be taken into consideration
in research using logit model .
• All appropriate independent variables should be included in the model:
Failure to include some variables in the model may cause the error term to
grow and the model to be inadequate.
• All unsuitable independent variables should be excluded: Inclusion of
causally inappropriate variables in the model can complicate the model.
• Observation should be done on the same individual once and there should
be no repeated measurements.
• The measurement error in the independent variables must be small:
measurement errors should be small, no missing (missing) data. Errors can
lead to bias in estimating coefficients and inadequacy of the model.
• There should be no multicollinearity between the independent variables:
The independent variables must not be interrelated.
• There should be no extreme values: As with linear regression, extreme
values can significantly affect the result. 35
Logit model
• In the Logit model, the coefficients cannot be directly interpreted as
the effect of a change in independent variables on the expected
value of the dependent variable.
• For this reason, OR values or marginal effects can be calculated in
applications. Furthermore, the sign of the coefficients indicates the
direction of the relationship between the argument and the
probability of occurrence of the event.
36
Probit Model
• In the linear probability model, which is one of the qualitative preference models
with qualitative variables that can take two values, the most obvious problem is
that the predicted probability values fall outside the range of “0” and “1”.
• One of the models used to solve this problem is the probit model.
• This model is a nonlinear model in terms of coefficients that allows the
probabilities to remain between “0” and “1”. When the dependent variable 𝑦𝑖 is
binary, 𝑃𝑖 is expressed in Equation 9:
• Here ϕ is the cumulative distribution function and 𝛽 maximum likelihood
coefficients of the standard normal distribution.
37
Probit model
• The probit model assumes that the basic dependent variable is
normally distributed, whereas the 𝑦 dependent variable
assumes that the variable is based on the logistic curve.
• Therefore, the tail regions of the logit cumulative distribution
function of these two models are wider than those of the probit
model.
38
Probit model
• Although these two models give similar results, it is not possible to directly
compare the predicted main mass coefficients of the two models.
• Provided that it does not fall outside the range “0” and “1”, a model should be
found so that the relationship between 𝑃𝑖 and 𝑥𝑖 is curvilinear: increases in 𝑥𝑖
also increase 𝑃𝑖. The illustration of the model with the above two features is
given in Fig. 2:
39
Figure 2. Logit and probit cumulative distribution
Probit Model
• The probit model utilizes the cumulative normal distribution function and is called
the “normit model” in the literature. The probit probability model based on the
normal cumulative distribution function can be represented by Equation 10:
• Where 𝑥𝑖 is observable but 𝑦𝑖∗ is not observable.
• As in the Logit model, if 𝑦𝑖=1 then 𝑦𝑖∗>0, but if 𝑦𝑖∗<0 then 𝑦𝑖=0. When assigning the
result of the variable 𝑦𝑖, the value of τ used as the threshold value is generally taken
as “0” and another number value can be used instead of zero .
• Considering that 𝑦𝑖 has a threshold value that cannot be observed as it is and is
expressed as 𝑦𝑖∗, it can be said that if 𝑦𝑖 exceeds the value 𝑦𝑖,∗ the event will occur
and if it does not, the event will not occur (Equation 11).
40
Probit model
• The case that 𝑦𝑖∗ is less than or equal to 𝑦𝑖 is calculated from standardized
cumulative distribution functions under the assumption of normality. If ϕ(Z)
cumulative normal distribution function is defined as ϕ(Z)=P(Z≤z) for the normal
standard variable Z, then Equation 12 and Equation 13 are expressed as follows:
• The variable Z here is a standardized normal variable with a mean of “0” and a
variance of “1”. Thus, the model can be represented by Equation 14:
• In this model, 𝐹−1 is the inverse of the normal cumulative distribution function. It
is possible to state the following assumptions for the Probit model .
41
Cont’d
𝑦𝑖∈{0,1},𝑖=1,2,…,𝑛
𝑃𝑖=𝐸(𝑌=1/𝑥)=ϕ(𝛽𝑥𝑖) (Unit normal cumulative distribution function)
𝑦1,𝑦2 ,…,𝑦𝑛 are statistically independent
• There is no exact or multicollinearity among all 𝑥𝑖's
• Binary probit models WLSM (Weighted Least Squares Method), ML
(Maximum Likelihood Method), minimum chi-square iterative can be
estimated with WLSM. In addition, the coefficient of R2 in the probit
model does not give us any idea as to whether the functional form of
the model is well chosen.
42
Tobit Model
• The sample where the information about the dependent variable is found only
for some observations is known as censored sample.
• This model is also shown among models with a limited dependent variable
because the dependent variable is limited. When censorship is applied to the
dependent variable, the regression model is expressed in Equation 15:
• This model is called “Tobit model”. 𝑦𝑖∗ is the latent variable and τ is the censor
point. Observed and censored for values greater than τ (Equation 16):
• In the traditional Tobit model in Equation 16 when τ=0, some observations above
𝑦𝑖∗ take the value of zero. That is, it is expressed as Equation 17;
43
Tobit Model
• If the dependent variable is censored, having a lower limit and/or an upper limit,
then the least squares estimators of the regression parameters are biased and
inconsistent
• We can apply an alternative estimation procedure, which is called Tobit
• Tobit is a maximum likelihood procedure that recognizes that we have data of two
sorts:
1. The limit observations (y = 0)
2. The non-limit observations (y > 0)
• The two types of observations that we observe, the limit observations and
those that are positive, are generated by the latent variable y* crossing the
zero threshold or not crossing that threshold
44
Difference and similarities among Logit, Probit and
Tobit
• The most commonly used models among these preference models are logit and probit
models.
• Both logit and probit model analyses are very similar and the probability estimates
obtained are close to each other.
• However, while log-odds (likelihood ratios) are used in logit model analysis, the
cumulative normal distribution of probit model is used.
• The structural models of Logit, probit and Tobit are similar, but the models are different.
• In the Tobit model, the observed values of the dependent variable are known when 𝑦𝑖
∗>τ.
• In the probit and logit model, if only 𝑦𝑖∗>τ, 𝑦 value is “1”. However, if the data are
below the threshold (τ), they cannot be known and the value 𝑦 is assumed to be zero.
• More information is available on the Tobit model. Therefore, it is expected that
coefficient estimations obtained from Tobit model will be more effective than those
obtained from probit model.
45
Multinomial logit model
• We are often faced with choices involving more than two
alternatives
• These are called multinomial choice situations
• If you are shopping for a laundry detergent, which one
do you choose? OMO, DURU, POPULAR, SPECIAL
BRIGHT, and so on
• If you enroll in the business school, will you major in
economics, marketing, management, finance, or
accounting?
46
Multinomial logit model
• Multinomial logistic regression is used to predict categorical
placement in or the probability of category membership on a
dependent variable based on multiple independent variables.
• The independent variables can be either dichotomous (i.e., binary) or
continuous (i.e.,interval or ratio in scale).
• Multinomial logistic regression is a simple extension of binary logistic
regression that allows for more than two categories of the dependent
or outcome variable.
• Like binary logistic regression, multinomial logistic regression uses
maximum likelihood estimation to evaluate the probability of
categorical membership.
• The multinomial logistic model assumes that data are case specific;
that is, each independent variable has a single value for each case. 47
48

More Related Content

What's hot

Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1Muhammad Ali
 
Autocorrelation- Detection- part 1- Durbin-Watson d test
Autocorrelation- Detection- part 1- Durbin-Watson d testAutocorrelation- Detection- part 1- Durbin-Watson d test
Autocorrelation- Detection- part 1- Durbin-Watson d testShilpa Chaudhary
 
Mellor`s model of agriculture development
Mellor`s model of agriculture developmentMellor`s model of agriculture development
Mellor`s model of agriculture developmentVaibhav verma
 
Autocorrelation- Concept, Causes and Consequences
Autocorrelation- Concept, Causes and ConsequencesAutocorrelation- Concept, Causes and Consequences
Autocorrelation- Concept, Causes and ConsequencesShilpa Chaudhary
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
HeteroscedasticityMuhammad Ali
 
Econometrics lecture 1st
Econometrics lecture 1stEconometrics lecture 1st
Econometrics lecture 1stIshaq Ahmad
 
welfare economics.pdf
welfare economics.pdfwelfare economics.pdf
welfare economics.pdfMahiMozumder
 
Heteroskedasticity
HeteroskedasticityHeteroskedasticity
Heteroskedasticityhalimuth
 
Autocorrelation- Remedial Measures
Autocorrelation- Remedial MeasuresAutocorrelation- Remedial Measures
Autocorrelation- Remedial MeasuresShilpa Chaudhary
 
Introduction to regression analysis 2
Introduction to regression analysis 2Introduction to regression analysis 2
Introduction to regression analysis 2Sibashis Chakraborty
 
Chapter 5.pptx
Chapter 5.pptxChapter 5.pptx
Chapter 5.pptxmesfin69
 
Schultz’s transformation of traditional agriculture
Schultz’s transformation of traditional agricultureSchultz’s transformation of traditional agriculture
Schultz’s transformation of traditional agricultureVaibhav verma
 
Eco Basic 1 8
Eco Basic 1 8Eco Basic 1 8
Eco Basic 1 8kit11229
 

What's hot (20)

Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1
 
Autocorrelation
AutocorrelationAutocorrelation
Autocorrelation
 
Autocorrelation- Detection- part 1- Durbin-Watson d test
Autocorrelation- Detection- part 1- Durbin-Watson d testAutocorrelation- Detection- part 1- Durbin-Watson d test
Autocorrelation- Detection- part 1- Durbin-Watson d test
 
Dummyvariable1
Dummyvariable1Dummyvariable1
Dummyvariable1
 
Mellor`s model of agriculture development
Mellor`s model of agriculture developmentMellor`s model of agriculture development
Mellor`s model of agriculture development
 
Econometrics ch5
Econometrics ch5Econometrics ch5
Econometrics ch5
 
Autocorrelation- Concept, Causes and Consequences
Autocorrelation- Concept, Causes and ConsequencesAutocorrelation- Concept, Causes and Consequences
Autocorrelation- Concept, Causes and Consequences
 
Double Hurdle Models
Double Hurdle ModelsDouble Hurdle Models
Double Hurdle Models
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
Heteroscedasticity
 
Econometrics lecture 1st
Econometrics lecture 1stEconometrics lecture 1st
Econometrics lecture 1st
 
welfare economics.pdf
welfare economics.pdfwelfare economics.pdf
welfare economics.pdf
 
Heteroskedasticity
HeteroskedasticityHeteroskedasticity
Heteroskedasticity
 
Autocorrelation- Remedial Measures
Autocorrelation- Remedial MeasuresAutocorrelation- Remedial Measures
Autocorrelation- Remedial Measures
 
Ols
OlsOls
Ols
 
Introduction to regression analysis 2
Introduction to regression analysis 2Introduction to regression analysis 2
Introduction to regression analysis 2
 
Chapter 5.pptx
Chapter 5.pptxChapter 5.pptx
Chapter 5.pptx
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Schultz’s transformation of traditional agriculture
Schultz’s transformation of traditional agricultureSchultz’s transformation of traditional agriculture
Schultz’s transformation of traditional agriculture
 
Eco Basic 1 8
Eco Basic 1 8Eco Basic 1 8
Eco Basic 1 8
 
Lewis model
Lewis model Lewis model
Lewis model
 

Similar to Dummy variable model

Similar to Dummy variable model (20)

Ch4 slides
Ch4 slidesCh4 slides
Ch4 slides
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
 
Econometrics_ch13.ppt
Econometrics_ch13.pptEconometrics_ch13.ppt
Econometrics_ch13.ppt
 
Ch6 slides
Ch6 slidesCh6 slides
Ch6 slides
 
Multiple Linear Regression.pptx
Multiple Linear Regression.pptxMultiple Linear Regression.pptx
Multiple Linear Regression.pptx
 
Ch7 slides
Ch7 slidesCh7 slides
Ch7 slides
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression models
 
Department of Engineering Management and Systems Engineering Manag.docx
Department of Engineering Management and Systems Engineering Manag.docxDepartment of Engineering Management and Systems Engineering Manag.docx
Department of Engineering Management and Systems Engineering Manag.docx
 
Ch8 slides
Ch8 slidesCh8 slides
Ch8 slides
 
Ch8_slides.ppt
Ch8_slides.pptCh8_slides.ppt
Ch8_slides.ppt
 
Distributed lag model koyck
Distributed lag model koyckDistributed lag model koyck
Distributed lag model koyck
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
Advanced Structural Analysis.ppt
Advanced Structural Analysis.pptAdvanced Structural Analysis.ppt
Advanced Structural Analysis.ppt
 
Chapitre08_Solutions.pdf
Chapitre08_Solutions.pdfChapitre08_Solutions.pdf
Chapitre08_Solutions.pdf
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
06 cs661 qb1_sn
06 cs661 qb1_sn06 cs661 qb1_sn
06 cs661 qb1_sn
 
FE3.ppt
FE3.pptFE3.ppt
FE3.ppt
 
panel data.ppt
panel data.pptpanel data.ppt
panel data.ppt
 
Dummy variables (1)
Dummy variables (1)Dummy variables (1)
Dummy variables (1)
 

Recently uploaded

The Economic History of the U.S. Lecture 20.pdf
The Economic History of the U.S. Lecture 20.pdfThe Economic History of the U.S. Lecture 20.pdf
The Economic History of the U.S. Lecture 20.pdfGale Pooley
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Pooja Nehwal
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...shivangimorya083
 
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130 Available With Room
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130  Available With RoomVIP Kolkata Call Girl Jodhpur Park 👉 8250192130  Available With Room
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130 Available With Roomdivyansh0kumar0
 
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services  9892124323 | ₹,4500 With Room Free DeliveryMalad Call Girl in Services  9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free DeliveryPooja Nehwal
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure servicePooja Nehwal
 
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130  Available With RoomVIP Kolkata Call Girl Serampore 👉 8250192130  Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Roomdivyansh0kumar0
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingAggregage
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...ssifa0344
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptxFinTech Belgium
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex
 
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...makika9823
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfGale Pooley
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
Dividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxDividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxanshikagoel52
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptxFinTech Belgium
 

Recently uploaded (20)

The Economic History of the U.S. Lecture 20.pdf
The Economic History of the U.S. Lecture 20.pdfThe Economic History of the U.S. Lecture 20.pdf
The Economic History of the U.S. Lecture 20.pdf
 
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
Dharavi Russian callg Girls, { 09892124323 } || Call Girl In Mumbai ...
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...
Russian Call Girls In Gtb Nagar (Delhi) 9711199012 💋✔💕😘 Naughty Call Girls Se...
 
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130 Available With Room
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130  Available With RoomVIP Kolkata Call Girl Jodhpur Park 👉 8250192130  Available With Room
VIP Kolkata Call Girl Jodhpur Park 👉 8250192130 Available With Room
 
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services  9892124323 | ₹,4500 With Room Free DeliveryMalad Call Girl in Services  9892124323 | ₹,4500 With Room Free Delivery
Malad Call Girl in Services 9892124323 | ₹,4500 With Room Free Delivery
 
Commercial Bank Economic Capsule - April 2024
Commercial Bank Economic Capsule - April 2024Commercial Bank Economic Capsule - April 2024
Commercial Bank Economic Capsule - April 2024
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
 
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130  Available With RoomVIP Kolkata Call Girl Serampore 👉 8250192130  Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of Reporting
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024
 
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdf
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
Dividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxDividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptx
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
 

Dummy variable model

  • 1. Bule Hora University College of Business and Economics Department of Economics Course Title: Econometrics Theory and Application Course Code: MLSCM 2033 Credit Hr: 3 ECTC: 5 Aschalew Shiferaw 1
  • 2. Dummy Variable Model • In the regression analysis the dependent variable may also be influenced by variable that are qualitative in nature (in addition to quantitative variables). • Such variables include Sex, Marital status, job category, region, seasons, etc. • We quantify such variables by artificially assigning values to them ( for example, assigning 0 and 1 to sex, where 0 indicates male and 1 indicates female), and then in the regression equation together with the other independent variables. Such variables are called dummy variable. 2
  • 3. Dummy Variable Model • ANOVA model • This model involves only dummy variables as explanatory variables. • Example: Consider the following model: Where Yi= annul starting salary of an employee 3 1 2 .......(1) i i i Y D u      1 0 i for male D for female    
  • 4. Dummy Variable Model • Under the usual assumption of CLRM, the mean salary for a female employee is: 4 1 2 1 1 0 1 2 1 2 1 2 0 ( / 0) ( (0) ) ( ) : ( / 1) ( (1) ) ( ) i i i i i i i i E Y D E u E u and themean salary for a maleemployeeis E Y D E u E u                            
  • 5. Dummy Variable Model • ANCOVA models • Regression models in most economic research involves quantitative explanatory variables in addition to dummy variables. Such models are known as Covariance (ANCOVA). • Example: Consider the following model: • Where Yi= annual starting salary of an employee Xi= Years of work experiences 5 1 0 i for male D for female     1 2 3 .......(2) i i i i Y D X u       
  • 6. Dummy Variable Model • Assuming that E(ui)=0 we can see that the mean salary of a female employee is: • And the mean salary for a male employee is: Remark: If we use two dummy variables( one for each male and female), our model becomes: 6 1 2 3 1 3 1 3 0 ( / 0) ( (0) ) ( ) i i i i i i i E Y D E X u X E u X                   1 2 3 1 2 3 1 2 3 0 ( / 1) ( (1) ) ( ) ( ) i i i i i i i E Y D E X u X E u X                       1 2 1 3 2 4 1 2 .......(3) 1 1 0 0 i i i i i i i Y D D X u for male for female where D and D for female for male                 
  • 7. Cont… • In model (3) it can clearly be seen that ,that is perfect Multicollinearity between .Consequently, the model can not be estimated. • Thus, the number of dummies should be one less than the number of categories. For example, if a variable has four categories, we should construct only three dummy variables. • Note 1.The category that is assigned a value 0 is referred to as the base category or the benchmark category, and all comparisons are made with reference to this category. 2.The Coefficient attached to the dummy variables (e.g α2 in the model (2)) is referred to as the differential intercept coefficient. It tells us by how much the value of the intercept term of the category that is assigned the value 1 differs from that of the base category. 7 1 2 1 i i D D   1 2 i i D and D
  • 8. Cont… • Dummy variable model are also used if one has take care of seasonal factors. For example, if we have quarterly data on consumption (C) and income (Y), we fit the regression model. • In equation (4) , the constant term( ) is the intercept for the base category (quarter IV). The intercept terms for quarters I,II and III are 8 0 1 1 2 2 3 3 4 1 2 3 1 2 3 .......(4) , : 1 1 1 , 0 0 0 i i i i i i i i i Y D D D Y u Where D D and D areseasonal dummiesdefined by for quarter I for quarter II for quarter III D D and D otherwise otherwise otherwise                        0  0 1 0 2 0 3 ( ), ( ) ( ), and respectively         
  • 9. Test of Structural Stability • Suppose we are interested in estimating a simple saving function that relates domestic households savings(S) with gross domestic product (Y) for a certain country. Suppose further that, at certain point of time, a series of economic reform have been introduced. • The hypothesis here is that such reforms might have considerably influenced the savings-income R/ship, that is, the R/ship b/n saving and income might be different in the post-reform period as compared to that in the pre-reform period. • If this hypothesis is true, then we say a structural change has happened. How we check if this is so? 9
  • 10. Test of Structural Stability 1. Chow’s test One approach for testing the presence of structural change (structural instability) is by means of Chow’s test. The step involved in this procedure are as follows: a) Estimate the regression equation For the whole period (pre-reform plus post- reform periods) and find the Error Sum of Square ( ) b) Estimate equation (5) using the available data in the pre-reform period (say, of size n1), that is, estimate the model: 10 , 1,2,..., .......(5) i i i S Y i n        R ESS 1 1 1 , 1,2,..., i i i S Y i n       
  • 11. Test of Structural Stability And find the Error sum of square ( ) c) Estimate equation (5) using the available data in the post-reform period (say, of size n2), that is, estimate the model: And find the Error Sum of Square ( ) d) Calculate: e)Calculate the Chow test statistic: 11 1 ESS 2 2 2 , 1,2,..., i i i S Y i n        2 ESS 1 2 UR ESS ESS ESS   1 2 ( ) / /( 2 ) . R UR UR R ESS ESS k F ESS n n K Where ESS is error sum of square for the whole period K isthenumber of estimated regressioncoefficients    
  • 12. Test of Structural Stability f) Decision rule: Reject the null hypothesis of identical intercept and slope for pre-reform and post-reform period, that is: Where is the critical value from F-distribution with K (in our case K=2) and degree of freedom for a given significance level α. Note that rejecting Ho (the null hypothesis) means there is a structural change. 12 1 2 1 2 1 2 : : ( , 2 ) Ho if F F K n n k             1 2 ( , 2 ) F K n n k    1 2 2 n n k  
  • 13. Test of Structural Stability Illustrative Example: 1. The following data is on domestic household saving (S) and gross domestic product (Y) for India for the period 1980 to 2002. 13 year Savings GDP year Savings GDP 1980 27136 401128 1992 162906 737791.6 1981 31355 425072.8 1993 193621.3 781345 1982 34368 438079.5 1994 251463.4 838031 1983 38587 471742.2 1995 298747.3 899563 1984 46063 492077.3 1996 317260.7 970082 1985 54167 513990 1997 352178 1016595 1986 58951 536256.7 1998 374659 1082748 1987 72908 556777.8 1999 468681 1148368 1988 87913 615098.4 2000 495986 1198592 1989 106979 656331.2 2001 535185 1267833 1990 131340 692871.5 2002 597697 1318321 1991 143908 701863.2 1 1 2 2 644994361.865( in ( 12)) 2736652790.434( in ( 11)) 13937337067.461( ). : R ESS error sum of squares the pre reform period n ESS error sum of squares the post reform period n ESS error sum of squares for the whole period We then have        1 2 3381647152.299 UR ESS ESS ESS   
  • 14. Test of Structural Stability • The test statistics is: • Decision: Since the calculated value of F exceeds the tabulated value, we reject the null hypothesis of identical intercept and slopes for pre-reform and post-reform periods at the 5% level of significance. Thus, we can conclude that there is a structural change. 14 1 2 ( ) / /( 2 ) R UR UR ESS ESS k F ESS n n K     (13937337067.461) 3381647152.299) / 2 29.65 3381647152.299/(12 11 2(2)) 2 19deg 5% 3.52 F Thetabulated value fromthe F distribution with and reeof freedomat the level of significanceis      
  • 15. Test of Structural Stability • Drawbacks • Chow’s test does not tell us whether the difference (change) is in the slope only, in intercept only or in both the intercept and the slope. 2.Using Dummy variables Write the saving function as: 15 1 2 3 ( ) , 1,2,..., .......(6) , : 0 1 t t t t t t t S o D Y DY u i n Where St is household saving at time t Yt isGDPat timet and pre reform period D post reform period                
  • 16. Test of Structural Stability • Here is the differential slope of coefficient indicating how much the slope coefficient of the pre-reform period saving function differs from the slope coefficient of the savings function in the post reform period. Observe that • If β1 and β3 are both statistically significant as judged by the t-test, then the pre-reform and post-reform regressions differ in both the intercept and the slope. • However, if only β1 is statistically significant, then the pre-reform and post-reform regression differ only in the intercept( meaning the marginal propensity to save(MPS) is the same for pre-reform and post-reform periods). 16 3  2 1 2 3 ( / 0, ) ( / 1, ) ( ) ( ) t t t t t t t t E S D Y o Y E S D Y o Y              
  • 17. Test of Structural Stability • Similarly , if only β3 is statistically significant , then the two regressions differ only in the slope(MPS). Estimating equation (6) using above data by OLS yields the following results. 17 Unstandardized Coefficients standardized Coefficients t Sig β Std.Error Beta B Std.Error (constant) -1336990.916 21258.997 -6.289 0.000 Dt -228628.947 30775.092 -0.641 -7.429 0.000 Yt 0.375 0.039 0.596 9.717 0.000 DtYT 0.339 0.044 1.003 7.674 0.000
  • 18. Test of Structural Stability • We can see that both differential intercept coefficient And differential slope coefficient are statistically significant. Thus the saving- income R/ship for the two periods is different. 18 1 228628.947( 0.001) p value      3 0.339( 0.001) p value    
  • 19. Limited Dependent variable model • Dependent variable in a regression equation simply represent a discrete choice assuming only a limited number of values. • Model involving dependent variable of this kind are called limited (discrete) dependent variable models ( also called qualitative response model. • Example of choices A)whether to work or not B) whether to attend formal education or not C) Choice of occupation D)Which brand of consumer durable goods to purchase 19
  • 20. Limited Dependent variable model • In the above situation, the variables are discrete valued. Example (a) and (b) the variables are binary ( having only two possible values), whereas, the variable in (c) and (d) are multinomial ( having more than two but a finite number of distinct values). • In such cases instead of standard regression models, we apply different methods of modeling and analyzing discrete data. • Example Suppose the choice is whether to work or not. The discrete dependent variable we are working with will assume only two values 0 and 1: 20 th th 1 if i individual is working/seeking work 0 if i individual is not working i Y      
  • 21. Limited Dependent variable model • There are four approaches to developing a probability model for a binary response variable: 1. The linear probability model (LPM) 2. The logit model 3. The probit model 4. The tobit model 21
  • 22. The Linear Probability Model (LPM) • To fix ideas, consider the following regression model: • where X = family income and Y = 1 if the family owns a house and 0 if it does not own a house. Model (*) looks like a typical linear regression model but because the regressand is binary, or dichotomous, it is called a linear probability model (LPM). • This is because the conditional expectation of Yi given Xi , E(Yi | Xi ), can be interpreted as the conditional probability that the event will occur given Xi , that is, Pr (Yi = 1 | Xi ). 22 i 1 2 i i Y = ß + ß X + u ...............(* )
  • 23. The Linear Probability Model (LPM) Thus, in our example, E(Yi | Xi ) gives the probability of a family owning a house and whose income is the given amount Xi . The justification of the name LPM for models like Eq. (*) can be seen as follows: • Assuming E(ui ) = 0, as usual (to obtain unbiased estimators), we obtain • Now, if Pi = probability that Yi = 1 (that is, the event occurs), and (1 − Pi ) = probability that Yi = 0 (that is, the event does not occur), the variable Yi has the following (probability) distribution: Yi Probability 0 1 − Pi 1 Pi Total 1 • That is, Yi follows the Bernoulli probability distribution. 23 i i 1 2 i E(Y | X ) = ß + ß X ......(**)
  • 24. The Linear Probability Model (LPM) • That is, Yi follows the Bernoulli probability distribution. • Now, by the definition of mathematical expectation, we obtain: E(Yi ) = 0(1 − Pi ) + 1(Pi ) = Pi (***) • Comparing Eq. (**) with Eq. (***), we can equate E(Yi | Xi ) = β1 + β2Xi = Pi (****) • that is, the conditional expectation of the model (*) can, in fact, be interpreted as the conditional probability of Yi. In general, the expectation of a Bernoulli random variable is the probability that the random variable equals 1. • In passing note that if there are n independent trials, each with a probability p of success and probability (1 − p) of failure, and X of these trials represent the number of successes, then X is said to follow the binomial distribution. The mean of the binomial distribution is np and its variance is np(1 − p). The term success is defined in the context of the problem. 24
  • 25. Logit and Probit • Logit and probit models are the most widely used models for estimating the functional relationship between dependent and independent variables in practice. • Logit and probit models are also among the generalized linear models (GLM) family. • If the latent variable is unobserved or the dependent variable is binary, this model cannot be estimated using the normal least squares method (OLS). • Instead, the maximum probability estimate is used which requires assumptions about the distribution of errors. • Often, the choice is between the normal errors in the probit model and the logistic errors in the logit model 25
  • 26. Con’t… • Limited dependent variables are generally divided into two groups: censored and truncated regression models. • The Tobit model is also known as the censored regression model. • When the dependent variable is censored, least squares estimates give biased results. Therefore, when censored is applied to the dependent variable, the Tobit model allows us to derive consistent and asymptotically efficient predictors. • Tobit model, which is known as models where the dependent variable has a lower or upper limit, was first used by Tobin to analyze household expenditures by working on durable consumer goods considering the fact that expenditure cannot be negative. 26
  • 27. Logit Model • In the logit regression model, none of the assumptions (linear distribution of the dependent variable, withdrawal of independent variables from normal distribution, normal distribution of the error term and no relationship between error term values, etc.) involved in the linear regression analysis are not sought/required. • Therefore, it provides researchers with considerable flexibility and has become a more preferred method. • A general linear regression model can be written as expressed in Equation 1, where yi is a dependent variable and xi is an independent variable. 𝑦𝑖=𝛼+𝛽1𝑥1+𝛽2𝑥2+⋯+𝛽𝑘𝑥𝑘+𝜀𝑖 (1) 27
  • 28. Logit Model • In the above model, 𝛼 constant term and 𝛽 are regression coefficients. This model can be predicted by classical OLS when the dependent variable is continuous. However, logit or probit regression methods are used in cases where the dependent variable is discrete. • The Logit model can be used to model the probability of a particular class or event with two states. • Suppose that the unobservable or latent variable generated from the observed variable 𝑦𝑖 between −∞ and +∞. is 𝑦𝑖∗. • Values greater than 𝑦𝑖∗ are considered 𝑦𝑖=1 and values less than or equal to 𝑦𝑖∗ are considered 𝑦𝑖=0. • The latent variable 𝑦𝑖∗ is assumed to be linearly dependent on the observed 𝑥𝑖 throughout the structural model. 𝑦𝑖∗ is connected to the binary variable 𝑦𝑖 observed by the measurement equation in Equation 2: 28
  • 31. Logit Model • Thus, when the dependent variable yi takes “0” and “1”, binary logit takes the name of the model. When the dependent variable is “1”, the probability is expressed by Equation 3: • In this model, 𝑃𝑖 provides information about the argument 𝑥𝑖 while the first individual expresses the probability of making a particular choice. Thus 𝑃𝑖 also takes values between “0” and “1”. The equations given in Equation 4 and Equation 5 can be written here: 31
  • 32. Logit Model • To determine the logit function, 𝛼 and 𝛽 parameters cannot be directly predicted by OLS and Equation 6 is used to estimate the model: • If equations (3) and (6) are proportional, 32
  • 33. Logit Model Equation 7 is obtained. It is also the odds or odds ratio (Odds Ratio, OR). • Variables close to 1 among these OR values are not the factors that have a significant effect on the change of 𝑦. • For OR values greater than 1, it is interpreted that the factor is an important risk factor provided that the coefficient is significant. • Values close to zero indicate that the factor is an important risk factor, provided that the coefficient is significant, but that it is a negative factor that causes the y to take low values • Equation 8 can be written by taking the natural logarithm of this model according to “e” base: 33
  • 34. Logit Model • 𝐿𝑖 is the difference rate logarithm and is linear with respect to both 𝑥𝑖 and parameters. Here 𝐿𝑖 is called the “logit model” . This model is a semi-logarithmic function. Therefore, the logit model is one of the best known models among generalized linear models. • In order to estimate the parameters in the model, when the 𝐿𝑖 function, 𝑃𝑖=1 and 𝑃𝑖=0 are put in their places in logit 𝐿𝑖, then 𝑙𝑛(1/0) and 𝑙𝑛(0/1) values are obtained which are insignificant. • Estimates of the parameters in the 𝐿𝑖 function cannot be found by OLS but these parameters can be estimated by the maximum likelihood model (ML). 34
  • 35. Logit Model • However, the following points should be taken into consideration in research using logit model . • All appropriate independent variables should be included in the model: Failure to include some variables in the model may cause the error term to grow and the model to be inadequate. • All unsuitable independent variables should be excluded: Inclusion of causally inappropriate variables in the model can complicate the model. • Observation should be done on the same individual once and there should be no repeated measurements. • The measurement error in the independent variables must be small: measurement errors should be small, no missing (missing) data. Errors can lead to bias in estimating coefficients and inadequacy of the model. • There should be no multicollinearity between the independent variables: The independent variables must not be interrelated. • There should be no extreme values: As with linear regression, extreme values can significantly affect the result. 35
  • 36. Logit model • In the Logit model, the coefficients cannot be directly interpreted as the effect of a change in independent variables on the expected value of the dependent variable. • For this reason, OR values or marginal effects can be calculated in applications. Furthermore, the sign of the coefficients indicates the direction of the relationship between the argument and the probability of occurrence of the event. 36
  • 37. Probit Model • In the linear probability model, which is one of the qualitative preference models with qualitative variables that can take two values, the most obvious problem is that the predicted probability values fall outside the range of “0” and “1”. • One of the models used to solve this problem is the probit model. • This model is a nonlinear model in terms of coefficients that allows the probabilities to remain between “0” and “1”. When the dependent variable 𝑦𝑖 is binary, 𝑃𝑖 is expressed in Equation 9: • Here ϕ is the cumulative distribution function and 𝛽 maximum likelihood coefficients of the standard normal distribution. 37
  • 38. Probit model • The probit model assumes that the basic dependent variable is normally distributed, whereas the 𝑦 dependent variable assumes that the variable is based on the logistic curve. • Therefore, the tail regions of the logit cumulative distribution function of these two models are wider than those of the probit model. 38
  • 39. Probit model • Although these two models give similar results, it is not possible to directly compare the predicted main mass coefficients of the two models. • Provided that it does not fall outside the range “0” and “1”, a model should be found so that the relationship between 𝑃𝑖 and 𝑥𝑖 is curvilinear: increases in 𝑥𝑖 also increase 𝑃𝑖. The illustration of the model with the above two features is given in Fig. 2: 39 Figure 2. Logit and probit cumulative distribution
  • 40. Probit Model • The probit model utilizes the cumulative normal distribution function and is called the “normit model” in the literature. The probit probability model based on the normal cumulative distribution function can be represented by Equation 10: • Where 𝑥𝑖 is observable but 𝑦𝑖∗ is not observable. • As in the Logit model, if 𝑦𝑖=1 then 𝑦𝑖∗>0, but if 𝑦𝑖∗<0 then 𝑦𝑖=0. When assigning the result of the variable 𝑦𝑖, the value of τ used as the threshold value is generally taken as “0” and another number value can be used instead of zero . • Considering that 𝑦𝑖 has a threshold value that cannot be observed as it is and is expressed as 𝑦𝑖∗, it can be said that if 𝑦𝑖 exceeds the value 𝑦𝑖,∗ the event will occur and if it does not, the event will not occur (Equation 11). 40
  • 41. Probit model • The case that 𝑦𝑖∗ is less than or equal to 𝑦𝑖 is calculated from standardized cumulative distribution functions under the assumption of normality. If ϕ(Z) cumulative normal distribution function is defined as ϕ(Z)=P(Z≤z) for the normal standard variable Z, then Equation 12 and Equation 13 are expressed as follows: • The variable Z here is a standardized normal variable with a mean of “0” and a variance of “1”. Thus, the model can be represented by Equation 14: • In this model, 𝐹−1 is the inverse of the normal cumulative distribution function. It is possible to state the following assumptions for the Probit model . 41
  • 42. Cont’d 𝑦𝑖∈{0,1},𝑖=1,2,…,𝑛 𝑃𝑖=𝐸(𝑌=1/𝑥)=ϕ(𝛽𝑥𝑖) (Unit normal cumulative distribution function) 𝑦1,𝑦2 ,…,𝑦𝑛 are statistically independent • There is no exact or multicollinearity among all 𝑥𝑖's • Binary probit models WLSM (Weighted Least Squares Method), ML (Maximum Likelihood Method), minimum chi-square iterative can be estimated with WLSM. In addition, the coefficient of R2 in the probit model does not give us any idea as to whether the functional form of the model is well chosen. 42
  • 43. Tobit Model • The sample where the information about the dependent variable is found only for some observations is known as censored sample. • This model is also shown among models with a limited dependent variable because the dependent variable is limited. When censorship is applied to the dependent variable, the regression model is expressed in Equation 15: • This model is called “Tobit model”. 𝑦𝑖∗ is the latent variable and τ is the censor point. Observed and censored for values greater than τ (Equation 16): • In the traditional Tobit model in Equation 16 when τ=0, some observations above 𝑦𝑖∗ take the value of zero. That is, it is expressed as Equation 17; 43
  • 44. Tobit Model • If the dependent variable is censored, having a lower limit and/or an upper limit, then the least squares estimators of the regression parameters are biased and inconsistent • We can apply an alternative estimation procedure, which is called Tobit • Tobit is a maximum likelihood procedure that recognizes that we have data of two sorts: 1. The limit observations (y = 0) 2. The non-limit observations (y > 0) • The two types of observations that we observe, the limit observations and those that are positive, are generated by the latent variable y* crossing the zero threshold or not crossing that threshold 44
  • 45. Difference and similarities among Logit, Probit and Tobit • The most commonly used models among these preference models are logit and probit models. • Both logit and probit model analyses are very similar and the probability estimates obtained are close to each other. • However, while log-odds (likelihood ratios) are used in logit model analysis, the cumulative normal distribution of probit model is used. • The structural models of Logit, probit and Tobit are similar, but the models are different. • In the Tobit model, the observed values of the dependent variable are known when 𝑦𝑖 ∗>τ. • In the probit and logit model, if only 𝑦𝑖∗>τ, 𝑦 value is “1”. However, if the data are below the threshold (τ), they cannot be known and the value 𝑦 is assumed to be zero. • More information is available on the Tobit model. Therefore, it is expected that coefficient estimations obtained from Tobit model will be more effective than those obtained from probit model. 45
  • 46. Multinomial logit model • We are often faced with choices involving more than two alternatives • These are called multinomial choice situations • If you are shopping for a laundry detergent, which one do you choose? OMO, DURU, POPULAR, SPECIAL BRIGHT, and so on • If you enroll in the business school, will you major in economics, marketing, management, finance, or accounting? 46
  • 47. Multinomial logit model • Multinomial logistic regression is used to predict categorical placement in or the probability of category membership on a dependent variable based on multiple independent variables. • The independent variables can be either dichotomous (i.e., binary) or continuous (i.e.,interval or ratio in scale). • Multinomial logistic regression is a simple extension of binary logistic regression that allows for more than two categories of the dependent or outcome variable. • Like binary logistic regression, multinomial logistic regression uses maximum likelihood estimation to evaluate the probability of categorical membership. • The multinomial logistic model assumes that data are case specific; that is, each independent variable has a single value for each case. 47
  • 48. 48