2. • Theoretically linear regression is done on the basis of quantitative data or simply
the ratio scale analysis.
• But in real life occurs different qualitative variables that play a crucial role in
overall analysis.
• These are the nominal scale variables known as indicator variables, categorical
variables, qualitative variables, or dummy variables.
3. • In regression analysis the dependent variable, or regressand, is frequently influenced not
only by ratio scale variables but also by variables that are essentially qualitative, or
nominal scale, in nature, such as sex, race, color, religion, nationality, geographical
region, political upheavals, and party affiliation.
• For example, holding all other factors constant, female workers are found to earn less
than their male counterparts or nonwhite workers are found to earn less than whites.
• This pattern may result from sex or racial discrimination, but whatever the reason,
qualitative variables such as sex and race seem to influence the regressand and clearly
should be included among the explanatory variables, or the regressors.
4. • Since such variables usually indicate the presence or absence of a “quality” or an
attribute, we can “quantify” such attributes by constructing artificial variables
that take on values of 1 or 0, 1 indicating the presence of that attribute and 0
indicating the absence of that attribute.
DUMMY VARIABLE
ANOVA
(Only Qualitative)
ANCOVA
(Both Qualitative &
quantitative)
5.
6. DUMMY VARIABLE TRAP
• In before mentioned Example, to distinguish the four items, we used
only three dummy variables, D2 and D3, D4.
• Suppose we do that and write the model as:
Y = β0 + β2(D2) + β3(D3) + β4(D4) + β5D5
7. • If we were to run the regression, the computer would “refuse” to run
the regression.
• The reason is that in the setup of the equation where we have a
dummy variable for each category or group and also an intercept, we
have a case of perfect collinearity, that is, exact linear relationships
among the variables.
8. • The solution: If a qualitative variable has m categories, introduce only
(m − 1) dummy variables.
• There is also a way to circumvent this trap by introducing as many
dummy variables as the number of categories of that variable,
provided we do not introduce the intercept in such a model.
Y = β1D1 + β2(D2) + β3(D3) + β4(D4) + β5D5
9. ANOVA Models with Two Qualitative Variables
• In regression analysis the dependent variable, or regressand is
frequently influenced by multiple qualitative variables at one time
• These variables can be gender, race, color, religion, nationality,
geographical region, etc.
Y= β1+ β2D2i + β3D3i + μi
10. Example
• We take data set on the life satisfaction rate based on the Marital
Status and Gender
• Dependent variable- Life satisfaction
• Independent variable (Qualitative)- Marital Status
• Independent variable (Qualitative)- Income
12. The ANCOVA Models
• Regression models containing a mixture of quantitative and
qualitative variables are called analysis of covariance (ANCOVA)
models
• They provide a method of statistically controlling the effects of
quantitative regressors, called covariates or control variables
• These regression models are used in most economic research
Y= β1+ β2D2i + β3D3i + β4Xi + μi
13. Example
• We take data set on the life satisfaction rate based on the Marital
Status and Income
• Dependent variable- Life satisfaction
• Independent variable (Qualitative)- Marital Status
• Independent variable (Quantitative)- Income
15. Chow Test
• A Chow test is a statistical used to test whether the coefficients
in two different regression models on different datasets are
equal.
• The Chow test is typically used in the field of econometrics with
time series data to determine if there is a structural break in the
data at some point.
16.
17.
18.
19. Dummy Variable Alternate to Chow Test
The 4 possibilities are-
1) Coincident Regressions
2) Parallel Regressions
3) Concurrent Regressions
4) Dissimilar regressions
24. Y1= α1+ α2dt + β1Xt+ β2(dtXt)+ μt
Dummy Variable Alternate to Chow Test
25. EXAMPLE
• Belize population growth rate structural break
• Determining if there was a structural break in Belize’s population
growth rate in 1978
• Data used : 1970-1992
28. • Y=cyclical + trend + seasonal +random
• Removing seasonal effect from time series data using dummy
variables
29. Deseasonalise US unemployment rate
• Quarterly unemployment rate: 1947 Q4 – 2022 Q2
• Unemp=b1 + b2*q2 + b2*q3 + b3*q4 + ui
• Benchmark category is q1 unemployment rate
• Regress unemployment on dummy variables q2,q3,q4
• Calculate residuals
• Add mean of unemployment in residuals
• Obtained time series is free from seasonal effects
30. • An interaction effect occurs when the effect of one variable
on an outcome depends on the level of another variable.
• In other words, the relationship between two variables
changes depending on the values of one or both of those
variables.
INTERACTION EFFECTS USING
DUMMY VARIABLES
31. INTERACTION EFFECTS USING
DUMMY VARIABLES
When we include interaction terms between dummy variables in a
regression model, we are examining whether the effect of one
categorical variable on the outcome depends on the level of
another categorical variable.
32. • Suppose we want to examine the relationship between gender and
income, and we suspect that the relationship may differ depending
on whether the person is married or not.
• We can create two dummy variables, one for gender and one for
marital status, where male is coded as 1 and female is coded as 0, and
married is coded as 1 and unmarried is coded as 0.
EXAMPLE
33. • We can then include an interaction term between these two dummy
variables in a regression model predicting income:
• income = β0 + β1gender + β2marital_status +
β3*(gender*marital_status) + ε
• where β0 is the intercept, β1 is the effect of gender on income, β2 is
the effect of marital status on income, β3 is the interaction effect
between gender and marital status on income, and ε is the error
term.
EXAMPLE
34. If the interaction effect β3 is significant, it means that the
effect of gender on income differs depending on whether the
person is married or not. In other words, the relationship
between gender and income is moderated by marital status.
EXAMPLE