Salient Features of India constitution especially power and functions
Discrete Choice Model
1. U N I V E R S I T Y O F S O U T H F L O R I D A //
Discrete Choice Model
Dr. Shivendu
2. Agenda
5/24/2022 2
Discrete choice models
Linear Probability Models
Logit Models
Probit Models
Quiz 7: Based on Class 8 Readings
Statistical Analysis III : Logistical Regression
Chap 18 and 19 of DAU_SAS
SAS Assignment 8 posted: Due before class 9
3. Type of Discrete Response Models
• Qualitative dichotomy (e.g., vote/not vote type variables)- We equate "no" with zero and
"yes" with 1. However, these are qualitative choices and the coding of 0-1 is arbitrary. We
could equally well code "no" as 1 and "yes" as zero.
• LPM
• Probit
• Logit
• CLASS 9: Revised Syllabus
• Multinominal Models: Qualitative multichotomy (e.g., occupational choice by an individual)-
Let 0 be a clerk, 1 an engineer, 2 an attorney, 3 a politician, 4 a college professor, and 5 other.
Here the codings are mere categories and the numbers have no real meaning.
• Ordinal Models: Rankings (e.g., opinions about a politician's job performance)- Strongly
approve (5), approve (4), don't know (3), disapprove (2), strongly disapprove (1). The values
that are chosen are not quantitative, but merely an ordering of preferences or opinions. The
difference between outcomes is not necessarily the same from 5 to 4 as it is from 2 to 1.
• Count outcomes: Censored Regression Models or Tobit Models
4. Binary Response Models
• Y variable has only two outcomes, which we can abstract as two values, 0 and 1
• We start with the thinking that the outcome Y depends on a set of X variables
• This is similar to linear regression model conceptualization
5/24/2022 4
5. Categorical Response Variables
Examples:
Whether or not a person
smokes
Smoker
smoker
Non
Y
Success of a medical
treatment
Dies
Survives
Y
Opinion poll responses
Disagree
Neutral
Agree
Y
Binary Response
Ordinal Response
6. OLS and Binary Y Variable
• Problem: OLS regression wasn’t really designed for dichotomous dependent
variables
• Two possible outcomes (typically labeled 0 & 1)
• What kinds of problems come up?
• Linearity assumption doesn’t hold up
• Error distribution is not normal
• The model offers nonsensical predicted values
• Instead of predicting pass (1) or fail (0), the regression line might predict -.5.
7. Example: Height predicts Gender
Y = Gender (0=Male 1=Female)
X = Height (inches)
Try an ordinary linear regression
> regmodel=lm(Gender~Hgt,data=Pulse)
> summary(regmodel)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.343647 0.397563 18.47 <2e-16 ***
Hgt -0.100658 0.005817 -17.30 <2e-16 ***
9. The Linear Probability Model (LPM)
i
Ki
K
i
i e
X
X
X
Y
...
1
P 2
2
1
1
•Solution #1: Use OLS regression anyway!
•Dependent variable = the probability that Y =1 (as
opposed to 0)
• In previous example, Y=1, Female = ; 0 = Male
• We’ll assume that the probability changes as a linear
function of independent variable, height:
• Note: This assumption may not be appropriate
i
K
j
ji
j e
X
Y
1
1
P
10. Linear Probability Model (LPM)
• The LPM may yield reasonable results
• Often good enough to get a “crude look” at your data
• Results tend to be better if data is well behaved
• Ex: If there are decent numbers of cases in each category of the dependent variable.
• Interpretation:
• Coefficients (b) reflect the increase in probability of Y=1 for each unit change in X
• Constant (a) reflects the base probability of Y=1 if all X variables are zero
• Significance tests are done; but may not be trustworthy due to OLS assumption violations.
11. LPM Example: Own a gun?
• Stata OLS output:
. regress gun male educ income south liberal
Source | SS df MS Number of obs = 850
-------------+------------------------------ F( 5, 844) = 17.86
Model | 18.3727851 5 3.67455703 Prob > F = 0.0000
Residual | 173.628391 844 .205720843 R-squared = 0.0957
-------------+------------------------------ Adj R-squared = 0.0903
Total | 192.001176 849 .226149796 Root MSE = .45356
------------------------------------------------------------------------------
gun | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1637871 .0314914 5.20 0.000 .1019765 .2255978
educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616
income | .0379628 .0071879 5.28 0.000 .0238546 .0520711
south | .1539077 .0420305 3.66 0.000 .0714111 .2364043
liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708
_cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531
------------------------------------------------------------------------------
Interpretation: Each additional year of education decreases probability of gun
ownership by .015. What about other vars?
12. LPM Example: Own a gun?
Ki
Ki
Ki
i
i Liberal
South
Inc
Educ
Male
Y 5
4
3
2
2
1
1
1
P
• OLS results can yield predicted probabilities
• Just plug in values of constant, X’s into linear equation
• Ex: A conservative, poor, southern male:
)
0
(
03
.
)
1
(
15
.
)
6
(
038
.
)
12
(
015
.
)
1
(
16
.
139
.
1
P
Y
501
.
1
P
Y
------------------------------------------------------------------------------
gun | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1637871 .0314914 5.20 0.000 .1019765 .2255978
educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616
income | .0379628 .0071879 5.28 0.000 .0238546 .0520711
south | .1539077 .0420305 3.66 0.000 .0714111 .2364043
liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708
_cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531
------------------------------------------------------------------------------
13. LPM Example: Own a gun?
)
7
(
03
.
)
0
(
15
.
)
4
(
038
.
)
20
(
015
.
)
0
(
16
.
139
.
1
P
Y
• Predicted probability for a female PhD student
• Highly educated northern liberal female
23
.
1
P
Y
21
.
0
15
.
30
.
0
139
.
1
P
Y
------------------------------------------------------------------------------
gun | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .1637871 .0314914 5.20 0.000 .1019765 .2255978
educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616
income | .0379628 .0071879 5.28 0.000 .0238546 .0520711
south | .1539077 .0420305 3.66 0.000 .0714111 .2364043
liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708
_cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531
------------------------------------------------------------------------------
14. LPM: Weaknesses
• Model yields nonsensical predicted values
• Probabilities should always fall between 0 and 1.
• Assumptions of OLS regression are violated
• Linearity
• Homoskedasticity (Equal error variance across values of X): error = low near 0, 1 & high at other
values.
• Normality of error distribution
• Coefficients (b) are not biased; but not “best” (i.e., lowest possible sampling variance)
• Variances & Standard errors will be inaccurate
• Hypothesis tests (t-tests, f-tests) can’t be trusted
15. Logistic Regression
•Better Alternative: Logistic Regression
• Also called “Logit”
• A non-linear form of regression that works well for binary or dichotomous
dependent variables
• Other non-linear formulations also work (e.g., probit)
•Based on “odds” rather than probability
• Rather than model P(Y=1), we model “log odds” of Y=1
• “Logit” refers to the natural log of an odds…
• Logistic regression is regression for a logit
• Rather than a simple variable “Y” (OLS)
• Or a probability (the Linear Probability Model).
16. Probability & Odds
outcomes
of
number
total
occurs
A
in which
outcomes
)
(
A
p
•Probability of event A defined as p(A):
• Example: Coin Flip… probability of “heads”
• 1 outcome is “heads”, 2 total possible outcomes
• P(“heads”) = 1 / 2 = .5
• Odds of A = Number of outcomes that are A, divided
by number of outcomes that are not A
• Odds of “heads” = 1 / 1 = 1.0
• Also equivalent to: probability of event over probability of it not
happening: p/(1-p) = (.5 / 1-.5) = 1.0
17. Logistic Regression
i
i
i
p
p
odds
1
•We can convert a probability to odds:
K
j
ji
j
i
i
i X
p
p
L
p
1
1
ln
)
(
logit
• “Logit” = natural log (ln) of an odds
• Natural log means base “e”, not base 10
– We can model a logit as a function of independent
variables:
• Just as we model Y or a probability (the LPM)
18. The Logit Curve
• Note: Logit always falls between 0 and 1
• From Knoke et al. p. 300
19. Logistic Regression
e
e
e K
j
ji
j
K
j
ji
j
K
j
ji
j
X
B
X
B
X
B
Y
P
1
1 1
1
1 1
)
1
(
•Note: We can solve for “p” and reformulate the
model:
• Why model this rather than a probability?
– Because it is a useful non-linear transformation
• It always generates Ps between 0 and 1, regardless of the values
of X variables
• Note: probit transformation has similar effect.
20. Logistic Regression: Estimation
Ki
K
i
i
i X
X
X
L
...
ˆ
2
2
1
1
• Estimation: We can model the logit
• Solution requires Maximum Likelihood Estimation
(MLE)
• In OLS there was an algebraic solution
• Here, we allow the computer to “search” for the best values of
coefficients (“a” and “b”s) to fit observed data.
21. OLS estimation vs MLE
• In OLS, estimated parameters are obtained by algebraic equaltion
• MLE:Maximum Likelihood Estimation (MLE) is a technique to find the:
most likely parameters or function that explain observed data.
5/24/2022 21
22. What is MLE?
• The maximum likelihood method is also based on a model and on a distribution.
• The model, P(X | p) is the probability of an event X dependent on model parameters p.
• The likelihood of the parameters given the data is the probability of observing X given p.
• The maximum likelihood method consists in optimizing the likelihood function:
• the goal is to estimate the parameters p which make it most likely to observe the data X.
5/24/2022 22
23. Logistic Regression: Estimation
• Properties of Maximum Likelihood Estimation
• “Consistent, efficient and asymptotically normal as N approaches infinity.” Large N = better!
• Rules of thumb regarding sample size
• N > 500 = fine; N < 100 can be worrisome
• Results aren’t necessarily wrong if N<100;
• But it is a possibility; and hard to know when problems crop up
• Higher N is needed if data are problematic due to:
• Multicollinearity
• Limited variation in dependent variable.
24. Logistic Regression
•Benefits of Logistic regression:
• You can now effectively model probability as a function of X variables
• You don’t have to worry about violations of OLS assumptions
• Predictions fall between 0 and 1
•Downsides
• You lose the “simple” interpretation of linear coefficients
• In a linear model, effect of each unit change in X on Y is consistent
• In a non-linear model, the effect isn’t consistent…
• Also, you can’t compute some stats (e.g., R-square).
25. Logistic Regression Example
• Stata output for gun ownership:
. logistic gun male educ income south liberal, coef
Logistic regression Number of obs = 850
LR chi2(5) = 89.53
Prob > chi2 = 0.0000
Log likelihood = -502.7251 Pseudo R2 = 0.0818
------------------------------------------------------------------------------
gun | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .7837017 .156764 5.00 0.000 .4764499 1.090954
educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984
income | .2416647 .0493794 4.89 0.000 .1448828 .3384466
south | .7363169 .1979038 3.72 0.000 .3484327 1.124201
liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921
_cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455
------------------------------------------------------------------------------
• Note: Results aren’t that different from LPM
• We’re dealing with big effects, large sample…
• But, predicted probabilities & SEs will be better.
26. Interpreting Coefficients
•Raw coefficients (s) show effect of 1-unit change in X on the log odds
of Y=1
• Positive coefficients make “Y=1” more likely
• Negative coefficients mean “less likely”
• But, effects are not linear
• Effect of unit change on p(Y=1) isn’t same for all values of X!
• Rather, Xs have a linear effect on the “log odds”
• But, it is hard to think in units of “log odds”, so we need to do further calculations
• NOTE: log-odds interpretation doesn’t work on Probit!
27. Interpreting Coefficients
•Best way to interpret logit coefficients is to exponentiate them
• This converts from “log odds” to simple “odds”
• Exponentiation = opposite of natural log
• On calculator use “ex” or “inverse ln” function
• Exponentiated coefficients are called odds ratios
• An odds ratio of 3.0 indicates odds are 3 times higher for each unit change in X
• Or, you can say the odds increase “by a factor of 3”.
• An odds ratio of .5 indicates odds decrease by ½ for each unit change in X.
• Odds ratios < 1 indicate negative effects.
28. Interpreting Coefficients
•Example: Do you drink coffee?
• Y=1 indicates coffee drinkers; Y=0 indicates no coffee
• Key independent variable: Year in grad program
• Observed “raw” coefficient: b = 0.67
• A positive effect… each year increases log odds by .67
• But how big is it really?
• Exponentiation: e.67= 1.95
• Odds increase multiplicatively by 1.95
• If a person’s initial odds were 2.0 (2:1), an extra year of school would
result in: 2.0*1.95 = 3.90
• The odds nearly DOUBLE for each unit change in X
• Net of other variables in the model…
29. Interpreting Coefficients
•Exponentiated coefficients (“odds ratios”) operate
multiplicatively
• Effect on odds is found by multiplying coefficients
• eb of 1.0 means that a variable has no effect
• Multiplying anything by 1.0 results in same value
• eb > 1.0 means that the variable has a positive effect on the
odds of “Y=1”
• eb < 1.0 means that the variable has a negative effect
•Hint: Papers may present results as “raw” coefficients
or odds ratios
• It is important to be aware of what you’re looking at
• If all coeffs are positive, they might be odds ratios!
30. Interpreting Coefficients
•To further aid interpretation, we can: convert
exponentiated coefficients to % change in odds
• Calculate: (exponentiated coef - 1)*100%
• Ex: (e.67 – 1) * 100% = (1.95 – 1) * 100% = 95%
• Interpretation: Every unit change in X (year of school) increases the odds of
coffee drinking by 95%
•What about a 2-point change in X?
• Is it 2 * 95%? No!!! You must multiply odds ratios:
• (1.95 * 1.95 – 1) * 100% = (3.80 – 1) * 100 = +280%
• 3-point change = (1.95 * 1.95 * 1.95 – 1) * 100%
• N-point change = (ORn – 1) * 100%
31. Interpreting Coefficients
•What is the effect of a 1-unit decrease in X?
• No, you can’t flip sign… it isn’t -95%
• You must invert odds ratios to see opposite effect
• Additional year in school = (1.95 – 1) * 100% = +95%
• One year less: (1/1.95 – 1)*100 =(.512 -1)*100= -48.7%
•What is the effect of two variables together?
• To combine odds ratios you must multiply
• Ex: Have a mean advisor; b=.1.2; OR = e1.2 = 3.32
• Effect of 1 additional year AND mean advisor:
• (1.95 * 3.32 – 1)*100 = (6.47 – 1) * 100% = 547% increase in odds of coffee
drinking…
32. Interpreting Coefficients
•Gun ownership: Effect of education?
. logistic gun male educ income south liberal, coef
Logistic regression Number of obs = 850
LR chi2(5) = 89.53
Prob > chi2 = 0.0000
Log likelihood = -502.7251 Pseudo R2 = 0.0818
------------------------------------------------------------------------------
gun | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .7837017 .156764 5.00 0.000 .4764499 1.090954
educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984
income | .2416647 .0493794 4.89 0.000 .1448828 .3384466
south | .7363169 .1979038 3.72 0.000 .3484327 1.124201
liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921
_cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455
------------------------------------------------------------------------------
• (e-.076-1)*100% = 7.39% lower odds per year
• Also: Male: (e.78-1)*100% = +118% -- more than double!
33. Raw Coefs vs. Odds ratios
• It is common to present results either way:
. logistic gun male educ income south liberal, coef
------------------------------------------------------------------------------
gun | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .7837017 .156764 5.00 0.000 .4764499 1.090954
educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984
income | .2416647 .0493794 4.89 0.000 .1448828 .3384466
south | .7363169 .1979038 3.72 0.000 .3484327 1.124201
liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921
_cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455
------------------------------------------------------------------------------
. logistic gun male educ income south liberal
------------------------------------------------------------------------------
gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | 2.189562 .3432446 5.00 0.000 1.610347 2.977112
educ | .926097 .0235272 -3.02 0.003 .8811137 .9733768
income | 1.273367 .0628781 4.89 0.000 1.155904 1.402767
south | 2.08823 .4132686 3.72 0.000 1.416845 3.077757
liberal | .848648 .049066 -2.84 0.005 .7577291 .9504762
------------------------------------------------------------------------------
Can you see the relationship? Negative coeffs yield ratios less below 1.0!
34. Interpreting Coefficients
•Raw coefficients (s) show effect of 1-unit change in X on the log odds
of Y=1
• Positive coefficients make “Y=1” more likely
• Negative coefficients mean “less likely”
• But, effects are not linear
• Effect of unit change on p(Y=1) isn’t same for all values of X!
• Rather, Xs have a linear effect on the “log odds”
• But, it is hard to think in units of “log odds”, so we need to do further calculations
• NOTE: log-odds interpretation doesn’t work on Probit!
35. Interpreting Coefficients
•Best way to interpret logit coefficients is to exponentiate them
• This converts from “log odds” to simple “odds”
• Exponentiation = opposite of natural log
• On calculator use “ex” or “inverse ln” function
• Exponentiated coefficients are called odds ratios
• An odds ratio of 3.0 indicates odds are 3 times higher for each unit change in X
• Or, you can say the odds increase “by a factor of 3”.
• An odds ratio of .5 indicates odds decrease by ½ for each unit change in X.
• Odds ratios < 1 indicate negative effects.
36. Predicted Probabilities
Ki
K
i
i
i X
X
X
L
...
ˆ
2
2
1
1
•To determine predicted probabilities, first compute
the predicted Logit value:
e
e
e
e
i
i
K
j
ji
j
K
j
ji
j
L
L
X
B
X
B
Y
P
1
1
ˆ
ˆ
1
1
)
1
(
• Then, plug logit values back into P formula:
37. Predicted Probabilities: Own a gun?
0
.
4
)
7
(
16
.
)
0
(
73
.
)
4
(
24
.
)
20
(
077
.
)
0
(
78
.
28
.
2
ˆ
i
L
•Predicted probability for a female PhD student
• Highly educated northern liberal female
017
.
018
.
1
018
.
)
1
(
1
1
0
.
4
0
.
4
ˆ
ˆ
e
e
e
e
i
i
L
L
Y
P
------------------------------------------------------------------------------
gun | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | .7837017 .156764 5.00 0.000 .4764499 1.090954
educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984
income | .2416647 .0493794 4.89 0.000 .1448828 .3384466
south | .7363169 .1979038 3.72 0.000 .3484327 1.124201
liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921
_cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455
------------------------------------------------------------------------------
38. The Logit Curve
• Effect of log odds on probability = nonlinear!
• From Knoke et al. p. 300
39. Predicted Probabilities
•Important point: Substantive effect of a variable on
predicted probability differs depending on values of other
variables
• If probability is already high (or low), variable changes may matter less…
• Suppose a 1-point change in X doubles the odds…
• Effect isn’t substantively consequential if probability (Y=1) is already very high
• Ex: 20:1 odds = .95 probability; 40:1 odds = .975 probability
• Change in probability is only .025
• Effect matters a lot for cases with probabilities near .5
• 1:1 odds = .5 probability. 2:1 odds = .67 probability
• Change in probability is nearly .2!
40. Logit Example: Own a gun?
16
.
4
)
7
(
16
.
)
0
(
73
.
)
4
(
24
.
)
22
(
077
.
)
0
(
78
.
28
.
2
ˆ
i
L
•Predicted probability of gun ownership for a female
PhD student is very low: P=.017
• Two additional years of education lowers probability from
.017 to .015 – not a big effect
• Additional unit change can’t have a big effect – because probability
can’t go below zero
• It would matter much more for a southern male…
0153
.
0156
.
1
0156
.
)
1
(
1
1
0
.
4
0
.
4
ˆ
ˆ
e
e
e
e
i
i
L
L
Y
P
41. Predicted Probabilities
•Predicted probabilities are a great way to make findings
accessible to a reader
• Often people make bar graphs of probabilities
• 1. Show predicted probabilities for real cases
• Ex: probability of civil war for Ghana vs. Sweden
• 2. Show probabilities for “hypothetical” cases that exemplify key
contrasts in your data
• Ex: Guns: Southern male vs. female PhD student
• 3. Show how a change in critical independent variable would affect
predicted probability
• Ex: Guns: What would happen to southern male who went and got a PhD?
42. Marginal Change in Logit
•Issue: How to best capture effect size in non-linear models?
• % Change in odds ratios for 1-unit change in X
• Change in actual probability for 1-unit change in X
• Either for hypothetical cases or an actual case
•Another option: marginal change
• The actual slope of the curve at a specific point
• Again, can be computed for real or hypothetical cases
• Recall from calculus: derivatives are slopes...
• So, a marginal change is just a derivative.
43. Sensitivity / Specificity of Prediction
• Sensitivity: Of gun owners, what proportion were correctly predicted to own a gun?
• Specificity: Of non-gun owners, what proportion did we correctly predict?
• Choosing a different probability cutoff affects those values
• If we reduce the cutoff to P > .4, we’ll catch a higher proportion of gun owners
• But, we’ll incorrectly identify more non-gun owners.
• And, we’ll have more false positives.
44. Hypothesis tests
•Testing hypotheses using logistic regression
• H0: There is no effect of year in grad program on coffee drinking
• H1: Year in grad school is associated with coffee
• Or, one-tail test: Year in school increases probability of coffee
• MLE estimation yields standard errors… like OLS
• Test statistic: 2 options; both yield same results
• t = b/SE… just like OLS regression
• Wald test (Chi-square, 1df); essentially the square of t
• Reject H0 if Wald or t > critical value
• Or if p-value less than alpha (usually .05).
45. Model Fit: Likelihood Ratio Tests
• MLE computes a likelihood for the model
• “Better” models have higher likelihoods
• Log likelihood is typically a negative value, so “better” means a less negative value… -100 > -1000
• Log likelihood ratio test: Allows comparison of any two nested models
• One model must be a subset of vars in other model
• You can’t compare totally unrelated models!
• Models must use the exact same sample.
46. Model Fit: Pseudo R-Square
•Pseudo R-square
• “A descriptive measure that indicates roughly the proportion of
observed variation accounted for by the… predictors.” Knoke et al, p.
313
Logistic regression Number of obs = 850
LR chi2(5) = 89.53
Prob > chi2 = 0.0000
Log likelihood = -502.7251 Pseudo R2 = 0.0818
------------------------------------------------------------------------------
gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | 2.189562 .3432446 5.00 0.000 1.610347 2.977112
educ | .926097 .0235272 -3.02 0.003 .8811137 .9733768
income | 1.273367 .0628781 4.89 0.000 1.155904 1.402767
south | 2.08823 .4132686 3.72 0.000 1.416845 3.077757
liberal | .848648 .049066 -2.84 0.005 .7577291 .9504762
------------------------------------------------------------------------------
Model explains roughly 8% of variation in Y
47. Assumptions & Problems
• Assumption: Independent random sample
• Serial correlation or clustering violate assumptions; bias SE estimates and hypothesis tests
• Multicollinearity: High correlation among independent variables causes
problems
• Unstable, inefficient estimates
• Watch for coefficient instability, check VIF/tolerance
• Remove unneeded variables or create indexes of related variables.
48. Assumptions & Problems
• Outliers/Influential cases
• Unusual/extreme cases can distort results, just like OLS
49. Assumptions & Problems
•Insufficient variance: You need cases for both values of
the dependent variable
• Extremely rare (or common) events can be a problem
• Suppose N=1000, but only 3 are coded Y=1
• Estimates won’t be great
•Also: Maximum likelihood estimates cannot be
computed if any independent variable perfectly predicts
the outcome (Y=1)
• Ex: Suppose sociology classes drives all students to drink coffee... So
there is no variation…
• In that case, you cannot include a dummy variable for taking sociology classes
in the model.
50. Assumptions & Problems
• Model specification / Omitted variable bias
• Just like any regression model, it is critical to include appropriate variables in the model
• Omission of important factors or ‘controls’ will lead to misleading results.
51. Probit
• Probit models are an alternative to logistic regression
• Involves a different non-linear transformation
• Generally yields results very similar to logit models
• Coefficients are rescaled by factor of (approx) 1.6
• For ‘garden variety’ analyses, there is little reason to prefer either logit or probit
• But, probit has advantages in some circumstances
• Ex: Multinomial models that violate the IIA assumption (to be discussed later).
52. Takeaway
• LPM models are easy to interpret and work well if independent variables do not take extreme
values
• Logit models are backbone of binary choice models and coefficients should be interpreted
carefully.
5/24/2022 52