2. Discrete Choice Models
Regression models we were studying so far had a continuous
dependent variable (e.g. sales of a product, House prices etc.)
o Predictor variables could be continuous or discrete (dummy variables)
Often the phenomenon of interest (i.e. our dependent
variable) is discrete
o Vote or not
o Customer acquisition/defection
o Buy/no-buy
o Click on a banner Ad
o Survive/Don’t survive
With discrete outcomes, we are predicting probabilities (of say
customer defection).
Properties of probabilities?
3. 3
• With binary or categorical dependent variables
standard regression analysis is not appropriate
• Example
• binary dependent variable y coded to be zero for non-purchases and
one for purchases
• X is a continuous metric say price
• Problems
The error terms are heteroskedastic (variance of the dependent
variable is different with different values of the independent variables
Does not meet the assumptions of standard ols regression
Prediction often below zero and values above one
Why Regression does not work
with
0 for non purchases
1 for purchases
y x
y
4. 4
Discrete choice models
Generalize the regression model for the situations
where y is a non-metric variable
o a binary (0-1) variable or
o an ordinal variable (like a questionnaire item assuming the
values completely disagree, disagree, neither, agree,
completely agree) or
o a categorical variable (for example a nominal variable
recording the preferred Brand).
The right-hand side variables can be discrete or continuous
Similar to linear regression but interpretations are different
i iy x
6. Logit: We want our predictions to be a probability
Solution: instead of estimating
we estimate the model
which, after rearranging, equals
nn xcxcxcc
yp
yp
...
)1(1
)1(
ln 22110
nn
nn
xcxcc
xcxcc
e
e
yp
...
...
110
110
1
)1(
8. 8
Obesity Trends* Among U.S. Adults
BRFSS, 1990
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
9. 9
Obesity Trends* Among U.S. Adults
BRFSS, 1991
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
10. 10
Obesity Trends* Among U.S. Adults
BRFSS, 1992
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
11. 11
Obesity Trends* Among U.S. Adults
BRFSS, 1993
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
12. 12
Obesity Trends* Among U.S. Adults
BRFSS, 1994
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
13. 13
Obesity Trends* Among U.S. Adults
BRFSS, 1995
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
14. 14
Obesity Trends* Among U.S. Adults
BRFSS, 1996
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
15. 15
Obesity Trends* Among U.S. Adults
BRFSS, 1997
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
16. 16
Obesity Trends* Among U.S. Adults
BRFSS, 1998
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
17. 17
Obesity Trends* Among U.S. Adults
BRFSS, 1999
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
18. 18
Obesity Trends* Among U.S. Adults
BRFSS, 2000
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
19. 19
Obesity Trends* Among U.S. Adults
BRFSS, 2001
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
20. (*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
Obesity Trends* Among U.S. Adults
BRFSS, 2002
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
20
21. 21
Obesity Trends* Among U.S. Adults
BRFSS, 2003
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
22. Obesity Trends* Among U.S. Adults
BRFSS, 2004
22
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
23. Obesity Trends* Among U.S. Adults
BRFSS, 2005
23
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
24. Obesity Trends* Among U.S. Adults
BRFSS, 2006
24
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
25. Obesity Trends* Among U.S. Adults
BRFSS, 2007
25
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
26. Obesity Trends* Among U.S. Adults
BRFSS, 2008
26
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
27. Obesity Trends* Among U.S. Adults
BRFSS, 2009
27
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
28. Obesity Trends* Among U.S. Adults
BRFSS, 2010
28
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14 15%–19% 20%–24% 25%–29% ≥30%
30. BRFSS Data at CDC
• Behavioral Risk Factor Surveillance System
(BRFSS) at CDC:
World’s largest survey
• Monthly telephone interviews 18 years old of age or
older living in households
• 50 states, the District of Columbia, Puerto Rico,
Guam and the Virgin Islands
Pooled data for 2001-2010.
Approximately 3 million observations.
We are using a random sample from 2006-2010
38. Interpretation
For continuous variables: Change in probability of being
Obese for a 1 unit change in the variable. In our case, only
AGE is continuous. Increasing AGE by 1 year, lowers
probability of being obese by 0.1%. Small effect but see later.
For dummy variables, change in probability compared to the
reference category. Person with “No High School” has 7%
higher likelihood of being obese compared to a person with
college degree (holding everything else fixed)