SlideShare a Scribd company logo
1 of 47
U N I V E R S I T Y O F S O U T H F L O R I D A //
Discrete Choice Model
Dr. Shivendu
Agenda
5/24/2022 2
Discrete choice models: Multiple Choices
• Multinomial Models
• Ordinal Logit Models
• Censored Regression or Count Data Models: Tobit Models
Quiz 8: Based on Class 9 Readings
Class 9_SAS_Module
Statistical Analysis IV: Non-Parametric Procedures
• Chap 27 and 28 of DAU_SAS
• SAS Assignment 9 posted: Due before class 10
Examples of multinomial choice (polytomous) situations:
1.Choice of a laundry detergent: Tide, Cheer, Arm & Hammer, Wisk, etc.
2.Choice of a major: economics, marketing, management, finance or
accounting.
3.Choices after graduating from high school: not going to college, going
to a private 4-year college, a public 4 year-college, or a 2-year college.
Slide16-3
Principles of Econometrics, 3rd Edition
The explanatory variable xi is individual specific but does not change
across alternatives. Example age of the individual.
The dependent variable is nominal
Slide16-4
Principles of Econometrics, 3rd Edition
Examples of multinomial choice situations:
1. It is key that there are more than 2 choices
2. It is key that there is no meaningful ordering to them. Otherwise, we would
want to use that information (with an ordered probit or ordered logit)
Slide16-5
Principles of Econometrics, 3rd Edition
In essence this model is like a set of simultaneous individual binomial/binary
logistic regressions
With appropriate weighting, since the different comparisons between different
pairs of categories would generally involve different numbers of
observations
Slide16-6
Principles of Econometrics, 3rd Edition
Principles of Econometrics, 3rd Edition Slide16-7
   
1
12 22 13 23
1
, 1
1 exp exp
i
i i
p j
x x
 
     
 
   
12 22
2
12 22 13 23
exp
, 2
1 exp exp
i
i
i i
x
p j
x x
 
 
     
 
   
13 23
3
12 22 13 23
exp
, 3
1 exp exp
i
i
i i
x
p j
x x
 
 
     
 
individual chooses alternative
ij
p P i j

An interesting feature of the odds ratio (16.21) is that the odds of choosing
alternative j rather than alternative 1 does not depend on how many alternatives
there are in total. There is the implicit assumption in logit models that the odds
between any pair of alternatives is independent of irrelevant alternatives (IIA).
Principles of Econometrics, 3rd Edition Slide16-8
 
 
 
1 2
1
exp 2,3
1
ij
i
j j i
i i
p
P y j
x j
P y p

    

 
 
1
2 1 2
exp 2,3
ij i
j j j i
i
p p
x j
x

    

• There is the implicit assumption in logit models that the odds between any pair of
alternatives is independent of irrelevant alternatives (IIA)
One way to state the assumption
• If choice A is preferred to choice B out of the choice set {A,B}, then
introducing a third alternative X, thus expanding that choice set to
{A,B,X}, must not make B preferable to A.
• which kind of makes sense 
Slide16-9
Principles of Econometrics, 3rd Edition
IIA assumption
• There is the implicit assumption in logit models that the odds between any pair of
alternatives is independent of irrelevant alternatives (IIA)
In the case of the multinomial logit model, the IIA implies that adding
another alternative or changing the characteristics of a third alternative
must not affect the relative odds between the two alternatives
considered.
This is not realistic for many real life applications involving similar
(substitute) alternatives.
Slide16-10
Principles of Econometrics, 3rd Edition
IIA assumption
This is not realistic for many real-life applications with similar
(substitute) alternatives
Examples:
• Beethoven/Debussy versus another of Beethoven’s Symphonies
(Debreu 1960; Tversky 1972)
• Bicycle/Pony (Luce and Suppes 1965)
• Red Bus/Blue Bus (McFadden 1974).
• Black slacks, jeans, shorts versus blue slacks (Hoffman, 2004)
• Etc.
Slide16-11
Principles of Econometrics, 3rd Edition
IIA assumption
Red Bus/Blue Bus (McFadden 1974).
• Imagine commuters first face a decision between two modes of transportation: car and red bus
• Suppose that a consumer chooses between these two options with equal probability, 0.5, so that
the odds ratio equals 1.
• Now add a third mode, blue bus. Assuming bus commuters do not care about the color of the
bus (they are perfect substitutes), consumers are expected to choose between bus and car still
with equal probability, so the probability of car is still 0.5, while the probabilities of each of the
two bus types should go down to 0.25
• However, this violates IIA: for the odds ratio between car and red bus to be preserved, the new
probabilities must be: car 0.33; red bus 0.33; blue bus 0.33
• Te IIA axiom does not mix well with perfect substitutes 
IIA assumption
We can test this assumption with a Hausman-McFadden test which
compares a logistic model with all the choices with one with restricted
choices.
IIA assumption
Model for three categories
Need k-1 generalized logits to represent a dependent
variable with k categories
Meaning of the regression coefficients
A positive regression coefficient for logit j means that higher
values of the independent variable are associated with
greater chances of response category j, compared to
the reference category.
Solve for the probabilities
so
So
Three linear equations in 3 unknowns
Solution
In general, solve k equations in k
unknowns
General Solution
Using the solution, one can
 Calculate the probability of obtaining the
observed data as a function of the regression
coefficients: Get maximum likelihood estimates
(beta-hat values)
 From maximum likelihood estimates, get tests
and confidence intervals
 Using beta-hat values in Lj, estimate
probabilities of category membership for any
set of x values.
Multinominal Logistic Regression in SAS
https://support.sas.com/resources/papers/proceedings12/427-2012.pdf
Multinominal Logistic Regression in SAS
• For a good example and implementation, see:
•https://stats.idre.ucla.edu/sas/dae/multinomiallogistic-regression/
•For example and more statistical theory, see
• https://support.sas.com/resources/papers/proceedings12/427-2012.pdf
5/24/2022 23
 Extensions have arisen to deal with this issue
 The multinomial probit and the mixed logit are alternative models for nominal outcomes that
relax IIA, by allowing correlation among the errors (to reflect similarity among options)
 but these models often have issues and assumptions themselves 
 IIA can also be relaxed by specifying a hierarchical model, ranking the choice alternatives.
The most popular of these is called the McFadden’s nested logit model, which allows
correlation among some errors, but not all (e.g. Heiss 2002)
 Generalized extreme value and multinomial probit models possess another property, the
Invariant Proportion of Substitution (Steenburgh 2008), which itself also suggests similarly
counterintuitive real-life individual choice behavior
 The multinomial probit has serious computational disadvantages too, since it involves
calculating multiple (one less than the number of categories) integrals. With integration by
simulation this problem is being ameliorated now…
IIA assumption
Combining Categories
Consider testing whether two categories could be combined
If none of the independent variables really explain the odds of choosing choice A versus B, you
should merge them
Multinomial Logit versus Probit
Computational issues make the Multinomial Probit very rare
Advantage: it does not need IIA 
Example: choice between three types (J = 3) of soft drinks, say Pepsi,
7-Up and Coke Classic.
Let yi1, yi2 and yi3 be dummy variables that indicate the choice made by individual i. The price facing
individual i for brand j is PRICEij.
Variables like price are to be individual and alternative specific, because they vary from individual
to individual and are different for each choice the consumer might make
Another example: of mode of transportation choice: time from home to work using train, car, or bus.
For more details and example in implementing conditional logit in SAS, see
https://stats.idre.ucla.edu/sas/faq/how-do-i-do-a-conditional-logit-model-analysis-in-sas-9-1/
Principles of Econometrics, 3rd Edition Slide16-27
Logit Models for Ordinal Responses
• Response variable is ordinal (categorical with natural ordering)
• Predictor variable(s) can be numeric or qualitative (dummy
variables)
• Labeling the ordinal categories from 1 (lowest level) to c (highest),
can obtain the cumulative probabilities:
c
j
j
Y
P
Y
P
j
Y
P ,
,
1
)
(
)
1
(
)
( 
 






Logistic Regression for Ordinal Response
• The odds of falling in category j or below:
1
)
(
1
,
,
1
)
(
)
(






c
Y
P
c
j
j
Y
P
j
Y
P

• Logit (log odds) of cumulative probabilities are modeled as linear
functions of predictor variable(s):
  1
,
,
1
)
(
)
(
log
)
(
logit 












 c
j
X
j
Y
P
j
Y
P
j
Y
P j 


This is called the proportional odds model, and assumes the effect of X is the
same for each cumulative probability
Example - Urban Renewal Attitudes
• Response: Attitude toward urban renewal project (Negative (Y=1),
Moderate (Y=2), Positive (Y=3))
• Predictor Variable: Respondent’s Race (White, Nonwhite)
• Contingency Table:
AttitudeRace White Nonwhite
Negative (Y=1) 101 106
Moderate (Y=2) 91 127
Positive (Y=3) 170 190
SPSS Output
• Note that SPSS fits the model in the following form:
  1
,
,
1
)
(
)
(
log
)
(
logit 












 c
j
X
j
Y
P
j
Y
P
j
Y
P j 


r E
7
2
3
1
0
7
8
5
4
0
1
0
0
1
1
3
0
1
3
3
0
0 a
.
.
0
.
.
.
[ A
[ A
T
[ R
[ R
L
m
E
a
d f
S i g
r B
r B
d e
L
T
a
Note that the race variable is not significant (or even close).
Fitted Equation
• The fitted equation for each group/category:
165
.
0
0
165
.
0
)
Nonwhite
|
2
(
)
Nonwhite
|
2
(
logit
:
te
Mod/Nonwhi
or
Neg
166
.
0
)
001
.
0
(
165
.
0
)
White
|
2
(
)
White
|
2
(
logit
:
Mod/White
or
Neg
027
.
1
)
0
(
027
.
1
)
Nonwhite
|
1
(
)
Nonwhite
|
1
(
logit
:
onwhite
Negative/N
026
.
1
)
001
.
0
(
027
.
1
)
White
|
1
(
)
White
|
1
(
logit
:
hite
Negative/W


















































Y
P
Y
P
Y
P
Y
P
Y
P
Y
P
Y
P
Y
P
For each group, the fitted probability of falling in that set of categories is eL/(1+eL)
where L is the logit value (0.264,0.264,0.541,0.541)
Inference for Regression Coefficients
• If  = 0, the response (Y) is independent of X
• Z-test can be conducted to test this (estimate divided by its standard error)
• Most software will conduct the Wald test, with the statistic being the z-statistic
squared, which has a chi-squared distribution with 1 degree of freedom under
the null hypothesis
• Odds ratio of increasing X by 1 unit and its confidence interval are obtained by
raising e to the power of the regression coefficient and its upper and lower
bounds
Example - Urban Renewal Attitudes
• Z-statistic for testing for race differences:
Z=0.001/0.133 = 0.0075 (recall model estimates -)
• Wald statistic: .000 (P-value=.993)
• Estimated odds ratio: e.001 = 1.001
• 95% Confidence Interval: (e-.260,e.263)=(0.771,1.301)
• Interval contains 1, odds of being in a given category or below is same for
whites as nonwhites
r E
7
2
3
1
0
7
8
5
4
0
1
0
0
1
1
3
0
1
3
3
0
0 a
.
.
0
.
.
.
[ A
[ A
T
[ R
[ R
L
m
E
a
d f
S i g
r B
r B
d e
L
T
a
Ordinal Predictors
• Creating dummy variables for ordinal categories treats them as if
nominal
• To make an ordinal variable, create a new variable X that models the
levels of the ordinal variable
• Setting depends on assignment of levels (simplest form is to let
X=1,...,c for the categories which treats levels as being equally
spaced)
Censored Regression or Count Data Models:
Tobit Models
3
7
INTRODUCTION
• Data on y is censored if for part of the range of y
we observe only that y is in that range, rather than
observing the exact value of y.
e.g. income is top-coded at $75,000 per year.
• Data on y is truncated if for part of the range of y we
do not observe y at all.
e.g. people with income above $75,000 per year are
excluded from the sample.
3
8
• Meaningful policy analysis requires extrapolationfrom
the restricted sample to the population as a whole.
• But running regressions on censored or truncated data,
without controlling for censoring or truncation, leads
to inconsistent parameter estimates.
3
9
• We focus on the normal, with censoring or truncation
at zero.
e.g. annual hours worked, and annual expenditure on
automobiles.
• The class of models presented in this chapter is called
limited dependent variable models or latent variable
models. Econometricians also use the terminology
tobit models or generalized tobit models.
4
0
• Censoring can arise for distributions other than the
normal.
• For e.g. count data treatment is similar to here except
different distributions.
• For duration data, e.g. the length of a spell of unem-
ployment, a separate treatment of censoring is given
there due to different censoring mechanism (random)
to that considered here.
4
1
Many Models for Censored Data
• Tobit model: MLE, NLS and Heckman 2-step.
• Sample selectivity model, a generalization of Tobit.
• Semiparametric estimation.
• Structural economic models for censored choice.
• Simultaneous equation models
4
2
TOBIT
MODEL
• Interest lies in a latent dependent variable y*
y* = X'β + c.
• This variable is only partially observed.
• In censored regression we observe
y =
*
y if y >0
0 if y <=0
• In truncated regression we observe
y = y* if y* >0
.
• For censored and truncated data, linear regression is
inappropriate.
4
4
• The standard estimators require stochasticassumptions
about the distribution of c and hence y*.
• The Tobit model assumes normality of the error term.
• For consistent estimates, OLS is not good. We should
use MLE
Tobit in SAS
• For example, and using SAS to estimate Tobit regression, see
• https://stats.idre.ucla.edu/sas/dae/tobit-analysis/
Takeaway
• Multinominal and ordered logit models are increasingly being used in consumer product
choice research as well as in practice.
• Tobit models are useful when data is count data or duration data. These are often used in
specific contexts only.
5/24/2022 46
Going forward
• No more face-to-face class. Session 10 and 11 will be only on Teams.
• Final exam is on 4th May from 8.30 am to 11.00 am and will be on Teams through proctorio.
5/24/2022 47

More Related Content

Similar to Discrete Choice Model - Part 2

What's the question?
What's the question? What's the question?
What's the question? jemille6
 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...inventionjournals
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) ijceronline
 
Creating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning AlgorithmCreating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning AlgorithmBill Fite
 
Explainable Machine Learning
Explainable Machine LearningExplainable Machine Learning
Explainable Machine LearningBill Fite
 
Production_planning_of_a_furniture_manufacturing_c.pdf
Production_planning_of_a_furniture_manufacturing_c.pdfProduction_planning_of_a_furniture_manufacturing_c.pdf
Production_planning_of_a_furniture_manufacturing_c.pdfMehnazTabassum20
 
Heuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemHeuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemIJMER
 
Rich routing problems arising in supply chain management (2012)
Rich routing problems arising in supply chain management (2012)Rich routing problems arising in supply chain management (2012)
Rich routing problems arising in supply chain management (2012)siska_suryanto
 
L3 1b
L3 1bL3 1b
L3 1bNBER
 
165662191 chapter-03-answers-1
165662191 chapter-03-answers-1165662191 chapter-03-answers-1
165662191 chapter-03-answers-1BookStoreLib
 
165662191 chapter-03-answers-1
165662191 chapter-03-answers-1165662191 chapter-03-answers-1
165662191 chapter-03-answers-1Firas Husseini
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdfLeonardo Auslender
 
Financial Benchmarking Of Transportation Companies In The New York Stock Exc...
Financial Benchmarking Of Transportation Companies  In The New York Stock Exc...Financial Benchmarking Of Transportation Companies  In The New York Stock Exc...
Financial Benchmarking Of Transportation Companies In The New York Stock Exc...ertekg
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...CSCJournals
 
Bag Jacobs Ead Model Ccl Irmc 6 10
Bag Jacobs Ead Model Ccl Irmc 6 10Bag Jacobs Ead Model Ccl Irmc 6 10
Bag Jacobs Ead Model Ccl Irmc 6 10Michael Jacobs, Jr.
 

Similar to Discrete Choice Model - Part 2 (20)

4 5-basic economic concepts to pharmacy
4 5-basic economic concepts to pharmacy4 5-basic economic concepts to pharmacy
4 5-basic economic concepts to pharmacy
 
What's the question?
What's the question? What's the question?
What's the question?
 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
 
Statsci
StatsciStatsci
Statsci
 
StatsModelling
StatsModellingStatsModelling
StatsModelling
 
Creating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning AlgorithmCreating an Explainable Machine Learning Algorithm
Creating an Explainable Machine Learning Algorithm
 
Explainable Machine Learning
Explainable Machine LearningExplainable Machine Learning
Explainable Machine Learning
 
Production_planning_of_a_furniture_manufacturing_c.pdf
Production_planning_of_a_furniture_manufacturing_c.pdfProduction_planning_of_a_furniture_manufacturing_c.pdf
Production_planning_of_a_furniture_manufacturing_c.pdf
 
Heuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemHeuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection Problem
 
Rich routing problems arising in supply chain management (2012)
Rich routing problems arising in supply chain management (2012)Rich routing problems arising in supply chain management (2012)
Rich routing problems arising in supply chain management (2012)
 
L3 1b
L3 1bL3 1b
L3 1b
 
165662191 chapter-03-answers-1
165662191 chapter-03-answers-1165662191 chapter-03-answers-1
165662191 chapter-03-answers-1
 
165662191 chapter-03-answers-1
165662191 chapter-03-answers-1165662191 chapter-03-answers-1
165662191 chapter-03-answers-1
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf
 
MSQA Thesis
MSQA ThesisMSQA Thesis
MSQA Thesis
 
Financial Benchmarking Of Transportation Companies In The New York Stock Exc...
Financial Benchmarking Of Transportation Companies  In The New York Stock Exc...Financial Benchmarking Of Transportation Companies  In The New York Stock Exc...
Financial Benchmarking Of Transportation Companies In The New York Stock Exc...
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
 
Bag Jacobs Ead Model Ccl Irmc 6 10
Bag Jacobs Ead Model Ccl Irmc 6 10Bag Jacobs Ead Model Ccl Irmc 6 10
Bag Jacobs Ead Model Ccl Irmc 6 10
 

More from Michael770443

Discrete Choice Model
Discrete Choice ModelDiscrete Choice Model
Discrete Choice ModelMichael770443
 
Categorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisCategorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisMichael770443
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of VarianceMichael770443
 
Segmentation: Clustering and Classification
Segmentation: Clustering and ClassificationSegmentation: Clustering and Classification
Segmentation: Clustering and ClassificationMichael770443
 
Introduction to Statistical Methods
Introduction to Statistical MethodsIntroduction to Statistical Methods
Introduction to Statistical MethodsMichael770443
 
Overview of Statistical Concepts
Overview of Statistical ConceptsOverview of Statistical Concepts
Overview of Statistical ConceptsMichael770443
 

More from Michael770443 (9)

Discrete Choice Model
Discrete Choice ModelDiscrete Choice Model
Discrete Choice Model
 
Categorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisCategorical Data and Statistical Analysis
Categorical Data and Statistical Analysis
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of Variance
 
Classification
ClassificationClassification
Classification
 
Segmentation: Clustering and Classification
Segmentation: Clustering and ClassificationSegmentation: Clustering and Classification
Segmentation: Clustering and Classification
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
Introduction to Statistical Methods
Introduction to Statistical MethodsIntroduction to Statistical Methods
Introduction to Statistical Methods
 
Overview of Statistical Concepts
Overview of Statistical ConceptsOverview of Statistical Concepts
Overview of Statistical Concepts
 

Recently uploaded

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 

Recently uploaded (20)

Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 

Discrete Choice Model - Part 2

  • 1. U N I V E R S I T Y O F S O U T H F L O R I D A // Discrete Choice Model Dr. Shivendu
  • 2. Agenda 5/24/2022 2 Discrete choice models: Multiple Choices • Multinomial Models • Ordinal Logit Models • Censored Regression or Count Data Models: Tobit Models Quiz 8: Based on Class 9 Readings Class 9_SAS_Module Statistical Analysis IV: Non-Parametric Procedures • Chap 27 and 28 of DAU_SAS • SAS Assignment 9 posted: Due before class 10
  • 3. Examples of multinomial choice (polytomous) situations: 1.Choice of a laundry detergent: Tide, Cheer, Arm & Hammer, Wisk, etc. 2.Choice of a major: economics, marketing, management, finance or accounting. 3.Choices after graduating from high school: not going to college, going to a private 4-year college, a public 4 year-college, or a 2-year college. Slide16-3 Principles of Econometrics, 3rd Edition
  • 4. The explanatory variable xi is individual specific but does not change across alternatives. Example age of the individual. The dependent variable is nominal Slide16-4 Principles of Econometrics, 3rd Edition
  • 5. Examples of multinomial choice situations: 1. It is key that there are more than 2 choices 2. It is key that there is no meaningful ordering to them. Otherwise, we would want to use that information (with an ordered probit or ordered logit) Slide16-5 Principles of Econometrics, 3rd Edition
  • 6. In essence this model is like a set of simultaneous individual binomial/binary logistic regressions With appropriate weighting, since the different comparisons between different pairs of categories would generally involve different numbers of observations Slide16-6 Principles of Econometrics, 3rd Edition
  • 7. Principles of Econometrics, 3rd Edition Slide16-7     1 12 22 13 23 1 , 1 1 exp exp i i i p j x x               12 22 2 12 22 13 23 exp , 2 1 exp exp i i i i x p j x x                 13 23 3 12 22 13 23 exp , 3 1 exp exp i i i i x p j x x             individual chooses alternative ij p P i j 
  • 8. An interesting feature of the odds ratio (16.21) is that the odds of choosing alternative j rather than alternative 1 does not depend on how many alternatives there are in total. There is the implicit assumption in logit models that the odds between any pair of alternatives is independent of irrelevant alternatives (IIA). Principles of Econometrics, 3rd Edition Slide16-8       1 2 1 exp 2,3 1 ij i j j i i i p P y j x j P y p            1 2 1 2 exp 2,3 ij i j j j i i p p x j x       
  • 9. • There is the implicit assumption in logit models that the odds between any pair of alternatives is independent of irrelevant alternatives (IIA) One way to state the assumption • If choice A is preferred to choice B out of the choice set {A,B}, then introducing a third alternative X, thus expanding that choice set to {A,B,X}, must not make B preferable to A. • which kind of makes sense  Slide16-9 Principles of Econometrics, 3rd Edition IIA assumption
  • 10. • There is the implicit assumption in logit models that the odds between any pair of alternatives is independent of irrelevant alternatives (IIA) In the case of the multinomial logit model, the IIA implies that adding another alternative or changing the characteristics of a third alternative must not affect the relative odds between the two alternatives considered. This is not realistic for many real life applications involving similar (substitute) alternatives. Slide16-10 Principles of Econometrics, 3rd Edition IIA assumption
  • 11. This is not realistic for many real-life applications with similar (substitute) alternatives Examples: • Beethoven/Debussy versus another of Beethoven’s Symphonies (Debreu 1960; Tversky 1972) • Bicycle/Pony (Luce and Suppes 1965) • Red Bus/Blue Bus (McFadden 1974). • Black slacks, jeans, shorts versus blue slacks (Hoffman, 2004) • Etc. Slide16-11 Principles of Econometrics, 3rd Edition IIA assumption
  • 12. Red Bus/Blue Bus (McFadden 1974). • Imagine commuters first face a decision between two modes of transportation: car and red bus • Suppose that a consumer chooses between these two options with equal probability, 0.5, so that the odds ratio equals 1. • Now add a third mode, blue bus. Assuming bus commuters do not care about the color of the bus (they are perfect substitutes), consumers are expected to choose between bus and car still with equal probability, so the probability of car is still 0.5, while the probabilities of each of the two bus types should go down to 0.25 • However, this violates IIA: for the odds ratio between car and red bus to be preserved, the new probabilities must be: car 0.33; red bus 0.33; blue bus 0.33 • Te IIA axiom does not mix well with perfect substitutes  IIA assumption
  • 13. We can test this assumption with a Hausman-McFadden test which compares a logistic model with all the choices with one with restricted choices. IIA assumption
  • 14. Model for three categories Need k-1 generalized logits to represent a dependent variable with k categories
  • 15. Meaning of the regression coefficients A positive regression coefficient for logit j means that higher values of the independent variable are associated with greater chances of response category j, compared to the reference category.
  • 16. Solve for the probabilities so So
  • 17. Three linear equations in 3 unknowns
  • 19. In general, solve k equations in k unknowns
  • 21. Using the solution, one can  Calculate the probability of obtaining the observed data as a function of the regression coefficients: Get maximum likelihood estimates (beta-hat values)  From maximum likelihood estimates, get tests and confidence intervals  Using beta-hat values in Lj, estimate probabilities of category membership for any set of x values.
  • 22. Multinominal Logistic Regression in SAS https://support.sas.com/resources/papers/proceedings12/427-2012.pdf
  • 23. Multinominal Logistic Regression in SAS • For a good example and implementation, see: •https://stats.idre.ucla.edu/sas/dae/multinomiallogistic-regression/ •For example and more statistical theory, see • https://support.sas.com/resources/papers/proceedings12/427-2012.pdf 5/24/2022 23
  • 24.  Extensions have arisen to deal with this issue  The multinomial probit and the mixed logit are alternative models for nominal outcomes that relax IIA, by allowing correlation among the errors (to reflect similarity among options)  but these models often have issues and assumptions themselves   IIA can also be relaxed by specifying a hierarchical model, ranking the choice alternatives. The most popular of these is called the McFadden’s nested logit model, which allows correlation among some errors, but not all (e.g. Heiss 2002)  Generalized extreme value and multinomial probit models possess another property, the Invariant Proportion of Substitution (Steenburgh 2008), which itself also suggests similarly counterintuitive real-life individual choice behavior  The multinomial probit has serious computational disadvantages too, since it involves calculating multiple (one less than the number of categories) integrals. With integration by simulation this problem is being ameliorated now… IIA assumption
  • 25. Combining Categories Consider testing whether two categories could be combined If none of the independent variables really explain the odds of choosing choice A versus B, you should merge them
  • 26. Multinomial Logit versus Probit Computational issues make the Multinomial Probit very rare Advantage: it does not need IIA 
  • 27. Example: choice between three types (J = 3) of soft drinks, say Pepsi, 7-Up and Coke Classic. Let yi1, yi2 and yi3 be dummy variables that indicate the choice made by individual i. The price facing individual i for brand j is PRICEij. Variables like price are to be individual and alternative specific, because they vary from individual to individual and are different for each choice the consumer might make Another example: of mode of transportation choice: time from home to work using train, car, or bus. For more details and example in implementing conditional logit in SAS, see https://stats.idre.ucla.edu/sas/faq/how-do-i-do-a-conditional-logit-model-analysis-in-sas-9-1/ Principles of Econometrics, 3rd Edition Slide16-27
  • 28. Logit Models for Ordinal Responses • Response variable is ordinal (categorical with natural ordering) • Predictor variable(s) can be numeric or qualitative (dummy variables) • Labeling the ordinal categories from 1 (lowest level) to c (highest), can obtain the cumulative probabilities: c j j Y P Y P j Y P , , 1 ) ( ) 1 ( ) (         
  • 29. Logistic Regression for Ordinal Response • The odds of falling in category j or below: 1 ) ( 1 , , 1 ) ( ) (       c Y P c j j Y P j Y P  • Logit (log odds) of cumulative probabilities are modeled as linear functions of predictor variable(s):   1 , , 1 ) ( ) ( log ) ( logit               c j X j Y P j Y P j Y P j    This is called the proportional odds model, and assumes the effect of X is the same for each cumulative probability
  • 30. Example - Urban Renewal Attitudes • Response: Attitude toward urban renewal project (Negative (Y=1), Moderate (Y=2), Positive (Y=3)) • Predictor Variable: Respondent’s Race (White, Nonwhite) • Contingency Table: AttitudeRace White Nonwhite Negative (Y=1) 101 106 Moderate (Y=2) 91 127 Positive (Y=3) 170 190
  • 31. SPSS Output • Note that SPSS fits the model in the following form:   1 , , 1 ) ( ) ( log ) ( logit               c j X j Y P j Y P j Y P j    r E 7 2 3 1 0 7 8 5 4 0 1 0 0 1 1 3 0 1 3 3 0 0 a . . 0 . . . [ A [ A T [ R [ R L m E a d f S i g r B r B d e L T a Note that the race variable is not significant (or even close).
  • 32. Fitted Equation • The fitted equation for each group/category: 165 . 0 0 165 . 0 ) Nonwhite | 2 ( ) Nonwhite | 2 ( logit : te Mod/Nonwhi or Neg 166 . 0 ) 001 . 0 ( 165 . 0 ) White | 2 ( ) White | 2 ( logit : Mod/White or Neg 027 . 1 ) 0 ( 027 . 1 ) Nonwhite | 1 ( ) Nonwhite | 1 ( logit : onwhite Negative/N 026 . 1 ) 001 . 0 ( 027 . 1 ) White | 1 ( ) White | 1 ( logit : hite Negative/W                                                   Y P Y P Y P Y P Y P Y P Y P Y P For each group, the fitted probability of falling in that set of categories is eL/(1+eL) where L is the logit value (0.264,0.264,0.541,0.541)
  • 33. Inference for Regression Coefficients • If  = 0, the response (Y) is independent of X • Z-test can be conducted to test this (estimate divided by its standard error) • Most software will conduct the Wald test, with the statistic being the z-statistic squared, which has a chi-squared distribution with 1 degree of freedom under the null hypothesis • Odds ratio of increasing X by 1 unit and its confidence interval are obtained by raising e to the power of the regression coefficient and its upper and lower bounds
  • 34. Example - Urban Renewal Attitudes • Z-statistic for testing for race differences: Z=0.001/0.133 = 0.0075 (recall model estimates -) • Wald statistic: .000 (P-value=.993) • Estimated odds ratio: e.001 = 1.001 • 95% Confidence Interval: (e-.260,e.263)=(0.771,1.301) • Interval contains 1, odds of being in a given category or below is same for whites as nonwhites r E 7 2 3 1 0 7 8 5 4 0 1 0 0 1 1 3 0 1 3 3 0 0 a . . 0 . . . [ A [ A T [ R [ R L m E a d f S i g r B r B d e L T a
  • 35. Ordinal Predictors • Creating dummy variables for ordinal categories treats them as if nominal • To make an ordinal variable, create a new variable X that models the levels of the ordinal variable • Setting depends on assignment of levels (simplest form is to let X=1,...,c for the categories which treats levels as being equally spaced)
  • 36. Censored Regression or Count Data Models: Tobit Models
  • 37. 3 7 INTRODUCTION • Data on y is censored if for part of the range of y we observe only that y is in that range, rather than observing the exact value of y. e.g. income is top-coded at $75,000 per year. • Data on y is truncated if for part of the range of y we do not observe y at all. e.g. people with income above $75,000 per year are excluded from the sample.
  • 38. 3 8 • Meaningful policy analysis requires extrapolationfrom the restricted sample to the population as a whole. • But running regressions on censored or truncated data, without controlling for censoring or truncation, leads to inconsistent parameter estimates.
  • 39. 3 9 • We focus on the normal, with censoring or truncation at zero. e.g. annual hours worked, and annual expenditure on automobiles. • The class of models presented in this chapter is called limited dependent variable models or latent variable models. Econometricians also use the terminology tobit models or generalized tobit models.
  • 40. 4 0 • Censoring can arise for distributions other than the normal. • For e.g. count data treatment is similar to here except different distributions. • For duration data, e.g. the length of a spell of unem- ployment, a separate treatment of censoring is given there due to different censoring mechanism (random) to that considered here.
  • 41. 4 1 Many Models for Censored Data • Tobit model: MLE, NLS and Heckman 2-step. • Sample selectivity model, a generalization of Tobit. • Semiparametric estimation. • Structural economic models for censored choice. • Simultaneous equation models
  • 42. 4 2 TOBIT MODEL • Interest lies in a latent dependent variable y* y* = X'β + c. • This variable is only partially observed. • In censored regression we observe y = * y if y >0 0 if y <=0 • In truncated regression we observe y = y* if y* >0 .
  • 43. • For censored and truncated data, linear regression is inappropriate.
  • 44. 4 4 • The standard estimators require stochasticassumptions about the distribution of c and hence y*. • The Tobit model assumes normality of the error term. • For consistent estimates, OLS is not good. We should use MLE
  • 45. Tobit in SAS • For example, and using SAS to estimate Tobit regression, see • https://stats.idre.ucla.edu/sas/dae/tobit-analysis/
  • 46. Takeaway • Multinominal and ordered logit models are increasingly being used in consumer product choice research as well as in practice. • Tobit models are useful when data is count data or duration data. These are often used in specific contexts only. 5/24/2022 46
  • 47. Going forward • No more face-to-face class. Session 10 and 11 will be only on Teams. • Final exam is on 4th May from 8.30 am to 11.00 am and will be on Teams through proctorio. 5/24/2022 47