SlideShare a Scribd company logo
1 of 45
AN INTRODUCTION TO DISCRETE
CHOICE MODELLING
Tony Fowkes
Visiting Reader
Institute for Transport Studies
University of Leeds
Internal Seminar, ITS, 07/04/16
WHAT DO YOU THINK OF BRITISH TV?
• How good are the BBC channels? – Think
of a number!
WHAT DO YOU THINK OF BRITISH TV?
• How good are the BBC channels? – Think
of a number!
Specifically, how ‘satisfied’ are you the BBC
channels (BBC1, BBC2 & BBC4)?
We will be dealing with comparisons, so any
number will do for now. Write down 100 if
you can think of nothing better.
WHAT DO YOU THINK OF BRITISH TV?
• Relative to the number you gave for the
BBC channels, how good do you think the
ITV offering is (ITV1 – ITV4)?
If you think one is twice as good as another,
you might give it twice the number.
Be guided by how often you watch ITV
channels as against BBC channels.
WHAT DO YOU THINK OF BRITISH TV?
• Now give me a third number for how good
you think all the other channels are.
WHAT DO YOU THINK OF BRITISH TV?
• Lastly, taking the total time you spend
watching all channels in a typical week as
100%, please write down the 3
percentages of time you typically spend
watching each of the channel groups.
You do not need to be too exact, and if you
don’t watch TV in a typical week, choose a
non-typical one.
So, we have been able to measure
shares (also known as proportions,
probabilities and, if multiplied by 100,
percentages).
But we want to model the shares, so
that we understand how they vary from
one person to another, and over time
as things change. That will allow us to
make predictions.
HOW MIGHT WE RELATE THE
VIEWING % FIGURES TO THE
SATISFACTION NUMBERS?
• Each person will have used a different,
(and unknown to the analyst) scale when
selecting their satisfaction numbers, but
we might try to guess (FOR EACH
PERSON) the proportion of time they
spend watching each of the 3 groups of
channels.
A SHARE MODEL
The simplest way of looking at this problem
is to try to form a simple ‘share model’.
Let Hi denote the hours spent watching
channel i, Si satisfaction with channel i and
Pi denote the share of the hours watched for
channel i in the total. Then:
PBBC = HBBC/(HBBC+HITV+HELSE)
A SHARE MODEL
If hours watched are proportional to
Satisfaction, then:
PBBC = SBBC/(SBBC+SITV+SELSE)
BUT – is Usage always proportional to
Satisfaction?
CONSIDER YOUR JOURNEY HOME
FROM THE UNIVERSITY
• If you had the choice of two alternative
routes, one of which is three times as
good as the other, would you ever willingly
choose the worse route?
• P1 = S1/(S1+S2) = 100/(100+300) = 0.25
Seems like we need a better share model.
TRY USING EXPONENTIALS
P1 = Exp(S1)/[Exp(S1)+Exp(S2)]
= 2.47/1081
Rather too extreme, but we can define a
Utility (U) as a function of the S values,
eg. U = θS
Let θ = 0.05 (just to try it)
P1 = Exp(5)/[Exp(5)+Exp(10)] = 0.03
By changing θ we can get sensible Ps
BACK TO THE TV EXAMPLE
If you had given S1=100, S2=80, S3=160;
then with θ=0.01 (just as an example),
PBBC = Exp(1)/[Exp(1)+Exp(0.8)+Exp(1.6)]
= 0.27
PITV = 0.22
PELSE = 0.50
THE SCALE FACTOR
We call θ the SCALE FACTOR, and it is a
crucial parameter that has to be estimated
when calibrating a Discrete Choice
forecasting model.
The scale factor determines the relative
weight we give to the deterministic part of
the model compared to everything else
(the unknown residual or ‘error’ term).
The Scale Factor Problem
Logit Models consist of 2 parts:
U = Deterministic part + Random error
U = ΩV + ε
where the Ω ‘scales’ the expression we use for V
to the scale of the random error.
Suppose V = β0 + β1X1 + β2X2
Then ΩV = Ωβ0 + Ωβ1X1 + Ωβ2X2
And so the modelled coefficients are estimates of
Ωβ0, Ωβ1, Ωβ2
Why does the scale factor problem matter?
• For attribute valuation, such as ‘value of time’, it
doesn’t matter since the scale factors cancel
• For mode choice forecasting it does matter,
unless the errors are the correct size. This may
well be the case for RP, but will not be the case
for SP, where the errors are likely to be greater
than real errors due to the hypothetical nature of
the experiment. That will mean that the formula
for P will overstate small probabilities and
understate the probability of the dominant mode.
Probability P varies with Ω
P = exp(ΩV)/∑kexp(ΩVk)
As Ω → 0, P → 1/k
ie. complete ignorance – toss of a coin.
As Ω increases, the more the model is
explaining what is going on – good.
How can the Binary Logit model be derived?
P1 = Prob(U1 > U2)
= Prob(ΩV1+ε1 > ΩV2+ε2)
= Prob(ε2 = h AND ε1 ≥ h + ΩV2 - ΩV1)
Assume a Gumbel distribution for the ε’s.
Cumulative F(ε) = exp(-exp(-ε))
Density fn. dF(ε) = exp(-ε) exp(-exp(-ε)) dε
P1 = ∫ from minus infinity to plus infinity of
dF(ε2)F(ε1) which on substitution gives
exp(-h)exp(-exp(-h).exp(-exp(- h + ΩV2 -ΩV1)) dh
which, after some tricky but conventional
manipulation gives:
P1 = 1/(1+exp(ΩV2-ΩV1)
Or
P1 = (exp(ΩV1))/[exp(ΩV1) + exp(ΩV2)]
which is the Binary Logit model.
Multinomial Logit Model (MNL)
• This brings us back to where we started, a
three way choice of TV channels. For
more than 2 choices we use a Multinomial
Logit model
P1 = exp(U1)/(exp(U1) + exp(U2) + …)
Problem with the MNL model
• A theoretical, and sometimes important
problem with MNL is the Red Bus – Blue
Bus problem, which arises from the
Independence of Irrelevant Alternatives
property.
• This can be avoided by using various
Nested Logits, Mixed Logit, Cascetta’s C-
Logit, or Fowkes & Toner’s Flat Logit.
THE DETERMINISTIC PART
Here we seek to model Utility.
The current terminology we use is to regard
the 3 channel groups as 3
ALTERNATIVES, each described by a set
of ATTRIBUTES, each set to a particular
LEVEL.
Examples of ALTERNATIVES,
ATTRIBUTES and ATTRIBUTE LEVELS
Our Alternatives are BBC, ITV, ELSE
Important ATTRIBUTES might be:
(i) Availability
(ii) Cost
(iii) Variety of programmes
(iv) Quality of programmes
Possible attribute LEVELS for Availability
might be:
a) Freeview
b) Satellite
c) High Definition
d) On Demand
Possible attribute LEVELS for Variety might
be:
(a) Very good choice
(b) Good choice
(c) Average
(d) Poor range of programmes
(e) Very limited range of programmes
(f) Only phone-in shows
Possible attribute LEVELS for Quality might
be:
(a) International top quality
(b) Not bad for a national network
(c) Has occasional good programmes
(d) Only repeats
(e) Only phone-in shows
(f) Ant ‘n’ Dec
Transport Applications
In Transport there are many occasions
where we model Alternatives by their
Generalised Cost, GC:
eg. GC = αC + βT
Or, more generally,
GC = αC + β1T1 + β2T2... + βnTn
Excerpt from A Gray (1977)
“For the UK, the generalised cost concept was
perhaps invented by Quarmby in the famous 1967
article about modal choice, based on some earlier
work by Warner (1962) in the United States. In
Quarmby’s article the concept was described as
‘disutility’ and referred to a linear combination of
the time and money costs of a journey”.
VALUE OF TIME
In passing we note that the RATIO OF the
coefficient of the nth type of time (Tn) TO
the coefficient of cost is called the value of
the nth type of time, ie
VOT(n) = βn /α
This has kept some of us employed for a
good part of our working lives.
WHAT IS THE VALUE OF TIME?
It is just the exchange rate (for a person, a
sample, or a population) between money
and spending extra time in an activity. It
has 2 parts.
There is always something we can do with
time so the Resource VOT is always +ve.
Usually more important is the (dis)utility of
the activity concerned. Most activities
have a –ve utility from time reduction, but
in transport they are mostly +ve.
Binary Choice
Let us estimate a model for 2 Alternatives: 1
& 2 (just 2, so we say “Binary”)
Suppose the Alternatives only differ in terms
of measured Generalised Cost.
We need to observe P1, the proportion
choosing Alternative 1 for various levels of
difference in GC between the Alternatives.
The Binary Logit Model
A Linear expression for P1 is not
satisfactory.
(eg. P1 has to lie between zero and one).
• A linear expression for
ln(P1/(1-P1))
seems much more satisfactory
Put this “logit” (or ‘log-odds’) equal to
difference in Generalised Cost, GC1-GC2
Equation for the Binary Logit Model
Ln(P1/(1-P1)) = GC1-GC2
P1/(1-P1) = exp(GC1-GC2)
P1 = exp(GC1-GC2) - P1.exp(GC1-GC2)
P1(1+exp(GC1-GC2)) = exp(GC1-GC2)
P1 = exp(GC1-GC2)/(1+exp(GC1-GC2))
P1 = exp(GC1)/[(exp(GC1)+exp(GC2)]
Excerpt from D McFadden (2001)
“In 1965, a graduate student asked me how she
might analyze her thesis data in freeway routing
choices by the California Department of
Highways. This led me to consider the problem of
economic choice among discrete alternatives. The
problem was to devise a computationally tractable
model of economic decision making that yielded
choice probabilities for each alternative in a finite
feasible set. It was natural to think of highway
department decision-makers as maximizing
preferences that varied from one bureaucrat to
another.
“I drew on a classical psychological study of
perception, Thurstone’s Law of comparative
Judgment. In this theory, the perceived level of a
stimulus equals its objective level plus a random
error. The probability that one object is judged
higher than a second is the probability that this
alternative has the higher perceived stimulus.
When the perceived stimuli are interpreted as
levels of satisfaction, or utility, this can be
interpreted as a model for economic choice in
which utility levels are random, and observed
choices pick out the alternative that has the
highest realized utility level. This connection
was made in the 1950’s by the economist Jacob
Marschak, who called this the random utility
maximization hypothesis, abbreviated to RUM.
“Another psychologist I relied on was Duncan Luce,
who in 1959 introduced an axiom that simplified
experimental collection of psychological choice data by
allowing choice probabilities for many alternatives to
be inferred from choices between pairs of alternatives.
Marschak showed that choice probabilities satisfying
Luce’s axiom were consistent with the RUM
hypothesis.
I proposed an econometric version of the Luce model
in which the utilities of alternatives depended on their
measured attributes, such as construction cost, route
length, and areas of parklands and open space taken. I
called this a conditional or multinomial logit model, and
developed a computer program to estimate it.”
DALY-ZACHARY-WILLIAMS
THEOREM
Andrew Daly & Stan Zachary (1976) and
Huw Williams (1977) added significantly to
Discrete Choice theory, particularly
providing a set of conditions that
Generalised Extreme Value models need
to meet in order to be a probability choice
model.
Williams also related the concept of
Consumer Surplus to Discrete Choice
Model parameters.
Revealed Preference Analysis
Key References
1. P Samuelson (1938). Econometrica.
Observing a consumer to have chosen one alternative
and, by so doing, have rejected a second alternative.
2. K Lancaster (1966). Journal of Political Economy.
Utility for a commodity determined by the
characteristics of that commodity. Then a small step
to modelling utility as a sum of ‘part-worths’ of these
characteristics individually.
3. D McFadden (1974). In: Zarembka (ed), Frontiers of
Econometrics.
‘Conditional Logit Analysis of Qualitative Choice
Behaviour’
Revealed Preference Data
TRAVELLERS ARE OBSERVED TO CHOOSE AN
OPTION (HAVING CERTAIN CHARACTERISTICS) IN
PREFERENCE TO ANOTHER OPTION (HAVING OTHER
CHARACTERISITCS)
e.g. Traveller chooses train with cost £30 and travel time 2
hours in preference to coach costing £15 and taking 4
hours.
EITHER Requires ‘Engineering’ data on costs, times,
etc. (Possibly from fare manuals, timetables
or modelled)
OR Requires traveller to report the costs and
times of both the chosen and rejected modes.
– Self justification bias in reported data
– Many choices ‘dominated’
– Cost and time differences between modes may
be correlated
– Habit/inertia effects
– Respondent may not be able to give satisfactory
data about the alternative mode
Generally need very large samples
Problems with Revealed Preference Data
Transfer Price Data
TRAVELLERS ARE ASKED DIRECTLY FOR A
MEASURE OF UTILITY DIFFERENCE BETWEEN
TWO TRAVEL ALTERNATIVES
by questions such as:
‘How much would the cost of your chosen alternative
have to rise in order for you to switch to your rejected
alternative?
Problems with Transfer Price Data
– Policy response bias
– Unconstrained response bias
– Self justification bias
– Requires data about the rejected alternative,
which may only be known very inexactly
– Respondent may not understand or be able to
relate to question
Stated Preference Data
TRAVELLERS ARE PRESENTED WITH A SET OF
HYPOTHETICAL TRAVEL CHOICES, EACH WITH
ITS OWN CHARACTERISTICS (e.g. Cost, Travel
time, etc), AND ASKED TO
- MAKE A CHOICE
- RANK ALTERNATIVES
- RATE ALTERNATIVES
THE CRUCIAL REQUIREMENT IS THAT THE
ABOVE INCORPORATE IMPLICIT TRADE-OFFS
Advantages of Stated Preference
– Can represent situations that do not yet exist
– No problem of reporting error/bias
– Can ‘design in’ interesting trade offs
– Can ensure low correlation between
characteristic differences
– Can ask ‘many’ choices of each individual
– Avoids requirement for ‘confidential’ information
Problems with Stated Preference Data
– Response not rooted in an actual choice
– Questions may be difficult to understand
– Respondents may refuse to ‘play games’
– Relatively unimportant characteristics may be
ignored
– Design is (very?) difficult
– Scale factor problem

More Related Content

What's hot

Analytic hierarchy process
Analytic hierarchy processAnalytic hierarchy process
Analytic hierarchy process
Ujjwal 'Shanu'
 
Analytic Hierarchy Process AHP
Analytic Hierarchy Process AHPAnalytic Hierarchy Process AHP
Analytic Hierarchy Process AHP
adcom2015
 

What's hot (20)

Decision Making Using the Analytic Hierarchy Process (AHP); A Step by Step A...
Decision Making Using the Analytic Hierarchy Process (AHP);  A Step by Step A...Decision Making Using the Analytic Hierarchy Process (AHP);  A Step by Step A...
Decision Making Using the Analytic Hierarchy Process (AHP); A Step by Step A...
 
Unit.4.integer programming
Unit.4.integer programmingUnit.4.integer programming
Unit.4.integer programming
 
Transportation models
Transportation modelsTransportation models
Transportation models
 
Transportation and Assignment
Transportation and AssignmentTransportation and Assignment
Transportation and Assignment
 
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysis
 
Linear programming graphical method (feasibility)
Linear programming   graphical method (feasibility)Linear programming   graphical method (feasibility)
Linear programming graphical method (feasibility)
 
Different kind of distance and Statistical Distance
Different kind of distance and Statistical DistanceDifferent kind of distance and Statistical Distance
Different kind of distance and Statistical Distance
 
Operations Research - Sensitivity Analysis
Operations Research - Sensitivity AnalysisOperations Research - Sensitivity Analysis
Operations Research - Sensitivity Analysis
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Data envelopment analysis
Data envelopment analysisData envelopment analysis
Data envelopment analysis
 
Data Envelopment Analysis
Data Envelopment AnalysisData Envelopment Analysis
Data Envelopment Analysis
 
Transportation
TransportationTransportation
Transportation
 
Queueing Theory and its BusinessS Applications
Queueing Theory and its BusinessS ApplicationsQueueing Theory and its BusinessS Applications
Queueing Theory and its BusinessS Applications
 
Analytic hierarchy process
Analytic hierarchy processAnalytic hierarchy process
Analytic hierarchy process
 
Queuing theory
Queuing theoryQueuing theory
Queuing theory
 
Queueing theory
Queueing theoryQueueing theory
Queueing theory
 
Transportation model
Transportation modelTransportation model
Transportation model
 
Analytic Hierarchy Process AHP
Analytic Hierarchy Process AHPAnalytic Hierarchy Process AHP
Analytic Hierarchy Process AHP
 
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shettyApplication of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
 

Similar to An Introduction to Discrete Choice Modelling

DTINGLEYWarning.docx
DTINGLEYWarning.docxDTINGLEYWarning.docx
DTINGLEYWarning.docx
sagarlesley
 
1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option
AbbyWhyte974
 
1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option
SantosConleyha
 

Similar to An Introduction to Discrete Choice Modelling (20)

Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
 
Take it to the Limit: quantitation, likelihood, modelling and other matters
Take it to the Limit: quantitation, likelihood, modelling and other mattersTake it to the Limit: quantitation, likelihood, modelling and other matters
Take it to the Limit: quantitation, likelihood, modelling and other matters
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
DTINGLEYWarning.docx
DTINGLEYWarning.docxDTINGLEYWarning.docx
DTINGLEYWarning.docx
 
Travelling Salesman Problem using Partical Swarm Optimization
Travelling Salesman Problem using Partical Swarm OptimizationTravelling Salesman Problem using Partical Swarm Optimization
Travelling Salesman Problem using Partical Swarm Optimization
 
ANOVA.pptx
ANOVA.pptxANOVA.pptx
ANOVA.pptx
 
OR 14 15-unit_2
OR 14 15-unit_2OR 14 15-unit_2
OR 14 15-unit_2
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option
 
1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option1. This question is on the application of the Binomial option
1. This question is on the application of the Binomial option
 
Variational Bayes: A Gentle Introduction
Variational Bayes: A Gentle IntroductionVariational Bayes: A Gentle Introduction
Variational Bayes: A Gentle Introduction
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
 
Distributed lag model koyck
Distributed lag model koyckDistributed lag model koyck
Distributed lag model koyck
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
 
probability assignment help (2)
probability assignment help (2)probability assignment help (2)
probability assignment help (2)
 
Presentation socg
Presentation socgPresentation socg
Presentation socg
 
Ch4 slides
Ch4 slidesCh4 slides
Ch4 slides
 
Analysis of Variance
Analysis of Variance Analysis of Variance
Analysis of Variance
 
Change Point Analysis
Change Point AnalysisChange Point Analysis
Change Point Analysis
 

More from Institute for Transport Studies (ITS)

Social networks, activities, and travel - building links to understand behaviour
Social networks, activities, and travel - building links to understand behaviourSocial networks, activities, and travel - building links to understand behaviour
Social networks, activities, and travel - building links to understand behaviour
Institute for Transport Studies (ITS)
 
Rail freight in Japan - track access
Rail freight in Japan - track accessRail freight in Japan - track access
Rail freight in Japan - track access
Institute for Transport Studies (ITS)
 

More from Institute for Transport Studies (ITS) (20)

Transport Projects Aimed at Fostering Economic Growth – experience in the UK ...
Transport Projects Aimed at Fostering Economic Growth – experience in the UK ...Transport Projects Aimed at Fostering Economic Growth – experience in the UK ...
Transport Projects Aimed at Fostering Economic Growth – experience in the UK ...
 
BA Geography with Transport Studies at the University of Leeds
BA Geography with Transport Studies at the University of LeedsBA Geography with Transport Studies at the University of Leeds
BA Geography with Transport Studies at the University of Leeds
 
Highways Benchmarking - Accelerating Impact
Highways Benchmarking - Accelerating ImpactHighways Benchmarking - Accelerating Impact
Highways Benchmarking - Accelerating Impact
 
Using telematics data to research traffic related air pollution
Using telematics data to research traffic related air pollutionUsing telematics data to research traffic related air pollution
Using telematics data to research traffic related air pollution
 
Masters Dissertation Posters 2017
Masters Dissertation Posters 2017Masters Dissertation Posters 2017
Masters Dissertation Posters 2017
 
Institute for Transport Studies - Masters Open Day 2017
Institute for Transport Studies - Masters Open Day 2017Institute for Transport Studies - Masters Open Day 2017
Institute for Transport Studies - Masters Open Day 2017
 
London's Crossrail Scheme - its evolution, governance, financing and challenges
London's Crossrail Scheme  - its evolution, governance, financing and challengesLondon's Crossrail Scheme  - its evolution, governance, financing and challenges
London's Crossrail Scheme - its evolution, governance, financing and challenges
 
Secretary of State Visit
Secretary of State VisitSecretary of State Visit
Secretary of State Visit
 
Business model innovation for electrical vehicle futures
Business model innovation for electrical vehicle futuresBusiness model innovation for electrical vehicle futures
Business model innovation for electrical vehicle futures
 
A clustering method based on repeated trip behaviour to identify road user cl...
A clustering method based on repeated trip behaviour to identify road user cl...A clustering method based on repeated trip behaviour to identify road user cl...
A clustering method based on repeated trip behaviour to identify road user cl...
 
Cars cars everywhere
Cars cars everywhereCars cars everywhere
Cars cars everywhere
 
Annual Review 2015-16 - University of leeds
Annual Review 2015-16 - University of leedsAnnual Review 2015-16 - University of leeds
Annual Review 2015-16 - University of leeds
 
Social networks, activities, and travel - building links to understand behaviour
Social networks, activities, and travel - building links to understand behaviourSocial networks, activities, and travel - building links to understand behaviour
Social networks, activities, and travel - building links to understand behaviour
 
Rail freight in Japan - track access
Rail freight in Japan - track accessRail freight in Japan - track access
Rail freight in Japan - track access
 
Real time traffic management - challenges and solutions
Real time traffic management - challenges and solutionsReal time traffic management - challenges and solutions
Real time traffic management - challenges and solutions
 
Proportionally fair scheduling for traffic light networks
Proportionally fair scheduling for traffic light networksProportionally fair scheduling for traffic light networks
Proportionally fair scheduling for traffic light networks
 
Capacity maximising traffic signal control policies
Capacity maximising traffic signal control policiesCapacity maximising traffic signal control policies
Capacity maximising traffic signal control policies
 
Bayesian risk assessment of autonomous vehicles
Bayesian risk assessment of autonomous vehiclesBayesian risk assessment of autonomous vehicles
Bayesian risk assessment of autonomous vehicles
 
Agent based car following model for heterogeneities of platoon driving with v...
Agent based car following model for heterogeneities of platoon driving with v...Agent based car following model for heterogeneities of platoon driving with v...
Agent based car following model for heterogeneities of platoon driving with v...
 
A new theory of lane selection on highways
A new theory of lane selection on highwaysA new theory of lane selection on highways
A new theory of lane selection on highways
 

Recently uploaded

VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
dipikadinghjn ( Why You Choose Us? ) Escorts
 
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
dipikadinghjn ( Why You Choose Us? ) Escorts
 
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
dipikadinghjn ( Why You Choose Us? ) Escorts
 

Recently uploaded (20)

VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
 
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
 
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
8377087607, Door Step Call Girls In Kalkaji (Locanto) 24/7 Available
 
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
 
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
 
falcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunitiesfalcon-invoice-discounting-unlocking-prime-investment-opportunities
falcon-invoice-discounting-unlocking-prime-investment-opportunities
 
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
 
Indore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdfIndore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdf
 
Top Rated Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
 
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Sant Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech Belgium
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
 
Enjoy Night⚡Call Girls Patel Nagar Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Patel Nagar Delhi >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Patel Nagar Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Patel Nagar Delhi >༒8448380779 Escort Service
 
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
 
Top Rated Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
Top Rated  Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...Top Rated  Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
Top Rated Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
 

An Introduction to Discrete Choice Modelling

  • 1. AN INTRODUCTION TO DISCRETE CHOICE MODELLING Tony Fowkes Visiting Reader Institute for Transport Studies University of Leeds Internal Seminar, ITS, 07/04/16
  • 2. WHAT DO YOU THINK OF BRITISH TV? • How good are the BBC channels? – Think of a number!
  • 3. WHAT DO YOU THINK OF BRITISH TV? • How good are the BBC channels? – Think of a number! Specifically, how ‘satisfied’ are you the BBC channels (BBC1, BBC2 & BBC4)? We will be dealing with comparisons, so any number will do for now. Write down 100 if you can think of nothing better.
  • 4. WHAT DO YOU THINK OF BRITISH TV? • Relative to the number you gave for the BBC channels, how good do you think the ITV offering is (ITV1 – ITV4)? If you think one is twice as good as another, you might give it twice the number. Be guided by how often you watch ITV channels as against BBC channels.
  • 5. WHAT DO YOU THINK OF BRITISH TV? • Now give me a third number for how good you think all the other channels are.
  • 6. WHAT DO YOU THINK OF BRITISH TV? • Lastly, taking the total time you spend watching all channels in a typical week as 100%, please write down the 3 percentages of time you typically spend watching each of the channel groups. You do not need to be too exact, and if you don’t watch TV in a typical week, choose a non-typical one.
  • 7. So, we have been able to measure shares (also known as proportions, probabilities and, if multiplied by 100, percentages). But we want to model the shares, so that we understand how they vary from one person to another, and over time as things change. That will allow us to make predictions.
  • 8. HOW MIGHT WE RELATE THE VIEWING % FIGURES TO THE SATISFACTION NUMBERS? • Each person will have used a different, (and unknown to the analyst) scale when selecting their satisfaction numbers, but we might try to guess (FOR EACH PERSON) the proportion of time they spend watching each of the 3 groups of channels.
  • 9. A SHARE MODEL The simplest way of looking at this problem is to try to form a simple ‘share model’. Let Hi denote the hours spent watching channel i, Si satisfaction with channel i and Pi denote the share of the hours watched for channel i in the total. Then: PBBC = HBBC/(HBBC+HITV+HELSE)
  • 10. A SHARE MODEL If hours watched are proportional to Satisfaction, then: PBBC = SBBC/(SBBC+SITV+SELSE) BUT – is Usage always proportional to Satisfaction?
  • 11. CONSIDER YOUR JOURNEY HOME FROM THE UNIVERSITY • If you had the choice of two alternative routes, one of which is three times as good as the other, would you ever willingly choose the worse route? • P1 = S1/(S1+S2) = 100/(100+300) = 0.25 Seems like we need a better share model.
  • 12. TRY USING EXPONENTIALS P1 = Exp(S1)/[Exp(S1)+Exp(S2)] = 2.47/1081 Rather too extreme, but we can define a Utility (U) as a function of the S values, eg. U = θS Let θ = 0.05 (just to try it) P1 = Exp(5)/[Exp(5)+Exp(10)] = 0.03 By changing θ we can get sensible Ps
  • 13. BACK TO THE TV EXAMPLE If you had given S1=100, S2=80, S3=160; then with θ=0.01 (just as an example), PBBC = Exp(1)/[Exp(1)+Exp(0.8)+Exp(1.6)] = 0.27 PITV = 0.22 PELSE = 0.50
  • 14. THE SCALE FACTOR We call θ the SCALE FACTOR, and it is a crucial parameter that has to be estimated when calibrating a Discrete Choice forecasting model. The scale factor determines the relative weight we give to the deterministic part of the model compared to everything else (the unknown residual or ‘error’ term).
  • 15. The Scale Factor Problem Logit Models consist of 2 parts: U = Deterministic part + Random error U = ΩV + ε where the Ω ‘scales’ the expression we use for V to the scale of the random error. Suppose V = β0 + β1X1 + β2X2 Then ΩV = Ωβ0 + Ωβ1X1 + Ωβ2X2 And so the modelled coefficients are estimates of Ωβ0, Ωβ1, Ωβ2
  • 16. Why does the scale factor problem matter? • For attribute valuation, such as ‘value of time’, it doesn’t matter since the scale factors cancel • For mode choice forecasting it does matter, unless the errors are the correct size. This may well be the case for RP, but will not be the case for SP, where the errors are likely to be greater than real errors due to the hypothetical nature of the experiment. That will mean that the formula for P will overstate small probabilities and understate the probability of the dominant mode.
  • 17. Probability P varies with Ω P = exp(ΩV)/∑kexp(ΩVk) As Ω → 0, P → 1/k ie. complete ignorance – toss of a coin. As Ω increases, the more the model is explaining what is going on – good.
  • 18. How can the Binary Logit model be derived? P1 = Prob(U1 > U2) = Prob(ΩV1+ε1 > ΩV2+ε2) = Prob(ε2 = h AND ε1 ≥ h + ΩV2 - ΩV1) Assume a Gumbel distribution for the ε’s. Cumulative F(ε) = exp(-exp(-ε)) Density fn. dF(ε) = exp(-ε) exp(-exp(-ε)) dε P1 = ∫ from minus infinity to plus infinity of dF(ε2)F(ε1) which on substitution gives exp(-h)exp(-exp(-h).exp(-exp(- h + ΩV2 -ΩV1)) dh
  • 19. which, after some tricky but conventional manipulation gives: P1 = 1/(1+exp(ΩV2-ΩV1) Or P1 = (exp(ΩV1))/[exp(ΩV1) + exp(ΩV2)] which is the Binary Logit model.
  • 20. Multinomial Logit Model (MNL) • This brings us back to where we started, a three way choice of TV channels. For more than 2 choices we use a Multinomial Logit model P1 = exp(U1)/(exp(U1) + exp(U2) + …)
  • 21. Problem with the MNL model • A theoretical, and sometimes important problem with MNL is the Red Bus – Blue Bus problem, which arises from the Independence of Irrelevant Alternatives property. • This can be avoided by using various Nested Logits, Mixed Logit, Cascetta’s C- Logit, or Fowkes & Toner’s Flat Logit.
  • 22. THE DETERMINISTIC PART Here we seek to model Utility. The current terminology we use is to regard the 3 channel groups as 3 ALTERNATIVES, each described by a set of ATTRIBUTES, each set to a particular LEVEL.
  • 23. Examples of ALTERNATIVES, ATTRIBUTES and ATTRIBUTE LEVELS Our Alternatives are BBC, ITV, ELSE Important ATTRIBUTES might be: (i) Availability (ii) Cost (iii) Variety of programmes (iv) Quality of programmes
  • 24. Possible attribute LEVELS for Availability might be: a) Freeview b) Satellite c) High Definition d) On Demand
  • 25. Possible attribute LEVELS for Variety might be: (a) Very good choice (b) Good choice (c) Average (d) Poor range of programmes (e) Very limited range of programmes (f) Only phone-in shows
  • 26. Possible attribute LEVELS for Quality might be: (a) International top quality (b) Not bad for a national network (c) Has occasional good programmes (d) Only repeats (e) Only phone-in shows (f) Ant ‘n’ Dec
  • 27. Transport Applications In Transport there are many occasions where we model Alternatives by their Generalised Cost, GC: eg. GC = αC + βT Or, more generally, GC = αC + β1T1 + β2T2... + βnTn
  • 28. Excerpt from A Gray (1977) “For the UK, the generalised cost concept was perhaps invented by Quarmby in the famous 1967 article about modal choice, based on some earlier work by Warner (1962) in the United States. In Quarmby’s article the concept was described as ‘disutility’ and referred to a linear combination of the time and money costs of a journey”.
  • 29. VALUE OF TIME In passing we note that the RATIO OF the coefficient of the nth type of time (Tn) TO the coefficient of cost is called the value of the nth type of time, ie VOT(n) = βn /α This has kept some of us employed for a good part of our working lives.
  • 30. WHAT IS THE VALUE OF TIME? It is just the exchange rate (for a person, a sample, or a population) between money and spending extra time in an activity. It has 2 parts. There is always something we can do with time so the Resource VOT is always +ve. Usually more important is the (dis)utility of the activity concerned. Most activities have a –ve utility from time reduction, but in transport they are mostly +ve.
  • 31. Binary Choice Let us estimate a model for 2 Alternatives: 1 & 2 (just 2, so we say “Binary”) Suppose the Alternatives only differ in terms of measured Generalised Cost. We need to observe P1, the proportion choosing Alternative 1 for various levels of difference in GC between the Alternatives.
  • 32. The Binary Logit Model A Linear expression for P1 is not satisfactory. (eg. P1 has to lie between zero and one). • A linear expression for ln(P1/(1-P1)) seems much more satisfactory Put this “logit” (or ‘log-odds’) equal to difference in Generalised Cost, GC1-GC2
  • 33. Equation for the Binary Logit Model Ln(P1/(1-P1)) = GC1-GC2 P1/(1-P1) = exp(GC1-GC2) P1 = exp(GC1-GC2) - P1.exp(GC1-GC2) P1(1+exp(GC1-GC2)) = exp(GC1-GC2) P1 = exp(GC1-GC2)/(1+exp(GC1-GC2)) P1 = exp(GC1)/[(exp(GC1)+exp(GC2)]
  • 34. Excerpt from D McFadden (2001) “In 1965, a graduate student asked me how she might analyze her thesis data in freeway routing choices by the California Department of Highways. This led me to consider the problem of economic choice among discrete alternatives. The problem was to devise a computationally tractable model of economic decision making that yielded choice probabilities for each alternative in a finite feasible set. It was natural to think of highway department decision-makers as maximizing preferences that varied from one bureaucrat to another.
  • 35. “I drew on a classical psychological study of perception, Thurstone’s Law of comparative Judgment. In this theory, the perceived level of a stimulus equals its objective level plus a random error. The probability that one object is judged higher than a second is the probability that this alternative has the higher perceived stimulus. When the perceived stimuli are interpreted as levels of satisfaction, or utility, this can be interpreted as a model for economic choice in which utility levels are random, and observed choices pick out the alternative that has the highest realized utility level. This connection was made in the 1950’s by the economist Jacob Marschak, who called this the random utility maximization hypothesis, abbreviated to RUM.
  • 36. “Another psychologist I relied on was Duncan Luce, who in 1959 introduced an axiom that simplified experimental collection of psychological choice data by allowing choice probabilities for many alternatives to be inferred from choices between pairs of alternatives. Marschak showed that choice probabilities satisfying Luce’s axiom were consistent with the RUM hypothesis. I proposed an econometric version of the Luce model in which the utilities of alternatives depended on their measured attributes, such as construction cost, route length, and areas of parklands and open space taken. I called this a conditional or multinomial logit model, and developed a computer program to estimate it.”
  • 37. DALY-ZACHARY-WILLIAMS THEOREM Andrew Daly & Stan Zachary (1976) and Huw Williams (1977) added significantly to Discrete Choice theory, particularly providing a set of conditions that Generalised Extreme Value models need to meet in order to be a probability choice model. Williams also related the concept of Consumer Surplus to Discrete Choice Model parameters.
  • 38. Revealed Preference Analysis Key References 1. P Samuelson (1938). Econometrica. Observing a consumer to have chosen one alternative and, by so doing, have rejected a second alternative. 2. K Lancaster (1966). Journal of Political Economy. Utility for a commodity determined by the characteristics of that commodity. Then a small step to modelling utility as a sum of ‘part-worths’ of these characteristics individually. 3. D McFadden (1974). In: Zarembka (ed), Frontiers of Econometrics. ‘Conditional Logit Analysis of Qualitative Choice Behaviour’
  • 39. Revealed Preference Data TRAVELLERS ARE OBSERVED TO CHOOSE AN OPTION (HAVING CERTAIN CHARACTERISTICS) IN PREFERENCE TO ANOTHER OPTION (HAVING OTHER CHARACTERISITCS) e.g. Traveller chooses train with cost £30 and travel time 2 hours in preference to coach costing £15 and taking 4 hours. EITHER Requires ‘Engineering’ data on costs, times, etc. (Possibly from fare manuals, timetables or modelled) OR Requires traveller to report the costs and times of both the chosen and rejected modes.
  • 40. – Self justification bias in reported data – Many choices ‘dominated’ – Cost and time differences between modes may be correlated – Habit/inertia effects – Respondent may not be able to give satisfactory data about the alternative mode Generally need very large samples Problems with Revealed Preference Data
  • 41. Transfer Price Data TRAVELLERS ARE ASKED DIRECTLY FOR A MEASURE OF UTILITY DIFFERENCE BETWEEN TWO TRAVEL ALTERNATIVES by questions such as: ‘How much would the cost of your chosen alternative have to rise in order for you to switch to your rejected alternative?
  • 42. Problems with Transfer Price Data – Policy response bias – Unconstrained response bias – Self justification bias – Requires data about the rejected alternative, which may only be known very inexactly – Respondent may not understand or be able to relate to question
  • 43. Stated Preference Data TRAVELLERS ARE PRESENTED WITH A SET OF HYPOTHETICAL TRAVEL CHOICES, EACH WITH ITS OWN CHARACTERISTICS (e.g. Cost, Travel time, etc), AND ASKED TO - MAKE A CHOICE - RANK ALTERNATIVES - RATE ALTERNATIVES THE CRUCIAL REQUIREMENT IS THAT THE ABOVE INCORPORATE IMPLICIT TRADE-OFFS
  • 44. Advantages of Stated Preference – Can represent situations that do not yet exist – No problem of reporting error/bias – Can ‘design in’ interesting trade offs – Can ensure low correlation between characteristic differences – Can ask ‘many’ choices of each individual – Avoids requirement for ‘confidential’ information
  • 45. Problems with Stated Preference Data – Response not rooted in an actual choice – Questions may be difficult to understand – Respondents may refuse to ‘play games’ – Relatively unimportant characteristics may be ignored – Design is (very?) difficult – Scale factor problem