My first presentation as a PhD student in which I outline the background to my research project. This presentation was given as part of the University of Southampton Transportation Research Group seminar programme.
2. Contents
Demand models for new stations
Defining station catchments
Catchments in reality
Probabilistic station choice – discrete choice models
Next steps
3. Simple demand models
Used to forecast the number of entries and exits (Vi) at a new station:
Trip rate model - function of population of catchment:
Trip end model - function of population plus other factors:
( )i iV f population=
( , , , )i i i i iV f population frequency parking jobs=
4. Spatial interaction (flow) models
Used to forecast the number of trips (T) from each origin (i) station to each
destination (j) station:
Oi – attributes of origin (e.g. population, parking, frequency)
Dj – attributes of destination (e.g. number of workplaces)
Sij – separation between origin and destination (e.g. journey time)
( )ij i j ijT f O D S=
5. Defining station catchments
Calibrate models using observed entries/exits or flows at existing stations.
But must define a catchment first.
Circular (buffer) around station:
1 2i i iV Pop Popα β γ= + +iV Popα β= +
7. Catchments in reality
Use origin-destination surveys.
2km circular catchments account on average for 57%
of observed trips – between 0-20% for some stations
(Blainey and Evens, 2011).
Only 53% of trip ends located within zone-based
catchments (Blainey and Preston, 2010).
47% of passengers in the Netherlands do not use their
nearest station (Debrezion et al., 2007).
8. Catchments in reality
Catchments are not discrete, they overlap,
and stations compete.
Station choice is not homogenous within
zones.
Catchments vary by access mode and station
type.
Station choice more complex than models
allow – need an alternative.
Mahmoud et al., 2014
9. Improving demand forecasting models
Include a probability-based station choice
element.
Should produce more accurate and
transferable models.
For each catchment zone calculate the
probability of each competing station being
chosen.
Allocate zonal population to each station based
on the probabilities.
10. Discrete choice models
Individual chooses from a finite
number of mutually exclusive
alternatives.
Individual chooses the alternative
that maximises their utility
(satisfaction).
Factor Change Expected
affect on utility
Frequency of
service
Car parking spaces
Fare
Access distance
Interchanges
Journey time
11. Discrete choice models
Station Access
Distance
(km)
Direct
destinations
Off-
peak
fare to
London
(£)
Journey
time to
London
(mins)
Transfers
(to
London)
Frequency
per day (to
London)
Parking
Spaces
Pen Mill 0.5 Cardiff-
Weymouth
86.00 206 1 8 25
Yeovil
Junction
2.1 Waterloo-
Exeter
52.00 140 0 19 199
Castle
Cary
24.1 Paddington-
Penzance
86.00 100 0 8 120
12. Discrete choice models
Actual utility an individual gains from an alternative is not
known.
Researcher tries to measure utility by identifying
attributes of the alternatives and/or the individual:
Utility = Measured utility + Unobserved utility
Measured utility = αFreq + βFare + γPkg + δDis
If we assume that the unobserved utility of the
alternatives is independent of each other and identically
distributed (extreme value) then can use logit models.
13. Logit models
Binary logit (choice of two alternatives, i and j):
Multinomial logit (e.g. three alternatives, i,j and k):
Pr( )
ni
njni
MeasuredUtility
MeasuredUtilityMeasuredUtility
e
ni
e e
=
+
Pr( )
ni
njni nk
MeasuredUtility
MeasuredUtilityMeasuredUtility MeasuredUtility
e
ni
e e e
=
+ +
14. Estimating logit models
Need to estimate the parameters in the utility function:
Measured utility = αFreq + βFare + γPkg + δDis
Collect individual-level data – usually from in-train passenger surveys.
Dependent variable is the observed choice (the station each participant
actually chose).
Parameters are estimated using maximum likelihood estimation - R, STATA,
LIMDEP.
15. Logit models - substitution behaviour
Independence from irrelevant alternatives (IIA).
For each pair of alternatives, the ratio of their probabilities is not affected by adding or
removing another alternative, or changing the attributes of another alternative.
Consequence – proportional substitution pattern.
Stations are located in space.
Are a-spatial choice models appropriate?
( ) 0.4
2
( ) 0.2
P A
P C
= =
( ) 0.66
2
( ) 0.33
P A
P C
= =
16. Next steps
Obtain and prepare data:
Transport Scotland ≈ 23,000 responses
London Travel Demand Survey 2005/06 to 2012/13 –
but rail trips a minor component.
Carry out on-train survey?
Big-data: transport timetables
Descriptive analysis – observed catchments.
Develop and validate choice models.
Incorporate choice models into trip-end, flow models.
17. References
Debrezion, G., Pels, E. and Rietveld, P. (2007) “Choice of Departure Station by
Railway Users,” European Transport, 37, 78–92.
Blainey, S. P. and Preston, J. M. (2010) “Modelling Local Rail Demand in South Wales,”
Transportation Planning and Technology, 33, 55–73.
Blainey, S. and Evens, S. (2011) “Local Station Catchments: Reconciling Theory with
Reality.” In European Transport Conference.
Mahmoud, M. S., Eng, P. and Shalaby, A. (2014) “Park-and-Ride Access Station Choice
Model for Cross-Regional Commuter Trips in the Greater Toronto and Hamilton Area
(GTHA).” In Transportation Research Board 93rd Annual Meeting.
50K Raster [TIFF geospatial data], Ordnance Survey (GB), Using: EDINA Digimap
Ordnance Survey Service, <http://edina.ac.uk/digimap>, Downloaded: April 2015.
250K Raster [TIFF geospatial data], Ordnance Survey (GB), Using: EDINA Digimap
Ordnance Survey Service, <http://edina.ac.uk/digimap>, Downloaded: April 2015.
Editor's Notes
Introduce myself, what I’m doing etc.
What I’m going to talk about in this seminar
Trip rate – no account taken of catchment characteristics, level of service
As a result, not readily transferable, model needs to be calibrated using stations with very similar characteristics
Trip-end – introduces additional variables that relate to the catchment or origin station – train frequency, parking availability
Neither model takes into account destinations served by the stations.
Introduces destinations
Sometimes attributes of destinations (though often just dummy variable)
Separation – for example journey time
Entries/exits or flows at existing stations used to calibrate the model and estimate the parameters/coefficients, can then apply the models to new situations.
Need to defined a catchment, so know from what population base trips originate.
Two main ways.
First is a circular buffer around the station. For example 2km.
For example. number of entries/exits is equal to some constant plus a weighting applied to the catchment population.
More sophisticated way is to have two buffer, one up to 800m and the other 800-2km, and population in each is weighted differently, recognising that greater proportion of trips originate from closer to the station.
Second is to base the catchment on zones.
In this example, the zones are census output areas.
Assign each output area to a specific station, based in some way on distance, for example:
straight-line
road distance
travel time.
Therefore, deterministic and discrete catchments.
Example here is two stations serving Yeovil, Pen Mill in city centre, and Yeovil Junction 2km out of the town. Can see in this case that virtually all the zones covering the town assigned to Pen Mill, and very little population to Yeovil Junction.
Unlikely to be realistic or accurate.
So, research has looked into this, mostly based on data from origin-destination surveys, so you know where each traveller started their journey, and what station they used. Can compare that with the defined catchment – how many fall within the defined catchment for the station they used.
Also, research shows that catchments are not discrete, they overlap. For example, work in Toronto plotted observed catchments for stations in the Greater Toronto area – red dots the stations, polygons indicate extent of the catchments.
Within a zone, passengers do not choose the same station.
Stations compete for passengers, improved service at a station might abstract passengers from another station.
Access mode – if walking catchment much smaller than if driving. If getting there by public transport, catchment will reflect location of bus routes.
So all this indicates that station choice much more complex than the models allow for.
Suggests that introducing a probability-based station choice element to the models would be a good idea.
More accurate and transferable models.
We could do this using a discrete choice model.
In these models as individual is assumed to chooses from a finite number of mutually exclusive choices – and they choose the alternative that maximises their utility – maximises their satisfaction and minimises their dissatisfaction.
In case of choosing between stations, examples of factor that may affect utility are ….
This is an example of the factors that may contribute to utility for someone from Yeovil city centre choosing a station.
Three potential stations.
Can see straight away that utility may depend on destination, as they are on different lines. If going to Weymouth can get a direct service from Pen Mill the closest station. Unlikely to choose the other two.
More complex if considering a trip to London.
Pen Mill has no direct train, would have to change
Yeovil Junction has 19 direct trains a day and cost is £52
Could drive up to Castle Cary – 8 trains a day and £86, but shortest journey time (better quality trains potentially).
So, look in more detail at discrete choice models.
Measured utility – includes each attribute and a parameter that weights its contribution to utility
If model so well specified then the unobserved utility becomes “white noise” and assumptions don’t matter.
Depending on assumptions about the unobserved utility, different statistical models can be used to calculate the probability that an individual will choose each alternative.
In a binary logit model, choice between two alternatives, the probability of an individual choosing an alternative is the exponential of its measured utility divided by the sum of the exponential of measured utility of both alternatives.
For more choices, use the multinomial model, in that case divide by the sum of the exponential of utility of all alternatives.
Results in a sigmoid probability distribution curve, shown here. That flattens at either extreme, so a small change in utility has much larger effect on probability of an alternative being chosen in the mid area.
Need to know ultimate origin and destination – otherwise don’t know access distance, and what stations might be considered alternatives.
Once have estimated parameters for the utility function can be used to predict choice in other situations, as can calculate the measured utility.
For example, suppose the probability of someone at origin O choosing station A or B is 0.4 and, choosing C is 0.2.
Effect of closing Station B, ratio of probabilities remains the same.
Assume A and B are perfect substitutes for one another – equidistant from the individual and same service – would expect (b) NOT (c).
Replace the simplistic catchment definitions with choice models.