Urban Transportation Planning for the Region of Waterloo

Urban Transportation Planning for
the Region of Waterloo
Yongqi Dong, Asad Ullah Malik
Term Project – Spring 2019
CIVE 640 – Urban Transportation Planning
Submitted to: Prof. Liping Fu
Submission Date: 16th August 2019

I
Acronyms and Variables
Trip Purposes
HBW Home-Based Work
HBS Home-Based School
HBD Home-Based Discretionary
NHB Non-Home-Based
Variables for Linear Regression Analysis:
HHLDs Total number of households per zone
RE Retail sales and service employees
MCTE Manufacturing / Construction / Trades Employees
ZE Zonal Employees
ZEE Zonal Educational Employees
ZSP Zonal Student Population
FTE Full time employees
PTE Part time employees
ZP Zonal Population
ZRE Zonal Retail Employees
Variables used for Modal Split
AUTOTT Auto travel time (minutes)
TRANSITTT Transit travel time (minutes)
AUTOGC Auto generalized costs ($)
TRANSITGC Transit generalized costs ($)
LICENSE 1 if a person has a license, else 0
TRANSPASS 1 if a person has a transit pass, else 0
RMSE Root Mean Squared Error

II
Table of Contents
1. Introduction ........................................................................................................................... 1
2. Exploratory Data Analysis.................................................................................................... 2
2.1 Household Data.................................................................................................................... 2
2.2 Personal Data....................................................................................................................... 4
2.3 Trip Data.............................................................................................................................. 6
2.4 Geographical Analysis......................................................................................................... 8
3. Trip Generation................................................................................................................... 16
3.1 Introduction ....................................................................................................................... 16
3.2 Categorical models ............................................................................................................ 17
3.2.1 HBW Household Categorical Model..................................................................................................... 18
3.2.2 HBS Household Categorical Model ...................................................................................................... 20
3.2.3 HBD (Discretionary) Household Categorical Model............................................................................ 22
3.2.4 NHB (Non-Home-based) Household Categorical Model ..................................................................... 23
3.2.5 Household Categorical Model TG Results............................................................................................ 25
3.3 Linear Regression Models ................................................................................................ 25
3.3.1 Production Models .................................................................................................................25
3.3.1.1 HBW Zonal Production Model......................................................................................25
3.3.1.2 HBS Zonal Production Mode ........................................................................................26
3.3.1.3 HBO Zonal Production Model.......................................................................................26
3.3.2 Attraction Models ...................................................................................................................27
3.3.2.1 HBW Zonal Attraction Model........................................................................................27
3.3.2.2 HBS Zonal Attraction Model .........................................................................................27
3.3.2.3 HBO Zonal Attraction Model ........................................................................................28
3.3.2.4 NHB Zonal Attraction Model ........................................................................................28
3.3.3 Adjustment Factor..................................................................................................................29
3.4 Models Comparison and Discussion............................................................................... 29
4. Trip Distribution ................................................................................................................. 30
4.1 Introduction ....................................................................................................................... 30
4.2 HBW Trip Distribution Models ....................................................................................... 30
4.2.1 Power Deterence Function ...................................................................................................30
4.2.2 Exponential Deterence Fucntion...........................................................................................32

III
4.2.3 Comparison of the HBW Trip Distribution Models ..............................................................33
4.3 HBS Trip Distribution Models......................................................................................... 33
4.3.1 Exponential Deterence Fucntion...........................................................................................33
4.3.2 Combined Deterence Function ..............................................................................................34
5. Modal Split Analysis............................................................................................................ 35
5.1 Introduction ................................................................................................................. 35
5.2 Aggregate Modal Split Models................................................................................... 35
5.3 Disaggregate Modal Split Models .............................................................................. 37
5.4 Statistics of Estimated Trip Records ......................................................................... 38
5.5 Modal Split Analysis for All trips .............................................................................. 39
5.5.1 Multinomial logit models..................................................................................................39
5.5.2 Nested logit models ...........................................................................................................45
5.6 Modal Split Analysis for HBW trips.......................................................................... 47
5.6.1 Multinomial logit models...................................................................................................47
5.6.2 Nested logit models ............................................................................................................51
5.7 Modal Split Analysis for HBD trips........................................................................... 53
5.8 Modal Split Analysis for HBS trips ........................................................................... 59
5.9 Check consistency with the reality............................................................................. 62
5.10 Comparison and discussion ........................................................................................ 64
6. Traffic Assignment.............................................................................................................. 64
6.1 Introduction ................................................................................................................. 64
6.2 Link and node performance based assignment methods......................................... 64
6.3 Link performance function......................................................................................... 65
6.4 Relative gap for measuring......................................................................................... 66
7. Conclusion and perspective ................................................................................................ 66
References..................................................................................................................................... 68

IV
APPENDIX .................................................................................................................................. 69
A.1 Linear Regression Production Models................................................................................ 70
A.1.1 HBW Zonal Production Model .................................................................................... 70
A.1.2 HBS Zonal Production Model ...................................................................................... 72
A.1.3 HBO Zonal Production Model ..................................................................................... 73
A.1.4 NHB Zonal Production Model...................................................................................... 74
A.2 Linear Regression Attraction Models................................................................................. 75
A.2.1 HBW Zonal Attraction Model...................................................................................... 75
A.2.2 HBS Zonal Attraction Model........................................................................................ 76
A.2.3 HBO Zonal Attraction Model....................................................................................... 78
A.2.4 NHB Zonal Attraction Model....................................................................................... 79
Table A1. Correlation of coefficients for model HBW_UF2_Upgrade.................................... 81

1
1. Introduction
This report aims to create models to describe travel behavior and transportation demand
within the Region of Waterloo. We apply the traditional four-step Urban Transportation
Demand analysis process to tackle the problem. Data collected by the Transportation
Tomorrow Survey (TTS) for 2011 is used as base year data to calibrate the models. The
data are firstly pre-processed (cleaning, deleting the missing and reduplicative value, etc.),
then a thorough Exploratory Data Analysis (EDA) is carried out, followed by developing
models for trip generation, trip distribution, and modal split analysis step by step. Also,
traffic assignment is discussed.
This report is organized into the following six sections:
Section 2 introduces the data used and gives a detailed EDA to better understand the data.
Household, personal, trip, population and employment data were examined. This step is
very useful for the following four steps’ conducting.
Section 3 describes the trip generation, the first step of the four-step model. Both
categorical and logistic regression models are developed according to each of the four trip
purposes: home-based-work (HBW), home-based-school (HBS), home-based-
discretionary (HBD), and non-home-based (NHB). Trip attraction results were balanced
with the trip production results in preparation for the next step.
Section 4 is the trip distribution step which is the second stage of the four steps method.
Trip distribution analysis aims to determine how trips produced by each zone are
distributed to other zones and to develop models that can be used to estimate such
estimation. Deterrence functions for home-based work (HBW) trips and home-based
school (HBS) trips were calibrated on the basis of minimizing the root mean square errors.
Special focus was given to find the global minima instead of the local one. Using the
gravity model, the origin-destination matrix for the base year was produced for the two
selected trip purposes.
Section 5 illustrates the mode split analysis step. In this section, trips between a given origin
and destination pair are divided into trips made by four different modes, i.e., auto, transit,
walk, and bike. The mode choices models for each of the given trip are examined through
the application of utility theory. After comparing the benefits and limitations for
multinomial logit model and nested logit models, HBW trips are analyzed by using the
nested logit model, while for other purposes of the trips, multinomial logit models are
chosen. The comparisons between different models are also discussed.
Section 6 discusses the link performance and node performance-based traffic assignment
methods.
Section 7 concludes the report and raises some perspective with the fact that some data-
driven machine learning methods are rising to do traffic demand modeling task.

2
2. Exploratory Data Analysis
In this report, we use the data from 2006 and 2011 Waterloo Region Transportation
Tomorrow Survey, in which household travel survey data obtained from TTS for the
Region of Waterloo 2011. In this section, household, personal, and trip data were
examined, and visualization pictures, pies, and charts were produced to summarize this
information.
2.1 Household Data
Figure 2.1 Dwell type
Figure 2.1 shows the type of dwellings in the Region of Waterloo. It is clear that the
majority of residents live in a house while smaller fractions live in an apartment or
townhouse. A high number of houses, as opposed to apartments or townhouses, could
indicate that large suburbs are present within the Region, further indicating that a high
number of trips made could be generated by these suburbs.
Figure 2.2 Number of dwellings per district
Figure 2.2 shows the number of distribution of households for districts within the Region
of Waterloo. From this, it can be concluded that Waterloo, Kitchener, and Cambridge are
the three primary urban centers due to their high number of households while North
136325,
73%
33861,
18%
15774, 9% 311, 0%
Dwell Type
House
Apartment
Townhouse
Other
36756,
20%
82949,
44%
45102,
24%
3271, 2%
6859, 4%
3284,
2%
8049, 4%
Number of Dwellings per district
Waterloo
Kitchener
Cambridge
North Dumfries
Wilmot
Wellesley
Woolwich

3
Dumfries, Wilmot, Wellesley, and Woolwich are more rural districts. Due to having larger
populations, it is expected that the three urban centers will generate more trips than the four
rural districts.
Figure 2.3 Number of vehicles per household
Figure 2.3 shows the number of vehicles per household for the Region, with the vast
majority of households having either 1 or 2 vehicles and very small segments having either
3 or more vehicles or no vehicle at all. It is expected that the majority of trips will be made
using a personal vehicle due to the high proportion of vehicle ownership per household.
Figure 2.4 Number of households with at least one person employed/studying
Figure 2.4 shows the employment status characteristics for households within the Region.
Roughly half of all households have at least one member working full-time, one quarter
has at least one member working part-time or at home, and one quarter has at least one
student. The high student proportion indicates that more walking, bicycle or transit trips
may be made due to the lower income of students while the majority full-time segment
indicates the potential for more personal vehicle trips due to higher income.
14318, 8%
66330, 35%
83173, 45%
22449, 12%
Number of vehicles per household
0
1
2
3+
130295,
51%43483,
17%
17090, 7%
62967,
25%
Employment (work or study) per household
Full Time
Part Time
Home
Student

4
Figure 2.5 Trip frequency per day for households
Figure 2.5 shows the trip frequency per day for households within the Region. The vast
majority of households tend to make 3 to 4 trips per day with few making more than 15
trips per day or no trips at all. This indicates a significant amount of trip chaining.
2.2 Personal Data
Personal data relates to a living individual who is or can be identified either from the data
or from the data in conjunction with other information that is in, or is likely to come into,
the possession of the data controller.
Figure 2.6 Transit pass ownership and employment status
Figure 2.6 shows the relationship between transit pass ownership and employment status
for residents within the Region. This chart shows that the majority of residents who do not
own a transit pass are those who work full time, indicating that residents with constant
17481
633
28907
7602
31124
9901
24161
8692
15551
6836
9122
43174909
26913600
10742
0
5000
10000
15000
20000
25000
30000
35000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15+
NumberofHouseholds
Number of Trips
Numbers of Household with different Trips per day
0
50000
100000
150000
200000
TransitPassOwnership
Employment Status
owns a pass
Does not own a pass

5
income may prefer a personal vehicle as their primary mode. It must also be mentioned
that the second largest group of those who do not own a transit pass are underage residents,
which could imply a preference by parents to have their children riding along with them in
a personal vehicle for trips. Another interesting note is that the largest group of transit pass
owners are those who are unemployed, indicating that a lack of income can mean fewer
personal vehicle trips and can result in more transit trips.
Figure 2.7 Transit pass ownership and occupation
Figure 2.7 shows the relationship between transit pass ownership and occupation for
residents within the Region. From this plot, it is evident that those with higher income
(assumed to be those under the “professional/management/technical” category) make up
the largest group of residents that do not own a transit pass. This could lead to the
conclusion that higher-income residents show a stronger affinity to choosing a personal
vehicle as their mode of transit while lower earners (assumed to be those under the
“manufacturing/construction/trades” category). As in the previous plot, the unemployed
make up the largest group of transit pass owners.
Figure 2.8 Person number within the household
0
20000
40000
60000
80000
100000
120000
TransitPassOwnership
Occupation
Owns a Pass
Does Not Own a Pass
1
2
3
4
5 6

6
Figure 2.8 shows the proportion distribution of the number of persons within the household.
Generally, the majority of a household has 1-3 person.
2.3 Trip Data
Figure 2.9 Trip start time during the day
Figure 2.9 shows the trip start time during the day. Clearly, we could find two peaks in the
plots, which represents the morning rush hours and the evening rush hours respectively.
Separately, the following plots provide the distribution of trips of different purpose during
the day.
(a) (b)

7
(c) (d)
(e)
Figure 2.10 Start time distribution of trips with different purposes during the day
(With subfigure a for Home-based work, b for Home-based-school, c for Home-based-discretionary, d for
Non-Home-based, and e for plotting these four kinds of trips in a picture)
Figure 2.10 further gives the visualization of start time distribution of trips with different
purposes during the day. From these plots, we can easily find that there are different
temporal patterns for trips with different purposes which are of vital importance. The
Home-based work and Home-based-school show similar patterns with two peaks in the
morning (around 8:00 AM) and afternoon (around 15:30~16:00 PM), respectively.
However, there are also quite a few home-based trips which are scattered during the whole
day between 6:00 and 23:30.

8
Figure 2.11 Trip distance distribution
Figure 2.11 shows the trip distance distribution. We can find there is clearly a long-tailed
distribution with a majority of the trips lie in short distance trips no longer than 20 km.
Figure 2.12 The distribution of the number of trips made by travelers
Figure 2.12 shows the distribution of the number of trips made by travelers. We can find
most of the travelers made less than 5 trips shown in the survey data.
2.4 Geographical Analysis
Previous subsection used graphs and charts as visualization tools to explore the household
survey data given by Transportation Tomorrow Survey (TTS), 2011. However, this
subsection will further explore the population and employment data by Traffic Analysis
Zone (TAZ). For this purpose, QGIS, which is a free and open-source cross-platform

9
desktop geographic information system application that supports viewing, editing, and
analysis of geospatial data, was used.
The Region of Waterloo is divided into 7 sub-divisions out of which 3 are cities and 4 are
townships. The townships mainly comprise of rural towns surrounded by farms
predominantly. As shown in Figure 2.13 Waterloo, Kitchener and Cambridge are the 3
cities and Wellesley, Woolwich, Wilmot and North Dumfries are the 4 townships.
Figure 2.13 Map of Region of Waterloo, showing 3 major cities and 4 townships.
Image Source: Elizabeth Siddorn, Information Technology Services, Region of Waterloo

10
Figure 2.14 Graduated representation of Population data for Waterloo Region for
the year 2011 generated by QGIS
Figure 2.14 shows that people prefer to live in the suburbs of the cities and Figure 2.15
shows that people work in the CBD of the cities. Which leads us to the trend of the trip
made daily from these highly populated suburbs to the central areas of the cities e.g.
downtown Kitchener.

11
Figure 1.15 Graduated representation of Employment data for Region of Waterloo
for the year generated by QGIS (2011)
Figure 2.15 represents the categorical distribution of the employment in the Region of
Waterloo. Darker shades represent higher employment of individuals and white areas
represent minimal employment opportunities. It can also be observed that the size of
Traffic Analysis Zones (TAZ) is smaller near the central areas of the 3 major cities i.e.
Waterloo, Kitchener, and Cambridge. For example, the TAZs near the uptown waterloo
are small-sized and tightly packed in comparison to TAZs in the rural Region of Waterloo
e.g. near Elmira in the Woolwich Township. This particular trend of sizing various TAZs
differently can be attributed to creating ease in managing a highly densely populated area
of the cities. Moreover, this differential sizing of TAZ also helps in the planning and
designing of a transit network.

12
The brown shaded zones in Figure 2.16 represent that no trips were produced from these
zones where as the highest number of trips were produced from the zones shaded in black.
University of Waterloo and Downtown Kitchener are among the zones shaded in black
which is an expected trend as one is one of the largest employer in the region and other is
the hub of commercial activity respectively.
Figure 2.16 Graduated representation of Trip Production data for Region of Waterloo
for the year generated by QGIS (2011)

13
Figure 2.17 Graduated representation of Trip Attraction data for Region of
Waterloo for the year generated by QGIS (2011)
The town of Elmira is covered by one large zone which creates analomalies in the data as that single
zone stands out in each and every demographic be it population, number of households or trips
being produced or attracted. We would suggest this Traffic Analysis Zone to be further sub divided
in future surveys. Similar to trip production the neighbourhood of Williamsburg, Downtown
Kitchener and University of Waterloo attracted majority of the trips. In Figure 2.17, Trip attraction
in other areas is relatively sparse. The grey zones attract no trips which means that these are either
residential areas or not touched at all.

14
Figure 2.18 Graduated representation of Households per zone for Region of
Waterloo for the year generated by QGIS (2011)
The number of households per zone in Figure 2.18 showed a similar pattern as the zonal
population in Figure 2.14. Elmira (a town in the outskirts of Waterloo) and Williamsburg
(a residential suburb of kitchener) have the highest number of households in 2011. The
reason could be lower residential land cost in comparison to the land price near the central
urban areas. In Uptown Waterloo or Downtown Kitchener the number of households fall
below 20 per zone with some zones having zero households.

15
The green dots (zonal centroids) represent the origins of the trips made to the University
of Waterloo in Figure 2.18. It can be seen that the majority of trips made to the University
of Waterloo were from Waterloo’s and Kitchener’s TAZs followed by Cambridge’s TAZs.
A limited number of trips from the other 4 townships were made to the University of
Waterloo. The trips included all types of trips i.e. HBS, HBW, HBD, and NHB. This
geographical map showing origins and destination was generated to emphasize the power
of trip attraction of a TAZs where huge employers like the University of Waterloo are
located.
Figure 2.18 TTS 2011 Trip data overlaid on the TAZ data to show the trips made
from various areas in the Region of Waterloo to University of Waterloo.

16
Figure 2.19 TTS 2011 Trip data overlaid on the TAZ data to show the trips made
from various areas in the Region of Waterloo to Connestoga Mall, Waterloo.
3. Trip Generation
3.1 Introduction
The purpose of this section is to present the Trip Generation which is the first step of the
four-step travel demand analysis. The purpose of trip generation analysis is to develop trip
generation expressions from survey data that may be used to convert estimates of horizon
year development patterns into zonal productions and attractions for each trip purpose of
interest[3]
.
Trip Generation is the general term for Trip Production and Trip Attraction, where Trip
Production is defined as the home-end for home-based trips and as the trip origin for non-
home-based trips, and Trip Attraction represents non-home-end for home-based trips and
as trip destination for non-home-based trips. In this section, two kinds of analysis methods,
i.e., Categorical model (Cross-Classification) and Linear Regression model, are used to
model the trip generation.

17
Using the household, personal, trip, population and employment data summarized in the
first report, several models were developed to create trip production and attraction vectors.
Trips were first divided according to one of four purposes: home-based-work (HBW),
home-based-school (HBS), home-based-discretionary (HBD), and non-home-based
(NHB).
Table 3.1 The numbers of different type of trips
Trip Type Sum of TRIP_NUM Percentage
HBW 18063 28.14%
HBS 5679 8.85%
HBD 28308 44.11%
NHB 12133 18.90%
Total 64183
Figure 3.1 The percentages of different type of trips
For trip production, trips were either aggregated on a zonal basis or were left disaggregated
on a household basis. At the zonal level, a regression model was developed for each type
of trip (4 models).
3.2 Categorical models
The categorical model was developed at the household level, for each type of the trips (4
models).
For trip attraction, trips were aggregated on a zonal basis and a regression model was
developed for each type of trip (4 models). This yielded a total of 8 regression models and
4 categorical models for a grand total of 12 models. It must be noted that expansion factors
28.14%
8.85%
44.11%
18.90%
Percentages of different types of trips
HBW
HBS
HBD
NHB

18
were applied for zonal-level models but not household-level models. Table 3.1 summarizes
the trip generation models developed.
Table 3.2 The Correlation coefficients between estimated household variables
N_Persons N_Vehicles N_Licences FT Emp PT Emp Home Emp N_Students
N_Persons 1.000 0.118 0.581 0.475 0.279 0.165 0.724
N_Vehicles 0.118 1.000 0.216 0.160 0.068 0.022 0.055
N_Licences 0.581 0.216 1.000 0.509 0.323 0.093 0.307
FT Emp 0.475 0.160 0.509 1.000 0.001 0.181 0.236
PT Emp 0.279 0.068 0.323 0.001 1.000 0.167 0.267
Home Emp 0.165 0.022 0.093 0.181 0.167 1.000 0.098
N_Students 0.724 0.055 0.307 0.236 0.267 0.098 1.000
FT Emp = Number of full-time workers in the household
PT Emp = Number of part-time workers in the household
Home Emp = Number of persons who work full or part-time at home in the household
Table 3.2 shows the correlation matrix with highlighted cells representing the variable pairs
excluded from consideration. As shown in Table 3.2, seven independent household
variables were considered for use in developing a categorical model for each trip type.
There is also another variable, “Dwell Type”, which is excluded due to its only being able
to provide three bins. Several independent variable pairs were excluded due to a correlation
having been found between the two variables. A correlation matrix shown in table 3.2 is
developed to find the correlations with a value of 0.4 or greater considered to be a strong
correlation. In the cross-classification model, only two independent variables are chosen
for each model.
In developing the categorical models, firstly, all possible variable pairs are explored with
exploratory plots and summary analysis tables, using Excel pivot and Matlab. Then the
optimal variable pairs are selected out which is defined as the one that possesses a certain
amount of entries for each cell while also possesses an adequate amount of bins (preferably
more than 3 but no more than 6), plus a lowest standard deviation of all possible variable
pairs should be detected. In addition, the different categories should show different trip
making rates, while trip making rates within categories should be homogeneous.
3.2.1 HBW Household Categorical Model
For home-based-work trips, N_Persons (Number of Persons Per Household) and
N_Vehicles (Number of Vehicles Per Household) are finally selected as the estimated two
independent variables. Table 3.5 shows the results as using cross-classification methods
regarding variables of N_Persons and N_Vehicles and Figure 3.2 shows the plot generated
using the tabulated information.

19
Table 3.3 HBW Cross-Classification Analysis (No. of trips)
No. of Trips (HBW)
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 85 754 99 30 8
2 113 1361 3378 376 124
3 64 615 2071 946 171
4 31 646 2324 694 404
5 41 210 765 251 157
6+ 8 87 317 176 160
Table 3.4 HBW Cross-Classification Analysis (No. of households)
No. of households
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 48 430 55 18 5
2 47 562 1247 144 45
3 26 267 736 259 52
4 13 256 867 196 87
5 12 85 280 71 34
6+ 4 32 109 47 33
Table 3.5 HBW Cross-Classification Analysis (No. of trips per households)
Trips per households
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 1.77 1.75 1.80 1.67 1.60
2 2.40 2.42 2.71 2.61 2.76
3 2.46 2.30 2.81 3.65 3.29
4 2.38 2.52 2.68 3.54 4.64
5 3.42 2.47 2.73 3.54 4.62
6+ 2.00 2.72 2.91 3.74 4.85

20
Figure 3.2 The distribution of the number of trips (HBW) made by travelers
Figure 3.2 shows the plot the trip making rates generated using the tabulated information.
3.2.2 HBS Household Categorical Model
For home-based-School trips, N_Students (Number of Students Per Household) and
independent variables. Table 3.8 shows the results as using cross-classification methods
regarding variables of N_Students and N_Vehicles and Figure 3.3 shows the plot
generated using the tabulated information.
Table 3.6 HBS Cross-Classification Analysis (No. of trips)
No. of Trips (HBS)
Number of Students
0 1 2 3 4 5+
Number of Vehicles
0 0 77 67 52 36 13
1 0 379 487 326 84 77
2 0 676 1264 608 195 93
3 0 220 302 121 68 26
4+ 0 102 122 14 33 6

21
Table 3.7 HBS Cross-Classification Analysis (No. of households)
No. of households
Number of Students
0 1 2 3 4 5+
Number of Vehicles
0 0 42 25 17 8 2
1 0 197 173 82 17 11
2 0 357 435 152 44 15
3 0 115 101 34 12 5
4+ 0 56 42 4 5 1
Table 3.8 HBS Cross-Classification Analysis (No. of trips per households)
Trips per households (HBS)
Number of Students
0 1 2 3 4 5+
Number of Vehicles
0 0 1.83 2.68 3.06 4.50 6.50
1 0 1.92 2.82 3.98 4.94 7.00
2 0 1.89 2.91 4.00 4.43 6.20
3 0 1.91 2.99 3.56 5.67 5.20
4+ 0 1.82 2.90 3.50 6.60 6.00
Figure 3.3 The distribution of the number of trips (HBS) made by travelers

22
3.2.3 HBD (Discretionary) Household Categorical Model
For home-based-Discretionary trips, N_Persons (Number of Persons Per Household) and
independent variables. Table 3.9-3.11 shows the results as using cross-classification
methods regarding variables of N_Persons and N_Vehicles and Figure 3.4 shows the plot
Table 2.9 HBD Cross-Classification Analysis (No. of trips)
No. of Trips (HBD)
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 413 1860 165 35 18
2 187 4719 5224 517 152
3 98 1136 2448 791 170
4 44 1060 3383 782 291
5 35 442 1303 277 173
6+ 32 213 597 305 182
Table 3.10 HBD Cross-Classification Analysis (No. of households)
No. of households
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 196 814 75 13 6
2 70 1228 1441 145 44
3 32 288 633 209 48
4 16 235 749 174 72
5 11 92 255 64 29
6+ 7 40 106 44 27
Table 3.11 HBD Cross-Classification Analysis (No. of trips per households)
Trips per households (HBD)
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 2.11 2.29 2.20 2.69 3.00
2 2.67 3.84 3.63 3.57 3.45
3 3.06 3.94 3.87 3.78 3.54
4 2.75 4.51 4.52 4.49 4.04
5 3.18 4.80 5.11 4.33 5.97
6+ 4.57 5.33 5.63 6.93 6.74

23
Figure 3.4 The distribution of the number of trips (HBD) made by travelers
Figure 3.4 shows the plot the trip making rates generated using the tabulated information.
3.2.4 NHB (Non-Home-based) Household Categorical Model
For NHB (Non-Home-based) trips, N_Persons (Number of Persons Per Household) and
independent variables. Table 3.12-3.14 shows the results as using cross-classification
methods regarding variables of N_Persons and N_Vehicles and Figure 3.5 shows the plot
Table 3.12 NHB Cross-Classification Analysis (No. of trips)
No. of Trips (NHB)
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 94 794 103 21 5
2 48 1406 2207 260 70
3 19 516 1119 331 79
4 8 419 1571 359 138
5 11 171 533 100 69
6+ 9 71 191 118 47

24
Table 3.13 NHB Cross-Classification Analysis (No. of households)
No. of households
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 62 443 47 11 3
2 30 584 850 103 27
3 13 183 423 132 33
4 7 158 540 123 58
5 5 58 180 42 23
6+ 3 27 61 27 19
Table 3.14 NHB Cross-Classification Analysis (No. of trips per households)
Trips per households(NHB)
Number of Vehicles
0 1 2 3 4+
Number of Persons
1 1.52 1.79 2.19 1.91 1.67
2 1.60 2.41 2.60 2.52 2.59
3 1.46 2.82 2.65 2.51 2.39
4 1.14 2.65 2.91 2.92 2.38
5 2.20 2.95 2.96 2.38 3.00
6+ 3.00 2.63 3.13 4.37 2.47
Figure 3.5 The distribution of the number of trips (HBD) made by travelers

25
Figure 3.5 shows the plot the trip making rates of HBD trips per household generated using
the tabulated information.
3.2.5 Household Categorical Model TG Results
Once the trip generation rates are established using the base year’s data, the trips generated
by a traffic analysis zone, i.e., TGi, can be calculated by multiplying the expected number
of households in each household category by the appropriate rates using:
Then the whole analyzing process of trip generation using categorical model are done.
3.3 Linear Regression Models
The Linear Regression models were developed at the zonal level, for each type of the trips
i.e. HBW, HBS, HBO, NHB. Various independent variables like Zonal student population,
Total number of retail employees in a zone, etc. were examined and were compared with
each other to check with independent variables or a combination of them yielded best
results i.e. were able to generate the highest R2
value of the model.
When a variable is considered for a production model, trips with the production zone
matching that variable are counted, and likewise for attraction models. For example, when
the number of employees is considered as a variable for an attraction model, the number of
employees working in each zone is considered; whereas when the number of employees is
considered as a variable for a production model, the number of employees living in each
zone is considered.
3.3.1 Production Models
Refer to Appendix A.1. for the best mean fit regression graphs and regression analysis
tabular results for production models.
3.3.1.1 HBW Zonal Production Model
Firstly, the data was segregated and arranged with the help of Pivot table command in MS
Excel after which each variable was individually modelled and R2
values were jotted down.
To create a better fit to the HBW trip data the individual independent variables were group
together and regressed. Linear regression analysis generated the following model for
Home- Base Production with the highest R2 value i.e. 0.967:
where:
rn = the average numbers of trips per household made by the n th category of households;
Hi,n = the number of households of the n th category in zone i.

26
𝐻𝐵𝑊 𝑃𝑟𝑜𝑑 = 0.4931 ∗ 𝐹𝑇𝐸 + 0.4476 ∗ 𝑃𝑇𝐸 + 0.0967 ∗ 𝐻𝐻𝐿𝐷𝑆 − 0.0001 ∗ 𝑍𝑃 − 0.434
Where,
FTE Full time employees (independent variable : n_emp_ft)
PTE Part time employees (independent variable : n_emp_pt)
HHLDS Number of Households (independent variable : HHLD NO.)
ZP Zonal Population (Independent variable : _Total Population)
It is to be noted that the value of the partial regression coefficient of Zonal population is
almost zero minimizing its contribution. However, including this variable in the model
does increase its R2
value.
3.3.1.2 HBS Zonal Production Mode
As logic dictates, the rate of home-based school trips (with the destination being an
educational institution) produced by a zone is dependent on the population of students
living in that particular zone. This was seconded by the high value of R2
i.e. 0.9324, the
following model generated:
𝐻𝐵𝑆 𝑃𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑜𝑛 𝑇𝑟𝑖𝑝𝑠 = 0.5202 ∗ 𝑍𝑆𝑃 + 0.3802
Where,
ZSP Zonal Student Population (independent variable : n_student)
0.5202 Partial regression coefficient
0.3802 Intercept
3.3.1.3 HBO Zonal Production Model
Due to the diverse nature of Home Based Other trips many factors can affect the trip
production however, the total number of households per zone can explain the trend in the
best way assuming that all HBO trips have household as the basic origin terminal.
Following are the regression results (R2
= 0.9229):
𝐻𝐵𝑂 𝑃𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑜𝑛 𝑇𝑟𝑖𝑝𝑠 = 1.3004 ∗ 𝐻𝐻𝐿𝐷𝑠 − 0.22167
Where,
HHLDs Total number of households per zone
0.22167 Intercept
3.3.1.4 NHB Zonal Production Model
Many iterations were made for this particular trip purpose. Educational Employees,
Number of students per zone, marketing employees, tradesmen and zonal population were
regression one by one to find the best model. However, a combination of Manufacturing/

27
Construction/ Trades Employees and Retail Employees gave the highest R2 value i.e.
0.7234.
𝑁𝐻𝐵 𝑃𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑜𝑛 𝑇𝑟𝑖𝑝𝑠 = 0.0531 ∗ 𝑅𝐸 + 0.5229 ∗ 𝑀𝐶𝑇𝐸 + 5.0742
Where,
RE Retail sales and service employees (independent variable : Occupation = S)
MCTE Manufacturing / Construction / Trades Employees (independent variable :
Occupation = M)
This also makes sense and supports logic because trademen like for example plumbers have
to visit many sites in a single day to fix issues unlike other professions which work daily
in a single unvariable working space. Retail employees also visit various retail stores or
other warehouses which give rise to non-home-based trips.
3.3.2 Attraction Models
Refer to Appendix A.2. for the best mean fit regression graphs and regression analysis
tabular results for production models.
3.3.2.1 HBW Zonal Attraction Model
HBW trip attraction co-relates with the total number of employees working in that
particular zone. It doesn’t mean that these employees live in this zone. (R2
= 0.7339)
𝐻𝐵𝑊 𝐴𝑡𝑡𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑇𝑟𝑖𝑝𝑠 = 0.0212 ∗ ZE + 1.3623
Where,
ZE Zonal Employees (independent variable : Total_Emp)
1.3623 Intercept
Full-time employees and work at home employees were also regressed for this model but
the analysis yielded models with extremely less explanatory power.
3.3.2.2 HBS Zonal Attraction Model
Initially, the model had one independent variable i.e. Education Sector Employees which
work in that zone with an R2
= 0.4657. However, later on when the total number of
employees and student population per zone were included in the model the R2
value
increased by a small margin to 0.4676.
𝐻𝐵𝑆 𝐴𝑡𝑡𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑇𝑟𝑖𝑝𝑠 = −0.0360 ∗ 𝑍𝐸 + 0.0542 ∗ 𝑍𝐸𝐸 + 0.1272 ∗ 𝑍𝑆𝑃 + 9.0428

28
Where,
ZE Zonal Employees (independent variable : Total_Emp)
ZEE Zonal Educational Employees (independent variable : edu)
ZSP Zonal Student Population (independent variable : n_student)
As mentioned earlier the regression graphs and regression analysis results are listed in the
Appendix A.2.
3.3.2.3 HBO Zonal Attraction Model
Unlike HBO Production trips, the HBO Attraction trips corresponded with the zonal retail
employees.
Table 3.15 Linear regression analysis results comparison
Independent Variable R2
Home Based Employees 0.0134
Part Time Employees 0.049
Retail Employees 0.5018
𝐻𝐵𝑂 𝐴𝑡𝑡𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑡𝑟𝑖𝑝𝑠 = 0.1392 ∗ 𝑍𝑅𝐸 + 16.49
Where,
ZRE Zonal Retail Employees (Independent variable : ret)
0.1392 Partial coefficient for ZRE
16.49 Intercept
3.3.2.4 NHB Zonal Attraction Model
First model was regressed with Retail Employees and Manufacturing / Construction /
Trades Employees as the independent variables with an explanatory power of 0.6838.
However, after addition of Zonal Education employees the explanatory power was
increased to 0.6853. herefore, the following model was chosen.
𝑁𝐻𝐵 𝐴𝑡𝑡𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑡𝑟𝑖𝑝𝑠 = 0.0751 ∗ 𝑍𝑅𝐸 + 0.3891 ∗ 𝑀𝐶𝑇𝐸 + 0.0052 ∗ 𝑍𝐸𝐸 + 5.8996
Where,
ZRE Zonal Retail Employees (independent variable : ret)
MCTE Manufacturing / Construction / Trades Employees (independent variable :
Occupation = M)
ZEE Zonal Educational Employees (independent variable : edu)

29
3.3.3 Adjustment Factor
Since our trip attraction and trip production models were established independent of each
other, there was a possibility that the total trips produced could not be equal to the total
trips attracted. Trip production models were chosen as the adjustment base due to their
higher explanatory power. The adjustment factors for trip attractions were calculated by
using the following formula:
Where,
TAj
new
New Trip Attraction for Zone j
TAj
old
Old Trip Attraction for Zone j
TPi Trip Production for zone i
Table 3.16 Adjustment factors for different trip purposes
Trip Purpose
Total Estimated
Production
Total Estimated
Attraction
Adjustment
Factor
HBW 6856 6743 1.01675812
HBS 2765 2727 1.013934727
HBD 13415 13268 1.011079289
NHB 9686 9686 1
3.4 Models Comparison and Discussion
Both regression and categorical models could manage to explain the observed variation in
trip generation, each model has its advantages and disadvantages. The major advantage of
regression models is that they are more flexible in that they can be aggregated to the zonal
level or left disaggregated at the household level. However, categorical models should
operate at the household level to produce meaningful results.
Another advantage of regression models is that there are goodness-of-fit measures which
could be relatively easy to apply and to determine the optimal model. It is much more
difficult or maybe not suitable to statistically validate a categorical model. A major
disadvantage of regression models is that a linear relationship between the independent

30
variables and trips is assumed when in reality some other type of relationship may best
describe the data. Another disadvantage is that a regression model also assumes that this
linear relationship applies to all data points in the model. Conversely, a categorical model
does not have this assumption built in.
Categorical models operate on a househould basis. This ensures that no information is lost
because there are no aggregation process. However, there are many issues due to the lack
of data in certain cells of the cross-classification table, especially in the cells at the extremes
of the table where data points are rare. This results in poor trip estimations being made for
these cells. Also, it is difficult to select the independent variables which should be used for
cross-classification as there are hardly any statistical measures to apply.
All in all, in our case, trips for both production and attraction will be estimated using zonal
regression models. Consistency is maintained as both models will be subject to the same
level of aggregation.
4. Trip Distribution
4.1 Introduction
Trip distribution analysis aims to determine how trips produced by each zone (origin) are
distributed to other zones (destination) and to develop models that can be used to estimate
such distribution[3]
.
The 3 types of deterrence functions used are as follows:
1. Power Function
𝑓(𝐶𝑖𝑗) = 𝐶𝑖𝑗
−𝑏
(4.1)
2. Exponential Function
𝑓(𝐶𝑖𝑗) = 𝑒−𝑏𝐶 𝑖𝑗 (4.2)
3. Combined Function
𝑓(𝐶𝑖𝑗) = (𝐶𝑖𝑗) 𝑎
𝑒−𝑏𝐶 𝑖𝑗 (4.3)
4.2 HBW Trip Distribution Models
Cost function is taken equal to Auto General Cost (GC).
4.2.1 Power Deterence Function
First, the power function f(Cij)= Cij
-b
was calibrated for HBW trips. Table 4.1 includes the
tested values for b, the number of iterations needed by the algorithm to calculate the optimal
values of Ai and Bj and the optimal values of Ai and Bj and the RMSE.

31
Table 4.1 Calibration of HBW Gravity Model with Power Function
B Iteration Ai Bj RMSE
1 10 2.56599E-05 1.269687735 13.33080336
1.1 10 3.51108E-05 1.298310777 13.15828503
1.105 10 3.56363E-05 1.300818469 13.15718502
1.1085 10 3.60086E-05 1.302583349 13.15690751
1.109 10 3.60621E-05 1.302836115 13.15690135
1.1095 10 3.61157E-05 1.303089042 13.15690359
1.11 10 3.61694E-05 1.30334213 13.15691426
1.12 10 1.308437804 1.308437804 13.15892116
1.15 10 4.07177E-05 1.324119786 13.18640295
1.2 10 4.71756E-05 1.351629071 13.31056066
1.205 10 4.78727E-05 1.354478128 13.32877655
1.21 10 4.85797E-05 1.357345524 13.34809559
1.25 10 5.46051E-05 1.380960239 13.54363657
1.3 10 6.31419E-05 1.412242836 13.89557103
1.5 10 0.000111714 1.559757442 16.55630449
While performing the iterations in Table 4.1, Graph in Figure 4.1 was made
simultaneously, and the lowest value of b was chosen i.e. 1.109 with an RMSE of 13.1569
which leads to the following function:
𝑓(𝐶𝑖𝑗) = 𝐶𝑖𝑗
−1.109
(4.4)
13.1565
13.157
13.1575
13.158
13.1585
13.159
13.1595
1.095 1.1 1.105 1.11 1.115 1.12 1.125
RMSE
Exponent (b)
Calibration by Golden Section Search Method
Figure 4.1 Trip Distribution Model optimization by Golden Section
Search Method

32
4.2.2 Exponential Deterence Fucntion
The exponential function f(Cij)=e-bCij
was calibrated applying the same procedure. The
results are in Table 4.2.
Table 4.2 Calibration of HBW Gravity Model with Exponenetial Function
B Iteration Ai Bj RMSE
0.05 10 1.308437804 1.274914809 14.13961244
0.07 10 1.308437804 1.42414732 12.5297404
0.09 10 1.308437804 1.582994498 11.51887811
0.0950 10 1.308437804 1.62368912 11.40739928
0.099 10 1.308437804 1.656443598 11.36042785
0.1 10 1.308437804 1.664655853 11.35443432
0.101 10 1.308437804 1.672876579 11.35070139
0.1015 10 1.308437804 1.677230685 11.3497205
0.102 10 1.308437804 1.681105267 11.34920637
0.1025 10 1.308437804 1.685222433 11.34929055
0.103 10 1.308437804 1.689341391 11.34992499
0.105 10 1.308437804 1.705833754 11.35789913
0.15 10 1.308437804 2.065774071 13.31010996
The final b value shown in Figure 4.2 is 0.102 with a RMSE of 11.3492 which leads to the
final formula:
𝑓(𝐶𝑖𝑗) = 𝑒−0.102𝐶𝑖𝑗 (4.5)
11.348
11.35
11.352
11.354
11.356
11.358
11.36
0.099 0.1 0.101 0.102 0.103 0.104 0.105 0.106
RMSE
Exponent (b)
Calibration by Golden Section Search Method
Search Method

33
4.2.3 Comparison of the HBW Trip Distribution Models
The exponential function model is minimally better than the power function model to
predict the observed generated trips, because the RMSE of 11.3492 is lower than the RMSE
of 13.1569.
4.3 HBS Trip Distribution Models
The HBS trip distribution models were calibrated in the exact same way as for the HBW
trips except instead of the power function the combined function was used to model the
HBS Trip Distribution Models in addition to the Exponential Deterence Function.
Secondly, the cost function is taken as 50% Transit General Cost (GC) + 50 % Auto
General Cost (GC).
4.3.1 Exponential Deterence Fucntion
The results for the exponential function are in Table 4.3.
Table 4.3 Calibration Iterations
b Iteration Ai Bj RMSE
0.2 40 0.002193523 1.806107739 8.074020214
0.105 40 0.000317205 1.263984778 5.279780178
0.1 40 0.000279806 1.239136381 5.132036811
0.06 40 9.20373E-05 1.067897891 4.24209423
0.05 40 6.74611E-05 1.036365641 4.171828801
0.046 40 5.93358E-05 1.025425068 4.164185186
0.045 40 5.74406E-05 1.02284905 4.163954682
0.044 60 5.55973E-05 1.020338286 4.164361153
0.04 40 4.8721E-05 1.010964513 4.171948937
0.03 40 3.46314E-05 0.992555049 4.224594886
0 40 1.11928E-05 1 4.487014471
Again, random values for b were tried first and then a closer search around the values with
the currently lowest RMSE was made. The final b value is 0.045 with a RMSE of 4.16395
which leads to the final formula:
𝑓(𝐶𝑖𝑗) = 𝑒−0.045𝐶 𝑖𝑗 (4.6)

34
Figure 4.3 Model Optimization by Golden Section Search Method
4.3.2 Combined Deterence Function
Value of a was assumed to be 2 and the value of b was found out after successive iterations
whose results are tabulated below.
Table 4.4 Calibration Iterations
a b Iteration Ai Bj RMSE
2 3 10 2.40564E+15 245803992.7 14.11425
2 2 10 84130221.32 98049.94251 14.01489
2 1 10 2.975836261 51.58806788 13.11658
2 0.5 10 0.001159372 2.822307559 9.926264
2 0.2 20 6.9012E-06 1.262071105 5.191258
2 0.15 30 1.83895E-06 1.110844769 4.541762
2 0.12 30 7.53075E-07 1.036141659 4.339555
2 0.117 40 6.85661E-07 1.0300132 4.335658
2 0.115 40 6.43785E-07 1.026088736 4.334634
2 0.114 40 6.23721E-07 1.024176061 4.334572
2 0.113 40 6.04221E-07 1.022296946 4.334801
2 0.112 40 5.85269E-07 1.020451772 4.33531
2 0.11 20 5.48958E-07 1.016864738 4.337134
2 0.1 20 2.87999E-14 13761807.06 4.35979
2 0.05 20 6.33249E-08 0.998997683 4.584185
2 0.0000 10 6.98366E-09 1.473571254 4.791334
2 -0.5 10 2.91637E-21 11284596.68 5.5037
4.162
4.164
4.166
4.168
4.17
4.172
4.174
0.037 0.039 0.041 0.043 0.045 0.047 0.049 0.051
RMSE
Exponent (b)
Golden Section Search Method

35
The value of b was found out to be 0.114 with a RMSE of 4.3346. Thus the following
Combined Deterence function was formed:
𝑓(𝐶𝑖𝑗) = (𝐶𝑖𝑗)2
𝑒−0.114𝐶 𝑖𝑗 (4.7)
4.3.3 Comparison of HBS Trip Distribution Models
The exponential function model is minimally better than the power function model to
predict the observed generated trips, because the RMSE of 4.16395 is lower than the RMSE
of 4.3346.
5. Modal Split Analysis
5.1 Introduction
This part is to present the progress made with respect to the Modal Split Analysis step of
the four-step Urban transportation demand modeling. In this step, modal split analysis
models are developed for HBW, HBD, and HBS trips as well as the whole trip sets.
Generally, there are three main different types of Modal Split Analysis, namely Graphical
Method (Diversion Curves), Aggregate Mode Choice models and Disaggregate Choice
models. Here, we will firstly give a brief introduction of the Aggregate Mode Choice
models (e.g., Logistic Regression method), and then focus on Disaggregate Choice models.
5.2 Aggregate Modal Split Models
Generally speaking, aggregate modal split modeling is similar to the regression
methodology applied for developing zonal trip generation models. It is done by using zonal
4.33
4.335
4.34
4.345
4.35
4.355
4.36
4.365
0.095 0.1 0.105 0.11 0.115 0.12 0.125
RMSE
Exponent (b)
Golden Section Search Method
Search Method

36
aggregated data. The idea is to establish expressions that relate the proportional division of
total trip interchanges among available modes to the attributes (competitiveness) of the
modes at a zone level.
The most popular aggregate model is the so-called logit model which is derived from the
following assumed relations:
(a)
ln
(b)
P
Y
P
= , or
(a)
(b)
YP
e
P
= (5.1)
and
(a) (b) 1P P+ = (5.2)
where P(a) and P(b) stand for the probabilities of two different kinds of travel mode, e.g.,
a for (car) and b for (transit), and Y represents the relative “advantage” of car as compared
to transit, and is normally assumed to be a linear function of the modal attributes such as:
1 1 1 2 2 3 3 n nY a b x b x b x b x= + + + + + (5.3)
where ix is a selected feature which represents the modal attribute differences such as
differences in travel time, differences in travel cost, etc. Now, we have
1
(b)
1 Y
P
e
=
+
(5.4)
( )
1
Y
Y
e
P a
e
=
+
(5.5)
and
1 1 1 2 2 3 3
(a)
ln
(b)
n n
P
Y a b x b x b x b x
P
= = + + + + + (5.6)
The parameters in the above equation (5.6) may be estimated by linear regression analysis,
thun the problem transformed to a general linear regression problem.
We should note that in the aggregate modal split model, we are using the proportions of
the different travel modes, which are calculated at the zone level, to represent the
probabilities of corresponding modes, and using a set of attributes to fit a function so that
we can estimate the parameters as shown in Equation 5.6. The objective is to explain the
zonal deviations from the mean value of this dependent variable in terms of a set of
independent variables which describe the zones and the transportation system properties of
the competing modes. What we could get here are the probabilities of the corresponding
travel modes for trips between specific OD pairs. Obviously, there are many limitations of
the aggregated logit model, e.g., individual behaviors are masked, which means we could
not tell for a certain purpose of trip which mode would the travelers choose. That is where
the disaggregate/discrete modal split models come.

37
5.3 Disaggregate Modal Split Models
Using the household survey data provided (2011 TTS data), the modal split was firstly
performed for the whole observed data set, then for trips with different purposes, i.e.,
HBW, HBD, and HBS trips separately. In each analysis, multinomial logit models are
explored and applied, followed by nested logit models.
In this section, both models utilize disaggregated data, examining mode choice on the basis
of each trip. These disaggregate mode choice models are based on the principles of utility
theory in which a deterministic component and a random error term are employed, as
shown in the equation below:
𝑢𝑖 = 𝑣𝑖 + 𝜀𝑖 (5.7)
in which u, v and ε are the utility, deterministic component and random error term of each
mode i respectively. The deterministic component accounts for variables, e.g., travel time
and cost, which can be easily quantified; while the random error term accounts for all other
factors that are difficult to quantify. Relying on the assumptions that random error terms
are identically and independently distributed and follow a Gumbel Type I distribution, the
probability of choosing a mode is described by the following formula:
𝑃(𝐴𝑖) =
𝑒 𝑣 𝑖
∑ 𝑒
𝑣 𝑗
𝐴 𝑗∈𝐴
(5.8)
in which P(Ai) is the probability of choosing mode Ai out of the set of all available modes
Aj and vi is the deterministic component of the utility function for mode i out of the set of
all available deterministic components vj. Users are expected to choose the model that
provides the highest utility.
A key assumption of this model is the independence of irrelevant alternatives which states
that the ratio of the probabilities of choosing two modes is independent of the attributes of
all other alternatives. This means that the probability ratio between these modes will be the
same no matter what changes are made from other modes.
In this section, the multinomial logit model estimates four possible modes: Auto (1), transit
(2), walk (3) and bike (4). Auto driver and passenger were placed into the same category
all as Auto (1).
As for the nested logit models, since they do not assume independence from irrelevant
alternatives, modes are classified according to certain distinguishing modal features using
a hierarchical structure. Modes are separated into subsets based on their attributes until one
of the final mode choices are reached. Organizing modes into subsets or “nests” allows for
only similar modes to be compared at each step, allowing for changes in attributes to cause

38
changes in mode choice probabilities. Two different nest structures are explored, as shown
in Figure 5.1 and 5.2.
Figure 5.1 Nested structure 1
Figure 5.2 Nested structure 2
5.4 Statistics of Estimated Trip Records
Table 5.1 and 5.2 give a brief illustration of some statistics of the estimated trip records,
from the viewpoint of mode and trip purpose, respectively.

39
Table 5.1 Number of estimated trips by different modes
Mode Number of Estimated trips
M1_AUTO Auto Driver (31348) + Auto Passenger (7731) = 39079
M2_TRANS 2210
M3_BIKE 347
M4_WALK 1845
M5_School
Bus
831
M6_Taxi 135
M7_Motorcyc
le
24
M8_Other 97
M9_Unknow 1
Total 44516
Table 5.3 Number of estimated trips by different trip purposes
Trip Purpose Number of Estimated trips
HBW 10847
HBS 4442
HBD 20983
NHB 8244
Total 44516
5.5 Modal Split Analysis for All trips
5.5.1 Multinomial logit models
Five different models, which is to say five different modal split utility functions are
explored for the whole observed trip dataset.
The first model starts from relatively complex utility functions which include all possible
variables that might be considered significant for the choice of each mode. The functions
for the probabilities of choosing each mode are illustrated in Table 5.3.

40
Table 5.3 Modal split function 1 for all observed trip records
Mode Utility Function Specification
M1_AUTO
V1 = A_AUTO + B1_AUTOTT * AUTOTT
+ B1_AUTOGC * AUTOGC
+ B1_TRAN_PASS * TRAN_PASS
+ B1_TRIP_KM * TRIP_KM
M2_TRANS
V2 = A_TRANS + B2_TRANSITTT * TRANSITTT
+ B2_TRANSITGC * TRANSITGC
+ B2_LICENSE * LICENSE
+ B2_TRAN_PASS * TRAN_PASS
M3_WALK
V3 = A_WALK + B3_LICENSE * LICENSE
+ B3_AGE * AGE + B3_TRAN_PASS * TRAN_PASS
M4_BIKE
V4 = A_BIKE + B4_LICENSE * LICENSE
+ B4_ AGE * AGE+ B4_TRAN_PASS * TRAN_PASS
In the above utility functions, the parameters beginning with “A” are constants, while the
factors beginning with “B” are the coefficients to describe the influence of each attributes
to the probability of choosing each mode. There should be 4-1=3 constants, i.e., one of the
A_* should be Fixed. However, at the first try, we tried to use all four constants coefficients
to check what will raise. And generally, we found that (at least) one of the fourth constants
does need to be fixed based upon the statistic results we get.
The estimation of these parameters is performed by BIOGEME, and the estimation results
which include the obtained values and statistics for each factor are demonstrated in Table
5.4. The model has a log-likelihood of -12558.591 and an adjusted Rho-squared of 0.798
which are used as two references in the later process to compare the goodness of different
models. From Table 5.4, we can see that many variables are not estimated as significant,
since their t-values are between [-1,1], and therefore should be removed. Specially, we can
see all parameters related to TRIP_KM are not significant which might mean that people
are not sensitive to trip distance as compared to other factors.

41
Table 5.4 Modal split function 1 calibration statistics
To improve the model step by step, parameters related to TRIP_KM, i.e., B1_TRIP_KM,
B2_TRIP_KM, B3_TRIP_KM, B4_TRIP_KM, and B1_PASS are firstly removed from
the utility function. The new modal split utility functions, UF2, is shown in Table 5.5.
M1_AUTO
M2_TRANS
V2 = A_TRANS + B2_TRANSITTT * TRANSITTT +
B2_TRANSITGC * TRANSITGC
+ B2_PASS * TRAN_PASS
M3_WALK
V3 = A_WALK + B3_LICENSE * LICENSE
+ B3_AGE * AGE + B3_PASS * TRAN_PASS
M4_BIKE
+ B4_ AGE * AGE + B4_PASS * TRAN_PASS

42
The estimation results, e.g., the obtained values and statistics for each factor are
demonstrated in Table 5.6.
This time, the model has a log-likelihood of - 13498.583 and an adjusted Rho-squared of
0.783. The mode improves a lot according to the parameters’ significances, however,
B4_PASS and A_WALK are still not significant. So, in the following model, B4_PASS is
dropped and A_WALK’s value is kept fixed. The new modal split utility functions, UF3,
is shown in Table 5.7.
M1_AUTO
M2_TRANS
M3_WALK
V3 = 0 + B3_LICENSE * LICENSE
M4_BIKE
+ B4_ AGE * AGE

43
The estimation results, e.g., the obtained values and statistics for each factor are
The model has a log-likelihood of - 13499.075 and an adjusted Rho-squared of 0.783. From
Table 5.8, we can tell, the mode improves again according to the parameters’ significances,
however, B1_AUTOGC and A_BIKE somehow are still not significant. So, in the
following model, we drop B1_AUTOGC and keep A_ BIKE fixed. The new modal split
utility functions, UF4, is shown in Table 5.9.
M1_AUTO V1 = A_AUTO + B1_AUTOTT * AUTOTT
M2_TRANS
M3_WALK
M4_BIKE
+ B4_ AGE * AGE

44
The estimation results, e.g., the obtained values and statistics for each factor in UF4 are
Table 5.10, we can tell, the mode again improves a lot, with all parameters are significant
now. However, we can find that the value of B1_AUTOTT and B2_TRANSITTT are
similar to each other, with 9.02 and 9.23 are so close. The last model is thus tried with a
generic variable for travel time. The new modal split utility functions, UF5, is shown in
Table 5.11.
M1_AUTO V1 = A_AUTO + B_TT* AUTOTT
M2_TRANS
V2 = A_TRANS + B_TT * TRANSITTT +
M3_WALK
M4_BIKE
+ B4_ AGE * AGE
The estimation results, e.g., the obtained values and statistics for each factor in UF5 are

45
The model has a log-likelihood of -13500.609 and an adjusted Rho-squared of 0.783. The
log-likelihood decreased minimally and the Rho-square did not change. From Table 5.12,
we can tell that all estimated factors remained significant.
To test if the new simpler model UF5 is better than the previous one, UF4, a likelihood-
ratio test has been done. The new model (UF5) is a subset of the previous model (UF4).
The likelihood ratio statistic is LR = 2*(- 13500.418-(-13500. 609)) = 0.382. This LR value
should be compared to the critical likelihood ratio value LR*, which comes from the χ2
table with the degree of freedom k, which is the difference in the number of variables
between both models, so here k equals to 1 in this case. The significance level is chosen to
95%. From the χ2
table, we can find the resulting LR* is 3.841.
LR = 0. 382 < LR* = 3.841
Therefore, the complex model UF4 is not significantly better than the simpler model UF5.
As a result, the simpler model UF5 should be used.
Finally, we get the utility function as demonstrated by UF5 in Table 5.11 when all observed
trip records are used to estimate our disaggregate model.
5.5.2 Nested logit models
Nested logit models are evaluated. Two different nested structures are explored to nest the
final multinomial logit model. As demonstrated in Figure 5.1, in the first nested structure,
we split the four modes into two subdomains, i.e., we treat Car and Transit as Non-active,
while Bike and Walk as Active. However, in the second nested structure, only walking and
biking are nested under the active category, while auto and public modes remain not nested.
Results show that the second nested structure obtains better statistics and is more
interpretable.

46
(1) Nested structure 1
Firstly, we nest the logit model given by the utility function UF5 in Table 5.11 using the
first nested structure illustrated in Figure 5.1. The estimation results for this nested model,
e.g., the obtained values and statistics for each factor are demonstrated in Table 5.13.
Table 5.13 Nested Model 1 calibration statistics
From Table 5.13, we check that the coefficient of the Non-active, i.e., AutoTransit, is 1.00
in red color which representing that this nested structure is not suitable. Also, the t-test
value of Active is 0.63 indicating that this value is insignificant. Therefore, this nested
structure should not be adopted.
(2) Nested structure 2
Table 5.14 Nested Model 2 calibration statistics
Then, we nest the logit model given by the utility function UF5 in Table 5.11 using the
nested structure 2 illustrated by Figure 5.2. The estimation results for the second nested

47
model, e.g., the obtained values and statistics for each factor are demonstrated in Table
5.14.
This time, although there is no red warning, from Table 5.14, we check that the coefficient
of the Active is 7.24 with a t-test value of 0.47 indicating that this value is insignificant.
The t-test value of B3_PASS is found to be insignificant as well. Therefore, although the
nested structure is better than the first one, nesting the logit model does not improve the
model. The nested model is excluded as a candidate for the modal split.
5.6 Modal Split Analysis for HBW trips
For HBW trips, the first tried model is the optimal model for all trips given by UF5. The
mode utility functions are shown again in Table 5.15.
Table 5.15 Modal split function 1 for HBW
M2_TRANS
M3_WALK
M4_BIKE
+ B4_ AGE * AGE
The resulting values and statistics for each factor are in Table 5.16.
Table 5.16 HBW modal split function 1 calibration statistics

48
The model has a log-likelihood of -3608.935 and an adjusted Rho-squared of 0.765. From
Table 5.16, we check that B4_AGE is not significant with a t-test value of -0.44. So, in the
following model, we drop B4_AGE. The new modal split utility functions, HBW_UF2, is
shown in Table 5.17.
M2_TRANS
M3_WALK
M4_BIKE V4 = 0 + B4_LICENSE * LICENSE
Table 5.18, we check that all parameters are significant. So, to improve the model, the
generic variable for travel time is split into B1_AUTOTT and B2_TRANSITTT to see if
they influence the mode choice for HBW trips. The new mode utility functions HBW_UF3
are shown in Table 5.19.
The resulting values and statistics for each factor are in Table 5.20. The model has a log-
likelihood of - 3608.829 and an adjusted Rho-squared of 0.765. The log-likelihood
increased minimal and the Rho-square did not change.

49
M2_TRANS
V2 = A_TRANS + B2_ TRANSITTT * TRANSITTT
M3_WALK
From Table 5.20, we can check that all parameters remain significant. So, to test whether
the new complex model is better than the previous one, a likelihood-ratio test has been
done. The previous model (HBW_UF2) is a subset of the previous model (HBW_UF3).
The likelihood ratio statistic is LR = 2*(-3608.829 (-3609.032)) = 0.406. This LR value
should be compared to the critical likelihood ratio value LR*, which comes from the χ2
table with the degree of freedom k, which is the difference in the number of variables
between both models, so here k equals to 1 in this case. The significance level is chosen to
95%. From the χ2
table, we can find the resulting LR* is 3.841.
LR = 0.406 < LR* = 3.841
Therefore, the complex model HBW_UF3 is not significantly better than the simpler model
HBW_UF2. As a result, the simpler model HBW_UF2 should be adopted. To improve the
HBW_UF2 model, a final model is explored adding B4_PASS in, with the new mode
utility functions HBW_UF2_Upgrade are shown in Table 5.21.

50
Table 5.21 Modal split function 2_Upgrade for HBW
M2_TRANS
M3_WALK
M4_BIKE
The estimation results, e.g., the obtained values and statistics for each factor in
HBW_UF2_Upgrade are demonstrated in Table 5.22.
Table 5.22 HBW modal split function 2_Upgrade calibration statistics
The model has a log-likelihood of - 3600.444 and an adjusted Rho-squared of 0.766. Both
of the log-likelihood and the Rho-square are improved. Moreover, from Table 5.22, we
check that all parameters are significant. The new adding parameter shows a t-test value of
4.13. Even after a likelihood-ratio test, with
LR = 2*(-3600.444 (-3609.032)) = 8.588 > 3.841
the new model (HBW_UF2_Upgrade) shown to be significantly better than the simpler
model HBW_UF2. As a result, HBW_UF2_Upgrade should be adopted.
Finally, we get the utility function as demonstrated by HBW_UF2_Upgrade in Table 5.21
to estimate our disaggregate model for HBW trips.

51
Nested logit models are also evaluated for HBW trips. From section 4.5.1 we get to know
that the nested structure is not suitable so here in this part we only explore the nested model
by nested structure 2 as shown in Figure 5.2. In this structure, walking and biking are nested
under the active category, while auto and public modes remain not nested.
(1) HBW nested model 1
Firstly, we nest the logit model given by minorly changing the utility function
HBW_UF2_Upgrade in Table 5.21 (adding A_BIKE in) using the nested structure
illustrated by Figure 5.2. The estimation results for this nested model, e.g., the obtained
values and statistics for each factor are demonstrated in Table 5.23.
Table 5.23 HBW Nested Model 1 calibration statistics
The model has a log-likelihood of - 3434.358 and an adjusted Rho-squared of 0.769, both
of which are improved. However, from Table 5.23, we check that Active, B4_LICENSE,
and B4_PASS are insignificant. So, in the following two models, we drop B4_LICENSE
and B4_PASS separately to see if the model will be improved. The second nested model
for HBW is then shown in Table 5.24 with B4_PASS dropped.
The estimation results, e.g., the obtained values and statistics for each factor in HBW nested
model 2 are demonstrated in Table 5.25.
The model has a log-likelihood of - 3436.743 and an adjusted Rho-squared of 0.769. This
time, from Table 5.25, we check that both Active and B4_LICENSE become significant.

52
So, this model is acceptable. However, a third nested model for HBW is explored shown
in Table 5.26 with B4_LICENSE dropped to check if any benefits will be obtained.
Table 5.24 HBW nested model 2 utility function
M2_TRANS
M3_WALK
M4_BIKE V4 = A_BIKE + B4_LICENSE * LICENSE
The estimation results, e.g., the obtained values and statistics for each factor in HBW nested
The model has a log-likelihood of - 3434.524 and an adjusted Rho-squared of 0.769 with
the log-likelihood improved and adjusted Rho-squared unchanged compared with previous
Nested Mode 2. Moreover, from Table 5.27, we check that all parameters remain
significant, and the t-test value of Active improve from 2.90 to 4.63. In addition, the t-test
value of B4_PASS is better than that of B4_LICENSE. All in all, this model is much better

53
than the Nested Model 2 and is also better than the no-nested models. Therefore, the HBW
Nested Model 3 should be the final one chosen for the modal split analysis of HBW trips.
Table 5.26 HBW nested model 3 utility function
M2_TRANS
M3_WALK
M4_BIKE V4 = A_BIKE + B4_PASS * TRAN_PASS
5.7 Modal Split Analysis for HBD trips
For HBD trips, the first tried model is the optimal model for all trips given by UF5. The
mode utility functions are again shown in Table 5.28.
The resulting values and statistics for each factor are in Table 5.29. The model has a log-
likelihood of - 3045.083 and an adjusted Rho-squared of 0.895. From Table 5.29, we check
that B3_PASS and B4_AGE are not significant. So, in the following model, we drop

54
B3_PASS and B4_AGE. The new modal split utility functions, HBD_UF2, is shown in
Table 5.30.
Table 5.28 Modal split function 1 for HBD
M2_TRANS
M3_WALK
M4_BIKE
+ B4_ AGE * AGE
Table 5.29 HBD modal split function 1 calibration statistics
M2_TRANS
M3_WALK V3 = 0 + B3_LICENSE * LICENSE + B3_AGE * AGE

55
The estimation results, e.g., the obtained values and statistics for each factor in HBD UF2
are demonstrated in Table 5.31.
the log-likelihood decreased a little and adjusted Rho-squared unchanged compared with
the previous one. Moreover, from Table 5.31, we check that all parameters are significant
this time. To improve the HBD_UF2 model, a further model is explored adding B4_PASS
in. The new mode utility functions HBD_UF3 are shown in Table 5.32.
M2_TRANS
M4_BIKE
The estimation results, e.g., the obtained values and statistics for each factor in HBD UF3
the log-likelihood increased a little and adjusted Rho-squared unchanged compared with
the previous one. However, from Table 5.33, we check that the newly added B4_PASS is
not significant with a t-test value of -0.40. Therefore, this model is not acceptable, the final
multinomial logit model for HBD trips is chosen as the second one demonstrated by HBD
UF2 in Table 5.30

56
Nested logit models are also evaluated for HBD trips. From section 4.5.1 we get to know
under the active category, while auto and public modes remain not nested.
(1) HBD nested model 1
Firstly, we nest the logit model given by minorly changing the utility function HBD UF2
(adding A_BIKE) in Table 5.30 using the nested structure illustrated by Figure 5.2. The
estimation results for this nested model, e.g., the obtained values and statistics for each
factor are demonstrated in Table 5.34.
Table 5.34 HBD Nested Model 1 calibration statistics
The model has a log-likelihood of -3045.120 and an adjusted Rho-squared of 0.894. From
Table 5.34, we check that many parameters are insignificant. So, in the following model,
we try to modify the model to make every parameter significant. The second nested model
for HBD is then shown in Table 5.35 with adding B4_PASS in.

57
The estimation results, e.g., the obtained values and statistics for each factor in HBD nested
The model has a log-likelihood of – 3041.532 and an adjusted Rho-squared of 0.894 with
log-likelihood improved a little and the adjusted Rho-squared unchanged. This time, from
Table 5.36, we check that some parameters become significant. So, this model is somehow
improved. Then, a third nested model for HBD is explored shown in Table 5.37 with
B4_LICENSE dropped to check if any benefits will be obtained.
Table 5. 35 HBD nested model 2 utility function
M2_TRANS
M3_WALK
M4_BIKE

58
Table 5. 37 HBD nested model 3 utility function
M2_TRANS
M3_WALK
M4_BIKE V4 = A_BIKE + B4_PASS * TRAN_PASS
The estimation results, e.g., the obtained values and statistics for each factor in HBD nested
The model has a log-likelihood of – 3042.317 and an adjusted Rho-squared of 0.894. From
Table 5.38, we check that all parameters are significant this time. However, compared with
the non-nested mode, whose log-likelihood is -3047.220 and adjusted Rho-squared is
0.895, the nested model is acceptable but not better significantly. Therefore, for the HBD
trips, the nested model and the non-nested model demonstrated by Table 5.30 could all be
in our chosen list.

59
5.8 Modal Split Analysis for HBS trips
For HBS trips, the first tried model is the optimal model for all trips given by UF5. The
mode utility functions are again shown in Table 5.39.
Table 5.39 Modal split function 1 for HBS
M2_TRANS
M3_WALK
M4_BIKE
+ B4_ AGE * AGE
Table 5.40 HBS modal split function 1 calibration statistics
Table 5.40, we check that B3_PASS and A_AUTO are not significant. So, in the following
model, we drop B3_PASS and keep A_AUTO fixed. The new modal split utility functions,
HBS_UF2, is shown in Table 5.41.
The estimation results, e.g., the obtained values and statistics for each factor in HBS UF2

60
M1_AUTO V1 = 0 + B_TT* AUTOTT
M2_TRANS
M3_WALK V3 = B3_LICENSE * LICENSE + B3_AGE * AGE
M4_BIKE V4 = 0 + B4_LICENSE * LICENSE + B4_AGE * AGE
The model has a log-likelihood of –3259.400 and an adjusted Rho-squared of 0.352. From
Table 5.42, we check that all parameters are significant. Then, we tried to upgrade the
model by adding B4_PASS, however, B4_AGE and B4_LICENSE seem to become
insignificant, and finally, we get a model, with B4_AGE and B4_LICENSE dropped and
B4_PASS added. The new modal split utility functions, HBS_UF3, are shown in Table
5.43.
M1_AUTO V1 = 0 + B_TT* AUTOTT
M2_TRANS
M4_BIKE V4 = 0 + + B4_PASS * TRAN_PASS

61
The estimation results, e.g., the obtained values and statistics for each factor in HBS UF3
The model has a log-likelihood of – 3257.059 and an adjusted Rho-squared of 0.353 with
both increased a little compared with the previous one. Moreover, this model has fewer
parameters than the previous one. Therefore, this model is better and is selected as the final
multinomial logit model for analyzing HBS trips. However, we should note that there are
only 4442 observed records for HBS trips, which might result in some random estimating
errors.
Nested logit models are also evaluated for HBS trips. From section 4.5.1 we get to know
under the active category, while auto and public modes remain not nested. Also, note that
there are not so many records with only 4442 observed records for estimating HBS trips,
so here we only try one nested model.
HBS nested model
We nest the logit model given final utility function HBS UF3 as described in the previous
section in Table 5.43. The nested structure is chosen by the one illustrated in Figure 5.2.
The estimation results for this nested model, e.g., the obtained values and statistics for each
factor are demonstrated in Table 5.45.
Table 5.45, we check that parameters of A_AUTO and B4_PASS are insignificant. As we
should note that B4_PASS is the last parameter for the trip mode of Bike, so we should not
remove it, thus we conclude that the nested model is not suitable for HBS trips may be due
to its quantity is too little.

62
Table 5.45 HBS Nested Model calibration statistics
5.9 Check consistency with the reality
After the model calibration, the signs of the estimated variables within each logit model
should be examined for consistency with reality to check if the variables counter rational
thinking.
Here, we provide a consistency check of the HBW trips’ final multinomial logit model as
an example. The utility functions are shown in Table 5.21, and the calibration result is
given in Table 5.22 which is copied below for an easy check.
With Walk and Bike as the reference modes (fixed constant), the sign of each coefficient
is examined in Table 5.46 below for auto, transit, biking and walking respectively.
The signs of many of these coefficients are open to interpretation, some of therm are in
accords with the reality, some are neutral, while a few of them might be countered with
realistic thinking. Table 5.46 provides a detailed consistency analysis for the estimated
variables in the HBW multinomial logit model.

63
Table 5.46 HBW multinomial logit model consistency analysis
Variable Sign
Makes
Sense?
Rationale
A_AUTO Positive Yes Shows that auto is superior
A_TRANS Negative Neutral
Transit might be superior to Walk and Bike
but could be inferior at certain situations
B2_TRANSITGC Negative Yes
If the general cost increases, the possibility
of choosing that mode should be decreased
B2_LICENSE Positive No
If a person gets a driving license, he/she
may be more likely to choose Auto mode
for travelling
may be more likely to choose the Auto
mode for traveling
B3_AGE Negative Yes
As people get older, they should be less
likely to choose Walk as the way of
traveling
B3_PASS Positive Neutral
If people have a transit pass, they generally
would be more likely to choose transit but it
may also increase the possibility of Walking
may be more likely to choose the Auto
mode for traveling
B4_PASS Positive Neutral
If people have a transit pass, they generally
would be more likely to choose transit but it
may also increase the possibility of riding a
Bike
B_TT Positive Neutral
Travel time should be a negative value
when considering it as one kind of travel
cost, however, it might be the case that as
trips get longer people would be more likely
to choose Transit and Auto for long trips
compared with Walk and Bike. Then it
makes sense.
Also, the correlation of the estimated coefficients should be examined to check if some of
these estimated variables show a significant correlation. As an example, the Table A.1 in
the Appendix shows the Correlation of coefficients for the HBW trips’ multinomial logit
model of the same utility function as the model consistency analysis.

64
5.10 Comparison and discussion
In general, most of the nested models are not improvements with compared to the non-
nested multinomial logit models. Therefore, in this report, nested logit models were not
extensively explored, especially for HBS trips. The final model for all trips is the one
described in its multinomial logit models as demonstrated by UF5 in Table 5.11; the final
model for HBD is a non-nested model demonstrated by Table 5.30, (its nested mode
illustrated by Table 5.37 is also suitable); the final model for HBS is its non-nested model
demonstrated by Table 5. 43; only for HBW trips, its final model is chosen as the nested
one as whose utility functions are demonstrated by Table 5. 26 and whose nested structure
is illustrated in Figure 5.2.
The generalized costs of auto and transit travel, especially of auto, are relatively
insignificant factors in mode choice. The strongest factors of mode choice are travel times
for auto and transit, license ownership and age (for the mode of bike). The fact that AGE
is significant for the mode choice of choosing a bike is reasonable since it has a negative
value which indicates that as people getting older, they might not want to ride a bike
compared to young people. Most users chose auto, indicating that this is generally the
preferred method of travel, yet its availability is dependent on the ownership of a driving
license. Thus, the License has a significant impact on the overall mode split is reasonable.
6. Traffic Assignment
6.1 Introduction
The Traffic Assignment is the last step in the four-step urban transportation demand
modeling. The objective of this step is to assign trips with specific paths based upon the
appropriate traffic assignment algorithm. Due to the data not being fit for traffic assignment
in its current condition, a prior theoretical discussion is taken.
In this section, firstly, the advantages and disadvantages of link performance-based and
node performance-based methods are examined. A link performance function is then
proposed along with a discussion of the effects that the variables within the function have
on link volumes. Finally, the advantages and disadvantages of using the maximum relative
gap for traffic assignment are discussed.
6.2 Link and node performance based assignment methods
Generally, when it comes to traffic assignment methods, a distinction between link-
performance-based and node-performance-based methods should be raised. Link
performance is mainly based on its capacity and free-flow travel time while a node
performance is based on more factors, such as traffic signal systems, freeway ramp meters

65
or enhanced network control of traffic. Obviously, the latter one, i.e., node-performance-
based methods are much more complicated.
For highways with extensive signalised intersections which involve highly complex
movements or congested downtowns, a node-based traffic assignment should be applied.
Node performance functions were used to calculate the delay experienced by vehicles at
signalized and un-signalized intersections. According to Highway Capacity Manual
(HCM) 2000[1]
, the following equations can be used to estimate the control delay at un-
signalized intersections (6.1) and signalized intersections (6.2):
𝑑 =
3600
𝑐 𝑚,𝑥
+ 900𝑇 [
𝑣 𝑥
𝑐 𝑚,𝑥
− 1 + √(
𝑣 𝑥
𝑐 𝑚,𝑥
− 1)2 +
(
3600
𝑐 𝑚,𝑥
)(
𝑣 𝑥
𝑐 𝑚,𝑥
)
450𝑇
] + 5 (6.1)
where
d = control delay (s/veh), 𝑣𝑥=flow rate for movement x (veh/h),
𝑐 𝑚,𝑥= capacity of movement x (veh/h), and
𝑇 = analysis time period (h) (with T=0.25 for a 15-min period).
𝑑 = 𝑑1(𝑃𝐹) + 𝑑2 + 𝑑3 (6.2)
where
d = control delay per vehicle; 𝑑1= uniform control delay assuming uniform arrival (s/Veh);
PF = uniform delay progression adjustment factor, which accounts for effects of signal
progression; 𝑑2 =incremental delay to account for the effect of random arrival and
oversaturation queue; 𝑑3=initial queue delay, which accounts for the delay to all vehicles
in the analysis period due to the initial queue.
Detail formats for the above parameters can be found in HCM 2000[1]
.
6.3 Link performance function
Link performance functions describe the relationship between traffic flow and travel
time/cost on the road. The most applied formula to describe the average travel time on a
link is called the Bureau of Public Roads (BPR) function[2]
and is as follows:
1f
V
t t
C


  
= +     
(6.3)
where
ft = free-flow travel time of a specific link;
V = flow on link a (vehicles/hour);
C = capacity of the specific link on an hour basis (vehicles/hour);
α, β = parameters to capture the quality of the traffic flow on that specific link.

66
Due to the fact that the BPR performance function is not asymptotic to any capacity value,
the following link performance function (6.4) was proposed by Davidson, based on
queuing theory considerations.
𝑡 = 𝑡𝑓(1 + 𝐽
𝑉
𝐶−𝑉
) (6.4)
where J is the parameter that controls the shape of the curve, and other parameters are
defined the same as in BPR function.
As long as V is smaller than C, there is a relatively free flow on the link and no congestion.
In general, the higher α and β, the higher is the impact of the flow to the capacity ratio on
the travel time.
6.4 Relative gap for measuring
The relative gap is defined as the difference between the cost of the current user equilibrium
solution and the all-or-nothing solution all divided by the cost of the current user
equilibrium solution. The selection of a maximum relative gap should account for the level
of accuracy and the computational effort required for the traffic assignment. Using a large
relative gap has the benefit of requiring far less processing power as less iteration is
required. However, the results produced are obviously very course and therefore are
unlikely to accurately reflect reality.
7. Conclusion and perspective
In this report, the whole process of four-step Urban Transportation demand modeling is
performed with different models are developed to carry out trip generation, trip
distribution, and modal split, as well as a discussion on traffic assignment.
In trip generation section, models of linear regression and categorical model are separately
developed for trip production and attraction, as well as different trip purposes. The models
and compared to one another using statistical measures and intuitive evaluation. Generally,
linear regression models were selected as they performance better than categorical models.
In trip distribution part, we developed four trip distribution models i.e. 2 for HBW TD and
2 for HBS TD. The data was analysed using two types of gravity models with different
deterrence functions, traditional, xponential and combined , to describe the impact of
generalized costs on the trip making behaviour. Both the HBW and HBS models used
different Generalised cost function depending upon the assumed mode split. The model
coefficients were optimized by searching for the values that result in the lowest RMSE i.e.
Golden Section Search Method. The exponential model was selected since it was able to
attain a lower RMSE.

67
In the Modal Split Analysis, we use both multinomial and nested logit models to carry out
modal split for HBW, HBD, HBS trips as well as from the whole observed trip data. The
models were compared using their log-likelihood and Rho-squared values. Overall, the
multinomial logit models were better than the nested logit models to describe the
probabilities of people choosing different modes, except for the analysis for HBW trips
where a nested model is finally selected.
Different link performance and node performance functions were discussed for traffic
assignment. If more detailed data of the road network and link performance are available
we can go into deeper analysis for the traffic assignment section.
All in all, we go through the whole process of traditional four-step Urban Transportation
demand modeling, i.e., trip generation, trip distribution, modal split, and traffic assignment.
Nowadays, with the advancement of Machine Leaning models and Big Data techniques,
there are enormous data-driven methods arisen for Urban Transportation demand modeling
and some of them performed very well. However, we believe that we should not look
down on the four-step modeling method, and there would be a bright future if the four-step
model could somehow be upgraded with the joint exploration of machine leaning and big
data tools.

68
References
[1] Special Report 209: Highway Capacity Manual. TRB, National Research Council,
Washington, D.C., 2000.
[2] BPR (1964) Traffic Assignment Manual: Bureau of Public Roads, U.S. Department of
Commerce, Washington, D.C.
[3] L. Fu, Modelling Travel Demand for Urban Transportation Planning. University of
Waterloo: Department of Civil Engineering, 2016, p. 31.

70
A.1 Linear Regression Production Models
A.1.1 HBW Zonal Production Model
y = 0.666x + 0.5988
R² = 0.9522
0
20
40
60
80
100
120
140
160
0 50 100 150 200 250
HBWTrips
Full Time Employees
HBW Zonal Production Model
y = 0.0027x + 16.095
R² = 0.0104
0
20
40
60
80
100
120
140
160
0 2000 4000 6000 8000 10000
HBWTrips
Total Employees
y = 0.0127x + 0.1756
R² = 0.8032
0
20
40
60
80
100
120
140
160
0 2000 4000 6000 8000 10000
HBWTrips
Zonal Population
y = 2.357x + 1.554
R² = 0.7923
0
20
40
60
80
100
120
140
160
0 10 20 30 40 50 60
HBWTrips
Part Time Employees
y = 0.7087x - 1.1349
R² = 0.8661
-20
0
20
40
60
80
100
120
140
160
0 50 100 150 200
HBWTrips
Total Number of Households

71
Regression Statistics
Multiple R 0.983514786
R Square 0.967301334
Adjusted R
Square 0.966967674
Standard
Error 3.454928086
Observations 397
ANOVA
df SS MS F
Significance
F
Regression 4 138419.0422 34604.76055 2899.064143 1.3427E-289
Residual 392 4679.119007 11.93652808
Total 396 143098.1612
Coefficients
Standard
Error t Stat P-value Lower 95%
Intercept -0.43405683 0.253737807 -1.71065099 0.087936726 -0.93291401
Zonal
Population
(ZP) -0.00010155 0.000336617 -0.30168605 0.763051377 -0.00076335
Full Time
Employees
(FTE) 0.493115147 0.016533273 29.82562241 7.0882E-103 0.460610169
Part Time
Employees
(PTE) 0.447629311 0.055160107 8.115091456 6.32706E-15 0.339182659
No. of
Households
(HHLDS) 0.096713007 0.021697866 4.457258911 1.08507E-05 0.054054262

Urban Transportation Planning for the Region of Waterloo

Urban Transportation Planning for the Region of Waterloo

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Urban Transportation Planning for the Region of Waterloo

Similar to Urban Transportation Planning for the Region of Waterloo (20)

More from Asadullah Malik

More from Asadullah Malik (15)

Recently uploaded

Recently uploaded (20)

Urban Transportation Planning for the Region of Waterloo