SlideShare a Scribd company logo
Summer Research Program UNSW
Data-driven Approach to Demand Modelling in Transport
Professor: Chen Cai
Collaborator: Hoang Nguyen
Authors: Luiza Anselmo Olinto Pavao Xavier and Marinna Pereira Pivatto
20 February 2015
Abstract
We studied the travel demand forecast fundamentals so we could apply this
knowledge in provide GoGet improvement. Our work was based in analyse the
GoGet data to make it useful to improve their services. Finding the correct
mathematical model gave us the number of GoGet trips that a person with some
characteristics will do in a month. Using this the company can know how many cars
they should have available.
1. Problem of the research
Now a days, we can realise a lack in travel demand forecast, because they are
usually wrong and this could happen because the data used is not correct, or even
the way of calculation is not right.
However, travel demand is an important information for all transport
companies, because it is a way to estimate the number of trips that will be made in
an area at some future time point. It starts with the calculation of trip generation
that is the number of trips that will be made. This will be influenced by factors as:
number of cars, workers and number of households, for example.
Using this knowledge of travel forecasting we will apply it on GoGet analysis
to study the demand for it. GoGet is a car sharing service that begun in 2003 in
Australia.
The company needs to improve the calculation of how many spots should
have in each area. For this study we will use the data base collected from the
registration forms about their customers. So there are some data available but just
the data does not mean anything for the company. Therefore, the challenge will be
transform this data in information that will be useful for GoGet to improve their
service.
2. Solution
In order to solve our problem, we used the trip generation formula that we extracted
from the document provided by Chen Cai. The document was an important method to
present all the definitions around Transport Area.
We select the example of a household-based model to calculate the number of trips
that one person with a specific job category can generate.
Y = 0.91 + 1.44X​1​+ 1.07X​2
Where Y is the number of trips per household
X​1​ is the number of workers per household
X​2 ​is the number of cars per household
This linear regression model assumes that there is a relationship between the
independent variables (workers and car ownership) and the dependent variable (trips per
household).
We adapted this model to our problem. Assuming Y as a trips per Job Category. So,
our input was:
X​1​= 1
X​2​= Average of car ownership in which Job Category
For example:
A director has an average of 1.2 cars, so the number of trips for this category will be:
Y = 0.91 + 1.44*1 + 1.07*1.2
Y = 3.66
One director will generate on average of 3.7 almost 4 trips per day.
Knowing that, we analysed the GoGet data and we did a relationship between them.
The data showed us how often directors (we will continue using Directors as an
example) use GoGet and also the quantity of trips made by them per month. Using this
inputs we were able to calculate the probability of directors to choose GoGet.
Y​total​= Y*F*30
Where Y​total​is the total monthly trips per category
Y is the number of trips per job category
F is the frequency (how often one job category use GoGet)
30 is the total days per month
So the probability to choose GoGet is:
P (x) = Y / Y​total
By identifying the probability we were able to calculate how many trips one director
will make using GoGet.
G = Y*30*P(x)
Where G is the number of trips using GoGet
P(x) is the probability to choose GoGet
For example:
After that we should create a model to identify how many trips using GoGet a person
with some characteristics like age, income and car ownership will probably make in one
month. To solve this problem we start to analyse the relationship between our independent
variable (age, car ownership and income) with our dependent variable (Trips using GoGet).
The analyse ended up with a multinomial logistic regression model because our dependent
variable has a limited number of possible values.
3. Results
Using the code provide by MatLab for multinomial logistic regression:
X = [Avgage CarOwnership IncoDay];
prob = ordinal(Y,{'1','2','3','4'},[],[0 1 2 3 4]);
[B,dev,stats] = mnrfit(X,prob,'model','ordinal','Interactions','on')
i = 1;
x = exp(B(1,1) + B(2,1)*Avgage(i) + B(3,1)* CarOwnership(i) + B(4,1)*IncoDay(i))
y = exp(B(1,2) + B(2,2)*Avgage(i) + B(3,2)* CarOwnership(i) + B(4,2)*IncoDay(i))
z = exp(B(1,3) + B(2,3)*Avgage(i) + B(3,3)* CarOwnership(i) + B(4,3)*IncoDay(i))
We found the following results:
B =
12.8314 10.3285 14.3284
-0.1505 -0.1821 -0.0366
-17.6728 -4.7472 -19.1064
0.0237 0.0045 0.0215
x =
0.1091
y =
0.2579
z =
4.5673
It means that the three equations of our model will be:
The code is an example of how many trips a director can make in one month.
Analysing the results for ​x​, ​y and ​z ​we can conclude that this category will travel using GoGet
more than 3 times per month because ​x​, ​y ​are less than 1. Also, we can analyse that our
result is reliable using the ​t​and ​p​statistical methods, as we show above.
>> stats.t
ans =
3.1928 3.0076 2.4068
-1.6417 -2.4583 -0.3832
-3.2159 -0.9729 -2.0562
2.8027 0.6066 1.7588
>> stats.p
ans =
0.0014 0.0026 0.0161
0.1007 0.0140 0.7016
0.0013 0.3306 0.0398
0.0051 0.5441 0.0786
To help the interpretation of the results, we made a program using Visual Basic.
The code:
The dashboard to input the independent variables and check the results:
Our results can ended up with four options for the number of trips made by month.
The analysed category can make ​zero, one, two or more than thre​e trips. The result depends
on age, income per day and car ownership. The range of results we stipulated by analysing
the data that we had. The maximum trips were four and the minimum were zero. We did
not use four as our maximum value because in the data just a few cases ended up with four
trips. In the future, the data can change and maybe more than five trips can be made, so to
refine the model is necessary increase the number of boundaries in the MatLab code.
4. Implications
The results of the model will be important to improve GoGet services. Knowing how
many trips should a person with some characteristics (age, income per day and car
ownership) use in a month, it will be easy to calculate the demand and also the number of
spots that will be necessary in some area.
Furthermore, this results could also be improved to analyse the data of one specific
zone, so they can calculate the number of trips in each zone and the opportunity of apply
GoGet in new regions. So, the model will provide important demand information for the
company and this will be able to manage it to better serve customers.

More Related Content

Similar to Final_Report.docx (2)

Hybrid iterated local search algorithm for optimization route of airplane tr...
Hybrid iterated local search algorithm for optimization route of  airplane tr...Hybrid iterated local search algorithm for optimization route of  airplane tr...
Hybrid iterated local search algorithm for optimization route of airplane tr...
IJECEIAES
 
Route Performance : YUL - CDG
Route Performance : YUL - CDGRoute Performance : YUL - CDG
Route Performance : YUL - CDG
Mohammed Awad
 
Predictive Analysis of Bike Sharing System Using Machine Learning Algorithms
Predictive Analysis of Bike Sharing System Using Machine Learning AlgorithmsPredictive Analysis of Bike Sharing System Using Machine Learning Algorithms
Predictive Analysis of Bike Sharing System Using Machine Learning Algorithms
sushantparte
 
AI Final Report
AI Final ReportAI Final Report
AI Final Report
Xuming Gao
 
Marketing Analytics Final Project
Marketing Analytics Final ProjectMarketing Analytics Final Project
Marketing Analytics Final Project
AlexandraBlom1
 
Bulldozer price prediction using regression model (Research Ethics).pptx
Bulldozer price prediction using regression model (Research Ethics).pptxBulldozer price prediction using regression model (Research Ethics).pptx
Bulldozer price prediction using regression model (Research Ethics).pptx
HaxiKhan1
 
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
IRJET Journal
 
intm ca 1.pdf
intm ca 1.pdfintm ca 1.pdf
intm ca 1.pdf
Sdhkr1
 
Analysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future UseAnalysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future Use
Kimberly Nguyen
 
Using Gamification For Stimulating Safe And Good Driving Behavior
Using Gamification For Stimulating Safe And Good Driving BehaviorUsing Gamification For Stimulating Safe And Good Driving Behavior
Using Gamification For Stimulating Safe And Good Driving Behavior
Lucas Machado
 
A Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising PlatformA Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising Platform
ijaia
 
A Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising PlatformA Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising Platform
gerogepatton
 
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
IJECEIAES
 
Prediction of Used Car Prices using Machine Learning Techniques
Prediction of Used Car Prices using Machine Learning TechniquesPrediction of Used Car Prices using Machine Learning Techniques
Prediction of Used Car Prices using Machine Learning Techniques
IRJET Journal
 
Kyung Kim
Kyung KimKyung Kim
Kyung Kim
Peter Kim
 
A new hybrid approach for solving travelling salesman problem using ordered c...
A new hybrid approach for solving travelling salesman problem using ordered c...A new hybrid approach for solving travelling salesman problem using ordered c...
A new hybrid approach for solving travelling salesman problem using ordered c...
eSAT Journals
 
The Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
The Optimizing Multiple Travelling Salesman Problem Using Genetic AlgorithmThe Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
The Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
ijsrd.com
 
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
gerogepatton
 
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
ijaia
 
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
gerogepatton
 

Similar to Final_Report.docx (2) (20)

Hybrid iterated local search algorithm for optimization route of airplane tr...
Hybrid iterated local search algorithm for optimization route of  airplane tr...Hybrid iterated local search algorithm for optimization route of  airplane tr...
Hybrid iterated local search algorithm for optimization route of airplane tr...
 
Route Performance : YUL - CDG
Route Performance : YUL - CDGRoute Performance : YUL - CDG
Route Performance : YUL - CDG
 
Predictive Analysis of Bike Sharing System Using Machine Learning Algorithms
Predictive Analysis of Bike Sharing System Using Machine Learning AlgorithmsPredictive Analysis of Bike Sharing System Using Machine Learning Algorithms
Predictive Analysis of Bike Sharing System Using Machine Learning Algorithms
 
AI Final Report
AI Final ReportAI Final Report
AI Final Report
 
Marketing Analytics Final Project
Marketing Analytics Final ProjectMarketing Analytics Final Project
Marketing Analytics Final Project
 
Bulldozer price prediction using regression model (Research Ethics).pptx
Bulldozer price prediction using regression model (Research Ethics).pptxBulldozer price prediction using regression model (Research Ethics).pptx
Bulldozer price prediction using regression model (Research Ethics).pptx
 
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
IRJET- A Hybrid Approach for Travelling Service by using Data Parsing and Enh...
 
intm ca 1.pdf
intm ca 1.pdfintm ca 1.pdf
intm ca 1.pdf
 
Analysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future UseAnalysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future Use
 
Using Gamification For Stimulating Safe And Good Driving Behavior
Using Gamification For Stimulating Safe And Good Driving BehaviorUsing Gamification For Stimulating Safe And Good Driving Behavior
Using Gamification For Stimulating Safe And Good Driving Behavior
 
A Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising PlatformA Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising Platform
 
A Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising PlatformA Novel Feature Engineering Framework in Digital Advertising Platform
A Novel Feature Engineering Framework in Digital Advertising Platform
 
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
Hybrid Genetic Algorithms and Simulated Annealing for Multi-trip Vehicle Rout...
 
Prediction of Used Car Prices using Machine Learning Techniques
Prediction of Used Car Prices using Machine Learning TechniquesPrediction of Used Car Prices using Machine Learning Techniques
Prediction of Used Car Prices using Machine Learning Techniques
 
Kyung Kim
Kyung KimKyung Kim
Kyung Kim
 
A new hybrid approach for solving travelling salesman problem using ordered c...
A new hybrid approach for solving travelling salesman problem using ordered c...A new hybrid approach for solving travelling salesman problem using ordered c...
A new hybrid approach for solving travelling salesman problem using ordered c...
 
The Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
The Optimizing Multiple Travelling Salesman Problem Using Genetic AlgorithmThe Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
The Optimizing Multiple Travelling Salesman Problem Using Genetic Algorithm
 
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
Predicting Road Accident Risk Using Google Maps Images and A Convolutional Ne...
 
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
 
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
PREDICTING ROAD ACCIDENT RISK USING GOOGLE MAPS IMAGES AND ACONVOLUTIONAL NEU...
 

Final_Report.docx (2)

  • 1. Summer Research Program UNSW Data-driven Approach to Demand Modelling in Transport Professor: Chen Cai Collaborator: Hoang Nguyen Authors: Luiza Anselmo Olinto Pavao Xavier and Marinna Pereira Pivatto 20 February 2015
  • 2. Abstract We studied the travel demand forecast fundamentals so we could apply this knowledge in provide GoGet improvement. Our work was based in analyse the GoGet data to make it useful to improve their services. Finding the correct mathematical model gave us the number of GoGet trips that a person with some characteristics will do in a month. Using this the company can know how many cars they should have available. 1. Problem of the research Now a days, we can realise a lack in travel demand forecast, because they are usually wrong and this could happen because the data used is not correct, or even the way of calculation is not right. However, travel demand is an important information for all transport companies, because it is a way to estimate the number of trips that will be made in an area at some future time point. It starts with the calculation of trip generation that is the number of trips that will be made. This will be influenced by factors as: number of cars, workers and number of households, for example. Using this knowledge of travel forecasting we will apply it on GoGet analysis to study the demand for it. GoGet is a car sharing service that begun in 2003 in Australia. The company needs to improve the calculation of how many spots should have in each area. For this study we will use the data base collected from the registration forms about their customers. So there are some data available but just the data does not mean anything for the company. Therefore, the challenge will be transform this data in information that will be useful for GoGet to improve their service.
  • 3. 2. Solution In order to solve our problem, we used the trip generation formula that we extracted from the document provided by Chen Cai. The document was an important method to present all the definitions around Transport Area. We select the example of a household-based model to calculate the number of trips that one person with a specific job category can generate. Y = 0.91 + 1.44X​1​+ 1.07X​2 Where Y is the number of trips per household X​1​ is the number of workers per household X​2 ​is the number of cars per household This linear regression model assumes that there is a relationship between the independent variables (workers and car ownership) and the dependent variable (trips per household). We adapted this model to our problem. Assuming Y as a trips per Job Category. So, our input was: X​1​= 1 X​2​= Average of car ownership in which Job Category For example: A director has an average of 1.2 cars, so the number of trips for this category will be: Y = 0.91 + 1.44*1 + 1.07*1.2 Y = 3.66 One director will generate on average of 3.7 almost 4 trips per day. Knowing that, we analysed the GoGet data and we did a relationship between them. The data showed us how often directors (we will continue using Directors as an example) use GoGet and also the quantity of trips made by them per month. Using this inputs we were able to calculate the probability of directors to choose GoGet. Y​total​= Y*F*30 Where Y​total​is the total monthly trips per category Y is the number of trips per job category F is the frequency (how often one job category use GoGet)
  • 4. 30 is the total days per month So the probability to choose GoGet is: P (x) = Y / Y​total By identifying the probability we were able to calculate how many trips one director will make using GoGet. G = Y*30*P(x) Where G is the number of trips using GoGet P(x) is the probability to choose GoGet For example: After that we should create a model to identify how many trips using GoGet a person with some characteristics like age, income and car ownership will probably make in one month. To solve this problem we start to analyse the relationship between our independent variable (age, car ownership and income) with our dependent variable (Trips using GoGet). The analyse ended up with a multinomial logistic regression model because our dependent variable has a limited number of possible values.
  • 5. 3. Results Using the code provide by MatLab for multinomial logistic regression: X = [Avgage CarOwnership IncoDay]; prob = ordinal(Y,{'1','2','3','4'},[],[0 1 2 3 4]); [B,dev,stats] = mnrfit(X,prob,'model','ordinal','Interactions','on') i = 1; x = exp(B(1,1) + B(2,1)*Avgage(i) + B(3,1)* CarOwnership(i) + B(4,1)*IncoDay(i)) y = exp(B(1,2) + B(2,2)*Avgage(i) + B(3,2)* CarOwnership(i) + B(4,2)*IncoDay(i)) z = exp(B(1,3) + B(2,3)*Avgage(i) + B(3,3)* CarOwnership(i) + B(4,3)*IncoDay(i)) We found the following results: B = 12.8314 10.3285 14.3284 -0.1505 -0.1821 -0.0366 -17.6728 -4.7472 -19.1064 0.0237 0.0045 0.0215 x = 0.1091 y = 0.2579 z = 4.5673 It means that the three equations of our model will be: The code is an example of how many trips a director can make in one month. Analysing the results for ​x​, ​y and ​z ​we can conclude that this category will travel using GoGet more than 3 times per month because ​x​, ​y ​are less than 1. Also, we can analyse that our result is reliable using the ​t​and ​p​statistical methods, as we show above. >> stats.t ans = 3.1928 3.0076 2.4068 -1.6417 -2.4583 -0.3832 -3.2159 -0.9729 -2.0562 2.8027 0.6066 1.7588 >> stats.p ans = 0.0014 0.0026 0.0161
  • 6. 0.1007 0.0140 0.7016 0.0013 0.3306 0.0398 0.0051 0.5441 0.0786 To help the interpretation of the results, we made a program using Visual Basic. The code: The dashboard to input the independent variables and check the results:
  • 7. Our results can ended up with four options for the number of trips made by month. The analysed category can make ​zero, one, two or more than thre​e trips. The result depends on age, income per day and car ownership. The range of results we stipulated by analysing the data that we had. The maximum trips were four and the minimum were zero. We did not use four as our maximum value because in the data just a few cases ended up with four trips. In the future, the data can change and maybe more than five trips can be made, so to refine the model is necessary increase the number of boundaries in the MatLab code. 4. Implications The results of the model will be important to improve GoGet services. Knowing how many trips should a person with some characteristics (age, income per day and car ownership) use in a month, it will be easy to calculate the demand and also the number of spots that will be necessary in some area. Furthermore, this results could also be improved to analyse the data of one specific zone, so they can calculate the number of trips in each zone and the opportunity of apply GoGet in new regions. So, the model will provide important demand information for the company and this will be able to manage it to better serve customers.