Presentation at the European transport conference 2019 (Dublin);
also presented at the 6th International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS) Krakow, Poland (2019).
Accompanying paper: https://doi.org/10.1109/MTITS.2019.8883333
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Big data fusion and parametrization for strategic transport models
1. Big data fusion and
parametrization
For strategic transport models
Luuk Brederode –
DAT.Mobility(speaker)
Mark Pots – DAT.Mobility
Ruben Fransen – TNO
Jan-Tino Brethouwer
Dublin, 2019-10-11 𝑓(𝐱 =…
Presented before at MT-ITS 2019 - Krakow - Paper will be published in IEEE
2. Contents
1. Research question and motivation
2. Proposed approach
3. Solution methodology
4. Application: fusing mobile phone data in Rotterdam / The Hague region
5. Conclusions and further research
Slide 2/24
3.
4. Motivation: new datasources on mode/destination
All these data sources describe observed mode- and destination choices
Slide 4
Data source type
Level of aggregation for
destination choice
Level of aggregation for
mode choice
Survey data (travel diaries)
Trip length frequencies per trip
purpose
modal split per trip length
bin
Mobile phone data (GSM, GPS) OD patterns (currently) train vs non-train
ANPR data OD patterns Only car
Bluetooth data OD patterns -
OD matrices from other / old transport model OD patterns modal split per cell
Data on parking occupancies (garages, scan cars) Number of departures/arrivals Only car
WiFi sniffers Number of departures/arrivals Only car
IoT ? ?
Drones ? ?
… ? ?
5. Sub research questions solved along the way
Data fusion (replication of the current)
1. How can we detect and remove inconsistencies in/between data sources?
2. How do we weigh and normalize different data sources?
3. How do we cope with the data fusion problem being (highly) underspecified?
Parameter estimation (forecasting)
4. How can we simultaneously use all data sources for parameter estimation?
5. How can we reach consistency in assumptions imposed in the fusion and
estimation steps?
Slide 5/24
9. Proposed approach: more accurate parameters
Slide 9/24
Parameters based on all dataParameters based on survey + scenario data
10. Proposed approach: consistency with application context
Slide 10/24
GSM/ANPR/… data fused taking (network-)
modelled context into account
GSM/ANPR/… data appended ignoring
(network) modelled context
11. Proposed approach: consistency in assumptions
Slide 11/24
Consistency in assumptions for fusion and
estimation
Inconsistency in assumptions between fusion
and estimation
Gravity model
Gravity model
Gravity model
Correction factors
12.
13. Data fusion
Solution: multi proportional gravity model solved using the IPFP*
• All data sources are handled as equality or inequality constraints
• No weighting or normalization required
• Inconsistencies between data sources detected as violated combinations of constraints
• We’ve generalized the model to also handle inequality constraints
• When it converges, the gravity model is guaranteed to find the solution corresponding to
conditions of max entropy
• i.e.: it finds matrices that are most likely to occur
• i.e.: its solution is unique, hence removing the underspecification
Slide 13/24
*see section III.a of the paper for details
14. Detection of inconsistenties in/between data
sources
Slide 14/24
Figure: normative adjustment factor per iteration in multi-proportional gravity model using 5 data sources
Inconsistenties in modal split,
origins en GSM data
Consistent datasets
To remove data inconsistencies constraints need to be relaxed by:
• Removing conflicting data points; or
• Transforming data (aggregate, make relative, …); or
• Apply tolerances by transforming equality constraints to pairs of inequality constraints
15. Parameter estimation
Problem formulation:
Solution method: BFGS algorithm with dampened Hessian update*
• Using adjoint method for gradient calculation for problems < ~3300 centroids
• Using finite differences for gradient approximation for problems > ~3300 centr.
Slide 15/24
*see section III.b of the paper for details
16.
17. Four interns did the heavy lifting
17//24
Internship Ruben Fransen
(DAT.Mobility / UTwente 2015)
Internship Mark Pots
(DAT.Mobility / UTwente 2017)
Internship Jan-Tino Brethouwer
(DAT.Mobility / UTwente 2017)
Masters thesis Mark Pots
(DAT.Mobility / UTwente 2018)
• This lead to an implementation in Matlab of:
• A multiproportional gravity model that can handle constraints on any aggregation level
• Inequality constraints added (on top of conventional equality constraints)
• An efficient parameter estimation method
• Interfacing with dataformats from OmniTRANS transport planning software
18. Fusing MP data in Rotterdam/The Hague model
• 7786 zones in model
• 1261 zones in MP data
MP Data
Data fusion
Fused OD matrices per
mode
Base year
Scenario data
Survey data
19. Fusing MP data in Rotterdam/The Hague model
Slide 19/24
Reference: Survey data + productions/attractions Survey and MP data + productions/attractions
Survey and relative MP data +
productions/attractions
Run
#iteration
loops
max cell
change
max adjustment
factor deviation calculation time
Reference 43 10 0.79% 11 minutes
Reference (25 it) 25 36 8% 7 minutes
MP data added
(absolute)
Did not converge
MP data added
(relative) 25 45 11% 10 minutes
All reported calculation times where attained on a laptop using a Core i7-8750H CPU and 16GB of RAM
Normative adjustment factors per iteration for:
20.
21. Proposed solution methods
• Multiproportional gravity model with IPFP for data fusion
• Extended to support inequality next to equality constraints
• BFGS with tri-proportional gravity model for parameter estimation
• Biproportional gravity model for application
Slide 21/24
22. We solved all sub research questions :p
1. How can we detect and remove inconsistencies in/between data sources?
By monitoring max/min adjustment factors per iteration
2. How do we weigh and normalize different data sources?
Avoided by threating all data as constraints
3. How do we cope with the data fusion problem being (highly) underspecified?
Impose conditions of maximum entropy
4. How can we simultaneously use all data sources for parameter estimation?
First fuse, then estimate
5. How can we reach consistency in assumptions imposed in the fusion and estimation steps?
Use a gravity model for both fusion and estimation
Slide 22/24
23. Further research (thanks reviewer #2!)
• Use backpropagation on a
computational graph version of the
multi proportional gravity model
• This allows to approximate the
gradient with lower computational
effort for models with lots of beta’s
• Done for bi-proportional case, looking
for a student to start work on the multi
proportional case
• Some other things…. (see paper)
Slide 23/24
Result: the most likely OD matrices that satisfy observed aggregate values for rowtotals, columntotals, matrix totals, trip length bin totals and structure of MP data on level of MP data zoning