Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MANAGING MODELS IN THE
AGE OF OPEN DATA
Key topics to cover
 The use of spatial databases
 Underlying principles
 Model data – infrastructure networks +network...
Why does this matter?
 We are no longer the custodians of data – we are more like
curators (collate and contextualise)
 ...
GENERAL DATA MANAGEMENT
How Data is Stored – Traditional
 Developed without reference to
recent developments in data
management
 Stored in propr...
How Data is Stored - Improved
 Well established approach – Relational Database
Management System (RDBMS)
 Commercial and...
Guiding principles
 Robustness principle – Fuzzy not brittle
 Be conservative in what you do, be liberal in what you acc...
NETWORKS
Data Sources
 Govt street centreline data
 Freely available but limited
 Commercial products
 Full routing information...
Network Geometry and Connectivity
Traditional approach:
 Series of links and nodes
 Anode, Bnode and fixed number of att...
The Goal
 No manual processing in network creation
 Repeatable, automatic process
 Share process not all data
 Fast en...
Creating a network from GIS layers
 Two ways of viewing a network
 Geographically (polylines in GIS)
 Topologically (li...
Network Connection Points
Adding more detailed information
 Need to add detail to source data (lanes,
capacities)
 Common approaches – both break ...
Our Approach – “Link Transitions”
A point with a
bearing (unit vector)
Specify start or end
of an attribute
change
Directional Points - Link Transitions
 Bearing allows direction
 Better identification when position data is ambiguous -...
Link Transitions
Option Coding
o Option Links
o Option
Connection
Points
o Option Link
Transitions
o Option Nodes
o All have
OptionCode
o S...
THE 4S MODEL
4S
Structure
Stochastic:
● Monte Carlo methods to draw
values from probability
distributions
● Random variable parameters
...
Key features of 4S model
 No matrices, no skims, no zones, no centroid connectors
 All travel is from node to node
 Mod...
Australia wide model
All roads except local streets
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 2 hrs ...
Detailed Australia model
All roads
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 8 hrs (2m links (2way),...
NSW
Central Sydney
ACT
Hobart
Orange, NSW
Great Britain
Excluding
residential
streets
864k Links
293,000 km
3:19 hrs
California
All Roads and
paths
1.9m Links
509,000 km
316,000 mi
8:44 hrs
Upcoming SlideShare
Loading in …5
×

Managing models in the age of Open Data

121 views

Published on

Peter Davidson & Anabelle Spinoulas

Published in: Education
  • Be the first to comment

  • Be the first to like this

Managing models in the age of Open Data

  1. 1. MANAGING MODELS IN THE AGE OF OPEN DATA
  2. 2. Key topics to cover  The use of spatial databases  Underlying principles  Model data – infrastructure networks +network options  The 4S model  Conclusions
  3. 3. Why does this matter?  We are no longer the custodians of data – we are more like curators (collate and contextualise)  Need to be able to incorporate updates  Hierarchy of models – share across levels  Community of modellers – share between modellers and platforms  Importance of auditing, licensing and change management  Less tedious more fun!
  4. 4. GENERAL DATA MANAGEMENT
  5. 5. How Data is Stored – Traditional  Developed without reference to recent developments in data management  Stored in proprietary formats  File based data/consolidated data bank  Data stored outside of the model (e.g. GIS) – often deeply nested folders of files on a network share
  6. 6. How Data is Stored - Improved  Well established approach – Relational Database Management System (RDBMS)  Commercial and open source packages  Large data sets, spatial  Standardised access and analysis – SQL  Integrate with other systems (GIS, stats, custom)  Shared access, security and access control
  7. 7. Guiding principles  Robustness principle – Fuzzy not brittle  Be conservative in what you do, be liberal in what you accept from others  Separate data and processing  BAD: Excel, GOOD: Database Queries  Data normalisation – never repeat data  [Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key  Unified data – Everything goes in the database  Metadata – Data about data  Source, context, limitations
  8. 8. NETWORKS
  9. 9. Data Sources  Govt street centreline data  Freely available but limited  Commercial products  Full routing information  License issues with derivative works  Crowd sourced (OpenStreetMap)  Road networks, points of interest, commercial centres, schools, airports, parking and many other elements  Good quality – but some missing/inconsistent  Can fix errors/omissions
  10. 10. Network Geometry and Connectivity Traditional approach:  Series of links and nodes  Anode, Bnode and fixed number of attributes  Semi-automatic/semi-manual process that creates a new stand-alone artifact Weaknesses:  Cannot distinguish defaults from overrides  Breaks links to original data sources  Hard to bring in update to external sources  Difficult to unify changes (node number conflicts)
  11. 11. The Goal  No manual processing in network creation  Repeatable, automatic process  Share process not all data  Fast enough to run every time  “Fuzzy" enough that it can still work even if there are changes to the underlying spatial data  No node numbers!
  12. 12. Creating a network from GIS layers  Two ways of viewing a network  Geographically (polylines in GIS)  Topologically (links + nodes in transport model)  Conversion between these views  Network connectivity from spatial join  Cannot use exact coincident points  sensitive to minor changes  Not too fuzzy or else incorrect topology
  13. 13. Network Connection Points
  14. 14. Adding more detailed information  Need to add detail to source data (lanes, capacities)  Common approaches – both break connection  Edit source data  Make model network and then edit
  15. 15. Our Approach – “Link Transitions” A point with a bearing (unit vector) Specify start or end of an attribute change
  16. 16. Directional Points - Link Transitions  Bearing allows direction  Better identification when position data is ambiguous - location + bearing eliminates most ambiguities  Remaining problems can be identified and solved through more careful coding  Works with named roads – consistent with the way that we think about roads  Robust when network changes – coordinates, added or removed links
  17. 17. Link Transitions
  18. 18. Option Coding o Option Links o Option Connection Points o Option Link Transitions o Option Nodes o All have OptionCode o Scenarios have hierarchical sets of OptionCodes
  19. 19. THE 4S MODEL
  20. 20. 4S Structure Stochastic: ● Monte Carlo methods to draw values from probability distributions ● Random variable parameters ● Number of slices can be varied SIMULTANEOUS Segmented: ● Comprehensive breakdown of travel markets (20 private + 40 CV segments) ● Behavioural parameters vary by market segment EXPLICIT RANDOM UTILITY Slice: ● Takes slices of the travel market ○ across model area ○ through probability distributions ● Very efficient – detailed networks, large models Simulation: ● Uses state-machine with very flexible transition rules ● Simulates all aspects of travel choice ● Complex public transport ● Multimodal freight ● Easily extended
  21. 21. Key features of 4S model  No matrices, no skims, no zones, no centroid connectors  All travel is from node to node  Models constructed with MUCH less manual effort  Include all roads, all paths, timetabled transit  Population and employment from multiple sources  Multimodal with all modes assigned  Continuous time and simultaneous choice  Easily include any demand based effects and capacity constraints (not just roads and transit)  Much more detailed outputs (volumes by purpose)
  22. 22. Australia wide model All roads except local streets Some timetabled PT Walk/cycle Commercial vehicles Runs in under 2 hrs (500k links, 400k nodes)
  23. 23. Detailed Australia model All roads Some timetabled PT Walk/cycle Commercial vehicles Runs in under 8 hrs (2m links (2way), 1.5m nodes)
  24. 24. NSW
  25. 25. Central Sydney
  26. 26. ACT
  27. 27. Hobart
  28. 28. Orange, NSW
  29. 29. Great Britain Excluding residential streets 864k Links 293,000 km 3:19 hrs
  30. 30. California All Roads and paths 1.9m Links 509,000 km 316,000 mi 8:44 hrs

×