Managing models in the age of Open Data

MANAGING MODELS IN THE
AGE OF OPEN DATA

Key topics to cover
 The use of spatial databases
 Underlying principles
 Model data – infrastructure networks +network
options
 The 4S model
 Conclusions

Why does this matter?
 We are no longer the custodians of data – we are more like
curators (collate and contextualise)
 Need to be able to incorporate updates
 Hierarchy of models – share across levels
 Community of modellers – share between modellers and
platforms
 Importance of auditing, licensing and change management
 Less tedious more fun!

How Data is Stored – Traditional
 Developed without reference to
recent developments in data
management
 Stored in proprietary formats
 File based data/consolidated data
bank
 Data stored outside of the model
(e.g. GIS) – often deeply nested
folders of files on a network share

How Data is Stored - Improved
 Well established approach – Relational Database
Management System (RDBMS)
 Commercial and open source packages
 Large data sets, spatial
 Standardised access and analysis – SQL
 Integrate with other systems (GIS, stats, custom)
 Shared access, security and access control

Guiding principles
 Robustness principle – Fuzzy not brittle
 Be conservative in what you do, be liberal in what you accept from
others
 Separate data and processing
 BAD: Excel, GOOD: Database Queries
 Data normalisation – never repeat data
 [Every] non-key [attribute] must provide a fact about the key, the whole
key, and nothing but the key
 Unified data – Everything goes in the database
 Metadata – Data about data
 Source, context, limitations

Data Sources
 Govt street centreline data
 Freely available but limited
 Commercial products
 Full routing information
 License issues with derivative works
 Crowd sourced (OpenStreetMap)
 Road networks, points of interest, commercial centres,
schools, airports, parking and many other elements
 Good quality – but some missing/inconsistent
 Can fix errors/omissions

Network Geometry and Connectivity
Traditional approach:
 Series of links and nodes
 Anode, Bnode and fixed number of attributes
 Semi-automatic/semi-manual process that creates a new
stand-alone artifact
Weaknesses:
 Cannot distinguish defaults from overrides
 Breaks links to original data sources
 Hard to bring in update to external sources
 Difficult to unify changes (node number conflicts)

The Goal
 No manual processing in network creation
 Repeatable, automatic process
 Share process not all data
 Fast enough to run every time
 “Fuzzy" enough that it can still work even if there are
changes to the underlying spatial data
 No node numbers!

Creating a network from GIS layers
 Two ways of viewing a network
 Geographically (polylines in GIS)
 Topologically (links + nodes in transport model)
 Conversion between these views
 Network connectivity from spatial join
 Cannot use exact coincident points
 sensitive to minor changes
 Not too fuzzy or else incorrect topology

Adding more detailed information
 Need to add detail to source data (lanes,
capacities)
 Common approaches – both break connection
 Edit source data
 Make model network and then edit

Our Approach – “Link Transitions”
A point with a
bearing (unit vector)
Specify start or end
of an attribute
change

Directional Points - Link Transitions
 Bearing allows direction
 Better identification when position data is ambiguous -
location + bearing eliminates most ambiguities
 Remaining problems can be identified and solved through
more careful coding
 Works with named roads – consistent with the way that we
think about roads
 Robust when network changes – coordinates, added or
removed links

Option Coding
o Option Links
o Option
Connection
Points
o Option Link
Transitions
o Option Nodes
o All have
OptionCode
o Scenarios have
hierarchical sets
of OptionCodes

4S
Structure
Stochastic:
● Monte Carlo methods to draw
values from probability
distributions
● Random variable parameters
● Number of slices can be
varied
SIMULTANEOUS
Segmented:
● Comprehensive
breakdown of travel
markets (20 private + 40
CV segments)
● Behavioural parameters
vary by market segment
EXPLICIT RANDOM UTILITY
Slice:
● Takes slices of the travel
market
○ across model area
○ through probability
distributions
● Very efficient – detailed
networks, large models
Simulation:
● Uses state-machine with
very flexible transition rules
● Simulates all aspects of
travel choice
● Complex public transport
● Multimodal freight
● Easily extended

Key features of 4S model
 No matrices, no skims, no zones, no centroid connectors
 All travel is from node to node
 Models constructed with MUCH less manual effort
 Include all roads, all paths, timetabled transit
 Population and employment from multiple sources
 Multimodal with all modes assigned
 Continuous time and simultaneous choice
 Easily include any demand based effects and capacity constraints (not
just roads and transit)
 Much more detailed outputs (volumes by purpose)

Australia wide model
All roads except local streets
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 2 hrs (500k links, 400k nodes)

Detailed Australia model
All roads
Some timetabled PT
Walk/cycle
Commercial vehicles
Runs in under 8 hrs (2m links (2way), 1.5m nodes)

Great Britain
Excluding
residential
streets
864k Links
293,000 km
3:19 hrs

California
All Roads and
paths
1.9m Links
509,000 km
316,000 mi
8:44 hrs

Managing models in the age of Open Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Managing models in the age of Open Data

Similar to Managing models in the age of Open Data (20)

More from JumpingJaq

More from JumpingJaq (20)

Recently uploaded

Recently uploaded (20)

Managing models in the age of Open Data

Editor's Notes