Using FME 2016 and
GTFS Datasets
Nick Ison
April 6, 2016
About Me
BCIT GIS StudentUniversity of
Victoria
Advanced Diploma, 2016BA Geography,
GIS/Geomatics
Concentration, 2013
Workplace
Practicum
Jan-May, 2016
GTFS Datasets – What are they?
Quite simply –
a common format and
specification for public
transportation schedules
and their associated
geographic information
GTFS Structure
• Collection of 6-13 CSV files (.txt),
packaged into a single ZIP file
• Each CSV is a table of relevant
components of a transit system’s
scheduling, stops, routes and related
attributes
Schema:
http://lin-ear-th-inking.blogspot.ca/2011/09/data-model-
diagrams-for-gtfs.html
Required Components
• routes.txt
• Transit routes. A route is a group of trips that are
displayed to riders as a single service.
• Toronto has 197 routes on all modes of transit
• stops.txt
• Individual locations where vehicles pick up or drop off
passengers.
• Chicago has 11449 transit stops in the CTA
Required Components
• trips.txt
• Trips for each route. A trip is a sequence of two or more
stops that occurs at specific time.
• NYC Subways make 19425 trips per week
• stop_times.txt
• Times that a vehicle arrives at and departs from
individual stops for each trip.
• TransLink busses, trains and boats make 2.4m scheduled stops/week
Required Components
• agency.txt
• One or more transit agencies that provide the data in
this feed.
• Vancouver has three, TransLink/CMBC, WCE and BCRTC
• calendar.txt
• Dates for service IDs using a weekly schedule. Specify
when service starts and ends, as well as days of the week
where service is available.
Common Optional Components
• shapes.txt
• Rules for drawing lines on a map to represent a transit
organization's routes.
• TransLink services travel 1229 different routes per week- on 246 unique routes –
25 Brentwood Stn, 25 UBC, 25 Granville, 25 BCIT
Lots of Information in Lots of Files!
What's the problem?
• Format can be difficult for the average person to access
and interpret
• Especially if looking for information involving transit data from
multiple operators
• Route Maps as KML
• Tabular Schedules in Excel
• Stop locations in GeoJSON
• Service Extent in Shapefile
• Learning Opportunity!
How can FME Help?
• FME 2016 introduced some shiny new features
making my life a lot easier, saved time and cut my
workspace complexity in half
• Direct GTFS Format Support
• MapboxStyler
• FeatureWriter
• AttributeManager❤
Project Considerations
• Needed to be able to serve data quickly
• ‘Student’ budget
• Processing files can be expensive and time
consuming if done on the fly, so pre-processing
essential
• Historical record accessibility
TransitDatabase.com
• Web portal to view and download desired information
• Multiple datasets
• Multiple formats (GeoJSON, SHP, KML, XLS, etc.)
• Multiple versions of the GTFS datasets
Still quite a work in progress, but the shell of the site is up
and running now
System
Overview
Route
Details
How it all works…
Overall Process
Download
GTFS from
Transit Agency
Run Main GTFS
Workspace
FTP resulting
output files to
web server
Check to see
if new file
found
This process is run overnight,
every night using FME Cloud
Average runtime for one new
GTFS file is between 2 and 20
minutes (incl. FTP uploads)
Runner Workflow
GTFS Handling
Read Files
Inline Querier
Information
Extraction
Formatting/
Styling Output to
Desired
Format(s)
GTFS Handling Workflow
InlineQuerier
FME makes this easy!
Next Steps?
Website
• Better way of
implementing the maps
– geoJsons and Mapbox are
limited to simple styling
FME Workflows
• Integrate with GTFS data
feeds to only run when
updates are found
• Many many more outputs
and formats
Thank you!
Nick Ison
nickison@gmail.com
http://nickison.ca

Using FME and GTFS datasets to run TransitDatabase.com

  • 1.
    Using FME 2016and GTFS Datasets Nick Ison April 6, 2016
  • 2.
    About Me BCIT GISStudentUniversity of Victoria Advanced Diploma, 2016BA Geography, GIS/Geomatics Concentration, 2013 Workplace Practicum Jan-May, 2016
  • 3.
    GTFS Datasets –What are they? Quite simply – a common format and specification for public transportation schedules and their associated geographic information
  • 4.
    GTFS Structure • Collectionof 6-13 CSV files (.txt), packaged into a single ZIP file • Each CSV is a table of relevant components of a transit system’s scheduling, stops, routes and related attributes
  • 5.
  • 6.
    Required Components • routes.txt •Transit routes. A route is a group of trips that are displayed to riders as a single service. • Toronto has 197 routes on all modes of transit • stops.txt • Individual locations where vehicles pick up or drop off passengers. • Chicago has 11449 transit stops in the CTA
  • 7.
    Required Components • trips.txt •Trips for each route. A trip is a sequence of two or more stops that occurs at specific time. • NYC Subways make 19425 trips per week • stop_times.txt • Times that a vehicle arrives at and departs from individual stops for each trip. • TransLink busses, trains and boats make 2.4m scheduled stops/week
  • 8.
    Required Components • agency.txt •One or more transit agencies that provide the data in this feed. • Vancouver has three, TransLink/CMBC, WCE and BCRTC • calendar.txt • Dates for service IDs using a weekly schedule. Specify when service starts and ends, as well as days of the week where service is available.
  • 9.
    Common Optional Components •shapes.txt • Rules for drawing lines on a map to represent a transit organization's routes. • TransLink services travel 1229 different routes per week- on 246 unique routes – 25 Brentwood Stn, 25 UBC, 25 Granville, 25 BCIT
  • 10.
    Lots of Informationin Lots of Files!
  • 11.
    What's the problem? •Format can be difficult for the average person to access and interpret • Especially if looking for information involving transit data from multiple operators • Route Maps as KML • Tabular Schedules in Excel • Stop locations in GeoJSON • Service Extent in Shapefile • Learning Opportunity!
  • 12.
    How can FMEHelp? • FME 2016 introduced some shiny new features making my life a lot easier, saved time and cut my workspace complexity in half • Direct GTFS Format Support • MapboxStyler • FeatureWriter • AttributeManager❤
  • 13.
    Project Considerations • Neededto be able to serve data quickly • ‘Student’ budget • Processing files can be expensive and time consuming if done on the fly, so pre-processing essential • Historical record accessibility
  • 14.
    TransitDatabase.com • Web portalto view and download desired information • Multiple datasets • Multiple formats (GeoJSON, SHP, KML, XLS, etc.) • Multiple versions of the GTFS datasets Still quite a work in progress, but the shell of the site is up and running now
  • 15.
  • 16.
  • 17.
    How it allworks…
  • 18.
    Overall Process Download GTFS from TransitAgency Run Main GTFS Workspace FTP resulting output files to web server Check to see if new file found This process is run overnight, every night using FME Cloud Average runtime for one new GTFS file is between 2 and 20 minutes (incl. FTP uploads)
  • 19.
  • 20.
    GTFS Handling Read Files InlineQuerier Information Extraction Formatting/ Styling Output to Desired Format(s)
  • 21.
  • 22.
  • 23.
  • 24.
    Next Steps? Website • Betterway of implementing the maps – geoJsons and Mapbox are limited to simple styling FME Workflows • Integrate with GTFS data feeds to only run when updates are found • Many many more outputs and formats
  • 25.

Editor's Notes

  • #3 IT background
  • #4 -regular interaction GTFS started out as a side project of a Google employee, worked together with TriMet in Portland OR to create an interchange format for their internal data. Portland became the first city to be featured in the first version of Google's “Transit Trip Planner format released as the Google Transit Feed Specification Later renamed General Transit Feed Specification to emphasize community involvement in project
  • #6 This boils dwon to 6 major components,
  • #7 Stops has lat/long
  • #10 Additional optional tables include: Fare Attributes and Rules Zones, prices Frequencies for routes that don’t operate on a set schedule Subway in Toronto comes every 2 minutes M-F, 6:30am to 9:00am Transfer/Connection Rules General Feed Metadata Versions, effective/expiry details, publisher Usage of these varies from system to system, and some operators add additional files outside of the specification too
  • #12 Basic gist is that there is lots of information in lots of files, but they share some fundamental commonalities between different sources. Using all this pieces together, you can start to look at the bigger picture of the transit infrastructure of an area
  • #13 Learning Project 4-5 years ago, CRD Project Competition
  • #16 Simple web hosting package, postgres database hosted on AWS. PHP5, Little Javascript Plug Owen
  • #22 Everything is really quite simple, and straightforward. Lots of manhandling attribute data and styling outputs for various formats Still needs performance tweaking – removing unnecessary fields returned, attributes, etc
  • #24 This is the brain of the operation – query SQLite DB for the relevant information. Probably 10x faster than using various combinations of Feature Mergers.