Presentation data collection and gtfs


  1. 1. Introduction to Data Collection & General Transit Feed Specification University of Thessaly,2015 GreenYourMove Project With the contribution of the LIFE programme of the European Union - LIFE14 ENV/GR/000611
  2. 2. The Big picture 2
  3. 3. How are GTFS produced? 3
  4. 4. Contents of this presentation  Definitions and Terminology of:  Stop, Routes, Trips, Layers/Shapefiles, CRS, layer features, csv files  OpenStreetMap  General Transit Feed Specification(GTFS)  QuantumGIS(QGIS) & OpenLayers Plugin  PostgreSQL, PostGIS & Shapefile and DBF Loader Exporter  shp2GTFS  Basic usage of:  QGIS  PostgreSQL & PostGIS  shp2GTFS  Conclusion and final results 4
  5. 5. Definitions and Terminology  Stop: An object that is defined by a geographic point on Earth and a name/ID, and is usually served/contained in public transit route.  Route: A sequence of stops that are served by a specific vehicle one after another. Information in a route are geospatial objects with IDs and are time-independent .  Trips: Trips are time-dependent executions of routes and they are grouped according to which route they belong to.  Layer/Shapefile: These are two concepts that are interconnected. A shapefile is a data format for GIS systems, which describes vector features, such as points, lines and polygons. When they are used in QGIS they are represented as layers on the map.  CRS: Coordinate Reference System is a system used to locate geographical entities. We will mostly use WGS84 and WGS84/Pseudo Mercator.  QGIS: In QGIS we will create layers, which will contain points and lines and fall under the generic category of layer features.  CSV file: Means Comma-Seperated Values file and is a text file in which every line represents a data record. Each record consists of one or more fields, seperated by commas. 5
  6. 6. Definitions and Terminology OpenStreetMap – website Definition: As stated on their website: “OpenStreetMap is a free geographic data. OpenStreetMap built by volunteers largely from scratch and released with an open-content license.” Why do we need it? It provides us with datasets about the road network. In most of the cases the raw data from OpenStreetMap needs to be edited in some way in order to use in an application. 6
  7. 7. Definitions and Terminology General Transit Feed Specification(GTFS) – website Definition: As stated on their website: “The General Transit Feed Specification (GTFS) defines a common format for public transportation schedules and associated geographic information.” Why do we need it? Because it will be the data format we will use to represent the transit network. It is the mainstream approach when it comes to schedule-based datasets. Its development is supported by big organizations and the number of data available in this form is really bigger than in any other form. 7
  8. 8. Definitions and Terminology QuantumGIS(QGIS) – website Definition: It is an open-source Geographic Information System(GIS) that is used for viewing, editing and analyzing geographic data. Why do we need it? We use QGIS for data collection. By exploiting the schedule-based nature of a transit network, we create layers which contain all the geographical information of a route. Taking for granted that we somehow know the longitude, latitude and id of a station in a route, we can create a layer containing stops of a route. Next, we use QGIS to create another layer with the data of the actual path the vehicle follows. The output file format are shapefiles. When we open shapefiles with a GIS system they are represented as layers. 8
  9. 9. Definitions and Terminology PostgreSQL, PostGIS & Shapefile and DBF Loader Exporter - , Definition: PostgreSQL is an open-source, object-relational database system, while PostGIS is an extension for that database which adds support for geographic objects and allows to run queries for geographic related content in SQL. DBF Loader Exporter is a simple tool for loading files into the database. Why do we need them? It is where we store our data. Furthermore, it allows to run queries to analyze and edit our data. In combination with Python’s modules they are the two main tools that we used in our script shp2GTFS. 9
  10. 10. Definitions and Terminology shp2GTFS – not available on website yet. Definition: It is a Python-based script we created that enables us to transform shapefiles to a GTFS feed with less effort than any suggested approach that we are aware of. Why do we need it? Because we need to convert the data we gathered with QGIS to GTFS feeds. There were no tools available online that would enable us to combine the simplicity of gathering data with QGIS and transform the produced layers/shapefiles to a GTFS feed. 10
  11. 11. Basic Usage - QGIS 11 The interface: Basic Toolbar, contains most of the tools we need and will use In this box we can see the layers have opened on our canvas. Lon,Lat & CRS information Canvas
  12. 12. Basic Usage - QGIS From the basic toolbar we will need: 12 Add feature (Point or Line) Used to open a map layer(plugin) Select and deselect a feature
  13. 13. Basic Usage - QGIS About the Plugin and the creation of a layer 13
  14. 14. Basic Usage - QGIS Note that:  When we create and save a shapefile to a folder in our computer, there are several files created with it. We only care about the shapefile, but we want to keep the other files too, because they contain information about CRS, encoding, etc. Opening and editing the shapefile only is enough and other files change accordingly.  We are always have to open and work on a OpenStreetMap map layer and not use other services like Bing Maps or Google Maps. 14
  15. 15. Basic Usage – PostgreSQL - PostGIS We use the software through a graphic user interface called pgAdmin 3. 15 pgAdmin’s icon:
  16. 16. Basic Usage – PostgreSQL - PostGIS 16 In order to load files onto the Database we use the a plugin: After entering username, password and database name we can connect and load shapefiles onto the database.
  17. 17. Basic Usage – PostgreSQL - PostGIS *Note that: When installing the database software we enter username and password. We used same username and password for all of the PCs we used. Also before using the loader software you should add to “Path” System Variable the path to the location of the “bin” folder in Postgres’s installation folder. (The path on my computer is: C:Program FilesPostgreSQL9.3bin ) 17
  18. 18. Basic Usage – shp2GTFS shp2GTFS is a tool we created and is still in development phase. Its usage is based around the keyfiles. Keyfiles are csv files that contain information about the timetable of the trips and other logic that needs to be included in a GTFS feed. Actually, it connects to the database using a python module, retrieves the data we ask it to retrieve, and combining them with the information on the keyfiles it produces the GTFS files one by one. 18
  19. 19. Basic Usage – shp2GTFS *Note that: Using shp2GTFS is achievable through Windows Command Prompt and Cygwin. Before using shp2GTFS you should set System Variable “client_encoding” to “UTF-8”. All input and output paths in the script need to be changed before using it on another computer or you need to have the exact same folder setup on all computers. 19
  20. 20. Thank you for your attention! For more information on our project, visit: Or contact us at: 20