Gerald Richter, Christian Rudloff, Anita Graser
Austrian Institute of Technology, Austria
Topic: “Extraction of bicycle commuter trips from day-long GPS trajectories”
08448380779 Call Girls In Friends Colony Women Seeking Men
Extraction of bicycle commuter trips from day long gps trajectories
1. Extraction of bicycle commuter trips
from day-long GPS trajectories
Cycling Data Challenge 2013
Leuven, Belgium
workshop presentation
Gerald Richter 1 Christian Rudloff 1 Anita Graser 1
1Austrian Inst. of Technology – Mobility Dept. – Dynamic Transportation Systems
G. Richter | AIT | mobility | DTS May 14, 2013 1 / 19
2. The Austrian Institute of Technology
AIT – who we are and what we do
Austria’s largest non-university research institute
AIT: 5 departments focussing on applied research topics
• Energy
• Mobility
business units:
• Transportation Infrastructure Technologies
• Dynamic Transportation Systems
• Electric Drive Technologies
• Light Metals Technologies Ranshofen
• Safety & Security
• Health & Environment
• Foresight & Policy Development
G. Richter | AIT | mobility | DTS May 14, 2013 2 / 19
3. Dynamic Transportation Systems
“develop efficient, safe and cost-effective multimodal
transportation solutions for transportation networks, hubs and
services”
Airports / Train Stations
Shopping Centres / Events
Multi-Modal Transportation
Networks
Transport Logistics
Crowd Dynamics Traffic Flow Modelling Dynamic Vehicle Routing
OptimisationSimulation /
Prediction
Data AnalysisData Collection
G. Richter | AIT | mobility | DTS May 14, 2013 3 / 19
4. GPS measurements
and some peculiarities
Proper GPS measurement requires 4 satelites
to be visible by device.
Measurement is stochastic process by nature.
Positional precision is gaussian distributed
under clear-view conditions.
Additional effects arise from obstructed view
(signal shadowing, reflection by obstacles).
• outliers: sudden change in signal reception
conditions
• drift: longer phases of signal impairment,
receiver-internal error correction walking a
misguided path.
snap-back
true path
G. Richter | AIT | mobility | DTS May 14, 2013 4 / 19
5. The input data
. . . hence this initial situation
some points not out of this
world
some tracks far outside the
region of interest
most likely due to GPS
initialisation phase
– fixable by bounding box
clipping
Figure: detail UK
G. Richter | AIT | mobility | DTS May 14, 2013 5 / 19
6. A simple yet efficient approach
stages of processing
Cleaning
• Outliers and unlikely points in the data are removed
i.e.: some trajectory smoothness is ascertained
• Data is split into trip trajectories inbetween stops or
activities
i.e.: a journey’s segments are identified
Mode Detection
• A training set of data is used to identify decision criteria
within a manually chosen set of variables (trip parameters).
• With those criteria modes of trips are detected to separate
bike trips from other trips
Details found in [1, 3, 2]
G. Richter | AIT | mobility | DTS May 14, 2013 6 / 19
7. Cleaning the data
Steps of the data cleaning algorithm
Outliers are removed according to
• geographic location: within bounding box around area of
interest
• accesiblity: reachable by realistic speeds (here ≤ 50 m
s )
• GPS drifts: points before trajectory snap-backs are deleted
until the remaining trajectory only contains realistic speeds
Stop detection and trip separation
• Stop is detected when trajectory does not
leave circle of radius 30m for at least 5
minutes.
• GPS trajectories are cut into trips at stop
points (removal of tumbleweed)
• Next trip starts when trajectory leaves
circle
G. Richter | AIT | mobility | DTS May 14, 2013 7 / 19
8. Unlikely points
Tumbleweed also found at
shorter stops (e.g. traffic lights)
Removed by loop detection
(look ahead 3 minutes and
find very low effective
velocities to reach a
successive trajectory point
in given time interval)
All points in loop are
replaced by one middle
point between start and
end of loop.
G. Richter | AIT | mobility | DTS May 14, 2013 8 / 19
9. Modal Decision
principle
Classification of cycling tracks
using a decision tree
Other methodologies (logistic
regression, support vector
machines, neural network)
show similar out of sample
performance
Decision tree are easy to use
and interpret
exemplary diagram:
(2-dimensional feature space)
Training data from the Vienna region with 8 different modes
G. Richter | AIT | mobility | DTS May 14, 2013 9 / 19
10. Mode Detection
algorithmic choices
For CDC data set distinction was made between 3 Modes
Walking
Cycling
Other
Algorithmic separability optimisation left 3 separation variables:
maximum velocity
percentage of time over 16 km/h
maximum acceleration
G. Richter | AIT | mobility | DTS May 14, 2013 10 / 19
12. Bird’s eye comparison
in numbers
Comparison of no. cycle trips and trip length
refined all modes cycling
No. cycle trips 941 1,734 749
Total trip [km] 4,483 6,800 3,014
Oct 12 2011
Oct 19 2011
Oct 26 2011
Nov 02 2011
Nov 09 2011
Nov 16 2011
Nov 23 2011
0
20000
40000
60000
80000
100000
totaltriptime[s]
trips per day comparison
wrt. total time
diary
processed
Oct 12 2011
Oct 19 2011
Oct 26 2011
Nov 02 2011
Nov 09 2011
Nov 16 2011
Nov 23 2011
0
10
20
30
40
50
60
70
totalnumberoftrips
trips per day comparison
wrt. number of trips
diary
processed
G. Richter | AIT | mobility | DTS May 14, 2013 12 / 19
13. Comparing track densities
principle
fewer trips were
detected than in refined
data
algorithm unlikely to
falsely qualify tracks as
cycling
coordinate shift in initial
data along the
backslash diagonal
(processed cycling trips) – (refined trips)
G. Richter | AIT | mobility | DTS May 14, 2013 13 / 19
14. Different cyclists
0 100 200 300 400 500 600 700
avg. number of pts per trip
5
0
5
10
15
20
25
numberoftrips
processed trip scatter
for all cyclists
quite different profiles
by cycling habit or
trajectory cleaning?
⇒ look associated
velocity profiles
0 10 20 30 40 50 60
speed [km/h]
0
100
200
300
400
500
#GPSpoints
speed distribution: cyclist 101
(high number of trips)
0 10 20 30 40 50 60
speed [km/h]
0
50
100
150
200
250
300
350
400
450
#GPSpoints
speed distribution: cyclist 113
(high avg. number of points per trip)
G. Richter | AIT | mobility | DTS May 14, 2013 14 / 19
15. Cyclist differences on map
high number of points per track
cyclist 113
high number of tracks
cyclist 101
G. Richter | AIT | mobility | DTS May 14, 2013 15 / 19
17. Summary & conclusions
Applied methods successfully discern useful GPS tracking data
from technological artifacts.
Not too complex methods, good classification of the cycling
transport mode
Results display periodic features of protocolled travel activity wrt.
number of trips and travel times.
Algorithm cannot identify all cycling tracks of reference data.
Differences most likely due to dissimilar training set.
Low rate of false modal identification for cycling, while retaining
the substantial part of useable tracking data.
Compared to reference data, removal of erratic GPS
measurement errors with appreciable reliability.
TODO: Use of homologous training data (road network topology
and traffic densities) expected to yield consistently better results.
G. Richter | AIT | mobility | DTS May 14, 2013 17 / 19
18. Remarks
Thanks to:
CDC2013 organisers
The other contributers and colleagues who I work with
. . . a patient audience
Questions & comments to:
Gerald.Richter@ait.ac.at
Christian.Rudloff@ait.ac.at
Anita.Graser@ait.ac.at
G. Richter | AIT | mobility | DTS May 14, 2013 18 / 19
19. References
[1] D. Bauer et al. “On Extracting Commuter Information from
GPS Motion Data”. In: Proceedings International
Workshop on Computational Transportation Science
(IWCTS08). 2008.
[2] R. Hariharan and K. Toyama. “Project Lachesis: Parsing
and Modeling Location Histories.” In: Proceedings of the
Third International Conference on GIScience. Adelphi,
MD, USA, 2004.
[3] C. Rudloff and M. Ray. “Detecting Travel Modes and
Profiling Commuter Routes Solely Based on GPS Data”.
In: TRB 89th Annual Meeting. 2010.
G. Richter | AIT | mobility | DTS May 14, 2013 19 / 19