Detecting and visualizing the stability of
activity chains with longest common
purpose subsequences
A. C. Prelipcean1
, Y. Susilo1
, G. Gid´ofalvi2
1Department of Transportation Science, KTH
2Division of Geoinformatics, KTH
acpr@kth.se
adrianprelipcean.github.io
@Adi Prelipcean
09 April 2017
Outline
This presentation will be about:
1. Sequences of activities for the study of travel behaviour
2. Data collected for visualizing activity sequences
3. Different ways of visualizing activity sequences
– Chord diagrams
– Timelines
– Sankey diagrams
– Topological networks
4. Summary and conclusions
2
Visualizing activity sequences
Why do we want to visualize activity sequences?
Visualizing activity sequences gives us better insights into
travel behaviour, which we use:
to investigate the reasons and mechanisms that underlie
an individual’s travel decision making process,
3
Visualizing activity sequences
Why do we want to visualize activity sequences?
Visualizing activity sequences gives us better insights into
travel behaviour, which we use:
to investigate the reasons and mechanisms that underlie
an individual’s travel decision making process,
to predict the effect of implementing new transportation
policies or changing the transportation infrastructure, or
3
Visualizing activity sequences
Why do we want to visualize activity sequences?
Visualizing activity sequences gives us better insights into
travel behaviour, which we use:
to investigate the reasons and mechanisms that underlie
an individual’s travel decision making process,
to predict the effect of implementing new transportation
policies or changing the transportation infrastructure, or
to understand the dynamic of transportation movement
within study areas.
3
Data collection for visualizing activity sequences
Which is the main source of travel data?
Travel diaries summarize where, why and how a user traveled
during a defined time frame:
The destination of a trip
Img: http://soarministries.com/hp_wordpress/wp-content/uploads/2011/08/Destinations-Icon.jpg 4
Data collection for visualizing activity sequences
Which is the main source of travel data?
Travel diaries summarize where, why and how a user traveled
during a defined time frame:
The destination of a trip
The trip’s purpose
Img: https://cdn2.vox-cdn.com/thumbor/93Yaxs7y3Tb8tzFfppyRsSn_yN8=/1020x0/cdn0.vox-cdn.com/ 4
Data collection for visualizing activity sequences
Which is the main source of travel data?
Travel diaries summarize where, why and how a user traveled
during a defined time frame:
The destination of a trip
The trip’s purpose
The means of transportation, i.e., trip legs
Img: https://d3ui957tjb5bqd.cloudfront.net/images/screenshots/products/4/42/42990/ 4
MEILI
An open source activity travel diary collection system
The data for this case study was collected with MEILI
Meili Components: GPS collection + Web and Mobile
GIS based interaction + Artificial Intelligence / Machine
Learning
Open source – https://github.com/Badger-MEILI/
5
Data collected by MEILI
6
How to visualize the sequences of activities?
It is intuitive to notice simple schedule
patterns but the complexity increases
with the number of days and activities
7
How to visualize the sequences of activities?
It is intuitive to notice simple schedule
patterns but the complexity increases
with the number of days and activities
A visualization should:
– Emphasize at least one
dimension of the studied
phenomenon
– Be easy to understand
independent of expertise
– Ideally, it should convey a
message without assuming any
need for interaction
7
Chord diagrams
8
Chord diagrams
An altered pie chart with
overlapped arcs that represent
the relationship between slices
9
Chord diagrams
An altered pie chart with
overlapped arcs that represent
the relationship between slices
Can easily notice the difference
in the share of activities
between weekdays and
weekends
9
Chord diagrams
An altered pie chart with
overlapped arcs that represent
the relationship between slices
Can easily notice the difference
in the share of activities
between weekdays and
weekends
9
Chord diagrams
An altered pie chart with
overlapped arcs that represent
the relationship between slices
Can easily notice the difference
in the share of activities
between weekdays and
weekends
It is easy to interact with the
chord diagram for further
exploration of OD activity
shares
9
Timeline
A common representation of a schedule
10
Timeline
A common representation of a schedule
Can be color coded to hint at pattern repetition
10
Timeline
A common representation of a schedule
Can be color coded to hint at pattern repetition
It represents the time budget for each activity
10
Timeline
A common representation of a schedule
Can be color coded to hint at pattern repetition
It represents the time budget for each activity
It is difficult to visualize sequences that do not overlap in time
10
Timeline
A common representation of a schedule
Can be color coded to hint at pattern repetition
It represents the time budget for each activity
It is difficult to visualize sequences that do not overlap in time
Can be used to visualize schedules up to a week, since it clutters
with more info
10
Sankey diagram
Traditionally, Sankey diagrams are used to visualize flows
https://www.ifu.com/knowtheflow/2013/from-data-to-knowledge-the-power-of-elegant-sankey-diagrams/ 11
Sankey diagram
Traditionally, Sankey diagrams are used to visualize flows
The distribution of time allocated per activity can be interpreted as
a flow
Represents time using discrete units (of 1 hour)
11
Sankey diagram
Traditionally, Sankey diagrams are used to visualize flows
The distribution of time allocated per activity can be interpreted as
a flow
Represents time using discrete units (of 1 hour)
Shows the hours of days where the schedule is constant across days
11
Sankey diagram
Traditionally, Sankey diagrams are used to visualize flows
The distribution of time allocated per activity can be interpreted as
a flow
Represents time using discrete units (of 1 hour)
Shows the hours of days where the schedule is constant across days
Shows the hours of days where the schedule varies between days
11
Sankey diagram
Traditionally, Sankey diagrams are used to visualize flows
The distribution of time allocated per activity can be interpreted as
a flow
Represents time using discrete units (of 1 hour)
Shows the hours of days where the schedule is constant across days
Shows the hours of days where the schedule varies between days
A compromise that allows for an easy visualization of sequences of
activities over a long period of time while still offering some
information regarding the time of activities
11
Network topology
12
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
13
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
Can easily clutter if visualizing
multiple OD pairs
13
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
Can easily clutter if visualizing
multiple OD pairs
However, the visualization can be
isolated to only show:
– irregular OD activity pairs
(home is a very centric place
and activity, which is why it is
not irregularly linked with any
other activity)
13
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
Can easily clutter if visualizing
multiple OD pairs
However, the visualization can be
isolated to only show:
– irregular OD activity pairs
– regular OD activity pairs (notice
that home is not connected to
work, shopping, pickup/dropoff
because home is centric to those
activities)
13
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
Can easily clutter if visualizing
multiple OD pairs
However, the visualization can be
isolated to only show:
– irregular OD activity pairs
– regular OD activity pairs
– frequent OD activity pairs
(everything revolves around 3
centric activities: private, home
and work)
13
Network topology
Focuses exclusively on sequences (no
time information)
Can be color coded to represent the
frequency of OD activity pairs
Can easily clutter if visualizing
multiple OD pairs
However, the visualization can be
isolated to only show:
– irregular OD activity pairs
– regular OD activity pairs
– frequent OD activity pairs
– Longest Common Subsequences
(the longest subsequence of
activities that always occur in
the same order is home →
pickup/dropoff → work →
private → home)
13
Summary
the interpretation of visualizations is usually subjective,
which makes it difficult to propose the best way to
visualize activity sequences.
in the main author’s view, the network topology
visualization offers a more robust overview of sequences
Sankey diagrams come as a close second for sequence viz
for the cases when the activity start / end time are needed
timelines are useful for comparing different days to get an
indicator of stability across day of week and hour of day,
but they become overloaded with information fast
chord diagrams are good to visualize Origin / Destination
activity matrices, but they only reveal frequency and not
sequences
14
References
source code for this presentation available at https://github.
com/Badger-MEILI/Visualizing-Activity-Sequences
source code for the MEILI family available at
https://github.com/Badger-MEILI
paper on sequential stability of travel behaviour
– A. C. Prelipcean, Y. Susilo, G. Gid´ofalvi. Longest common subsequences:
Identifying the stability of individuals’ travel patterns. Submitted to
Transportation Research Part B.
papers on travel diary data collection with MEILI
– A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. 2014. “Mobility Collector”,
in the Journal of Location Based Services, Volume 8, Issue 4, pages
229-255, DOI: 10.1080/17489725.2014.973917
– A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. 2016. “Measures of
transport mode segmentation of trajectories”, in the International
Journal of Geographical Information Science, Volume 30, Issue 9, pages
1763-1784, DOI: 10.1080/13658816.2015.1137297. [link]
– A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. “MEILI: an activity travel
diary collection, annotation and automation system” submitted to
Journal of Urban Technology
15
Inspiration
Chord Diagrams based on http://www.delimited.io/blog/
2013/12/8/chord-diagrams-in-d3 and
https://bl.ocks.org/mbostock/4062006
Timeline based on https://developers.google.com/chart/
interactive/docs/gallery/timeline
Sankey Diagrams based on
https://bost.ocks.org/mike/sankey/
Network Topology based on
http://bl.ocks.org/d3noob/5155181 and
http://bl.ocks.org/mbostock/1153292
16
Thank you for your attention!
Questions and Discussions
Adrian C. Prelipcean
CTO, Airmee
PhD Student in Transport Science
KTH, Royal Institute of Technology
http://adrianprelipcean.github.io/
acpr@kth.se
@Adi Prelipcean
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Detecting and visualizing the stability of activity chains with longest common purpose subsequences

  • 1.
    Detecting and visualizingthe stability of activity chains with longest common purpose subsequences A. C. Prelipcean1 , Y. Susilo1 , G. Gid´ofalvi2 1Department of Transportation Science, KTH 2Division of Geoinformatics, KTH acpr@kth.se adrianprelipcean.github.io @Adi Prelipcean 09 April 2017
  • 2.
    Outline This presentation willbe about: 1. Sequences of activities for the study of travel behaviour 2. Data collected for visualizing activity sequences 3. Different ways of visualizing activity sequences – Chord diagrams – Timelines – Sankey diagrams – Topological networks 4. Summary and conclusions 2
  • 3.
    Visualizing activity sequences Whydo we want to visualize activity sequences? Visualizing activity sequences gives us better insights into travel behaviour, which we use: to investigate the reasons and mechanisms that underlie an individual’s travel decision making process, 3
  • 4.
    Visualizing activity sequences Whydo we want to visualize activity sequences? Visualizing activity sequences gives us better insights into travel behaviour, which we use: to investigate the reasons and mechanisms that underlie an individual’s travel decision making process, to predict the effect of implementing new transportation policies or changing the transportation infrastructure, or 3
  • 5.
    Visualizing activity sequences Whydo we want to visualize activity sequences? Visualizing activity sequences gives us better insights into travel behaviour, which we use: to investigate the reasons and mechanisms that underlie an individual’s travel decision making process, to predict the effect of implementing new transportation policies or changing the transportation infrastructure, or to understand the dynamic of transportation movement within study areas. 3
  • 6.
    Data collection forvisualizing activity sequences Which is the main source of travel data? Travel diaries summarize where, why and how a user traveled during a defined time frame: The destination of a trip Img: http://soarministries.com/hp_wordpress/wp-content/uploads/2011/08/Destinations-Icon.jpg 4
  • 7.
    Data collection forvisualizing activity sequences Which is the main source of travel data? Travel diaries summarize where, why and how a user traveled during a defined time frame: The destination of a trip The trip’s purpose Img: https://cdn2.vox-cdn.com/thumbor/93Yaxs7y3Tb8tzFfppyRsSn_yN8=/1020x0/cdn0.vox-cdn.com/ 4
  • 8.
    Data collection forvisualizing activity sequences Which is the main source of travel data? Travel diaries summarize where, why and how a user traveled during a defined time frame: The destination of a trip The trip’s purpose The means of transportation, i.e., trip legs Img: https://d3ui957tjb5bqd.cloudfront.net/images/screenshots/products/4/42/42990/ 4
  • 9.
    MEILI An open sourceactivity travel diary collection system The data for this case study was collected with MEILI Meili Components: GPS collection + Web and Mobile GIS based interaction + Artificial Intelligence / Machine Learning Open source – https://github.com/Badger-MEILI/ 5
  • 10.
  • 11.
    How to visualizethe sequences of activities? It is intuitive to notice simple schedule patterns but the complexity increases with the number of days and activities 7
  • 12.
    How to visualizethe sequences of activities? It is intuitive to notice simple schedule patterns but the complexity increases with the number of days and activities A visualization should: – Emphasize at least one dimension of the studied phenomenon – Be easy to understand independent of expertise – Ideally, it should convey a message without assuming any need for interaction 7
  • 13.
  • 14.
    Chord diagrams An alteredpie chart with overlapped arcs that represent the relationship between slices 9
  • 15.
    Chord diagrams An alteredpie chart with overlapped arcs that represent the relationship between slices Can easily notice the difference in the share of activities between weekdays and weekends 9
  • 16.
    Chord diagrams An alteredpie chart with overlapped arcs that represent the relationship between slices Can easily notice the difference in the share of activities between weekdays and weekends 9
  • 17.
    Chord diagrams An alteredpie chart with overlapped arcs that represent the relationship between slices Can easily notice the difference in the share of activities between weekdays and weekends It is easy to interact with the chord diagram for further exploration of OD activity shares 9
  • 18.
  • 19.
    Timeline A common representationof a schedule Can be color coded to hint at pattern repetition 10
  • 20.
    Timeline A common representationof a schedule Can be color coded to hint at pattern repetition It represents the time budget for each activity 10
  • 21.
    Timeline A common representationof a schedule Can be color coded to hint at pattern repetition It represents the time budget for each activity It is difficult to visualize sequences that do not overlap in time 10
  • 22.
    Timeline A common representationof a schedule Can be color coded to hint at pattern repetition It represents the time budget for each activity It is difficult to visualize sequences that do not overlap in time Can be used to visualize schedules up to a week, since it clutters with more info 10
  • 23.
    Sankey diagram Traditionally, Sankeydiagrams are used to visualize flows https://www.ifu.com/knowtheflow/2013/from-data-to-knowledge-the-power-of-elegant-sankey-diagrams/ 11
  • 24.
    Sankey diagram Traditionally, Sankeydiagrams are used to visualize flows The distribution of time allocated per activity can be interpreted as a flow Represents time using discrete units (of 1 hour) 11
  • 25.
    Sankey diagram Traditionally, Sankeydiagrams are used to visualize flows The distribution of time allocated per activity can be interpreted as a flow Represents time using discrete units (of 1 hour) Shows the hours of days where the schedule is constant across days 11
  • 26.
    Sankey diagram Traditionally, Sankeydiagrams are used to visualize flows The distribution of time allocated per activity can be interpreted as a flow Represents time using discrete units (of 1 hour) Shows the hours of days where the schedule is constant across days Shows the hours of days where the schedule varies between days 11
  • 27.
    Sankey diagram Traditionally, Sankeydiagrams are used to visualize flows The distribution of time allocated per activity can be interpreted as a flow Represents time using discrete units (of 1 hour) Shows the hours of days where the schedule is constant across days Shows the hours of days where the schedule varies between days A compromise that allows for an easy visualization of sequences of activities over a long period of time while still offering some information regarding the time of activities 11
  • 28.
  • 29.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs 13
  • 30.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs Can easily clutter if visualizing multiple OD pairs 13
  • 31.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs Can easily clutter if visualizing multiple OD pairs However, the visualization can be isolated to only show: – irregular OD activity pairs (home is a very centric place and activity, which is why it is not irregularly linked with any other activity) 13
  • 32.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs Can easily clutter if visualizing multiple OD pairs However, the visualization can be isolated to only show: – irregular OD activity pairs – regular OD activity pairs (notice that home is not connected to work, shopping, pickup/dropoff because home is centric to those activities) 13
  • 33.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs Can easily clutter if visualizing multiple OD pairs However, the visualization can be isolated to only show: – irregular OD activity pairs – regular OD activity pairs – frequent OD activity pairs (everything revolves around 3 centric activities: private, home and work) 13
  • 34.
    Network topology Focuses exclusivelyon sequences (no time information) Can be color coded to represent the frequency of OD activity pairs Can easily clutter if visualizing multiple OD pairs However, the visualization can be isolated to only show: – irregular OD activity pairs – regular OD activity pairs – frequent OD activity pairs – Longest Common Subsequences (the longest subsequence of activities that always occur in the same order is home → pickup/dropoff → work → private → home) 13
  • 35.
    Summary the interpretation ofvisualizations is usually subjective, which makes it difficult to propose the best way to visualize activity sequences. in the main author’s view, the network topology visualization offers a more robust overview of sequences Sankey diagrams come as a close second for sequence viz for the cases when the activity start / end time are needed timelines are useful for comparing different days to get an indicator of stability across day of week and hour of day, but they become overloaded with information fast chord diagrams are good to visualize Origin / Destination activity matrices, but they only reveal frequency and not sequences 14
  • 36.
    References source code forthis presentation available at https://github. com/Badger-MEILI/Visualizing-Activity-Sequences source code for the MEILI family available at https://github.com/Badger-MEILI paper on sequential stability of travel behaviour – A. C. Prelipcean, Y. Susilo, G. Gid´ofalvi. Longest common subsequences: Identifying the stability of individuals’ travel patterns. Submitted to Transportation Research Part B. papers on travel diary data collection with MEILI – A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. 2014. “Mobility Collector”, in the Journal of Location Based Services, Volume 8, Issue 4, pages 229-255, DOI: 10.1080/17489725.2014.973917 – A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. 2016. “Measures of transport mode segmentation of trajectories”, in the International Journal of Geographical Information Science, Volume 30, Issue 9, pages 1763-1784, DOI: 10.1080/13658816.2015.1137297. [link] – A. C. Prelipcean, G. Gid´ofalvi, and Y. Susilo. “MEILI: an activity travel diary collection, annotation and automation system” submitted to Journal of Urban Technology 15
  • 37.
    Inspiration Chord Diagrams basedon http://www.delimited.io/blog/ 2013/12/8/chord-diagrams-in-d3 and https://bl.ocks.org/mbostock/4062006 Timeline based on https://developers.google.com/chart/ interactive/docs/gallery/timeline Sankey Diagrams based on https://bost.ocks.org/mike/sankey/ Network Topology based on http://bl.ocks.org/d3noob/5155181 and http://bl.ocks.org/mbostock/1153292 16
  • 38.
    Thank you foryour attention! Questions and Discussions Adrian C. Prelipcean CTO, Airmee PhD Student in Transport Science KTH, Royal Institute of Technology http://adrianprelipcean.github.io/ acpr@kth.se @Adi Prelipcean This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.