On the Management, Analysis and Simulation of our LifeSteps

On the management,
analysis and simulation
of our LifeSteps*
Yannis Theodoridis
Data Science Lab., Univ. Piraeus
www.datastories.org
(*) joint work with N. Pelekis, S.
Sideridis, P. Tampakis
Paris Descartes Univ., 15.10.2015

Motivation (1/2)
¡  The field of Mobility Data Management and Exploration*
has many success stories to narrate
¡  Data management - access methods, query
processing techniques, DBMS extensions
(the so-called, Moving Object Databases)
¡  Data exploration – trajectory data cubes,
data mining techniques (clusters, flocks,
convoys, T-patterns, etc.)
¡  … all based on the sampled spatio-temporal
coordinates (x-, y-, t-axis) of moving objects
(*) N. Pelekis, Y. Theodoridis (2014): Mobility data management and
exploration. Springer, New York.

Motivation (2/2)
¡ The new era that emerges is around two keywords:
semantic trajectories* and BIG mobility data
¡  Semantic trajectories – information
about when, where, what, why
¡  BIG mobility data - voluminous, streaming,
disperse information about movement of
objects (on land, sea, air)
(*) C. Parent, S. Spaccapietra, C. Renso, G. Andrienko, N. Andrienko, V.
Bogorny, M. L. Damiani, A. Gkoulalas-Divanis, J. Macedo, N. Pelekis, Y.
Theodoridis, Z. Yan (2013): Semantic trajectories modeling and
analysis. ACM Computing Surveys, 45(4).

Talk outline
¡ Background – what is a LifeStep
¡ On the management and analysis of our LifeSteps
¡ On the simulation of our LifeSteps
¡ Relevant publications

raw mobility (x, y, t)
data series
e.g., GPS feeds
mobility diaries:
meaningful mobility
tuples of type
<when, where, how,
what, why>
Home (breakfast) office (work) Market (shopping) Home (relax)
Road
(bus)
Train
(metro)
Sideway
(walk)
[~, 8am]
[8am, 9am] [6pm, 6:30am] [7:30pm, 8pm]
[9am, 6pm] [6:30pm, 7:30pm] [8pm,~]
From raw (GPS-based) to
semantic trajectories (Parent et al., 2013 )

Home (breakfast) office (work) Market (shopping) Home (relax)
Road
(bus)
Train
(metro)
Sideway
(walk)
[~, 8am]
[8am, 9am] [6pm, 6:30am] [7:30pm, 8pm]
[9am, 6pm] [6:30pm, 7:30pm] [8pm,~]
Semantic trajectories consist of
our LifeSteps (Pelekis et al. 2013b)
¡ (informal) Definitions:
¡ An Episode / LifeStep
is a tuple modeling
homogeneous
movement
behavior
(Stop vs. Move)
¡ A Semantic Trajectory / Mobility Timeline is a
sequence of Episodes / LifeSteps

Examples of LifeSteps & Mobility
Timelines

Challenges
¡ Drawbacks:
¡ A MOD system cannot be used as-is to support semantic
trajectories
¡ different models, querying and indexing requirements,
¡ different specs for data analytics
¡ Real semantic trajectory data (of appropriate size) are not
available nowadays.
¡ synthetic data generators should be developed (as usual)
¡ Questions (that motivate our work):
¡ Q1: how would a semantic-aware MOD look like?
¡ Q2: how would a semantic trajectory data
generator look like?

Talk outline
¡ Background – what is a LifeStep
¡ On the management and analysis of our
LifeSteps
¡ On the simulation of our LifeSteps
¡ Relevant publications

Motivation 1 – specs for a
semantic MOD
¡ (preliminary step) Activity Inference issues
¡  From spatial to activity information
¡  he/she stopped where? for what purpose?
¡ Management issues
¡  Querying semantic MODs
¡  raw vs. semantic layer
¡ Analytics issues
¡  Similarity measure
¡  Sampling, Clustering, etc.

Activity inference
¡ Activity inference
¡  Stopped in a place why? to perform which activity?
¡  Open linked data are quite useful for this purpose
¡ Our Baquara methodology (Fileto et al. 2013, 2015)
Linking
Data Pre-processing
Textually
Annotated
Movement
Data
Ontologies
& LOD
Semantically
Enriched
Movement Data
Data
Cleansing &
Integration
Data
Compressing
Text & KB Pre-
processing
Spatio-
Temporal
Matching
Textual
Matching
Refinement &
Disambiguation

Activity inference – an example
(Fileto et al. 2015)

¡  Q1 type: queries involving raw data
¡  Spatio-temporal (range, NN, …),
trajectory-based queries
¡  Q2 type: queries involving
semantically-enriched data.
Example:
¡  Find those who follow the pattern
“home – office – home” Mon-Fri
¡  Q3 type: cross-over queries. Examples:
¡  Find those who cross the city center on their way from office back to home
¡  How many of them make long trips (e.g. more than 20 km) on their way
from home to office? Exclude the trajectories which include intermediate
stops.
Querying Semantic MODs

Indexing Semantic MODs
¡  Hybrid indexing of spatial-temporal-textual information:
Sem3DR-tree vs. SemTB-tree (Pelekis et al. 2015b)

Querying Spatio-Temporal-
Textual Patterns (ST2P)
¡  An ST2P is a (simplified) regular expression consisting of LifeStep
objects. Formally:
Q := <p* | p is either a LifeStep lsi or a wildcard w ∈ {>, *}>
Example:
Q = [ls1 > ls2 * ls3 > ls4]
i.e. timelines starting from ls1,
immediately followed by ls2,
then followed by * LifeSteps,
then followed by ls3,
then ending to ls4

Semantic mobility networks
(Pelekis et al. 2013b)

Querying Semantic MODs (cont.)
¡  Q4 type: Selection queries
over a SMN
¡  “Find Alice’s mobility network for
her movement during last week;
restrict it inside region R; call the
resulting network A”
¡  Q5 type: Aggregate queries
over a SMN
¡  “Find Alice’s Facebook friends’ mobility
network for the same period; roll it up
at level 2; call the resulting network B”
¡  “Given the above two networks A and B, extract the network where Alice and her
friends perform same activities by following e.g. similar routes; call the resulting
network C”
¡  Q6 type: Cross-over queries using SMN
¡  “Find Alice’s raw trajectories conforming to network C”

Application
Interface(s)
Geodata Sources
(Road network,
Land Usage, POI/ROI, etc.)
Semantic Mobility
Database (SMD)
Raw (e.g. GPS)
Mobility Storage
SemanticMobility
Storage
MOD index(raw) Moving Object
Database (MOD)
Queries
(MD/SMD/OLAP)
Results
(e.g. mobility timeline/network
visualization)
SMD index
SMD
Cube
Construction / Cross-over Operators
- Raw trajectory cleansing, compression,
map-matching, …
- Semantic mobility timeline
reconstruction (segmentation (lifesteps:
meteorsteps/moves), annotation, …
ETL
(Extract,
Transform,
Load)
process
Advanced Operators
Semantic trajectory similarity
search, compression, clustering,
FP mining etc.
Primitive Operators
Attribute Filtering, space /
time / trajectory
derivatives, Semantic OD-
matrix, etc.
Advanced
graph-based
OLAP
operations

Motivation 2 – specs for a
semantic trajectory generator
¡  Lack of real BIG “synchronized” raw
(i.e. GPS logs) - diaries (i.e. annotated
trips) dataset
¡  Simulate different mobility profiles -
popular behaviors of people like e.g.
¡  students in campus vs. a downtown
building,
¡  9-to-5 vs. workaholic employees, etc.
¡  Results:
¡  Hermoupolis (Pelekis et al. 2013a)
¡  Hermoupolisby-example (Pelekis et al.
2015a; 2015c)

Hermoupolis - the big picture
Motivation: lack of real semantic trajectories
following various mobility patterns
Road network
POIs
Mobility Profiles
(~ abstract semantic trajectories)
INPUT OUTPUT
Flocks
Swarms
Meeting Points
Methodology:
generate movements w.r.t. mobility
profiles
Synchronizedraw-and
semantictrajectories

(parenthesis – Brinkhoff
generator)
(Hermoupolis exploits on
Brinkhoff generator for raw
trajectories)
¡  Brinkhoff methodology:
¡  generate starting points
¡  generate length of route
(depending on object class)
¡  generate destination for each
object
¡  compute the route
¡  compute the trajectory by generating a
random speed every time unit
¡  based on capacity, weather, edge
class, etc.
source: www.fh-oow.de/institute/iapg/
personen/brinkhoff/generator

¡  spatial + temporal +
semantic profiles
¡  P1) Attending school
¡  Home – School – Home
¡  P2) Studying at university
¡  Home – Campus –
Leisure – Home
¡  P3) Working and having fun
¡  Home – Work – Leisure
– Home
¡  P4) Working and shopping
¡  Home – Work – Mall –
Home
¡  P5) Working (only)
¡  Home – Work – Home
¡  P6) Having fun (only)
¡  Home – Leisure – Home
Hermoupolis input
H
S
H
W
L
H
H
C
L
H
W
L
H
L

!
Hermoupolis output
Generate objects moving in
Athens
¡  ... of certain population (e.g.
4 millions)
¡  ... during a period (e.g. 1
week)
¡  ... belonging to a number of
population profiles
P1
P2
P3
P5
P4
P6

The next step …
¡ Generate-by-example
¡  Given a small real dataset,
produce a large synthetic,
as much similar as the initial one
¡  The number, distribution,
characteristics etc. of population
profiles should be discovered
by the input dataset
Hermoupolis è Hermoupolisby-example

Hermoupolisby-example in action
29
S3
S2
D1D3D2
H1
H2
H4
H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
W2
W3
W1
R2
R3
R1
B3
B1
E1
B2
E2
E
3
S3
S2
D1 D3D2
H1
H2
H4H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
W2W3
W1
R2
R3
R1
H
C
L
D
S
W
R
H
C
L
D
S
H
C
L
D
S
S3
S2
D1D3D2
H1
H2
H4
H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
R2
W2W3
W1
R3
R1
1. Clustering 2. Cluster gen.
3. Cluster class.4. Hermoupolis
W
R
W
R
B3
B1
E1
B2
E2
E3
B3
B1
E1
B2
E2
E3
B3
B1
E1
B2
E2
E3
B3
B1
E1
B2
E2
E3
B3
B1
E1
B2
E2
E3

S3 S2
D1
D3D2
H1
H2
H4
H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
W2
W3
W1
R2 R3 R1
B3
B1
E1
B2
E2
E3
Cluster 1: 4 τsem
Cluster 2: 3 τsem
Cluster 3: 3 τsem
Outliers 4: 3 τsem
Hermoupolisby-example step 1
S3 S2
D1
D3D
2
H1
H2
H4
H3
L2
L4
L
3
L1
C2
C4
C3
C1
S1
W
2 W3
W
1
R2
R3 R1
B3
B1
E1
B2
E2
E3
1. Clustering
1. Perform semantic trajectory clustering

S3 S2
D1
D3D2
H1
H2
H4
H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
W2
W3
W1
R2 R3 R1
B3
B1
E1
B2
E2
E3
H
C
L
D
S
W
R
B
3B
1
E
1
B
2
E
2
E
3
2. Cluster gen.
Cluster 1: 4 τsem
Cluster 2: 3 τsem
Cluster 3: 3 τsem
Outliers 4: 3 τsem
2. For each cluster, find generalization(s) /
representative(s)

H
C
L
D
S
W
R
B3
B1
E1
B2
E2
E3
H
C
L
D
S
W
R
B3
B1
E1
B2
E2
E3
3rd equiv. class
1st equiv. class
2nd equiv. class
Cluster 1: 4 τsem
Cluster 2: 3 τsem
Cluster 3: 3 τsem
Outliers 4: 3 τsem
3. Cluster class.
3. Classify clusters into equivalence classes (for
parallelism purposes)

H
C
L
D
S
W
R
B3
B1
E1
B2
E2
E3
S3 S2
D1
D3D2
H1
H2
H4
H3
L2
L4
L3
L1
C2
C4
C3
C1
S1
B3
B1
E1
B2
E2
E3
R2
W2
W3
W1
R3 R1
4. For each equivalence class, invoke
Hermoupolis core simulator
4. Hermoupolis

Research issues addressed
¡ Step 1. Clustering
¡  What is an appropriate spatial-temporal-textual
semantic trajectory similarity measure?
¡  Which clustering algorithm?
¡ Step 2. Clusters’ generalization
¡  How to create a generalized mobility profile for
each cluster?
¡ Step 3. Clusters’ classification
¡  How to classify clusters into equivalence classes?
¡ Step 4. Hermoupolis
¡  How to select PoIs?
¡  How to generate artificial STOPs and MOVEs?

Hermoupolis vs. related work
(1/2)
mobility
features
obstacles
avoidance
objects
interaction
network-
based
stop
generation
GSTD (and
variations)
✔ ✔ ✔
CENTRE ✔ ✔
G-TERD ✔
OPORTO ✔
Brinkhoff ✔ ✔ ✔
SUMO ✔ ✔ ✔
BerlinMOD ✔ ✔ ✔
ST-ACTS
MWGen ✔
GAMMA
Hermoupolis ✔ ✔ ✔ ✔ ✔

Hermoupolis vs. related work
(2/2)
long time
generation
pattern-
aware
by-example
additional
data
activities /
semantics
GSTD (and
variations)
✔
CENTRE
G-TERD
OPORTO
Brinkhoff
SUMO
BerlinMOD ✔ ✔ ✔
ST-ACTS ✔ ✔
MWGen
GAMMA ✔ ✔
Hermoupolis ✔ ✔ ✔ ✔ ✔

Relevant publications
¡  On the activity inference (the Baquara ontology):
¡  R. Fileto, M. Krüger, N. Pelekis, Y. Theodoridis, C. Renso (2013):
Baquara: a holistic ontological framework for movement analysis using
linked data. Proc. ER’13. (best paper award)
¡  R. Fileto, C. May, C. Renso, N. Pelekis, D. Klein, Y. Theodoridis (2015):
The Baquara2 knowledge-based framework for semantic enrichment
and analysis of movement data. Data Knowl. Eng., 98.
¡  On the management of LifeSteps (Semantic MODs):
¡  N. Pelekis, Y. Theodoridis, D. Janssens (2013b): On the management
and analysis of our LifeSteps. SIGKDD Explorations, 15(1).
¡  N. Pelekis, S. Sideridis, Y. Theodoridis (2015b): Hermessem: a semantic-
aware framework for the management and analysis of our LifeSteps.
Proc. DSAA’15.

Relevant publications (cont.)
¡  On the simulation of semantic trajectories (Hermoupolis):
¡  N. Pelekis, C. Ntrigkogias, P. Tampakis, S. Sideridis, Y. Theodoridis
(2013a): Hermoupolis: a trajectory generator for simulating
generalized mobility patterns. Proc. ECML/PKDD'13.
¡  N. Pelekis, S. Sideridis, P. Tampakis, Y. Theodoridis (2015a):
Hermoupolis: a semantic trajectory generator in the data science era.
ACM SIGSPATIAL Special Newsletter, 7(1).
¡  N. Pelekis, S. Sideridis, P. Tampakis, Y. Theodoridis (2015c):
Simulating our LifeSteps by example. Submitted.

On the Management, Analysis and Simulation of our LifeSteps

Recommended

Recommended

More Related Content

Similar to On the Management, Analysis and Simulation of our LifeSteps

Similar to On the Management, Analysis and Simulation of our LifeSteps (20)

Recently uploaded

Recently uploaded (20)

On the Management, Analysis and Simulation of our LifeSteps