Trajectory clustering using adaptive Euclidean distances

Trajectory clustering using adaptive Euclidean
distances
Antonio Irpino, PhD
19/6/2019
Sis 2019
Dept. of Mathematics and Physics
Caserta, Italy
19/6/2019 Sis 2019 1 / 47

Outline
• We aim at clustering trajectories of
moving objects.
• A k-means-like algorithm based on a
Euclidean distance between piece-wise
linear curves is used. Each trajectory is
decomposed into sub-trajectories.
• The importance of each sub-trajectory is
automatically computed in the clustering
algorithm using an adaptive distances
approach.
• The proposed algorithm is tested against
some workbench trajectory datasets
Some trajectories
Examples of clustered
trajectories
19/6/2019 Sis 2019 2 / 47

Trajectory
A trajectory 𝑃𝑖 is a collection of ordered pairs of data (s𝑖
𝑗, 𝑡𝑖
𝑗),
𝑗 = 1, … , 𝑇, sampled in 𝑇 time-points where s𝑖
𝑗 is a spatial location
(namely. a 2D or a 3D vector of spatial coordinates) and 𝑡𝑖
𝑗 is a
time-stamp. A trajectory can be enriched with other data recorded at
each time-point, but we don’t consider this case. Considering the order
provided by the time-stamps, a trajectory 𝑃 is described as a curve in a
2D (or 3D space).
Trajectories are everywhere
Trajectories of
• pedestrians
• animals
• vehicles
• hurricanes
• …
Sensed by:
• GPS systems
• GSM
networks
• RFID and
WiFi
• …
Clustering and classification are
useful applications in
• Transportation
• Urban planning
• Business
• …
19/6/2019 Sis 2019 3 / 47

Clustering trajectories
Clustering aims at grouping objects such that
• similar objects are grouped together
• different objects belongs to different groups
Trajectories clustering looks for groups of trajectories, or of
sub-trajectories, such that they represent a movement pattern in the data.
When are two trajectories similar?
19/6/2019 Sis 2019 4 / 47

Different approaches to trajectory clustering
In the literature, the following two approaches represent the state of the
art of trajectory clustering:
• Lee et al. [6] propose a distance between sub-trajectories, and an
algorithm implements an extension of a density based clustering for
grouping set of sub-trajectories.
• Ferreira et al. [4] estimate 𝑘 vector fields associated with 𝑘 groups of
trajectories observed in a 2D space. This application, is inspired by
the problem of monitoring and predicting storm or hurricane paths.
• Another approach is provided by functional data analysis where a
trajectory is considered as a curve in a 2D or 3D space. Sangalli et
al. [8] proposed a k-means type algorithm using an alignment step.1
1We do not consider alignment in this paper
19/6/2019 Sis 2019 5 / 47

What if trajectories have different time-length? Two choices.
Consider sub-trajectories:
sub-trajectories of equal lengths can be
selected and then compared
• Pro: time length is preserved
• Cons: computational cost can be high
Normalize lengths
time lengths are set equal to 1
• Pros: trajectories are considered as a
single objects, distances are more
interpretable, computational cost is
acceptable. Averaging trajectories is
possible.
• Cons: if trajectories have very different
time-lengths some biases arise
19/6/2019 Sis 2019 6 / 47

K-means of trajectories with
adaptive distances

Distances between trajectories
Some assumptions
• We consider normalized trajectories.
• We consider trajectories as piece-wise linear curves.
• For each piece, we assume a constant relative speed.
Under these assumptions, we can consider the Euclidean distance between
two 2D trajectories2
having the same 𝑘 time-stamps normalized in [0, 1].
Given two normalized trajectories
𝑃1 = {{(𝑥1
0, 𝑦1
0), 0}, … , {(𝑥1
𝑗 , 𝑦1
𝑗 ), 𝜏1
𝑗 }, … , {(𝑥1
𝑇1
, 𝑦1
𝑇1
), 1}} and
𝑃2 = {{(𝑥2
0, 𝑦2
0), 0}, … , {(𝑥2
𝑗 , 𝑦2
𝑗 ), 𝜏2
𝑗 }, … , {(𝑥2
𝑇2
, 𝑦2
𝑇2
), 1}},
where 𝜏 𝑖
𝑗 =
𝑡 𝑖
𝑗−𝑡 𝑖
0
𝑡 𝑖
𝑇 𝑖
−𝑡 𝑖
0
2The trajectory is on a plane, but the extension to 3D spaces is straightforward.
19/6/2019 Sis 2019 7 / 47

Euclidean distance between two trajectories i
It is possible to express the two trajectories with a common set of 𝜏’s by
a linear interpolation. Once the two trajectories are registered such that
they have the same normalized 𝐿 ∈ [𝑚𝑖𝑛(𝑇1, 𝑇2), (𝑇1 + 𝑇2)] time-stamps
we compute the squared Euclidean distance between 𝑃1 and 𝑃2 as
follows:
𝑑2
𝐸 (𝑃1, 𝑃2) =
1
∫
0
[(𝑥1(𝜏) − 𝑥2(𝜏))
2
+ (𝑦1(𝜏) − 𝑦2(𝜏))
2
]𝑑𝜏 =
=
𝐿
∑
ℓ=1
(𝜏ℓ − 𝜏ℓ−1) {
| ̄𝑥1(ℓ) − ̄𝑥2(ℓ)|
2
+ | ̄𝑦1(ℓ) − ̄𝑦2(ℓ)|
2
+
+1
3 [| ̇𝑥1(ℓ) − ̇𝑥2(ℓ)|
2
+ | ̇𝑦1(ℓ) − ̇𝑦2(ℓ)|
2
]
}
(1)
where:
19/6/2019 Sis 2019 8 / 47

Euclidean distance between two trajectories ii
• ̄𝑥1(ℓ) = 𝑥1(𝜏ℓ)+𝑥1(𝜏ℓ−1)
2 , ̄𝑥2(ℓ) = 𝑥2(𝜏ℓ)+𝑥2(𝜏ℓ−1)
2 ,
̄𝑦1(ℓ) = 𝑦1(𝜏ℓ)+𝑦1(𝜏ℓ−1)
2 , and ̄𝑦2(ℓ) = 𝑦2(𝜏ℓ)+𝑦2(𝜏ℓ−1)
2 . The points
( ̄𝑥1(ℓ), ̄𝑦1(ℓ)) and ( ̄𝑥2(ℓ), ̄𝑦2(ℓ)) are, respectively, the centers of the
segment that starts from (𝑥1(𝜏ℓ−1), 𝑦1(𝜏ℓ−1)) and arrives at
(𝑥1(𝜏ℓ) , 𝑦1(𝜏ℓ)), respectively, the centers of the segment that starts
from (𝑥2(𝜏ℓ−1) , 𝑦2(𝜏ℓ−1)) and arrives at (𝑥2(𝜏ℓ), 𝑦2(𝜏ℓ));
• ̇𝑥1(ℓ) = 𝑥1(𝜏ℓ)−𝑥1(𝜏ℓ−1)
2 , ̇𝑥2(ℓ) = 𝑥2(𝜏ℓ)−𝑥2(𝜏ℓ−1)
2 ,
̇𝑦1(ℓ) = 𝑦1(𝜏ℓ)−𝑦1(𝜏ℓ−1)
2 , and ̇𝑦2(ℓ) = 𝑦2(𝜏ℓ)−𝑦2(𝜏ℓ−1)
2 . The value
( ̇𝑥1(ℓ), ̇𝑦1(ℓ)) and ( ̇𝑥2(ℓ) , ̇𝑦2(ℓ)) are, respectively, the pairs of the
signed component-wise half widths of the segment that starts from
(𝑥1(𝜏ℓ−1) , 𝑦1(𝜏ℓ−1)) and arrives at (𝑥1(𝜏ℓ) , 𝑦1(𝜏ℓ)), respectively, of
the segment that starts from (𝑥2(𝜏ℓ−1) , 𝑦2(𝜏ℓ−1)) and arrives at
(𝑥2(𝜏ℓ) , 𝑦2(𝜏ℓ)).
19/6/2019 Sis 2019 9 / 47

A k-means algorithm
The (squared) Euclidean distance allows the Fréchet means of a set of
trajectories, thus a k-means algorithm can be applied.
Can we compare different sub-trajectories?
• In several scenarios, if could be useful to consider how mobile-objects
enter in place, how they exit and their intermediate paths.
• We propose to extend the k-means algorithm by considering a
trajectory as a sequence of 𝑀 common (w.r.t. the normalized time)
sub-trajectories which is a partition of the original one.
• While k-means of 𝑁 trajectories is like to do a k-means on one
(functional) variable, k-means of 𝑁 trajectories on 𝑀
sub-trajectories is like to do a k-means on 𝑀 (functional) variables.
– If the 𝑀 sub-trajectories have the same time-length, k-means of
trajectories and on sub-trajectories return the same results! So,
whats’ new!
19/6/2019 Sis 2019 10 / 47

Using adaptive distances
The hypothesis is that each sub-trajectory could have a different
importance in the clustering process.
Adaptive distances (or weighted distances)[3]
• We can consider a system of weights for each sub-trajectory
reflecting the importance in the clustering process.
• Using adaptive distances in a k-means algorithm as suggested in [3],
we extended the k-means algorithm for trajectories such that a
system of weights is the solution of the minimization of the criterion
function (or the cost function) of the k-means algorithm.
• We propose a global weighting system (a weight for each
sub-trajectory) and a cluster-wise (a weight for each cluster) one.
Note that, weights are computed by the algorithm and not provided
by the user.
19/6/2019 Sis 2019 11 / 47

The optimized criteria
Method Criterion
K-means 𝑊 𝐾𝑚 =
𝐾
∑
𝑘=1
∑
𝑖∈𝑘
𝑑2
𝐸(𝑃𝑖, ̄𝑃 𝑘)
SK-means 𝑊 𝑆𝐾𝑚 =
𝐾
∑
𝑘=1
𝑀
∑
𝑚=1
∑
𝑖∈𝑘
𝑑2
𝐸(𝑃𝑖,𝑚, ̄𝑃 𝑘,𝑚)
SKADAG (Glob.
W.)
𝑊 𝐴𝐺 =
𝐾
∑
𝑘=1
𝑀
∑
𝑚=1
∑
𝑖∈𝑘
𝜆 𝑚 ⋅ 𝑑2
𝐸(𝑃𝑖,𝑚, ̄𝑃 𝑘,𝑚) s. a
𝑀
∏
𝑚=1
𝜆 𝑚 = 1
SKADAL (Cl.-wise
W.)
𝑊 𝐴𝐿
𝐾
∑
𝑘=1
𝑀
∑
𝑚=1
∑
𝑖∈𝑘
𝜆 𝑘,𝑚 ⋅ 𝑑2
𝐸(𝑃𝑖,𝑚, ̄𝑃 𝑘,𝑚) s. a
𝑀
∏
𝑚=1
𝜆 𝑘,𝑚 = 1 ∀𝑘 ∈ 1, … , 𝐾
19/6/2019 Sis 2019 12 / 47

The algorithm: Initialization
0. Input: A dataset 𝑃 of normalized and registered trajectories cut at some
predefined 𝑀 normalized time-stamps. A predefined 𝐾 number of
clusters.
1. Initialization Set 𝑡 = 0
1.1 Centers selection Select randomly 𝐾 trajectories and store
them in 𝐺(0)
1.2 Fix initial weights Fix Λ(0) = 1
1.3 Assign Assign data to clusters according to a minimum
distance criterion, and generate the initial partition
of trajectories 𝒫(0)
1.4 Compute initial criterion Compute 𝑊
(0)
𝐴𝐺 (SKADAG) or 𝑊
(0)
𝐴𝐿
(SKADAL)
19/6/2019 Sis 2019 13 / 47

The algorithm: Iterative optimization
2. Repeat Set 𝑡 = 𝑡 + 1
2.1 Centers selection Fixed 𝒫(𝑡−1) and Λ(𝑡−1), compute the average
trajectories for each cluster and store them in 𝐺(𝑡)
2.2 Compute weights Fixed 𝒫(𝑡−1) and 𝐺(𝑡), compute Λ(𝑡)
according to the constrained minimization of 𝑊
(𝑡)
𝐴𝐺
(SKADAG) or 𝑊
(𝑡)
𝐴𝐿 (SKADAL), using the
Lagrange multiplier method.
2.3 Assign Fixed 𝐺(𝑡) and Λ(𝑡), assign trajectories to clusters
according a minimum distance criterion w.r.t. the
average trajectories, and store the partition of
trajectories in 𝒫(𝑡).
2.4 Compute the new criterion Compute 𝑊
(𝑡)
𝐴𝐺 (SKADAG) or
𝑊
(𝑡)
𝐴𝐿 (SKADAL) .
2.5 Verify the stopping rule If 𝑊
(𝑡)
𝐴𝐺 < 𝑊
(𝑡−1)
𝐴𝐺 (SKADAG) or
𝑊
(𝑡)
𝐴𝐿 < 𝑊
(𝑡)
𝐴𝐿 (SKADAL) then go to 2. else go to
3..
3. Return solution Return 𝒫(𝑡), 𝐺(𝑡), Λ(𝑡).
19/6/2019 Sis 2019 14 / 47

Two datasets
We apply the algorithm to two datasets3
CROSS (road) dataset
CROSS is a dataset of 1, 900 trajectories of
vehicles approaching to a crossroad. The
trajectories are labeled into 19 different
types.
LABOMNI
It is a dataset describing 15 (𝐾) sets of
trajectories of 209 people in a laboratory.
3available from http://cvrr.ucsd.edu/LISA/Datasets/TrajectoryClustering/CVRR_d
ataset_trajectory_clustering.zip
19/6/2019 Sis 2019 15 / 47

Main results: external validity indices
Table 2: CROSS and LABOMNI datasets: ARI= Adjusted Rand Index,
PUR=Purity, NMI=Normalized Mutual Information
CROSS LABOMNI
N=1,900 k=19 N=209 K=15
Methods ARI PUR NMI ARI PUR NMI
K-means 0.8163 0.8389 0.9405 0.6715 0.8373 0.8248
cuts (0.15,0.85) cuts (0.005, 0.15,0.85,0.995)
K-means pieces 0.8210 0.8411 0.9443 0.8772 0.9234 0.9118
SKADAG 0.8192 0.8400 0.9426 0.8930 0.9330 0.9230
SKADAL 0.8200 0.8405 0.9433 0.8273 0.9139 0.8998
Note: algorithms based on sub-trajectories perform slightly better for CROSS and
significantly better for the LABOMNI. According to [7], we note that CROSS has less
complex trajectories than LABOMNI. Indeed, CROSS contains more regular
trajectories, since they show cars behaviors at a crossroad, while LABOMNI consists in
trajectories of people that walk almost freely in a laboratory.
19/6/2019 Sis 2019 16 / 47

LABOMNI clustering results i
In the following slides, we focus on LABOMNI dataset.
• For each ground-truth class (top-left), we show the closest cluster
(bottom-left).
• On the right, it is reported the respective Variance function, namely
𝑉 𝑎𝑟 𝑘(𝑝) =
1
𝑁 𝑘
∑
𝑖∈𝐶 𝑘
[𝑃𝑖(𝑝) − ̄𝑃𝑖(𝑝)]
2
For comparing results, we plot the square root of the function.
• We report the log of relevance weights (that sum to zero because of
the log transformation)
19/6/2019 Sis 2019 17 / 47

SKADAG Traj 1 –> Old cla 1 (8) Newcla 13 (9)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 18 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 19 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 20 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 21 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 22 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 23 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 24 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 25 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 26 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 27 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 28 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.18 -0.63 -1.42 1.11 1.12
19/6/2019 Sis 2019 29 / 47

SKADAL Algorithm
Let’s see the SKADAL (cluster-wise adaptive distances)
19/6/2019 Sis 2019 30 / 47

SKADAL Traj 1 –> Old cla 1 (8) Newcla 1 (11)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) 2.39 2.04 -1.84 -1.12 -1.47
19/6/2019 Sis 2019 31 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.01 -2.23 -0.51 1.45 1.30
19/6/2019 Sis 2019 32 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.26 -1.11 -2.29 2.05 1.60
19/6/2019 Sis 2019 33 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) 0.76 0.21 -1.93 0.58 0.38
19/6/2019 Sis 2019 34 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -3.30 -3.15 -1.60 3.70 4.36
19/6/2019 Sis 2019 35 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) 1.59 1.54 -1.88 -0.94 -0.32
19/6/2019 Sis 2019 36 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -1.57 -1.47 -1.41 2.20 2.25
19/6/2019 Sis 2019 37 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) 0.52 -1.20 -1.84 1.40 1.12
19/6/2019 Sis 2019 38 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -1.57 -1.47 -1.41 2.20 2.25
19/6/2019 Sis 2019 39 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) 2.99 3.47 -1.66 -2.37 -2.43
19/6/2019 Sis 2019 40 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -0.65 -0.64 -1.54 1.92 0.91
19/6/2019 Sis 2019 41 / 47

50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
50
100
150
200
100 150 200 250
x
y
0
20
40
60
0.00 0.25 0.50 0.75 1.00
p
Vark(p)
0-0.005 0.005-0.15 0.15-0.85 0.85-0.995 0.995-1
log(𝑤) -2.11 -1.78 -0.30 2.22 1.96
19/6/2019 Sis 2019 42 / 47

Comments on LABOMNI
• We see that some clusters are strange! This is due to misaligned
trajectories.
• While alignment is solvable in a hierarchical clustering approach (for
example, distances matrices can be computed using warping), in a
k-mean approach is less feasible.
19/6/2019 Sis 2019 43 / 47

Conclusions
• In this work
– We presented a new k-means clustering algorithm for trajectories
– We observed that using sub-trajectories improves clustering results
• In perspective
– How many cuts and where to cut?
– A possible solution is to cut everywhere! I.e., we are developing and
testing a new algorithm where the 𝜆 are continuous.
– Alignment of trajectories in a k-means framework. Recently, some
proposal have been introduced for time-series.
– Relaxing hypothesis without losing interpretability and performances.
19/6/2019 Sis 2019 44 / 47

References i
[1] Demšar, U., Buchin, K., Cagnacci, F., Safi, K., Speckmann, B.,
Weghe, N.V., Weiskopf, D., Weibel, R.: Analysis and visualisation of
movement: an interdisciplinary review. Movement ecology. 3:5,
(2015)
[2] Diday, E.: The dynamic clusters method in nonhierarchical
clustering. International Journal of Computer and Information
Sciences 2: 61 (1973) doi: 10.1007/BF00987153
[3] Diday, E. and Govaert, G.: Classification Automatique avec
Distances Adaptatives. R.A.I.R.O. Informatique Computer Science,
11 (4), 329-349 (1977)
19/6/2019 Sis 2019 45 / 47

References ii
[4] Ferreira, N. , Klosowski, J. T., Scheidegger, C. E. and Silva, C. T.:
Vector Field k‐Means: Clustering Trajectories by Fitting Multiple
Vector Fields. Computer Graphics Forum, 32: 201-210. (2013) doi:
10.1111/cgf.12107
[5] Jiang Bian, Dayong Tian, Yuanyan Tang, Dacheng Tao. A review of
moving object trajectory clustering algorithms. Artif Intell Rev
(2016) doi: 10.1007/s10462-016-9477-7
[6] Lee, J., Han, J., Whang, K.: Trajectory clustering: a
partition-and-group framework. Proceedings of the 2007 ACM
SIGMOD international conference on Management of data, pp.
593-604 (2007)
19/6/2019 Sis 2019 46 / 47

References iii
[7] Morris, B. T., Trivedi, M. M.:Learning Trajectory Patterns by
Clustering: Experimental Studies and Comparative Evaluation, in
Proc. IEEE Inter. Conf. on Computer Vision and Pattern Recog.,
Maimi, Florida, (2009)
[8] Sangalli, L.M., Secchi, P., Vantini, S., Vitelli, V.: K-mean alignment
for curve clustering, Computational Statistics & Data Analysis, 54,5,
1219–1233 (2010)
19/6/2019 Sis 2019 47 / 47

Thank you for your attention.
Any questions?

Trajectory clustering using adaptive Euclidean distances

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Trajectory clustering using adaptive Euclidean distances

Similar to Trajectory clustering using adaptive Euclidean distances (20)

Recently uploaded

Recently uploaded (20)

Trajectory clustering using adaptive Euclidean distances