Time series data mining techniques

IT'S ABOUT TIME !!
Presented By-
P.SHANMUKHA SREENIVAS
M.MGT 1

AN OVERVIEW ON TIME SERIES DATA MINING
OUTLINE
2
1. Introduction
2. Similarity Search in Time Series Data
3. Feature-based Dimensionality Reduction
4. Discretization
5. Other Time Series Data Mining Tasks
6. Conclusions

3
Introduction
6145.45
6128.75
6142.7
6201.2
6151.9
6050.95
5917.75
5855.95
5984
5993.9
5934.8
5920.05
5950
5950.7
5963.8
6141.15
..
..
6471.4
6511.7
6563.25
6558.45
6492.7
6546.75
A time series is a collection of observations
made sequentially in time.
CNX IT returns
Examples: Financial time series, scientific time series

TIME SERIES SIMILARITY SEARCH
4
Some examples:
- Identifying companies with similar patterns of growth.
- Determining products with similar selling patterns
- Discovering stocks with similar movement in stock prices.
- Finding out whether a musical score is similar to one of a set
of copyrighted scores.

Major Time Series Data Mining Tasks
• Indexing
• Clustering
• Classification
• Prediction
• Anomaly Detection
Indexing and clustering make explicit use of a distance measure
The others make implicit use of a distance measure

TIME SERIES SIMILARITY SEARCH
DISTANCE MEASURES
 Euclidean distance
 Dynamic Time Warping
 Other distance measures
o Threshold query based similarity search (TQuEST)
o Minkowski Distance
6

7
Euclidean Distance Metric
Given two time series
Q = q1…qn
and
C = c1…cn
their Euclidean distance is
defined as:
n
2 ,
     
i i D Q C q c

i
1
C
Q
D(Q,C)

What’s wrong with Euclidean Distance?
Similar sequences but they are shifted and have different scales
Normalize the time series before measuring
the distance between them. 푥푖
What if a sequence is stretched or compressed along the time axis?
(Goldin and Kanellakis, 1995)
′ =
푥푖 − μ
σ

9
Dynamic Time Warping (Berndt et al.)
Dynamic Time Warping is a technique that finds the optimal
alignment between two time series if one time series may be
“warped” non-linearly by stretching or shrinking it along its time
axis.
This warping between two time series can be used or to determine
the similarity between the two time series.
Fixed Time Axis
Sequences are aligned “one to one”.
“Warped” Time Axis
Nonlinear alignments are possible.

DYNAMIC TIME WARPING
[BERNDT, CLIFFORD, 1994]
 Allows acceleration-deceleration of signals along the time
dimension
 Basic idea
X = (x1; x2; :::xN); N є N Y = (y1; y2; :::yM); M є N
*Data sequences should be sampled at equidistant points in time
 Algorithm starts by building the distance matrix C є R (N*M)
representing all pairwise distances between X and Y
This distance matrix is also called as the local cost matrix
c(i,j) = ||xi - yj|| i є [1 : N]; j є [1 : M]
 Once the local cost matrix is built, the algorithm finds the
alignment path which runs through the low-cost areas – ‘valleys’
on the augmented cost matrix

C
Q
C Q
HOW IS DTW
CALCULATED?
(i,j) = d(qi,cj) + min{ (i-1,j-1) , (i-1,j ) , (i,j-1) }
Warping path w

CONSTRAINTS
 Boundary condition
Shanmukha Sreenivas P , DoMS
The starting and ending points of the warping path must be the first and the
last points of aligned sequences i.e C1 =(1,1) Ck=(M,N)
 Monotonicity condition
n1< n2 < ::: < nK and m1< m2< :::< mK.
This condition preserves the time-ordering of points.
 Step size condition
This criteria limits the warping path from long jumps (shifts in time) while
aligning sequences.
i.e we’ll be looking at only these values w(i-1,j-1) , w(i-1,j ) , w(i,j-1)
12

CONSTRAINT VISUALIZATION
a)Admissible path satisfying constraints
b)Violation of boundary condition
c)Violation of monotonicity
d)Violation of step size
13

STEP SIZE CONDITION
A global constraint constrains the indices of the warping path wk = (i,j)k such that
j-r  i  j+r
Where r is a term defining allowed range of warping for a given point in a
sequence.
r =
Sakoe-Chiba Band Itakura Parallelogram

DYNAMIC TIME WARPING
15
Advantages:

EXAMPLE
s1 s2 s3 s4 s5 s6 s7 s8 s9
q1 3.76 8.07 1.64 1.08 2.86 0.00 0.06 1.88 1.25
q2 2.02 5.38 0.58 2.43 4.88 0.31 0.59 3.57 2.69
q3 6.35 11.70 3.46 0.21 1.23 0.29 0.11 0.62 0.29
q4 16.8 25.10 11.90 1.28 0.23 4.54 3.69 0.64 1.10
q5 3.20 7.24 1.28 1.42 3.39 0.04 0.16 2.31 1.61
q6 3.39 7.51 1.39 1.30 3.20 0.02 0.12 2.16 1.49
q7 4.75 9.49 2.31 0.64 2.10 0.04 0.00 1.28 0.77
q8 0.96 3.53 0.10 4.00 7.02 1.00 1.46 5.43 4.33
q9 0.02 1.08 0.27 8.07 12.18 3.39 4.20 10.05 8.53
Matrix of the pair-wise distances for element si with qj

EXAMPLE
s1 s2 s3 s4 s5 s6 s7 s8 s9
q1 3.76 11.83 13.47 14.55 17.41 17.41 17.47 19.35 20.60
q2 5.78 9.14 9.72 12.15 17.03 17.34 17.93 21.04 22.04
q3 12.13 17.48 12.60 9.93 11.16 11.45 11.56 12.18 12.47
q4 29.02 37.23 24.50 11.21 10.16 14.70 15.14 12.20 13.28
q5 32.22 36.26 25.78 12.63 13.55 10.20 10.36 12.67 13.81
q6 35.61 39.73 27.17 13.93 15.83 10.22 10.32 12.48 13.97
q7 40.36 45.10 29.48 14.57 16.03 10.26 10.22 11.50 12.27
q8 41.32 43.89 29.58 18.57 21.59 11.26 11.68 15.65 15.83
q9 41.34 42.40 29.85 26.64 30.75 14.65 15.46 21.73 24.18
Window size = 2
Matrix computed with Dynamic Programming based on the:
dist(i,j) = dist(s1, q1) + min {dist(i-1,j-1), dist(i, j-1), dist(i-1,j))

FORMULATION
 Let D(i, j) refer to the dynamic time warping
distance between the subsequences
x1, x2, …, xi
y1, y2, …, yj
D(i, j) = | xi – yj | + min{ D(i – 1, j), D(i – 1, j – 1), D(i, j – 1) }

SOLUTION BY DYNAMIC PROGRAMMING
 Basic implementation = O(n2) where n is the length of
the sequences
 will have to solve the problem for each (i, j)
pair
 If warping window is specified, then O(nw)
 Only solve for the (i, j) pairs where | i – j | <=
w

FEATURE-BASED DIMENSIONALITY
REDUCTION
20
• Time series databases are often extremely large.
Searching directly on these data will be very
complex and inefficient.
• To overcome this problem, we should use some of
transformation methods to reduce the magnitude of
time series.
• These transformation methods are called
dimensionality reduction techniques.

21
Dimensionality Reduction
C
An Example of a
Technique I
0 20 40 60 80 100 120 140
Raw
Data
0.4995
0.5264
0.5523
0.5761
0.5973
0.6153
0.6301
0.6420
0.6515
0.6596
0.6672
0.6751
0.6843
0.6954
0.7086
0.7240
0.7412
0.7595
0.7780
0.7956
0.8115
0.8247
0.8345
0.8407
0.8431
0.8423
0.8387
…
The graphic shows a
time series with 128
points.
The raw data used to
produce the graphic is
also reproduced as a
column of numbers (just
the first 30 or so points are
shown).
n = 128

22
Dimensionality Reduction
C
An Example of a
Technique II
0 20 40 60 80 100 120 140
. . . . . . . . . . . . . .
Fourier
Coefficients
1.5698
1.0485
0.7160
0.8406
0.3709
0.4670
0.2667
0.1928
0.1635
0.1602
0.0992
0.1282
0.1438
0.1416
0.1400
0.1412
0.1530
0.0795
0.1013
0.1150
0.1801
0.1082
0.0812
0.0347
0.0052
0.0017
0.0002
...
Raw
Data
0.4995
0.5264
0.5523
0.5761
0.5973
0.6153
0.6301
0.6420
0.6515
0.6596
0.6672
0.6751
0.6843
0.6954
0.7086
0.7240
0.7412
0.7595
0.7780
0.7956
0.8115
0.8247
0.8345
0.8407
0.8431
0.8423
0.8387
…
Truncated
Fourier
Coefficients
1.5698
1.0485
0.7160
0.8406
0.3709
0.4670
0.2667
0.1928
n = 128
N = 8
Cratio = 1/16

23
excellent approximation, with
only 2 frequencies!

24
Fourier Analysis of Time Series using R
No. observations(n) = 11
Max freq = (n-1)/2 =5w
No. of cosines = {(n-1)/2}+1=6

25
Fourier Analysis of Time Series using R
No. observations(n) = 11
Max freq = (n-1)/2 =5w
No. of sines = {(n-1)/2}=5

0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
26
DFT DWT SVD APCA PAA PLA

DISCRETIZATION
27
• Discretization of a time series is tranforming it into a
symbolic string.
• The main benefit of this discretization is that there is an
enormous wealth of existing algorithms and data structures
that allow the efficient manipulations of symbolic
representations.
• Lin and Keogh et al. (2003) proposed a method called
Symbolic Aggregate Approximation (SAX), which allows
the descretization of original time series into symbolic
strings.

SYMBOLIC AGGREGATE
APPROXIMATION (SAX) [LIN ET AL. 2003]
28
baabccbc
The first symbolic representation
of time series, that allows
discretization of time series into
symbolic strings

HOW DO WE OBTAIN SAX
29
C
C
0 20 40 60 80 100 120
0
-
b
20 40 60 80 100 120
b
b
a
c
c
c
a
baabccbc
First convert the time
series to PAA
representation, then
convert the PAA to
symbols

TWO PARAMETER CHOICES
30
0 20 40 60 80 100 120
0
-
b
20 40 60 80 100 120
b
b
a
c
c
c
a
C
C
1 2 3 4 5 6 7
1
8
The word size, in this
case 8
The alphabet size (cardinality), in this case 3
3
2
1

 Structural representations help in
understanding time series through
 Data analysis + Visualization
 SAX is claimed to be a landmark representation
of time series
 Symbolic and therefore allows use of discrete data
structures and their corresponding algorithms for
analysis
 Also helps with visualization
31

THANK YOU

www.cs.ucr.edu/~eamonn/TSDMA/index.html
32
Datasets and code used in
this presentation can be
found at..

Time series data mining techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Time series data mining techniques

Similar to Time series data mining techniques (20)

More from Shanmukha S. Potti

More from Shanmukha S. Potti (9)

Recently uploaded

Recently uploaded (20)

Time series data mining techniques