1. Outline Introduction Mining by Learning Conclusion
A Uniﬁed Approach to Mining Complex
Time-Series Data for Various Kinds of Patterns
Yi Wang1 J.H. Feng1 J.Y. Wang1 Z.Q. Liu2
1 Department of Computer Science, Tsinghua University, Beijing, 100084, China
2 School of Creative Media, City University of Hong Kong, Hong Kong
IEEE ICDM Conference, 2007
Wang, et al Mining Complex Time-Series Data
2. Outline Introduction Mining by Learning Conclusion
1 Introduction
Aspects of Sequential Data Mining
Various Approaches or A Uniﬁed One
2 Mining by Learning
Learning the Temporal Structure as A Graph
Various Kinds of Hidden Markovian Models
Learning VLHMM
3 Conclusion
Mining Various Kinds of Patterns
Contributions
Wang, et al Mining Complex Time-Series Data
3. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
Various Mining Goals:
Diﬃculties:
Wang, et al Mining Complex Time-Series Data
4. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
Diﬃculties:
Wang, et al Mining Complex Time-Series Data
5. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
periodic pattern,
Diﬃculties: two periodic patterns:
one with 3 realizations,
the other with 2.
Wang, et al Mining Complex Time-Series Data
6. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
periodic pattern,
search-by-example,
Diﬃculties:
Wang, et al Mining Complex Time-Series Data
7. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
periodic pattern,
search-by-example,
frequent atomic pattern,
Diﬃculties:
Wang, et al Mining Complex Time-Series Data
8. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
periodic pattern,
search-by-example,
frequent atomic pattern,
Diﬃculties:
uncertainty on the y-axis
(e.g., noise), match?
Wang, et al Mining Complex Time-Series Data
9. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/
real (continous),
Various Mining Goals:
periodic pattern,
search-by-example,
frequent atomic pattern,
Diﬃculties:
uncertainty on the y-axis matches with which? or
(e.g., noise), matches with both?
uncertainty on the x-axis
(e.g., time scale).
Wang, et al Mining Complex Time-Series Data
10. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Various Approaches or A Uniﬁed One
Previous Research:
Various approaches
Our Work: Various types of Various mining Various mining
sequences and algorithms resultsd
difficulties
The Uniﬁed Approach:
Wang, et al Mining Complex Time-Series Data
11. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Various Approaches or A Uniﬁed One
Previous Research:
Various approaches
Our Work: Various types of Various mining Various mining
sequences and algorithms resultsd
A uniﬁed approach difficulties
The Uniﬁed Approach:
Wang, et al Mining Complex Time-Series Data
12. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Various Approaches or A Uniﬁed One
Previous Research:
Various approaches
Our Work: Various types of Various mining Various mining
sequences and algorithms resultsd
A uniﬁed approach difficulties
The Uniﬁed Approach:
Learning Temporal Graph
Learns various types of hidden structure algorithms
sequences by hidden Markovian as directed for
Markovian models; model graph mining
Wang, et al Mining Complex Time-Series Data
13. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Various Approaches or A Uniﬁed One
Previous Research:
Various approaches
Our Work: Various types of Various mining Various mining
sequences and algorithms resultsd
A uniﬁed approach difficulties
The Uniﬁed Approach:
Learning Temporal Graph
Learns various types of hidden structure algorithms
sequences by hidden Markovian as directed for
Markovian models; model graph mining
represents the temproal
structure by a graph;
and
Wang, et al Mining Complex Time-Series Data
14. Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Uniﬁed Approach
Various Approaches or A Uniﬁed One
Previous Research:
Various approaches
Our Work: Various types of Various mining Various mining
sequences and algorithms resultsd
A uniﬁed approach difficulties
The Uniﬁed Approach:
Learning Temporal Graph
Learns various types of hidden structure algorithms
sequences by hidden Markovian as directed for
Markovian models; model graph mining
represents the temproal
structure by a graph;
and
mines various patterns
by well-studies graph
algorithms.
Wang, et al Mining Complex Time-Series Data
15. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
16. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
17. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
18. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
19. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
20. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
21. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
22. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
23. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S, the number of contexts is S.
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
24. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S, the number of contexts is S.
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
25. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S, the number of contexts is S.
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
26. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S, the number of contexts is S.
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
27. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S, and the length of context, n, the
number of contexts is S n .
Long contexts → accurate modeling, but ineﬃcient learning.
Wang, et al Mining Complex Time-Series Data
28. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S, and the length of context, n, the
number of contexts is S n .
Long contexts → accurate modeling, but ineﬃcient learning.
Wang, et al Mining Complex Time-Series Data
29. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S, and the length of context, n, the
number of contexts is S n .
Long contexts → accurate modeling, but ineﬃcient learning.
Wang, et al Mining Complex Time-Series Data
30. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S, and the length of context, n, the
number of contexts is S n .
Long contexts → accurate modeling, but ineﬃcient learning.
Wang, et al Mining Complex Time-Series Data
31. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to ﬁxed length of n;
Contexts have variable lengths: the shortest, but long enough
to accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
Wang, et al Mining Complex Time-Series Data
32. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to ﬁxed length of n;
Contexts have variable lengths: the shortest, but long enough
to accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2
3
HMM
Wang, et al Mining Complex Time-Series Data
33. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to ﬁxed length of n;
Contexts have variable lengths: the shortest, but long enough
to accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2
1 1 1 2 1 3
2 1 3 3
2 2 3 2
3
2 3 3 1
HMM n-HMM
Wang, et al Mining Complex Time-Series Data
34. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to ﬁxed length of n;
Contexts have variable lengths: the shortest, but long enough
to accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2 1
2 2
1 2
1
1 1 1 2 1 3
2 1 3 3 3
2
1
3
2
2 2 3 2
3
3 3
2 3 3 1 3 3 3 3
HMM n-HMM VLHMM
Wang, et al Mining Complex Time-Series Data
35. Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning Variable-length Hidden Markov Model (VLHMM)
The number of contexts is unknown before learning, even with
the number of states, S, given;
This situation is called “unknown model structure” in learning
theory, and is the most of the four types of learning problems;
As the EM algorithm cannot learn the model structure, we
derived a structural-EM algorithm to learn the model;
Optimizing a Minimum-Entropy criterion to learn the
minimum set of contexts, and
optimizing the Maximum-likelihood criterion the estimate the
model parameters.
Wang, et al Mining Complex Time-Series Data
36. Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Mining Various Kinds of Patterns
Align sequence with temporal structure
The Viterbi algorithm can setup a map from each element in the
sequence to a context in the graph.
(Partial) Periodic Pattern
Finding cyclic paths in the graph. Many algorithms are developed
to do this.
Search-by-Example
Input the example to the Viterbi algorithm, outputs a path that is
“most likely” with the example.
Frequent Atomic Pattern
Select those contexts that frequently appear in the training
sequence.
Wang, et al Mining Complex Time-Series Data
37. Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Our Contribution
A uniﬁed framework – mining by learning
Mining from the learned temporal structure using well-studied
graph algorithms;
“Hidden” model support learning various kinds of sequences;
Probabilistic transitions (esp, self-transitions) encode
uncertainty in time-scale; Output p.d.f.s encode noises.
VLHMM for eﬃcient and accurate learning and mining
Optimizing two criteria simultaneously by developing a
structural-EM algorithm;
Minimum-Entropy criteria → minimum number of parameters,
eﬃcient and eﬀective learning;
Maximum-Likelihood criteria → accurate learning of the
temporal structure.
Wang, et al Mining Complex Time-Series Data
38. Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Thank You for Your Attention
More details and demos can be accessed online at:
http://dbgroup.cs.tsinghua.edu.cn/wangyi/vlhmm
Wang, et al Mining Complex Time-Series Data
Be the first to comment