SlideShare a Scribd company logo
1 of 43
1 I NAME OF PRESENTER
Periodic Pattern Mining in
Time Series Databases
Ashis Kumar Chanda
Swapnil Saha
Department of Computer Science and Engineering
University of Dhaka
2 I NAME OF PRESENTERCSE, DU2
Introduction
Key Terms
Suffix Tree Generation
Conclusion
>
>
>
Time Series Database>
Periodic Pattern Detection
>
Topics to be covered
>
3 I NAME OF PRESENTERCSE, DU3
Introduction
What is a time-series database?
A time-series database consists of
sequences of values or events obtained
over repeated measurements of time
A fixed time intervals (e.g., hourly, daily,
weekly).
A time series is a set of observation taken at
specified times
A time series involving a variable Y
If a time series is defined by y1, y2, y3 ...
Values at times t1, t2, t3 ... Then we can
write a function of time Y=F(t)
4
 Long term movements
 Cyclic movements
 Seasonal movements
 Irregular or random movements
We can define each movements as L, C, S, I
variables respectively
And Time series variables Y = L+C+S+I
or Y = L*C*S*I
5
 Symbol periodicity
axy apq amn
 Sequence periodicity
abxy abpq abmn
 Segment periodicity
abxy abxy abxy
6
 Perfect Periodicity
abxy abpq abmn
abxy acpq abmn
Here conf( 4,0, ab)= 2/3 = 0.67
7
 Periodicity in Subsection of a Time Series
T= gbxy asdf abpq abmn
Stpos = 8
endPos= 15
So, Subsection part gbxy asdf abpq abmn
8
 Periodicity with Time Tolerance
We can’t get always noise free time series data
So we check some more bit then our target
sequence
This extra bit is known as time tolerance (tt)
If X is a pattern of p length in T then we check
At stPos, stPos+p±tt, stPos+2p±tt . . . ..
9
 A period in a time series may be represented
by 5 tuple
( S, p, stPos, endPos, Conf)
S = sequence of periodic pattern
p = check pattern after p num of char
Conf= confidence
stPos, endPos is the starting and ending
position of segment where match pattern
10
 Suppose, T= abxy acpq abdd abmn
then ( ab, 4, 0, 11, 1) means
Find ab pattern in T from 0 position to 11
postion affter 4 char
a b x y a c p q a b d d abmn
0 1 2 3 4 5 6 7 8 9 10 11
11
Occurrence Vector:
a b c a b b a b b a $
0 1 2 3 4 5 6 7 8 9
Occurrence vector of a : (0 3 6 9)
Occurrence vector of ab : (0 3 6)
12
Difference Vector:
a b c a b b a b b a $
0 1 2 3 4 5 6 7 8 9
Occurrence vector of a : 0 3
Difference vector : 3
Occurrence vector of bb : 4 7
Difference vetor : 3
13
How to get a string format from
a Transactional database?
14
Discretization Technique
15
We need to define a range or group from DB
and characterized each range by a unique
ASCII character
Suppose,
In our previous example,
log in defined by a
log out ,, x
before log in ,, b
before log out ,, c
after log out ,, d
16
17
18
accx acxd axdd bacx
 ‘abcabbaabb$’ has following ten suffixes. We
can ignore the 10th suffix when generating
suffix tree
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
6. babb$
7. abb$
8. bb$
9. b$
10. $
19
 Strings:
1. abcabbabb$
20
a
b
a
c
b
b
a
b
b
$
 Strings:
1. abcabbabb$
2. bcabbabb$
21
a
b
a
c
b
b
a
b
b
$
b
c
b
a
b
$
a
b
b
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
22
a
b
a
c
b
b
a
b
b
$
b
c
b
a
b
$
a
b
b
c
b
a
b
$
a
b
b
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
23
a
b
b
c
b
a
b
$
a
b
b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
24
a
b
b
c
b
a
b
$
a
b
b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
25
a
b b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
c
b
a
b
$
a
b
b
b
a
b
b
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
6. babb$
26
a
b b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
c
b
a
b
$
a
b
b
b
a
b
b
$
a b
b
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
6. babb$
7. abb$
27
a
b b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
c
b
a
b
$
a
b
b
b
a
b
b
$
a b
b
$
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
6. babb$
7. abb$
8. bb$
28
a
b b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
c
b
a
b
$
a
b
b
b
a
a b
b
$
$
b
b
$
$
 Strings:
1. abcabbabb$
2. bcabbabb$
3. cabbabb$
4. abbabb$
5. bbabb$
6. babb$
7. abb$
8. bb$
9. b$
29
a
b b
c
b
a
b
$
a
b
b
a
c
b
b
a
b
b
$
b
a
b
b
$
c
b
a
b
$
a
b
b
b
a
a b
b
$
$
b
b
$
$
$
abcabbabb$
Edge leaf node holds
a number that represents
starting position
of the suffix
Each intermediate node holds a number which
is the length of the substring read from root
to the intermediate node
30
0
a
b
1
b
c
b
a
b
$
a
b
b
2a
c
b
b
a
b
b
$
2
6
b
a
b
b
$
c
b
a
b
$
a
b
b
1
4
b
a
5
a b
b
$
3
$
3
b
b
$
2
7
$
$
8
abcabbabb$
Find Occrrence Vector
31
0
a
b
1
b
c
b
a
b
$
a
b
b
2a
c
b
b
a
b
b
$
2
6
b
a
b
b
$
c
b
a
b
$
a
b
b
1
4
b
a
5
a b
b
$
3
$
3
b
b
$
2
7
$
$
8(3,6)
abcabbabb$
Find Occrrence Vector
32
0
a
b
1
b
c
b
a
b
$
a
b
b
2a
c
b
b
a
b
b
$
2
6
b
a
b
b
$
c
b
a
b
$
a
b
b
1
4
b
a
5
a b
b
$
3
$
3
b
b
$
2
7
$
$
8(3,6)
(0,3,6)
abcabbabb$
Find Occrrence Vector
33
0
a
b
1
b
c
b
a
b
$
a
b
b
2a
c
b
b
a
b
b
$
2
6
b
a
b
b
$
c
b
a
b
$
a
b
b
1
4
b
a
5
a b
b
$
3
$
3
b
b
$
2
7
$
$
8(3,6)
(0,3,6)
(4,7)
(1,5,8,4,7)
Input: a time series of Size n
Output: Positions of periodic patterns
Process:
for each occurrence vector of size k
find p
for 0 to k
check each position after p char
count confidence
add to list if greater than threshold
34
abcabbabb$
ab - (0,3,6)
abb - (3,6)
bb - (4,7)
b - (1,5,8,4,7)
35
stpos= 0
endPos= 6
P= 3-0 = 3
Now check occurrence vector of ab
if difference equal p
count increment
Check confidence
Add to pattern list if confidence >= Θ
abcdabcabcab$
ab - (0,4,7,10)
36
stpos= 0
endPos= 10
P= 4-0 = 4
Now check occurrence vector of ab
if difference equal p
count increment
Only one pattern get 0 to 10 with p=4
abcdabcabcab$
abcdabcabcab$
ab - (0,4,7,10)
37
stpos= 4
endPos= 10
P= 7-4 = 3
Now check occurrence vector of ab
if difference equal p
count increment
3 pattern get 4 to 10 with p=3
abcdabcabcab$
38
- Elfeky proposed two separate algorithms to
detect symbol & segment periodicity. (CONV)
& (WARP)
But it not used in sub-sequence & complexity
O(nlogn) & O(n^2)
- Han’s parper algorithm used in sub-sequence
But it need user input
39
- In this perspective, The algorithm discussed
here is better than previous
- Complexity O(nlogn)
- Works online
40
41 I NAME OF PRESENTERCSE, DU41
References
- Periodic pattern mining using suffix tree
by Rasheed, Al-Shalalfa, & Alhajj, 2011
- Effective periodic pattern mining in time series database
by Nishi, Farhan, Samiullah, Jeong
- Data Mining Concepts & Techniques
by J. Han & M. Kamber
- Database system Concept
by Abraham Sillberschatz, Korth, Sudarshan
42 I NAME OF PRESENTERCSE, DU42
Questions
43 I NAME OF PRESENTERCSE, DU43
Thank You

More Related Content

What's hot

CAPS_Discipline_Training
CAPS_Discipline_TrainingCAPS_Discipline_Training
CAPS_Discipline_Training
Hannah Butler
 
Type header file in c++ and its function
Type header file in c++ and its functionType header file in c++ and its function
Type header file in c++ and its function
Frankie Jones
 

What's hot (14)

Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in Haskell
 
Reg ex cheatsheet
Reg ex cheatsheetReg ex cheatsheet
Reg ex cheatsheet
 
Array, string and pointer
Array, string and pointerArray, string and pointer
Array, string and pointer
 
CAPS_Discipline_Training
CAPS_Discipline_TrainingCAPS_Discipline_Training
CAPS_Discipline_Training
 
Derivatives in graphing-dfs
Derivatives in graphing-dfsDerivatives in graphing-dfs
Derivatives in graphing-dfs
 
Faster Python, FOSDEM
Faster Python, FOSDEMFaster Python, FOSDEM
Faster Python, FOSDEM
 
Variables 2
Variables 2Variables 2
Variables 2
 
Compiling fµn language
Compiling fµn languageCompiling fµn language
Compiling fµn language
 
When youseeab
When youseeabWhen youseeab
When youseeab
 
Programming for engineers in python
Programming for engineers in pythonProgramming for engineers in python
Programming for engineers in python
 
Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)
Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)
Runtime Monitoring of Stream Logic Formulae (Talk @ FPS 2015)
 
Type header file in c++ and its function
Type header file in c++ and its functionType header file in c++ and its function
Type header file in c++ and its function
 
Activity Recognition Through Complex Event Processing: First Findings
Activity Recognition Through Complex Event Processing: First Findings Activity Recognition Through Complex Event Processing: First Findings
Activity Recognition Through Complex Event Processing: First Findings
 
Radical functions
Radical functionsRadical functions
Radical functions
 

Similar to Periodic pattern mining

Raices de ecuaciones
Raices de ecuacionesRaices de ecuaciones
Raices de ecuaciones
Natalia
 
Raices de ecuaciones
Raices de ecuacionesRaices de ecuaciones
Raices de ecuaciones
Natalia
 
Strinng Classes in c++
Strinng Classes in c++Strinng Classes in c++
Strinng Classes in c++
Vikash Dhal
 
Graph Algorithms Graph Algorithms Graph Algorithms
Graph Algorithms Graph Algorithms Graph AlgorithmsGraph Algorithms Graph Algorithms Graph Algorithms
Graph Algorithms Graph Algorithms Graph Algorithms
htttuneti
 

Similar to Periodic pattern mining (20)

Time Series Analysis on Egg depositions (in millions) of age-3 Lake Huron Blo...
Time Series Analysis on Egg depositions (in millions) of age-3 Lake Huron Blo...Time Series Analysis on Egg depositions (in millions) of age-3 Lake Huron Blo...
Time Series Analysis on Egg depositions (in millions) of age-3 Lake Huron Blo...
 
Math
MathMath
Math
 
Lex analysis
Lex analysisLex analysis
Lex analysis
 
Control chap7
Control chap7Control chap7
Control chap7
 
Study on a class of recursive functions
Study on a class of recursive functionsStudy on a class of recursive functions
Study on a class of recursive functions
 
Boolean Algebra
Boolean AlgebraBoolean Algebra
Boolean Algebra
 
KARNAUGH MAP(K-MAP)
KARNAUGH MAP(K-MAP)KARNAUGH MAP(K-MAP)
KARNAUGH MAP(K-MAP)
 
Raices de ecuaciones
Raices de ecuacionesRaices de ecuaciones
Raices de ecuaciones
 
Raices de ecuaciones
Raices de ecuacionesRaices de ecuaciones
Raices de ecuaciones
 
4th Semester (June-2016) Computer Science and Information Science Engineering...
4th Semester (June-2016) Computer Science and Information Science Engineering...4th Semester (June-2016) Computer Science and Information Science Engineering...
4th Semester (June-2016) Computer Science and Information Science Engineering...
 
ArrayBasics.ppt
ArrayBasics.pptArrayBasics.ppt
ArrayBasics.ppt
 
Python Cheat Sheet
Python Cheat SheetPython Cheat Sheet
Python Cheat Sheet
 
Introduction to matlab lecture 3 of 4
Introduction to matlab lecture 3 of 4Introduction to matlab lecture 3 of 4
Introduction to matlab lecture 3 of 4
 
Counting sort(Non Comparison Sort)
Counting sort(Non Comparison Sort)Counting sort(Non Comparison Sort)
Counting sort(Non Comparison Sort)
 
Strinng Classes in c++
Strinng Classes in c++Strinng Classes in c++
Strinng Classes in c++
 
Graph Algorithms Graph Algorithms Graph Algorithms
Graph Algorithms Graph Algorithms Graph AlgorithmsGraph Algorithms Graph Algorithms Graph Algorithms
Graph Algorithms Graph Algorithms Graph Algorithms
 
Chapter 3 REGULAR EXPRESSION.pdf
Chapter 3 REGULAR EXPRESSION.pdfChapter 3 REGULAR EXPRESSION.pdf
Chapter 3 REGULAR EXPRESSION.pdf
 
Algorithm Assignment Help
Algorithm Assignment HelpAlgorithm Assignment Help
Algorithm Assignment Help
 
Algorithms presentation on Path Matrix, Bell Number and Sorting
Algorithms presentation on Path Matrix, Bell Number and SortingAlgorithms presentation on Path Matrix, Bell Number and Sorting
Algorithms presentation on Path Matrix, Bell Number and Sorting
 
Bisection method
Bisection methodBisection method
Bisection method
 

More from Ashis Chanda

More from Ashis Chanda (11)

Understanding medical concepts and codes through NLP methods
Understanding medical concepts and codes through NLP methodsUnderstanding medical concepts and codes through NLP methods
Understanding medical concepts and codes through NLP methods
 
Word2vector
Word2vectorWord2vector
Word2vector
 
Information extraction from EHR
Information extraction from EHRInformation extraction from EHR
Information extraction from EHR
 
Understanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational DatabasesUnderstanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational Databases
 
Multi-class Image Classification using Deep Convolutional Networks on extreme...
Multi-class Image Classification using Deep Convolutional Networks on extreme...Multi-class Image Classification using Deep Convolutional Networks on extreme...
Multi-class Image Classification using Deep Convolutional Networks on extreme...
 
Full resolution image compression with recurrent neural networks
Full resolution image compression with recurrent neural networksFull resolution image compression with recurrent neural networks
Full resolution image compression with recurrent neural networks
 
Iterative deepening search
Iterative deepening searchIterative deepening search
Iterative deepening search
 
An efficient approach to mine flexible periodic patterns in time series datab...
An efficient approach to mine flexible periodic patterns in time series datab...An efficient approach to mine flexible periodic patterns in time series datab...
An efficient approach to mine flexible periodic patterns in time series datab...
 
Data Mining
Data MiningData Mining
Data Mining
 
Frequent Pattern Growth Algorithm (FP growth method)
Frequent Pattern Growth Algorithm (FP growth method)Frequent Pattern Growth Algorithm (FP growth method)
Frequent Pattern Growth Algorithm (FP growth method)
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 

Recently uploaded

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Periodic pattern mining

  • 1. 1 I NAME OF PRESENTER Periodic Pattern Mining in Time Series Databases Ashis Kumar Chanda Swapnil Saha Department of Computer Science and Engineering University of Dhaka
  • 2. 2 I NAME OF PRESENTERCSE, DU2 Introduction Key Terms Suffix Tree Generation Conclusion > > > Time Series Database> Periodic Pattern Detection > Topics to be covered >
  • 3. 3 I NAME OF PRESENTERCSE, DU3 Introduction What is a time-series database? A time-series database consists of sequences of values or events obtained over repeated measurements of time A fixed time intervals (e.g., hourly, daily, weekly).
  • 4. A time series is a set of observation taken at specified times A time series involving a variable Y If a time series is defined by y1, y2, y3 ... Values at times t1, t2, t3 ... Then we can write a function of time Y=F(t) 4
  • 5.  Long term movements  Cyclic movements  Seasonal movements  Irregular or random movements We can define each movements as L, C, S, I variables respectively And Time series variables Y = L+C+S+I or Y = L*C*S*I 5
  • 6.  Symbol periodicity axy apq amn  Sequence periodicity abxy abpq abmn  Segment periodicity abxy abxy abxy 6
  • 7.  Perfect Periodicity abxy abpq abmn abxy acpq abmn Here conf( 4,0, ab)= 2/3 = 0.67 7
  • 8.  Periodicity in Subsection of a Time Series T= gbxy asdf abpq abmn Stpos = 8 endPos= 15 So, Subsection part gbxy asdf abpq abmn 8
  • 9.  Periodicity with Time Tolerance We can’t get always noise free time series data So we check some more bit then our target sequence This extra bit is known as time tolerance (tt) If X is a pattern of p length in T then we check At stPos, stPos+p±tt, stPos+2p±tt . . . .. 9
  • 10.  A period in a time series may be represented by 5 tuple ( S, p, stPos, endPos, Conf) S = sequence of periodic pattern p = check pattern after p num of char Conf= confidence stPos, endPos is the starting and ending position of segment where match pattern 10
  • 11.  Suppose, T= abxy acpq abdd abmn then ( ab, 4, 0, 11, 1) means Find ab pattern in T from 0 position to 11 postion affter 4 char a b x y a c p q a b d d abmn 0 1 2 3 4 5 6 7 8 9 10 11 11
  • 12. Occurrence Vector: a b c a b b a b b a $ 0 1 2 3 4 5 6 7 8 9 Occurrence vector of a : (0 3 6 9) Occurrence vector of ab : (0 3 6) 12
  • 13. Difference Vector: a b c a b b a b b a $ 0 1 2 3 4 5 6 7 8 9 Occurrence vector of a : 0 3 Difference vector : 3 Occurrence vector of bb : 4 7 Difference vetor : 3 13
  • 14. How to get a string format from a Transactional database? 14 Discretization Technique
  • 15. 15
  • 16. We need to define a range or group from DB and characterized each range by a unique ASCII character Suppose, In our previous example, log in defined by a log out ,, x before log in ,, b before log out ,, c after log out ,, d 16
  • 17. 17
  • 19.  ‘abcabbaabb$’ has following ten suffixes. We can ignore the 10th suffix when generating suffix tree 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 6. babb$ 7. abb$ 8. bb$ 9. b$ 10. $ 19
  • 21.  Strings: 1. abcabbabb$ 2. bcabbabb$ 21 a b a c b b a b b $ b c b a b $ a b b
  • 22.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 22 a b a c b b a b b $ b c b a b $ a b b c b a b $ a b b
  • 23.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 23 a b b c b a b $ a b b c b a b $ a b b a c b b a b b $
  • 24.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 24 a b b c b a b $ a b b c b a b $ a b b a c b b a b b $ b a b b $
  • 25.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 25 a b b c b a b $ a b b a c b b a b b $ b a b b $ c b a b $ a b b b a b b $
  • 26.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 6. babb$ 26 a b b c b a b $ a b b a c b b a b b $ b a b b $ c b a b $ a b b b a b b $ a b b $
  • 27.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 6. babb$ 7. abb$ 27 a b b c b a b $ a b b a c b b a b b $ b a b b $ c b a b $ a b b b a b b $ a b b $ $
  • 28.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 6. babb$ 7. abb$ 8. bb$ 28 a b b c b a b $ a b b a c b b a b b $ b a b b $ c b a b $ a b b b a a b b $ $ b b $ $
  • 29.  Strings: 1. abcabbabb$ 2. bcabbabb$ 3. cabbabb$ 4. abbabb$ 5. bbabb$ 6. babb$ 7. abb$ 8. bb$ 9. b$ 29 a b b c b a b $ a b b a c b b a b b $ b a b b $ c b a b $ a b b b a a b b $ $ b b $ $ $
  • 30. abcabbabb$ Edge leaf node holds a number that represents starting position of the suffix Each intermediate node holds a number which is the length of the substring read from root to the intermediate node 30 0 a b 1 b c b a b $ a b b 2a c b b a b b $ 2 6 b a b b $ c b a b $ a b b 1 4 b a 5 a b b $ 3 $ 3 b b $ 2 7 $ $ 8
  • 34. Input: a time series of Size n Output: Positions of periodic patterns Process: for each occurrence vector of size k find p for 0 to k check each position after p char count confidence add to list if greater than threshold 34
  • 35. abcabbabb$ ab - (0,3,6) abb - (3,6) bb - (4,7) b - (1,5,8,4,7) 35 stpos= 0 endPos= 6 P= 3-0 = 3 Now check occurrence vector of ab if difference equal p count increment Check confidence Add to pattern list if confidence >= Θ
  • 36. abcdabcabcab$ ab - (0,4,7,10) 36 stpos= 0 endPos= 10 P= 4-0 = 4 Now check occurrence vector of ab if difference equal p count increment Only one pattern get 0 to 10 with p=4 abcdabcabcab$
  • 37. abcdabcabcab$ ab - (0,4,7,10) 37 stpos= 4 endPos= 10 P= 7-4 = 3 Now check occurrence vector of ab if difference equal p count increment 3 pattern get 4 to 10 with p=3 abcdabcabcab$
  • 38. 38
  • 39. - Elfeky proposed two separate algorithms to detect symbol & segment periodicity. (CONV) & (WARP) But it not used in sub-sequence & complexity O(nlogn) & O(n^2) - Han’s parper algorithm used in sub-sequence But it need user input 39
  • 40. - In this perspective, The algorithm discussed here is better than previous - Complexity O(nlogn) - Works online 40
  • 41. 41 I NAME OF PRESENTERCSE, DU41 References - Periodic pattern mining using suffix tree by Rasheed, Al-Shalalfa, & Alhajj, 2011 - Effective periodic pattern mining in time series database by Nishi, Farhan, Samiullah, Jeong - Data Mining Concepts & Techniques by J. Han & M. Kamber - Database system Concept by Abraham Sillberschatz, Korth, Sudarshan
  • 42. 42 I NAME OF PRESENTERCSE, DU42 Questions
  • 43. 43 I NAME OF PRESENTERCSE, DU43 Thank You