SlideShare a Scribd company logo
1 of 31
Download to read offline
Visualizing and Discovering Non-Trivial
Patterns in Large Time Series Databases
Quan Le
HCI Lab
23th Mar, 2015
Jessica Lin, Eamonn Keogh, Stefano Lonardi, Jeffrey P. Lankford, Daonna M. Nystrom
Computer Science & Engineering Department University of California, Riverside, CA 92521
Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004
Contents
 Introduction
 Background
 Time series data mining tasks
 Visualizing Time Series
 VizTree
 Evaluation
 Conclusion
2
Fig 1. Time Series Visualization
Introduction
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- Data visualization techniques are very important for data
analysis.
- Visualizing massive time series datasets.
- VizTree – a time series pattern discovery and visualization
system based on augment suffix trees.
 Occurring pattern (Motif Discovery)
 Surprising Pattern (Anomaly Discovery)
 Query by content
 Measures the dissimilarity between any two time series
3
Introduction
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- DoD – The U.S Department of Defense and The Aerospace
Corp (TAC)
- There are two major directions of research area:
 Producing better technique to mine the archival launch
data from previous missions (mining stage).
 Producing better techniques to visualize the streaming
telemetry data in the hours before launch (monitor
stage)
- Dr. Ben Shneiderman of Uni. of Maryland – “Overview, zoom
& filter, details-on-demand”.
4
Background
 Time Series data mining tasks
 Visualization Time Series
5
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Time Series data mining tasks
 Subsequence matching
 Motifs Discovery
 Anomaly detection
6
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Subsequence Matching
- Sequence matching has long been divided into two categories:
whole matching and subsequence matching.
- Subsequence matching: a short query subsequence time series
is matched against longer time series by sliding it along the
longer sequence, looking for the best matching location.
- Chunking – the process where a time series is broken into
individual time series by either specific period.
7
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Fig 2. A weekly map of drought conditions in Texas
Time Series Motif Discovery
- A substantial body of literature has been devoted to
techniques to discover frequently recurring, overrepresented
patterns in time series.
8
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Fig 3. Time series Subsequence Motifs Discovery
Anomaly Detection
9
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- The problem of detecting anomalous/surprising patterns has
attracted much attention.
- Keogh’s definition – “whose frequently of occurrences differs
substantially from that expected or given previously seen
data”.
Fig 4. Illustration of Anomalous Series Detection ( Red represents anomalous time series)
Visualizing Time Series
10
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion TimeSearcher
 Cluster and Calendar-based visualization
 Spirals
TimeSearcher
11
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- A time series exploratory and visualization tool that allows
user to retrieve time series by creating queries.
Fig 5. The TimeSearcher visual query interface.
User can filter away sequence that are not
interesting by insisting that all sequences have
at least on data point within the query box.
http://www.cs.umd.edu/hcil/timesearcher/vide
os/TimeSearcherDemo.mp4
 Flexibility
 Specify different regions
to compare.
Cluster and Calendar-based visualization
12
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- The time series data are chunked into sequences of day
patterns.
- This visualization system displays patterns represented
calendar with each day color-coded by the cluster that it
belongs to.
Fig 6. The cluster and calendar-based
visualization on employee working
hours data. It shows 6 clusters,
representing different working day
patterns.
 Good overview
 Limited to calendar-
based data
Spirals
13
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- Weber developed this tool to visualize time series on spirals.
- Identify periodic structures in the data.
- Do not exhibit periodic behaviors of time series data.
- Requires pixel space in length of time series.
Fig 7. The Spiral visualization approach of Weber applied to the power usage
dataset
VizTree - Motivation
14
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Here are two sets of bit strings.
Which set is generated by human and
which one is generated by a computer?
0101100101111001101001000010001010
0110110101110000101010111011111000
1101101101111110100110010010001101
0001111001101101000101111000101101
0011011001101000000100110001001110
000011101001100101100001010010
1000100010100100010101010000101010
0010101110111101011010010111010010
1010011101010101001010010101011101
0101001010101011010101001011001011
1011110100011100001010000100111010
100011100001010101100101110101
VizTree - Motivation
15
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Fig 8. (Left) Computer-generated random bits presented as an augmented suffix tree.
(Right) Human-constructed bits presented as an augmented suffix tree.
0
1
(0,1,0)
(1,0,1)
(0,1,1)
1000100010100100010101010000101010
0010101110111101011010010111010010
1010011101010101001010010101011101
0101001010101011010101001011001011
1011110100011100001010000100111010
100011100001010101100101110101
0101100101111001101001000010001010
0110110101110000101010111011111000
1101101101111110100110010010001101
0001111001101101000101111000101101
0011011001101000000100110001001110
000011101001100101100001010010
VizTree - Motivation
16
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion- The strings represented in the tree are in fact “subsequences”
rather than “suffixes”.
- Using a time-series discretization method.
- Given the same parameters, the tree has the same overall
shape for any dataset.
Fig 9. VizTree Tool
Discretizing time series method
17
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
SAX
Symbolic Aggregate ApproXimation
baabccbc
18
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Discretizing time series method
Convert
time series
to PAA
Convert PAA
to symbols
PAA = Piecewise Aggregate Approximation
Fig 10. A summarization of the notation used
19
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Discretizing time series method
Fig 11. A time series dataset of electrical consumption (of length 1024) is
converted into an eight-symbol string “acdcbdba”. Note that the general
shape is the time series is preserved, in spite of the massive amount of
dimensionality reduction.
20
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
VizTree – First Look
Fig 12. A screenshot of VizTree
The parameter
setting area
The actual
subsequence
when the
technician clicks
on a branch
The input time series
The subsequence
tree for the time
series
Zoom-in window
VizTree
21
 Motifs Discovery
 Anomaly Detection
 Diff-Tree (Surprising Patterns)
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
22
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Motif Discovery
Fig 13. Example of Motif discovery on the winding dataset. Two nearly identical
subsequences are identified, among the other motifs.
23
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Anomaly Detection
Fig 14. Heart-beat data with anomaly is shown. While the subsequence tree can be
used to identify motifs, it can be used for simple anomaly detection as well.
24
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Diff-Tree (Surprising Patterns)
Fig 15. The blue ECG data is the reference data and the green ECG data is the testing
data. The resulting tree show the difference in pattern distributions of two datasets. The
surprising patterns are ranked with the red one.
25
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Evaluation
 Subsequence Matching & Motifs Discovery: Human
motion data of Yoga Postures
 Anomaly Detection: Power consumption data
26
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Yoga Postures – Subsequence Matching
- A model postured yoga routines in front of a green screen.
- The motion capture is transformed into a time series.
- The length of the time series is approximately 26.000.
Fig 16. The sample yoga sequence that we are interested in finding
27
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Yoga Postures – Subsequence Matching
- A model postured yoga routines in front of a green screen.
- The motion capture is transformed into a time series.
- The length of the time series is approximately 26.000.
Fig 17. Matches for the yoga sequence. The
bottom right corner shows how similar
these two subsequences are
28
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Yoga Postures – Motif Discovery
- Identify approximately motifs by examining the subsequences
represented by thick three paths.
Fig 18. Example of Motifs discovery
on the winding dataset. Two nearly
identical subsequences are identified,
among the other motifs.
29
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Power Consumption – Anomaly Detection
- Electricity consumption is recorded every 15 minutes, in 1997
Fig 19. Anomaly detection on power
consumption data. The anomaly
shown here is a short week during
Christmas.
30
Introduction
Background
Time Series data mining tasks
Visualizing Time Series
VizTree
Evaluation
Conclusion
Conclusion
- Proposed VizTree as a visualization framework for massive
time series datasets.
- Mining and monitoring purposes.
- Process new data arrive.
Thank you!

More Related Content

What's hot

Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control SystemChaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control Systemijtsrd
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu
 
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...Jumlesha Shaik
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructuresKrish_ver2
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
Machine learning (11)
Machine learning (11)Machine learning (11)
Machine learning (11)NYversity
 
My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...Alexander Litvinenko
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural NetworkRuochun Tzeng
 
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)neeraj7svp
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientFabian Pedregosa
 
Smooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural NetworkSmooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural NetworkKostas Hatalis, PhD
 
A General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataA General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataHopeBay Technologies, Inc.
 
Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...
 Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec... Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...
Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...hydrologyproject001
 
Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesRob Hyndman
 

What's hot (20)

Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control SystemChaos Suppression and Stabilization of Generalized Liu Chaotic Control System
Chaos Suppression and Stabilization of Generalized Liu Chaotic Control System
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Cosmografi
CosmografiCosmografi
Cosmografi
 
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...
Ml srhwt-machine-learning-based-superlative-rapid-haar-wavelet-transformation...
 
5.4 randomized datastructures
5.4 randomized datastructures5.4 randomized datastructures
5.4 randomized datastructures
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
Machine learning (11)
Machine learning (11)Machine learning (11)
Machine learning (11)
 
My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
 
presentation_btp
presentation_btppresentation_btp
presentation_btp
 
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
Smooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural NetworkSmooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural Network
 
A General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataA General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series Data
 
Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...
 Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec... Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...
Download-manuals-water quality-wq-training-44howtocarryoutcorrelationandspec...
 
Exploring the feature space of large collections of time series
Exploring the feature space of large collections of time seriesExploring the feature space of large collections of time series
Exploring the feature space of large collections of time series
 

Similar to [PDF] Visualizing and discovering non trivial patterns in large time-series databases

Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series dataRob Hyndman
 
20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.pptPalaniKumarR2
 
Una introducción a la minería de series temporales
Una introducción a la minería de series temporalesUna introducción a la minería de series temporales
Una introducción a la minería de series temporalesFacultad de Informática UCM
 
OBJECT IDENTIFICATION
OBJECT IDENTIFICATIONOBJECT IDENTIFICATION
OBJECT IDENTIFICATIONIRJET Journal
 
0912f50eedb48e44d7000000
0912f50eedb48e44d70000000912f50eedb48e44d7000000
0912f50eedb48e44d7000000Rakesh Sharma
 
Frequent Pattern Mining with Serialization and De-Serialization
Frequent Pattern Mining with Serialization and De-SerializationFrequent Pattern Mining with Serialization and De-Serialization
Frequent Pattern Mining with Serialization and De-Serializationiosrjce
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam LaurenceAlexanderAdamLaurenc
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?Raffael Marty
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?Andre Freitas
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?Anubhav Jain
 
Cs 1004 -_data_warehousing_and_data_mining
Cs 1004 -_data_warehousing_and_data_miningCs 1004 -_data_warehousing_and_data_mining
Cs 1004 -_data_warehousing_and_data_mininghari91
 

Similar to [PDF] Visualizing and discovering non trivial patterns in large time-series databases (20)

Visualization of big time series data
Visualization of big time series dataVisualization of big time series data
Visualization of big time series data
 
20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt
 
01-pengantar.pdf
01-pengantar.pdf01-pengantar.pdf
01-pengantar.pdf
 
Time series
Time seriesTime series
Time series
 
Una introducción a la minería de series temporales
Una introducción a la minería de series temporalesUna introducción a la minería de series temporales
Una introducción a la minería de series temporales
 
OBJECT IDENTIFICATION
OBJECT IDENTIFICATIONOBJECT IDENTIFICATION
OBJECT IDENTIFICATION
 
0912f50eedb48e44d7000000
0912f50eedb48e44d70000000912f50eedb48e44d7000000
0912f50eedb48e44d7000000
 
S01732110114
S01732110114S01732110114
S01732110114
 
Frequent Pattern Mining with Serialization and De-Serialization
Frequent Pattern Mining with Serialization and De-SerializationFrequent Pattern Mining with Serialization and De-Serialization
Frequent Pattern Mining with Serialization and De-Serialization
 
S01732110114
S01732110114S01732110114
S01732110114
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Reproducibility for IR evaluation
Reproducibility for IR evaluationReproducibility for IR evaluation
Reproducibility for IR evaluation
 
Reproducibility for IR evaluation
Reproducibility for IR evaluationReproducibility for IR evaluation
Reproducibility for IR evaluation
 
AUDIBERT_Julien_2021.pdf
AUDIBERT_Julien_2021.pdfAUDIBERT_Julien_2021.pdf
AUDIBERT_Julien_2021.pdf
 
2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 
Cs 1004 -_data_warehousing_and_data_mining
Cs 1004 -_data_warehousing_and_data_miningCs 1004 -_data_warehousing_and_data_mining
Cs 1004 -_data_warehousing_and_data_mining
 

Recently uploaded

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 

[PDF] Visualizing and discovering non trivial patterns in large time-series databases

  • 1. Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases Quan Le HCI Lab 23th Mar, 2015 Jessica Lin, Eamonn Keogh, Stefano Lonardi, Jeffrey P. Lankford, Daonna M. Nystrom Computer Science & Engineering Department University of California, Riverside, CA 92521 Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004
  • 2. Contents  Introduction  Background  Time series data mining tasks  Visualizing Time Series  VizTree  Evaluation  Conclusion 2 Fig 1. Time Series Visualization
  • 3. Introduction Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- Data visualization techniques are very important for data analysis. - Visualizing massive time series datasets. - VizTree – a time series pattern discovery and visualization system based on augment suffix trees.  Occurring pattern (Motif Discovery)  Surprising Pattern (Anomaly Discovery)  Query by content  Measures the dissimilarity between any two time series 3
  • 4. Introduction Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- DoD – The U.S Department of Defense and The Aerospace Corp (TAC) - There are two major directions of research area:  Producing better technique to mine the archival launch data from previous missions (mining stage).  Producing better techniques to visualize the streaming telemetry data in the hours before launch (monitor stage) - Dr. Ben Shneiderman of Uni. of Maryland – “Overview, zoom & filter, details-on-demand”. 4
  • 5. Background  Time Series data mining tasks  Visualization Time Series 5 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion
  • 6. Time Series data mining tasks  Subsequence matching  Motifs Discovery  Anomaly detection 6 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion
  • 7. Subsequence Matching - Sequence matching has long been divided into two categories: whole matching and subsequence matching. - Subsequence matching: a short query subsequence time series is matched against longer time series by sliding it along the longer sequence, looking for the best matching location. - Chunking – the process where a time series is broken into individual time series by either specific period. 7 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Fig 2. A weekly map of drought conditions in Texas
  • 8. Time Series Motif Discovery - A substantial body of literature has been devoted to techniques to discover frequently recurring, overrepresented patterns in time series. 8 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Fig 3. Time series Subsequence Motifs Discovery
  • 9. Anomaly Detection 9 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- The problem of detecting anomalous/surprising patterns has attracted much attention. - Keogh’s definition – “whose frequently of occurrences differs substantially from that expected or given previously seen data”. Fig 4. Illustration of Anomalous Series Detection ( Red represents anomalous time series)
  • 10. Visualizing Time Series 10 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion TimeSearcher  Cluster and Calendar-based visualization  Spirals
  • 11. TimeSearcher 11 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- A time series exploratory and visualization tool that allows user to retrieve time series by creating queries. Fig 5. The TimeSearcher visual query interface. User can filter away sequence that are not interesting by insisting that all sequences have at least on data point within the query box. http://www.cs.umd.edu/hcil/timesearcher/vide os/TimeSearcherDemo.mp4  Flexibility  Specify different regions to compare.
  • 12. Cluster and Calendar-based visualization 12 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- The time series data are chunked into sequences of day patterns. - This visualization system displays patterns represented calendar with each day color-coded by the cluster that it belongs to. Fig 6. The cluster and calendar-based visualization on employee working hours data. It shows 6 clusters, representing different working day patterns.  Good overview  Limited to calendar- based data
  • 13. Spirals 13 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- Weber developed this tool to visualize time series on spirals. - Identify periodic structures in the data. - Do not exhibit periodic behaviors of time series data. - Requires pixel space in length of time series. Fig 7. The Spiral visualization approach of Weber applied to the power usage dataset
  • 14. VizTree - Motivation 14 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Here are two sets of bit strings. Which set is generated by human and which one is generated by a computer? 0101100101111001101001000010001010 0110110101110000101010111011111000 1101101101111110100110010010001101 0001111001101101000101111000101101 0011011001101000000100110001001110 000011101001100101100001010010 1000100010100100010101010000101010 0010101110111101011010010111010010 1010011101010101001010010101011101 0101001010101011010101001011001011 1011110100011100001010000100111010 100011100001010101100101110101
  • 15. VizTree - Motivation 15 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Fig 8. (Left) Computer-generated random bits presented as an augmented suffix tree. (Right) Human-constructed bits presented as an augmented suffix tree. 0 1 (0,1,0) (1,0,1) (0,1,1) 1000100010100100010101010000101010 0010101110111101011010010111010010 1010011101010101001010010101011101 0101001010101011010101001011001011 1011110100011100001010000100111010 100011100001010101100101110101 0101100101111001101001000010001010 0110110101110000101010111011111000 1101101101111110100110010010001101 0001111001101101000101111000101101 0011011001101000000100110001001110 000011101001100101100001010010
  • 16. VizTree - Motivation 16 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion- The strings represented in the tree are in fact “subsequences” rather than “suffixes”. - Using a time-series discretization method. - Given the same parameters, the tree has the same overall shape for any dataset. Fig 9. VizTree Tool
  • 17. Discretizing time series method 17 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion SAX Symbolic Aggregate ApproXimation baabccbc
  • 18. 18 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Discretizing time series method Convert time series to PAA Convert PAA to symbols PAA = Piecewise Aggregate Approximation Fig 10. A summarization of the notation used
  • 19. 19 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Discretizing time series method Fig 11. A time series dataset of electrical consumption (of length 1024) is converted into an eight-symbol string “acdcbdba”. Note that the general shape is the time series is preserved, in spite of the massive amount of dimensionality reduction.
  • 20. 20 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion VizTree – First Look Fig 12. A screenshot of VizTree The parameter setting area The actual subsequence when the technician clicks on a branch The input time series The subsequence tree for the time series Zoom-in window
  • 21. VizTree 21  Motifs Discovery  Anomaly Detection  Diff-Tree (Surprising Patterns) Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion
  • 22. 22 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Motif Discovery Fig 13. Example of Motif discovery on the winding dataset. Two nearly identical subsequences are identified, among the other motifs.
  • 23. 23 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Anomaly Detection Fig 14. Heart-beat data with anomaly is shown. While the subsequence tree can be used to identify motifs, it can be used for simple anomaly detection as well.
  • 24. 24 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Diff-Tree (Surprising Patterns) Fig 15. The blue ECG data is the reference data and the green ECG data is the testing data. The resulting tree show the difference in pattern distributions of two datasets. The surprising patterns are ranked with the red one.
  • 25. 25 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Evaluation  Subsequence Matching & Motifs Discovery: Human motion data of Yoga Postures  Anomaly Detection: Power consumption data
  • 26. 26 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Yoga Postures – Subsequence Matching - A model postured yoga routines in front of a green screen. - The motion capture is transformed into a time series. - The length of the time series is approximately 26.000. Fig 16. The sample yoga sequence that we are interested in finding
  • 27. 27 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Yoga Postures – Subsequence Matching - A model postured yoga routines in front of a green screen. - The motion capture is transformed into a time series. - The length of the time series is approximately 26.000. Fig 17. Matches for the yoga sequence. The bottom right corner shows how similar these two subsequences are
  • 28. 28 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Yoga Postures – Motif Discovery - Identify approximately motifs by examining the subsequences represented by thick three paths. Fig 18. Example of Motifs discovery on the winding dataset. Two nearly identical subsequences are identified, among the other motifs.
  • 29. 29 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Power Consumption – Anomaly Detection - Electricity consumption is recorded every 15 minutes, in 1997 Fig 19. Anomaly detection on power consumption data. The anomaly shown here is a short week during Christmas.
  • 30. 30 Introduction Background Time Series data mining tasks Visualizing Time Series VizTree Evaluation Conclusion Conclusion - Proposed VizTree as a visualization framework for massive time series datasets. - Mining and monitoring purposes. - Process new data arrive.