SlideShare a Scribd company logo
1 of 24
Download to read offline
Enumerating Hub Motifs
in Time Series
Based on the Matrix Profile
1 National Institute of Advanced Industrial Science and Technology (AIST)
2 Mitsubishi Electric Corporation
3 LeapMind Inc.
5th Workshop on Mining and Learning from Time Series (MiLeTS’19)
Aug 5, 2019 - Anchorage, Alaska, USA
Genta Yoshimura1,2 Atsunori Kanemura1,3 Hideki Asoh1
Outline
1. Introduction
• Motif Enumeration
• Problems in Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
2G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
3G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Motif Enumeration from Time Series
Motif = a subsequence that occurs frequently in time series
• Finding motifs is useful for many time series mining tasks
• Classification, forecasting, segmentation, anomaly detection, …
Motif Enumeration
• Enumerate multiple motifs in order of significance
rather than fining a single motif
• Most time series include multiple patterns
• In our problem setting, motif length W is a tunable parameter
• Not variable-length motifs, but fixed-length motifs
4G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
Time series
Time
Motif 1 Motif 2
W W W W
The difference of two definitions arises from
how we regard a subsequence as significant
1. Range motif
• A subsequence is significant
if there exist many subsequences
inside the sphere of radius R
2. Closest-pair motif
• A subsequence is significant
if the distance to its closest
subsequence is small
Note
• Z-normalized Euclidean Distance (ED) is used as subsequence distance
• Trivial-matches are ignored when finding neighbor subsequences
n1=6 n2=3
Existing Two Motif Definitions
5G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
>significant
>significant
d1=0.63 d2=0.89
Subsequence whose length is W
R
Existing methods [Bagnall+14] require a radius parameter R
• Place spheres of radius R so as not to overlap each other
• Iteratively find the most significant subsequence as motif
and remove subsequences inside the sphere of radius R
1. SetFinder = Range motif based method
2. ScanMK = Closest-pair motif based method
Existing Motif Enumeration Methods
6G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
R
R ×
argmaxi ni
argmini di
argmaxi ni
argmini di
remove
remove
remove
remove
Problems in Existing Methods
Existing methods suffer from the radius parameter R
1. It is not easy to tune R
• Appropriate parameter R changes
in accordance with the target dataset
• We cannot even know which R is appropriate
in most real applications where no ground truth is available
2. There are cases where the existing methods fail to
enumerate motifs successfully no matter how finely tune R
• Such cases can be easily made and actually occur in real datasets
Novel motif enumeration method is necessary
7G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
R
R
R
?
Too small…Too large…
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
8G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Novel Motif Definition: Hub Motif
In order to get free from the radius parameter R
1. Range motif
• A subsequence is significant
if there exist many subsequences
with in the sphere of radius R
2. Closest-pair motif
• A subsequence is significant
if the distance to its closest
subsequence is small
3. Hub motif
• A subsequence is significant
if a sum of distances from
other subsequences is small
• Looks like a wheel hub
9G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Method
R
d1=0.63 d2=0.89
n1=6 n2=3
>significant
Σk dik=5.12 Σk djk=7.36
Proposed Method: HubFinder
HubFinder does not require the radius parameter R
• Motif length: W
• Number of motifs: K
HubFinder consists of two steps
1. Extract candidates for motifs using the matrix profile
2. Refine candidates into K motifs according to the hub motif significance
10G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Method
Time series
Candidates
1. Extract
2. Refine
Time
W
K Motifs
・・・
K
Step 1. Extract candidates for motifs
• Time series:
• -th subsequence:
• Matrix profile:
• is z-normalized Euclidean Distance (ED)
between and its closest subsequence
except its trivial matches
• Can be computed efficiently using STOMP algorithm [Zhu+16]
• is a candidate of motifs if is a local minimum of
• Use a sliding window whose length is to detect local minima
• Extracted candidates are added to a candidate set
11G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
closest-pair
Time series X
Matrix profile P STOMP
Method
is a local minimum in a sliding window ⇒
Step 2. Refine candidates into K motifs
• Refine the candidate set into a motif set
• Cost function based on the hub motif definition
• Find which minimizes the cost function in greedy manner
• New candidate is added to one by one
• If , the least significant candidate is removed
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 12
Method
Motif set = { }Candidate set = { }
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
13G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Synthetic Data
Two motifs of length W=32 are arranged alternately
• Motif-1: z-normalized triangular wave + Gaussian noise
• Motif-2: z-normalized sine wave + Gaussian noise
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 14
Experiments
・・・ x50 (T=9616)
Apply ScanMK, SetFinder, and HubFinder with W=32
• HubFinder succeeds in finding alternate motifs perfectly without tuning R
• Existing methods fail no matter how finely you tune R
• Existing methods are sensitive to R
Synthetic Data (Result)
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 15
Experiments
58.5% (R=0.96)
69.0% (R=0.86)
100% (constant)
Purity(thelarger,thebetter)
Radius
Extracted 2nd motif and neighbors
Extracted 1st motif and neighbors
Human Motion Data
MotionSense Dataset [Malekzadeh+18]
• Collected with an iPhone 6s kept in the participant's front pocket
• Include 3D accelerometer time series of human motion
• Total of 24 participants performed several activities
• 4 activities were chosen for this study
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 16
Experiments
Downstairs Upstairs
Walking Jogging
x
y
z
x
y
z
x
y
z
x
y
z
Human Motion Data (Result)
Apply ScanMK, SetFinder, and HubFinder with W=64
• Position of top-4 motifs ( ) and neighbors ( ) of participant #23
• Existing methods fail to find motif from Downstairs activity
• HubFinder successfully finds motifs from all 4 activities
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 17
Experiments
Downstairs Upstairs Walking Jogging
ScanMKSetFinderhubFinder
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
18G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Summary
• Problems in existing motif enumeration methods caused by R
• Novel hub motif definition and HubFinder algorithm
• HubFinder succeeds in finding appropriate motifs without tuning R
• Existing methods fail no matter how finely tune R
Future Work
• Remove the motif length parameter W
(Extend to variable-length motifs)
• Utilize extracted motifs for other time series mining tasks
such as classification, forecasting, segmentation, and anomaly detection
Python code is available at
https://github.com/intellygenta/HubFinder
Thank you for your attention!!
19G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Conclusion
References
[Bagnall+14] A. Bagnall, J. Hills, and J. Lines,
“Finding Motif Sets in Time Series”
arXiv:1407.3685 (2014).
[Zhu+16] Y. Zhu, Z. Zimmerman, N. S. Senobari, C. C. M. Yeh,
G. Funning, A. Mueen, P. Brisk, and E. Keogh,
“Matrix Profile II: Exploiting a Novel Algorithm and GPUs to
Break the One Hundred Million Barrier for Time Series Motifs and Joins“
IEEE 16th International Conference on Data Mining (ICDM), 739–748. (2016)
[Malekzadeh+18] M. Malekzadeh, R. G. Clegg, A. Cavallaro, and H. Haddadi,
“Protecting sensory data against sensitive inferences”
Workshop on Privacy by Design in Distributed Systems (2018).
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 20
Purity
• Motif enumeration is similar to the clustering task to some extent
• Find representative patterns within a dataset in unsupervised manner
• We adopt purity as evaluation metric for this study
• One of the most popular metric for the clustering
•
• Ground truth motif clusters:
• Enumerated motif clusters:
• E.g. Purity = (5 + 1) / (5 + 5) = 0.60
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 21
ψ1 ψ1
∈
∈
∈
∈
∈
∈
∈
∈
∈
∈
ψ1 ψ1 ψ1ψ2 ψ2 ψ2 ψ2 ψ2
ω1
∈
∈
∈
∈
∈
∈
ω2 ω1 ω1 ω1 ω1
∈
ω1
Ground truth
Enumerated
Time Complexity
• Running times on the synthetic time series
• HubFinder is faster than the existing methods for long time series
• HubFinder does not need multiple trials for tuning R
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 22
Dependency of the Number of Motifs K
Dependency of the number of motifs K on the purity for the synthetic data
• Blue and orange lines: ScanMK and SetFinder for their optimal radius R
• Gray lines: Existing methods with non-optimal radii
• HubFinder outperforms existing methods for all K and R
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 23
Human Motion Data (Result for All Participants)
HubFinder outperforms existing methods in terms of purity metric
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 24
ScanMK/SetFinder Purity (with the best radius R)
HubFinderPurity(withouttuningR)

More Related Content

Similar to MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile

Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
Jing WANG
 
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.pptMSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
butest
 

Similar to MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile (20)

Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex scene
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.
 
8clst.ppt
8clst.ppt8clst.ppt
8clst.ppt
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
E0442328
E0442328E0442328
E0442328
 
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.pptMSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
 
Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...
 
Large scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithmLarge scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithm
 
8clustering.pptx
8clustering.pptx8clustering.pptx
8clustering.pptx
 

Recently uploaded

如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 

Recently uploaded (20)

NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 

MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile

  • 1. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 1 National Institute of Advanced Industrial Science and Technology (AIST) 2 Mitsubishi Electric Corporation 3 LeapMind Inc. 5th Workshop on Mining and Learning from Time Series (MiLeTS’19) Aug 5, 2019 - Anchorage, Alaska, USA Genta Yoshimura1,2 Atsunori Kanemura1,3 Hideki Asoh1
  • 2. Outline 1. Introduction • Motif Enumeration • Problems in Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 2G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 3. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 3G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 4. Motif Enumeration from Time Series Motif = a subsequence that occurs frequently in time series • Finding motifs is useful for many time series mining tasks • Classification, forecasting, segmentation, anomaly detection, … Motif Enumeration • Enumerate multiple motifs in order of significance rather than fining a single motif • Most time series include multiple patterns • In our problem setting, motif length W is a tunable parameter • Not variable-length motifs, but fixed-length motifs 4G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction Time series Time Motif 1 Motif 2 W W W W
  • 5. The difference of two definitions arises from how we regard a subsequence as significant 1. Range motif • A subsequence is significant if there exist many subsequences inside the sphere of radius R 2. Closest-pair motif • A subsequence is significant if the distance to its closest subsequence is small Note • Z-normalized Euclidean Distance (ED) is used as subsequence distance • Trivial-matches are ignored when finding neighbor subsequences n1=6 n2=3 Existing Two Motif Definitions 5G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction >significant >significant d1=0.63 d2=0.89 Subsequence whose length is W R
  • 6. Existing methods [Bagnall+14] require a radius parameter R • Place spheres of radius R so as not to overlap each other • Iteratively find the most significant subsequence as motif and remove subsequences inside the sphere of radius R 1. SetFinder = Range motif based method 2. ScanMK = Closest-pair motif based method Existing Motif Enumeration Methods 6G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction R R × argmaxi ni argmini di argmaxi ni argmini di remove remove remove remove
  • 7. Problems in Existing Methods Existing methods suffer from the radius parameter R 1. It is not easy to tune R • Appropriate parameter R changes in accordance with the target dataset • We cannot even know which R is appropriate in most real applications where no ground truth is available 2. There are cases where the existing methods fail to enumerate motifs successfully no matter how finely tune R • Such cases can be easily made and actually occur in real datasets Novel motif enumeration method is necessary 7G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction R R R ? Too small…Too large…
  • 8. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 8G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 9. Novel Motif Definition: Hub Motif In order to get free from the radius parameter R 1. Range motif • A subsequence is significant if there exist many subsequences with in the sphere of radius R 2. Closest-pair motif • A subsequence is significant if the distance to its closest subsequence is small 3. Hub motif • A subsequence is significant if a sum of distances from other subsequences is small • Looks like a wheel hub 9G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Method R d1=0.63 d2=0.89 n1=6 n2=3 >significant Σk dik=5.12 Σk djk=7.36
  • 10. Proposed Method: HubFinder HubFinder does not require the radius parameter R • Motif length: W • Number of motifs: K HubFinder consists of two steps 1. Extract candidates for motifs using the matrix profile 2. Refine candidates into K motifs according to the hub motif significance 10G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Method Time series Candidates 1. Extract 2. Refine Time W K Motifs ・・・ K
  • 11. Step 1. Extract candidates for motifs • Time series: • -th subsequence: • Matrix profile: • is z-normalized Euclidean Distance (ED) between and its closest subsequence except its trivial matches • Can be computed efficiently using STOMP algorithm [Zhu+16] • is a candidate of motifs if is a local minimum of • Use a sliding window whose length is to detect local minima • Extracted candidates are added to a candidate set 11G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 closest-pair Time series X Matrix profile P STOMP Method is a local minimum in a sliding window ⇒
  • 12. Step 2. Refine candidates into K motifs • Refine the candidate set into a motif set • Cost function based on the hub motif definition • Find which minimizes the cost function in greedy manner • New candidate is added to one by one • If , the least significant candidate is removed MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 12 Method Motif set = { }Candidate set = { }
  • 13. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 13G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 14. Synthetic Data Two motifs of length W=32 are arranged alternately • Motif-1: z-normalized triangular wave + Gaussian noise • Motif-2: z-normalized sine wave + Gaussian noise MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 14 Experiments ・・・ x50 (T=9616)
  • 15. Apply ScanMK, SetFinder, and HubFinder with W=32 • HubFinder succeeds in finding alternate motifs perfectly without tuning R • Existing methods fail no matter how finely you tune R • Existing methods are sensitive to R Synthetic Data (Result) MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 15 Experiments 58.5% (R=0.96) 69.0% (R=0.86) 100% (constant) Purity(thelarger,thebetter) Radius Extracted 2nd motif and neighbors Extracted 1st motif and neighbors
  • 16. Human Motion Data MotionSense Dataset [Malekzadeh+18] • Collected with an iPhone 6s kept in the participant's front pocket • Include 3D accelerometer time series of human motion • Total of 24 participants performed several activities • 4 activities were chosen for this study MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 16 Experiments Downstairs Upstairs Walking Jogging x y z x y z x y z x y z
  • 17. Human Motion Data (Result) Apply ScanMK, SetFinder, and HubFinder with W=64 • Position of top-4 motifs ( ) and neighbors ( ) of participant #23 • Existing methods fail to find motif from Downstairs activity • HubFinder successfully finds motifs from all 4 activities MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 17 Experiments Downstairs Upstairs Walking Jogging ScanMKSetFinderhubFinder
  • 18. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 18G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 19. Summary • Problems in existing motif enumeration methods caused by R • Novel hub motif definition and HubFinder algorithm • HubFinder succeeds in finding appropriate motifs without tuning R • Existing methods fail no matter how finely tune R Future Work • Remove the motif length parameter W (Extend to variable-length motifs) • Utilize extracted motifs for other time series mining tasks such as classification, forecasting, segmentation, and anomaly detection Python code is available at https://github.com/intellygenta/HubFinder Thank you for your attention!! 19G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Conclusion
  • 20. References [Bagnall+14] A. Bagnall, J. Hills, and J. Lines, “Finding Motif Sets in Time Series” arXiv:1407.3685 (2014). [Zhu+16] Y. Zhu, Z. Zimmerman, N. S. Senobari, C. C. M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, “Matrix Profile II: Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins“ IEEE 16th International Conference on Data Mining (ICDM), 739–748. (2016) [Malekzadeh+18] M. Malekzadeh, R. G. Clegg, A. Cavallaro, and H. Haddadi, “Protecting sensory data against sensitive inferences” Workshop on Privacy by Design in Distributed Systems (2018). MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 20
  • 21. Purity • Motif enumeration is similar to the clustering task to some extent • Find representative patterns within a dataset in unsupervised manner • We adopt purity as evaluation metric for this study • One of the most popular metric for the clustering • • Ground truth motif clusters: • Enumerated motif clusters: • E.g. Purity = (5 + 1) / (5 + 5) = 0.60 MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 21 ψ1 ψ1 ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ψ1 ψ1 ψ1ψ2 ψ2 ψ2 ψ2 ψ2 ω1 ∈ ∈ ∈ ∈ ∈ ∈ ω2 ω1 ω1 ω1 ω1 ∈ ω1 Ground truth Enumerated
  • 22. Time Complexity • Running times on the synthetic time series • HubFinder is faster than the existing methods for long time series • HubFinder does not need multiple trials for tuning R MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 22
  • 23. Dependency of the Number of Motifs K Dependency of the number of motifs K on the purity for the synthetic data • Blue and orange lines: ScanMK and SetFinder for their optimal radius R • Gray lines: Existing methods with non-optimal radii • HubFinder outperforms existing methods for all K and R MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 23
  • 24. Human Motion Data (Result for All Participants) HubFinder outperforms existing methods in terms of purity metric MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 24 ScanMK/SetFinder Purity (with the best radius R) HubFinderPurity(withouttuningR)