SlideShare a Scribd company logo
Enumerating Hub Motifs
in Time Series
Based on the Matrix Profile
1 National Institute of Advanced Industrial Science and Technology (AIST)
2 Mitsubishi Electric Corporation
3 LeapMind Inc.
5th Workshop on Mining and Learning from Time Series (MiLeTS’19)
Aug 5, 2019 - Anchorage, Alaska, USA
Genta Yoshimura1,2 Atsunori Kanemura1,3 Hideki Asoh1
Outline
1. Introduction
• Motif Enumeration
• Problems in Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
2G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
3G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Motif Enumeration from Time Series
Motif = a subsequence that occurs frequently in time series
• Finding motifs is useful for many time series mining tasks
• Classification, forecasting, segmentation, anomaly detection, …
Motif Enumeration
• Enumerate multiple motifs in order of significance
rather than fining a single motif
• Most time series include multiple patterns
• In our problem setting, motif length W is a tunable parameter
• Not variable-length motifs, but fixed-length motifs
4G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
Time series
Time
Motif 1 Motif 2
W W W W
The difference of two definitions arises from
how we regard a subsequence as significant
1. Range motif
• A subsequence is significant
if there exist many subsequences
inside the sphere of radius R
2. Closest-pair motif
• A subsequence is significant
if the distance to its closest
subsequence is small
Note
• Z-normalized Euclidean Distance (ED) is used as subsequence distance
• Trivial-matches are ignored when finding neighbor subsequences
n1=6 n2=3
Existing Two Motif Definitions
5G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
>significant
>significant
d1=0.63 d2=0.89
Subsequence whose length is W
R
Existing methods [Bagnall+14] require a radius parameter R
• Place spheres of radius R so as not to overlap each other
• Iteratively find the most significant subsequence as motif
and remove subsequences inside the sphere of radius R
1. SetFinder = Range motif based method
2. ScanMK = Closest-pair motif based method
Existing Motif Enumeration Methods
6G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
R
R ×
argmaxi ni
argmini di
argmaxi ni
argmini di
remove
remove
remove
remove
Problems in Existing Methods
Existing methods suffer from the radius parameter R
1. It is not easy to tune R
• Appropriate parameter R changes
in accordance with the target dataset
• We cannot even know which R is appropriate
in most real applications where no ground truth is available
2. There are cases where the existing methods fail to
enumerate motifs successfully no matter how finely tune R
• Such cases can be easily made and actually occur in real datasets
Novel motif enumeration method is necessary
7G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Introduction
R
R
R
?
Too small…Too large…
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
8G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Novel Motif Definition: Hub Motif
In order to get free from the radius parameter R
1. Range motif
• A subsequence is significant
if there exist many subsequences
with in the sphere of radius R
2. Closest-pair motif
• A subsequence is significant
if the distance to its closest
subsequence is small
3. Hub motif
• A subsequence is significant
if a sum of distances from
other subsequences is small
• Looks like a wheel hub
9G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Method
R
d1=0.63 d2=0.89
n1=6 n2=3
>significant
Σk dik=5.12 Σk djk=7.36
Proposed Method: HubFinder
HubFinder does not require the radius parameter R
• Motif length: W
• Number of motifs: K
HubFinder consists of two steps
1. Extract candidates for motifs using the matrix profile
2. Refine candidates into K motifs according to the hub motif significance
10G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Method
Time series
Candidates
1. Extract
2. Refine
Time
W
K Motifs
・・・
K
Step 1. Extract candidates for motifs
• Time series:
• -th subsequence:
• Matrix profile:
• is z-normalized Euclidean Distance (ED)
between and its closest subsequence
except its trivial matches
• Can be computed efficiently using STOMP algorithm [Zhu+16]
• is a candidate of motifs if is a local minimum of
• Use a sliding window whose length is to detect local minima
• Extracted candidates are added to a candidate set
11G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
closest-pair
Time series X
Matrix profile P STOMP
Method
is a local minimum in a sliding window ⇒
Step 2. Refine candidates into K motifs
• Refine the candidate set into a motif set
• Cost function based on the hub motif definition
• Find which minimizes the cost function in greedy manner
• New candidate is added to one by one
• If , the least significant candidate is removed
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 12
Method
Motif set = { }Candidate set = { }
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
13G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Synthetic Data
Two motifs of length W=32 are arranged alternately
• Motif-1: z-normalized triangular wave + Gaussian noise
• Motif-2: z-normalized sine wave + Gaussian noise
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 14
Experiments
・・・ x50 (T=9616)
Apply ScanMK, SetFinder, and HubFinder with W=32
• HubFinder succeeds in finding alternate motifs perfectly without tuning R
• Existing methods fail no matter how finely you tune R
• Existing methods are sensitive to R
Synthetic Data (Result)
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 15
Experiments
58.5% (R=0.96)
69.0% (R=0.86)
100% (constant)
Purity(thelarger,thebetter)
Radius
Extracted 2nd motif and neighbors
Extracted 1st motif and neighbors
Human Motion Data
MotionSense Dataset [Malekzadeh+18]
• Collected with an iPhone 6s kept in the participant's front pocket
• Include 3D accelerometer time series of human motion
• Total of 24 participants performed several activities
• 4 activities were chosen for this study
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 16
Experiments
Downstairs Upstairs
Walking Jogging
x
y
z
x
y
z
x
y
z
x
y
z
Human Motion Data (Result)
Apply ScanMK, SetFinder, and HubFinder with W=64
• Position of top-4 motifs ( ) and neighbors ( ) of participant #23
• Existing methods fail to find motif from Downstairs activity
• HubFinder successfully finds motifs from all 4 activities
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 17
Experiments
Downstairs Upstairs Walking Jogging
ScanMKSetFinderhubFinder
Outline
1. Introduction
• Motif Enumeration
• Problems of Existing Methods
2. Method
• Novel Motif Definition: Hub Motif
• Proposed Method: HubFinder
3. Experiments
• Synthetic Data
• Human Motion Data
4. Conclusion
• Summary
18G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Summary
• Problems in existing motif enumeration methods caused by R
• Novel hub motif definition and HubFinder algorithm
• HubFinder succeeds in finding appropriate motifs without tuning R
• Existing methods fail no matter how finely tune R
Future Work
• Remove the motif length parameter W
(Extend to variable-length motifs)
• Utilize extracted motifs for other time series mining tasks
such as classification, forecasting, segmentation, and anomaly detection
Python code is available at
https://github.com/intellygenta/HubFinder
Thank you for your attention!!
19G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
Conclusion
References
[Bagnall+14] A. Bagnall, J. Hills, and J. Lines,
“Finding Motif Sets in Time Series”
arXiv:1407.3685 (2014).
[Zhu+16] Y. Zhu, Z. Zimmerman, N. S. Senobari, C. C. M. Yeh,
G. Funning, A. Mueen, P. Brisk, and E. Keogh,
“Matrix Profile II: Exploiting a Novel Algorithm and GPUs to
Break the One Hundred Million Barrier for Time Series Motifs and Joins“
IEEE 16th International Conference on Data Mining (ICDM), 739–748. (2016)
[Malekzadeh+18] M. Malekzadeh, R. G. Clegg, A. Cavallaro, and H. Haddadi,
“Protecting sensory data against sensitive inferences”
Workshop on Privacy by Design in Distributed Systems (2018).
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 20
Purity
• Motif enumeration is similar to the clustering task to some extent
• Find representative patterns within a dataset in unsupervised manner
• We adopt purity as evaluation metric for this study
• One of the most popular metric for the clustering
•
• Ground truth motif clusters:
• Enumerated motif clusters:
• E.g. Purity = (5 + 1) / (5 + 5) = 0.60
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 21
ψ1 ψ1
∈
∈
∈
∈
∈
∈
∈
∈
∈
∈
ψ1 ψ1 ψ1ψ2 ψ2 ψ2 ψ2 ψ2
ω1
∈
∈
∈
∈
∈
∈
ω2 ω1 ω1 ω1 ω1
∈
ω1
Ground truth
Enumerated
Time Complexity
• Running times on the synthetic time series
• HubFinder is faster than the existing methods for long time series
• HubFinder does not need multiple trials for tuning R
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 22
Dependency of the Number of Motifs K
Dependency of the number of motifs K on the purity for the synthetic data
• Blue and orange lines: ScanMK and SetFinder for their optimal radius R
• Gray lines: Existing methods with non-optimal radii
• HubFinder outperforms existing methods for all K and R
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 23
Human Motion Data (Result for All Participants)
HubFinder outperforms existing methods in terms of purity metric
MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 24
ScanMK/SetFinder Purity (with the best radius R)
HubFinderPurity(withouttuningR)

More Related Content

Similar to MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile

Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Experfy
 
Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
Iwan Sofana
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
Vitor Hirota Makiyama
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
Jing WANG
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex scene
Kumar Mayank
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.
bhavinecindus
 
8clst.ppt
8clst.ppt8clst.ppt
8clst.ppt
Gurumurthy B R
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
nandhini manoharan
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
KAMAL CHOUDHARY
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
inventy
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
SaruwatariLabUTokyo
 
E0442328
E0442328E0442328
E0442328
IJERA Editor
 
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Wilfried Elmenreich
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Daichi Kitamura
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
Ajay Kumar
 
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.pptMSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
butest
 
Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...
Shun Kojima
 
Large scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithmLarge scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithm
Parth Nandedkar
 
8clustering.pptx
8clustering.pptx8clustering.pptx
8clustering.pptx
DeepanshuPatel19
 

Similar to MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile (20)

Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex scene
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.
 
8clst.ppt
8clst.ppt8clst.ppt
8clst.ppt
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
E0442328
E0442328E0442328
E0442328
 
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
Machine Learning Techniques for the Smart Grid – Modeling of Solar Energy usi...
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.pptMSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
MSShin-Machine_Learning_Algorithm_in_Period_Estimation.ppt
 
Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...Time-delayed collective flow diffusion models for inferring latent people flo...
Time-delayed collective flow diffusion models for inferring latent people flo...
 
Large scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithmLarge scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithm
 
8clustering.pptx
8clustering.pptx8clustering.pptx
8clustering.pptx
 

Recently uploaded

在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
fkyes25
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 

Recently uploaded (20)

在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 

MiLeTS'19: Enumerating Hub Motifs in Time Series Based on the Matrix Profile

  • 1. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 1 National Institute of Advanced Industrial Science and Technology (AIST) 2 Mitsubishi Electric Corporation 3 LeapMind Inc. 5th Workshop on Mining and Learning from Time Series (MiLeTS’19) Aug 5, 2019 - Anchorage, Alaska, USA Genta Yoshimura1,2 Atsunori Kanemura1,3 Hideki Asoh1
  • 2. Outline 1. Introduction • Motif Enumeration • Problems in Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 2G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 3. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 3G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 4. Motif Enumeration from Time Series Motif = a subsequence that occurs frequently in time series • Finding motifs is useful for many time series mining tasks • Classification, forecasting, segmentation, anomaly detection, … Motif Enumeration • Enumerate multiple motifs in order of significance rather than fining a single motif • Most time series include multiple patterns • In our problem setting, motif length W is a tunable parameter • Not variable-length motifs, but fixed-length motifs 4G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction Time series Time Motif 1 Motif 2 W W W W
  • 5. The difference of two definitions arises from how we regard a subsequence as significant 1. Range motif • A subsequence is significant if there exist many subsequences inside the sphere of radius R 2. Closest-pair motif • A subsequence is significant if the distance to its closest subsequence is small Note • Z-normalized Euclidean Distance (ED) is used as subsequence distance • Trivial-matches are ignored when finding neighbor subsequences n1=6 n2=3 Existing Two Motif Definitions 5G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction >significant >significant d1=0.63 d2=0.89 Subsequence whose length is W R
  • 6. Existing methods [Bagnall+14] require a radius parameter R • Place spheres of radius R so as not to overlap each other • Iteratively find the most significant subsequence as motif and remove subsequences inside the sphere of radius R 1. SetFinder = Range motif based method 2. ScanMK = Closest-pair motif based method Existing Motif Enumeration Methods 6G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction R R × argmaxi ni argmini di argmaxi ni argmini di remove remove remove remove
  • 7. Problems in Existing Methods Existing methods suffer from the radius parameter R 1. It is not easy to tune R • Appropriate parameter R changes in accordance with the target dataset • We cannot even know which R is appropriate in most real applications where no ground truth is available 2. There are cases where the existing methods fail to enumerate motifs successfully no matter how finely tune R • Such cases can be easily made and actually occur in real datasets Novel motif enumeration method is necessary 7G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Introduction R R R ? Too small…Too large…
  • 8. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 8G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 9. Novel Motif Definition: Hub Motif In order to get free from the radius parameter R 1. Range motif • A subsequence is significant if there exist many subsequences with in the sphere of radius R 2. Closest-pair motif • A subsequence is significant if the distance to its closest subsequence is small 3. Hub motif • A subsequence is significant if a sum of distances from other subsequences is small • Looks like a wheel hub 9G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Method R d1=0.63 d2=0.89 n1=6 n2=3 >significant Σk dik=5.12 Σk djk=7.36
  • 10. Proposed Method: HubFinder HubFinder does not require the radius parameter R • Motif length: W • Number of motifs: K HubFinder consists of two steps 1. Extract candidates for motifs using the matrix profile 2. Refine candidates into K motifs according to the hub motif significance 10G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Method Time series Candidates 1. Extract 2. Refine Time W K Motifs ・・・ K
  • 11. Step 1. Extract candidates for motifs • Time series: • -th subsequence: • Matrix profile: • is z-normalized Euclidean Distance (ED) between and its closest subsequence except its trivial matches • Can be computed efficiently using STOMP algorithm [Zhu+16] • is a candidate of motifs if is a local minimum of • Use a sliding window whose length is to detect local minima • Extracted candidates are added to a candidate set 11G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 closest-pair Time series X Matrix profile P STOMP Method is a local minimum in a sliding window ⇒
  • 12. Step 2. Refine candidates into K motifs • Refine the candidate set into a motif set • Cost function based on the hub motif definition • Find which minimizes the cost function in greedy manner • New candidate is added to one by one • If , the least significant candidate is removed MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 12 Method Motif set = { }Candidate set = { }
  • 13. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 13G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 14. Synthetic Data Two motifs of length W=32 are arranged alternately • Motif-1: z-normalized triangular wave + Gaussian noise • Motif-2: z-normalized sine wave + Gaussian noise MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 14 Experiments ・・・ x50 (T=9616)
  • 15. Apply ScanMK, SetFinder, and HubFinder with W=32 • HubFinder succeeds in finding alternate motifs perfectly without tuning R • Existing methods fail no matter how finely you tune R • Existing methods are sensitive to R Synthetic Data (Result) MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 15 Experiments 58.5% (R=0.96) 69.0% (R=0.86) 100% (constant) Purity(thelarger,thebetter) Radius Extracted 2nd motif and neighbors Extracted 1st motif and neighbors
  • 16. Human Motion Data MotionSense Dataset [Malekzadeh+18] • Collected with an iPhone 6s kept in the participant's front pocket • Include 3D accelerometer time series of human motion • Total of 24 participants performed several activities • 4 activities were chosen for this study MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 16 Experiments Downstairs Upstairs Walking Jogging x y z x y z x y z x y z
  • 17. Human Motion Data (Result) Apply ScanMK, SetFinder, and HubFinder with W=64 • Position of top-4 motifs ( ) and neighbors ( ) of participant #23 • Existing methods fail to find motif from Downstairs activity • HubFinder successfully finds motifs from all 4 activities MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 17 Experiments Downstairs Upstairs Walking Jogging ScanMKSetFinderhubFinder
  • 18. Outline 1. Introduction • Motif Enumeration • Problems of Existing Methods 2. Method • Novel Motif Definition: Hub Motif • Proposed Method: HubFinder 3. Experiments • Synthetic Data • Human Motion Data 4. Conclusion • Summary 18G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19
  • 19. Summary • Problems in existing motif enumeration methods caused by R • Novel hub motif definition and HubFinder algorithm • HubFinder succeeds in finding appropriate motifs without tuning R • Existing methods fail no matter how finely tune R Future Work • Remove the motif length parameter W (Extend to variable-length motifs) • Utilize extracted motifs for other time series mining tasks such as classification, forecasting, segmentation, and anomaly detection Python code is available at https://github.com/intellygenta/HubFinder Thank you for your attention!! 19G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix ProfileMiLeTS'19 Conclusion
  • 20. References [Bagnall+14] A. Bagnall, J. Hills, and J. Lines, “Finding Motif Sets in Time Series” arXiv:1407.3685 (2014). [Zhu+16] Y. Zhu, Z. Zimmerman, N. S. Senobari, C. C. M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, “Matrix Profile II: Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins“ IEEE 16th International Conference on Data Mining (ICDM), 739–748. (2016) [Malekzadeh+18] M. Malekzadeh, R. G. Clegg, A. Cavallaro, and H. Haddadi, “Protecting sensory data against sensitive inferences” Workshop on Privacy by Design in Distributed Systems (2018). MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 20
  • 21. Purity • Motif enumeration is similar to the clustering task to some extent • Find representative patterns within a dataset in unsupervised manner • We adopt purity as evaluation metric for this study • One of the most popular metric for the clustering • • Ground truth motif clusters: • Enumerated motif clusters: • E.g. Purity = (5 + 1) / (5 + 5) = 0.60 MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 21 ψ1 ψ1 ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ∈ ψ1 ψ1 ψ1ψ2 ψ2 ψ2 ψ2 ψ2 ω1 ∈ ∈ ∈ ∈ ∈ ∈ ω2 ω1 ω1 ω1 ω1 ∈ ω1 Ground truth Enumerated
  • 22. Time Complexity • Running times on the synthetic time series • HubFinder is faster than the existing methods for long time series • HubFinder does not need multiple trials for tuning R MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 22
  • 23. Dependency of the Number of Motifs K Dependency of the number of motifs K on the purity for the synthetic data • Blue and orange lines: ScanMK and SetFinder for their optimal radius R • Gray lines: Existing methods with non-optimal radii • HubFinder outperforms existing methods for all K and R MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 23
  • 24. Human Motion Data (Result for All Participants) HubFinder outperforms existing methods in terms of purity metric MiLeTS'19 G. Yoshimura et al. Enumerating Hub Motifs in Time Series Based on the Matrix Profile 24 ScanMK/SetFinder Purity (with the best radius R) HubFinderPurity(withouttuningR)