SlideShare a Scribd company logo
1 of 1
Download to read offline
Observation 1:
The convergence of DTW and Euclidean distance results for increasing data sizes.
Observation 2:
The increasing effectiveness of lower-bounding pruning for increasing data sizes.
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning StrategyAccelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy
Nurjahan Begum,Nurjahan Begum, LiudmilaLiudmila Ulanova,Ulanova, Jun WangJun Wang11 andand EamonnEamonn KeoghKeogh
UniversityUniversity of California,of California, RiversideRiverside UT DallasUT Dallas11
Why is DTW Clustering Hard?Why is DTW Clustering Hard?
Motivation of DTW ClusteringMotivation of DTW Clustering Density Peaks (DP) AlgorithmDensity Peaks (DP) Algorithm
Why Existing Work isWhy Existing Work is notnot the Answer?the Answer?
TADPoleTADPole: Our Proposed Algorithm: Our Proposed Algorithm
How ‘good’ are TADPole Clusters?
Case Study 1: ElectromagneticCase Study 1: Electromagnetic ArticulographArticulograph
How Effective isHow Effective is TADPole’sTADPole’s Pruning?Pruning?
#kanyewest
#Michael
#MichaelJackson
#taylorswift
0 40 80 120
hours
Synonym Discovery ?
Association Discovery ?
“I’mma let you finish”
Bos taurus
Hyperoodon
ampullatus
Talpa
europaea
Bos taurus
Hyperoodon
ampullatus
Talpa
europaea
Cetartiodactyla
DTW ED
0 1000 2000
0.01
0.03
0.05
0.07
1-NN
errorrate
Size of training set
Euclidean
DTW
0 1000 2000
0.6
0.7
0.8
0.9
Dataset Size
RandIndex
DTW
Euclidean
Neither of these two observations help!
5
1
2
3
4
6
7
8
9
10
11
12
13
1
2
3
4
5
6
7
8
9
10
11
12
13
Mislabeled
by k-means
Outlier
Scalability Issue:
DTW is not a metric, therefore very difficult to index
Quality Issue:
Need clustering algorithm which is insensitive to outliers
3 steps
1. Density Calculation
2. NN within Higher Density List Calculation
3. Cluster Assignment
1
23
4
5
6
8
7
910
111213 1
2
3
4
5
6
7
8
9
10
11
12
13
4
3
6
4
5
3
1
3
1
1
2
2
2
ρ
3 5
Elements with higher density
4.2 6
Item 1’s cluster label =
item 3’s cluster label
1
dc
 
j
ciji dd )(
Pruning During Local Density Computation
j
LBMatrix(i,j)
Dij
UBMatrix(i,j)
LBMatrix(i,j)
Dij
UBMatrix(i,j)
dc
LBMatrix(i,j)
Dij
UBMatrix(i,j)
B)
C)
D)
i j
i
i
j
j
i Dij = 0A)
Pruning During NN Distance Calculation
From Higher Density List
LBMatrix(i,j1)
D1
UBMatrix(i,j1)
D2
UBMatrix(i,j2)
D3
UBMatrix(i,j3)
A)
B)
C)
i j1
i
i
j2
j3
D4
UBMatrix(i,j4)
i j4
D)
LBMatrix(i,j2)
LBMatrix(i,j4)
LBMatrix(i,j3)
DistanceCalculations
0 3500
1
3
5
7
x 10
6
TADPole
Number of objects
Absolute
Number
0 3500
0
100
Number of objects
Brute force
TADPole
Percentage
DP: 9 Hours
TADPole: 9 minutes
Distance Computation Ordering:Distance Computation Ordering:
AnytimeAnytime TADPoleTADPole
Distance Computation Percentage 100%
0.4
1
0
RandIndex
Euclidean
Distance
Oracle
Order
TADPole
Order
0 10%
0.4
1
Oracle Order
Random Order
TADPole Order
Random
Order
RandIndex
Distance Computation Percentage
Zoom-In of Above Figure
This reflects the
90% of DTW
calculations that
were admissibly
pruned
This reflects the
10% of DTW
calculations that
were calculated in
anytime ordering
10%
0 150
Y
Z
Y
Z
1 2 3 4 5 6 7
0.84
0.92
1
Distance Computation Percentage
RandIndex
Euclidean Distance
Oracle Order
Random Order
TADPole Order
Pruning: 94%
Case Study 2:Case Study 2: PulsusPulsus DatasetDataset
Suspected Pulsus
Severe Pulsus
Healthy
Oximeter
Vein
Artery
Photo Detector
LED
0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60
Patient 639 Patient 523 Patient 618 Patient 2975918
0 10 20 30 40 50 600 10 20 30 40 50 60
Normalized Respiration Rate
Normalized Heart Rate
PowerSpectral
Density
Frequency
A) B)
C) D) E) F)
200 600 1000 1400 1800200 600 1000 1400 1800
Non-Severe Pulsus Severe Pulsus
PPG
ReproducibilityReproducibility
All the code and datasets used in this paper are publicly available in:
www.cs.ucr.edu/~nbegu001/SpeededClusteringDTW
Pruning: 88%

More Related Content

What's hot

What's hot (20)

Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
 
Classical access structures of ramp secret sharing based on quantum stabilize...
Classical access structures of ramp secret sharing based on quantum stabilize...Classical access structures of ramp secret sharing based on quantum stabilize...
Classical access structures of ramp secret sharing based on quantum stabilize...
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear Regression
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
 
Ceis 4
Ceis 4Ceis 4
Ceis 4
 
Yonsei Data Science Lab - Computer Vision
Yonsei Data Science Lab - Computer VisionYonsei Data Science Lab - Computer Vision
Yonsei Data Science Lab - Computer Vision
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
 
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Robust Watermarking through Dual Band IWT and Chinese Remainder TheoremRobust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
 
[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposals
 
Use CNN for Sequence Modeling
Use CNN for Sequence ModelingUse CNN for Sequence Modeling
Use CNN for Sequence Modeling
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 

Viewers also liked

Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
Kan Ouivirach, Ph.D.
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
Davide Nardone
 
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPINGTOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
ijdkp
 
TADPole_Nurjahan Begum
TADPole_Nurjahan BegumTADPole_Nurjahan Begum
TADPole_Nurjahan Begum
Nurjahan Begum
 

Viewers also liked (10)

Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
Clustering Human Behaviors with Dynamic Time Warping and Hidden Markov Models...
 
Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
 
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPINGTOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
 
Industry Training: 03 Awareness Simulation
Industry Training: 03 Awareness SimulationIndustry Training: 03 Awareness Simulation
Industry Training: 03 Awareness Simulation
 
Real Time Analytics
Real Time AnalyticsReal Time Analytics
Real Time Analytics
 
Thai-OCR
Thai-OCRThai-OCR
Thai-OCR
 
Time Machine session @ ICME 2012 - DTW's New Youth
Time Machine session @ ICME 2012 - DTW's New YouthTime Machine session @ ICME 2012 - DTW's New Youth
Time Machine session @ ICME 2012 - DTW's New Youth
 
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentationInformation Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
 
TADPole_Nurjahan Begum
TADPole_Nurjahan BegumTADPole_Nurjahan Begum
TADPole_Nurjahan Begum
 
Warping
WarpingWarping
Warping
 

Similar to KDD Poster Nurjahan Begum

2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
Hamza Ameur
 
Effect of Block Sizes on the Attributes of Watermarking Digital Images
Effect of Block Sizes on the Attributes of Watermarking Digital ImagesEffect of Block Sizes on the Attributes of Watermarking Digital Images
Effect of Block Sizes on the Attributes of Watermarking Digital Images
Dr. Michael Agbaje
 
Unsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
Unsupervised Deconvolution Neural Network for High Quality Ultrasound ImagingUnsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
Unsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
Shujaat Khan
 

Similar to KDD Poster Nurjahan Begum (20)

Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Oral Defence
Oral DefenceOral Defence
Oral Defence
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
 
Project PPT
Project PPTProject PPT
Project PPT
 
vicTheoryWorkshop
vicTheoryWorkshopvicTheoryWorkshop
vicTheoryWorkshop
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry:Statisti...
Tutorial @ IEEE ICC 2019 : Machine Learning and Stochastic Geometry: Statisti...
 
Effect of Block Sizes on the Attributes of Watermarking Digital Images
Effect of Block Sizes on the Attributes of Watermarking Digital ImagesEffect of Block Sizes on the Attributes of Watermarking Digital Images
Effect of Block Sizes on the Attributes of Watermarking Digital Images
 
Mining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsMining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systems
 
Continuum Modeling and Control of Large Nonuniform Networks
Continuum Modeling and Control of Large Nonuniform NetworksContinuum Modeling and Control of Large Nonuniform Networks
Continuum Modeling and Control of Large Nonuniform Networks
 
Unsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
Unsupervised Deconvolution Neural Network for High Quality Ultrasound ImagingUnsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
Unsupervised Deconvolution Neural Network for High Quality Ultrasound Imaging
 
Clustering coefficients for correlation networks
Clustering coefficients for correlation networksClustering coefficients for correlation networks
Clustering coefficients for correlation networks
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
Time series clustering presentation
Time series clustering presentationTime series clustering presentation
Time series clustering presentation
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine Learning
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
 

KDD Poster Nurjahan Begum

  • 1. Observation 1: The convergence of DTW and Euclidean distance results for increasing data sizes. Observation 2: The increasing effectiveness of lower-bounding pruning for increasing data sizes. Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning StrategyAccelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan Begum,Nurjahan Begum, LiudmilaLiudmila Ulanova,Ulanova, Jun WangJun Wang11 andand EamonnEamonn KeoghKeogh UniversityUniversity of California,of California, RiversideRiverside UT DallasUT Dallas11 Why is DTW Clustering Hard?Why is DTW Clustering Hard? Motivation of DTW ClusteringMotivation of DTW Clustering Density Peaks (DP) AlgorithmDensity Peaks (DP) Algorithm Why Existing Work isWhy Existing Work is notnot the Answer?the Answer? TADPoleTADPole: Our Proposed Algorithm: Our Proposed Algorithm How ‘good’ are TADPole Clusters? Case Study 1: ElectromagneticCase Study 1: Electromagnetic ArticulographArticulograph How Effective isHow Effective is TADPole’sTADPole’s Pruning?Pruning? #kanyewest #Michael #MichaelJackson #taylorswift 0 40 80 120 hours Synonym Discovery ? Association Discovery ? “I’mma let you finish” Bos taurus Hyperoodon ampullatus Talpa europaea Bos taurus Hyperoodon ampullatus Talpa europaea Cetartiodactyla DTW ED 0 1000 2000 0.01 0.03 0.05 0.07 1-NN errorrate Size of training set Euclidean DTW 0 1000 2000 0.6 0.7 0.8 0.9 Dataset Size RandIndex DTW Euclidean Neither of these two observations help! 5 1 2 3 4 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 Mislabeled by k-means Outlier Scalability Issue: DTW is not a metric, therefore very difficult to index Quality Issue: Need clustering algorithm which is insensitive to outliers 3 steps 1. Density Calculation 2. NN within Higher Density List Calculation 3. Cluster Assignment 1 23 4 5 6 8 7 910 111213 1 2 3 4 5 6 7 8 9 10 11 12 13 4 3 6 4 5 3 1 3 1 1 2 2 2 ρ 3 5 Elements with higher density 4.2 6 Item 1’s cluster label = item 3’s cluster label 1 dc   j ciji dd )( Pruning During Local Density Computation j LBMatrix(i,j) Dij UBMatrix(i,j) LBMatrix(i,j) Dij UBMatrix(i,j) dc LBMatrix(i,j) Dij UBMatrix(i,j) B) C) D) i j i i j j i Dij = 0A) Pruning During NN Distance Calculation From Higher Density List LBMatrix(i,j1) D1 UBMatrix(i,j1) D2 UBMatrix(i,j2) D3 UBMatrix(i,j3) A) B) C) i j1 i i j2 j3 D4 UBMatrix(i,j4) i j4 D) LBMatrix(i,j2) LBMatrix(i,j4) LBMatrix(i,j3) DistanceCalculations 0 3500 1 3 5 7 x 10 6 TADPole Number of objects Absolute Number 0 3500 0 100 Number of objects Brute force TADPole Percentage DP: 9 Hours TADPole: 9 minutes Distance Computation Ordering:Distance Computation Ordering: AnytimeAnytime TADPoleTADPole Distance Computation Percentage 100% 0.4 1 0 RandIndex Euclidean Distance Oracle Order TADPole Order 0 10% 0.4 1 Oracle Order Random Order TADPole Order Random Order RandIndex Distance Computation Percentage Zoom-In of Above Figure This reflects the 90% of DTW calculations that were admissibly pruned This reflects the 10% of DTW calculations that were calculated in anytime ordering 10% 0 150 Y Z Y Z 1 2 3 4 5 6 7 0.84 0.92 1 Distance Computation Percentage RandIndex Euclidean Distance Oracle Order Random Order TADPole Order Pruning: 94% Case Study 2:Case Study 2: PulsusPulsus DatasetDataset Suspected Pulsus Severe Pulsus Healthy Oximeter Vein Artery Photo Detector LED 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Patient 639 Patient 523 Patient 618 Patient 2975918 0 10 20 30 40 50 600 10 20 30 40 50 60 Normalized Respiration Rate Normalized Heart Rate PowerSpectral Density Frequency A) B) C) D) E) F) 200 600 1000 1400 1800200 600 1000 1400 1800 Non-Severe Pulsus Severe Pulsus PPG ReproducibilityReproducibility All the code and datasets used in this paper are publicly available in: www.cs.ucr.edu/~nbegu001/SpeededClusteringDTW Pruning: 88%