SlideShare a Scribd company logo
1 of 45
12 Dec 2019
Time Series Clustering
Yolande, Elena, & Beth
AGENDA 01
Distance Measures 02
03Prototypes
07
Data Preprocessing 04
Background
Clustering Algorithms 05
Cluster Evaluation 06
Remarks & Conclusions
BACKGROUND
What is time series?
What is time series clustering and its applications?
01
Background
• High dimensionality
• Irregular lengths
• Noise and time shifts
time (s)
variable
A time series is a collection of observations made sequentially in time.
Time Series Inputs
time (s)
Time Series Clustering Applications
Customer Segmentation Chicken Segmentation
Clustering
Algorithm
Distance
Measure
Prototype
N clusters
Time Series
Data
𝑑(𝑥, 𝑐)
𝑝(𝑥)
𝑥
DISTANCE MEASURES
Euclidean
Manhattan
Correlation-Based
Dynamic Time Warping (DTW)
Others: Canberra
Binary
Minkowski
Maximum Norm
Hamming Distance
02
1. Euclidean distance
n = number of dimensions
xi and yi = the i-th attribute (component) of x and y
𝑑(𝑥, 𝑦) =
𝑖=1
𝑛
𝑥𝑖 − 𝑦𝑖 2 𝑥
𝑦
2. Manhattan distance
𝑑 𝑚𝑎𝑛(𝑥, 𝑦) =
𝑖=1
𝑛
| (𝑥𝑖 − 𝑦𝑖) |
Also known as Manhattan length, rectilinear distance,
Minkowski's L1 distance, L1 norm, taxi cab metric, snake
distance, city block distance
3. Correlation-Based Distances
 Pearson
 Spearman
 Kendall
4. Dynamic Time Warping (DTW)
Time Series 2
TimeSeries1
Distance Matrix
1. Create matrix
2. Create the alignment
4. Dynamic Time Warping (DTW)
1. Create matrix 2. Create the alignment
Difference between DTW and Euclidean
Which Distance Measure to Use
• Type of the data
• Research questions
Criteria Euclidean DTW
Supports Time Series length differences No Yes
Supports Time Series time shifts No Yes
Computational costs Low High
Clustering
Algorithm
Distance
Measure
Prototype
N clusters
Time Series
Data
𝑑(𝑥, 𝑐)
𝑝(𝑥)
𝑥
PROTOTYPES
Mean - Median
Partition Around Medoids ( PAM) Methods
DTW Barycenter Averaging
Shape Extraction
03
1.Mean 2.Median
𝜇𝑖
𝑣
=
1
𝑁
𝑥 𝑐,𝑖
𝑣
, ∀𝑐 ∈ 𝐶 𝑚𝑖
𝑣
=
𝑥 𝑐,(𝑛−1)/2
𝑣
− 𝑥 𝑐,(𝑛+1)/2
𝑣
2
(n+1)/2
(n-1)/2
3.Partition Around Medoids (PAM)
𝑓 𝑥, 𝑐 =
𝑖=1
𝑁
𝑑𝑖𝑠𝑡(𝑥𝑖 − 𝑐𝑖)
4.Dynamic Time Warping Barycenter Averaging (DBA)
l1
l2
l3
P
l4
l5
l1 l2 l3 l4 l5
P D1 D2 D3 D4 D5
𝐷𝐵𝐴 = 𝑎𝑟𝑔 𝑚𝑖𝑛
𝑖=1
𝑁
𝐷𝑇𝑊(𝑃, 𝑙𝑖)
5.Shape Extraction
𝑆𝐸 𝑋, 𝑣𝑖 =
𝑥 𝑘∈𝑋 𝑖
∆(𝑥 𝑘 − 𝑣𝑖)2
Clustering
Algorithm
Distance
Measure
Prototype
N clusters
Time Series
Data
𝑑(𝑥, 𝑐)
𝑝(𝑥)
𝑥
Normalized
DATA PREPROCESSING
Dimensionality Reduction
Normalization
04
Aggregating
time (s)
variablevariable
time (s)
Downsampling
time (s)
variable
time (s)
variable
Normalization
𝜇 = 0
σ = 1
Input Output
Clustering
Algorithm
Distance
Measure
Prototype
N clusters
Time Series
Data
𝑑(𝑥, 𝑐)
𝑝(𝑥)
𝑥
Normalized
CLUSTERING ALGORITHMS
Partitional Clustering
Hierarchical Clustering
05
Clustering
Clustering algorithm Distance measure Prototype
Partitional
K – means / K – medoid Euclidean / Manhattan Mean / PAM
TAD Pole DTW DBA
K – shape SBD Shape Extraction
Hierarchical Agglomerative All All
Clustering AlgorithmDistance
Measure
Prototype
N clusters
Time Series
Data
Assign clusters
𝑐1
𝑐2
𝑥1
𝑑 𝟏𝟏 < 𝑑 𝟏2
Random centroid
initialization
k
𝑥 = 𝑝𝑜𝑖𝑛𝑡𝑠 , 𝑐 = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑𝑠,
𝑑 = 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑚𝑒𝑎𝑠𝑢𝑟𝑒
𝑚𝑖𝑛 𝑥,𝑐 𝑑(𝑥, 𝑐)
𝑑(𝑥, 𝑐)
Partitional Clustering
Assign clusters
𝑐1 𝑥1
𝑥2
𝑥 = 𝑝𝑜𝑖𝑛𝑡𝑠 , 𝑐 = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑𝑠, 𝑑 = 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑚𝑒𝑎𝑠𝑢𝑟𝑒
𝑚𝑖𝑛 𝑥,𝑐 𝑑(𝑥, 𝑐)
Random centroid
initialization
k 𝑑(𝑥, 𝑐)
Partitional Clustering
Adjust
centroids
Assign clusters
𝑐1 𝑥1
𝑥2
𝑥 = 𝑝𝑜𝑖𝑛𝑡𝑠 , 𝑐 = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑𝑠, 𝑑 = 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑚𝑒𝑎𝑠𝑢𝑟𝑒, 𝑝 = 𝑝𝑟𝑜𝑡𝑜𝑡𝑦𝑝𝑒
𝑚𝑖𝑛 𝑥,𝑐 𝑑(𝑥, 𝑐)
Random centroid
initialization
k 𝑑(𝑥, 𝑐) 𝑝(𝑥)
𝑐𝑖 = 𝑝(𝑥)
𝑐1
𝑐2
𝑛 𝑖𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠
Partitional Clustering
Hierarchical Clustering
• Agglomerative bottom – up
• Divisive top – down
Dendrogram
Hierarchical Clustering
PattySelmaMargeLisaEdna
Hierarchical Clustering
1 2 3
Partitional Clustering vs Hierarchical Clustering
Partitional Clustering Hierarchical Clustering
Visualization
Cluster number predefined predefined threshold
Computational
cost
low high
k = 3
CLUSTER EVALUATION
Internal metric
 Silhouette
 Davies – Bouldin (DB)
External metric
 Soft Rand
 Jaccard Coefficient
06
Silhouette Index
𝑆𝐼 =
1
𝑁
𝑖=1
𝑘
𝑥 ∈𝐶 𝑖
𝑏 𝑥, 𝐶𝑖 − 𝑎(𝑥, 𝐶𝑖)
max(𝑎 𝑥, 𝐶𝑖 , 𝑏 𝑥, 𝐶𝑖 )
𝑎 𝑥, 𝐶𝑖 =
1
𝑁𝐶 𝑖 𝑦∈𝐶 𝑖
𝑑𝑖𝑠𝑡(𝑥, 𝑦) b 𝑥, 𝐶𝑖 = 𝑚𝑖𝑛
1
𝑁 𝐶 𝑖
𝑦∈𝐶𝑖
𝑑𝑖𝑠𝑡(𝑥, 𝑦)
𝑐1
𝑥1
Intra – cluster distance Inter – cluster distance
𝑁 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒 𝑠𝑒𝑟𝑖𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑙𝑢𝑠𝑡𝑒𝑟
Davies-Bouldin (DB)
𝐷𝐵 =
1
𝑘
𝑖=1
𝑘
min
𝑖≠𝑗
𝛼𝑖 + 𝛼𝑗
𝑑𝑖𝑠𝑡( 𝐶𝑖, 𝐶𝑗)
𝑐1
𝑐2
𝛼𝑖, 𝛼𝑗 = 𝑖𝑛𝑡𝑟𝑎 − 𝑐𝑙𝑢𝑠𝑒𝑟 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒
𝑐𝑖, 𝑐𝑗 = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑
Soft Rand
𝑅𝐼 =
𝑎 + 𝑏
𝑎 + 𝑏 + 𝑐 + 𝑑
Clustering Ground Truth
a b c d RI
9 3 2 2 0,75
Jaccard Coefficient
Clustering Ground Truth
a b c J
9 3 2 0,643
J=
𝑎
𝑎+𝑏+𝑐
REMARKS
Elbow method
Silhouette index
GAP Statistics
Discrete Fourier Transform (DFT)
07
CONCLUSIONS
What is the take away?
07
 DTW
 Right combination of distance measure & prototype
Conclusions
Clustering algorithm Distance measure Prototype
Partitional
K – means / K – medoid Euclidean / Manhattan Mean / PAM
TAD Pole DTW DBA
K – shape SBD Shape Extraction
Hierarchical Agglomerative All All
Time series clustering presentation

More Related Content

What's hot

Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clusteringChakrit Phain
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Clustering training
Clustering trainingClustering training
Clustering trainingGabor Veress
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 
Privacy preserving machine learning
Privacy preserving machine learningPrivacy preserving machine learning
Privacy preserving machine learningMichał Kuźba
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Nonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemNonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemMichele Filannino
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Asha Aher
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)Padma Metta
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAPJakub Bartczuk
 

What's hot (20)

Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Dynamic time wrapping
Dynamic time wrappingDynamic time wrapping
Dynamic time wrapping
 
Clustering training
Clustering trainingClustering training
Clustering training
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Privacy preserving machine learning
Privacy preserving machine learningPrivacy preserving machine learning
Privacy preserving machine learning
 
Clustering
ClusteringClustering
Clustering
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Nonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problemNonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problem
 
Clustering
ClusteringClustering
Clustering
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)
 
Hidden Markov Model
Hidden Markov Model Hidden Markov Model
Hidden Markov Model
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
2.2.ppt.SC
2.2.ppt.SC2.2.ppt.SC
2.2.ppt.SC
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 

Similar to Time series clustering presentation

Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matchingtaeseon ryu
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big DataChristian Robert
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...Adam Fausett
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingHsing-chuan Hsieh
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machineRuta Kambli
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog Datadog
 
Pattern Mining in large time series databases
Pattern Mining in large time series databasesPattern Mining in large time series databases
Pattern Mining in large time series databasesJitesh Khandelwal
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clusteringishmecse13
 
Dataday Texas 2016 - Datadog
Dataday Texas 2016 - DatadogDataday Texas 2016 - Datadog
Dataday Texas 2016 - DatadogDatadog
 
2017 09-29 ndt loop closure
2017 09-29 ndt loop closure2017 09-29 ndt loop closure
2017 09-29 ndt loop closureiMorpheus ai
 
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...EuroIoTa
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributionsWooSung Choi
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 

Similar to Time series clustering presentation (20)

Clustering
ClusteringClustering
Clustering
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종
 
Lec13 Clustering.pptx
Lec13 Clustering.pptxLec13 Clustering.pptx
Lec13 Clustering.pptx
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
 
Lect4
Lect4Lect4
Lect4
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketching
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machine
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
 
08 clustering
08 clustering08 clustering
08 clustering
 
Pattern Mining in large time series databases
Pattern Mining in large time series databasesPattern Mining in large time series databases
Pattern Mining in large time series databases
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Dataday Texas 2016 - Datadog
Dataday Texas 2016 - DatadogDataday Texas 2016 - Datadog
Dataday Texas 2016 - Datadog
 
2017 09-29 ndt loop closure
2017 09-29 ndt loop closure2017 09-29 ndt loop closure
2017 09-29 ndt loop closure
 
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
Variable neighborhood Prediction of temporal collective profiles by Keun-Woo ...
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Time series clustering presentation

Editor's Notes

  1. Provide quantification for the dissimilarity between two time-series The classification of objects, into clusters, requires some methods for measuring the distance or the (dis)similarity between the objects The term proximity is used to refer to either similarity or dissimilarity. Frequently, the term distance is used as a synonym for dissimilarity.
  2. Variable for Recent years have seen a surge of interest in time series clustering. Data characteristics are evolving and traditional clustering algorithms are becoming less popular in time series clustering. The most commonly used distance measures are only defined for series of equal length and are sensitive to noise, scale and time shifts Thus, many other distance measures tailored to time-series have been developed in order to overcome these limitations; other challenges associated with the structure of time-series, such as multiple variables, serial correlation each
  3. Goal is to put them all together in clusters
  4. Input in customer segmentation Mention about chicken segmentation Behavior based on purchases, bank transactions, energy, other utilities usage/consumption, social networks – who is connected to who
  5. Hierarchy of classes  dendrogram
  6. Provide quantification for the dissimilarity between two time-series The classification of objects, into clusters, requires some methods for measuring the distance or the (dis)similarity between the objects The term proximity is used to refer to either similarity or dissimilarity. Frequently, the term distance is used as a synonym for dissimilarity.
  7. https://en.wikipedia.org/wiki/Taxicab_geometry The distance between two points measured along axes at right angles. Also known as Manhattan length, rectilinear distance, Minkowski's L1 distance, L1 norm, taxi cab metric, snake distance, city block distance
  8. Correlation measures are only useful if/when the relationship between attributes is linear. So if the correlation is 0, then there is no linear relationship between the two data objects. http://cs.tsu.edu/ghemri/CS497/ClassNotes/ML/Similarity%20Measures.pdf Be ready to explain pearson and spearman
  9. When time series have different lengths One of the most used measure of the similarity between two time series Originally designed to treat automatic speech recognition Optimal global alignment between two time series, exploiting temporal distortions between them Designed especially for time series analysis Ignore shifts in time dimension Ignore speeds of two time series How is it calculated?
  10. When time series have different lengths One of the most used measure of the similarity between two time series Originally designed to treat automatic speech recognition Optimal global alignment between two time series, exploiting temporal distortions between them Designed especially for time series analysis Ignore shifts in time dimension Ignore speeds of two time series How is it calculated?
  11. https://www.datanovia.com/en/lessons/clustering-distance-measures/ For example, correlation-based distance is often used in gene expression data analysis. Correlation-based distance considers two objects to be similar if their features are highly correlated, even though the observed values may be far apart in terms of Euclidean distance.  For most clustering package, Euclidean is default. If we want to identify clusters of observations with the same overall profiles regardless of their magnitudes, then correlation-based distance If correlation, Pearson’s correlation is quite sensitive to outliers Commonly used in gene expression data analysis marketing, if we want to identify group of shoppers with the same preference in term of items, regardless of the volume of items they bought.
  12. Hierarchy of classes  dendrogram
  13. Gamma is the optimization function. A is the alignment function
  14. Hierarchy of classes  dendrogram
  15. Hierarchy of classes  dendrogram
  16. Clusters are defines beforehand
  17. Compute distance between point and centroids and keep the minimum Predict For each data point calculate the distance from both centroids and the data point is assigned to the cluster with the min distance Move centroids in the point where the is the mean distance so that they are in the center of the cluster
  18. Compute distance between point and centroids and keep the minimum Predict For each data point calculate the distance from both centroids and the data point is assigned to the cluster with the min distance Move centroids in the point where the is the mean distance so that they are in the center of the cluster
  19. Compute distance between point and centroids and keep the minimum Predict For each data point calculate the distance from both centroids and the data point is assigned to the cluster with the min distance Move centroids in the point where the is the mean distance so that they are in the center of the cluster
  20. Hierarchy of classes  dendrogram
  21. Each character has each one cluster Input = genetic code Selma + Patty  twins Lisa + Merge  mother and daughter (less similarity because the share genetic code with Homer Simpson) Selma + patty sisters of Marge Number of clusters and order of clustering
  22. A: number of time series assigned to same cluster and belong to the same class B: number of time series assigned to different cluster and belong to the different class C: number of time series assigned to different cluster and belong to the same class D: number of time series assigned to same cluster and belong to the different class