SlideShare a Scribd company logo
Network Intelligence and Analysis Lab 
Network Intelligence and Analysis Lab 
Clustering methods via EM algorithm 
2014.07.10 
SanghyukChun
Network Intelligence and Analysis Lab 
• 
Machine Learning 
• 
Training data 
• 
Learning model 
• 
Unsupervised Learning 
• 
Training data without label 
• 
Input data: 퐷퐷={푥푥1,푥푥2,…,푥푥푁푁} 
• 
Most of unsupervised learning problems are trying to find hidden structure in unlabeled data 
• 
Examples: Clustering, Dimensionality Reduction (PCA, LDA), … 
Machine Learning and Unsupervised Learning 
2
Network Intelligence and Analysis Lab 
• 
Clustering 
• 
Grouping objects in a such way that objects in the same group are more similar to each other than other groups 
• 
Input: a set of objects (or data) without group information 
• 
Output: cluster index for each object 
• 
Usage: Customer Segmentation, Image Segmentation… 
Unsupervised Learning and Clustering 
Input 
Output 
Clustering 
Algorithm 
3
Network Intelligence and Analysis Lab 
K-means Clustering 
Introduction 
Optimization 
4
Network Intelligence and Analysis Lab 
• 
Intuition: data in same cluster has shorter distance than data which are in other clusters 
• 
Goal: minimize distance between data in same cluster 
• 
Objective function: 
• 
퐽퐽=෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 
• 
Where N is number of data points, K is number of clusters 
• 
푟푟푛푛푛∈{0,1}is indicator variables where k describing which of the K clusters the data point 퐱퐱퐧퐧is assigned to 
• 
훍훍퐤퐤is a prototype associated with the k-thcluster 
• 
Eventually 훍훍퐤퐤is same as the center (mean) of cluster 
K-means Clustering 
5
Network Intelligence and Analysis Lab 
• 
Objective function: 
• 
푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 
• 
This function can be solved through an iterative procedure 
• 
Step 1: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed 
• 
Step 2: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed 
• 
Repeat Step 1,2 until converge 
• 
Does it always converge? 
K-means Clustering –Optimization 
6
Network Intelligence and Analysis Lab 
• 
Biconvex optimization is a generalization of convex optimization where the objective function and the constraint set can be biconvex 
• 
푓푓푥푥,푦푦is biconvex if fixing x, 푓푓푥푥y=푓푓푥푥,푦푦is convex over Y and fixing y, 푓푓푦푦푥푥=푓푓푥푥,푦푦is convex over X 
• 
One way to solve biconvex optimization problem is that iteratively solve the corresponding convex problems 
• 
It does not guarantee the global optimal point 
• 
But it always converge to some local optimum 
Optional –Biconvex optimization 
7
Network Intelligence and Analysis Lab 
• 
푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 
• 
Step 1: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed 
• 
푟푟푛푛푛=ቊ1푖푘푘=푎푎푎푎푎푎푎푎푎푛푛푗푗퐱퐱퐧퐧−훍훍퐤퐤 ퟐퟐ 0표표표표표표표표표표표표표표표 
• 
Step 2: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed 
• 
Derivative with respect to 훍훍퐤퐤to zero giving 
• 
2Σ푛푛푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤=0 
• 
훍훍퐤퐤=Σ푛푛푟푟푛푛푛푛퐱퐱퐧퐧 Σ푛푛푟푟푛푛푛푛 
• 
훍훍퐤퐤is equal to the mean of all the data assigned to cluster k 
K-means Clustering –Optimization 
8
Network Intelligence and Analysis Lab 
• 
Advantage of K-means clustering 
• 
Easy to implement (kmeansin Matlab, kclusterin Python) 
• 
In practice, it works well 
• 
Disadvantage of K-means clustering 
• 
It can converge to local optimum 
• 
Computing Euclidian distance of every point is expensive 
• 
Solution: Batch K-means 
• 
Euclidian distance is non-robust to outlier 
• 
Solution: K-medoidsalgorithms (use different metric) 
K-means Clustering –Conclusion 
9
Network Intelligence and Analysis Lab 
Mixture of Gaussians 
Mixture Model 
EM Algorithm 
EM for Gaussian Mixtures 
10
Network Intelligence and Analysis Lab 
• 
Assumption: There are k components: 푐푐푖푖푖푖=1 푘푘 
• 
Component 푐푐푖푖has an associated mean vector 휇휇푖푖 
• 
Each component generates data from a Gaussian with mean 휇휇푖푖 and covariance matrix Σ푖푖 
Mixture of Gaussians 
휇휇1 
휇휇2 
휇휇3 
휇휇4 
휇휇5 
11
Network Intelligence and Analysis Lab 
• 
Represent model as linear combination of Gaussians 
• 
Probability density function of GMM 
• 
푝푝푥푥=෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁푥푥휇휇푘푘,Σ푘푘 
• 
푁푁푥푥휇휇푘푘,Σ푘푘=12휋휋푑푑/2Σ1/2exp{−12푥푥−휇휇⊤Σ−1푥푥−휇휇} 
• 
Which is called a mixture of Gaussian or Gaussian Mixture Model 
• 
Each Gaussian density is called component of the mixtures and has its own mean 휇휇푘푘and covariance Σ푘푘 
• 
The parameters are called mixing coefficients (Σ푘푘휋휋푘푘=1) 
Gaussian Mixture Model 
12
Network Intelligence and Analysis Lab 
• 
푝푝푥푥=Σ푘푘=1 퐾퐾휋휋푘푘푁푁푥푥휇휇푘푘,Σ푘푘, where Σ푘푘휋휋푘푘=1 
• 
Input: 
• 
The training set: 푥푥푖푖푖푖=1 푁푁 
• 
Number of clusters: k 
• 
Goal: model this data using mixture of Gaussians 
• 
Mixing coefficients 휋휋1,휋휋2,…,휋휋푘푘 
• 
Means and covariance: 휇휇1,휇휇2,…,휇휇푘푘;Σ1,Σ2,…,Σ푘푘 
Clustering using Mixture Model 
13
Network Intelligence and Analysis Lab 
• 
푝푝푥푥퐺퐺=푝푝푥푥휋휋1,휇휇1,…=Σ푖푖푝푝푥푥푐푐푖푖푝푝(푐푐푖푖)=Σ푖푖휋휋푖푖푁푁(푥푥|휇휇푖푖,Σ푖푖) 
• 
푝푝푥푥1,푥푥2,…,푥푥푁푁퐺퐺=Π푖푖푝푝(푥푥푖푖|퐺퐺) 
• 
The log likelihood function is given by 
• 
ln푝푝퐗퐗훑훑,훍훍,횺횺=෍ 푛푛=1 푁푁 ln෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁퐱퐱퐧퐧훍훍퐤퐤,횺횺퐤퐤 
• 
Goal: Find parameter which maximize log-likelihood 
• 
Problem: Hard to compute maximum likelihood 
• 
Solution: use EM algorithm 
Maximum Likelihood of GMM 
14
Network Intelligence and Analysis Lab 
• 
EM algorithm is an iterative procedure for finding the MLE 
• 
An expectation (E) step creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters 
• 
A maximization (M) step computes parameters maximizing the expected log-likelihood found on the E step 
• 
These parameter-estimates are then used to determine the distribution of the latent variables in the next E step. 
• 
EM always converges to one of local optimums 
EM (Expectation Maximization) Algorithm 
15
Network Intelligence and Analysis Lab 
• 
푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 
• 
E-Step: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed 
• 
푟푟푛푛푛=ቊ1푖푘푘=푎푎푎푎푎푎푎푎푎푛푛푗푗퐱퐱퐧퐧−훍훍퐤퐤 ퟐퟐ 0표표표표표표표표표표표표표표표 
• 
M-Step: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed 
• 
훍훍퐤퐤=Σ푛푛푟푟푛푛푛푛퐱퐱퐧퐧 Σ푛푛푟푟푛푛푛푛 
K-means revisit: EM and K-means 
16
Network Intelligence and Analysis Lab 
• 
Let 푧푧푘푘is Bernoulli random variable with probability 휋휋푘푘 
• 
푝푝푧푧푘푘=1=휋휋푘푘where Σ푧푧푘푘=1and Σ휋휋푘푘=1 
• 
Because z use a 1-of-K representation, this distribution in the form 
• 
푝푝푧푧=Π푘푘=1 퐾퐾휋휋푘푘 푧푧푘
• 
Similarly, the conditional distribution of x given a particular value for z is a Gaussian 
• 
푝푝푥푥푧푧=Π푘푘=1 퐾퐾푁푁푥푥휇휇푘푘,Σ푘푘 푧푧푘
Latent variable for GMM 
17
Network Intelligence and Analysis Lab 
• 
The joint distribution is given by 푝푝푥푥,푧푧=푝푝푧푧푝푝(푥푥|푧푧) 
• 
푝푝푥푥=Σ푧푧푝푝푧푧푝푝(푥푥|푧푧)=Σ푘푘휋휋푘푘푁푁(푥푥|휇휇푘푘,Σ푘푘) 
• 
Thus the marginal distribution of x is a Gaussian mixture of the above form 
• 
Now, we are able to work with joint distribution instead of marginal distribution 
• 
Graphical representation of a GMMfor a set of N i.i.d. data points {푥푥푛푛} with corresponding latent variable{푧푧푛푛},where n=1,…,N 
Latent variable for GMM 
퐳퐳퐧퐧 
푿푿풏풏 
훑훑 
흁흁 
횺횺 
N 
18
Network Intelligence and Analysis Lab 
• 
Conditional probability of z given x 
• 
From Bayes’ theorem, 
• 
훾훾푧푧푘푘≡푝푝푧푧푘푘=1퐱퐱=푝푝푧푧푘=1푝푝퐱퐱푧푧푘푘=1Σ푗푗=1 퐾퐾푝푝푧푧푗푗=1푝푝퐱퐱푧푧푗푗=1= 휋휋푘푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) 
• 
훾훾푧푧푘푘can also be viewed as the responsibility that component k takes for ‘explaining’ the observation x 
EM for Gaussian Mixtures (E-step) 
19
Network Intelligence and Analysis Lab 
• 
Likelihood function for GMM 
• 
ln푝푝퐗퐗훑훑,훍훍,횺횺=෍ 푛푛=1 푁푁 ln෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁퐱퐱퐧퐧훍훍퐤퐤,횺횺퐤퐤 
• 
Setting the derivatives of log likelihood with respect to the means 휇휇푘푘of the Gaussian components to zero, we obtain 
• 
휇휇푘푘= 1N푘푘 ෍ 푛푛=1 푁푁 훾훾푧푧푛푛푛퐱퐱퐧퐧 where, 푁푁푘푘=Σ푛푛=1 푁푁훾훾(푧푧푛푛푛) 
EM for Gaussian Mixtures (M-step) 
20
Network Intelligence and Analysis Lab 
• 
Setting the derivatives of likelihood with respect to the Σ푘푘to zero, we obtain 
• 
횺횺풌풌= 1 푁푁푘푘 ෍ 푛푛=1 푁푁 훾훾푧푧푛푛푛퐱퐱퐧퐧−휇휇푘푘퐱퐱퐧퐧−휇휇푘푘 ⊤ 
• 
Maximize likelihood with respect to the mixing coefficient 휋휋by using a Lagrange multiplier, we obtain 
• 
ln푝푝퐗퐗훑훑,훍훍,횺횺+휆휆(Σ푘푘=1 퐾퐾휋휋푘푘−1) 
• 
휋휋푘푘=푁푁푘푁푁 
EM for Gaussian Mixtures (M-step) 
21
Network Intelligence and Analysis Lab 
• 
휇휇푘푘,Σ푘푘,휋휋푘푘do not constitute a closed-form solution for the parameters of the mixture model because the responsibility 훾훾푧푧푛푛푛depend on those parameters in a complex way 
• 
훾훾(푧푧푛푛푛)=휋휋푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) 
• 
In EM algorithm for GMM, 훾훾(푧푧푛푛푛)and parameters are iteratively optimized 
• 
In E step, responsibilities or the posterior probabilities are evaluated by current values for the parameters 
• 
In M step, re-estimate the means, covariances, and mixing coefficients using previous results 
EM for Gaussian Mixtures 
22
Network Intelligence and Analysis Lab 
• 
Initialize the means 휇휇푘푘, covariancesΣ푘푘and mixing coefficient 휋휋푘푘, and evaluate the initial value of the log likelihood 
• 
E step: Evaluate the responsibilities using the current parameter 
• 
훾훾(푧푧푛푛푛)= 휋휋푘푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) 
• 
M step: Re-estimate parameters using the current responsibilities 
• 
휇휇푘푘 푛푛푛푛푛푛=1N푘Σ푛푛=1 푁푁훾훾푧푧푛푛푛퐱퐱퐧퐧 
• 
횺횺풌풌 풏풏풏풏풏풏=1 푁푁푘Σ푛푛=1 푁푁훾훾푧푧푛푛푛퐱퐱퐧퐧−휇휇푘푘퐱퐱퐧퐧−휇휇푘푘 ⊤ 
• 
휋휋푘푘 푛푛푛푛푛푛=푁푁푘푁푁 
• 
푁푁푘푘=Σ푛푛=1 푁푁훾훾(푧푧푛푛푛) 
• 
Repeat E step and M step until converge 
EM for Gaussian Mixtures 
23
Network Intelligence and Analysis Lab 
• 
We can derive the K-means algorithm as a particular limit of EM for Gaussian Mixture Model 
• 
Consider a Gaussian mixture model with covariance matrices are given by 휀휀퐼퐼, where 휀휀is a variance parameter and I is identity 
• 
If we consider the limit휀휀→0, log likelihood of GMM becomes 
• 
퐸퐸푧푧ln푝푝푋푋,푍푍휇휇,Σ,휋휋→−12=Σ푛푛Σ푘푘푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2+퐶퐶 
• 
Thus, we see that in this limit, maximizing the expected complete- data log likelihood is equivalent to K-means algorithm 
Relationship between K-means algorithm and GMM 
24

More Related Content

What's hot

Deep learning
Deep learningDeep learning
Deep learning
Kuppusamy P
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithms
Ganesh Solanke
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Hakky St
 
Decision trees for machine learning
Decision trees for machine learningDecision trees for machine learning
Decision trees for machine learning
Amr BARAKAT
 
GMM
GMMGMM
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
Pier Luca Lanzi
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Gradient descent optimizer
Gradient descent optimizerGradient descent optimizer
Gradient descent optimizer
Hojin Yang
 
P, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-HardP, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-Hard
Animesh Chaturvedi
 
Lstm
LstmLstm
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
Si Haem
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
Puneet Kulyana
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
Raktim Halder
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
Prof. Neeta Awasthy
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
Dr.(Mrs).Gethsiyal Augasta
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
Jason Larsen
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
Marina Santini
 

What's hot (20)

Deep learning
Deep learningDeep learning
Deep learning
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithms
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Decision trees for machine learning
Decision trees for machine learningDecision trees for machine learning
Decision trees for machine learning
 
GMM
GMMGMM
GMM
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Gradient descent optimizer
Gradient descent optimizerGradient descent optimizer
Gradient descent optimizer
 
P, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-HardP, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-Hard
 
Lstm
LstmLstm
Lstm
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
07 approximate inference in bn
07 approximate inference in bn07 approximate inference in bn
07 approximate inference in bn
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
 

Viewers also liked

K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture modelsVu Pham
 
Markov Chain Basic
Markov Chain BasicMarkov Chain Basic
Markov Chain Basic
Sanghyuk Chun
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
jins0618
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
Seth Familian
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical Clustering
Tushar Tank
 
Em Algorithm | Statistics
Em Algorithm | StatisticsEm Algorithm | Statistics
Em Algorithm | Statistics
Transweb Global Inc
 
Expectation Maximization | Statistics
Expectation Maximization | StatisticsExpectation Maximization | Statistics
Expectation Maximization | Statistics
Transweb Global Inc
 
면발쫄깃123(기업용제안서)
면발쫄깃123(기업용제안서)면발쫄깃123(기업용제안서)
면발쫄깃123(기업용제안서)
Minho Lee
 
A study of investors opinion about on line tradingcase of m soll ltd.
A study of investors opinion about on line tradingcase of m soll ltd.A study of investors opinion about on line tradingcase of m soll ltd.
A study of investors opinion about on line tradingcase of m soll ltd.
Rana Ratnakar
 
Internship_presentation
Internship_presentationInternship_presentation
Internship_presentationAditya Gautam
 
구조 설정
구조 설정구조 설정
구조 설정
승연 신
 
Coordinate Descent method
Coordinate Descent methodCoordinate Descent method
Coordinate Descent method
Sanghyuk Chun
 
Introduction to E-book
Introduction to E-bookIntroduction to E-book
Introduction to E-book
Sanghyuk Chun
 
비즈니스모델 젠 (Business Model Zen) 소개
비즈니스모델 젠 (Business Model Zen) 소개비즈니스모델 젠 (Business Model Zen) 소개
비즈니스모델 젠 (Business Model Zen) 소개
The Innovation Lab
 
최종Ppt 디자인입힌거 최종
최종Ppt 디자인입힌거 최종최종Ppt 디자인입힌거 최종
최종Ppt 디자인입힌거 최종종성 박
 
[토크아이티] 프런트엔드 개발 시작하기 저자 특강
[토크아이티] 프런트엔드 개발 시작하기 저자 특강 [토크아이티] 프런트엔드 개발 시작하기 저자 특강
[토크아이티] 프런트엔드 개발 시작하기 저자 특강
우영 주
 
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
Evan Ryu
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
212140045 박채영 인데코
212140045 박채영 인데코212140045 박채영 인데코
212140045 박채영 인데코채영 박
 
쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기
Brian Hong
 

Viewers also liked (20)

K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
 
Markov Chain Basic
Markov Chain BasicMarkov Chain Basic
Markov Chain Basic
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 
Statistical Clustering
Statistical ClusteringStatistical Clustering
Statistical Clustering
 
Em Algorithm | Statistics
Em Algorithm | StatisticsEm Algorithm | Statistics
Em Algorithm | Statistics
 
Expectation Maximization | Statistics
Expectation Maximization | StatisticsExpectation Maximization | Statistics
Expectation Maximization | Statistics
 
면발쫄깃123(기업용제안서)
면발쫄깃123(기업용제안서)면발쫄깃123(기업용제안서)
면발쫄깃123(기업용제안서)
 
A study of investors opinion about on line tradingcase of m soll ltd.
A study of investors opinion about on line tradingcase of m soll ltd.A study of investors opinion about on line tradingcase of m soll ltd.
A study of investors opinion about on line tradingcase of m soll ltd.
 
Internship_presentation
Internship_presentationInternship_presentation
Internship_presentation
 
구조 설정
구조 설정구조 설정
구조 설정
 
Coordinate Descent method
Coordinate Descent methodCoordinate Descent method
Coordinate Descent method
 
Introduction to E-book
Introduction to E-bookIntroduction to E-book
Introduction to E-book
 
비즈니스모델 젠 (Business Model Zen) 소개
비즈니스모델 젠 (Business Model Zen) 소개비즈니스모델 젠 (Business Model Zen) 소개
비즈니스모델 젠 (Business Model Zen) 소개
 
최종Ppt 디자인입힌거 최종
최종Ppt 디자인입힌거 최종최종Ppt 디자인입힌거 최종
최종Ppt 디자인입힌거 최종
 
[토크아이티] 프런트엔드 개발 시작하기 저자 특강
[토크아이티] 프런트엔드 개발 시작하기 저자 특강 [토크아이티] 프런트엔드 개발 시작하기 저자 특강
[토크아이티] 프런트엔드 개발 시작하기 저자 특강
 
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
패션, 뷰티, 라이프스타일 부문 글로벌 디지털 콘텐츠 허브 : 패션인코리아 소개
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
212140045 박채영 인데코
212140045 박채영 인데코212140045 박채영 인데코
212140045 박채영 인데코
 
쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기
 

Similar to K-means and GMM

ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
MohamedAliHabib3
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
Fa18_P1.pptx
Fa18_P1.pptxFa18_P1.pptx
Fa18_P1.pptx
Md Abul Hayat
 
SPICE-MATEX @ DAC15
SPICE-MATEX @ DAC15SPICE-MATEX @ DAC15
SPICE-MATEX @ DAC15
Hao Zhuang
 
MrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label ClassificationMrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label Classification
YI-JHEN LIN
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
Tsuyoshi Sakama
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
Kohei Asano
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptxvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
Seungeon Baek
 
iterativealgorithms.ppsx
iterativealgorithms.ppsxiterativealgorithms.ppsx
iterativealgorithms.ppsx
Bharathi Lakshmi Pon
 
Iterative Algorithms.ppsx
Iterative Algorithms.ppsxIterative Algorithms.ppsx
Iterative Algorithms.ppsx
BharathiLakshmiAAssi
 
13Kernel_Machines.pptx
13Kernel_Machines.pptx13Kernel_Machines.pptx
13Kernel_Machines.pptx
KarasuLee
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Sungchul Kim
 
Domain adaptation: A Theoretical View
Domain adaptation: A Theoretical ViewDomain adaptation: A Theoretical View
Domain adaptation: A Theoretical View
Chia-Ching Lin
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learning
Taehoon Kim
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLJanani C
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
SSA KPI
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 
30thSep2014
30thSep201430thSep2014
30thSep2014Mia liu
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
Fares Al-Qunaieer
 

Similar to K-means and GMM (20)

ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Fa18_P1.pptx
Fa18_P1.pptxFa18_P1.pptx
Fa18_P1.pptx
 
SPICE-MATEX @ DAC15
SPICE-MATEX @ DAC15SPICE-MATEX @ DAC15
SPICE-MATEX @ DAC15
 
MrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label ClassificationMrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label Classification
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptxvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pptx
 
iterativealgorithms.ppsx
iterativealgorithms.ppsxiterativealgorithms.ppsx
iterativealgorithms.ppsx
 
Iterative Algorithms.ppsx
Iterative Algorithms.ppsxIterative Algorithms.ppsx
Iterative Algorithms.ppsx
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
13Kernel_Machines.pptx
13Kernel_Machines.pptx13Kernel_Machines.pptx
13Kernel_Machines.pptx
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Domain adaptation: A Theoretical View
Domain adaptation: A Theoretical ViewDomain adaptation: A Theoretical View
Domain adaptation: A Theoretical View
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learning
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
30thSep2014
30thSep201430thSep2014
30thSep2014
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 

Recently uploaded

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 

K-means and GMM

  • 1. Network Intelligence and Analysis Lab Network Intelligence and Analysis Lab Clustering methods via EM algorithm 2014.07.10 SanghyukChun
  • 2. Network Intelligence and Analysis Lab • Machine Learning • Training data • Learning model • Unsupervised Learning • Training data without label • Input data: 퐷퐷={푥푥1,푥푥2,…,푥푥푁푁} • Most of unsupervised learning problems are trying to find hidden structure in unlabeled data • Examples: Clustering, Dimensionality Reduction (PCA, LDA), … Machine Learning and Unsupervised Learning 2
  • 3. Network Intelligence and Analysis Lab • Clustering • Grouping objects in a such way that objects in the same group are more similar to each other than other groups • Input: a set of objects (or data) without group information • Output: cluster index for each object • Usage: Customer Segmentation, Image Segmentation… Unsupervised Learning and Clustering Input Output Clustering Algorithm 3
  • 4. Network Intelligence and Analysis Lab K-means Clustering Introduction Optimization 4
  • 5. Network Intelligence and Analysis Lab • Intuition: data in same cluster has shorter distance than data which are in other clusters • Goal: minimize distance between data in same cluster • Objective function: • 퐽퐽=෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 • Where N is number of data points, K is number of clusters • 푟푟푛푛푛∈{0,1}is indicator variables where k describing which of the K clusters the data point 퐱퐱퐧퐧is assigned to • 훍훍퐤퐤is a prototype associated with the k-thcluster • Eventually 훍훍퐤퐤is same as the center (mean) of cluster K-means Clustering 5
  • 6. Network Intelligence and Analysis Lab • Objective function: • 푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 • This function can be solved through an iterative procedure • Step 1: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed • Step 2: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed • Repeat Step 1,2 until converge • Does it always converge? K-means Clustering –Optimization 6
  • 7. Network Intelligence and Analysis Lab • Biconvex optimization is a generalization of convex optimization where the objective function and the constraint set can be biconvex • 푓푓푥푥,푦푦is biconvex if fixing x, 푓푓푥푥y=푓푓푥푥,푦푦is convex over Y and fixing y, 푓푓푦푦푥푥=푓푓푥푥,푦푦is convex over X • One way to solve biconvex optimization problem is that iteratively solve the corresponding convex problems • It does not guarantee the global optimal point • But it always converge to some local optimum Optional –Biconvex optimization 7
  • 8. Network Intelligence and Analysis Lab • 푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 • Step 1: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed • 푟푟푛푛푛=ቊ1푖푘푘=푎푎푎푎푎푎푎푎푎푛푛푗푗퐱퐱퐧퐧−훍훍퐤퐤 ퟐퟐ 0표표표표표표표표표표표표표표표 • Step 2: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed • Derivative with respect to 훍훍퐤퐤to zero giving • 2Σ푛푛푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤=0 • 훍훍퐤퐤=Σ푛푛푟푟푛푛푛푛퐱퐱퐧퐧 Σ푛푛푟푟푛푛푛푛 • 훍훍퐤퐤is equal to the mean of all the data assigned to cluster k K-means Clustering –Optimization 8
  • 9. Network Intelligence and Analysis Lab • Advantage of K-means clustering • Easy to implement (kmeansin Matlab, kclusterin Python) • In practice, it works well • Disadvantage of K-means clustering • It can converge to local optimum • Computing Euclidian distance of every point is expensive • Solution: Batch K-means • Euclidian distance is non-robust to outlier • Solution: K-medoidsalgorithms (use different metric) K-means Clustering –Conclusion 9
  • 10. Network Intelligence and Analysis Lab Mixture of Gaussians Mixture Model EM Algorithm EM for Gaussian Mixtures 10
  • 11. Network Intelligence and Analysis Lab • Assumption: There are k components: 푐푐푖푖푖푖=1 푘푘 • Component 푐푐푖푖has an associated mean vector 휇휇푖푖 • Each component generates data from a Gaussian with mean 휇휇푖푖 and covariance matrix Σ푖푖 Mixture of Gaussians 휇휇1 휇휇2 휇휇3 휇휇4 휇휇5 11
  • 12. Network Intelligence and Analysis Lab • Represent model as linear combination of Gaussians • Probability density function of GMM • 푝푝푥푥=෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁푥푥휇휇푘푘,Σ푘푘 • 푁푁푥푥휇휇푘푘,Σ푘푘=12휋휋푑푑/2Σ1/2exp{−12푥푥−휇휇⊤Σ−1푥푥−휇휇} • Which is called a mixture of Gaussian or Gaussian Mixture Model • Each Gaussian density is called component of the mixtures and has its own mean 휇휇푘푘and covariance Σ푘푘 • The parameters are called mixing coefficients (Σ푘푘휋휋푘푘=1) Gaussian Mixture Model 12
  • 13. Network Intelligence and Analysis Lab • 푝푝푥푥=Σ푘푘=1 퐾퐾휋휋푘푘푁푁푥푥휇휇푘푘,Σ푘푘, where Σ푘푘휋휋푘푘=1 • Input: • The training set: 푥푥푖푖푖푖=1 푁푁 • Number of clusters: k • Goal: model this data using mixture of Gaussians • Mixing coefficients 휋휋1,휋휋2,…,휋휋푘푘 • Means and covariance: 휇휇1,휇휇2,…,휇휇푘푘;Σ1,Σ2,…,Σ푘푘 Clustering using Mixture Model 13
  • 14. Network Intelligence and Analysis Lab • 푝푝푥푥퐺퐺=푝푝푥푥휋휋1,휇휇1,…=Σ푖푖푝푝푥푥푐푐푖푖푝푝(푐푐푖푖)=Σ푖푖휋휋푖푖푁푁(푥푥|휇휇푖푖,Σ푖푖) • 푝푝푥푥1,푥푥2,…,푥푥푁푁퐺퐺=Π푖푖푝푝(푥푥푖푖|퐺퐺) • The log likelihood function is given by • ln푝푝퐗퐗훑훑,훍훍,횺횺=෍ 푛푛=1 푁푁 ln෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁퐱퐱퐧퐧훍훍퐤퐤,횺횺퐤퐤 • Goal: Find parameter which maximize log-likelihood • Problem: Hard to compute maximum likelihood • Solution: use EM algorithm Maximum Likelihood of GMM 14
  • 15. Network Intelligence and Analysis Lab • EM algorithm is an iterative procedure for finding the MLE • An expectation (E) step creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters • A maximization (M) step computes parameters maximizing the expected log-likelihood found on the E step • These parameter-estimates are then used to determine the distribution of the latent variables in the next E step. • EM always converges to one of local optimums EM (Expectation Maximization) Algorithm 15
  • 16. Network Intelligence and Analysis Lab • 푎푎푎푎푎푎푎푎푎푛푛{푟푟푛푛푛푛,훍훍퐤퐤}෍ 푛푛=1 푁푁 ෍ 푘푘=1 퐾퐾 푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2 • E-Step: minimize J with respect to the 푟푟푛푛푛, keeping 훍훍퐤퐤is fixed • 푟푟푛푛푛=ቊ1푖푘푘=푎푎푎푎푎푎푎푎푎푛푛푗푗퐱퐱퐧퐧−훍훍퐤퐤 ퟐퟐ 0표표표표표표표표표표표표표표표 • M-Step: minimize J with respect to the 훍훍퐤퐤, keeping 푟푟푛푛푛is fixed • 훍훍퐤퐤=Σ푛푛푟푟푛푛푛푛퐱퐱퐧퐧 Σ푛푛푟푟푛푛푛푛 K-means revisit: EM and K-means 16
  • 17. Network Intelligence and Analysis Lab • Let 푧푧푘푘is Bernoulli random variable with probability 휋휋푘푘 • 푝푝푧푧푘푘=1=휋휋푘푘where Σ푧푧푘푘=1and Σ휋휋푘푘=1 • Because z use a 1-of-K representation, this distribution in the form • 푝푝푧푧=Π푘푘=1 퐾퐾휋휋푘푘 푧푧푘 • Similarly, the conditional distribution of x given a particular value for z is a Gaussian • 푝푝푥푥푧푧=Π푘푘=1 퐾퐾푁푁푥푥휇휇푘푘,Σ푘푘 푧푧푘 Latent variable for GMM 17
  • 18. Network Intelligence and Analysis Lab • The joint distribution is given by 푝푝푥푥,푧푧=푝푝푧푧푝푝(푥푥|푧푧) • 푝푝푥푥=Σ푧푧푝푝푧푧푝푝(푥푥|푧푧)=Σ푘푘휋휋푘푘푁푁(푥푥|휇휇푘푘,Σ푘푘) • Thus the marginal distribution of x is a Gaussian mixture of the above form • Now, we are able to work with joint distribution instead of marginal distribution • Graphical representation of a GMMfor a set of N i.i.d. data points {푥푥푛푛} with corresponding latent variable{푧푧푛푛},where n=1,…,N Latent variable for GMM 퐳퐳퐧퐧 푿푿풏풏 훑훑 흁흁 횺횺 N 18
  • 19. Network Intelligence and Analysis Lab • Conditional probability of z given x • From Bayes’ theorem, • 훾훾푧푧푘푘≡푝푝푧푧푘푘=1퐱퐱=푝푝푧푧푘=1푝푝퐱퐱푧푧푘푘=1Σ푗푗=1 퐾퐾푝푝푧푧푗푗=1푝푝퐱퐱푧푧푗푗=1= 휋휋푘푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) • 훾훾푧푧푘푘can also be viewed as the responsibility that component k takes for ‘explaining’ the observation x EM for Gaussian Mixtures (E-step) 19
  • 20. Network Intelligence and Analysis Lab • Likelihood function for GMM • ln푝푝퐗퐗훑훑,훍훍,횺횺=෍ 푛푛=1 푁푁 ln෍ 푘푘=1 퐾퐾 휋휋푘푘푁푁퐱퐱퐧퐧훍훍퐤퐤,횺횺퐤퐤 • Setting the derivatives of log likelihood with respect to the means 휇휇푘푘of the Gaussian components to zero, we obtain • 휇휇푘푘= 1N푘푘 ෍ 푛푛=1 푁푁 훾훾푧푧푛푛푛퐱퐱퐧퐧 where, 푁푁푘푘=Σ푛푛=1 푁푁훾훾(푧푧푛푛푛) EM for Gaussian Mixtures (M-step) 20
  • 21. Network Intelligence and Analysis Lab • Setting the derivatives of likelihood with respect to the Σ푘푘to zero, we obtain • 횺횺풌풌= 1 푁푁푘푘 ෍ 푛푛=1 푁푁 훾훾푧푧푛푛푛퐱퐱퐧퐧−휇휇푘푘퐱퐱퐧퐧−휇휇푘푘 ⊤ • Maximize likelihood with respect to the mixing coefficient 휋휋by using a Lagrange multiplier, we obtain • ln푝푝퐗퐗훑훑,훍훍,횺횺+휆휆(Σ푘푘=1 퐾퐾휋휋푘푘−1) • 휋휋푘푘=푁푁푘푁푁 EM for Gaussian Mixtures (M-step) 21
  • 22. Network Intelligence and Analysis Lab • 휇휇푘푘,Σ푘푘,휋휋푘푘do not constitute a closed-form solution for the parameters of the mixture model because the responsibility 훾훾푧푧푛푛푛depend on those parameters in a complex way • 훾훾(푧푧푛푛푛)=휋휋푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) • In EM algorithm for GMM, 훾훾(푧푧푛푛푛)and parameters are iteratively optimized • In E step, responsibilities or the posterior probabilities are evaluated by current values for the parameters • In M step, re-estimate the means, covariances, and mixing coefficients using previous results EM for Gaussian Mixtures 22
  • 23. Network Intelligence and Analysis Lab • Initialize the means 휇휇푘푘, covariancesΣ푘푘and mixing coefficient 휋휋푘푘, and evaluate the initial value of the log likelihood • E step: Evaluate the responsibilities using the current parameter • 훾훾(푧푧푛푛푛)= 휋휋푘푘푁푁퐱퐱훍훍퐤퐤,횺횺퐤퐤 Σ푗푗=1 퐾퐾휋휋푗푗푁푁(퐱퐱|훍훍퐣퐣,횺횺퐣퐣) • M step: Re-estimate parameters using the current responsibilities • 휇휇푘푘 푛푛푛푛푛푛=1N푘Σ푛푛=1 푁푁훾훾푧푧푛푛푛퐱퐱퐧퐧 • 횺횺풌풌 풏풏풏풏풏풏=1 푁푁푘Σ푛푛=1 푁푁훾훾푧푧푛푛푛퐱퐱퐧퐧−휇휇푘푘퐱퐱퐧퐧−휇휇푘푘 ⊤ • 휋휋푘푘 푛푛푛푛푛푛=푁푁푘푁푁 • 푁푁푘푘=Σ푛푛=1 푁푁훾훾(푧푧푛푛푛) • Repeat E step and M step until converge EM for Gaussian Mixtures 23
  • 24. Network Intelligence and Analysis Lab • We can derive the K-means algorithm as a particular limit of EM for Gaussian Mixture Model • Consider a Gaussian mixture model with covariance matrices are given by 휀휀퐼퐼, where 휀휀is a variance parameter and I is identity • If we consider the limit휀휀→0, log likelihood of GMM becomes • 퐸퐸푧푧ln푝푝푋푋,푍푍휇휇,Σ,휋휋→−12=Σ푛푛Σ푘푘푟푟푛푛푛퐱퐱퐧퐧−훍훍퐤퐤 2+퐶퐶 • Thus, we see that in this limit, maximizing the expected complete- data log likelihood is equivalent to K-means algorithm Relationship between K-means algorithm and GMM 24