The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- K-Means Clustering and its application for Image Compression
- Introduction of Latent Variables
- Mixtures of Gaussians and its update using EM-algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・K-meansクラスタリングとその画像圧縮への応用
・隠れ変数の導入
・混合ガウス分布とEMアルゴリズムによる更新
D. Mladenov - On Integrable Systems in CosmologySEENET-MTP
Lecture by Prof. Dr. Dimitar Mladenov (Theoretical Physics Department, Faculty of Physics, Sofia University, Bulgaria) on December 7, 2011 at the Faculty of Science and Mathematics, Nis, Serbia.
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
D. Mladenov - On Integrable Systems in CosmologySEENET-MTP
Lecture by Prof. Dr. Dimitar Mladenov (Theoretical Physics Department, Faculty of Physics, Sofia University, Bulgaria) on December 7, 2011 at the Faculty of Science and Mathematics, Nis, Serbia.
Deep Reinforcement Learning: Q-LearningKai-Wen Zhao
This slide reviews deep reinforcement learning, specially Q-Learning and its variants. We introduce Bellman operator and approximate it with deep neural network. Last but not least, we review the classical paper: DeepMind Atari Game beats human performance. Also, some tips of stabilizing DQN are included.
deep reinforcement learning with double q learningSeungHyeok Baek
presentation for Lab seminar
Double DQN Algorithm of Deepmind
Van Hasselt, Hado, Arthur Guez, and David Silver. "Deep Reinforcement Learning with Double Q-Learning." AAAI. Vol. 2. 2016.
Slides taught in the course of pattern recognition of Professor Zohreh Azimifar at Shiraz University.
The slides are owned by the University of Texas.
اسلاید های تدریس شده درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
اسلاید ها متعلق به دانشگاه تگزاس است.
The Global Positioning System (GPS) is a network of dozens of satellites that hover out in space with the purpose of allowing people to identify their location on earth. Signals from the GPS satellites are transmitted to a GPS receiver on earth’s surface to pinpoint the satellite’s location in space. With knowledge of the satellite’s orbit and utilizing time information, a GPS receiver is able to determine its own location under the condition that four satellites are within range. However, due to the inaccuracy of the receiver’s clock when utilizing commercial GPS units for low cost, the distances calculated, called pseudo-ranges, are not accurate. Ideally, these four pseudo-ranges should intersect at a single point for a true receiver-satellite distance, but the unsynchronized clocks prevent this. To accurately determine a location, a few algebraic computations are necessary to make the adjustment for the imperfect information. These algebraic computations consist of deriving, implementing, and testing two algorithms, the Gradient Descent and Gauss Newton algorithms. Throughout this project, we will be exploring how these algorithms contribute to resolving the clock error when determining the true pseudo- range under noiseless conditions.
Accuracy of the internal multiple prediction when a time-saving method based ...Arthur Weglein
The inverse scattering series (ISS) is a direct inversion method for a multidimensional acoustic,
elastic and anelastic earth. It communicates that all inversion processing goals can be
achieved directly and without any subsurface information. This task is reached through a taskspecific
subseries of the ISS. Using primaries in the data as subevents of the first-order internal
multiples, the leading-order attenuator can predict the time of all the first-order internal multiples
and is able to attenuate them.
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
PRML 2.4-2.5: The Exponential Family & Nonparametric MethodsShinichi Tamura
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- Exponential Family and its ML estimation
- Overview of Nonparametric methods density estimation
- Kernel Density Estimators
- Nearest-neighbour methods and its application for classification
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・指数型分布族とその最尤推定
・密度推定のためのノンパラメトリック法の概要
・カーネル密度推定法
・最近傍法とその分類への応用
deep reinforcement learning with double q learningSeungHyeok Baek
presentation for Lab seminar
Double DQN Algorithm of Deepmind
Van Hasselt, Hado, Arthur Guez, and David Silver. "Deep Reinforcement Learning with Double Q-Learning." AAAI. Vol. 2. 2016.
Slides taught in the course of pattern recognition of Professor Zohreh Azimifar at Shiraz University.
The slides are owned by the University of Texas.
اسلاید های تدریس شده درس شناسایی آماری الگو استاد زهره عظیمی فر در دانشگاه شیراز.
اسلاید ها متعلق به دانشگاه تگزاس است.
The Global Positioning System (GPS) is a network of dozens of satellites that hover out in space with the purpose of allowing people to identify their location on earth. Signals from the GPS satellites are transmitted to a GPS receiver on earth’s surface to pinpoint the satellite’s location in space. With knowledge of the satellite’s orbit and utilizing time information, a GPS receiver is able to determine its own location under the condition that four satellites are within range. However, due to the inaccuracy of the receiver’s clock when utilizing commercial GPS units for low cost, the distances calculated, called pseudo-ranges, are not accurate. Ideally, these four pseudo-ranges should intersect at a single point for a true receiver-satellite distance, but the unsynchronized clocks prevent this. To accurately determine a location, a few algebraic computations are necessary to make the adjustment for the imperfect information. These algebraic computations consist of deriving, implementing, and testing two algorithms, the Gradient Descent and Gauss Newton algorithms. Throughout this project, we will be exploring how these algorithms contribute to resolving the clock error when determining the true pseudo- range under noiseless conditions.
Accuracy of the internal multiple prediction when a time-saving method based ...Arthur Weglein
The inverse scattering series (ISS) is a direct inversion method for a multidimensional acoustic,
elastic and anelastic earth. It communicates that all inversion processing goals can be
achieved directly and without any subsurface information. This task is reached through a taskspecific
subseries of the ISS. Using primaries in the data as subevents of the first-order internal
multiples, the leading-order attenuator can predict the time of all the first-order internal multiples
and is able to attenuate them.
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
PRML 2.4-2.5: The Exponential Family & Nonparametric MethodsShinichi Tamura
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the sections cover
- Exponential Family and its ML estimation
- Overview of Nonparametric methods density estimation
- Kernel Density Estimators
- Nearest-neighbour methods and its application for classification
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・指数型分布族とその最尤推定
・密度推定のためのノンパラメトリック法の概要
・カーネル密度推定法
・最近傍法とその分類への応用
ESL 17.3.2-17.4: Graphical Lasso and Boltzmann MachinesShinichi Tamura
The presentation material for the reading club of Element of Statistical Learning by Hastie et al.
The contents of the sections cover
- Graphical Lasso (fitting for Gaussian graphical model)
- Derivation and Learning of Boltzmann Machines
- Restricted Boltzmann Machines and Contrastive Divergence Methods
-------------------------------------------------------------------------
研究室での『統計学習の基礎』(Hastieら著)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・グラフィカルLASSO(ガウスグラフィカルモデルへの当てはめ)
・ボルツマンマシンの導出と学習について
・制限ボルツマンマシンとコントラスティブダイバージェンス法
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the section cover
- EM algorithm for HMM
- Forward-Backward Algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・隠れマルコフモデルに対するEMアルゴリズムのEステップ
・フォワード-バックワードアルゴリズム
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneShinichi Tamura
The presentation material for the reading club of Element of Statistical Learning by Hastie et al.
The contents of the sections cover
- Properties of logistic regression compared to least square s fitting
- Difference between logistic regression vs. linear discriminant analysis
- Rosenblatt's perceptron algorithm
- Derivation of optimal hyperplane, which offers the basis for SVM
-------------------------------------------------------------------------
研究室での『統計学習の基礎』(Hastieら著)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・最小二乗法との類推で見るロジスティック回帰の特徴
・ロジスティック回帰と線形判別分析の比較
・ローゼンブラットのパーセプトロンアルゴリズム
・SVMの基礎となる最適分離超平面の導出
This Edureka k-means clustering algorithm tutorial will take you through the machine learning introduction, cluster analysis, types of clustering algorithms, k-means clustering, how it works along with an example/ demo in R. This Data Science with R tutorial is ideal for beginners to learn how k-means clustering work. You can also read the blog here: https://goo.gl/3aseSs
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
2. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
3. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
4. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
5. Mixtures of Gaussians
K-means Clustering
Clustering Problem
An unsupervised machine learning problem
Divide data in some group (=cluster) where
ü
similar data
>
same group
ü
dissimilar data
>
different group
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
6. Mixtures of Gaussians
K-means Clustering
Clustering Problem
Divide data in some group (=cluster) where
ü
similar data
>
same group
ü
dissimilar data
>
different group
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
7. Mixtures of Gaussians
K-means Clustering
Clustering Problem
Divide data in some group (=cluster) where
ü
similar data
>
same group
ü
dissimilar data
>
different group
Minimize
N
n=1
xn − µk(n)
2
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
8. Mixtures of Gaussians
K-means Clustering
Clustering Problem
Divide data in some group (=cluster) where
ü
similar data
>
same group
ü
dissimilar data
>
different group
Minimize
N
n=1
xn − µk(n)
2
Center of the cluster
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
9. Mixtures of Gaussians
K-means Clustering
Clustering Problem
Given data set and # of cluster K
Let be cluster representative and be
assignment indicator ( ),
Here, J is called “distortion measure”.
X = {x1, . . . , xN }
µk rnk
rnk = 1 if x ∈ Ck
Minimize J =
N
n=1
K
k=1
rnk xn − µk
2
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
10. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
11. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
12. Mixtures of Gaussians
K-means Clustering
K-means Clustering
How to solve that?
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
13. Mixtures of Gaussians
K-means Clustering
K-means Clustering
How to solve that?
and are dependent each other
> No closed form solution
µk rnk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
14. Mixtures of Gaussians
K-means Clustering
K-means Clustering
How to solve that?
and are dependent each other
> No closed form solution
Use iterative algorithm !
µk rnk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
15. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Strategy
and can't be updated simultaneously
µk rnk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
16. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Strategy
and can't be updated simultaneously
> Update them one by one
µk rnk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
17. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (assignment)
Since each can be determined independently,
J will be minimum if they are assigned to the
nearest .
rnk
xn
µk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
18. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (assignment)
Since each can be determined independently,
J will be minimum if they are assigned to the
nearest . Therefore,
rnk
xn
µk
rnk =
1 if k = arg minj xn − µj
2
,
0 otherwise.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
19. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
20. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
21. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
Mean of the cluster
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
22. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
Mean of the cluster
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
23. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
Mean of the cluster
is the mean of the cluster
Cost function J corresponds to
the sum of inner-class variance!
µk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
24. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Update of (parameter estimation)
Optimal is obtained by setting derivative 0.
µk
µk
∂
∂µk
N
n=1
K
k =1
rnk xn − µk
2
= 0.
⇐⇒ 2
N
n=1
rnk(xn − µk) = 0.
∴ µk =
N
n=1 rnkxn
N
n=1 rnk
=
1
Nk
xn∈Ck
xn.
Mean of the cluster
is the mean of the cluster
Cost function J corresponds to
the sum of inner-class variance!
µk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
25. Mixtures of Gaussians
K-means Clustering
K-means Clustering
K-means algorithm
1. Initialize ,
2. Repeat following two steps until converge
i) Assign each to closest
ii) Update to the mean of the cluster
µk rnk
xn µk
µk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
26. Mixtures of Gaussians
K-means Clustering
K-means Clustering
K-means algorithm
1. Initialize ,
2. Repeat following two steps until converge
i) Assign each to closest
ii) Update to the mean of the cluster
µk rnk
xn µk
µk
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
27. Mixtures of Gaussians
K-means Clustering
K-means Clustering
K-means algorithm
1. Initialize ,
2. Repeat following two steps until converge
i) Assign each to closest
ii) Update to the mean of the cluster
µk rnk
xn µk
µk
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
28. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Convergence property
Both steps never increase J, so we can obtain
better result in every iteration.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
29. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Convergence property
Both steps never increase J, so we can obtain
better result in every iteration.
Since is finite, algorithm converge after
finite iterations.
rnk
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
30. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
31. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
32. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
33. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
34. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
35. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
36. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
37. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
38. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
39. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
40. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
41. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
42. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Calculation performance
E step
...
Comparison of every data point
and every cluster mean
> O(KN)
µk
xn
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
43. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Calculation performance
E step
...
Comparison of every data point
and every cluster mean
> O(KN)
µk
xn
Not good
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
44. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Calculation performance
E step
...
Comparison of every data point
and every cluster mean
> O(KN)
µk
xn
Not good
Improve with kd-tree,
triangle inequality...etc
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
45. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Calculation performance
E step
...
Comparison of every data point
and every cluster mean
> O(KN)
µk
xn
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
46. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Calculation performance
E step
...
Comparison of every data point
and every cluster mean
> O(KN)
M step
...
Calculation of mean for every cluster
> O(N)
µk
xn
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
47. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Here, two variation will be introduced:
1. On-line version
2. General dissimilarity
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
48. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Here, two variation will be introduced:
1. On-line version
2. General dissimilarity
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
49. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 1. On-line version
The case where one datum is observed at once.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
50. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 1. On-line version
The case where one datum is observed at once.
> Apply Robbins-Monro algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
51. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 1. On-line version
The case where one datum is observed at once.
> Apply Robbins-Monro algorithm
µnew
k = µold
k + ηn(xn − µold
k ).
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
52. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 1. On-line version
The case where one datum is observed at once.
> Apply Robbins-Monro algorithm
µnew
k = µold
k + ηn(xn − µold
k ).
Learning rate
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
53. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 1. On-line version
The case where one datum is observed at once.
> Apply Robbins-Monro algorithm
µnew
k = µold
k + ηn(xn − µold
k ).
Learning rate
Decrease with iteration
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
54. Mixtures of Gaussians
K-means Clustering
K-means Clustering
Here, two variation will be introduced:
1. On-line version
2. General dissimilarity
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
55. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
Euclidian distance is not
ü
appropriate to categorical data, etc.
ü
robust to outlier.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
56. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
Euclidian distance is not
ü
appropriate to categorical data, etc.
ü
robust to outlier.
> Use general dissimilarity measure
V(x, x )
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
57. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
Euclidian distance is not
ü
appropriate to categorical data, etc.
ü
robust to outlier.
> Use general dissimilarity measure
V(x, x )
E step ... No difference
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
58. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
Euclidian distance is not
ü
appropriate to categorical data, etc.
ü
robust to outlier.
> Use general dissimilarity measure
V(x, x )
M step ... Not assured J is easy to minimize
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
59. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
To make M-step easy, restrict to the vector
chosen from
>
A solution can be obtained by finite
number of comparison
µk
{xn}
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
60. Mixtures of Gaussians
K-means Clustering
K-means Clustering
[Variation] 2. General dissimilarity
To make M-step easy, restrict to the vector
chosen from
>
A solution can be obtained by finite
number of comparison
µk
{xn}
µk = arg min
xn
xn ∈Ck
V(xn, xn )
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
61. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
62. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
63. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
64. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
Basic Idea
Treat similar pixel as same one
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
65. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
Basic Idea
Treat similar pixel as same one
Original data
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
66. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
Basic Idea
Treat similar pixel as same one
Cluster center
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
67. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
Basic Idea
Treat similar pixel as same one
Cluster center
(pallet / code-book vector)
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
68. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
K-means algorithm can be applied to
Image Compression and Segmentation
Basic Idea
Treat similar pixel as same one
= so called “vector quantization”
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
69. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Demo
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
70. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Demo
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
71. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Demo
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
72. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Compression rate
Original image...24N bits
(N=# of pixels)
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
73. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Compression rate
Original image...24N bits
(N=# of pixels)
Compressed image... 24K+N log2K bits
(K=# of pallet)
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
74. Mixtures of Gaussians
K-means Clustering
Application for Image Compression
Compression rate
Original image...24N bits
(N=# of pixels)
Compressed image... 24K+N log2K bits
(K=# of pallet)
16.7% if N~1M, K=10
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
75. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
76. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
77. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
78. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
In K-means, all assignments
are equal, “all or nothing”.
Treated same
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
79. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
In K-means, all assignments
are equal, “all or nothing”.
Is these “hard” assignment
appropriate?
Treated same
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
80. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
In K-means, all assignments
are equal, “all or nothing”.
Is these “hard” assignment
appropriate?
>
Want introduce "soft"
assignment
Treated same
Probabilistic
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
81. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Introduce random variable z,
having 1-of-K representation
> Control unobserved “states”
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
82. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Introduce random variable z,
having 1-of-K representation
> Control unobserved “states”
Once state is determined,
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
83. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Introduce random variable z,
having 1-of-K representation
> Control unobserved “states”
Once state is determined,
x is drawn from Gaussian of the state
p(x|zk = 1) = N(x|µk, Σk).
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
84. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Introduce random variable z,
having 1-of-K representation
> Control unobserved “states”
Once state is determined,
x is drawn from Gaussian of the state
p(x|zk = 1) = N(x|µk, Σk).
x
z
Graphical representation
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
85. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Here the distribution over x is
p(x) =
z
p(z)p(x|z)
=
K
k=1
p(zk = 1)p(x|zk = 1)
=
K
k=1
πkN(x|µk, Σk).
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
86. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Here the distribution over x is
p(x) =
z
p(z)p(x|z)
=
K
k=1
p(zk = 1)p(x|zk = 1)
=
K
k=1
πkN(x|µk, Σk).
z is 1-of-K rep.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
87. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Here the distribution over x is
p(x) =
z
p(z)p(x|z)
=
K
k=1
p(zk = 1)p(x|zk = 1)
=
K
k=1
πkN(x|µk, Σk).
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
88. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Here the distribution over x is
p(x) =
z
p(z)p(x|z)
=
K
k=1
p(zk = 1)p(x|zk = 1)
=
K
k=1
πkN(x|µk, Σk).
Gaussian Mixtures !
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
89. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
90. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
91. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
92. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
Posteriors
93. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
Posteriors
Priors
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
94. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
Posteriors
Priors
Likelihood
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
95. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Estimate (or “explain”) x came from which state
This value is also called “responsibilities”
γ(zk) ≡ p(zk = 1|x) =
p(zk = 1)p(x|zk = 1)
j p(zj = 1)p(x|zj = 1)
=
πkN(x|µk, Σk)
j πjN(x|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
96. Mixtures of Gaussians
K-means Clustering
Introduction of Latent Variable
Example of Gaussian Mixtures
(a)
0 0.5 1
0
0.5
1
(b)
0 0.5 1
0
0.5
1
(c)
0 0.5 1
0
0.5
1
No state info
Coloured by
true state
Coloured by
responsibility
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
97. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
98. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
99. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
ML estimates of mixtures of Gaussians have
two problems:
i. Presence of Singularities
ii. Identifiability
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
100. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
ML estimates of mixtures of Gaussians have
two problems:
i. Presence of Singularities
ii. Identifiability
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
101. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
∃j, m µj = xm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
102. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
103. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞.→ ∞
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
104. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞. → ∞
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
105. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞. → ∞ → 0
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
106. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞. → ∞ > 0
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
107. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
What if a mean collides with a data point?
Likelihood can be however large by
∃j, m µj = xm
σj → 0
L ∝
1
σj
+
k=j
pk,m
n=m
1
σj
exp −
(xn − µj)2
2σ2
j
+
k=j
pk,n
→∞.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
108. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
It doesn't occur in single Gaussian.
L ∝
1
σN
j n=m
exp −
(xn − µj)2
2σ2
j
→0.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
109. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
It doesn't occur in single Gaussian.
L ∝
1
σN
j n=m
exp −
(xn − µj)2
2σ2
j
→0.→ ∞
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
110. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
It doesn't occur in single Gaussian.
L ∝
1
σN
j n=m
exp −
(xn − µj)2
2σ2
j
→0.→ ∞ → 0
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
111. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
It doesn't occur in single Gaussian.
L ∝
1
σN
j n=m
exp −
(xn − µj)2
2σ2
j
→0.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
112. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
i) Presence of Singularities
It doesn't occur in single Gaussian.
It doesn't occur in Bayesian approach either.
L ∝
1
σN
j n=m
exp −
(xn − µj)2
2σ2
j
→0.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
113. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
ML estimates of mixtures of Gaussians have
two problems:
i. Presence of Singularities
ii. Identifiability
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
114. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
ii) Identifiability
Optimal solutions are not unique:
If we have a solution, there are (K!-1) other
equivalent solution.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
115. Mixtures of Gaussians
K-means Clustering
Problems of ML estimates
ii) Identifiability
Optimal solutions are not unique:
If we have a solution, there are (K!-1) other
equivalent solution.
Matters when interpret,
but does not matter when model only
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
116. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
117. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
118. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
The conditions of ML are obtained by
where
∂
∂µk
L = 0,
∂
∂Σk
L = 0,
∂
∂πk
L + λ j πj − 1 = 0.
L(π, µ, Σ) =
N
n=1 ln
K
k=1 πkN(xn|µk, Σk)
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
119. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
The conditions of ML
where
µk =
1
Nk
N
n=1
γn(zk)xn,
Σk =
1
Nk
N
n=1
γn(zk)(xn − µj)(xn − µj)T
,
πk =
Nk
N
,
Nk =
N
n=1 γn(zk)
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
120. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
The conditions of ML
where
µk =
1
Nk
N
n=1
γn(zk)xn,
Σk =
1
Nk
N
n=1
γn(zk)(xn − µj)(xn − µj)T
,
πk =
Nk
N
,
Nk =
N
n=1 γn(zk)
γn(zk) appeared
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
121. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Recall that
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
122. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Recall that
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
Parameters appeared
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
123. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Recall that
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
Parameters appeared
= No closed form solution
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
124. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Recall that
Again, use iterative algorithm!
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
Parameters appeared
= No closed form solution
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
125. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
EM algorithm for Gaussian Mixtures
1. Initialize parameters
2. Repeat following two steps until converge
i) Calculate
ii) Update parameters
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
126. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
EM algorithm for Gaussian Mixtures
1. Initialize parameters
2. Repeat following two steps until converge
i) Calculate
ii) Update parameters
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
E step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
127. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
EM algorithm for Gaussian Mixtures
1. Initialize parameters
2. Repeat following two steps until converge
i) Calculate
ii) Update parameters
γn(zk) =
πkN(xn|µk, Σk)
j πjN(xn|µj, Σj)
.
M step
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
128. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
129. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
130. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
131. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
132. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Demo of algorithm
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
133. Mixtures of Gaussians
K-means Clustering
EM-algorithm for Gaussian Mixtures
Comparison with K-means
EM for Gaussian Mixtures
K-means Clustering
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
134. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
135. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA
136. Mixtures of Gaussians
K-means Clustering
Today's topics
1. K-means Clustering
1. Clustering Problem
2. K-means Clustering
3. Application for Image Compression
2. Mixtures of Gaussians
1. Introduction of latent variables
2. Problem of ML estimates
3. EM-algorithm for Mixture of Gaussians
July 16, 2014
PRML 9.1-9.2
Shinichi TAMURA