SlideShare a Scribd company logo
1 of 40
Download to read offline
Expectation Maximization and
         Mixture of Gaussians




                            1
(bpm
                                                125)
 Recommend   me
                          Bpm
  some music!             90!
 Discover groups
  of similar songs…
                                                  Only my
                                                railgun (bpm
            Bach Sonata                              120)
            #1 (bpm 60)   My Music Collection




                                                2
(bpm
                                                 125)
 Recommend   me
  some music!
                                                     bpm
 Discover groups                                    120
  of similar songs…
                                                   Only my
                                                 railgun (bpm
            Bach Sonata                               120)
            #1 (bpm 60)    My Music Collection


                      bpm 60


                                                 3
An unsupervised classifying method




               4
1.    Initialize K
      “means” µk , one
      for each class        µ1

    Eg.  Use random
      starting points, or
  €   choose k random €                     µ2
      points from the set



                                 €K=2
                                        5
1       0
2.    Phase 1: Assign
      each point to
      closest mean µk
3.    Phase 2: Update
      means of the
      new clusters

        €


                            6
2.    Phase 1: Assign
      each point to
      closest mean µk
3.    Phase 2: Update
      means of the
      new clusters

        €
                        0   1




                        7
2.    Phase 1: Assign
      each point to
      closest mean
3.    Phase 2: Update
      means of the
      new clusters




                        8
2.    Phase 1: Assign
      each point to
      closest mean
3.    Phase 2: Update
      means of the
      new clusters




                        9
2.    Phase 1: Assign
      each point to
      closest mean
3.    Phase 2: Update
      means of the
      new clusters




                        10
0        1
2.    Phase 1: Assign
      each point to
      closest mean µk
3.    Phase 2: Update
      means of the
      new clusters

        €


                            11
2.    Phase 1: Assign
      each point to
      closest mean
3.    Phase 2: Update
      means of the
      new clusters




                        12
2.    Phase 1: Assign
      each point to
      closest mean µk
3.    Phase 2: Update
      means of the
      new clusters

        €


                        13
2.    Phase 1: Assign
      each point to
      closest mean
3.    Phase 2: Update
      means of the
      new clusters




                        14
4.    When means do
      not change
      anymore 
      clustering DONE.




                         15
 InK-means, a point can only have 1 class
 But what about points that lie in between
  groups? eg. Jazz + Classical




                                        16
The Famous “GMM”:
Gaussian Mixture Model




              17
Mean

p(X) = N(X | µ,Σ)
                                   Variance


                    Gaussian ==
                     “Normal”
                    distribution




                                     18
p(X) = N(X | µ,Σ) + N(X | µ,Σ)




                         19
p(X) = N(X | µ1,Σ1 ) + N(X | µ2 ,Σ 2 )
Example:

                                      Variance




                                 20
p(X) = π 1N(X | µ1,Σ1 ) + π 2 N(X | µ2 ,Σ 2 )
                                          k
Example:
                            Mixing
                          Coefficient
                                         ∑π    k    =1
                                         k=1




                                 €



              π 1 = 0.7                 π 2 = 0.3
                                                   21
K
        p(X) = ∑ π k N(X | µk ,Σ k )
                k=1


    Example:

    K =2
€

€                                      22
 K-means     is a    Mixture of
 classifier            Gaussians is a
                       probability model
                      We can USE it as a
                       “soft” classifier




                                    23
 K-means     is a    Mixture of
 classifier            Gaussians is a
                       probability model
                      We can USE it as a
                       “soft” classifier




                                    24
 K-means      is a          Mixture of
  classifier                  Gaussians is a
                              probability model
                             We can USE it as a
                              “soft” classifier

Parameter to fit to data:   Parameters to fit to data:
    • Mean µk                   • Mean µk
                                • Covariance Σ k
                                • Mixing coefficient π k



€                            €                  25
                                  €
EM for GMM




             26
1.      Initialize means    µk                          1 0
      2.    E Step: Assign each point to a cluster
      3.    M Step: Given clusters, refine mean µk of each
            cluster k
4.      Stop when change in means is small
                 €
                                    €



                                                   27
1.      Initialize Gaussian* parameters: means µk ,
        covariances Σ k and mixing coefficients π k
      2.    E Step: Assign each point Xn an assignment
            score γ (znk ) for each cluster k            0.5 0.5
      3.    M Step: Given scores, adjust µk ,€ k ,Σ k
                                              π
            for€each cluster k                €
4.  Evaluate
  €             likelihood. If likelihood or
        parameters converge, stop.
                                € € €

       *There are k Gaussians


                                                    28
1.    Initialize µk , Σk
          π k , one for each
          Gaussian k
                 €                              π2         Σ2
        Tip!  Use K-means
€     €   result to initialize:                       µ2
          µk ← µk
           Σk ← cov(cluster(K)) €           €
           π k ← Number of pointspoints
                                  in k  €
                 Total number of

                                                 29

€
Latent variable
 2.    E Step: For each                                    .7    .3
       point Xn, determine
       its assignment score
       to each Gaussian k:




           is called a “responsibility”: how much is this Gaussian k
γ (znk )   responsible for this point Xn?
                                                                30
3.    M Step: For each
       Gaussian k, update
       parameters using
       new γ (znk )

                      Responsibility
                       for this Xn
Mean of Gaussian k
  €




Find the mean that “fits” the assignment scores best
                                             31
3.    M Step: For each
      Gaussian k, update
      parameters using
      new γ (znk )


Covariance matrix
 €
of Gaussian k




                           Just calculated this!
                                     32
3.    M Step: For each
      Gaussian k, update
      parameters using
      new γ (znk )


Mixing Coefficient
 €
                                   eg. 105.6/200
for Gaussian k



                      Total # of
                        points
                                          33
4.    Evaluate log likelihood. If likelihood or
      parameters converge, stop. Else go to Step
      2 (E step).




Likelihood is the probability that the data X
  was generated by the parameters you found.
  ie. Correctness!


                                           34
35
old              Hidden
1.      Initialize parameters   θ                   variables
                                          old
      2.    E Step: Evaluate p(Z | X,θ          )
      3.    M Step: Evaluate                         Observed
                                                     variables


                     €
                 €                                              Likelihood
             where




4.      Evaluate log likelihood. If likelihood or
        parameters converge, stop. Else θ old ← θ new
        and go to E Step.
                                                        36
 K-means  can be formulated as EM
 EM for Gaussian Mixtures
 EM for Bernoulli Mixtures

 EM for Bayesian Linear Regression




                                      37
 “Expectation”
Calculated the fixed, data-dependent
  parameters of the function Q.
 “Maximization”
Once the parameters of Q are known, it is fully
  determined, so now we can maximize Q.




                                         38
 We  learned how to cluster data in an
  unsupervised manner
 Gaussian Mixture Models are useful for
  modeling data with “soft” cluster
  assignments
 Expectation Maximization is a method used
  when we have a model with latent variables
  (values we don’t know, but estimate with
  each step)                                   0.5 0.5




                                       39
 Myquestion: What other applications could
 use EM? How about EM of GMMs?
                                       40

More Related Content

What's hot

Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Hojin Yang
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...Shuhei Yoshida
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative ModelsMLReview
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learningmilad abbasi
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and morehsharmasshare
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Breast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptBreast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptAnkitGupta1476
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313Slideshare
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 

What's hot (20)

Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and more
 
Lecture 9 Perceptron
Lecture 9 PerceptronLecture 9 Perceptron
Lecture 9 Perceptron
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Breast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptBreast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning ppt
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 

Similar to Expectation Maximization and Gaussian Mixture Models

The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...
The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...
The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...Colm Connaughton
 
2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussiannozomuhamada
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloMonte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloXin-She Yang
 
Ordinary abelian varieties having small embedding degree
Ordinary abelian varieties having small embedding degreeOrdinary abelian varieties having small embedding degree
Ordinary abelian varieties having small embedding degreePaula Valenca
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control systemAlireza Mirzaei
 
Cluster-cluster aggregation with (complete) collisional fragmentation
Cluster-cluster aggregation with (complete) collisional fragmentationCluster-cluster aggregation with (complete) collisional fragmentation
Cluster-cluster aggregation with (complete) collisional fragmentationColm Connaughton
 
Color Coding-Related Techniques
Color Coding-Related TechniquesColor Coding-Related Techniques
Color Coding-Related Techniquescseiitgn
 
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3tingyuansenastro
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4arogozhnikov
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingSSA KPI
 
Quantization
QuantizationQuantization
Quantizationwtyru1989
 
Cluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationCluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationColm Connaughton
 
Diffraction,unit 2
Diffraction,unit  2Diffraction,unit  2
Diffraction,unit 2Kumar
 

Similar to Expectation Maximization and Gaussian Mixture Models (17)

The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...
The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...
The Inverse Smoluchowski Problem, Particles In Turbulence 2011, Potsdam, Marc...
 
Manuscript 1334
Manuscript 1334Manuscript 1334
Manuscript 1334
 
Manuscript 1334-1
Manuscript 1334-1Manuscript 1334-1
Manuscript 1334-1
 
2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloMonte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
 
Ordinary abelian varieties having small embedding degree
Ordinary abelian varieties having small embedding degreeOrdinary abelian varieties having small embedding degree
Ordinary abelian varieties having small embedding degree
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control system
 
The Gaussian Hardy-Littlewood Maximal Function
The Gaussian Hardy-Littlewood Maximal FunctionThe Gaussian Hardy-Littlewood Maximal Function
The Gaussian Hardy-Littlewood Maximal Function
 
Cluster-cluster aggregation with (complete) collisional fragmentation
Cluster-cluster aggregation with (complete) collisional fragmentationCluster-cluster aggregation with (complete) collisional fragmentation
Cluster-cluster aggregation with (complete) collisional fragmentation
 
Color Coding-Related Techniques
Color Coding-Related TechniquesColor Coding-Related Techniques
Color Coding-Related Techniques
 
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3
ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 3
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated Annealing
 
Quantization
QuantizationQuantization
Quantization
 
Cluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationCluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentation
 
Diffraction,unit 2
Diffraction,unit  2Diffraction,unit  2
Diffraction,unit 2
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

Expectation Maximization and Gaussian Mixture Models

  • 1. Expectation Maximization and Mixture of Gaussians 1
  • 2. (bpm 125)  Recommend me Bpm some music! 90!  Discover groups of similar songs… Only my railgun (bpm Bach Sonata 120) #1 (bpm 60) My Music Collection 2
  • 3. (bpm 125)  Recommend me some music! bpm  Discover groups 120 of similar songs… Only my railgun (bpm Bach Sonata 120) #1 (bpm 60) My Music Collection bpm 60 3
  • 5. 1.  Initialize K “means” µk , one for each class µ1   Eg. Use random starting points, or € choose k random € µ2 points from the set €K=2 5
  • 6. 1 0 2.  Phase 1: Assign each point to closest mean µk 3.  Phase 2: Update means of the new clusters € 6
  • 7. 2.  Phase 1: Assign each point to closest mean µk 3.  Phase 2: Update means of the new clusters € 0 1 7
  • 8. 2.  Phase 1: Assign each point to closest mean 3.  Phase 2: Update means of the new clusters 8
  • 9. 2.  Phase 1: Assign each point to closest mean 3.  Phase 2: Update means of the new clusters 9
  • 10. 2.  Phase 1: Assign each point to closest mean 3.  Phase 2: Update means of the new clusters 10
  • 11. 0 1 2.  Phase 1: Assign each point to closest mean µk 3.  Phase 2: Update means of the new clusters € 11
  • 12. 2.  Phase 1: Assign each point to closest mean 3.  Phase 2: Update means of the new clusters 12
  • 13. 2.  Phase 1: Assign each point to closest mean µk 3.  Phase 2: Update means of the new clusters € 13
  • 14. 2.  Phase 1: Assign each point to closest mean 3.  Phase 2: Update means of the new clusters 14
  • 15. 4.  When means do not change anymore  clustering DONE. 15
  • 16.  InK-means, a point can only have 1 class  But what about points that lie in between groups? eg. Jazz + Classical 16
  • 17. The Famous “GMM”: Gaussian Mixture Model 17
  • 18. Mean p(X) = N(X | µ,Σ) Variance Gaussian == “Normal” distribution 18
  • 19. p(X) = N(X | µ,Σ) + N(X | µ,Σ) 19
  • 20. p(X) = N(X | µ1,Σ1 ) + N(X | µ2 ,Σ 2 ) Example: Variance 20
  • 21. p(X) = π 1N(X | µ1,Σ1 ) + π 2 N(X | µ2 ,Σ 2 ) k Example: Mixing Coefficient ∑π k =1 k=1 € π 1 = 0.7 π 2 = 0.3 21
  • 22. K p(X) = ∑ π k N(X | µk ,Σ k ) k=1 Example: K =2 € € 22
  • 23.  K-means is a  Mixture of classifier Gaussians is a probability model  We can USE it as a “soft” classifier 23
  • 24.  K-means is a  Mixture of classifier Gaussians is a probability model  We can USE it as a “soft” classifier 24
  • 25.  K-means is a  Mixture of classifier Gaussians is a probability model  We can USE it as a “soft” classifier Parameter to fit to data: Parameters to fit to data: • Mean µk • Mean µk • Covariance Σ k • Mixing coefficient π k € € 25 €
  • 27. 1.  Initialize means µk 1 0 2.  E Step: Assign each point to a cluster 3.  M Step: Given clusters, refine mean µk of each cluster k 4.  Stop when change in means is small € € 27
  • 28. 1.  Initialize Gaussian* parameters: means µk , covariances Σ k and mixing coefficients π k 2.  E Step: Assign each point Xn an assignment score γ (znk ) for each cluster k 0.5 0.5 3.  M Step: Given scores, adjust µk ,€ k ,Σ k π for€each cluster k € 4.  Evaluate € likelihood. If likelihood or parameters converge, stop. € € € *There are k Gaussians 28
  • 29. 1.  Initialize µk , Σk π k , one for each Gaussian k € π2 Σ2   Tip! Use K-means € € result to initialize: µ2 µk ← µk Σk ← cov(cluster(K)) € € π k ← Number of pointspoints in k € Total number of 29 €
  • 30. Latent variable 2.  E Step: For each .7 .3 point Xn, determine its assignment score to each Gaussian k: is called a “responsibility”: how much is this Gaussian k γ (znk ) responsible for this point Xn? 30
  • 31. 3.  M Step: For each Gaussian k, update parameters using new γ (znk ) Responsibility for this Xn Mean of Gaussian k € Find the mean that “fits” the assignment scores best 31
  • 32. 3.  M Step: For each Gaussian k, update parameters using new γ (znk ) Covariance matrix € of Gaussian k Just calculated this! 32
  • 33. 3.  M Step: For each Gaussian k, update parameters using new γ (znk ) Mixing Coefficient € eg. 105.6/200 for Gaussian k Total # of points 33
  • 34. 4.  Evaluate log likelihood. If likelihood or parameters converge, stop. Else go to Step 2 (E step). Likelihood is the probability that the data X was generated by the parameters you found. ie. Correctness! 34
  • 35. 35
  • 36. old Hidden 1.  Initialize parameters θ variables old 2.  E Step: Evaluate p(Z | X,θ ) 3.  M Step: Evaluate Observed variables € € Likelihood where 4.  Evaluate log likelihood. If likelihood or parameters converge, stop. Else θ old ← θ new and go to E Step. 36
  • 37.  K-means can be formulated as EM  EM for Gaussian Mixtures  EM for Bernoulli Mixtures  EM for Bayesian Linear Regression 37
  • 38.  “Expectation” Calculated the fixed, data-dependent parameters of the function Q.  “Maximization” Once the parameters of Q are known, it is fully determined, so now we can maximize Q. 38
  • 39.  We learned how to cluster data in an unsupervised manner  Gaussian Mixture Models are useful for modeling data with “soft” cluster assignments  Expectation Maximization is a method used when we have a model with latent variables (values we don’t know, but estimate with each step) 0.5 0.5 39
  • 40.  Myquestion: What other applications could use EM? How about EM of GMMs? 40