SlideShare a Scribd company logo
GMMGaussian mixture models
8/15/2014 1
Saurab Dulal
IOE, pulchowk Campus
Introduction to GMM
• Gaussian
“Gaussian is a
characteristic symmetric
"bell curve" shape that
quickly falls off towards 0
(practically)”
• Mixture Model
“mixture model is a
probabilistic model which
assumes the underlying
data to belong to a
mixture distribution”
2
Introduction to GMM
• Mathematical Description of GMM
p(x) = w1 p1 (x) + w2p2 (x) + w3 p3 (x) ……… +wn pn (x)
where p(x) = mixture component
w1, w2 ….. wn = mixture weight or mixture coefficient
pi (x) = Density functions
Fig :- Image
showing
Best fit
Gaussian
Curve
3
Introduction to GMM
“The most common mixture distribution is the Gaussian
(Normal) density function, in which each of the mixture
components are Gaussian distributions, each with their
own mean and variance parameters.”
p(x) = w1N( x | µ1∑1 )+ w1N( x | µ2∑2 )… +w1N( x | µn∑n )
µi ‘s are means and ∑i ‘s are covariance-matrix of
individual components(probability density function)
4
G1,w1 G2,w2
G3,w3
G4,w4
G5,w5
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Component 1 Component 2
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Component 1 Component 2
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
-5 0 5 10
0
0.5
1
1.5
2
Component Models
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
GMM for Speaker Recognition
Motivation
• Interpretation that Gaussian component
represent some general speaker –dependent
spectral shapes
• Capabilities of Gaussian mixture to model
arbitrary densities
8
Description of SR-using GMM
• Speech Analysis
• Model Description
• Model Interpretations
• Maximum Likelihood Parameters Estimation
• Speaker Identification
9
Speech Analysis
10
• Linear predictive coding(LPC)
•Mel-scale filter-bank(to reduce
noise)
Analysis is ended with the
generation of
Cepstrum coefficients x1
’, x2
’
x3’….xn
’
A cepstrum is the result of taking the Inverse Fourier transform (IFT)
of the logarithm of the estimated spectrum of a signal.
Cosine transform
2000/05/03 11
Model Description
Gaussian Mixture Density
)()|(
1
xbpxp
M
i
ii



Where x
 D-dimensional random vector








 
)()'(
2
1
exp
)2(
1
)( 1
212 iii
i
Di xxxb 


 iiip  ,,

Mi ,,1 
Nodal, Grand,Global
Nodal, diagonal (this)
Covariance matrix
Mean
Component Density
Speaker Model
Choice of Covariance Matrix
12
• Nodal Covariance
One co-variance matrix per Gaussian component
• Grand Covariance
One co-variance matrix for all Gaussian component
• Global Covariance
single co-variance matrix shared by all speaker
component
Model Interpretation
• Intuitive notion
Acoustic classes(vowels, nasals, fricatives) reflects
some general speaker-dependent vocal tract
configuration that are useful for characterizing speaker-
identity
• GMM have ability to form smooth approximation to
arbitrary shaped density
• It doesn’t only have smooth approx but also
multimodal nature of densities
13
2000/05/03 14
ML-Parameters Estimation
Step:
1. Beginning with an initial model
2. Estimate a new model such that
Mixture density
3. Repeated 2. until certain threshold is
reached.
…Maximum Likelihood
)|()|(  XpXp 
 
2000/05/03 15
(Mixture Weights)
(Means)
(Variances)


T
t
ti xip
T
p
1
),|(
1






 T
t t
T
t tt
i
xip
xxip
1
1
),|(
),|(


 


2
1
1
2
2
),|(
),|(
iT
t t
T
t tt
i
xip
xxip



 






 
 M
k tkk
tii
t
xbp
xbp
xip
1
)(
)(
),|( 


Mixture
Density
Component
Density
and refers to arbitrary elements of vectors ii 

,2
and tx

ii ','2


'tx

and
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
ANEMIA PATIENTS AND CONTROLS
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 1
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 3
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 5
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 10
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 15
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 25
0 5 10 15 20 25
400
410
420
430
440
450
460
470
480
490
LOG-LIKELIHOOD AS A FUNCTION OF EM ITERATIONS
EM Iteration
Log-Likelihood
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
ANEMIA DATA WITH LABELS
Anemia Group
Control Group
2000/05/03 25
Speaker Identification
A group of speakers S = {1,2,…,S} is represented by GMM’s
λ1, λ2, …, λs, the obective is to find the speaker model which
has the maximum a posteriori probability for a given observation
sequence
)(
)Pr()|(
maxarg)|Pr(maxargˆ
11 Xp
Xp
XS kk
Sk
k
Sk




)|(maxargˆ
1
k
Sk
XpS 

 )|(logmaxargˆ
1
1
kt
T
t
Sk
xpS 






T
t
tiikt xbpxp
1
)()|(

which
  logtake
References
D. A. Reynolds and R. C. Rose, “Robust Text- Independent
Speaker Identification Using Gaussian Mixture Speaker
Models”, IEEE Trans. on Speech and Audio Processing, vol.3,
No.1, pp.72-83,January 1995.
• http://en.wikipedia.org/wiki/Probability_density_function
• http://crsouza.blogspot.com/2010/10/gaussian-mixture-
models-and-expectation.html
• https://www.ll.mit.edu/mission/communications/ist/publications
/0802_Reynolds_Biometrics-GMM.pdf
• http://statweb.stanford.edu/~tibs/stat315a/LECTURES/em.pdf
• http://eprints.pascal
network.org/archive/00008291/01/SoftAssignReconstr_ICIP20
11.pdf
• http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/km
eans.html
26

More Related Content

What's hot

Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
Sumeet Kakani
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
Fundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural NetworksFundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural Networks
Nelson Piedra
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
Akash Goel
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
milad abbasi
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
Rohit Kumar
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
Changjin Lee
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
Md Shabir Alam
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
Sangwoo Mo
 
Simultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color ImagesSimultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color Images
Cristina Pérez Benito
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Vector quantization
Vector quantizationVector quantization
Vector quantization
Rajani Sharma
 
Medical image processing
Medical image processingMedical image processing
Medical image processing
Dr G R Sinha
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
guestfee8698
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
Lossless predictive coding in Digital Image Processing
Lossless predictive coding in Digital Image ProcessingLossless predictive coding in Digital Image Processing
Lossless predictive coding in Digital Image Processing
priyadharshini murugan
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Image Filtering in the Frequency Domain
Image Filtering in the Frequency DomainImage Filtering in the Frequency Domain
Image Filtering in the Frequency Domain
Amnaakhaan
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 

What's hot (20)

Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Face recognition using artificial neural network
Face recognition using artificial neural networkFace recognition using artificial neural network
Face recognition using artificial neural network
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Fundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural NetworksFundamental, An Introduction to Neural Networks
Fundamental, An Introduction to Neural Networks
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Simultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color ImagesSimultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color Images
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Vector quantization
Vector quantizationVector quantization
Vector quantization
 
Medical image processing
Medical image processingMedical image processing
Medical image processing
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
Lossless predictive coding in Digital Image Processing
Lossless predictive coding in Digital Image ProcessingLossless predictive coding in Digital Image Processing
Lossless predictive coding in Digital Image Processing
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Image Filtering in the Frequency Domain
Image Filtering in the Frequency DomainImage Filtering in the Frequency Domain
Image Filtering in the Frequency Domain
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 

Similar to Speaker Recognition using Gaussian Mixture Model

An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
IOSR Journals
 
Poisson distribution jen
Poisson distribution jenPoisson distribution jen
Poisson distribution jen
jennilynbalbalosa
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
AIST
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
garima931
 
Report
ReportReport
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping AlgorithmAdaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
theijes
 
Dong Zhang's project
Dong Zhang's projectDong Zhang's project
Dong Zhang's project
Dong Zhang
 
O hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartalaO hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartala
Anand Kumar Chinni
 
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
CSCJournals
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
lovemucheca
 
E0212730
E0212730E0212730
E0212730
IOSR Journals
 
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
hirokazutanaka
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
Wael Sharba
 
Roots of equations
Roots of equationsRoots of equations
Roots of equations
Mileacre
 
Image compression based on
Image compression based onImage compression based on
Image compression based on
ijma
 
Unit3
Unit3Unit3
A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1
thanhdowork
 
ch03.ppt
ch03.pptch03.ppt
ch03.ppt
ETManagement
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
Syed Muhammad Zeejah Hashmi
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue Problem
Scott Donald
 

Similar to Speaker Recognition using Gaussian Mixture Model (20)

An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
 
Poisson distribution jen
Poisson distribution jenPoisson distribution jen
Poisson distribution jen
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Report
ReportReport
Report
 
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping AlgorithmAdaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
 
Dong Zhang's project
Dong Zhang's projectDong Zhang's project
Dong Zhang's project
 
O hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartalaO hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartala
 
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
E0212730
E0212730E0212730
E0212730
 
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Roots of equations
Roots of equationsRoots of equations
Roots of equations
 
Image compression based on
Image compression based onImage compression based on
Image compression based on
 
Unit3
Unit3Unit3
Unit3
 
A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1
 
ch03.ppt
ch03.pptch03.ppt
ch03.ppt
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue Problem
 

Recently uploaded

Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
upoux
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
harshapolam10
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
Kamal Acharya
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
PreethaV16
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
b0754201
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
Kamal Acharya
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
snaprevwdev
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
Paris Salesforce Developer Group
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
sydezfe
 
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Levelised Cost of Hydrogen  (LCOH) Calculator ManualLevelised Cost of Hydrogen  (LCOH) Calculator Manual
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Massimo Talia
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
PriyankaKilaniya
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
Pallavi Sharma
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
Abdullah Al Noman
 

Recently uploaded (20)

Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
 
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
一比一原版(uoft毕业证书)加拿大多伦多大学毕业证如何办理
 
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Levelised Cost of Hydrogen  (LCOH) Calculator ManualLevelised Cost of Hydrogen  (LCOH) Calculator Manual
Levelised Cost of Hydrogen (LCOH) Calculator Manual
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
 

Speaker Recognition using Gaussian Mixture Model

  • 1. GMMGaussian mixture models 8/15/2014 1 Saurab Dulal IOE, pulchowk Campus
  • 2. Introduction to GMM • Gaussian “Gaussian is a characteristic symmetric "bell curve" shape that quickly falls off towards 0 (practically)” • Mixture Model “mixture model is a probabilistic model which assumes the underlying data to belong to a mixture distribution” 2
  • 3. Introduction to GMM • Mathematical Description of GMM p(x) = w1 p1 (x) + w2p2 (x) + w3 p3 (x) ……… +wn pn (x) where p(x) = mixture component w1, w2 ….. wn = mixture weight or mixture coefficient pi (x) = Density functions Fig :- Image showing Best fit Gaussian Curve 3
  • 4. Introduction to GMM “The most common mixture distribution is the Gaussian (Normal) density function, in which each of the mixture components are Gaussian distributions, each with their own mean and variance parameters.” p(x) = w1N( x | µ1∑1 )+ w1N( x | µ2∑2 )… +w1N( x | µn∑n ) µi ‘s are means and ∑i ‘s are covariance-matrix of individual components(probability density function) 4 G1,w1 G2,w2 G3,w3 G4,w4 G5,w5
  • 5. -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Component 1 Component 2 p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 6. -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Component 1 Component 2 p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 7. -5 0 5 10 0 0.5 1 1.5 2 Component Models p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 8. GMM for Speaker Recognition Motivation • Interpretation that Gaussian component represent some general speaker –dependent spectral shapes • Capabilities of Gaussian mixture to model arbitrary densities 8
  • 9. Description of SR-using GMM • Speech Analysis • Model Description • Model Interpretations • Maximum Likelihood Parameters Estimation • Speaker Identification 9
  • 10. Speech Analysis 10 • Linear predictive coding(LPC) •Mel-scale filter-bank(to reduce noise) Analysis is ended with the generation of Cepstrum coefficients x1 ’, x2 ’ x3’….xn ’ A cepstrum is the result of taking the Inverse Fourier transform (IFT) of the logarithm of the estimated spectrum of a signal. Cosine transform
  • 11. 2000/05/03 11 Model Description Gaussian Mixture Density )()|( 1 xbpxp M i ii    Where x  D-dimensional random vector           )()'( 2 1 exp )2( 1 )( 1 212 iii i Di xxxb     iiip  ,,  Mi ,,1  Nodal, Grand,Global Nodal, diagonal (this) Covariance matrix Mean Component Density Speaker Model
  • 12. Choice of Covariance Matrix 12 • Nodal Covariance One co-variance matrix per Gaussian component • Grand Covariance One co-variance matrix for all Gaussian component • Global Covariance single co-variance matrix shared by all speaker component
  • 13. Model Interpretation • Intuitive notion Acoustic classes(vowels, nasals, fricatives) reflects some general speaker-dependent vocal tract configuration that are useful for characterizing speaker- identity • GMM have ability to form smooth approximation to arbitrary shaped density • It doesn’t only have smooth approx but also multimodal nature of densities 13
  • 14. 2000/05/03 14 ML-Parameters Estimation Step: 1. Beginning with an initial model 2. Estimate a new model such that Mixture density 3. Repeated 2. until certain threshold is reached. …Maximum Likelihood )|()|(  XpXp   
  • 15. 2000/05/03 15 (Mixture Weights) (Means) (Variances)   T t ti xip T p 1 ),|( 1        T t t T t tt i xip xxip 1 1 ),|( ),|(       2 1 1 2 2 ),|( ),|( iT t t T t tt i xip xxip               M k tkk tii t xbp xbp xip 1 )( )( ),|(    Mixture Density Component Density and refers to arbitrary elements of vectors ii   ,2 and tx  ii ','2   'tx  and
  • 16. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 ANEMIA PATIENTS AND CONTROLS Red Blood Cell Volume RedBloodCellHemoglobinConcentration
  • 17. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 1
  • 18. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 3
  • 19. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 5
  • 20. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 10
  • 21. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 15
  • 22. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 25
  • 23. 0 5 10 15 20 25 400 410 420 430 440 450 460 470 480 490 LOG-LIKELIHOOD AS A FUNCTION OF EM ITERATIONS EM Iteration Log-Likelihood
  • 24. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration ANEMIA DATA WITH LABELS Anemia Group Control Group
  • 25. 2000/05/03 25 Speaker Identification A group of speakers S = {1,2,…,S} is represented by GMM’s λ1, λ2, …, λs, the obective is to find the speaker model which has the maximum a posteriori probability for a given observation sequence )( )Pr()|( maxarg)|Pr(maxargˆ 11 Xp Xp XS kk Sk k Sk     )|(maxargˆ 1 k Sk XpS    )|(logmaxargˆ 1 1 kt T t Sk xpS        T t tiikt xbpxp 1 )()|(  which   logtake
  • 26. References D. A. Reynolds and R. C. Rose, “Robust Text- Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Trans. on Speech and Audio Processing, vol.3, No.1, pp.72-83,January 1995. • http://en.wikipedia.org/wiki/Probability_density_function • http://crsouza.blogspot.com/2010/10/gaussian-mixture- models-and-expectation.html • https://www.ll.mit.edu/mission/communications/ist/publications /0802_Reynolds_Biometrics-GMM.pdf • http://statweb.stanford.edu/~tibs/stat315a/LECTURES/em.pdf • http://eprints.pascal network.org/archive/00008291/01/SoftAssignReconstr_ICIP20 11.pdf • http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/km eans.html 26

Editor's Notes

  1. Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. It is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate and provides extremely accurate estimates of speech parameters.