Upcoming SlideShare
×

# Text-Independent Speaker Verification

5,132
-1

Published on

Presentation slides discussing the theory and empirical results of a text-independent speaker verification system I developed based upon classification of MFCCs. Both mininimum-distance classification and least-likelihood ratio classification using Gaussian Mixture Models were discussed.

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total Views
5,132
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
113
0
Likes
0
Embeds 0
No embeds

No notes for slide
• Analysis window w[n]Discrete STFTWeight the magnitudes of X(n, w) with mel-scale filter bank
• GMM of target speakerGMM for a collection of imposters that we call the background modelRatio of probability that the collection of feature vectors X is from the claimant speaker to the probability that X is not from the claimed speaker (ie, background)
• Mean-squared error (distance metric) is symmetric, so what was used for training/testing doesn’t matter. Output is just transposed with the other order.
• Mean-squared difference between average testing and training feature vectors for each speaker
• Minimize the distances on diagonals (matches)
• 0 false negatives6 false positives
• 1 false negatives5 false positives
• ### Text-Independent Speaker Verification

1. 1. Speaker Recognition<br />Cody A. Ray<br />ECES 435 Final Project<br />March 11, 2010<br />
2. 2. Speaker Recognition<br />Speaker Identification<br />Speaker Verification<br />Text<br />Dependent<br />Text<br />Independent<br />Text<br />Dependent<br />Text<br />Independent<br />
3. 3. Speaker Recognition System<br />Training speech<br />Feature Vector<br />Target & Background<br />Feature <br />Extraction<br />Training<br />Speaker<br />Model<br />Score<br />Test speech<br />Feature<br />Extraction<br />Matching<br />Testing<br />Verification<br /><ul><li>Cepstrum
4. 4. LPCC
5. 5. MFCC
6. 6. Glottal Flow Derivative
7. 7. Deterministic Models
8. 8. Min Distance
9. 9. DTW
10. 10. Stochastic Models
11. 11. GMM
12. 12. HMM
13. 13. Minimum Distance
14. 14. Maximum-Likelihood
15. 15. Maximum a posteriori
16. 16. Minimum-Mean-Squared Error</li></li></ul><li>Feature Extraction<br />Big surprise here – MFCCs!<br />Speech signal<br />x[m] w[n-m]<br />X(n, w)<br />Window<br />DFT<br />| . |<br />Mel-Scale<br />Emel(n, l)<br />MFCCs<br />DCT<br />Filter Bank<br />Log<br />MFCC - 12 coefficients (skip 0’th order coefficient)<br />256 sample frames, 128 sample increment, Hamming window<br />Triangular filters in mel domain (absolute magnitude) <br />
17. 17. Mel Frequency Bank<br />
18. 18. System 1: Minimum-Distance<br />Average of mel-cepstral features for test and training data<br />
19. 19. Minimum-Distance Classifier<br />Mean-squared difference between average testing and training feature vectors<br />
20. 20. System 2: Gaussian Mixture Model<br />Multivariate Normal Distribution<br />
21. 21. Gaussian Mixture Model<br />
22. 22. GMM Speaker Recognition System<br />Target<br />Model<br />Feature Vectors<br />Imposter 1<br />Imposter 2<br />
23. 23. Log-Likelihood Ratio<br />
24. 24. Experiments<br />8 Speakers (4 Male, 4 Female)<br />2 Sentences Each<br />Don’t ask me to carry an oily rag like that<br />She had your dark suit in greasy wash water all year<br />“Rag” used for training, “suit” for testing<br />
25. 25. Results<br />
26. 26. Results<br />
27. 27. Results<br />Threshold = 0.12<br />Accuracy = 91%<br />
28. 28. Results<br />Threshold = 0.11<br />Accuracy = 91%<br />
29. 29. Conclusions<br />Accuracy isn’t terrible, but room to improve<br />Threshold tradeoff<br />false-negatives vs. false-positives<br />DON’T use Minimum-Distance classifier for text-independent authentication systems<br />
30. 30. Future Work<br />Implement LLR Classifier using GMM library<br />Repeat experiment with GMM-based system<br />Compare Min-Distance and GMM results<br />
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.