SlideShare a Scribd company logo
Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
SPEAKER IDENTIFICATION FROM YOUTUBE 
OBTAINED DATA 
Nitesh Kumar Chaudhary1 and Shraddha Srivastav2 
1Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 
2Bharti School Of Telecommunication, IIT, Delhi, India 
ABSTRACT 
An efficient, and intuitive algorithm is presented for the identification of speakers from a long dataset (like 
YouTube long discussion, Cocktail party recorded audio or video).The goal of automatic speaker 
identification is to identify the number of different speakers and prepare a model for that speaker by 
extraction, characterization and speaker-specific information contained in the speech signal. It has many 
diverse application specially in the field of Surveillance , Immigrations at Airport , cyber security , 
transcription in multi-source of similar sound source, where it is difficult to assign transcription arbitrary. 
The most commonly speech parameterization used in speaker verification, K-mean, cepstral analysis, is 
detailed. Gaussian mixture modeling, which is the speaker modeling technique is then explained. Gaussian 
mixture models (GMM), perhaps the most robust machine learning algorithm has been introduced to 
examine and judge carefully speaker identification in text independent. The application or employment of 
Gaussian mixture models for monitoring & Analysing speaker identity is encouraged by the familiarity, 
awareness, or understanding gained through experience that Gaussian spectrum depict the characteristics 
of speaker's spectral conformational pattern and remarkable ability of GMM to construct capricious 
densities after that we illustrate 'Expectation maximization' an iterative algorithm which takes some 
arbitrary value in initial estimation and carry on the iterative process until the convergence of value is 
observed We have tried to obtained 85 ~ 95% of accuracy using speaker modeling of vector quantization 
and Gaussian Mixture model ,so by doing various number of experiments we are able to obtain 79 ~ 82% 
of identification rate using Vector quantization and 85 ~ 92.6% of identification rate using GMM modeling 
by Expectation maximization parameter estimation depending on variation of parameter. 
KEYWORDS 
MFCC, Vector Quantization (VQ), Gaussian Mixture Model (GMM), K-mean, Maximum Likelihood, 
Expectation maximization. 
1. INTRODUCTION 
Speaker identification have two categories: text-dependent and text-independent. Essentially, we 
are more interested and involved in the research of text independent speaker identification / 
verification with the reason that it doesn't impose as a necessity and demands regarding the 
utterances obtained from speaker that means there is no restriction over the words , it can be 
anything in any order, so the first basic steps involve the feature extraction from speech sample in 
the enrolment process of a speaker , extracted feature are collected in the database as a training 
data utterances. For the better accuracy, every time we are updating our training dataset while 
preparing training dataset for new utterances with previous stored data in our database, in this 
way model of each speaker is trained. Now in identification process with the help of probabilistic 
model a measurement of identification has to be done, the feature vectors of testing utterances is 
measured with feature vectors of training dataset and decision has to be made whether it belongs 
to a group of dataset or not. 
DOI : 10.5121/sipij.2014.5503 27
Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
28 
2. FEATURE EXTRACTION FROM VOCAL TRACT 
Human's vocal tract, the airway used in the production of speech is the organs above the vocal 
folds, especially the passage above the larynx, including the pharynx, mouth, and nasal cavities. 
which is formed of the oral part (pharynx, tongue, lips, and jaw) , olfactory nerves, and the nasal 
tract. When the glottal pulses signal generated by the vibration of the vocal folds passes through 
the vocal tract, it is modified. Human’s vocal tract is performing like a filter, and its frequency 
characteristics is dependent upon the resonance peak from the vocal tract and vocal tract 
configuration can be obtained from the spectral shape such as formant position and spectral 
inclination of the speech signal. These features can be obtained from the spectrogram of the 
speech signal and we are using Mel-Frequency Cepstral Coefficients (MFCC) features in speaker 
identification as it combines the advantages of the cepstrum analysis with a perceptual frequency 
scale based on critical bands. Although the speech signal is non-stationary, but can be assumed as 
stationary for a short duration of time, so analysis is done by framing the speech signal; the frame 
width is about 20−30 milliseconds, and the frames are shifted by about 10 milliseconds. 
The number of feature vector that we get from utterances is usually large so for computation and 
the number of feature vectors can diminished without lose by K-means clustering method. This 
results in a small set of vectors called as codebook vector and a codebook is obtained for each 
speaker utterances using K-means. A codebook consist of many different vectors, which depict 
the important characteristics of each speaker. Each codebook is obtained as follows: Given a set 
of training set of MFCC feature vectors of 16-point vector for each frame of the utterance, which 
represent the speaker, find a decomposition of the feature vector space. Each decoposed region 
contains a cluster of vectors, which depict the same kind of basic sound. This region is 
represented by the centroid vector, which is the vector, which causes the minimum distortion 
when vectors in the region are mapped to it. Thus, each speaker has a codebook with a number of 
centroids which is prepared with the help of K-mean Clustering. 
3. MODELING USING VECTOR QUANTIZATION AND IDENTIFICATION 
Vector quantization (VQ) is one of the simplest text-independent speaker models. VQ is often 
used for computational speed-up techniques and lightweight practical implementations. Vector
Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
quantization (VQ) is a ancient well-known quantization skills from signal processing which 
allows the modeling of probability density functions by the classification of prototype vectors. 
VQ is generally used by classifying a large se of (MFCC)feature vector dataset in small groups 
vectors having same number of points closet to the denser value i.e. means of codebook of 
training dataset 
For Identification average quantization distortion is computed for the test utterance feature 
vectors by X = {x1,x2....xT}and the reference vectors by ={r1,r2….rk} . 
Where d(.,.) is a distance measure such as the Euclidean distance ||xt-rk||. A smaller value DQ 
indicates higher likelihood for X and R originating from the same speaker. 
Above Diagram1 shows speaker identification using Vector quantization modeling, the upper 
region is enrolment process while in lower region feature has been extracted from test utterance. 
Diagram2 shows the Vector quantization distortion measurements plot when the speaker 
identification’s test has been performed for 8 speakers. For K = 16 variations of distortion 
measurement from speaker i to speaker i utterances lies between (3.4561 to 7.0723) and from 
speaker i to speaker j lies between (13.1892 to 29.9038), where i, j  (1 to 8). 
29
Signal  Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
30 
Diagram 4 
Table1. Performance of Vector quantization modeling with MFCC feature extraction 
Training Utterances No. of Clustering (K mean) Identification rate (%) 
8 Speaker utterances K = 32 82 
8 Speaker utterances K = 16 82 
8 Speaker utterances K = 8 80 
Total Number of speaker’s utterances = 75 
4. MODELING USING GAUSSIAN MIXTURE MODEL 
A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a 
weighted sum of Gaussian component densities. It is generally used as a parametric model of the 
probability distribution of continuous spectrum. GMM parameters are computed by 'Expectation 
maximization' an iterative algorithm which takes some arbitrary value in initial estimation and carry on the 
iterative process until the convergence of value is observed and the whole Gaussian mixture model is 
defined by these parameters namely mean vectors (μi), covariance matrices(i) and mixture 
weights(pi) from all different components, and the weighted sum of M Component density is 
given by 
where  is a D-dimensional continuous-valued feature vector (in our case 16 dimensional), pi , 
i = 1, . . . ,M, are the mixture weights, and g(/μi,i) are component densities. Each component 
density is a D-dimensional Gaussian function of the form,
Signal  Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
GMM based speaker identification model is shown below, the upper region is extraction of 
features from training dataset and preparing a mixture model using expectation maximization 
while in lower region feature has been extracted from test utterance, and finally with GMM 
parameter and extracted features from test utterances identification measurement has been done 
by using probability generation function 
One of the powerful attributes of the GMM is its ability to form smooth approximations to 
arbitrarily shaped densities. There are several techniques available for estimating the parameters 
of a GMM i.e (μ, , p) and most popular and well-established method is maximum likelihood 
(ML) estimation. Since the equation of joint probability is nonlinear function of parameter  so 
we can’t solve it easily because number of unknown variables are more than number of equations 
, so the parameter of ML is estimated iteratively by expectation maximization . The basic idea of 
the EM algorithm is, beginning with an initial model i to estimate a new model j such that p(/ 
j)  p(/ i), where j  i . The new model then becomes the initial model for the next iteration and 
the process is repeated until some convergence threshold is reached, so by using number of 
utterances = 10 can give you a good approximated parameters (μ, , p) of GMM. 
31 
Table2. Performance of GMM with MFCC feature based identification 
Training data 
in seconds 
Testing data in 
seconds 
No. of component No. of Iterations 
for EM 
Rate of correct 
identification (%) 
6 6 2 6 78.8342 
6 6 4 6 76.6654 
12 6 4 8 83.6723 
20 6 4 10 84.4348 
30 4 to 10 4 12 87.4382 
30 4 to 10 5 12 89.5630 
60 4 to 10 6 12 92.6643 
120 4 to 10 7 14 92.1273 
Total Number of speaker’s utterances = 75
Signal  Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
32 
5. CONCLUSION 
In this paper, we have introduced a text-dependent speaker identification system, we have 
investigated that Gaussian mixture models (GMMs) have proven extremely successful for text-independent 
speaker identification for long dataset of different speakers. Identification 
performance of the Gaussian mixture speaker model is insensitive to the method of model 
initialization however we have estimated the parameter using expectation maximization and 
Identification rate is very sensitive to number of cluster and number of iteration of expectation 
maximization we have performed the experiments in Matlab R2014a (The Language of Technical 
Computing), this model is currently being evaluated on a 75 speaker's utterances and the results 
indicate that Gaussian mixture models provide a robust speaker representation for the difficult 
task of speaker identification using corrupted, unconstrained speech of cocktail party or YouTube 
dataset The models are computationally inexpensive and easily implemented on a real-time 
platform. 
REFERENCES 
[1] T.Matsui and S. Furui, “Likelihood normalization for speaker verification using a phoneme- and 
speaker-independent model,” Speech Communication, vol. 17, no. 1-2, pp. 109–116, 1995. 
[2] D. A. Reynolds, A Gaussian mixture modeling approach to text independent speaker identification, 
Ph.D. thesis, Georgia Institute of Technology, Atlanta, Ga, USA, September 1992. 
[3] D. A. Reynolds, “Speaker identification and verification using Gaussian mixture speaker models,” 
Speech Communication, vol. 17, no. 1-2, pp. 91–108, 1995. 
[4] J. Attili, M. Savic, and J. Campbell. “A TMS32020-Based Real Time, Text-Independent, Automatic 
Speaker Verification System,” In International Conference on Acoustics, Speech, and Signal 
Processing in New York, IEEE, pp. 599-602, 1988 
[5] Levinson, S . E., L. R. Rabiner, A. E. Rosenberg and J. G. Wilson, “Interactive Clustering Techniques 
for Selecting Speaker-Independent Techniques for Selecting Speaker-Independent Reference 
Templates for Isolated Word Recognition,” IEEE Trans. ASSP. Vol. 27, pp. 134-141,1979. 
[6] R. Mammone, X. Zhang, R. Ramachandran, Robust Speaker Recognition, IEEE Signal Processing 
Magazine, September 1996. 
[7] F. Soong, E. Rosenberg, B. Juang, and L. Rabiner. A Vector Quantization Approach 
[8] Campbell, J., 1997. Speaker recognition: a tutorial. Proc. IEEE 85 (9),1437–1462. 
[9] Hatch, A., Stolcke, A., Peskin, B., 2005. Combining feature sets with support vector machines: 
application to speaker recognition. In: The 2005 IEEE Workshop on Automatic Speech Recognition 
and Understanding (ASRU), November 2005, pp. 75–79. 
[10] Reynolds, D. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal 
Processing 10.1‐3 (2000): 19‐41. Print. 
[11] K. Chen and L. Wang and H. Chi, Methods of combining multiple classifiers with different features 
and their application to text-independent speaker identification, International Journal of Pattern 
Recognition and Artificial Intelligence, Vol. 11, no. 3, pp. 417{445, 1997. 
[12] Linde, Y., A. Buzo, and R. Gray (1980). An algorithm for vector quantizer design. IEEE Transactions 
on Communications, 28(1), 84–95. 
[13] Lu, X. and J. Dang (2008). An investigation of dependencies between frequency components and 
speaker characteristics for text-independent speaker identification. Speech Communication, 50(4), 
312–322. 
[14] Madikeri, S. R. and H. A. Murthy (2011a). Mel filter bank energy-based slope feature and its 
application to speaker recognition. In Proc. National Conference on Communications, 1–4. 
[15] NIST (2004). The NIST year 2004 speaker recognition evaluation plan. 
http://www.itl.nist.gov/iad/mig/tests/sre/2004/index.html. 
[16] Sha, F. and L. K. Saul, Large margin gaussian mixture modeling for phonetic classification and 
recognition. volume 1. IEEE, 2006.
Signal  Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 
33 
AUTHORS 
Nitesh Kumar Chaudhary Final Year undergraduate (Electronics  Communication) 
student at LNMIIT, jaipur (India). My Research interest includes audio Signal Processing, 
Speaker Authentication  Identification , Machine Learning (Probabilistic model). I have 
worked as a Visiting Summer Researcher at Newyork University , Summer Research 
Fellow at Indian Institute of Science, Bangalore and worked as Research Intern at Indian 
Institute Of Technology ,Delhi a world class program organised by Bharti School Of 
Telecommunication , Foundation of Innovation and Technology Transfer. 
More Info : https://sites.google.com/site/nkcniteshkumarchaudhary/ 
Shraddha Srivastav Post Doc. Researcher at Bharti School of Telecommunication, Indian Institute of 
Technology, Delhi, PhD from University College Cork, Ireland (UCC)

More Related Content

What's hot

Phonetic distance based accent
Phonetic distance based accentPhonetic distance based accent
Phonetic distance based accent
sipij
 
A Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak TransformA Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak Transform
CSCJournals
 
V4101134138
V4101134138V4101134138
V4101134138
IJERA Editor
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
TELKOMNIKA JOURNAL
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification Report
Cody Ray
 
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITIONSEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
cscpconf
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systems
Namratha Dcruz
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
Roger Gomes
 
Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker Verification
Cody Ray
 
F EATURE S ELECTION USING F ISHER ’ S R ATIO T ECHNIQUE FOR A UTOMATIC ...
F EATURE  S ELECTION USING  F ISHER ’ S  R ATIO  T ECHNIQUE FOR  A UTOMATIC  ...F EATURE  S ELECTION USING  F ISHER ’ S  R ATIO  T ECHNIQUE FOR  A UTOMATIC  ...
F EATURE S ELECTION USING F ISHER ’ S R ATIO T ECHNIQUE FOR A UTOMATIC ...
IJCI JOURNAL
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
Phan Duy
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniquePankaj Kumar
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
ijsrd.com
 
Performance analysis of image compression using fuzzy logic algorithm
Performance analysis of image compression using fuzzy logic algorithmPerformance analysis of image compression using fuzzy logic algorithm
Performance analysis of image compression using fuzzy logic algorithm
sipij
 
Line Spectral Pairs Based Voice Conversion using Radial Basis Function
Line Spectral Pairs Based Voice Conversion using Radial Basis FunctionLine Spectral Pairs Based Voice Conversion using Radial Basis Function
Line Spectral Pairs Based Voice Conversion using Radial Basis Function
IDES Editor
 
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITIONSYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
csandit
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
International Journal of Engineering Inventions www.ijeijournal.com
 

What's hot (19)

Phonetic distance based accent
Phonetic distance based accentPhonetic distance based accent
Phonetic distance based accent
 
A Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak TransformA Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak Transform
 
V4101134138
V4101134138V4101134138
V4101134138
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification Report
 
SPEAKER VERIFICATION
SPEAKER VERIFICATIONSPEAKER VERIFICATION
SPEAKER VERIFICATION
 
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITIONSEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systems
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
 
Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker Verification
 
F EATURE S ELECTION USING F ISHER ’ S R ATIO T ECHNIQUE FOR A UTOMATIC ...
F EATURE  S ELECTION USING  F ISHER ’ S  R ATIO  T ECHNIQUE FOR  A UTOMATIC  ...F EATURE  S ELECTION USING  F ISHER ’ S  R ATIO  T ECHNIQUE FOR  A UTOMATIC  ...
F EATURE S ELECTION USING F ISHER ’ S R ATIO T ECHNIQUE FOR A UTOMATIC ...
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC technique
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
 
Performance analysis of image compression using fuzzy logic algorithm
Performance analysis of image compression using fuzzy logic algorithmPerformance analysis of image compression using fuzzy logic algorithm
Performance analysis of image compression using fuzzy logic algorithm
 
Line Spectral Pairs Based Voice Conversion using Radial Basis Function
Line Spectral Pairs Based Voice Conversion using Radial Basis FunctionLine Spectral Pairs Based Voice Conversion using Radial Basis Function
Line Spectral Pairs Based Voice Conversion using Radial Basis Function
 
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITIONSYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITION
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
 
Kf2517971799
Kf2517971799Kf2517971799
Kf2517971799
 

Viewers also liked

Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
sipij
 
Application of parallel algorithm approach for performance optimization of oi...
Application of parallel algorithm approach for performance optimization of oi...Application of parallel algorithm approach for performance optimization of oi...
Application of parallel algorithm approach for performance optimization of oi...
sipij
 
Offline handwritten signature identification using adaptive window positionin...
Offline handwritten signature identification using adaptive window positionin...Offline handwritten signature identification using adaptive window positionin...
Offline handwritten signature identification using adaptive window positionin...
sipij
 
Robust content based watermarking algorithm using singular value decompositio...
Robust content based watermarking algorithm using singular value decompositio...Robust content based watermarking algorithm using singular value decompositio...
Robust content based watermarking algorithm using singular value decompositio...
sipij
 
Lossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal waveletsLossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal wavelets
sipij
 
Beamforming with per antenna power constraint and transmit antenna selection ...
Beamforming with per antenna power constraint and transmit antenna selection ...Beamforming with per antenna power constraint and transmit antenna selection ...
Beamforming with per antenna power constraint and transmit antenna selection ...
sipij
 
Global threshold and region based active contour model for accurate image seg...
Global threshold and region based active contour model for accurate image seg...Global threshold and region based active contour model for accurate image seg...
Global threshold and region based active contour model for accurate image seg...
sipij
 
An intensity based medical image registration using genetic algorithm
An intensity based medical image registration using genetic algorithmAn intensity based medical image registration using genetic algorithm
An intensity based medical image registration using genetic algorithm
sipij
 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
sipij
 
Review of ocr techniques used in automatic mail sorting of postal envelopes
Review of ocr techniques used in automatic mail sorting of postal envelopesReview of ocr techniques used in automatic mail sorting of postal envelopes
Review of ocr techniques used in automatic mail sorting of postal envelopes
sipij
 
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGESIDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
sipij
 
A combined method of fractal and glcm features for mri and ct scan images cla...
A combined method of fractal and glcm features for mri and ct scan images cla...A combined method of fractal and glcm features for mri and ct scan images cla...
A combined method of fractal and glcm features for mri and ct scan images cla...
sipij
 
A voting based approach to detect recursive order number of photocopy documen...
A voting based approach to detect recursive order number of photocopy documen...A voting based approach to detect recursive order number of photocopy documen...
A voting based approach to detect recursive order number of photocopy documen...
sipij
 
Contrast enhancement using various statistical operations and neighborhood pr...
Contrast enhancement using various statistical operations and neighborhood pr...Contrast enhancement using various statistical operations and neighborhood pr...
Contrast enhancement using various statistical operations and neighborhood pr...
sipij
 
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATIONFRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
sipij
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
sipij
 
Feature selection approach in animal classification
Feature selection approach in animal classificationFeature selection approach in animal classification
Feature selection approach in animal classification
sipij
 
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
sipij
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
sipij
 

Viewers also liked (19)

Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
Parallax Effect Free Mosaicing of Underwater Video Sequence Based on Texture ...
 
Application of parallel algorithm approach for performance optimization of oi...
Application of parallel algorithm approach for performance optimization of oi...Application of parallel algorithm approach for performance optimization of oi...
Application of parallel algorithm approach for performance optimization of oi...
 
Offline handwritten signature identification using adaptive window positionin...
Offline handwritten signature identification using adaptive window positionin...Offline handwritten signature identification using adaptive window positionin...
Offline handwritten signature identification using adaptive window positionin...
 
Robust content based watermarking algorithm using singular value decompositio...
Robust content based watermarking algorithm using singular value decompositio...Robust content based watermarking algorithm using singular value decompositio...
Robust content based watermarking algorithm using singular value decompositio...
 
Lossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal waveletsLossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal wavelets
 
Beamforming with per antenna power constraint and transmit antenna selection ...
Beamforming with per antenna power constraint and transmit antenna selection ...Beamforming with per antenna power constraint and transmit antenna selection ...
Beamforming with per antenna power constraint and transmit antenna selection ...
 
Global threshold and region based active contour model for accurate image seg...
Global threshold and region based active contour model for accurate image seg...Global threshold and region based active contour model for accurate image seg...
Global threshold and region based active contour model for accurate image seg...
 
An intensity based medical image registration using genetic algorithm
An intensity based medical image registration using genetic algorithmAn intensity based medical image registration using genetic algorithm
An intensity based medical image registration using genetic algorithm
 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
 
Review of ocr techniques used in automatic mail sorting of postal envelopes
Review of ocr techniques used in automatic mail sorting of postal envelopesReview of ocr techniques used in automatic mail sorting of postal envelopes
Review of ocr techniques used in automatic mail sorting of postal envelopes
 
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGESIDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
IDENTIFICATION OF SUITED QUALITY METRICS FOR NATURAL AND MEDICAL IMAGES
 
A combined method of fractal and glcm features for mri and ct scan images cla...
A combined method of fractal and glcm features for mri and ct scan images cla...A combined method of fractal and glcm features for mri and ct scan images cla...
A combined method of fractal and glcm features for mri and ct scan images cla...
 
A voting based approach to detect recursive order number of photocopy documen...
A voting based approach to detect recursive order number of photocopy documen...A voting based approach to detect recursive order number of photocopy documen...
A voting based approach to detect recursive order number of photocopy documen...
 
Contrast enhancement using various statistical operations and neighborhood pr...
Contrast enhancement using various statistical operations and neighborhood pr...Contrast enhancement using various statistical operations and neighborhood pr...
Contrast enhancement using various statistical operations and neighborhood pr...
 
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATIONFRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
FRONT AND REAR VEHICLE DETECTION USING HYPOTHESIS GENERATION AND VERIFICATION
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
Feature selection approach in animal classification
Feature selection approach in animal classificationFeature selection approach in animal classification
Feature selection approach in animal classification
 
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
ANALYSIS OF BRAIN COGNITIVE STATE FOR ARITHMETIC TASK AND MOTOR TASK USING EL...
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
 

Similar to Speaker Identification From Youtube Obtained Data

A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
TELKOMNIKA JOURNAL
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
CSCJournals
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
Ankit Gujrati
 
AN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEMAN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEM
cseij
 
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Ahmed Ayman
 
IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A Review
IRJET Journal
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector Quantization
TELKOMNIKA JOURNAL
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
IDES Editor
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summaryAditya Deshmukh
 
A017410108
A017410108A017410108
A017410108
IOSR Journals
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
AIRCC Publishing Corporation
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
IRJET Journal
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...
IJECEIAES
 
Ijecet 06 09_010
Ijecet 06 09_010Ijecet 06 09_010
Ijecet 06 09_010
IAEME Publication
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...
IJECEIAES
 
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
IDES Editor
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
kevig
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
ijnlc
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
inventionjournals
 

Similar to Speaker Identification From Youtube Obtained Data (20)

A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
AN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEMAN EFFICIENT SPEECH RECOGNITION SYSTEM
AN EFFICIENT SPEECH RECOGNITION SYSTEM
 
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
 
IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A Review
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector Quantization
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summary
 
A017410108
A017410108A017410108
A017410108
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...
 
Ijecet 06 09_010
Ijecet 06 09_010Ijecet 06 09_010
Ijecet 06 09_010
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...
 
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
Designing an Efficient Multimodal Biometric System using Palmprint and Speech...
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 

Recently uploaded

CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 

Recently uploaded (20)

CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 

Speaker Identification From Youtube Obtained Data

  • 1. Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary1 and Shraddha Srivastav2 1Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2Bharti School Of Telecommunication, IIT, Delhi, India ABSTRACT An efficient, and intuitive algorithm is presented for the identification of speakers from a long dataset (like YouTube long discussion, Cocktail party recorded audio or video).The goal of automatic speaker identification is to identify the number of different speakers and prepare a model for that speaker by extraction, characterization and speaker-specific information contained in the speech signal. It has many diverse application specially in the field of Surveillance , Immigrations at Airport , cyber security , transcription in multi-source of similar sound source, where it is difficult to assign transcription arbitrary. The most commonly speech parameterization used in speaker verification, K-mean, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique is then explained. Gaussian mixture models (GMM), perhaps the most robust machine learning algorithm has been introduced to examine and judge carefully speaker identification in text independent. The application or employment of Gaussian mixture models for monitoring & Analysing speaker identity is encouraged by the familiarity, awareness, or understanding gained through experience that Gaussian spectrum depict the characteristics of speaker's spectral conformational pattern and remarkable ability of GMM to construct capricious densities after that we illustrate 'Expectation maximization' an iterative algorithm which takes some arbitrary value in initial estimation and carry on the iterative process until the convergence of value is observed We have tried to obtained 85 ~ 95% of accuracy using speaker modeling of vector quantization and Gaussian Mixture model ,so by doing various number of experiments we are able to obtain 79 ~ 82% of identification rate using Vector quantization and 85 ~ 92.6% of identification rate using GMM modeling by Expectation maximization parameter estimation depending on variation of parameter. KEYWORDS MFCC, Vector Quantization (VQ), Gaussian Mixture Model (GMM), K-mean, Maximum Likelihood, Expectation maximization. 1. INTRODUCTION Speaker identification have two categories: text-dependent and text-independent. Essentially, we are more interested and involved in the research of text independent speaker identification / verification with the reason that it doesn't impose as a necessity and demands regarding the utterances obtained from speaker that means there is no restriction over the words , it can be anything in any order, so the first basic steps involve the feature extraction from speech sample in the enrolment process of a speaker , extracted feature are collected in the database as a training data utterances. For the better accuracy, every time we are updating our training dataset while preparing training dataset for new utterances with previous stored data in our database, in this way model of each speaker is trained. Now in identification process with the help of probabilistic model a measurement of identification has to be done, the feature vectors of testing utterances is measured with feature vectors of training dataset and decision has to be made whether it belongs to a group of dataset or not. DOI : 10.5121/sipij.2014.5503 27
  • 2. Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 28 2. FEATURE EXTRACTION FROM VOCAL TRACT Human's vocal tract, the airway used in the production of speech is the organs above the vocal folds, especially the passage above the larynx, including the pharynx, mouth, and nasal cavities. which is formed of the oral part (pharynx, tongue, lips, and jaw) , olfactory nerves, and the nasal tract. When the glottal pulses signal generated by the vibration of the vocal folds passes through the vocal tract, it is modified. Human’s vocal tract is performing like a filter, and its frequency characteristics is dependent upon the resonance peak from the vocal tract and vocal tract configuration can be obtained from the spectral shape such as formant position and spectral inclination of the speech signal. These features can be obtained from the spectrogram of the speech signal and we are using Mel-Frequency Cepstral Coefficients (MFCC) features in speaker identification as it combines the advantages of the cepstrum analysis with a perceptual frequency scale based on critical bands. Although the speech signal is non-stationary, but can be assumed as stationary for a short duration of time, so analysis is done by framing the speech signal; the frame width is about 20−30 milliseconds, and the frames are shifted by about 10 milliseconds. The number of feature vector that we get from utterances is usually large so for computation and the number of feature vectors can diminished without lose by K-means clustering method. This results in a small set of vectors called as codebook vector and a codebook is obtained for each speaker utterances using K-means. A codebook consist of many different vectors, which depict the important characteristics of each speaker. Each codebook is obtained as follows: Given a set of training set of MFCC feature vectors of 16-point vector for each frame of the utterance, which represent the speaker, find a decomposition of the feature vector space. Each decoposed region contains a cluster of vectors, which depict the same kind of basic sound. This region is represented by the centroid vector, which is the vector, which causes the minimum distortion when vectors in the region are mapped to it. Thus, each speaker has a codebook with a number of centroids which is prepared with the help of K-mean Clustering. 3. MODELING USING VECTOR QUANTIZATION AND IDENTIFICATION Vector quantization (VQ) is one of the simplest text-independent speaker models. VQ is often used for computational speed-up techniques and lightweight practical implementations. Vector
  • 3. Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 quantization (VQ) is a ancient well-known quantization skills from signal processing which allows the modeling of probability density functions by the classification of prototype vectors. VQ is generally used by classifying a large se of (MFCC)feature vector dataset in small groups vectors having same number of points closet to the denser value i.e. means of codebook of training dataset For Identification average quantization distortion is computed for the test utterance feature vectors by X = {x1,x2....xT}and the reference vectors by ={r1,r2….rk} . Where d(.,.) is a distance measure such as the Euclidean distance ||xt-rk||. A smaller value DQ indicates higher likelihood for X and R originating from the same speaker. Above Diagram1 shows speaker identification using Vector quantization modeling, the upper region is enrolment process while in lower region feature has been extracted from test utterance. Diagram2 shows the Vector quantization distortion measurements plot when the speaker identification’s test has been performed for 8 speakers. For K = 16 variations of distortion measurement from speaker i to speaker i utterances lies between (3.4561 to 7.0723) and from speaker i to speaker j lies between (13.1892 to 29.9038), where i, j (1 to 8). 29
  • 4. Signal Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 30 Diagram 4 Table1. Performance of Vector quantization modeling with MFCC feature extraction Training Utterances No. of Clustering (K mean) Identification rate (%) 8 Speaker utterances K = 32 82 8 Speaker utterances K = 16 82 8 Speaker utterances K = 8 80 Total Number of speaker’s utterances = 75 4. MODELING USING GAUSSIAN MIXTURE MODEL A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. It is generally used as a parametric model of the probability distribution of continuous spectrum. GMM parameters are computed by 'Expectation maximization' an iterative algorithm which takes some arbitrary value in initial estimation and carry on the iterative process until the convergence of value is observed and the whole Gaussian mixture model is defined by these parameters namely mean vectors (μi), covariance matrices(i) and mixture weights(pi) from all different components, and the weighted sum of M Component density is given by where is a D-dimensional continuous-valued feature vector (in our case 16 dimensional), pi , i = 1, . . . ,M, are the mixture weights, and g(/μi,i) are component densities. Each component density is a D-dimensional Gaussian function of the form,
  • 5. Signal Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 GMM based speaker identification model is shown below, the upper region is extraction of features from training dataset and preparing a mixture model using expectation maximization while in lower region feature has been extracted from test utterance, and finally with GMM parameter and extracted features from test utterances identification measurement has been done by using probability generation function One of the powerful attributes of the GMM is its ability to form smooth approximations to arbitrarily shaped densities. There are several techniques available for estimating the parameters of a GMM i.e (μ, , p) and most popular and well-established method is maximum likelihood (ML) estimation. Since the equation of joint probability is nonlinear function of parameter so we can’t solve it easily because number of unknown variables are more than number of equations , so the parameter of ML is estimated iteratively by expectation maximization . The basic idea of the EM algorithm is, beginning with an initial model i to estimate a new model j such that p(/ j) p(/ i), where j i . The new model then becomes the initial model for the next iteration and the process is repeated until some convergence threshold is reached, so by using number of utterances = 10 can give you a good approximated parameters (μ, , p) of GMM. 31 Table2. Performance of GMM with MFCC feature based identification Training data in seconds Testing data in seconds No. of component No. of Iterations for EM Rate of correct identification (%) 6 6 2 6 78.8342 6 6 4 6 76.6654 12 6 4 8 83.6723 20 6 4 10 84.4348 30 4 to 10 4 12 87.4382 30 4 to 10 5 12 89.5630 60 4 to 10 6 12 92.6643 120 4 to 10 7 14 92.1273 Total Number of speaker’s utterances = 75
  • 6. Signal Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 32 5. CONCLUSION In this paper, we have introduced a text-dependent speaker identification system, we have investigated that Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker identification for long dataset of different speakers. Identification performance of the Gaussian mixture speaker model is insensitive to the method of model initialization however we have estimated the parameter using expectation maximization and Identification rate is very sensitive to number of cluster and number of iteration of expectation maximization we have performed the experiments in Matlab R2014a (The Language of Technical Computing), this model is currently being evaluated on a 75 speaker's utterances and the results indicate that Gaussian mixture models provide a robust speaker representation for the difficult task of speaker identification using corrupted, unconstrained speech of cocktail party or YouTube dataset The models are computationally inexpensive and easily implemented on a real-time platform. REFERENCES [1] T.Matsui and S. Furui, “Likelihood normalization for speaker verification using a phoneme- and speaker-independent model,” Speech Communication, vol. 17, no. 1-2, pp. 109–116, 1995. [2] D. A. Reynolds, A Gaussian mixture modeling approach to text independent speaker identification, Ph.D. thesis, Georgia Institute of Technology, Atlanta, Ga, USA, September 1992. [3] D. A. Reynolds, “Speaker identification and verification using Gaussian mixture speaker models,” Speech Communication, vol. 17, no. 1-2, pp. 91–108, 1995. [4] J. Attili, M. Savic, and J. Campbell. “A TMS32020-Based Real Time, Text-Independent, Automatic Speaker Verification System,” In International Conference on Acoustics, Speech, and Signal Processing in New York, IEEE, pp. 599-602, 1988 [5] Levinson, S . E., L. R. Rabiner, A. E. Rosenberg and J. G. Wilson, “Interactive Clustering Techniques for Selecting Speaker-Independent Techniques for Selecting Speaker-Independent Reference Templates for Isolated Word Recognition,” IEEE Trans. ASSP. Vol. 27, pp. 134-141,1979. [6] R. Mammone, X. Zhang, R. Ramachandran, Robust Speaker Recognition, IEEE Signal Processing Magazine, September 1996. [7] F. Soong, E. Rosenberg, B. Juang, and L. Rabiner. A Vector Quantization Approach [8] Campbell, J., 1997. Speaker recognition: a tutorial. Proc. IEEE 85 (9),1437–1462. [9] Hatch, A., Stolcke, A., Peskin, B., 2005. Combining feature sets with support vector machines: application to speaker recognition. In: The 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), November 2005, pp. 75–79. [10] Reynolds, D. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10.1‐3 (2000): 19‐41. Print. [11] K. Chen and L. Wang and H. Chi, Methods of combining multiple classifiers with different features and their application to text-independent speaker identification, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 11, no. 3, pp. 417{445, 1997. [12] Linde, Y., A. Buzo, and R. Gray (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1), 84–95. [13] Lu, X. and J. Dang (2008). An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification. Speech Communication, 50(4), 312–322. [14] Madikeri, S. R. and H. A. Murthy (2011a). Mel filter bank energy-based slope feature and its application to speaker recognition. In Proc. National Conference on Communications, 1–4. [15] NIST (2004). The NIST year 2004 speaker recognition evaluation plan. http://www.itl.nist.gov/iad/mig/tests/sre/2004/index.html. [16] Sha, F. and L. K. Saul, Large margin gaussian mixture modeling for phonetic classification and recognition. volume 1. IEEE, 2006.
  • 7. Signal Image Processing : An International Journal (SIPIJ) Vol.5, No.5, October 2014 33 AUTHORS Nitesh Kumar Chaudhary Final Year undergraduate (Electronics Communication) student at LNMIIT, jaipur (India). My Research interest includes audio Signal Processing, Speaker Authentication Identification , Machine Learning (Probabilistic model). I have worked as a Visiting Summer Researcher at Newyork University , Summer Research Fellow at Indian Institute of Science, Bangalore and worked as Research Intern at Indian Institute Of Technology ,Delhi a world class program organised by Bharti School Of Telecommunication , Foundation of Innovation and Technology Transfer. More Info : https://sites.google.com/site/nkcniteshkumarchaudhary/ Shraddha Srivastav Post Doc. Researcher at Bharti School of Telecommunication, Indian Institute of Technology, Delhi, PhD from University College Cork, Ireland (UCC)