SlideShare a Scribd company logo
International Journal of Electronic Engineering Research
ISSN 0975 - 6450 Volume 2 Number 3 (2010) pp. 377–381
© Research India Publications
http://www.ripublication.com/ijeer.htm



              Independent Speaker Recognition for
                     Native English Vowels

       1
           G.N. Kodandaramaiah, 2M.N. Giriprasad and 3M. Mukunda Rao
        1
          HOD, Department of Electronics and Communications Engineering,
               Madanapalli Institute of Technology, Madanapalli, India
     2
       Principal, Jawaharlal Nehru Technological University, Pulivendula, India
                  3
                    Honorary Research Professor, Biomedical Sciences
       Sri Ramachandra Medical College & Research Institute, Chennai, India
                         E-mail: kodandramaiah@yahoo.com


                                       Abstract

   This paper presents the standard method for vocal tract shape estimation has
   been the basis for many successful automatic speech recognition (ASR)
   systems. Analytic results presented demonstrate that estimation of vocal tract
   shape, based on reflection co-efficients obtained from LPC analysis of speech,
   is satisfactory and is related to the place of articulation of the vowels. Here we
   describe a “standard” approach for classification of vowels based on formants,
   which are meaningfully distinguishable frequency components of human
   speech. These formant frequencies depend upon the shape and dimensions of
   the vocal tract, Vocal tract shape is characterized by a set of formant
   frequencies, and different sounds are produced by varying the shape of the
   vocal tract, leading to the property of spoken speech. It has been implemented
   in many of speech related applications such as, speech/speaker recognition.
   This work uses Euclidean distance measure, is applied in order to measure the
   similarity or the dissimilarity between two spoken words, which take place
   after quantizing a spoken word into its code book.

   Keywords: Speech, Vocal tract, Formants, Euclidean distance.


Introduction
The Fig 1.1 shows the block diagram of Independent Speaker Recognition for vowels.
Let S(n) be the test sample of a vowel. Then parameters i.e. formants F1 and F2 are
extracted.
378                                                        G.N. Kodandaramaiah et al

    The extracted formants are compared with the threshold of reference formants.
Euclidean distance measure is applied in order to measure the similarity or the
dissimilarity between two spoken words, which take place after quantizing a spoken
word into its code book. The matching of an unknown vowel is performed by
measuring the Euclidean distance between the features vector (formants) of the
unknown vowel to the reference model (codebook) of the known vowel formants F1,
F2 in the database. The goal is to find the codebook that has the minimum distance
measurement in order to identify the unknown vowel (Franti et al., 1997). For
example, in the testing or identification session, the Euclidean distance between the
features vector formants F1, F2 and codebook for each spoken vowel is calculated and
the vowel with the smallest average minimum distance is picked as shown in the Eq.
(1.1). Note that xi is the i th input features vector (formants F1, F2), yi is the i th
features vector in the codebook (Reference Model) and distance d is the distance
between xi and yi.

               d(x,y)=√[                            ]                                1.1

where D=2, xi is the ith input features vector (formants F1, F2), yi is the ith feature
vector in the code book (Reference Model) and d is the distance between xi and
yi.s,Wi = weight associated with ith feature vector, recognition score.


Decision Rule
The weights ‘w’ are important to use if the information contained in the underlining
features is not proportional to the feature variances. In this case of vowel recognition
based on formants F1 and F2, they do not uniformly contribute to vowel recognition.
Based on study, relative weights-F1 =2; F2 =1 are given but normalized such that the
sum of the weights is 1.0.
     We refer to classification based on this distance as Maximum Likelihood
Regression, since this is based on Gaussian assumptions used to obtain the parameters
in the classifier. To provide verification that the vowels displayed are producing
accurate results, the MLR has calculated the distance of average features for the given
vowels.If the feature distance is within the threshold criteria Di (F1, F2), then equation
1.1 becomes
                Di (f) < α√m                                                          1.2
where m is number of features i.e. F1 and F2, α is arbitrary scale factor used for
performance tuning. Then, the vector xi is identified as the vector yi, otherwise not. If
it is too small the MLR rejects many correct vowel samples. If it is too large the
output of category vowels will not be rejected. In our work the threshold α=x has
given optimum results.
Independent Speaker Recognition for Native English Vowels                                       379




                      Figure 1.1: Block diagram of vowel recognition.


Result of Vowel Ecognition of Male and Female Speakers
Male Speakers
The table 3.1 gives the result for male vowel recognition based on MLR method.
Vowel /a/ has achieved perfect classification compared to other vowels. The detection
rate for vowel /u/ and /e/ is better than vowel /o/ and /i/ for all tested samples. Vowel
/e/ and vowel /i/ tend to mis-classify with each other due to the variations of
utterances from different inter-speakers. The Fig 3.1 shows vowel ‘X’ versus %
vowel recognition for 50 male samples, where ‘X’ is the actual vowel.
    For vowel /a/, /a/ in /a/ is 46; /a/ in /e/ is 0; /a/ in /i/ is 4;/a/ in /o/ is 0;/a/ in /u/ is 0.
Hence the percentage correctness of recognition of vowel /a/ is = ( /a/ in /a/ )*
100÷(/a/ in all the vowels) = 46 * 100÷(46+0+4+0+0) = 46*100/50 = 92 %.




             Figure 3.1: Vowel Vs % vowel recognition for male speaker.
380                                                         G.N. Kodandaramaiah et al

   Table 3.1 Shows the percentage recognition for vowel of male speakers.


                      vowels            Predicted
                      Actual /a/ /e/ /i/ /o/ /u/ % correct
                        /a/  46 0 4 0           0  92%
                        /e/   2 44 0 4          0  89%
                        /i/   6 0 40 0          4  80%
                        /o/   0 3 0 44 3           88%
                        /u/   3 1 1 0 45           90%


Female Speakers
The table 3.2 gives the result for female vowel recognition based on MLR method.
Vowel /o/ has achieved perfect classification compared to other vowels. The detection
rate for vowel /u/ and /e/ is better than vowel /a/ and /i/ for all tested samples. Vowel
/a/ and vowel /i/ tend to mis-classify with each other due to the variations of
utterances from different inter-speakers. The Fig 3.2 shows the percentage of
recogniton of vowel for 40 female samples. For vowel /o/, /o/ in /a/ is 0; /o/ in /e/ is 0;
/o/ in /i/ is 0;/o/ in /o/ is 39;/o/ in /u/ is 1. Hence the Percentage correctness of
recognition of vowel /o/ is = ( /o/ in /o/ )* 100÷(/o/ in all the vowels) = 39 *
100÷(0+0+0+39+1) = 39*100/40 = 98 %.




           Figure 3.2: Vowel Vs % vowel recognition for female speaker.
Independent Speaker Recognition for Native English Vowels                        381

   Table 3.2 Shows percentage recognition of vowel for female speakers

                     Vowel /a/ /e/ /i/ /o/ /u/ %correct
                      /a/  34 4 0 0         2   85%
                      /e/   0 37 0 3        0   92%
                      /i/   0 4 34 0        2   86%
                      /o/   0 0 0 39 1          98%
                      /u/   3 0 0 0 37          94%


Conclussion
It was an attempt presents to the standard method for vocal tract shape estimation has
been the basis for many successful automatic speech recognition (ASR) systems. Here
we describe a “standard” approach for classification of vowels based on formants. We
achieved 80 to 95 percentage of speaker recognition using Euclidean distance
measure.


Acknowledgements
We would like to thanks the Management, Principal of Madanapalli Institute of
Technology and Science, Madanapalli, A.P., for their Cooperation and
Encouragement


References
[1]    L.R.Rabiner and R.W.Schafer, Digital processing of Speech signals, Droling
       Kindersly(india)pvt.Ltd.,licensees of pearson eduction in south asia, 1978, PP.
       54-101,412-460.
[2]    Thomas F. Quatieri, Discrete time speech signal processing principles and
       practice,2002, pp 56-59.
[3]    P. Ladefoged, R. Harshman, L. Goldstein, and L. Rice, “Generating vocal tract
       shapes from formant frequencies,” J. Acoust. Soc. Am., vol. 64, no. 4, , 1978,
       pp. 1027–1035.
[4]    Mayukh Bhaowal & Kunal Chawla Isolated word Recognition for English
       Language using LPC, Vq and HMM,pp.2-4.
[5]    G.E Peterson and H.L Barney,” control methods used in a study of the vowels
       ” J.Acoustic.Soc.Amer., Volume 24,PP.175-184
[6]    P.Rose,Long-and short-term within-speaker differences in the formants of
       Australian hello, j.Int. Phonetic. Assoc. 29(1) (1999) 1-31.
[7]    AhmedAli Safiullah Bhatti, dr.Munammad Sleem Miam. formants based
       Analysis for speech recognition, IEEE 2006.
382   G.N. Kodandaramaiah et al

More Related Content

What's hot

CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
IOSR Journals
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
ijcsa
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity Removal
Waqas Tariq
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
International Journal of Science and Research (IJSR)
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech Synthesis
CSCJournals
 
33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)
IAESIJEECS
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
iosrjce
 
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
CSCJournals
 
10122603 劉倪均Internet based grammar instruction in the esl classroom
10122603 劉倪均Internet based grammar instruction in the esl classroom10122603 劉倪均Internet based grammar instruction in the esl classroom
10122603 劉倪均Internet based grammar instruction in the esl classroom
Cathy Liu
 
ASA 09 Poster-portlandOR-051209
ASA 09 Poster-portlandOR-051209ASA 09 Poster-portlandOR-051209
ASA 09 Poster-portlandOR-051209
Sangsook Choi
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...
IJECEIAES
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
International Journal of Engineering Inventions www.ijeijournal.com
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
CSCJournals
 
Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Data
sipij
 

What's hot (16)

CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity Removal
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
 
Survey On Speech Synthesis
Survey On Speech SynthesisSurvey On Speech Synthesis
Survey On Speech Synthesis
 
33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
 
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...
 
10122603 劉倪均Internet based grammar instruction in the esl classroom
10122603 劉倪均Internet based grammar instruction in the esl classroom10122603 劉倪均Internet based grammar instruction in the esl classroom
10122603 劉倪均Internet based grammar instruction in the esl classroom
 
ASA 09 Poster-portlandOR-051209
ASA 09 Poster-portlandOR-051209ASA 09 Poster-portlandOR-051209
ASA 09 Poster-portlandOR-051209
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
 
Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Data
 

Similar to Ijeer journal

Gender voice classification with huge accuracy rate
Gender voice classification with huge accuracy rateGender voice classification with huge accuracy rate
Gender voice classification with huge accuracy rate
TELKOMNIKA JOURNAL
 
Speech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data VisualisationSpeech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data Visualisation
ITIIIndustries
 
A new framework based on KNN and DT for speech identification through emphat...
A new framework based on KNN and DT for speech  identification through emphat...A new framework based on KNN and DT for speech  identification through emphat...
A new framework based on KNN and DT for speech identification through emphat...
nooriasukmaningtyas
 
F010334548
F010334548F010334548
F010334548
IOSR Journals
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
ijsc
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
IJERA Editor
 
Int journal 01
Int journal 01Int journal 01
Int journal 01
Salai Selvam V
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based features
ijsc
 
Energy distribution in formant bands for arabic vowels
Energy distribution in formant bands for arabic vowelsEnergy distribution in formant bands for arabic vowels
Energy distribution in formant bands for arabic vowels
IJECEIAES
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Editor IJCATR
 
T0 numtq0nzq=
T0 numtq0nzq=T0 numtq0nzq=
Estimation of Severity of Speech Disability Through Speech Envelope
Estimation of Severity of Speech Disability Through Speech EnvelopeEstimation of Severity of Speech Disability Through Speech Envelope
Estimation of Severity of Speech Disability Through Speech Envelope
sipij
 
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
TELKOMNIKA JOURNAL
 
A comparative analysis of classifiers in emotion recognition thru acoustic fea...
A comparative analysis of classifiers in emotion recognition thru acoustic fea...A comparative analysis of classifiers in emotion recognition thru acoustic fea...
A comparative analysis of classifiers in emotion recognition thru acoustic fea...
Pravena Duplex
 
B110512
B110512B110512
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
cscpconf
 
A study of gender specific pitch variation pattern of emotion expression for ...
A study of gender specific pitch variation pattern of emotion expression for ...A study of gender specific pitch variation pattern of emotion expression for ...
A study of gender specific pitch variation pattern of emotion expression for ...
IAEME Publication
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
cscpconf
 
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
sipij
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
kevig
 

Similar to Ijeer journal (20)

Gender voice classification with huge accuracy rate
Gender voice classification with huge accuracy rateGender voice classification with huge accuracy rate
Gender voice classification with huge accuracy rate
 
Speech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data VisualisationSpeech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data Visualisation
 
A new framework based on KNN and DT for speech identification through emphat...
A new framework based on KNN and DT for speech  identification through emphat...A new framework based on KNN and DT for speech  identification through emphat...
A new framework based on KNN and DT for speech identification through emphat...
 
F010334548
F010334548F010334548
F010334548
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
 
Int journal 01
Int journal 01Int journal 01
Int journal 01
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based features
 
Energy distribution in formant bands for arabic vowels
Energy distribution in formant bands for arabic vowelsEnergy distribution in formant bands for arabic vowels
Energy distribution in formant bands for arabic vowels
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
 
T0 numtq0nzq=
T0 numtq0nzq=T0 numtq0nzq=
T0 numtq0nzq=
 
Estimation of Severity of Speech Disability Through Speech Envelope
Estimation of Severity of Speech Disability Through Speech EnvelopeEstimation of Severity of Speech Disability Through Speech Envelope
Estimation of Severity of Speech Disability Through Speech Envelope
 
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
Significance of Speech Intelligibility Assessors in Medium Classroom Using An...
 
A comparative analysis of classifiers in emotion recognition thru acoustic fea...
A comparative analysis of classifiers in emotion recognition thru acoustic fea...A comparative analysis of classifiers in emotion recognition thru acoustic fea...
A comparative analysis of classifiers in emotion recognition thru acoustic fea...
 
B110512
B110512B110512
B110512
 
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
DETECTION OF AUTOMATIC THE VOT VALUE FOR VOICED STOP SOUNDS IN MODERN STANDAR...
 
A study of gender specific pitch variation pattern of emotion expression for ...
A study of gender specific pitch variation pattern of emotion expression for ...A study of gender specific pitch variation pattern of emotion expression for ...
A study of gender specific pitch variation pattern of emotion expression for ...
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
 
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
 

Recently uploaded

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 

Recently uploaded (20)

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 

Ijeer journal

  • 1. International Journal of Electronic Engineering Research ISSN 0975 - 6450 Volume 2 Number 3 (2010) pp. 377–381 © Research India Publications http://www.ripublication.com/ijeer.htm Independent Speaker Recognition for Native English Vowels 1 G.N. Kodandaramaiah, 2M.N. Giriprasad and 3M. Mukunda Rao 1 HOD, Department of Electronics and Communications Engineering, Madanapalli Institute of Technology, Madanapalli, India 2 Principal, Jawaharlal Nehru Technological University, Pulivendula, India 3 Honorary Research Professor, Biomedical Sciences Sri Ramachandra Medical College & Research Institute, Chennai, India E-mail: kodandramaiah@yahoo.com Abstract This paper presents the standard method for vocal tract shape estimation has been the basis for many successful automatic speech recognition (ASR) systems. Analytic results presented demonstrate that estimation of vocal tract shape, based on reflection co-efficients obtained from LPC analysis of speech, is satisfactory and is related to the place of articulation of the vowels. Here we describe a “standard” approach for classification of vowels based on formants, which are meaningfully distinguishable frequency components of human speech. These formant frequencies depend upon the shape and dimensions of the vocal tract, Vocal tract shape is characterized by a set of formant frequencies, and different sounds are produced by varying the shape of the vocal tract, leading to the property of spoken speech. It has been implemented in many of speech related applications such as, speech/speaker recognition. This work uses Euclidean distance measure, is applied in order to measure the similarity or the dissimilarity between two spoken words, which take place after quantizing a spoken word into its code book. Keywords: Speech, Vocal tract, Formants, Euclidean distance. Introduction The Fig 1.1 shows the block diagram of Independent Speaker Recognition for vowels. Let S(n) be the test sample of a vowel. Then parameters i.e. formants F1 and F2 are extracted.
  • 2. 378 G.N. Kodandaramaiah et al The extracted formants are compared with the threshold of reference formants. Euclidean distance measure is applied in order to measure the similarity or the dissimilarity between two spoken words, which take place after quantizing a spoken word into its code book. The matching of an unknown vowel is performed by measuring the Euclidean distance between the features vector (formants) of the unknown vowel to the reference model (codebook) of the known vowel formants F1, F2 in the database. The goal is to find the codebook that has the minimum distance measurement in order to identify the unknown vowel (Franti et al., 1997). For example, in the testing or identification session, the Euclidean distance between the features vector formants F1, F2 and codebook for each spoken vowel is calculated and the vowel with the smallest average minimum distance is picked as shown in the Eq. (1.1). Note that xi is the i th input features vector (formants F1, F2), yi is the i th features vector in the codebook (Reference Model) and distance d is the distance between xi and yi. d(x,y)=√[ ] 1.1 where D=2, xi is the ith input features vector (formants F1, F2), yi is the ith feature vector in the code book (Reference Model) and d is the distance between xi and yi.s,Wi = weight associated with ith feature vector, recognition score. Decision Rule The weights ‘w’ are important to use if the information contained in the underlining features is not proportional to the feature variances. In this case of vowel recognition based on formants F1 and F2, they do not uniformly contribute to vowel recognition. Based on study, relative weights-F1 =2; F2 =1 are given but normalized such that the sum of the weights is 1.0. We refer to classification based on this distance as Maximum Likelihood Regression, since this is based on Gaussian assumptions used to obtain the parameters in the classifier. To provide verification that the vowels displayed are producing accurate results, the MLR has calculated the distance of average features for the given vowels.If the feature distance is within the threshold criteria Di (F1, F2), then equation 1.1 becomes Di (f) < α√m 1.2 where m is number of features i.e. F1 and F2, α is arbitrary scale factor used for performance tuning. Then, the vector xi is identified as the vector yi, otherwise not. If it is too small the MLR rejects many correct vowel samples. If it is too large the output of category vowels will not be rejected. In our work the threshold α=x has given optimum results.
  • 3. Independent Speaker Recognition for Native English Vowels 379 Figure 1.1: Block diagram of vowel recognition. Result of Vowel Ecognition of Male and Female Speakers Male Speakers The table 3.1 gives the result for male vowel recognition based on MLR method. Vowel /a/ has achieved perfect classification compared to other vowels. The detection rate for vowel /u/ and /e/ is better than vowel /o/ and /i/ for all tested samples. Vowel /e/ and vowel /i/ tend to mis-classify with each other due to the variations of utterances from different inter-speakers. The Fig 3.1 shows vowel ‘X’ versus % vowel recognition for 50 male samples, where ‘X’ is the actual vowel. For vowel /a/, /a/ in /a/ is 46; /a/ in /e/ is 0; /a/ in /i/ is 4;/a/ in /o/ is 0;/a/ in /u/ is 0. Hence the percentage correctness of recognition of vowel /a/ is = ( /a/ in /a/ )* 100÷(/a/ in all the vowels) = 46 * 100÷(46+0+4+0+0) = 46*100/50 = 92 %. Figure 3.1: Vowel Vs % vowel recognition for male speaker.
  • 4. 380 G.N. Kodandaramaiah et al Table 3.1 Shows the percentage recognition for vowel of male speakers. vowels Predicted Actual /a/ /e/ /i/ /o/ /u/ % correct /a/ 46 0 4 0 0 92% /e/ 2 44 0 4 0 89% /i/ 6 0 40 0 4 80% /o/ 0 3 0 44 3 88% /u/ 3 1 1 0 45 90% Female Speakers The table 3.2 gives the result for female vowel recognition based on MLR method. Vowel /o/ has achieved perfect classification compared to other vowels. The detection rate for vowel /u/ and /e/ is better than vowel /a/ and /i/ for all tested samples. Vowel /a/ and vowel /i/ tend to mis-classify with each other due to the variations of utterances from different inter-speakers. The Fig 3.2 shows the percentage of recogniton of vowel for 40 female samples. For vowel /o/, /o/ in /a/ is 0; /o/ in /e/ is 0; /o/ in /i/ is 0;/o/ in /o/ is 39;/o/ in /u/ is 1. Hence the Percentage correctness of recognition of vowel /o/ is = ( /o/ in /o/ )* 100÷(/o/ in all the vowels) = 39 * 100÷(0+0+0+39+1) = 39*100/40 = 98 %. Figure 3.2: Vowel Vs % vowel recognition for female speaker.
  • 5. Independent Speaker Recognition for Native English Vowels 381 Table 3.2 Shows percentage recognition of vowel for female speakers Vowel /a/ /e/ /i/ /o/ /u/ %correct /a/ 34 4 0 0 2 85% /e/ 0 37 0 3 0 92% /i/ 0 4 34 0 2 86% /o/ 0 0 0 39 1 98% /u/ 3 0 0 0 37 94% Conclussion It was an attempt presents to the standard method for vocal tract shape estimation has been the basis for many successful automatic speech recognition (ASR) systems. Here we describe a “standard” approach for classification of vowels based on formants. We achieved 80 to 95 percentage of speaker recognition using Euclidean distance measure. Acknowledgements We would like to thanks the Management, Principal of Madanapalli Institute of Technology and Science, Madanapalli, A.P., for their Cooperation and Encouragement References [1] L.R.Rabiner and R.W.Schafer, Digital processing of Speech signals, Droling Kindersly(india)pvt.Ltd.,licensees of pearson eduction in south asia, 1978, PP. 54-101,412-460. [2] Thomas F. Quatieri, Discrete time speech signal processing principles and practice,2002, pp 56-59. [3] P. Ladefoged, R. Harshman, L. Goldstein, and L. Rice, “Generating vocal tract shapes from formant frequencies,” J. Acoust. Soc. Am., vol. 64, no. 4, , 1978, pp. 1027–1035. [4] Mayukh Bhaowal & Kunal Chawla Isolated word Recognition for English Language using LPC, Vq and HMM,pp.2-4. [5] G.E Peterson and H.L Barney,” control methods used in a study of the vowels ” J.Acoustic.Soc.Amer., Volume 24,PP.175-184 [6] P.Rose,Long-and short-term within-speaker differences in the formants of Australian hello, j.Int. Phonetic. Assoc. 29(1) (1999) 1-31. [7] AhmedAli Safiullah Bhatti, dr.Munammad Sleem Miam. formants based Analysis for speech recognition, IEEE 2006.
  • 6. 382 G.N. Kodandaramaiah et al