SlideShare a Scribd company logo
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
DOI : 10.5121/acij.2013.4105 45
SPEAKER VERIFICATION USING ACOUSTIC AND
PROSODIC FEATURES
Utpal Bhattacharjee1
and Kshirod Sarmah2
Department of Computer Science and Engineering, Rajiv Gandhi University, Rono
Hills, Doimukh, Arunachal Pradesh, India, PIN – 791112
1
utpalbhattacharjee@rediffmail.com
2
kshirodsarmah@gmail.com
ABSTRACT
In this paper we report the experiment carried out on recently collected speaker recognition database
namely Arunachali Language Speech Database (ALS-DB)to make a comparative study on the
performance of acoustic and prosodic features for speaker verification task.The speech database consists
of speech data recorded from 200 speakers with Arunachali languages of North-East India as mother
tongue. The collected database is evaluated using Gaussian mixture model-Universal Background Model
(GMM-UBM) based speaker verification system. The acoustic feature considered in the present study is
Mel-Frequency Cepstral Coefficients (MFCC) along with its derivatives.The performance of the system
has been evaluated for both acoustic feature and prosodic feature individually as well as in
combination.It has been observed that acoustic feature, when considered individually, provide better
performance compared to prosodic features. However, if prosodic features are combined with acoustic
feature, performance of the system outperforms both the systems where the features are considered
individually. There is a nearly 5% improvement in recognition accuracy with respect to the system where
acoustic features are considered individually and nearly 20% improvement with respect to the system
where only prosodic features are considered.
KEYWORDS
Speaker Verification, Multilingual,GMM-UBM,MFCC, Prosodic Feature
1. INTRODUCTION
Automatic Speaker Recognition (ASR)refers to recognizing persons from their voice. The sound
of each speaker is not identical because their vocal tract shapes, larynx sizes and other parts of
their voice production organs are different. ASR System can be divided into either Automatic
Speaker Verification(ASV) or Automatic Speaker Identification (ASI) systems [1,2, 3].Speaker
verification aims to verify whether an input speech corresponds to the claimed identity. Speaker
identification aims to identify an input speech by selecting one model from a set of enrolled
speaker models: in some cases, speaker verification will follow speaker identification in order to
validate the identification result [4].Speaker Verification is the task of determining whether a
person is who he or she claims to be (a yes/ no decision).Since it is generally assumed that
imposter (falsely claimed speaker) are not known to the system, it is also referred to as an Open-
Set task [5].
The speaker verification system aims to verify whether an input speech corresponds to the
claimed identity or not. A security system based on this ability has great potential in several
application domains. Speaker verification systems are typically distinguished into two
categories – text-dependent and text-independent [6]. In text-dependent system, a predetermined
group of words or sentences is used to enroll the speaker to the system and those words or
sentences are used to verify the speaker. Text-dependent system use an explicit verification
protocol, usually combined with pass phrases or Personal Identification Number (PIN) as an
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
46
additional level of security. In text-independent system, no constraints are placed on what can
be said by the speaker. It is an implicit verification process where the verification is done while
the user is performing some other tasks like talking with the customer care executive or
registering acomplain.
The state-of-art speaker verification system use either adaptive Gaussian mixture model (GMM)
[7] with universal background model (UBM) or support vector machine (SVM) over GMM
super-vector [8].
Mel-frequency Cepstral coefficients (MFCCs) are most commonly used feature vector for
speaker verification system. Supra-segmental features like – prosody, speaking style are also
combined with the cepstral feature to improve the performance[9].
Prosody plays a key role in the perception of human speech. The information contained in
prosodic features is partly different from the information contained in cepstral features.
Therefore, more and more researchers from the speech recognition area are showing interests in
prosodic features. Generally, prosody means “the structure that organizessound”. Pitch (tone),
Energy (loudness) and normalized duration (rhythm) are the main components of prosody for a
speaker. Prosody can vary from speaker to speaker and relies on long-term information of
speech.
Very often, prosodic features are extracted with larger frame size than acoustical features since
prosodic features exist over a long speech segment such as syllables. The pitch and energy-
contours change slowly compared to the spectrum, which implies that the variation can be
captured over a long speech segment [10].
The rest of the paper is organized as follows: Section–2 describes the details of the speaker
recognition database. Section–3 details the speaker verification system. The experimental setup,
data used in the experiments and result obtained are described in Section 4. The paper is
concluded in Section–5.
2. SPEAKER RECOGNITION DATABASE
In this section we describe the recently collected Arunachali Language Speech Database (ALS-
DB).Arunachal Pradesh of North East India is one of the linguistically richest and most diverse
regions in all of Asia, being home to at least thirty and possibly as many as fifty distinct
languages in addition to innumerable dialects and subdialects thereof [11]. The vast majority of
languages indigenous to modern-day Arunachal Pradesh belong to the Tibeto-Burman language
family. The majority of these in turn belong to a single branch of Tibeto-Burman, namely Tani.
Almost all Tani languages are indigenous to central Arunachal Pradesh while a handful of Tani
languages are also spoken in Tibet. Tani languages are noticeably characterized by an overall
relative uniformity, suggesting relatively recent origin and dispersal within their present-day
area of concentration. Most Tani languages are mutually intelligible with at least one other Tani
language, meaning that the area constitutes a dialect chain. In addition to these non-Indo-
European languages, the Indo-European languages Assamese, Bengali, English, Nepali and
especially Hindi are making strong inroads into Arunachal Pradesh primarily as a result of the
primary education system in which classes are generally taught by immigrant teachers from
Hindi-speaking parts of northern India. Because of the linguistic diversity of the region, English
is the only official language recognized in the state.
To study the impact of language variability on speaker recognition task, ALS-DB is collected in
multilingual environment. Each speaker is recorded for three different languages – English,
Hindi and a local language, which belongs to any one of the four major Arunachali languages -
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
47
Adi, Nyishi, Galo and Apatani. Each recording is of 4-5 minutes duration. Speech data were
recorded in parallel across four recording devices, which are listed in table -1.
Table 1: Device type and recording specifications
Device
Sl. No
Device Type Sampling
Rate
File Format
Device 1 Table mounted microphone 16 kHz wav
Device 2 Headset microphone 16 kHz wav
Device 3 Laptop microphone 16 kHz wav
Device 4 Portable Voice Recorder 44.1 kHz mp3
The speakers are recorded for reading style of conversation. The speech data collection was
done in laboratory environment with air conditioner, server and other equipments switched on.
The speech data was contributed by 112 male and 88 female informants chosen from the age
group 20-50 years. During recording, the subject was asked to read a story from the school book
of duration 4-5 minutes in each language for twice and the second reading was considered for
recording. Each informant participates in four recording sessions and there is a gap of at least
one week between two sessions.
3. SPEAKER VERIFICATION SYSTEM
In this works, aSpeaker Verification system has been developed using Gaussian Mixture Model
with Universal Background model (GMM-UBM) based modeling approach. A 39-dimensional
feature vector was used, made up of 13 mel-frequency cepstral coefficient (MFCC) and their
first order and second order derivatives. The first order derivatives were approximated over
three samples and similarly for second order derivatives.The coefficients were extracted from a
speech sampled at 16 KHz with 16 bits/sample resolution. A pre-emphasis filter‫ܪ‬ሺ‫ݖ‬ሻ = 1 −
0.97‫ݖ‬ିଵ
has been applied before framing. The pre-emphasized speech signal is segmented into
frame of 20 microseconds with frame frequency 100 Hz. Each frame is multiplied by a
Hamming window. From the windowed frame, FFT has been computed and the magnitude
spectrum is filtered with a bank of 22 triangular filters spaced on Mel-scale. The log-
compressed filter outputs are converted to cepstral coefficients by DCT. The 0th
cepstral
coefficient is not used in the cepstral feature vector since it corresponds to the energy of the
whole frame [12], and only next 12 MFCC coefficients have been used. To capture the time
varying nature of the speech signal, the MFCC coefficients were combined with its first order
and second derivatives, we get a 39-dimensional feature vector.
Cepstral Mean Subtraction (CMS) has been applied on all features to reduce the effect of
channel mismatch. In this approach we apply Cepstral Variance Normalization (CVN) which
forces the feature vectors to follow a unit variance distribution in feature level solution to get
more robustness results.
The prosodic features typically include pitch, intensity and normalized duration of the syllable.
However, as a limitation of combination, the features have to be frame based and therefore only
pitch and intensity are selected to represent the prosodic information in the present study. Pitch
and intensity are static features as they are calculated frame by frame. They only represent the
exact value of the current frame. In order to incorporate more temporal information their 1st
order and 2nd
order derivatives has also been included.
The Gaussian mixture model with 1024 Gaussian components has been used for both the UBM
and speaker model. The UBM was created by training the speaker model with 50 male and 50
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
48
female speaker’s data with 512 Gaussian components each male and female model with
Expectation Maximization (EM) algorithm. Finally UBM model is created by pulling the both
male and female models and finding the average of all these models [13]. The speaker models
were created by adapting only the mean parameters of the UBM using maximum a posteriori
(MAP) approach with the speaker specific data [8].
The detection error trade-off (DET) curve has been plotted using log likelihood ratio between
the claimed model and the UBM and the equal error rate (EER) obtained from the DET curve
has been used as a measure for the performance of the speaker verification system. Another
measurement Minimum DCF values has also been evaluated.
4. EXPERIMENTS AND RESULTS
All the experiments reported in this paper are carried out using the database ASL-DB described
in section 2. An energy based silence detector is used to identify and discard the silence frames
prior to feature extraction. Only data from the headset microphone has been considered in the
present study. All the four available sessions were considered for the experiments. Each speaker
model was trained using one complete session. The test sequences were extracted from the next
three sessions. The training set consists of speech data of length 120 seconds per speaker. The
test set consists of speech data of length 15 seconds, 30 seconds and 45 seconds. The test set
contains more than 3500 test segments of varying length and each test segment will be evaluated
against 11 hypothesized speakers of the same sex as segment speaker [14].
In this experiment single language (English, Hindi, and a Local language) has been considered
for training the system and each language has been considered separately for testing the system.
Training sample of length 120 seconds from a single session has been considered for training
the system and the other three sessions have been considered for testing the system.
Figure–1(a),(b) and (c) shows the DET curves for Speaker Verification system obtained for the
three languages in the speech database. The result of the experiments has been summarized in
table–1.
Figure 1(a). DET curves for the speaker verification system using the Local language as training
and testing language
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
49
Figure 1(b). DET curves for the speaker verification system using the Hindi language as training
and testing language
Figure 1(c). DET curves for the speaker verification system using the English language as
training and testing language
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
50
Table 2.EER andMin DCF values for speaker verification system for training with one language
and testing with each language
Training &
Testing
Language
Feature Vectors ERR% Recognition
Rate%
Minimum
DCF Value
Local MFCC + ∆MFCC+ ∆∆MFCC 6.81 93.19 0.0991
PROSODIC 21.50 79.50 0.4045
MFCC + ∆MFCC+ ∆∆MFCC+
PROSODIC
4.55 95.45 0.0823
Hindi MFCC + ∆MFCC+ ∆∆MFCC 6.81 93.19 0.0968
PROSODIC 25.00 75.00 0.4438
MFCC + ∆MFCC+ ∆∆MFCC+
PROSODIC
4.55 95.45 0.0927
English MFCC + ∆MFCC+ ∆∆MFCC 9.09 90.91 0.1195
PROSODIC 26.50 73.50 0.4632
MFCC + ∆MFCC+ ∆∆MFCC+
PROSODIC
4.55 95.45 0.0823
5. CONCLUSIONS
The experiments reported in the above section established the fact that MFCC features along
with its time derivatives may be considered as an efficient parameterization of the speech signal
for speaker verification task. It has been observed that when MFCC with its 1st
and 2nd
order
derivatives has been considered as feature vector, it gives a recognition accuracy of around
92.43%. Performance of another feature vector namely prosody has also been evaluated in the
present study. It has been observed that when the prosodic features are considered as feature
vector for the speaker verification system, it gives a recognition accuracy of 76%, which is
nearly 16% below the recognition accuracy MFCC features vector. However, it has been
observed that when both MFCC and prosodic features are combined, a recognition accuracy of
95.45% has been achieved. Thus, it may be conclude that MFCC features when combined with
prosodic features, the performance of the system improves marginally, in the present study by
3%, without increasing the complexity of the system. Another interesting observation made in
the present study is that when English languages has been considered for training and testing the
system, a recognition accuracy of 90.19% has been achieved with MFCC features whereas
under same experimental condition the recognition accuracy for Local and Hindi languages is
93.19%. The relatively poor performance in case of English can be explained in the context of
linguistic scenario of Arunachal Pradesh. The people of Arunachal Pradesh of India, specially
the educated section use Hindi in their day-to-day conversation even with their family members.
Therefore, Hindi is as fluent to them as their native language and its articulation is relatively
robust to context like their native language. However, English being the non-native language
undergoes major deviation in the articulation with context.
ACKNOWLEDGEMENT
This work has been supported by the ongoing project grant No. 12(12)/2009-ESD sponsored by
the Department of Information Technology, Government of India.
Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013
51
REFERENCES
[1] Furui, S., (1997) “Recent advances in speaker recognition”, Pattern Recognition Lett., 18, 859–872.
[2] Campbell, J.P., Jr., (1997) “Speaker recognition: a tutorial”, Proceedings of the IEEE, 85(9): 1437–
1462.
[3] Bimbot, F. et al, (2004), “A tutorial on text-independent speaker verification”, EURASIP Journal on
Applied Signal Processing, 4, 430–451.
[4] Park, A. and Hazen, T. J. (2002), “ASR dependent techniques for speaker identification”,
Proceedings of the International Conference on Spoken Language Processing, Denver, Colorado.
[5] Douglas A. Reynolds (2002), “An overview of Automatic Speaker Recognition Technology”, MIT
Lincoln Laboratory,244 wood St. Lexinton, MA 02140,USA,IEEE 2002.
[6] Rosenberg, J. Delong, C. Lee, B. Juang and F. Soong, (1992), "The use of cohort normalized scores
for speakerrecognition," InProc. ICSLP, pp. 599–602.
[7] Reynolds, A. (1995), "Robust text-independent speakeridentification using Gaussian mixture speaker
models," Speech Communications, vol. 17, pp. 91-108.
[8] D. A. Reynolds, T. F. Quatieri, and R. B. Dunn (2000), "SpeakerVerification Using Adapted
Gaussian Mixture Models," Digital Signal Processing, vol. 10(1–3), pp. 19-41.
[9] Haris B.C., Pradhan G., Misra A, Shukla S., Sinha R and Prasanna S.R.M. (2011), Multi-variability
Speech Database for Robust Speaker Recognition, In Proc. NCC, pp. 1-5.
[10]Shriberg, E.E., (2007), “Higher Level Features in Speaker Recognition”, In C. Muller (Ed.) Speaker
Classification I. Volume 4343 of Lecture Notes in Computer Science / Artificial Intelligence.
Springer: Heidelberg / Berlin / New York, pp. 241-259.
[11]Arunachal Pradesh, http://en.wikipedia.org/wiki/Arunachal_Pradesh.
[12]Xiaojia Z., Yang S.andDeLiang W., (2011), Robust speaker identification using a CASA front-end,
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on,
pp.5468-5471, 2011.
[13]Kleynhans N.T. and Barnard E., (2005), Language dependence in multilingual speaker verification,
In Proc. of the 16th Annual Symposium of the Pattern Recognition Association of South Africa,
Langebaan, South Africa, pp. 117-122.
[14]NIST 2003 Evaluation plan, http:// www.itl.nist.gov/iad/mig/tests/sre/2003/2003-spkrec-evalplan-
v2.2.

More Related Content

What's hot

Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
IJERA Editor
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
International Journal of Science and Research (IJSR)
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
sophiabelthome
 
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITIONFORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
sipij
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
TELKOMNIKA JOURNAL
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
IJERA Editor
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemVani011
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
cscpconf
 
Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...
IJECEIAES
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...
eSAT Journals
 
Performance of different classifiers in speech recognition
Performance of different classifiers in speech recognitionPerformance of different classifiers in speech recognition
Performance of different classifiers in speech recognition
eSAT Publishing House
 

What's hot (14)

Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...Isolated English Word Recognition System: Appropriate for Bengali-accented En...
Isolated English Word Recognition System: Appropriate for Bengali-accented En...
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
 
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITIONFORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
 
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
Ijetcas14 390
Ijetcas14 390Ijetcas14 390
Ijetcas14 390
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition System
 
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...
 
Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...Classification improvement of spoken arabic language based on radial basis fu...
Classification improvement of spoken arabic language based on radial basis fu...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...
 
Performance of different classifiers in speech recognition
Performance of different classifiers in speech recognitionPerformance of different classifiers in speech recognition
Performance of different classifiers in speech recognition
 

Similar to SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES

IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
kevig
 
MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
 MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
ijitcs
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
kevig
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
kevig
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
eSAT Publishing House
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
eSAT Journals
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
ijma
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
ijma
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
csandit
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
kevig
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
ijnlc
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
Shivlal Mewada
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET Journal
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
ijsc
 

Similar to SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES (20)

IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
IMPROVING MYANMAR AUTOMATIC SPEECH RECOGNITION WITH OPTIMIZATION OF CONVOLUTI...
 
MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
 MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
MULTILINGUAL SPEECH IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
50120140501002
5012014050100250120140501002
50120140501002
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
 
histogram-based-emotion
histogram-based-emotionhistogram-based-emotion
histogram-based-emotion
 

More from acijjournal

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdf
acijjournal
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
acijjournal
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdf
acijjournal
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdf
acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
acijjournal
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...
acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
acijjournal
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
acijjournal
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
acijjournal
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
acijjournal
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ)
acijjournal
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
acijjournal
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...
acijjournal
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...
acijjournal
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
acijjournal
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
acijjournal
 

More from acijjournal (20)

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdf
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdf
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdf
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ)
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
 

Recently uploaded

CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
PauloRodrigues104553
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
Kamal Acharya
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
bhadouriyakaku
 

Recently uploaded (20)

CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
 

SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES

  • 1. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 DOI : 10.5121/acij.2013.4105 45 SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES Utpal Bhattacharjee1 and Kshirod Sarmah2 Department of Computer Science and Engineering, Rajiv Gandhi University, Rono Hills, Doimukh, Arunachal Pradesh, India, PIN – 791112 1 utpalbhattacharjee@rediffmail.com 2 kshirodsarmah@gmail.com ABSTRACT In this paper we report the experiment carried out on recently collected speaker recognition database namely Arunachali Language Speech Database (ALS-DB)to make a comparative study on the performance of acoustic and prosodic features for speaker verification task.The speech database consists of speech data recorded from 200 speakers with Arunachali languages of North-East India as mother tongue. The collected database is evaluated using Gaussian mixture model-Universal Background Model (GMM-UBM) based speaker verification system. The acoustic feature considered in the present study is Mel-Frequency Cepstral Coefficients (MFCC) along with its derivatives.The performance of the system has been evaluated for both acoustic feature and prosodic feature individually as well as in combination.It has been observed that acoustic feature, when considered individually, provide better performance compared to prosodic features. However, if prosodic features are combined with acoustic feature, performance of the system outperforms both the systems where the features are considered individually. There is a nearly 5% improvement in recognition accuracy with respect to the system where acoustic features are considered individually and nearly 20% improvement with respect to the system where only prosodic features are considered. KEYWORDS Speaker Verification, Multilingual,GMM-UBM,MFCC, Prosodic Feature 1. INTRODUCTION Automatic Speaker Recognition (ASR)refers to recognizing persons from their voice. The sound of each speaker is not identical because their vocal tract shapes, larynx sizes and other parts of their voice production organs are different. ASR System can be divided into either Automatic Speaker Verification(ASV) or Automatic Speaker Identification (ASI) systems [1,2, 3].Speaker verification aims to verify whether an input speech corresponds to the claimed identity. Speaker identification aims to identify an input speech by selecting one model from a set of enrolled speaker models: in some cases, speaker verification will follow speaker identification in order to validate the identification result [4].Speaker Verification is the task of determining whether a person is who he or she claims to be (a yes/ no decision).Since it is generally assumed that imposter (falsely claimed speaker) are not known to the system, it is also referred to as an Open- Set task [5]. The speaker verification system aims to verify whether an input speech corresponds to the claimed identity or not. A security system based on this ability has great potential in several application domains. Speaker verification systems are typically distinguished into two categories – text-dependent and text-independent [6]. In text-dependent system, a predetermined group of words or sentences is used to enroll the speaker to the system and those words or sentences are used to verify the speaker. Text-dependent system use an explicit verification protocol, usually combined with pass phrases or Personal Identification Number (PIN) as an
  • 2. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 46 additional level of security. In text-independent system, no constraints are placed on what can be said by the speaker. It is an implicit verification process where the verification is done while the user is performing some other tasks like talking with the customer care executive or registering acomplain. The state-of-art speaker verification system use either adaptive Gaussian mixture model (GMM) [7] with universal background model (UBM) or support vector machine (SVM) over GMM super-vector [8]. Mel-frequency Cepstral coefficients (MFCCs) are most commonly used feature vector for speaker verification system. Supra-segmental features like – prosody, speaking style are also combined with the cepstral feature to improve the performance[9]. Prosody plays a key role in the perception of human speech. The information contained in prosodic features is partly different from the information contained in cepstral features. Therefore, more and more researchers from the speech recognition area are showing interests in prosodic features. Generally, prosody means “the structure that organizessound”. Pitch (tone), Energy (loudness) and normalized duration (rhythm) are the main components of prosody for a speaker. Prosody can vary from speaker to speaker and relies on long-term information of speech. Very often, prosodic features are extracted with larger frame size than acoustical features since prosodic features exist over a long speech segment such as syllables. The pitch and energy- contours change slowly compared to the spectrum, which implies that the variation can be captured over a long speech segment [10]. The rest of the paper is organized as follows: Section–2 describes the details of the speaker recognition database. Section–3 details the speaker verification system. The experimental setup, data used in the experiments and result obtained are described in Section 4. The paper is concluded in Section–5. 2. SPEAKER RECOGNITION DATABASE In this section we describe the recently collected Arunachali Language Speech Database (ALS- DB).Arunachal Pradesh of North East India is one of the linguistically richest and most diverse regions in all of Asia, being home to at least thirty and possibly as many as fifty distinct languages in addition to innumerable dialects and subdialects thereof [11]. The vast majority of languages indigenous to modern-day Arunachal Pradesh belong to the Tibeto-Burman language family. The majority of these in turn belong to a single branch of Tibeto-Burman, namely Tani. Almost all Tani languages are indigenous to central Arunachal Pradesh while a handful of Tani languages are also spoken in Tibet. Tani languages are noticeably characterized by an overall relative uniformity, suggesting relatively recent origin and dispersal within their present-day area of concentration. Most Tani languages are mutually intelligible with at least one other Tani language, meaning that the area constitutes a dialect chain. In addition to these non-Indo- European languages, the Indo-European languages Assamese, Bengali, English, Nepali and especially Hindi are making strong inroads into Arunachal Pradesh primarily as a result of the primary education system in which classes are generally taught by immigrant teachers from Hindi-speaking parts of northern India. Because of the linguistic diversity of the region, English is the only official language recognized in the state. To study the impact of language variability on speaker recognition task, ALS-DB is collected in multilingual environment. Each speaker is recorded for three different languages – English, Hindi and a local language, which belongs to any one of the four major Arunachali languages -
  • 3. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 47 Adi, Nyishi, Galo and Apatani. Each recording is of 4-5 minutes duration. Speech data were recorded in parallel across four recording devices, which are listed in table -1. Table 1: Device type and recording specifications Device Sl. No Device Type Sampling Rate File Format Device 1 Table mounted microphone 16 kHz wav Device 2 Headset microphone 16 kHz wav Device 3 Laptop microphone 16 kHz wav Device 4 Portable Voice Recorder 44.1 kHz mp3 The speakers are recorded for reading style of conversation. The speech data collection was done in laboratory environment with air conditioner, server and other equipments switched on. The speech data was contributed by 112 male and 88 female informants chosen from the age group 20-50 years. During recording, the subject was asked to read a story from the school book of duration 4-5 minutes in each language for twice and the second reading was considered for recording. Each informant participates in four recording sessions and there is a gap of at least one week between two sessions. 3. SPEAKER VERIFICATION SYSTEM In this works, aSpeaker Verification system has been developed using Gaussian Mixture Model with Universal Background model (GMM-UBM) based modeling approach. A 39-dimensional feature vector was used, made up of 13 mel-frequency cepstral coefficient (MFCC) and their first order and second order derivatives. The first order derivatives were approximated over three samples and similarly for second order derivatives.The coefficients were extracted from a speech sampled at 16 KHz with 16 bits/sample resolution. A pre-emphasis filter‫ܪ‬ሺ‫ݖ‬ሻ = 1 − 0.97‫ݖ‬ିଵ has been applied before framing. The pre-emphasized speech signal is segmented into frame of 20 microseconds with frame frequency 100 Hz. Each frame is multiplied by a Hamming window. From the windowed frame, FFT has been computed and the magnitude spectrum is filtered with a bank of 22 triangular filters spaced on Mel-scale. The log- compressed filter outputs are converted to cepstral coefficients by DCT. The 0th cepstral coefficient is not used in the cepstral feature vector since it corresponds to the energy of the whole frame [12], and only next 12 MFCC coefficients have been used. To capture the time varying nature of the speech signal, the MFCC coefficients were combined with its first order and second derivatives, we get a 39-dimensional feature vector. Cepstral Mean Subtraction (CMS) has been applied on all features to reduce the effect of channel mismatch. In this approach we apply Cepstral Variance Normalization (CVN) which forces the feature vectors to follow a unit variance distribution in feature level solution to get more robustness results. The prosodic features typically include pitch, intensity and normalized duration of the syllable. However, as a limitation of combination, the features have to be frame based and therefore only pitch and intensity are selected to represent the prosodic information in the present study. Pitch and intensity are static features as they are calculated frame by frame. They only represent the exact value of the current frame. In order to incorporate more temporal information their 1st order and 2nd order derivatives has also been included. The Gaussian mixture model with 1024 Gaussian components has been used for both the UBM and speaker model. The UBM was created by training the speaker model with 50 male and 50
  • 4. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 48 female speaker’s data with 512 Gaussian components each male and female model with Expectation Maximization (EM) algorithm. Finally UBM model is created by pulling the both male and female models and finding the average of all these models [13]. The speaker models were created by adapting only the mean parameters of the UBM using maximum a posteriori (MAP) approach with the speaker specific data [8]. The detection error trade-off (DET) curve has been plotted using log likelihood ratio between the claimed model and the UBM and the equal error rate (EER) obtained from the DET curve has been used as a measure for the performance of the speaker verification system. Another measurement Minimum DCF values has also been evaluated. 4. EXPERIMENTS AND RESULTS All the experiments reported in this paper are carried out using the database ASL-DB described in section 2. An energy based silence detector is used to identify and discard the silence frames prior to feature extraction. Only data from the headset microphone has been considered in the present study. All the four available sessions were considered for the experiments. Each speaker model was trained using one complete session. The test sequences were extracted from the next three sessions. The training set consists of speech data of length 120 seconds per speaker. The test set consists of speech data of length 15 seconds, 30 seconds and 45 seconds. The test set contains more than 3500 test segments of varying length and each test segment will be evaluated against 11 hypothesized speakers of the same sex as segment speaker [14]. In this experiment single language (English, Hindi, and a Local language) has been considered for training the system and each language has been considered separately for testing the system. Training sample of length 120 seconds from a single session has been considered for training the system and the other three sessions have been considered for testing the system. Figure–1(a),(b) and (c) shows the DET curves for Speaker Verification system obtained for the three languages in the speech database. The result of the experiments has been summarized in table–1. Figure 1(a). DET curves for the speaker verification system using the Local language as training and testing language
  • 5. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 49 Figure 1(b). DET curves for the speaker verification system using the Hindi language as training and testing language Figure 1(c). DET curves for the speaker verification system using the English language as training and testing language
  • 6. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 50 Table 2.EER andMin DCF values for speaker verification system for training with one language and testing with each language Training & Testing Language Feature Vectors ERR% Recognition Rate% Minimum DCF Value Local MFCC + ∆MFCC+ ∆∆MFCC 6.81 93.19 0.0991 PROSODIC 21.50 79.50 0.4045 MFCC + ∆MFCC+ ∆∆MFCC+ PROSODIC 4.55 95.45 0.0823 Hindi MFCC + ∆MFCC+ ∆∆MFCC 6.81 93.19 0.0968 PROSODIC 25.00 75.00 0.4438 MFCC + ∆MFCC+ ∆∆MFCC+ PROSODIC 4.55 95.45 0.0927 English MFCC + ∆MFCC+ ∆∆MFCC 9.09 90.91 0.1195 PROSODIC 26.50 73.50 0.4632 MFCC + ∆MFCC+ ∆∆MFCC+ PROSODIC 4.55 95.45 0.0823 5. CONCLUSIONS The experiments reported in the above section established the fact that MFCC features along with its time derivatives may be considered as an efficient parameterization of the speech signal for speaker verification task. It has been observed that when MFCC with its 1st and 2nd order derivatives has been considered as feature vector, it gives a recognition accuracy of around 92.43%. Performance of another feature vector namely prosody has also been evaluated in the present study. It has been observed that when the prosodic features are considered as feature vector for the speaker verification system, it gives a recognition accuracy of 76%, which is nearly 16% below the recognition accuracy MFCC features vector. However, it has been observed that when both MFCC and prosodic features are combined, a recognition accuracy of 95.45% has been achieved. Thus, it may be conclude that MFCC features when combined with prosodic features, the performance of the system improves marginally, in the present study by 3%, without increasing the complexity of the system. Another interesting observation made in the present study is that when English languages has been considered for training and testing the system, a recognition accuracy of 90.19% has been achieved with MFCC features whereas under same experimental condition the recognition accuracy for Local and Hindi languages is 93.19%. The relatively poor performance in case of English can be explained in the context of linguistic scenario of Arunachal Pradesh. The people of Arunachal Pradesh of India, specially the educated section use Hindi in their day-to-day conversation even with their family members. Therefore, Hindi is as fluent to them as their native language and its articulation is relatively robust to context like their native language. However, English being the non-native language undergoes major deviation in the articulation with context. ACKNOWLEDGEMENT This work has been supported by the ongoing project grant No. 12(12)/2009-ESD sponsored by the Department of Information Technology, Government of India.
  • 7. Advanced Computing: An International Journal ( ACIJ ), Vol.4, No.1, January 2013 51 REFERENCES [1] Furui, S., (1997) “Recent advances in speaker recognition”, Pattern Recognition Lett., 18, 859–872. [2] Campbell, J.P., Jr., (1997) “Speaker recognition: a tutorial”, Proceedings of the IEEE, 85(9): 1437– 1462. [3] Bimbot, F. et al, (2004), “A tutorial on text-independent speaker verification”, EURASIP Journal on Applied Signal Processing, 4, 430–451. [4] Park, A. and Hazen, T. J. (2002), “ASR dependent techniques for speaker identification”, Proceedings of the International Conference on Spoken Language Processing, Denver, Colorado. [5] Douglas A. Reynolds (2002), “An overview of Automatic Speaker Recognition Technology”, MIT Lincoln Laboratory,244 wood St. Lexinton, MA 02140,USA,IEEE 2002. [6] Rosenberg, J. Delong, C. Lee, B. Juang and F. Soong, (1992), "The use of cohort normalized scores for speakerrecognition," InProc. ICSLP, pp. 599–602. [7] Reynolds, A. (1995), "Robust text-independent speakeridentification using Gaussian mixture speaker models," Speech Communications, vol. 17, pp. 91-108. [8] D. A. Reynolds, T. F. Quatieri, and R. B. Dunn (2000), "SpeakerVerification Using Adapted Gaussian Mixture Models," Digital Signal Processing, vol. 10(1–3), pp. 19-41. [9] Haris B.C., Pradhan G., Misra A, Shukla S., Sinha R and Prasanna S.R.M. (2011), Multi-variability Speech Database for Robust Speaker Recognition, In Proc. NCC, pp. 1-5. [10]Shriberg, E.E., (2007), “Higher Level Features in Speaker Recognition”, In C. Muller (Ed.) Speaker Classification I. Volume 4343 of Lecture Notes in Computer Science / Artificial Intelligence. Springer: Heidelberg / Berlin / New York, pp. 241-259. [11]Arunachal Pradesh, http://en.wikipedia.org/wiki/Arunachal_Pradesh. [12]Xiaojia Z., Yang S.andDeLiang W., (2011), Robust speaker identification using a CASA front-end, Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pp.5468-5471, 2011. [13]Kleynhans N.T. and Barnard E., (2005), Language dependence in multilingual speaker verification, In Proc. of the 16th Annual Symposium of the Pattern Recognition Association of South Africa, Langebaan, South Africa, pp. 117-122. [14]NIST 2003 Evaluation plan, http:// www.itl.nist.gov/iad/mig/tests/sre/2003/2003-spkrec-evalplan- v2.2.