SlideShare a Scribd company logo
1 of 3
Download to read offline
K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and
                  Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                   Vol. 2, Issue 5, September- October 2012, pp.1797-1799
Emotion Identification From Continuous Speech Using Cepstral
                           Analysis
                           K. Ravi Kumar1, V.Ambika2, K.Suri Babu3
                     1
                         M.tech(DECS), Chaitanya Engineering College, Visakhapatnam, India.
                         2
                           Dept of ECE, Chaitanya Engineering College, Visakhapatnam, India.
                            3
                              Scientist, NSTL (DRDO), Govt. of India, Visakhapatnam, India


ABSTRACT
         Emotion plays a major role in the area              2. FEATURE EXTRACTION
of psychology, human computer interaction, and                          In order to have an effective recognition
robotics and BPO sectors. With the                           system, the features are to be extracted efficiently.
advancements in the field of communication                   In order to achieve this, we convert these speech
technology, it is possible to establish the channel          signals and model it by using Gamma mixture
within few seconds across the globe. As most of              model. Every speech signal varies gradually in slow
the communication channels are public data                   phase and its features are fairly constant .In order to
transmission may not be authenticated. In such               identify the features, long speech duration is to be
situation, before interacting, it is essential to            considered. Features like MFCC and LPCs are most
recognize speaker by the unique features in the              commonly used, The main advantage of MFCC is, it
speech. A speaker can modulate his/her voice can             tries to identify the features in the presence of noise,
changes his/her emotion state. Hence emotion                 and LPCs are mostly preferred to extract the
recognition is required for the applications like            features in low acoustics,                  LPC and
telemetry, call centers, forensics and security. In          MFCC[8]coefficients are used to extract the features
our project the main emotion consider happy,                 from the given speech.
angry, boredom, and sad. In this work we dealt
with speaker recognition with different emotion.             Speech signal
The basic emotions for this study include angry,
sad, happy, boredom and neutral. The features
we modeled using Gamma Distribution(GD) and
data base generated with 50 speakers of both
genders with the above basic emotions
Considering        feature vector combinations
MFCC-LPC.

Keywords: MFCC, LPC, Gamma Distribution

1. INTRODUCTION
          In some specific situations, such as remote
medical treatment, call center applications, it is very
much necessary to identify the speaker along with
his/her emotion. Hence in this paper a methodology
for emotion identification in continuous speech, the
various emotions consider are happy, angry, sad,
boredom and neutral. Lot of research is projected to
recognize the emotional states of the speaker using
various models such as GMM, HMM, SVM, neural
networks [2][3][4][5]. In this paper we have
considered an emotional data base with 50 speakers
of both genders. The data is trained and for
classification of the speakers emotions generalized
gamma distribution is utilized. The data is tested
with different emotions .The rest of the paper is
organized as follows. In section-2 of the paper deals
with feature extraction, .section-3 of the paper
generalized gamma distribution is presented.section-                             Mel Cepstrum
4 of the paper deals with the experimentation and
results.


                                                                                                   1797 | P a g e
K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and
                  Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                   Vol. 2, Issue 5, September- October 2012, pp.1797-1799
3. GENERALIZED GAMMA MIXTURE                             Cluster the recorded speech of unknown
MODEL                                            emotion for testing using fuzzy c-means clustering
          Today most of the research in speech          technique and perform step2 to step 3.
processing is carried out by using Gaussian mixture     Step5:
model, but the main disadvantage with GMM is that                 Find the range of speech of test signal in
it relies exclusively on the the approximation and      the trained set.
low in convergence, and also if GMM is used the         Step6:
speech and the noise coefficients differ in magnitude             Evaluation metrics such as Acceptance
[9]. To have a more accurate feature extraction         Rate (AR), False Acceptance         Rate     (FAR),
maximum posterior estimation models are to be           Missed Detection Rate(MDR) are calculated to find
considered [10].Hence in this paper generalized         the       accuracy of speaker recognition.
gamma distribution is utilized for classifying the
speech signal. Generalized gamma distribution           5. EXPERIMENTATION
represents the sum of n-exponential distributed                    In general, the Emotion signal will always
random variables both the shape and scale               be of finite range and therefore, it needs to truncate
parameters have non-negative integer values [11].       the infinite range. Hence it is always advantage to
Generalized gamma distribution is defined in terms      consider the truncations of GMM into finite range;
of scale and shape parameters [12]. The generalized     also it is clearly observed that the pitch signals along
gamma mixture is given by                               the right side are more appropriate. Hence, in this
                                                        paper we have considered Generalized Gamma
                                             (1)        Distribution with acted sequences of 5 different
                                                        emotions, namely happy, sad, angry, boredom,
                                                        neutral. In order to test the data 50 samples are
         Where k and c are the shape parameters, a      considered and a database of audio voice is
is the location parameter, b is the scale parameter     generated in .wav format. The emotion speech data
and gamma is the complete gamma function [13].          base is considered with different emotions such as
The shape and scale parameters of the generalized       happy, angry, sad, boredom and neutral. The data
gamma distribution help to classify the speech          base is generated from the voice samples of both the
signal and identify the speaker accurately.             genders .50 samples have been recorded using text
                                                        dependent data; we have considered only a short
4. OUR APPROACH OF                     SPEAKER          sentence. The data is trained by extracting the voice
EMOTION RECOGNITION                                     features MFCC and LPC. The data is recorded with
         For identifying emotion in a continuous        sampling rate of 16 kHz. The signals were divided
speech, our method considers MFCC-LPC as feature        into 256 frames with an overlap of 128 frames and
vector. Emotions of speaker identified using various    the MFCC and LPC for each frame is computed .In
sets of recorded. Unknown emotion recognition of        order to classify the emotions and to appropriately
speaker considering continuous speech. Every            identify the speaker generalized gamma distribution
speech consists of the basic emotions: Angry,           is considered .The experimentation has been
happy, Boredom, neutral, sad to identify emotional      conducted on the database, by considering 10
state of speaker, we should train the speaker’s         emotions per training and 5 emotions for testing .we
speech properly i.e. selection of feature vector is     have repeated the experimentation and above 90%
crucial in emotion identification. Recorded a           over all recognition rate is achieved. The
continuous speech with unknown emotion .Known           experimentation is conducted by changing deferent
emotions are used for training. The steps to be         emotions and testing the data with a speaker’s voice
followed for identification of emotion in continuous    .In all the cases the recognition rate is above 90%.
speech effectively are given under
                                                        6. CONCLUSIONS:
Step1:                                                            In this paper a novel frame work for
          Obtain the training set by recording the      emotion identification in continuous speech with
speech voices in a .wav form                            MFCC and LPC together as feature vectors. This
Step2:                                                  work is very much useful in applications such as
          Identify the feature vector of these speech   speaker identification associated with emotional
signals by application of       transformation based    coefficients .It is used in practical situations such as
compound feature vector MFCC-LPC.                       call centers and telemedicine. In this paper
Step3:                                                  generalized gamma distribution is considered for
          Generate the probability density function     classification and over all recognition rate of above
(PDF) of the generalized gamma distribution for all     90% is achieved.
the trained data set.
Step4:


                                                                                              1798 | P a g e
K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and
                  Applications (IJERA) ISSN: 2248-9622 www.ijera.com
                   Vol. 2, Issue 5, September- October 2012, pp.1797-1799
REFERENCES:
 [1].  K.suri           Babu,           Y.srinivas,
       P.S.Varma,V.Nagesh(2011)”Text
       dependent & Gender independent speaker
       recognition model based on generalizations
       of gamma distribution”IJCA,dec2011.
 [2]. Xia Mao, Lijiang Chen, Liqin Fu, "Multi-
       level Speech Emotion Recognition Based
       on HMM and ANN", 2009 WRI World
       Congress,Computer          Science      and
       Information Engineering, pp.225-229,
       March2009.
 [3]. B. Schuller, G. Rigoll, M. Lang, "Hidden
       Markov model-based speech emotion
       recognition", Proceedings of the IEEE
       ICASSP Conference on Acoustics,Speech
       and Signal Processing, vol.2, pp. 1-4, April
       2003.
 [4]. Yashpalsing Chavhan, M. L. Dhore, Pallavi
       Yesaware, "Speech Emotion Recognition
       Using      Support     Vector     Machine",
       International    Journal     of   Computer
       Applications, vol. 1, pp.6-9,February 2010.
 [5]   Zhou Y, Sun Y, Zhang J, Yan Y, "Speech
       Emotion Recognition Using Both Spectral
       and      Prosodic     Features",    ICIECS
       2009.International       Conference      on
       Information Engineering and Computer
       Science, pp. 1-4, Dec.2009.
 [6]. I. Shahin, “Speaker recognition systems in
       the emotional environment,” in Proc. IEEE
       ICTTA ’08, Damascus,Syria, 2008.
  [7]. T. Giannakopoulos, A. Pikrakis, and S.
       Theoridis, “A dimensional approach to
       emotion recognition of speech from
       movies,” in Proc. IEEE ICASSP ’09,
       Taipei, Taiwan,2009.
 [8]. Md. AfzalHossan, SheerazMemon, Mark A
       Gregory “A Novel Approach for MFCC
       Feature Extraction” RMIT University978-
       1-4244-7907-8/10/$26.00 ©2010 IEEE.
 [9]. George Almpanidis and Constantine
       Kotropoulos,(2006)voice activity detection
       with generalized gamma distribution,
       IEEE,ICME 2006.
 [10]. XinGuang       Li     et    al(2011),Speech
       Recognition Based on K-means clustering
       and              neural             network
       Ensembles,International Conference on
       Natural           Computation,IEEEXplore
       Proceedings,2011,pp614-617.
 [11]. EwaPaszek,TheGamma&chi-square
       Distribution,conexions,Module-M13129.




                                                                             1799 | P a g e

More Related Content

What's hot

Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Data
sipij
 
VAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speechVAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speech
KunZhou18
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From Speech
IJERA Editor
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
kevig
 

What's hot (19)

Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Data
 
IEEE ICASSP 2021
IEEE ICASSP 2021IEEE ICASSP 2021
IEEE ICASSP 2021
 
VAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speechVAW-GAN for disentanglement and recomposition of emotional elements in speech
VAW-GAN for disentanglement and recomposition of emotional elements in speech
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
Oral Qualification Examination_Kun_Zhou
Oral Qualification Examination_Kun_ZhouOral Qualification Examination_Kun_Zhou
Oral Qualification Examination_Kun_Zhou
 
IRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET - Audio Emotion Analysis
IRJET - Audio Emotion Analysis
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From Speech
 
Report for Speech Emotion Recognition
Report for Speech Emotion RecognitionReport for Speech Emotion Recognition
Report for Speech Emotion Recognition
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesMarathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
 
Understanding and estimation of
Understanding and estimation ofUnderstanding and estimation of
Understanding and estimation of
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
 
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
 
Audio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for DepressionAudio/Speech Signal Analysis for Depression
Audio/Speech Signal Analysis for Depression
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Jw2417001703
Jw2417001703Jw2417001703
Jw2417001703
 

Viewers also liked

Viewers also liked (20)

Hf2513081311
Hf2513081311Hf2513081311
Hf2513081311
 
Hc2512901294
Hc2512901294Hc2512901294
Hc2512901294
 
Id2514581462
Id2514581462Id2514581462
Id2514581462
 
Hu2514081412
Hu2514081412Hu2514081412
Hu2514081412
 
Kn2518431847
Kn2518431847Kn2518431847
Kn2518431847
 
Hf2513081311
Hf2513081311Hf2513081311
Hf2513081311
 
Hp2513711375
Hp2513711375Hp2513711375
Hp2513711375
 
Hs2513891402
Hs2513891402Hs2513891402
Hs2513891402
 
Hx2514211427
Hx2514211427Hx2514211427
Hx2514211427
 
Kj2518171824
Kj2518171824Kj2518171824
Kj2518171824
 
Ig2616051609
Ig2616051609Ig2616051609
Ig2616051609
 
Ht2515231526
Ht2515231526Ht2515231526
Ht2515231526
 
Iv2616901694
Iv2616901694Iv2616901694
Iv2616901694
 
I26043047
I26043047I26043047
I26043047
 
W26142147
W26142147W26142147
W26142147
 
Ic2615781586
Ic2615781586Ic2615781586
Ic2615781586
 
Ms2521852190
Ms2521852190Ms2521852190
Ms2521852190
 
Mx2522172219
Mx2522172219Mx2522172219
Mx2522172219
 
Y25124127
Y25124127Y25124127
Y25124127
 
Bp31457463
Bp31457463Bp31457463
Bp31457463
 

Similar to Kf2517971799

Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms system
Kamal Spring
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
IJMER
 

Similar to Kf2517971799 (20)

Ho3114511454
Ho3114511454Ho3114511454
Ho3114511454
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
A017410108
A017410108A017410108
A017410108
 
A017410108
A017410108A017410108
A017410108
 
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNNSPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
 
IRJET- Study of Effect of PCA on Speech Emotion Recognition
IRJET- Study of Effect of PCA on Speech Emotion RecognitionIRJET- Study of Effect of PCA on Speech Emotion Recognition
IRJET- Study of Effect of PCA on Speech Emotion Recognition
 
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
 
65 69
65 6965 69
65 69
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machine
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
 
Speaker Identification
Speaker IdentificationSpeaker Identification
Speaker Identification
 
Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms system
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
IRJET-speech emotion.pdf
IRJET-speech emotion.pdfIRJET-speech emotion.pdf
IRJET-speech emotion.pdf
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approaches
 

Kf2517971799

  • 1. K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1797-1799 Emotion Identification From Continuous Speech Using Cepstral Analysis K. Ravi Kumar1, V.Ambika2, K.Suri Babu3 1 M.tech(DECS), Chaitanya Engineering College, Visakhapatnam, India. 2 Dept of ECE, Chaitanya Engineering College, Visakhapatnam, India. 3 Scientist, NSTL (DRDO), Govt. of India, Visakhapatnam, India ABSTRACT Emotion plays a major role in the area 2. FEATURE EXTRACTION of psychology, human computer interaction, and In order to have an effective recognition robotics and BPO sectors. With the system, the features are to be extracted efficiently. advancements in the field of communication In order to achieve this, we convert these speech technology, it is possible to establish the channel signals and model it by using Gamma mixture within few seconds across the globe. As most of model. Every speech signal varies gradually in slow the communication channels are public data phase and its features are fairly constant .In order to transmission may not be authenticated. In such identify the features, long speech duration is to be situation, before interacting, it is essential to considered. Features like MFCC and LPCs are most recognize speaker by the unique features in the commonly used, The main advantage of MFCC is, it speech. A speaker can modulate his/her voice can tries to identify the features in the presence of noise, changes his/her emotion state. Hence emotion and LPCs are mostly preferred to extract the recognition is required for the applications like features in low acoustics, LPC and telemetry, call centers, forensics and security. In MFCC[8]coefficients are used to extract the features our project the main emotion consider happy, from the given speech. angry, boredom, and sad. In this work we dealt with speaker recognition with different emotion. Speech signal The basic emotions for this study include angry, sad, happy, boredom and neutral. The features we modeled using Gamma Distribution(GD) and data base generated with 50 speakers of both genders with the above basic emotions Considering feature vector combinations MFCC-LPC. Keywords: MFCC, LPC, Gamma Distribution 1. INTRODUCTION In some specific situations, such as remote medical treatment, call center applications, it is very much necessary to identify the speaker along with his/her emotion. Hence in this paper a methodology for emotion identification in continuous speech, the various emotions consider are happy, angry, sad, boredom and neutral. Lot of research is projected to recognize the emotional states of the speaker using various models such as GMM, HMM, SVM, neural networks [2][3][4][5]. In this paper we have considered an emotional data base with 50 speakers of both genders. The data is trained and for classification of the speakers emotions generalized gamma distribution is utilized. The data is tested with different emotions .The rest of the paper is organized as follows. In section-2 of the paper deals with feature extraction, .section-3 of the paper generalized gamma distribution is presented.section- Mel Cepstrum 4 of the paper deals with the experimentation and results. 1797 | P a g e
  • 2. K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1797-1799 3. GENERALIZED GAMMA MIXTURE Cluster the recorded speech of unknown MODEL emotion for testing using fuzzy c-means clustering Today most of the research in speech technique and perform step2 to step 3. processing is carried out by using Gaussian mixture Step5: model, but the main disadvantage with GMM is that Find the range of speech of test signal in it relies exclusively on the the approximation and the trained set. low in convergence, and also if GMM is used the Step6: speech and the noise coefficients differ in magnitude Evaluation metrics such as Acceptance [9]. To have a more accurate feature extraction Rate (AR), False Acceptance Rate (FAR), maximum posterior estimation models are to be Missed Detection Rate(MDR) are calculated to find considered [10].Hence in this paper generalized the accuracy of speaker recognition. gamma distribution is utilized for classifying the speech signal. Generalized gamma distribution 5. EXPERIMENTATION represents the sum of n-exponential distributed In general, the Emotion signal will always random variables both the shape and scale be of finite range and therefore, it needs to truncate parameters have non-negative integer values [11]. the infinite range. Hence it is always advantage to Generalized gamma distribution is defined in terms consider the truncations of GMM into finite range; of scale and shape parameters [12]. The generalized also it is clearly observed that the pitch signals along gamma mixture is given by the right side are more appropriate. Hence, in this paper we have considered Generalized Gamma (1) Distribution with acted sequences of 5 different emotions, namely happy, sad, angry, boredom, neutral. In order to test the data 50 samples are Where k and c are the shape parameters, a considered and a database of audio voice is is the location parameter, b is the scale parameter generated in .wav format. The emotion speech data and gamma is the complete gamma function [13]. base is considered with different emotions such as The shape and scale parameters of the generalized happy, angry, sad, boredom and neutral. The data gamma distribution help to classify the speech base is generated from the voice samples of both the signal and identify the speaker accurately. genders .50 samples have been recorded using text dependent data; we have considered only a short 4. OUR APPROACH OF SPEAKER sentence. The data is trained by extracting the voice EMOTION RECOGNITION features MFCC and LPC. The data is recorded with For identifying emotion in a continuous sampling rate of 16 kHz. The signals were divided speech, our method considers MFCC-LPC as feature into 256 frames with an overlap of 128 frames and vector. Emotions of speaker identified using various the MFCC and LPC for each frame is computed .In sets of recorded. Unknown emotion recognition of order to classify the emotions and to appropriately speaker considering continuous speech. Every identify the speaker generalized gamma distribution speech consists of the basic emotions: Angry, is considered .The experimentation has been happy, Boredom, neutral, sad to identify emotional conducted on the database, by considering 10 state of speaker, we should train the speaker’s emotions per training and 5 emotions for testing .we speech properly i.e. selection of feature vector is have repeated the experimentation and above 90% crucial in emotion identification. Recorded a over all recognition rate is achieved. The continuous speech with unknown emotion .Known experimentation is conducted by changing deferent emotions are used for training. The steps to be emotions and testing the data with a speaker’s voice followed for identification of emotion in continuous .In all the cases the recognition rate is above 90%. speech effectively are given under 6. CONCLUSIONS: Step1: In this paper a novel frame work for Obtain the training set by recording the emotion identification in continuous speech with speech voices in a .wav form MFCC and LPC together as feature vectors. This Step2: work is very much useful in applications such as Identify the feature vector of these speech speaker identification associated with emotional signals by application of transformation based coefficients .It is used in practical situations such as compound feature vector MFCC-LPC. call centers and telemedicine. In this paper Step3: generalized gamma distribution is considered for Generate the probability density function classification and over all recognition rate of above (PDF) of the generalized gamma distribution for all 90% is achieved. the trained data set. Step4: 1798 | P a g e
  • 3. K. Ravi Kumar, V.Ambika, K.Suri Babu / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1797-1799 REFERENCES: [1]. K.suri Babu, Y.srinivas, P.S.Varma,V.Nagesh(2011)”Text dependent & Gender independent speaker recognition model based on generalizations of gamma distribution”IJCA,dec2011. [2]. Xia Mao, Lijiang Chen, Liqin Fu, "Multi- level Speech Emotion Recognition Based on HMM and ANN", 2009 WRI World Congress,Computer Science and Information Engineering, pp.225-229, March2009. [3]. B. Schuller, G. Rigoll, M. Lang, "Hidden Markov model-based speech emotion recognition", Proceedings of the IEEE ICASSP Conference on Acoustics,Speech and Signal Processing, vol.2, pp. 1-4, April 2003. [4]. Yashpalsing Chavhan, M. L. Dhore, Pallavi Yesaware, "Speech Emotion Recognition Using Support Vector Machine", International Journal of Computer Applications, vol. 1, pp.6-9,February 2010. [5] Zhou Y, Sun Y, Zhang J, Yan Y, "Speech Emotion Recognition Using Both Spectral and Prosodic Features", ICIECS 2009.International Conference on Information Engineering and Computer Science, pp. 1-4, Dec.2009. [6]. I. Shahin, “Speaker recognition systems in the emotional environment,” in Proc. IEEE ICTTA ’08, Damascus,Syria, 2008. [7]. T. Giannakopoulos, A. Pikrakis, and S. Theoridis, “A dimensional approach to emotion recognition of speech from movies,” in Proc. IEEE ICASSP ’09, Taipei, Taiwan,2009. [8]. Md. AfzalHossan, SheerazMemon, Mark A Gregory “A Novel Approach for MFCC Feature Extraction” RMIT University978- 1-4244-7907-8/10/$26.00 ©2010 IEEE. [9]. George Almpanidis and Constantine Kotropoulos,(2006)voice activity detection with generalized gamma distribution, IEEE,ICME 2006. [10]. XinGuang Li et al(2011),Speech Recognition Based on K-means clustering and neural network Ensembles,International Conference on Natural Computation,IEEEXplore Proceedings,2011,pp614-617. [11]. EwaPaszek,TheGamma&chi-square Distribution,conexions,Module-M13129. 1799 | P a g e