SlideShare a Scribd company logo
IJSRD - International Journal for Scientific Research & Development| Vol. 2, Issue 07, 2014 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 228
Audio/Speech Signal Analysis for Depression
Arpit Parikh1
Sameena Zafar2
Mukesh Saini3
1
PG Research Scholar 2
Assistant Professor and Head of Department 3
Assistant Professor
1,2,3
Department of Electronics and Communication Engineering
1,2,3
Patel Institute of Engineering & Science, RGPV University, Bhopal, India
Abstract— The word “depressed” is a common everyday
word. People might say "I am depressed" when in fact they
mean "I am fed up because I have had a row, or failed an
exam, or lost my job", etc. These ups and downs of life are
common and normal. Most people recover quite quickly.
Depression is identified by different methods. Here we are
identified depression by MFCC (Mel Frequency Ceptral
Coefficient) method. There are different parameters used for
the identification of depressed speech and normal speech,
but MFCCs based parameter is the most applicable
information then other parameter because depressive speech
or audio signal can contain more information in the higher
energy bands when compared with normal speech.
Keywords: Mel Frequency Cepstral Coefficients (MFCC),
Speech Emotion Recognition (SER), discrete Cosine
Transform (DCT), Depression
I. INTRODUCTION
Speech Emotion recognition through speech is an area
which is increasingly attracting attention within the
engineers in the field of pattern recognition and
speech/Audio signal processing in recent years. Speech
recognition is a technique that enables a device to recognize
and understand spoken words. Automatic emotion
recognition paid close attention in identifying emotional
state of speaker from voice signal. Emotions play an
extremely important role in human life. To recognize
human's emotions and feelings, various physiological
signals have been widely used to classify emotion because
signal acquisition by non-invasive sensors is relatively
simple and physiological responses induced by emotion are
less sensitive in social and cultural difference. Depression is
classified blood volume pulse, electrocardiogram (ECG),
and Galvanic skin response, Skin temperature and Gaussian
Mixture Model. Speech signal is also recognize with Wiener
filter and Kalman filter. [1]
MFCC (Mel Frequency Ceptral Coefficients) is
used to analysis of depression. The main point to understand
about speech is that the sounds generated by a human are
filtered by the shape of the vocal tract including tongue,
teeth etc. This shape determines what sound comes out. If
we can determine the shape accurately, this should give us
an accurate representation of the phoneme being produced.
The shape of the vocal tract manifests itself in the envelope
of the short time power spectrum, and the job of MFCCs is
to accurately represent this envelope. Mel Frequency
Cepstral Coefficents (MFCCs) are a feature widely used in
automatic speech and speaker recognition. They were
introduced by Davis and Mermelstein in the 1980's, and
have been state-of-the-art ever since. Prior to the
introduction of MFCCs, Linear Prediction Coefficients
(LPCs) [8] and Linear Prediction Cepstral Coefficients
(LPCCs) and were the main feature type for automatic
speech recognition (ASR) [5].
II. MFCC METHODOLOGY
Now there are some steps to calculate MFCCs [7].
(1) Frame the signal into short frames.
(2) For each frame calculate the period gram
estimate of the power spectrum.
(3) Apply the mel filter bank to the power spectra, sum
the energy in each filter.
(4) Take the logarithm of all filter bank energies.
(5) Take the DCT of the log filter bank energies.
(6) Keep DCT coefficients, discard the rest.
The first step is framing. The speech signal is split
up into frames typically with the length of 10 to 30
milliseconds. The frame length is important due to the trade-
off between time and frequency resolution. If it is too long it
will not be able to capture local spectral properties and if too
short the frequency resolution would degrade. The frames
overlap each other typically by 25% to 70% of their own
length. Overlapping is used to make sure that each speech
sound is approximately centered at some frame. After the
signal is split up into frames each frame is multiplied by a
window function. A good window function has a narrow
main lobe and a low side lobe. A smooth tapering at the
edges is desired to minimize discontinuities. The most
common window used in speech processing is the Hamming
window.
The second step is to apply the Discrete Fourier
Transform on each frame. The fastest way to calculate the
DFT is to use FFT which is an algorithm that can speed up
DFT calculations by hundred-folds [3].
∑ (1)
Spectral analysis shows that different timbres in
speech signals corresponds to different energy distribution
over frequencies. Therefore we usually perform FFT to
obtain the magnitude frequency response of each frame.
When we perform FFT on a frame, we assume that
the signal within a frame is periodic, and continuous when
wrapping around. If this is not the case, we can still perform
FFT but the in continuity at the frame's first and last points
is likely to introduce undesirable effects in the frequency
response. To deal with this problem, we have two strategies.
Multiply each frame by a Hamming window to increase its
continuity at the first and last points. Take a frame of a
variable size such that it always contains an integer multiple
number of the fundamental periods of the speech signal [6].
The second strategy encounters difficulty in
practice since the identification of the fundamental period is
not a trivial problem. Moreover, unvoiced sounds do not
have a fundamental period at all. Consequently, we usually
adopt the first strategy to multiply the frame by a Hamming
window before performing FFT.
We multiple the magnitude frequency response by
a set of 20 triangular band pass filters to get the log energy
of each triangular band pass filter. The positions of these
Audio/Speech Signal Analysis for Depression
(IJSRD/Vol. 2/Issue 07/2014/053)
All rights reserved by www.ijsrd.com 229
filters are equally spaced along the Mel frequency, which is
related to the common linear frequency f by the following
equation:
M=2595 ( ) (2)
The mel scale is based on how the human hearing
perceives frequencies. It was defined by setting 1000 mels
equal to 1000 Hz as a reference point. Mel-frequency is
proportional to the logarithm of the linear frequency,
reflecting similar effects in the human's subjective aural
perception. Then listeners were asked to adjust the physical
pitch until they perceived it as two-fold ten-fold and half,
and those frequencies were then labeled as 2000 mel, 10000
mel and 500 mel respectively. The resulting scale was called
the mel scale and is approximately linear below frequencies
of 1000hz and logarithmic above [4].
Spectrum is calculated by using a filter bank,
spaced uniformly on the mel scale. That filter bank has a
triangular bandpass frequency response, and the spacing as
well as the bandwidth is determined by a constant mel
frequency interval. The mel frequency warping when
calculating MFCCs is accomplished by the use of a
triangular mel spaced filter bank. It consists of several
triangular shaped and mel spaced filters, and their outputs
are described by:
( ) ∑ (3)
Where is the N-point magnitude spectrum and
the sampled magnitude response of an M-channel filter
bank.
In this step, we apply DCT on the 20 log energy is
obtained from the triangular bandpass filters to have L mel-
scale cepstral coefficients. Apply compression by using a
logarithm on the filter outputs Y and then to apply the
discrete cosine transform which yields the MFCCs c[n]
according to the following formula
[ ] ∑ ( ( )) ( ( )) (4)
Where N is the number of triangular bandpass
filters, L is the number of Mel-scale cepstral coefficients.
Usually we set N=20 and L=12. Since we have performed
FFT, DCT transforms the frequency domain into a time-like
domain called quefrency domain. The obtained features are
similar to cepstrum, thus it is referred to as the mel-scale
cepstral coefficients, or MFCC. MFCC alone can be used as
the feature for speech recognition. The energy within a
frame is also an important feature that can be easily
obtained. Hence we usually add the log energy as the 13rd
feature to MFCC. If necessary, we can add some other
features at this step, including pitch, zero cross rate, high-
order spectrum momentum, and so on.
When we run the program, three messages are
displayed in GUI. ‘Start record’, ‘Browse file’ &
‘Recognize ‘is displayed. When we push the button ‘Start
Record’, program asks to record of audio/speech signal or
voice for 1 second and when we push the button ‘Browse
Wav file’, program asks to browse wave file from the
system and to record of audio/speech signal or voice for 1
second. After Recording When we push the button
‘Recognize’, of the Audio signal, if feature of voice matched
with depressed data, then it display message ‘Depressed’.
After recording, if feature of voice is not matched with
depressed data, then it displays message ‘Normal’ [2].
III. RESULT
Result obtained during recognition of speech/audio signal. It
seems to be very effective and robust to implement in real
time working environment.
Fig. 1: Output wave form of depression
IV. CONCLUSION
Finally analysis of speech/audio sample, conclude that vocal
properties represented by MFCC. It is provide most
effective result between depressed and normal people. The
extracted features were stored in a .mat file using MFCC
algorithm. A distortion measure based on minimizing the
Euclidean distance was used when matching the unknown
speech signal with the speech signal database. The
experimental results were analyzed with the help of
MATLAB and it is proved that the results are efficient. We
hope this paper brings a basic understanding of the
researchers motivate them to explore the community of
speech/audio recognition.
REFERENCES
[1] M.W. Huang, K. S. Cheng, “The Evaluation of
Cognitive Function with Visual Long-term Evoked
Potentials in Schizophrenic Patients”, Institute of
Biomedical Engineering National Cheng Kung
University, 2002.
[2] Arpit Parikh, Sameena Zafar, Mukesh Saini,
“Classification of Speech Emotion using MFCC”,
International Journal For Advance Research In
Engineering and Technology (IJARET) – Volume 2,
Issue IX Sep 2014
[3] Yen-Ting Chen, I-Chung Hung, Chun-Ju Hou,
Signal Analysis for Patients with Depression”,
Department of Electrical Engineering, Southern
Taiwan University, Tainan, Taiwan.
[4] Thaweesak Yingthawornsuk. PhD Thesis. “Acoustic
Analysis of Vocal Output Characteristics for Suicidal
Risk Assessment “Graduate School of Vanderbilt
University, December, 2007.
Audio/Speech Signal Analysis for Depression
(IJSRD/Vol. 2/Issue 07/2014/053)
All rights reserved by www.ijsrd.com 230
[5] France, D.J., PhD, Thesis “Acoustical properties of
speech as indicators of depression and suicidal risk”,
Vanderbilt University, August, 1997.
[6] AsliOzdas, Richard G. Shiavi, Senior Member, IEEE,
Stephen E. Silverman, Marilyn K. Silverman, and D.
Mitchell Wilkes, Member, IEEE. “Investigation of
Vocal Jitter and Glottal Flow Spectrum as Possible
Cues for Depression and Near-Term Suicidal Risk”,
IEEE transactions on Engineering, Vol. 51, No. 9,
September 2004.
[7] Mel-frequency cepstrum coefficients (MFCCs),
http://mirlab.org/jang/books/audiosignalprocessing/sp
eechFeatureMfcc.asp?title=12-2%20MFCC.
[8] http://practicalcryptography.com/miscellane
ous/machine-learning/guide-mel-frequency-cepstral-
coefficients-mfccs

More Related Content

What's hot

Animal Voice Morphing System
Animal Voice Morphing SystemAnimal Voice Morphing System
Animal Voice Morphing System
editor1knowledgecuddle
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech Analysis
Rushin Shah
 
Conditional Averaging a New Algorithm for Digital Filter
Conditional Averaging a New Algorithm for Digital FilterConditional Averaging a New Algorithm for Digital Filter
Conditional Averaging a New Algorithm for Digital Filter
IDES Editor
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systems
Namratha Dcruz
 
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
TELKOMNIKA JOURNAL
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural network
eSAT Journals
 
IRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time DomainIRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time Domain
IRJET Journal
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signal
Vinodhini
 
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONSPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONniranjan kumar
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
eSAT Publishing House
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
Phan Duy
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlab
Arcanjo Salazaku
 
Multirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolationMultirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolation
ransherraj
 
multirate signal processing for speech
multirate signal processing for speechmultirate signal processing for speech
multirate signal processing for speech
Rudra Prasad Maiti
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition system
Deepesh Lekhak
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
Murtadha Alsabbagh
 
F010334548
F010334548F010334548
F010334548
IOSR Journals
 

What's hot (20)

Animal Voice Morphing System
Animal Voice Morphing SystemAnimal Voice Morphing System
Animal Voice Morphing System
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech Analysis
 
Conditional Averaging a New Algorithm for Digital Filter
Conditional Averaging a New Algorithm for Digital FilterConditional Averaging a New Algorithm for Digital Filter
Conditional Averaging a New Algorithm for Digital Filter
 
Speaker recognition.
Speaker recognition.Speaker recognition.
Speaker recognition.
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systems
 
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
Feature Extraction Analysis for Hidden Markov Models in Sundanese Speech Reco...
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural network
 
IRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time DomainIRJET- Pitch Detection Algorithms in Time Domain
IRJET- Pitch Detection Algorithms in Time Domain
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signal
 
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITIONSPEKER RECOGNITION UNDER LIMITED DATA CODITION
SPEKER RECOGNITION UNDER LIMITED DATA CODITION
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
 
Speaker recognition on matlab
Speaker recognition on matlabSpeaker recognition on matlab
Speaker recognition on matlab
 
Multirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolationMultirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolation
 
multirate signal processing for speech
multirate signal processing for speechmultirate signal processing for speech
multirate signal processing for speech
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition system
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
F010334548
F010334548F010334548
F010334548
 
Subband Coding
Subband CodingSubband Coding
Subband Coding
 

Similar to Audio/Speech Signal Analysis for Depression

Speech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using VocoderSpeech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using Vocoder
IJTET Journal
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognition
phyuhsan
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker Recognition
CSCJournals
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based features
ijsc
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
Roger Gomes
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
ijsc
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
IDES Editor
 
D111823
D111823D111823
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
IOSR Journals
 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Roman Atachiants
 
Mfcc based enlargement of the training set for emotion recognition in speech
Mfcc based enlargement of the training set for emotion recognition in speechMfcc based enlargement of the training set for emotion recognition in speech
Mfcc based enlargement of the training set for emotion recognition in speech
sipij
 
SPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEWSPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEW
ijiert bestjournal
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
IJCSEIT Journal
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
CSCJournals
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
IRJET Journal
 
A Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak TransformA Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak Transform
CSCJournals
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPCDisha Modi
 
A017410108
A017410108A017410108
A017410108
IOSR Journals
 
Raichlin et al_Mobile application for Speech Rate
Raichlin et al_Mobile application for Speech Rate Raichlin et al_Mobile application for Speech Rate
Raichlin et al_Mobile application for Speech Rate Katia Raichlin-Levi
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...
IJECEIAES
 

Similar to Audio/Speech Signal Analysis for Depression (20)

Speech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using VocoderSpeech Analysis and synthesis using Vocoder
Speech Analysis and synthesis using Vocoder
 
Voice biometric recognition
Voice biometric recognitionVoice biometric recognition
Voice biometric recognition
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker Recognition
 
Broad phoneme classification using signal based features
Broad phoneme classification using signal based featuresBroad phoneme classification using signal based features
Broad phoneme classification using signal based features
 
Speaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home ApplicationsSpeaker and Speech Recognition for Secured Smart Home Applications
Speaker and Speech Recognition for Secured Smart Home Applications
 
Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features  Broad Phoneme Classification Using Signal Based Features
Broad Phoneme Classification Using Signal Based Features
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
D111823
D111823D111823
D111823
 
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
 
Mfcc based enlargement of the training set for emotion recognition in speech
Mfcc based enlargement of the training set for emotion recognition in speechMfcc based enlargement of the training set for emotion recognition in speech
Mfcc based enlargement of the training set for emotion recognition in speech
 
SPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEWSPEECH COMPRESSION TECHNIQUES: A REVIEW
SPEECH COMPRESSION TECHNIQUES: A REVIEW
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
 
A Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak TransformA Text-Independent Speaker Identification System based on The Zak Transform
A Text-Independent Speaker Identification System based on The Zak Transform
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
 
A017410108
A017410108A017410108
A017410108
 
Raichlin et al_Mobile application for Speech Rate
Raichlin et al_Mobile application for Speech Rate Raichlin et al_Mobile application for Speech Rate
Raichlin et al_Mobile application for Speech Rate
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...
 

More from ijsrd.com

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
ijsrd.com
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
ijsrd.com
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
ijsrd.com
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
ijsrd.com
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
ijsrd.com
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
ijsrd.com
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
ijsrd.com
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
ijsrd.com
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
ijsrd.com
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
ijsrd.com
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
ijsrd.com
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
ijsrd.com
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
ijsrd.com
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
ijsrd.com
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
ijsrd.com
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
ijsrd.com
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
ijsrd.com
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
ijsrd.com
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
ijsrd.com
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
ijsrd.com
 

More from ijsrd.com (20)

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
 

Recently uploaded

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 

Recently uploaded (20)

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 

Audio/Speech Signal Analysis for Depression

  • 1. IJSRD - International Journal for Scientific Research & Development| Vol. 2, Issue 07, 2014 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 228 Audio/Speech Signal Analysis for Depression Arpit Parikh1 Sameena Zafar2 Mukesh Saini3 1 PG Research Scholar 2 Assistant Professor and Head of Department 3 Assistant Professor 1,2,3 Department of Electronics and Communication Engineering 1,2,3 Patel Institute of Engineering & Science, RGPV University, Bhopal, India Abstract— The word “depressed” is a common everyday word. People might say "I am depressed" when in fact they mean "I am fed up because I have had a row, or failed an exam, or lost my job", etc. These ups and downs of life are common and normal. Most people recover quite quickly. Depression is identified by different methods. Here we are identified depression by MFCC (Mel Frequency Ceptral Coefficient) method. There are different parameters used for the identification of depressed speech and normal speech, but MFCCs based parameter is the most applicable information then other parameter because depressive speech or audio signal can contain more information in the higher energy bands when compared with normal speech. Keywords: Mel Frequency Cepstral Coefficients (MFCC), Speech Emotion Recognition (SER), discrete Cosine Transform (DCT), Depression I. INTRODUCTION Speech Emotion recognition through speech is an area which is increasingly attracting attention within the engineers in the field of pattern recognition and speech/Audio signal processing in recent years. Speech recognition is a technique that enables a device to recognize and understand spoken words. Automatic emotion recognition paid close attention in identifying emotional state of speaker from voice signal. Emotions play an extremely important role in human life. To recognize human's emotions and feelings, various physiological signals have been widely used to classify emotion because signal acquisition by non-invasive sensors is relatively simple and physiological responses induced by emotion are less sensitive in social and cultural difference. Depression is classified blood volume pulse, electrocardiogram (ECG), and Galvanic skin response, Skin temperature and Gaussian Mixture Model. Speech signal is also recognize with Wiener filter and Kalman filter. [1] MFCC (Mel Frequency Ceptral Coefficients) is used to analysis of depression. The main point to understand about speech is that the sounds generated by a human are filtered by the shape of the vocal tract including tongue, teeth etc. This shape determines what sound comes out. If we can determine the shape accurately, this should give us an accurate representation of the phoneme being produced. The shape of the vocal tract manifests itself in the envelope of the short time power spectrum, and the job of MFCCs is to accurately represent this envelope. Mel Frequency Cepstral Coefficents (MFCCs) are a feature widely used in automatic speech and speaker recognition. They were introduced by Davis and Mermelstein in the 1980's, and have been state-of-the-art ever since. Prior to the introduction of MFCCs, Linear Prediction Coefficients (LPCs) [8] and Linear Prediction Cepstral Coefficients (LPCCs) and were the main feature type for automatic speech recognition (ASR) [5]. II. MFCC METHODOLOGY Now there are some steps to calculate MFCCs [7]. (1) Frame the signal into short frames. (2) For each frame calculate the period gram estimate of the power spectrum. (3) Apply the mel filter bank to the power spectra, sum the energy in each filter. (4) Take the logarithm of all filter bank energies. (5) Take the DCT of the log filter bank energies. (6) Keep DCT coefficients, discard the rest. The first step is framing. The speech signal is split up into frames typically with the length of 10 to 30 milliseconds. The frame length is important due to the trade- off between time and frequency resolution. If it is too long it will not be able to capture local spectral properties and if too short the frequency resolution would degrade. The frames overlap each other typically by 25% to 70% of their own length. Overlapping is used to make sure that each speech sound is approximately centered at some frame. After the signal is split up into frames each frame is multiplied by a window function. A good window function has a narrow main lobe and a low side lobe. A smooth tapering at the edges is desired to minimize discontinuities. The most common window used in speech processing is the Hamming window. The second step is to apply the Discrete Fourier Transform on each frame. The fastest way to calculate the DFT is to use FFT which is an algorithm that can speed up DFT calculations by hundred-folds [3]. ∑ (1) Spectral analysis shows that different timbres in speech signals corresponds to different energy distribution over frequencies. Therefore we usually perform FFT to obtain the magnitude frequency response of each frame. When we perform FFT on a frame, we assume that the signal within a frame is periodic, and continuous when wrapping around. If this is not the case, we can still perform FFT but the in continuity at the frame's first and last points is likely to introduce undesirable effects in the frequency response. To deal with this problem, we have two strategies. Multiply each frame by a Hamming window to increase its continuity at the first and last points. Take a frame of a variable size such that it always contains an integer multiple number of the fundamental periods of the speech signal [6]. The second strategy encounters difficulty in practice since the identification of the fundamental period is not a trivial problem. Moreover, unvoiced sounds do not have a fundamental period at all. Consequently, we usually adopt the first strategy to multiply the frame by a Hamming window before performing FFT. We multiple the magnitude frequency response by a set of 20 triangular band pass filters to get the log energy of each triangular band pass filter. The positions of these
  • 2. Audio/Speech Signal Analysis for Depression (IJSRD/Vol. 2/Issue 07/2014/053) All rights reserved by www.ijsrd.com 229 filters are equally spaced along the Mel frequency, which is related to the common linear frequency f by the following equation: M=2595 ( ) (2) The mel scale is based on how the human hearing perceives frequencies. It was defined by setting 1000 mels equal to 1000 Hz as a reference point. Mel-frequency is proportional to the logarithm of the linear frequency, reflecting similar effects in the human's subjective aural perception. Then listeners were asked to adjust the physical pitch until they perceived it as two-fold ten-fold and half, and those frequencies were then labeled as 2000 mel, 10000 mel and 500 mel respectively. The resulting scale was called the mel scale and is approximately linear below frequencies of 1000hz and logarithmic above [4]. Spectrum is calculated by using a filter bank, spaced uniformly on the mel scale. That filter bank has a triangular bandpass frequency response, and the spacing as well as the bandwidth is determined by a constant mel frequency interval. The mel frequency warping when calculating MFCCs is accomplished by the use of a triangular mel spaced filter bank. It consists of several triangular shaped and mel spaced filters, and their outputs are described by: ( ) ∑ (3) Where is the N-point magnitude spectrum and the sampled magnitude response of an M-channel filter bank. In this step, we apply DCT on the 20 log energy is obtained from the triangular bandpass filters to have L mel- scale cepstral coefficients. Apply compression by using a logarithm on the filter outputs Y and then to apply the discrete cosine transform which yields the MFCCs c[n] according to the following formula [ ] ∑ ( ( )) ( ( )) (4) Where N is the number of triangular bandpass filters, L is the number of Mel-scale cepstral coefficients. Usually we set N=20 and L=12. Since we have performed FFT, DCT transforms the frequency domain into a time-like domain called quefrency domain. The obtained features are similar to cepstrum, thus it is referred to as the mel-scale cepstral coefficients, or MFCC. MFCC alone can be used as the feature for speech recognition. The energy within a frame is also an important feature that can be easily obtained. Hence we usually add the log energy as the 13rd feature to MFCC. If necessary, we can add some other features at this step, including pitch, zero cross rate, high- order spectrum momentum, and so on. When we run the program, three messages are displayed in GUI. ‘Start record’, ‘Browse file’ & ‘Recognize ‘is displayed. When we push the button ‘Start Record’, program asks to record of audio/speech signal or voice for 1 second and when we push the button ‘Browse Wav file’, program asks to browse wave file from the system and to record of audio/speech signal or voice for 1 second. After Recording When we push the button ‘Recognize’, of the Audio signal, if feature of voice matched with depressed data, then it display message ‘Depressed’. After recording, if feature of voice is not matched with depressed data, then it displays message ‘Normal’ [2]. III. RESULT Result obtained during recognition of speech/audio signal. It seems to be very effective and robust to implement in real time working environment. Fig. 1: Output wave form of depression IV. CONCLUSION Finally analysis of speech/audio sample, conclude that vocal properties represented by MFCC. It is provide most effective result between depressed and normal people. The extracted features were stored in a .mat file using MFCC algorithm. A distortion measure based on minimizing the Euclidean distance was used when matching the unknown speech signal with the speech signal database. The experimental results were analyzed with the help of MATLAB and it is proved that the results are efficient. We hope this paper brings a basic understanding of the researchers motivate them to explore the community of speech/audio recognition. REFERENCES [1] M.W. Huang, K. S. Cheng, “The Evaluation of Cognitive Function with Visual Long-term Evoked Potentials in Schizophrenic Patients”, Institute of Biomedical Engineering National Cheng Kung University, 2002. [2] Arpit Parikh, Sameena Zafar, Mukesh Saini, “Classification of Speech Emotion using MFCC”, International Journal For Advance Research In Engineering and Technology (IJARET) – Volume 2, Issue IX Sep 2014 [3] Yen-Ting Chen, I-Chung Hung, Chun-Ju Hou, Signal Analysis for Patients with Depression”, Department of Electrical Engineering, Southern Taiwan University, Tainan, Taiwan. [4] Thaweesak Yingthawornsuk. PhD Thesis. “Acoustic Analysis of Vocal Output Characteristics for Suicidal Risk Assessment “Graduate School of Vanderbilt University, December, 2007.
  • 3. Audio/Speech Signal Analysis for Depression (IJSRD/Vol. 2/Issue 07/2014/053) All rights reserved by www.ijsrd.com 230 [5] France, D.J., PhD, Thesis “Acoustical properties of speech as indicators of depression and suicidal risk”, Vanderbilt University, August, 1997. [6] AsliOzdas, Richard G. Shiavi, Senior Member, IEEE, Stephen E. Silverman, Marilyn K. Silverman, and D. Mitchell Wilkes, Member, IEEE. “Investigation of Vocal Jitter and Glottal Flow Spectrum as Possible Cues for Depression and Near-Term Suicidal Risk”, IEEE transactions on Engineering, Vol. 51, No. 9, September 2004. [7] Mel-frequency cepstrum coefficients (MFCCs), http://mirlab.org/jang/books/audiosignalprocessing/sp eechFeatureMfcc.asp?title=12-2%20MFCC. [8] http://practicalcryptography.com/miscellane ous/machine-learning/guide-mel-frequency-cepstral- coefficients-mfccs