SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 997
Speech Emotion Recognition Using Machine Learning
Preethi Jeevan1, Kolluri Sahaja2, Dasari Vani3, Alisha Begum4
1Professor,Dept.ofComputerScienceandEngineering,SNIST,Hyderabad-501301,India
2,3,4B.TECHScholars,Dept.ofComputerScienceandEngineeringHyderabad-501301,India
---------------------------------------------------------------------***----------------------------------------------------------------
Abstract: Speech is the ability to express thoughts and emotions through vocal soundsand gestures. Speechemotionrecognition
(SER), is the act of attempting to recognize human emotion and affective states from speech. This is capitalizing on the fact that
voice often reflects underlying emotion through tone and pitch..In this paper we analyze the emotions in recorded speech by
analyzing the acoustic features of the audio data of recordings .From the Audio data Feature Selection and Feature Extraction is
done. The Python implementation of the Librosa package was used in their extraction. Feature selection(FS) wasappliedinorder
to seek for the most relevant feature subset. We analyzed and labeled with respect to emotions and gender for six emotions viz.
neutral, happy, sad, angry, fear and disgust, the number of labels was slightly less for surprise.
Keywords: MFCC, Chroma, Mel Spectrogram, Feature selection
1. INTRODUCTION
Emotion plays a significant role in interpersonal human interactions. For interaction between human and machine use of
speech signal is the fastest and most efficient method. Emotional displays convey considerable information about the mental
state of an individual. For machine emotional detection is a very difficult task, on the other hand, it is natural for humans. So,
knowledge related to emotion is used by an emotion recognition system in such a way that there is an improvement in
communication between machine and human.
1.1 LITERATURE SURVERY
A majority of natural language processing solutions, such as voice-activated systems etc. require speech as input. Standard
procedure is to first convert this speech input to text usingAutomatic Speech Recognition (ASR) systems and then run
classification or other learning operations on the ASR text output. ASR resolves variations in speech transcriptions being
speaker independent. ASR systems produce outputs with high accuracybutenduplosinga significantamountof a speechfrom
different users using probabilistic acoustic and language models which results in information that suggests emotion from
speech. This gap has resulted in Speech-based Emotion Recognition(SER)systemsbecominganarea ofresearchinterestinthe
last few years .Three key issues for successful SER system
 Choice of a good emotional speech database.
 Extract effective features.
 Design reliable classifiers using machine learning algorithms.
The emotional feature extraction is a main issue in the SER system. A typical SER system works on extracting features such as
spectral features, pitch frequency features, formant features and energy related features from speech, following it with a
classification task to predict various classes of emotion.Speechinformationrecognizedemotionsmaybespeakerindependent
or speaker dependent. Bayesian Network model, Hidden Markov Model (HMM), Support Vector Machines (SVM), Gaussian
Mixture Model (GMM) and Multi-Classifier Fusion are few of the techniques used in traditional classification tasks.
In this work, we propose a robust technique of emotion classification usingspeechfeaturesandtranscriptions.The objectiveis
to capture emotional characteristics using speech features, along with semantic informationfromtext,andusea deeplearning
based emotion classifier to improve emotion detection accuracies. We present differentdeep network architecturestoclassify
emotion using speech features and text. The main contributions of the current work are:
1.2 PROPOSED METHOD
The speech emotion recognition system is performed as a Machine Learning (ML) model. The steps of operation are similarto
any other ML project, with supplementary fine- tuning systems to make the model function adequately.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 998
The fundamental action is data collection, which is of prime importance.
The model being generated will acquire from the data contributed to it and all the conclusions and decisionsthata progressed
model will produce is supervised data. The secondary action,calledasfeature engineering,isa combinationofvariousmachine
learning assignments that are performedoverthegathered data.Thesesystemsapproachthevariousdata descriptionanddata
quality problems. The third step is often explored the essence of an ML project where an algorithmic based prototype is
generated. This model uses an ML algorithm to determine about the data and instruct itself to react to any new data it is
exhibited to. The ultimate step is to estimate the functioning of the built model. Very frequently, developersreplicatethe steps
of generating a model and estimating it to analyze the performance of various algorithms. Measuring outcomeshelptochoose
the suitable ML algorithm most appropriate to the p Dataset:
English Language Dataset, namely the Toronto Emotional Speech Set (TESS) was taken into consideration. This is a dataset
which consists of 200 target words and were spoken by two women, one younger and the other older, and the phrases were
recorded in order to portray the following seven emotions happy, sad,angry,disgust,fear,surpriseandneutral state.There are
2800 files in total, in this dataset.
CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the
ages of 20 and 74 coming from a variety of races and ethnicities (AfricanAmerica,Asian,Caucasian,Hispanic,andUnspecified).
Actors spoke from a selection of12 sentences. The sentences were presented using one of six different emotions (Anger,
Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified).
Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset contains 24 professional actors(12female,
12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech emotions include calm,
happy, sad, angry, fearful, surprise, and disgust expressions. Each expression is produced at two levels of emotional intensity
(normal, strong), with an additional neutral expression.
2.SPEECH EMOTION RECOGNITION SYSTEM
2.1 Data Set & Data Visualization:
In this Speech Emotion Recognition Project, Audio File is taken from the TESS Dataset, and that will be uploaded in .wav file
format before the file upload process is validated, which relates to the file format and empty file input, and will be connect
directly to python files where the output generated in the form of Emotional Labels. Data visualization provides information
about the given audio data in redicament.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 999
Graphic and pictorial form. Here, initial dataset is divided into its emotional labels, and then the whole data is pictured into
spectrogram graph and wave plot diagram. Spectrogram maybea visual representationofthespectrumoffrequenciesofa sign
because it varies with time. Wave-plot is employed to plot waveform of amplitude vs time where theprimaryaxisisamplitude
and the second axis is time.
2.2 Features Selection:
In this Project mainly comprise features like Zero Crossing rate, Chroma Shift, Root Mean Square Value, Mel Spectrogram and
MFCC (Mel information retrieval in the music category; these extract the crux of given audio. The zero- crossing rate iswhena
significant change from positive to 0 Frequency Cepstral Coefficient). These are few predominantly used audio features for
emotional-related audio content, acoustic recognition, and to negative or from negative to 0 to positive. Mel Spectrogramsare
spectrograms that visual sounds on the Mel scale as against the frequency domain. The Mel Scale must be a logarithmic
transformation of a signal's frequency.
2.3 Feature extraction:
The speech signal contains a large number of parameters that reflect theemotional characteristics.Oneofthestickingpointsin
emotion recognition is what features should be used. In recent research, many common featuresareextracted,suchasenergy,
pitch, formant, and some spectrum features such as linear prediction coefficients (LPC), mel-frequency cepstrum coefficients
(MFCC), and modulation spectral features. In this work,wehaveselectedmodulationspectral featuresandMFCC,toextractthe
emotional features.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1000
Mel-frequency cepstrum coefficient (MFCC) is the most used representation ofthespectral propertyofvoicesignals.Theseare
the best for speech recognition as it takes humanperceptionsensitivitywithrespectto frequenciesintoconsideration.Foreach
frame, the Fourier transform and the energy spectrum were estimated and mapped into the Mel-frequency scale. In our
research, we extract the first 12 order of the MFCC coefficients where the speech signals aresampledat16KHz.For eachorder
coefficients, we calculate the mean, variance, standard deviation, kurtosis, andskewness,andthisisfortheotherall the frames
of an utterance. Each MFCC feature vector is 60-dimensional.
3. CONVOLUTIONAL NEURAL NETWORK
Convolutional neural network (CNN) is used to train the speech data in the training set. Fig. 2 shows the accuracy of CNN
model RNN model and MLP model training set and verification set. It can be seen from Figure 2 that with the increase of
iteration times, the accuracy of training set and verification set tends to increase, especially the training set. Finally, after 200
iterations, the accuracy of the training set is over 97%, and that of the verificationsetisover80%.CNN model developedinthis
paper has high accuracy than RNN and MLP in both training set and verification set.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1001
The convolution operation represents the hierarchical extraction of speech features, while the maximum pooling operation
removes redundant information in the previous layer features, and the operation is simplified. An activation layer is set after
each convolution layer, and the best activation function determinedby experiments.Twofullyconnected neurons aresetinthe
output layer to divide the speech signals into two categories. In addition, considering the complexity of the network structure,
random Dropout is set after each hidden layer to prevent the network from over-fitting in the training process. After adding
Dropout, the neurons in each hidden layer will have a certain probability of not updating the weights in the training process,
and the probability of each neuron not updating the weights is equal.
Algorithm
Step 1: The sample audio is provided as input.
Step 2: The Spectrogram and Waveform is plotted from the audio file.
Step 3: Using the LIBROSA, a python library we extract the MFCC (Mel Frequency Cepstral Coefficient) usually about 10–20.
Step 4: Remixing the data, dividing it in train and test and there after constructing a CNN model anditsfollowinglayerstotrain
the dataset.
Step 5: Predicting the human voice emotion from that trained data (sample no.- predicted value - actual value)
UML DIAGRAM:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1002
5. EXPERIMENTAL RESULTS AND ANALYSIS
In this paper, the obtained models are tested in the test set following the experimental procedure. Table 1 shows the
normalized confusion matrix for SER. The recognition accuracy ofexpression recognitionachievedbytheproposedmethodon
the RAVDESS dataset is better than existing studies on SER. The average classification accuracy on RAVDESS is listed in Table
Some discriminations of 'sad' for ' neural ' fail. Perhaps surprisingly, the frequency of confusion between'fear'and'happiness'
is as high as the frequency of confusion between 'sadness' or 'disgust'- perhaps this is because fear is a true multi-faceted
emotion. Based on this, we will compare the characteristics of confounding emotions more carefully to see if there are any
differences and how to capture them. For example, translating spokenwordsintotextandtrainingthenetwork for multimodal
prediction, and combining semantics for sentiment recognition.
6. CONCLUSION
In recent years, SER technology as one of the key technologies in human-computer interaction systems, has received a lot of
attention from researchers at home and abroad for its ability to accurately recognize emotions andthusimprovethequalityof
human-computer interaction. In this paper, we propose a deep learning algorithm with fused featuresforSER.intermsofdata
processing, we quadruple-processed the RAVDESS dataset with 5760 audio. For the network structure, we constructed two
parallel convolutional neural networks (CNNs) to extractspatial featuresanda transformencodernetwork to extracttemporal
features to classify emotions from one of the eight categories. The TESS dataset that we considered is fine-tuned. Since it has
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1003
noiseless data, it was easy for us to classify and feature .Taking advantage of CNNs in spatial feature representation and
sequence coding transformation, we obtained an accuracy of 80.46% on the holdout test set of the RAVDESSdataset.Basedon
the analysis of the results, the recognition of emotions by converting speech into text combined with semantics is considered.
The combined Spectrogram-MFCC model results in an overall emotion detection accuracy of 73.1%, an almost 4%
improvement to the existing state-of-the-art methods. Better results are observed when speech features are used along with
speech transcriptions. The combinedSpectrogram-Text model givesa classaccuracyof69.5%andanoverall accuracyof75.1%
whereas the combined MFCC-Text model also gives a class accuracy of 69.5% but an overall accuracy of 76.1%, a 5.6% and an
almost 7% improvement over current benchmarks respectively. The proposed models can be used for emotion-related
applications such as conversational, social robots, etc. where identifying emotion and sentiment hidden in speech may play a
role in the better conversation.
REFERENCES
1. Surabhi V, Saurabh M. Speech emotion recognition: A review. International Research Journal of Engineering and
Technology (IRJET). 2016;03:313-316
2. Nicholson, J., Takahashi, K. &Nakatsu, R. Emotion Recognition in Speech Using Neural Networks. NCA 9, 290–
296(2000). https://doi.org/10.1007/s005210070006.
3. ‘Han, Kun / Yu, Dong / Tashev, Ivan (2014): “Speech emotion recognition using deep neural network ”
4. A. Rajasekhar, M. K. Hota, ―A Study of Speech, Speaker and Emotion Recognition using Mel Frequency Cepstrum
Coefficients and Support Vector Machines‖, International Conference on Communication and Signal Processing,
pp. 0114-0118, 2018.
5. Zheng, W. L., Zhu, J., Peng, Y.: EEG-based emotion classification using deep belief networks. In: IEEE International
Conference on Multimedia & Expo, pp. 1-6 (2014).
6. Parthasarathy S, Tashev I.Convolutional neural network techniques for speech emotion recognition. In: 2018 16th
International Workshop on Acoustic Signal Enhancement (IWAENC). IEEE 2018. pp. 121-125.
7. Matilda S. Emotion recognition: A survey. International Journal of Advanced Computer Research. 2015;3(1):14-19
8. Lim W, Jang D, Lee T. Speech emotion recognition using convolutional and recurrent neural networks. Asia- Pacific.
2017:1-4
9. G. Liu, W. He, B. Jin, ―Feature fusion of speech emotion recognition based on deep Learning‖, 2018 International
Conference on Network Infrastructure and Digital Content (IC-NIDC), pp. 193-197, 2018.
10. Koolagudi SG, Rao KS. Emotion recognition from speech: A review. International Journal of Speech Technology 2012.

More Related Content

Similar to Speech Emotion Recognition Using Machine Learning

A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
CSCJournals
 
An evolutionary optimization method for selecting features for speech emotion...
An evolutionary optimization method for selecting features for speech emotion...An evolutionary optimization method for selecting features for speech emotion...
An evolutionary optimization method for selecting features for speech emotion...
TELKOMNIKA JOURNAL
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
IRJET Journal
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
AIRCC Publishing Corporation
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender Recognition
IRJET Journal
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET Journal
 
IRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET - Audio Emotion Analysis
IRJET - Audio Emotion Analysis
IRJET Journal
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech Recognition
IRJET Journal
 
F334047
F334047F334047
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
sophiabelthome
 
IRJET- Techniques for Analyzing Job Satisfaction in Working Employees – A...
IRJET-  	  Techniques for Analyzing Job Satisfaction in Working Employees – A...IRJET-  	  Techniques for Analyzing Job Satisfaction in Working Employees – A...
IRJET- Techniques for Analyzing Job Satisfaction in Working Employees – A...
IRJET Journal
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
IRJET Journal
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
IRJET Journal
 
SPEECH CLASSIFICATION USING ZERNIKE MOMENTS
SPEECH CLASSIFICATION USING ZERNIKE MOMENTSSPEECH CLASSIFICATION USING ZERNIKE MOMENTS
SPEECH CLASSIFICATION USING ZERNIKE MOMENTS
cscpconf
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
SPEECH BASED  EMOTION RECOGNITION USING VOICESPEECH BASED  EMOTION RECOGNITION USING VOICE
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
IRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech AnalysisIRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech Analysis
IRJET Journal
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET Journal
 
IRJET- Factoid Question and Answering System
IRJET-  	  Factoid Question and Answering SystemIRJET-  	  Factoid Question and Answering System
IRJET- Factoid Question and Answering System
IRJET Journal
 
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNNSPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
IRJET Journal
 

Similar to Speech Emotion Recognition Using Machine Learning (20)

A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
An evolutionary optimization method for selecting features for speech emotion...
An evolutionary optimization method for selecting features for speech emotion...An evolutionary optimization method for selecting features for speech emotion...
An evolutionary optimization method for selecting features for speech emotion...
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender Recognition
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
 
IRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET - Audio Emotion Analysis
IRJET - Audio Emotion Analysis
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech Recognition
 
F334047
F334047F334047
F334047
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
 
IRJET- Techniques for Analyzing Job Satisfaction in Working Employees – A...
IRJET-  	  Techniques for Analyzing Job Satisfaction in Working Employees – A...IRJET-  	  Techniques for Analyzing Job Satisfaction in Working Employees – A...
IRJET- Techniques for Analyzing Job Satisfaction in Working Employees – A...
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
 
Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...Voice Recognition Based Automation System for Medical Applications and for Ph...
Voice Recognition Based Automation System for Medical Applications and for Ph...
 
SPEECH CLASSIFICATION USING ZERNIKE MOMENTS
SPEECH CLASSIFICATION USING ZERNIKE MOMENTSSPEECH CLASSIFICATION USING ZERNIKE MOMENTS
SPEECH CLASSIFICATION USING ZERNIKE MOMENTS
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
SPEECH BASED  EMOTION RECOGNITION USING VOICESPEECH BASED  EMOTION RECOGNITION USING VOICE
SPEECH BASED EMOTION RECOGNITION USING VOICE
 
IRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech AnalysisIRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech Analysis
 
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and SynthesizerIRJET- Voice Command Execution with Speech Recognition and Synthesizer
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
 
IRJET- Factoid Question and Answering System
IRJET-  	  Factoid Question and Answering SystemIRJET-  	  Factoid Question and Answering System
IRJET- Factoid Question and Answering System
 
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNNSPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
Aditya Rajan Patra
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 

Recently uploaded (20)

basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 

Speech Emotion Recognition Using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 997 Speech Emotion Recognition Using Machine Learning Preethi Jeevan1, Kolluri Sahaja2, Dasari Vani3, Alisha Begum4 1Professor,Dept.ofComputerScienceandEngineering,SNIST,Hyderabad-501301,India 2,3,4B.TECHScholars,Dept.ofComputerScienceandEngineeringHyderabad-501301,India ---------------------------------------------------------------------***---------------------------------------------------------------- Abstract: Speech is the ability to express thoughts and emotions through vocal soundsand gestures. Speechemotionrecognition (SER), is the act of attempting to recognize human emotion and affective states from speech. This is capitalizing on the fact that voice often reflects underlying emotion through tone and pitch..In this paper we analyze the emotions in recorded speech by analyzing the acoustic features of the audio data of recordings .From the Audio data Feature Selection and Feature Extraction is done. The Python implementation of the Librosa package was used in their extraction. Feature selection(FS) wasappliedinorder to seek for the most relevant feature subset. We analyzed and labeled with respect to emotions and gender for six emotions viz. neutral, happy, sad, angry, fear and disgust, the number of labels was slightly less for surprise. Keywords: MFCC, Chroma, Mel Spectrogram, Feature selection 1. INTRODUCTION Emotion plays a significant role in interpersonal human interactions. For interaction between human and machine use of speech signal is the fastest and most efficient method. Emotional displays convey considerable information about the mental state of an individual. For machine emotional detection is a very difficult task, on the other hand, it is natural for humans. So, knowledge related to emotion is used by an emotion recognition system in such a way that there is an improvement in communication between machine and human. 1.1 LITERATURE SURVERY A majority of natural language processing solutions, such as voice-activated systems etc. require speech as input. Standard procedure is to first convert this speech input to text usingAutomatic Speech Recognition (ASR) systems and then run classification or other learning operations on the ASR text output. ASR resolves variations in speech transcriptions being speaker independent. ASR systems produce outputs with high accuracybutenduplosinga significantamountof a speechfrom different users using probabilistic acoustic and language models which results in information that suggests emotion from speech. This gap has resulted in Speech-based Emotion Recognition(SER)systemsbecominganarea ofresearchinterestinthe last few years .Three key issues for successful SER system  Choice of a good emotional speech database.  Extract effective features.  Design reliable classifiers using machine learning algorithms. The emotional feature extraction is a main issue in the SER system. A typical SER system works on extracting features such as spectral features, pitch frequency features, formant features and energy related features from speech, following it with a classification task to predict various classes of emotion.Speechinformationrecognizedemotionsmaybespeakerindependent or speaker dependent. Bayesian Network model, Hidden Markov Model (HMM), Support Vector Machines (SVM), Gaussian Mixture Model (GMM) and Multi-Classifier Fusion are few of the techniques used in traditional classification tasks. In this work, we propose a robust technique of emotion classification usingspeechfeaturesandtranscriptions.The objectiveis to capture emotional characteristics using speech features, along with semantic informationfromtext,andusea deeplearning based emotion classifier to improve emotion detection accuracies. We present differentdeep network architecturestoclassify emotion using speech features and text. The main contributions of the current work are: 1.2 PROPOSED METHOD The speech emotion recognition system is performed as a Machine Learning (ML) model. The steps of operation are similarto any other ML project, with supplementary fine- tuning systems to make the model function adequately.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 998 The fundamental action is data collection, which is of prime importance. The model being generated will acquire from the data contributed to it and all the conclusions and decisionsthata progressed model will produce is supervised data. The secondary action,calledasfeature engineering,isa combinationofvariousmachine learning assignments that are performedoverthegathered data.Thesesystemsapproachthevariousdata descriptionanddata quality problems. The third step is often explored the essence of an ML project where an algorithmic based prototype is generated. This model uses an ML algorithm to determine about the data and instruct itself to react to any new data it is exhibited to. The ultimate step is to estimate the functioning of the built model. Very frequently, developersreplicatethe steps of generating a model and estimating it to analyze the performance of various algorithms. Measuring outcomeshelptochoose the suitable ML algorithm most appropriate to the p Dataset: English Language Dataset, namely the Toronto Emotional Speech Set (TESS) was taken into consideration. This is a dataset which consists of 200 target words and were spoken by two women, one younger and the other older, and the phrases were recorded in order to portray the following seven emotions happy, sad,angry,disgust,fear,surpriseandneutral state.There are 2800 files in total, in this dataset. CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (AfricanAmerica,Asian,Caucasian,Hispanic,andUnspecified). Actors spoke from a selection of12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified). Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset contains 24 professional actors(12female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech emotions include calm, happy, sad, angry, fearful, surprise, and disgust expressions. Each expression is produced at two levels of emotional intensity (normal, strong), with an additional neutral expression. 2.SPEECH EMOTION RECOGNITION SYSTEM 2.1 Data Set & Data Visualization: In this Speech Emotion Recognition Project, Audio File is taken from the TESS Dataset, and that will be uploaded in .wav file format before the file upload process is validated, which relates to the file format and empty file input, and will be connect directly to python files where the output generated in the form of Emotional Labels. Data visualization provides information about the given audio data in redicament.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 999 Graphic and pictorial form. Here, initial dataset is divided into its emotional labels, and then the whole data is pictured into spectrogram graph and wave plot diagram. Spectrogram maybea visual representationofthespectrumoffrequenciesofa sign because it varies with time. Wave-plot is employed to plot waveform of amplitude vs time where theprimaryaxisisamplitude and the second axis is time. 2.2 Features Selection: In this Project mainly comprise features like Zero Crossing rate, Chroma Shift, Root Mean Square Value, Mel Spectrogram and MFCC (Mel information retrieval in the music category; these extract the crux of given audio. The zero- crossing rate iswhena significant change from positive to 0 Frequency Cepstral Coefficient). These are few predominantly used audio features for emotional-related audio content, acoustic recognition, and to negative or from negative to 0 to positive. Mel Spectrogramsare spectrograms that visual sounds on the Mel scale as against the frequency domain. The Mel Scale must be a logarithmic transformation of a signal's frequency. 2.3 Feature extraction: The speech signal contains a large number of parameters that reflect theemotional characteristics.Oneofthestickingpointsin emotion recognition is what features should be used. In recent research, many common featuresareextracted,suchasenergy, pitch, formant, and some spectrum features such as linear prediction coefficients (LPC), mel-frequency cepstrum coefficients (MFCC), and modulation spectral features. In this work,wehaveselectedmodulationspectral featuresandMFCC,toextractthe emotional features.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1000 Mel-frequency cepstrum coefficient (MFCC) is the most used representation ofthespectral propertyofvoicesignals.Theseare the best for speech recognition as it takes humanperceptionsensitivitywithrespectto frequenciesintoconsideration.Foreach frame, the Fourier transform and the energy spectrum were estimated and mapped into the Mel-frequency scale. In our research, we extract the first 12 order of the MFCC coefficients where the speech signals aresampledat16KHz.For eachorder coefficients, we calculate the mean, variance, standard deviation, kurtosis, andskewness,andthisisfortheotherall the frames of an utterance. Each MFCC feature vector is 60-dimensional. 3. CONVOLUTIONAL NEURAL NETWORK Convolutional neural network (CNN) is used to train the speech data in the training set. Fig. 2 shows the accuracy of CNN model RNN model and MLP model training set and verification set. It can be seen from Figure 2 that with the increase of iteration times, the accuracy of training set and verification set tends to increase, especially the training set. Finally, after 200 iterations, the accuracy of the training set is over 97%, and that of the verificationsetisover80%.CNN model developedinthis paper has high accuracy than RNN and MLP in both training set and verification set.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1001 The convolution operation represents the hierarchical extraction of speech features, while the maximum pooling operation removes redundant information in the previous layer features, and the operation is simplified. An activation layer is set after each convolution layer, and the best activation function determinedby experiments.Twofullyconnected neurons aresetinthe output layer to divide the speech signals into two categories. In addition, considering the complexity of the network structure, random Dropout is set after each hidden layer to prevent the network from over-fitting in the training process. After adding Dropout, the neurons in each hidden layer will have a certain probability of not updating the weights in the training process, and the probability of each neuron not updating the weights is equal. Algorithm Step 1: The sample audio is provided as input. Step 2: The Spectrogram and Waveform is plotted from the audio file. Step 3: Using the LIBROSA, a python library we extract the MFCC (Mel Frequency Cepstral Coefficient) usually about 10–20. Step 4: Remixing the data, dividing it in train and test and there after constructing a CNN model anditsfollowinglayerstotrain the dataset. Step 5: Predicting the human voice emotion from that trained data (sample no.- predicted value - actual value) UML DIAGRAM:
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1002 5. EXPERIMENTAL RESULTS AND ANALYSIS In this paper, the obtained models are tested in the test set following the experimental procedure. Table 1 shows the normalized confusion matrix for SER. The recognition accuracy ofexpression recognitionachievedbytheproposedmethodon the RAVDESS dataset is better than existing studies on SER. The average classification accuracy on RAVDESS is listed in Table Some discriminations of 'sad' for ' neural ' fail. Perhaps surprisingly, the frequency of confusion between'fear'and'happiness' is as high as the frequency of confusion between 'sadness' or 'disgust'- perhaps this is because fear is a true multi-faceted emotion. Based on this, we will compare the characteristics of confounding emotions more carefully to see if there are any differences and how to capture them. For example, translating spokenwordsintotextandtrainingthenetwork for multimodal prediction, and combining semantics for sentiment recognition. 6. CONCLUSION In recent years, SER technology as one of the key technologies in human-computer interaction systems, has received a lot of attention from researchers at home and abroad for its ability to accurately recognize emotions andthusimprovethequalityof human-computer interaction. In this paper, we propose a deep learning algorithm with fused featuresforSER.intermsofdata processing, we quadruple-processed the RAVDESS dataset with 5760 audio. For the network structure, we constructed two parallel convolutional neural networks (CNNs) to extractspatial featuresanda transformencodernetwork to extracttemporal features to classify emotions from one of the eight categories. The TESS dataset that we considered is fine-tuned. Since it has
  • 7. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 01 | Jan 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1003 noiseless data, it was easy for us to classify and feature .Taking advantage of CNNs in spatial feature representation and sequence coding transformation, we obtained an accuracy of 80.46% on the holdout test set of the RAVDESSdataset.Basedon the analysis of the results, the recognition of emotions by converting speech into text combined with semantics is considered. The combined Spectrogram-MFCC model results in an overall emotion detection accuracy of 73.1%, an almost 4% improvement to the existing state-of-the-art methods. Better results are observed when speech features are used along with speech transcriptions. The combinedSpectrogram-Text model givesa classaccuracyof69.5%andanoverall accuracyof75.1% whereas the combined MFCC-Text model also gives a class accuracy of 69.5% but an overall accuracy of 76.1%, a 5.6% and an almost 7% improvement over current benchmarks respectively. The proposed models can be used for emotion-related applications such as conversational, social robots, etc. where identifying emotion and sentiment hidden in speech may play a role in the better conversation. REFERENCES 1. Surabhi V, Saurabh M. Speech emotion recognition: A review. International Research Journal of Engineering and Technology (IRJET). 2016;03:313-316 2. Nicholson, J., Takahashi, K. &Nakatsu, R. Emotion Recognition in Speech Using Neural Networks. NCA 9, 290– 296(2000). https://doi.org/10.1007/s005210070006. 3. ‘Han, Kun / Yu, Dong / Tashev, Ivan (2014): “Speech emotion recognition using deep neural network ” 4. A. Rajasekhar, M. K. Hota, ―A Study of Speech, Speaker and Emotion Recognition using Mel Frequency Cepstrum Coefficients and Support Vector Machines‖, International Conference on Communication and Signal Processing, pp. 0114-0118, 2018. 5. Zheng, W. L., Zhu, J., Peng, Y.: EEG-based emotion classification using deep belief networks. In: IEEE International Conference on Multimedia & Expo, pp. 1-6 (2014). 6. Parthasarathy S, Tashev I.Convolutional neural network techniques for speech emotion recognition. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC). IEEE 2018. pp. 121-125. 7. Matilda S. Emotion recognition: A survey. International Journal of Advanced Computer Research. 2015;3(1):14-19 8. Lim W, Jang D, Lee T. Speech emotion recognition using convolutional and recurrent neural networks. Asia- Pacific. 2017:1-4 9. G. Liu, W. He, B. Jin, ―Feature fusion of speech emotion recognition based on deep Learning‖, 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC), pp. 193-197, 2018. 10. Koolagudi SG, Rao KS. Emotion recognition from speech: A review. International Journal of Speech Technology 2012.