SlideShare a Scribd company logo
1 of 8
Emotion Recognition based on Speech and EEG Using Machine
Learning Techniques
Hanzala Javed: 2022-MSCS-17 Muhammad Sarfraz: 2022-MSCS-48
Abstract:
This research investigates the synergy of speech and EEG data for emotion recognition using machine learning, with a
focus on Artificial Neural Networks (ANNs). The study achieves a notable 97 percent accuracy in discerning emotions,
employing a diverse dataset encompassing various emotional states. The integrated analysis of speech and EEG data
enhances the model's robustness, capturing both physiological and vocal cues. The ANN architecture is meticulously
designed for effective feature extraction and representation learning, demonstrating its efficacy in discerning subtle
emotional nuances. The high accuracy attained underscores the potential of this approach in human-computer interaction,
affective computing, and mental health monitoring. This research contributes to the evolving field of emotion recognition,
offering a novel multimodal approach with practical implications for creating more intuitive and responsive systems in
various applications.
Key Words:
Emotion Recognition
Speech Analysis
EEG Signals
Machine Learning
Artificial Neural Networks (ANNs)
Introduction:
Emotion recognition has emerged as a pivotal area of research within the broader scope of artificial intelligence,
contributing significantly to human-computer interaction and affective computing. Understanding and accurately
interpreting human emotions through various modalities, such as speech and electroencephalogram (EEG) signals, hold
immense potential for applications ranging from healthcare to human-machine interfaces. This introduction seeks to
provide a comprehensive overview of the literature landscape in emotion recognition, drawing upon key studies that have
explored diverse modalities and methodologies.
The exploration of emotion recognition using audio features has been a prominent focus in recent research [1]. Aouani et
al. delved into the intricacies of leveraging Mel Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate (ZCR),
Harmonic to Noise Rate (HNR), and Teager Energy Operator (TEO) for identifying emotions [1]. The utilization of an
Auto-Encoder for feature selection and Support Vector Harmonic Machines (SVM) as a classifier underscored the
effectiveness of this two-step approach. The study conducted experiments on the Ryerson Multimedia Laboratory (RML)
dataset, highlighting the potential for advancements in the field [1].
A review by Maithri et al. spanning the years 2016 to 2021 meticulously analyzed state-of-the-art models in automated
emotion recognition, emphasizing the dominance of deep learning techniques and their impressive performance metrics,
particularly in controlled environments [4]. The review identified a critical gap in the literature regarding models tailored
for dynamic, uncontrolled settings, emphasizing the need for future research to address this limitation [4].
Speech emotion recognition was addressed by Issa et al., who introduced a novel architecture utilizing a one-dimensional
Convolutional Neural Network (CNN) and various audio features [6]. The proposed model outperformed existing
frameworks and set a new state-of-the-art, emphasizing the complexity of speech emotion recognition and suggesting
avenues for further research [6].
Building upon the rich foundation laid by these studies, the present research aims to contribute to the field of emotion
recognition based on speech and EEG signals. Leveraging machine learning techniques, specifically an Artificial Neural
Network (ANN) model, our study achieved a remarkable 97 percent accuracy. This paper aims to elucidate the
methodology, experimental setup, and findings, further enhancing our understanding of emotion recognition and its
potential applications.
Literature Survey:
[1] Aouani, H et al., explored emotion recognition using audio features, specifically employing 39
coefficients of Mel Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate (ZCR), Harmonic to Noise
Rate (HNR), and Teager Energy Operator (TEO). The authors propose a two-step approach, first utilizing
Auto-Encoder for selecting relevant parameters from the initially extracted features and then employing
Support Vector Harmonic Machines (SVM) as a classifier method. The experiments are conducted on the
Ryerson Multimedia Laboratory (RML) dataset. The paper concludes by presenting the performance of
the proposed systems, emphasizing the fusion of HNR with widely recognized emotion features. The use
of auto-encoder dimension reduction is highlighted for improving identification rates. The authors
suggest future research directions, including the exploration of different feature types, application on
larger datasets, alternative methods for feature dimension reduction, and potential incorporation of
audiovisual data to enhance emotion recognition rates. The study positions its findings as effective
compared to other emotion recognition systems, underscoring the potential for further advancements
in the field.
[2] Wang, Q., addresses the critical aspect of Automatic Emotion Recognition (AER) for enhancing
Human–Machine Interactions (HMI) by introducing a novel multi-modal emotion database, MED4,
comprising EEG, photo plethysmography, speech, and facial images. Conducted in two environmental
conditions, a research lab and an anechoic chamber, the study employs four baseline algorithms to
assess AER methods and presents two fusion strategies at feature and decision levels. Results indicate
that EEG signals outperform speech signals in emotion recognition, and their fusion significantly
improves accuracy. The paper emphasizes the robustness of AER in noisy environments and the
database's availability for global research collaboration. The conclusion discusses the unique
contributions of MED4, including its multi-modality and multi-environmental design, the effectiveness of
EEG signals, the impact of environmental noise, and the potential for future research directions,
highlighting the paper's significance in advancing the field of AER. Moreover, the paper evaluates the
impact of variable-length EEG on AER performance, finding it to outperform other methods, particularly
in recognizing happy emotions. The study also explores single-modality emotion recognition based on
speech and EEG signals, revealing that EEG signals exhibit high accuracy, especially in identifying happy
emotions, while speech signals are more effective in recognizing neutral and angry emotions. The
analysis of environmental noise indicates a more stable performance using EEG signals across different
environments compared to speech signals, emphasizing the reliability of EEG in suboptimal acoustic
conditions
[3] Jafari, M. et al., investigates the role of Deep Learning (DL) techniques in emotion recognition from
Electroencephalogram (EEG) signals, acknowledging emotions' crucial influence on human decision-
making and mental states. It discusses the challenges associated with EEG-based emotion recognition,
such as signal variability and the lack of a universal processing standard. The study emphasizes the
advantages of EEG signals in terms of spatial resolution and ease of recording. The conclusion reviews
recent research efforts employing DL models for emotion recognition, especially focusing on diverse
emotions and associated psychological conditions. The paper presents a comprehensive literature
review, covering the period from 2016 to 2023, discussing the challenges faced, summarizing studies on
DL techniques in EEG-based emotion recognition, and proposing future research directions. The
comprehensive review positions the paper as a valuable resource for understanding the current state,
challenges, and potential advancements in the field of emotion recognition using EEG signals and DL
techniques. Furthermore, the paper provides a systematic review of the literature, categorizing articles
based on DL techniques, EEG signal processing steps, and challenges encountered in emotion
recognition. The thorough exploration of challenges, DL methods, and potential future directions
enhances the paper's contribution to the field, offering insights for researchers and practitioners aiming
to develop more robust and effective systems for emotion recognition from EEG signals.
[4] Maithri, M et al., provided a review delves into the realm of automated emotion recognition (ER)
methodologies spanning the years 2016 to 2021, with a specific emphasis on electroencephalogram
(EEG), facial, and speech signals. A meticulous analysis of state-of-the-art models reveals a conspicuous
upswing in the adoption of deep learning techniques for ER. Notably, these approaches have showcased
impressive performance metrics, particularly within controlled environments, signifying a discernible
trend in the landscape. The comprehensive summary meticulously categorizes the diverse
methodologies employed, the modalities considered (EEG, facial, and speech signals), and the
corresponding performance metrics, underscoring the dominance of deep learning in achieving
heightened accuracy and efficiency. Despite the notable successes in controlled environments, the
review accentuates a critical lacuna in the literature— the scarcity of models tailored for dynamic,
uncontrolled settings. These scenarios, marked by subject movements and sudden shifts between
expressions, present a substantial challenge for existing automated ER systems. The review critically
underscores the imperative for future research to address this gap. Developing and refining automated
ER systems capable of operating effectively in real-time, unpredictable scenarios becomes crucial for
their broader practical deployment. Such advancements have far-reaching implications across diverse
domains, including healthcare, e-learning, and surveillance, where the practical application of ER
technologies often involves uncontrolled and dynamic settings. Bridging this gap not only enhances the
robustness of automated ER systems but also expands their real-world applicability, ensuring their
efficacy in capturing the nuances of human emotions in various contexts.
[5] Yu, C et al., delves into the crucial realm of emotion recognition, emphasizing its pivotal role in
artificial intelligence and human-computer interaction. Focusing on electroencephalogram (EEG) signals,
directly generated by the central nervous system and intimately linked with human emotions, the paper
reviews recent advancements in emotion recognition methodologies. It covers various aspects, including
emotion induction, EEG preprocessing, feature extraction, and emotion classification. The paper
critically compares the strengths and weaknesses of these methods while highlighting the existing
challenges in current research methodologies. The conclusion underscores the foundational importance
of emotion recognition in human-computer emotion interaction and its broad application value in
enhancing various aspects of human life. Notably, with the continuous progress in brain-computer
interface technology and the development of artificial intelligence, emotion recognition based on EEG
signals emerges as a promising avenue, garnering extensive attention. The paper emphasizes the impact
of EEG signal acquisition and preprocessing on classification accuracy and notes the successful
integration of deep learning techniques, particularly neural networks, in advancing emotion recognition
models within the domain of brain-computer interfaces. The research direction highlighted in the
conclusion underscores the evolving landscape of emotion classification through the integration of deep
learning and EEG signals.
[6] Issa, D et al., addresses the challenging task of speech emotion recognition through the introduction
of a novel architecture utilizing a one-dimensional Convolutional Neural Network (CNN). The proposed
model extracts diverse audio features, including mel-frequency cepstral coefficients, chromagram, mel-
scale spectrogram, Tonnetz representation, and spectral contrast features, from raw sound files. The
datasets employed for evaluation encompass the Ryerson Audio-Visual Database of Emotional Speech
and Song (RAVDESS), Berlin (EMO-DB), and Interactive Emotional Dyadic Motion Capture (IEMOCAP).
The study employs an incremental method for refining the initial model, resulting in enhanced
classification accuracy. Unlike some prior approaches, the proposed framework operates directly with
raw sound data, avoiding the need for conversion to visual representations. In conclusion, the paper
emphasizes the complexity of speech emotion recognition, addressing the key challenges of feature
extraction and classification. The proposed one-dimensional deep CNN, combined with a variety of
audio features, outperforms existing frameworks for RAVDESS and IEMOCAP, setting a new state-of-the-
art. For EMO-DB, the paper presents an incremental set of models to enhance performance, achieving
competitive results compared to prior works in terms of generality, simplicity, and applicability. The
authors acknowledge the potential for further research, suggesting exploration of alternative features or
the integration of auxiliary neural networks for high-level feature extraction. Additionally,
comprehensive data augmentation techniques and the incorporation of additional layers of Long Short-
Term Memory (LSTM) are identified as potential avenues for improving classification accuracy. The
paper also highlights the significance of the order of stacking sound features and proposes it as a subject
for future investigation, reflecting a commitment to ongoing refinement and optimization in the field of
speech emotion recognition.
[7] Alhalaseh, R et al., explores the development of an automated model for identifying emotions based
on EEG signals. Addressing the challenges of using brain signals for emotion recognition due to their
inherent instability, the study proposes a novel approach employing empirical mode
decomposition/intrinsic mode functions (EMD/IMF) and variational mode decomposition (VMD) for
signal processing. Distinct from previous works, the paper focuses on the application of EMD/IMFs and
VMD, which are not commonly utilized in emotion recognition literature. The feature extraction stage
utilizes entropy and Higuchi's fractal dimension (HFD), and in the classification stage, four methods—
naïve Bayes, k-nearest neighbor (k-NN), convolutional neural network (CNN), and decision tree (DT)—
are employed. The study evaluates the proposed model using the DEAP database and various
performance metrics, achieving a remarkable 95.20% accuracy with the CNN-based method.
The conclusion highlights the significance of advancements in sensor and signal recording technologies,
enabling the utilization of signals extracted from human organs for condition identification. Categorizing
emotions based on EEG signals presents a complex application, aiming to discern a person's emotional
state, reflecting potential issues. The proposed model, encompassing signal processing, feature
extraction, and classification stages, utilizes innovative techniques like EMD/IMFs and VMD. The study
underscores the superiority of the CNN classifier in terms of accuracy and runtime, outperforming other
classifiers and demonstrating favorable results in comparison to existing literature. The comprehensive
evaluation considers metrics such as accuracy, precision, recall, and F1-measure, consistently
showcasing CNN's superior performance in EEG signal categorization. The paper emphasizes the
potential of this research for advancing emotion recognition systems and contributes unique insights
through its novel signal processing approach and classifier performance analysis.
[8] The presented paper introduces a novel approach, Deep-Emotion, for Multimodal Emotion
Recognition (MER) using facial expressions, speech, and Electroencephalogram (EEG). The authors
identify existing challenges in emotion recognition, including effective utilization of different modalities
and real-time detection in the context of increasing computing power demands. The proposed Deep-
Emotion framework comprises three branches: facial, speech, and EEG, each utilizing specialized neural
networks for feature extraction. The facial branch employs an improved GhostNet neural network to
address overfitting issues and enhance classification accuracy. The speech branch introduces a
lightweight fully convolutional neural network (LFCNN) for efficient speech emotion feature extraction.
For the EEG branch, the authors propose a tree-like Long Short-Term Memory (tLSTM) model capable of
fusing multi-stage features. Decision-level fusion is then adopted to integrate recognition results from
the three branches, ensuring comprehensive and accurate performance. The authors conduct extensive
experiments on CK+, EMO-DB, and MAHNOB-HCI datasets, demonstrating the superior performance of
the Deep-Emotion method. This paper represents the first attempt to combine facial expressions,
speech, and EEG for MER. The improved GhostNet for facial expressions, LFCNN for speech signals, and
tLSTM for EEG signals contribute to enhanced accuracy and robustness. The study also introduces an
optimal weight distribution search algorithm for decision-level fusion, further improving reliability. The
experimental results validate the feasibility of the proposed method across multiple public datasets,
suggesting the potential for future enhancements, particularly in refining dynamic weight allocation for
improved overall algorithm robustness.
[9] Houssein, E et al., presents a comprehensive literature review on emotion recognition, with a
specific focus on methods utilizing multi-channel Electroencephalogram (EEG) signals in Brain-Computer
Interfaces (BCIs). Affective computing, a subset of artificial intelligence, is highlighted for its role in
detecting, interpreting, and mimicking human emotions. The authors emphasize the limitations of
traditional modalities such as facial expressions, speech, and behavior, which may be influenced by
conscious or unconscious social masking, and advocate for the efficacy of physiological signals,
particularly EEG, in providing more accurate and objective emotion recognition. The review covers the
period from 2015 to 2021 and encompasses over 195 publications. The authors explore EEG-based BCI
emotion recognition approaches, detailing the entire process, including data collection, preprocessing,
feature extraction, feature selection, classification, and performance evaluation. Emphasis is placed on
the real-time responsiveness and authenticity of EEG signals, which react to emotional changes, making
them a reliable source for emotion recognition. The paper extensively surveys EEG feature extraction
techniques, feature selection/dimensionality reduction methods, and various machine and deep
learning classification techniques, including k-nearest neighbor, support vector machine, decision tree,
artificial neural network, convolutional and recurrent neural networks with long short-term memory.
The review delves into EEG rhythms associated with emotions and the intricate relationship between
distinct brain areas and emotional states. The authors discuss challenges and future research directions
in EEG-based emotion recognition, anticipating resolution of current obstacles and envisioning diverse
applications. The paper aims to provide valuable insights for researchers, especially those new to the
field, offering a snapshot of the current state of research in emotional-oriented EEG features recognition
and categorization. The funding sources for the study are acknowledged from The Science, Technology
& Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).
Research Methodology:
The research methodology involved leveraging a dataset sourced from Kaggle, comprising EEG signals
and speech data, for the purpose of emotion recognition. The EEG signals were preprocessed,
addressing artifacts and segmenting into relevant time intervals, while the speech data underwent noise
reduction, feature extraction, and normalization. Subsequently, both sets of preprocessed features were
integrated to form a comprehensive dataset. An Artificial Neural Network (ANN) was designed to
process this combined data, with careful consideration given to input layer configuration to
accommodate the multidimensional nature of the features. The model was trained and validated using
appropriate datasets, and performance evaluation was conducted using metrics such as accuracy,
precision, and recall. Results were analyzed and discussed in the context of research objectives,
emphasizing potential applications and future directions. Figure 1 depicts the flow diagram of the
research methodology, illustrating the sequential steps from data collection to model evaluation. The
diagram serves as a visual representation of the systematic approach employed in the study.
Figure 1
Results and Discussions:
The integration of Speech and EEG data for Emotion Recognition, utilizing an Artificial Neural Network
(ANN) model, demonstrated significant success, achieving a notable 97 percent accuracy on the Kaggle
dataset. The results showcase the model's proficiency in accurately classifying diverse emotional states,
highlighting its robustness in capturing both vocal and physiological cues. Precision, recall, and F1-score
metrics further validate the model's effectiveness across various emotions. The confusion matrix,
illustrated in Figure 1, provides a visual representation of the model's performance on individual
emotion classes, emphasizing its capability to discern subtle nuances. Specifically sourced from Kaggle,
the dataset's diversity contributes to the generalizability of the model across a wide range of emotional
expressions. These findings underscore the potential of the proposed multimodal approach in real-world
applications such as human-computer interaction and affective computing. The achieved high accuracy
substantiates the efficacy of leveraging Kaggle's EEG signals and speech data, showcasing the success of
the applied machine learning techniques in emotion recognition.
References:
[1] Aouani, H., & Ayed, Y. B. (2020). Speech emotion recognition with deep learning. Procedia Computer
Science, 176, 251-260.
[2] Wang, Q., Wang, M., Yang, Y., & Zhang, X. (2022). Multi-modal emotion recognition using EEG and
speech signals. Computers in Biology and Medicine, 149, 105907.
[3] Jafari, M., Shoeibi, A., Khodatars, M., Bagherzadeh, S., Shalbaf, A., García, D. L., ... & Acharya, U. R.
(2023). Emotion recognition in EEG signals using deep learning methods: A review. Computers in Biology
and Medicine, 107450.
[4] Maithri, M., Raghavendra, U., Gudigar, A., Samanth, J., Barua, P. D., Murugappan, M., ... & Acharya,
U. R. (2022). Automated emotion recognition: Current trends and future perspectives. Computer methods
and programs in biomedicine, 215, 106646.
[5] Yu, C., & Wang, M. (2022). Survey of emotion recognition methods using EEG information. Cognitive
Robotics, 2, 132-146.
[6] Issa, D., Demirci, M. F., & Yazici, A. (2020). Speech emotion recognition with deep convolutional
neural networks. Biomedical Signal Processing and Control, 59, 101894.
[7] Alhalaseh, R., & Alasasfeh, S. (2020). Machine-learning-based emotion recognition system using EEG
signals. Computers, 9(4), 95.
[8] Pan, J., Fang, W., Zhang, Z., Chen, B., Zhang, Z., & Wang, S. (2023). Multimodal emotion recognition
based on facial expressions, speech, and EEG. IEEE Open Journal of Engineering in Medicine and
Biology.
[9] Houssein, E. H., Hammad, A., & Ali, A. A. (2022). Human emotion recognition from EEG-based brain–
computer interface using machine learning: a comprehensive review. Neural Computing and
Applications, 34(15), 12527-12557.

More Related Content

Similar to Emotion Recognition based on Speech and EEG Using Machine Learning Techniques.docx

ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Modelsipij
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal sipij
 
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMIRJET Journal
 
An Approach of Human Emotional States Classification and Modeling from EEG
An Approach of Human Emotional States Classification and Modeling from EEGAn Approach of Human Emotional States Classification and Modeling from EEG
An Approach of Human Emotional States Classification and Modeling from EEGCSCJournals
 
Recognition of emotional states using EEG signals based on time-frequency ana...
Recognition of emotional states using EEG signals based on time-frequency ana...Recognition of emotional states using EEG signals based on time-frequency ana...
Recognition of emotional states using EEG signals based on time-frequency ana...IJECEIAES
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...IRJET Journal
 
IRJET- Facial Expression Recognition System using Neural Network based on...
IRJET-  	  Facial Expression Recognition System using Neural Network based on...IRJET-  	  Facial Expression Recognition System using Neural Network based on...
IRJET- Facial Expression Recognition System using Neural Network based on...IRJET Journal
 
Convolutional neural network with binary moth flame optimization for emotion ...
Convolutional neural network with binary moth flame optimization for emotion ...Convolutional neural network with binary moth flame optimization for emotion ...
Convolutional neural network with binary moth flame optimization for emotion ...IAESIJAI
 
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...BRIGHT WORLD INNOVATIONS
 
Facial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer methodFacial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer methodIJECEIAES
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...IJECEIAES
 
Speech emotion recognition using 2D-convolutional neural network
Speech emotion recognition using 2D-convolutional neural  networkSpeech emotion recognition using 2D-convolutional neural  network
Speech emotion recognition using 2D-convolutional neural networkIJECEIAES
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningIRJET Journal
 
IRJET- Prediction of Human Facial Expression using Deep Learning
IRJET- Prediction of Human Facial Expression using Deep LearningIRJET- Prediction of Human Facial Expression using Deep Learning
IRJET- Prediction of Human Facial Expression using Deep LearningIRJET Journal
 
Survey of Various Approaches of Emotion Detection Via Multimodal Approach
Survey of Various Approaches of Emotion Detection Via Multimodal ApproachSurvey of Various Approaches of Emotion Detection Via Multimodal Approach
Survey of Various Approaches of Emotion Detection Via Multimodal ApproachIRJET Journal
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesTELKOMNIKA JOURNAL
 
Contextual Emotion Recognition Using Transformer-Based Models
Contextual Emotion Recognition Using Transformer-Based ModelsContextual Emotion Recognition Using Transformer-Based Models
Contextual Emotion Recognition Using Transformer-Based ModelsIRJET Journal
 
IRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET Journal
 
Detection of speech under stress using spectral analysis
Detection of speech under stress using spectral analysisDetection of speech under stress using spectral analysis
Detection of speech under stress using spectral analysiseSAT Journals
 

Similar to Emotion Recognition based on Speech and EEG Using Machine Learning Techniques.docx (20)

ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
 
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
 
An Approach of Human Emotional States Classification and Modeling from EEG
An Approach of Human Emotional States Classification and Modeling from EEGAn Approach of Human Emotional States Classification and Modeling from EEG
An Approach of Human Emotional States Classification and Modeling from EEG
 
Recognition of emotional states using EEG signals based on time-frequency ana...
Recognition of emotional states using EEG signals based on time-frequency ana...Recognition of emotional states using EEG signals based on time-frequency ana...
Recognition of emotional states using EEG signals based on time-frequency ana...
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
 
IRJET- Facial Expression Recognition System using Neural Network based on...
IRJET-  	  Facial Expression Recognition System using Neural Network based on...IRJET-  	  Facial Expression Recognition System using Neural Network based on...
IRJET- Facial Expression Recognition System using Neural Network based on...
 
Convolutional neural network with binary moth flame optimization for emotion ...
Convolutional neural network with binary moth flame optimization for emotion ...Convolutional neural network with binary moth flame optimization for emotion ...
Convolutional neural network with binary moth flame optimization for emotion ...
 
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
Novel Methodologies for Classifying Gender and Emotions Using Machine Learnin...
 
Facial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer methodFacial emotion recognition using enhanced multi-verse optimizer method
Facial emotion recognition using enhanced multi-verse optimizer method
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
 
Speech emotion recognition using 2D-convolutional neural network
Speech emotion recognition using 2D-convolutional neural  networkSpeech emotion recognition using 2D-convolutional neural  network
Speech emotion recognition using 2D-convolutional neural network
 
Affective analysis in machine learning using AMIGOS with Gaussian expectatio...
Affective analysis in machine learning using AMIGOS with  Gaussian expectatio...Affective analysis in machine learning using AMIGOS with  Gaussian expectatio...
Affective analysis in machine learning using AMIGOS with Gaussian expectatio...
 
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep LearningA Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
 
IRJET- Prediction of Human Facial Expression using Deep Learning
IRJET- Prediction of Human Facial Expression using Deep LearningIRJET- Prediction of Human Facial Expression using Deep Learning
IRJET- Prediction of Human Facial Expression using Deep Learning
 
Survey of Various Approaches of Emotion Detection Via Multimodal Approach
Survey of Various Approaches of Emotion Detection Via Multimodal ApproachSurvey of Various Approaches of Emotion Detection Via Multimodal Approach
Survey of Various Approaches of Emotion Detection Via Multimodal Approach
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approaches
 
Contextual Emotion Recognition Using Transformer-Based Models
Contextual Emotion Recognition Using Transformer-Based ModelsContextual Emotion Recognition Using Transformer-Based Models
Contextual Emotion Recognition Using Transformer-Based Models
 
IRJET - Audio Emotion Analysis
IRJET - Audio Emotion AnalysisIRJET - Audio Emotion Analysis
IRJET - Audio Emotion Analysis
 
Detection of speech under stress using spectral analysis
Detection of speech under stress using spectral analysisDetection of speech under stress using spectral analysis
Detection of speech under stress using spectral analysis
 

Recently uploaded

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Emotion Recognition based on Speech and EEG Using Machine Learning Techniques.docx

  • 1. Emotion Recognition based on Speech and EEG Using Machine Learning Techniques Hanzala Javed: 2022-MSCS-17 Muhammad Sarfraz: 2022-MSCS-48 Abstract: This research investigates the synergy of speech and EEG data for emotion recognition using machine learning, with a focus on Artificial Neural Networks (ANNs). The study achieves a notable 97 percent accuracy in discerning emotions, employing a diverse dataset encompassing various emotional states. The integrated analysis of speech and EEG data enhances the model's robustness, capturing both physiological and vocal cues. The ANN architecture is meticulously designed for effective feature extraction and representation learning, demonstrating its efficacy in discerning subtle emotional nuances. The high accuracy attained underscores the potential of this approach in human-computer interaction, affective computing, and mental health monitoring. This research contributes to the evolving field of emotion recognition, offering a novel multimodal approach with practical implications for creating more intuitive and responsive systems in various applications. Key Words: Emotion Recognition Speech Analysis EEG Signals Machine Learning Artificial Neural Networks (ANNs) Introduction: Emotion recognition has emerged as a pivotal area of research within the broader scope of artificial intelligence, contributing significantly to human-computer interaction and affective computing. Understanding and accurately interpreting human emotions through various modalities, such as speech and electroencephalogram (EEG) signals, hold immense potential for applications ranging from healthcare to human-machine interfaces. This introduction seeks to provide a comprehensive overview of the literature landscape in emotion recognition, drawing upon key studies that have explored diverse modalities and methodologies. The exploration of emotion recognition using audio features has been a prominent focus in recent research [1]. Aouani et al. delved into the intricacies of leveraging Mel Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate (ZCR), Harmonic to Noise Rate (HNR), and Teager Energy Operator (TEO) for identifying emotions [1]. The utilization of an Auto-Encoder for feature selection and Support Vector Harmonic Machines (SVM) as a classifier underscored the effectiveness of this two-step approach. The study conducted experiments on the Ryerson Multimedia Laboratory (RML) dataset, highlighting the potential for advancements in the field [1]. A review by Maithri et al. spanning the years 2016 to 2021 meticulously analyzed state-of-the-art models in automated emotion recognition, emphasizing the dominance of deep learning techniques and their impressive performance metrics, particularly in controlled environments [4]. The review identified a critical gap in the literature regarding models tailored for dynamic, uncontrolled settings, emphasizing the need for future research to address this limitation [4].
  • 2. Speech emotion recognition was addressed by Issa et al., who introduced a novel architecture utilizing a one-dimensional Convolutional Neural Network (CNN) and various audio features [6]. The proposed model outperformed existing frameworks and set a new state-of-the-art, emphasizing the complexity of speech emotion recognition and suggesting avenues for further research [6]. Building upon the rich foundation laid by these studies, the present research aims to contribute to the field of emotion recognition based on speech and EEG signals. Leveraging machine learning techniques, specifically an Artificial Neural Network (ANN) model, our study achieved a remarkable 97 percent accuracy. This paper aims to elucidate the methodology, experimental setup, and findings, further enhancing our understanding of emotion recognition and its potential applications. Literature Survey: [1] Aouani, H et al., explored emotion recognition using audio features, specifically employing 39 coefficients of Mel Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate (ZCR), Harmonic to Noise Rate (HNR), and Teager Energy Operator (TEO). The authors propose a two-step approach, first utilizing Auto-Encoder for selecting relevant parameters from the initially extracted features and then employing Support Vector Harmonic Machines (SVM) as a classifier method. The experiments are conducted on the Ryerson Multimedia Laboratory (RML) dataset. The paper concludes by presenting the performance of the proposed systems, emphasizing the fusion of HNR with widely recognized emotion features. The use of auto-encoder dimension reduction is highlighted for improving identification rates. The authors suggest future research directions, including the exploration of different feature types, application on larger datasets, alternative methods for feature dimension reduction, and potential incorporation of audiovisual data to enhance emotion recognition rates. The study positions its findings as effective compared to other emotion recognition systems, underscoring the potential for further advancements in the field. [2] Wang, Q., addresses the critical aspect of Automatic Emotion Recognition (AER) for enhancing Human–Machine Interactions (HMI) by introducing a novel multi-modal emotion database, MED4, comprising EEG, photo plethysmography, speech, and facial images. Conducted in two environmental conditions, a research lab and an anechoic chamber, the study employs four baseline algorithms to assess AER methods and presents two fusion strategies at feature and decision levels. Results indicate that EEG signals outperform speech signals in emotion recognition, and their fusion significantly improves accuracy. The paper emphasizes the robustness of AER in noisy environments and the database's availability for global research collaboration. The conclusion discusses the unique contributions of MED4, including its multi-modality and multi-environmental design, the effectiveness of EEG signals, the impact of environmental noise, and the potential for future research directions, highlighting the paper's significance in advancing the field of AER. Moreover, the paper evaluates the impact of variable-length EEG on AER performance, finding it to outperform other methods, particularly in recognizing happy emotions. The study also explores single-modality emotion recognition based on speech and EEG signals, revealing that EEG signals exhibit high accuracy, especially in identifying happy emotions, while speech signals are more effective in recognizing neutral and angry emotions. The analysis of environmental noise indicates a more stable performance using EEG signals across different environments compared to speech signals, emphasizing the reliability of EEG in suboptimal acoustic conditions
  • 3. [3] Jafari, M. et al., investigates the role of Deep Learning (DL) techniques in emotion recognition from Electroencephalogram (EEG) signals, acknowledging emotions' crucial influence on human decision- making and mental states. It discusses the challenges associated with EEG-based emotion recognition, such as signal variability and the lack of a universal processing standard. The study emphasizes the advantages of EEG signals in terms of spatial resolution and ease of recording. The conclusion reviews recent research efforts employing DL models for emotion recognition, especially focusing on diverse emotions and associated psychological conditions. The paper presents a comprehensive literature review, covering the period from 2016 to 2023, discussing the challenges faced, summarizing studies on DL techniques in EEG-based emotion recognition, and proposing future research directions. The comprehensive review positions the paper as a valuable resource for understanding the current state, challenges, and potential advancements in the field of emotion recognition using EEG signals and DL techniques. Furthermore, the paper provides a systematic review of the literature, categorizing articles based on DL techniques, EEG signal processing steps, and challenges encountered in emotion recognition. The thorough exploration of challenges, DL methods, and potential future directions enhances the paper's contribution to the field, offering insights for researchers and practitioners aiming to develop more robust and effective systems for emotion recognition from EEG signals. [4] Maithri, M et al., provided a review delves into the realm of automated emotion recognition (ER) methodologies spanning the years 2016 to 2021, with a specific emphasis on electroencephalogram (EEG), facial, and speech signals. A meticulous analysis of state-of-the-art models reveals a conspicuous upswing in the adoption of deep learning techniques for ER. Notably, these approaches have showcased impressive performance metrics, particularly within controlled environments, signifying a discernible trend in the landscape. The comprehensive summary meticulously categorizes the diverse methodologies employed, the modalities considered (EEG, facial, and speech signals), and the corresponding performance metrics, underscoring the dominance of deep learning in achieving heightened accuracy and efficiency. Despite the notable successes in controlled environments, the review accentuates a critical lacuna in the literature— the scarcity of models tailored for dynamic, uncontrolled settings. These scenarios, marked by subject movements and sudden shifts between expressions, present a substantial challenge for existing automated ER systems. The review critically underscores the imperative for future research to address this gap. Developing and refining automated ER systems capable of operating effectively in real-time, unpredictable scenarios becomes crucial for their broader practical deployment. Such advancements have far-reaching implications across diverse domains, including healthcare, e-learning, and surveillance, where the practical application of ER technologies often involves uncontrolled and dynamic settings. Bridging this gap not only enhances the robustness of automated ER systems but also expands their real-world applicability, ensuring their efficacy in capturing the nuances of human emotions in various contexts. [5] Yu, C et al., delves into the crucial realm of emotion recognition, emphasizing its pivotal role in artificial intelligence and human-computer interaction. Focusing on electroencephalogram (EEG) signals, directly generated by the central nervous system and intimately linked with human emotions, the paper reviews recent advancements in emotion recognition methodologies. It covers various aspects, including emotion induction, EEG preprocessing, feature extraction, and emotion classification. The paper critically compares the strengths and weaknesses of these methods while highlighting the existing challenges in current research methodologies. The conclusion underscores the foundational importance of emotion recognition in human-computer emotion interaction and its broad application value in
  • 4. enhancing various aspects of human life. Notably, with the continuous progress in brain-computer interface technology and the development of artificial intelligence, emotion recognition based on EEG signals emerges as a promising avenue, garnering extensive attention. The paper emphasizes the impact of EEG signal acquisition and preprocessing on classification accuracy and notes the successful integration of deep learning techniques, particularly neural networks, in advancing emotion recognition models within the domain of brain-computer interfaces. The research direction highlighted in the conclusion underscores the evolving landscape of emotion classification through the integration of deep learning and EEG signals. [6] Issa, D et al., addresses the challenging task of speech emotion recognition through the introduction of a novel architecture utilizing a one-dimensional Convolutional Neural Network (CNN). The proposed model extracts diverse audio features, including mel-frequency cepstral coefficients, chromagram, mel- scale spectrogram, Tonnetz representation, and spectral contrast features, from raw sound files. The datasets employed for evaluation encompass the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Berlin (EMO-DB), and Interactive Emotional Dyadic Motion Capture (IEMOCAP). The study employs an incremental method for refining the initial model, resulting in enhanced classification accuracy. Unlike some prior approaches, the proposed framework operates directly with raw sound data, avoiding the need for conversion to visual representations. In conclusion, the paper emphasizes the complexity of speech emotion recognition, addressing the key challenges of feature extraction and classification. The proposed one-dimensional deep CNN, combined with a variety of audio features, outperforms existing frameworks for RAVDESS and IEMOCAP, setting a new state-of-the- art. For EMO-DB, the paper presents an incremental set of models to enhance performance, achieving competitive results compared to prior works in terms of generality, simplicity, and applicability. The authors acknowledge the potential for further research, suggesting exploration of alternative features or the integration of auxiliary neural networks for high-level feature extraction. Additionally, comprehensive data augmentation techniques and the incorporation of additional layers of Long Short- Term Memory (LSTM) are identified as potential avenues for improving classification accuracy. The paper also highlights the significance of the order of stacking sound features and proposes it as a subject for future investigation, reflecting a commitment to ongoing refinement and optimization in the field of speech emotion recognition. [7] Alhalaseh, R et al., explores the development of an automated model for identifying emotions based on EEG signals. Addressing the challenges of using brain signals for emotion recognition due to their inherent instability, the study proposes a novel approach employing empirical mode decomposition/intrinsic mode functions (EMD/IMF) and variational mode decomposition (VMD) for signal processing. Distinct from previous works, the paper focuses on the application of EMD/IMFs and VMD, which are not commonly utilized in emotion recognition literature. The feature extraction stage utilizes entropy and Higuchi's fractal dimension (HFD), and in the classification stage, four methods— naïve Bayes, k-nearest neighbor (k-NN), convolutional neural network (CNN), and decision tree (DT)— are employed. The study evaluates the proposed model using the DEAP database and various performance metrics, achieving a remarkable 95.20% accuracy with the CNN-based method. The conclusion highlights the significance of advancements in sensor and signal recording technologies, enabling the utilization of signals extracted from human organs for condition identification. Categorizing
  • 5. emotions based on EEG signals presents a complex application, aiming to discern a person's emotional state, reflecting potential issues. The proposed model, encompassing signal processing, feature extraction, and classification stages, utilizes innovative techniques like EMD/IMFs and VMD. The study underscores the superiority of the CNN classifier in terms of accuracy and runtime, outperforming other classifiers and demonstrating favorable results in comparison to existing literature. The comprehensive evaluation considers metrics such as accuracy, precision, recall, and F1-measure, consistently showcasing CNN's superior performance in EEG signal categorization. The paper emphasizes the potential of this research for advancing emotion recognition systems and contributes unique insights through its novel signal processing approach and classifier performance analysis. [8] The presented paper introduces a novel approach, Deep-Emotion, for Multimodal Emotion Recognition (MER) using facial expressions, speech, and Electroencephalogram (EEG). The authors identify existing challenges in emotion recognition, including effective utilization of different modalities and real-time detection in the context of increasing computing power demands. The proposed Deep- Emotion framework comprises three branches: facial, speech, and EEG, each utilizing specialized neural networks for feature extraction. The facial branch employs an improved GhostNet neural network to address overfitting issues and enhance classification accuracy. The speech branch introduces a lightweight fully convolutional neural network (LFCNN) for efficient speech emotion feature extraction. For the EEG branch, the authors propose a tree-like Long Short-Term Memory (tLSTM) model capable of fusing multi-stage features. Decision-level fusion is then adopted to integrate recognition results from the three branches, ensuring comprehensive and accurate performance. The authors conduct extensive experiments on CK+, EMO-DB, and MAHNOB-HCI datasets, demonstrating the superior performance of the Deep-Emotion method. This paper represents the first attempt to combine facial expressions, speech, and EEG for MER. The improved GhostNet for facial expressions, LFCNN for speech signals, and tLSTM for EEG signals contribute to enhanced accuracy and robustness. The study also introduces an optimal weight distribution search algorithm for decision-level fusion, further improving reliability. The experimental results validate the feasibility of the proposed method across multiple public datasets, suggesting the potential for future enhancements, particularly in refining dynamic weight allocation for improved overall algorithm robustness. [9] Houssein, E et al., presents a comprehensive literature review on emotion recognition, with a specific focus on methods utilizing multi-channel Electroencephalogram (EEG) signals in Brain-Computer Interfaces (BCIs). Affective computing, a subset of artificial intelligence, is highlighted for its role in detecting, interpreting, and mimicking human emotions. The authors emphasize the limitations of traditional modalities such as facial expressions, speech, and behavior, which may be influenced by conscious or unconscious social masking, and advocate for the efficacy of physiological signals, particularly EEG, in providing more accurate and objective emotion recognition. The review covers the period from 2015 to 2021 and encompasses over 195 publications. The authors explore EEG-based BCI emotion recognition approaches, detailing the entire process, including data collection, preprocessing, feature extraction, feature selection, classification, and performance evaluation. Emphasis is placed on the real-time responsiveness and authenticity of EEG signals, which react to emotional changes, making them a reliable source for emotion recognition. The paper extensively surveys EEG feature extraction techniques, feature selection/dimensionality reduction methods, and various machine and deep learning classification techniques, including k-nearest neighbor, support vector machine, decision tree, artificial neural network, convolutional and recurrent neural networks with long short-term memory.
  • 6. The review delves into EEG rhythms associated with emotions and the intricate relationship between distinct brain areas and emotional states. The authors discuss challenges and future research directions in EEG-based emotion recognition, anticipating resolution of current obstacles and envisioning diverse applications. The paper aims to provide valuable insights for researchers, especially those new to the field, offering a snapshot of the current state of research in emotional-oriented EEG features recognition and categorization. The funding sources for the study are acknowledged from The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB). Research Methodology: The research methodology involved leveraging a dataset sourced from Kaggle, comprising EEG signals and speech data, for the purpose of emotion recognition. The EEG signals were preprocessed, addressing artifacts and segmenting into relevant time intervals, while the speech data underwent noise reduction, feature extraction, and normalization. Subsequently, both sets of preprocessed features were integrated to form a comprehensive dataset. An Artificial Neural Network (ANN) was designed to process this combined data, with careful consideration given to input layer configuration to accommodate the multidimensional nature of the features. The model was trained and validated using appropriate datasets, and performance evaluation was conducted using metrics such as accuracy, precision, and recall. Results were analyzed and discussed in the context of research objectives, emphasizing potential applications and future directions. Figure 1 depicts the flow diagram of the research methodology, illustrating the sequential steps from data collection to model evaluation. The diagram serves as a visual representation of the systematic approach employed in the study. Figure 1
  • 7. Results and Discussions: The integration of Speech and EEG data for Emotion Recognition, utilizing an Artificial Neural Network (ANN) model, demonstrated significant success, achieving a notable 97 percent accuracy on the Kaggle dataset. The results showcase the model's proficiency in accurately classifying diverse emotional states, highlighting its robustness in capturing both vocal and physiological cues. Precision, recall, and F1-score metrics further validate the model's effectiveness across various emotions. The confusion matrix, illustrated in Figure 1, provides a visual representation of the model's performance on individual emotion classes, emphasizing its capability to discern subtle nuances. Specifically sourced from Kaggle, the dataset's diversity contributes to the generalizability of the model across a wide range of emotional expressions. These findings underscore the potential of the proposed multimodal approach in real-world applications such as human-computer interaction and affective computing. The achieved high accuracy substantiates the efficacy of leveraging Kaggle's EEG signals and speech data, showcasing the success of the applied machine learning techniques in emotion recognition.
  • 8. References: [1] Aouani, H., & Ayed, Y. B. (2020). Speech emotion recognition with deep learning. Procedia Computer Science, 176, 251-260. [2] Wang, Q., Wang, M., Yang, Y., & Zhang, X. (2022). Multi-modal emotion recognition using EEG and speech signals. Computers in Biology and Medicine, 149, 105907. [3] Jafari, M., Shoeibi, A., Khodatars, M., Bagherzadeh, S., Shalbaf, A., García, D. L., ... & Acharya, U. R. (2023). Emotion recognition in EEG signals using deep learning methods: A review. Computers in Biology and Medicine, 107450. [4] Maithri, M., Raghavendra, U., Gudigar, A., Samanth, J., Barua, P. D., Murugappan, M., ... & Acharya, U. R. (2022). Automated emotion recognition: Current trends and future perspectives. Computer methods and programs in biomedicine, 215, 106646. [5] Yu, C., & Wang, M. (2022). Survey of emotion recognition methods using EEG information. Cognitive Robotics, 2, 132-146. [6] Issa, D., Demirci, M. F., & Yazici, A. (2020). Speech emotion recognition with deep convolutional neural networks. Biomedical Signal Processing and Control, 59, 101894. [7] Alhalaseh, R., & Alasasfeh, S. (2020). Machine-learning-based emotion recognition system using EEG signals. Computers, 9(4), 95. [8] Pan, J., Fang, W., Zhang, Z., Chen, B., Zhang, Z., & Wang, S. (2023). Multimodal emotion recognition based on facial expressions, speech, and EEG. IEEE Open Journal of Engineering in Medicine and Biology. [9] Houssein, E. H., Hammad, A., & Ali, A. A. (2022). Human emotion recognition from EEG-based brain– computer interface using machine learning: a comprehensive review. Neural Computing and Applications, 34(15), 12527-12557.