SlideShare a Scribd company logo
1 of 59
Download to read offline
MASTER’S THESIS
To obtain the Master degree in :
Advanced Engineering of Robotized Systems
and Artificial Intelligence
Presented by: Bouzidi Amir
Emotions prediction for augmented EEG signals using VAE
and Convolutional Neural Networks CNN combined with LSTM
Presented in : July, 4th
2021
Graduation committee :
President : Mr. Sayadi Mounir
Technical advisor : Ms. Fourati Rahma and Mr. Yangui Maher
Report advisor : Ms. Ammar Boudour
Member : Ms. Selmani Anissa
Academic year: 2020/2021
‫اﻟﺨﻼﺻﺔ‬
‫ﺑﺎﺳﺘﺨﺪام‬ ‫اﻟﻤﻌﺰزة‬ EEG ‫إﺷﺎرات‬ ‫ﺑﺈﺳﺘﻌﻤﺎل‬ ‫ﺑﺎﻟﻌﻮاﻃﻒ‬ ‫ﻟﻠﺘﻨﺒﺆ‬ ‫ذﻛﻲ‬ ‫ﻧﻈﺎم‬ ‫إﻧﺸﺎء‬ ‫ﻋﻦ‬ ‫ﻋﺒﺎرة‬ ‫ﻫﺬا‬ ‫اﻟﺪروس‬ ‫ﺧﺘﻢ‬ ‫ﻣﺸﺮوع‬
‫ﺗﺴﻬﻴﻞ‬‫ﺧﺎﺻﺔ‬‫و‬‫اﻟﻄﺒﻲ‬‫اﻟﺘﺸﺨﻴﺺ‬‫ﺗﺤﺴﻴﻦ‬‫ﻫﻮ‬‫اﻟﻤﺸﺮوع‬‫اﻟﻬﺪف‬.LSTM‫ﺑﻮاﺳﻄﺔ‬‫اﻹﻟﺘﻔﺎﻓﻴﺔ‬‫اﻟﻌﺼﺒﻴﺔ‬‫اﻟﺸﺒﻜﺎت‬‫و‬VAE
‫و‬ ‫اﻟﻨﻔﺴﻴﺔ‬ ‫اﻹﺿﻄﺮاﺑﺎت‬ ‫ﻣﻦ‬ ‫ﻳﻌﺎﻧﻮن‬ ‫اﻟﺬﻳﻦ‬ ‫و‬ ‫ﻟﻠﻤﺮﺿﻰ‬ ‫ﺑﺴﺮﻋﺔ‬ ‫اﻟﻌﻼج‬ ‫ﻟﻤﺮﺣﻠﺔ‬ ‫اﻟﻤﺮور‬ ‫ﺑﺎﻟﺘﺎﻟﻲ‬ ‫و‬ ‫اﻟﻌﻮاﻃﻒ‬ ‫اﻛﺘﺸﺎف‬
‫أﺟﻬﺰة‬‫ﻋﺒﺮ‬‫ﺟﺪﻳﺪة‬‫ﺑﻴﺎﻧﺎت‬‫أﻧﺘﺠﻨﺎ‬،‫اﻷوﻟﻰ‬‫اﻟﻤﺮﺣﻠﺔ‬‫ﻓﻲ‬ ‫ﺳﻮاء‬‫ﺣﺪ‬‫ﻋﻠﻰ‬‫واﻟﻤﺮﻳﺾ‬‫اﻟﻄﺒﻲ‬‫اﻟﻄﺎﻗﻢ‬‫ﻳﺴﺎﻋﺪ‬‫اﻟﻌﻤﻞ‬‫ﻫﺬا‬ ‫اﻟﻘﻠﻖ‬
‫اﻟﻌﺼﺒﻴﺔ‬ ‫اﻟﺸﺒﻜﺎت‬ ‫اﺳﺘﺨﺪﻣﻨﺎ‬ ‫ًﺎ‬‫ﻴ‬‫وﺛﺎﻧ‬ ، ‫اﻟﺒﻴﺎﻧﺎت‬ ‫ﻣﻦ‬ ‫ﻛﺎﻓﻴﺔ‬ ‫ﻛﻤﻴﺔ‬ ‫ﻟﺘﻮﻓﻴﺮ‬ cVAE ‫اﻟﺸﺮﻃﻴﺔ‬ ‫اﻟﻤﺘﻐﻴﺮة‬ ‫اﻟﺘﻠﻘﺎﺋﻲ‬ ‫اﻟﺘﺸﻔﻴﺮ‬
‫و‬ ‫ﺳﻬﻠﺔ‬ ‫واﺟﻬﺔ‬ ‫أﻧﺸﺄﻧﺎ‬ ‫ًا‬‫ﺮ‬‫وأﺧﻴ‬ ، ‫ﺑﺎﻟﻌﻮاﻃﻒ‬ ‫ﻟﻠﺘﻨﺒﺆ‬ ‫ﺑﺎﻟﻀﺒﻂ‬ LSTM ‫اﻟﻤﺪى‬ ‫ﻃﻮﻳﻠﺔ‬ ‫اﻟﺬاﻛﺮة‬ ‫ﺗﻘﻨﻴﺔ‬ ‫ﻓﻲ‬ ‫ﻣﺘﻤﺜﻠﺔ‬ ، ‫اﻟﺘﻼﻓﻴﻔﻴﺔ‬
‫اﻟﻄﺒﻲ‬‫ﻟﻠﺘﺸﺨﻴﺺ‬‫ﻻﺳﺘﺨﺪاﻣﻬﺎ‬‫ﻟﻠﻤﺴﺘﺨﺪﻣﻴﻦ‬‫اﻹﺳﺘﻌﻤﺎل‬‫ﺑﺴﻴﻄﺔ‬
Resumé
Notre projet de fin d’étude consiste à développer un système intelligent de prédiction des
émotions en utilisant les signaux EEG augmentés à l'aide de VAE et de réseaux de neurones
convolutifs combiné avec LSTM. L’objectif du projet est d'améliorer le diagnostic médical, en
facilitant notamment la détection des émotions et en appliquant un traitement rapide des
patients, ceux qui souffrent du traumatisme et de l'anxiété psychologique. Ce travail aide aussi
bien le personnel médical que le patient. Dans une première étape, nous avons généré de
nouvelles données via des cVAEs pour fournir une quantité suffisante de données. En
deuxième lieu, nous avons utilisé les CNNs combiné avec la technique LSTM pour prédire les
émotions. Enfin, nous avons créé une interface ergonomique à simple accès pour les
utilisateurs au diagnostic médical.
Abstract
Our end-of-study project aims to develop an intelligent emotion prediction system using
augmented EEG signals using VAE and convolutional neural networks combined with LSTM.
The aim of the project is to improve medical diagnosis, in particular by facilitating the
detection of emotions and applying quick treatment to patients, those suffering from trauma
and anxiety. This work helps both the medical staff and the patient. In a first step, we
generated new data via cVAEs to provide a sufficient amount of data. Second, we used CNNs
combined with the LSTM technique to predict emotions. Finally, we have created an
ergonomic, easy-to-access interface for users in medical diagnosis.
‫اﻟﺨﺎﺿﻊ‬ ‫ﺷﺒﻪ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫اﻟﻤﺸﺎﻋﺮ‬ ‫ﻋﲆ‬ ‫اﻟﺘﻌﺮف‬ ، ‫اﻟﺘﻨﺒﺆ‬ ، ‫ﺑﺎﻳﺜﻮن‬ ، EEG ، CNN ، RNN ، VAE ، LSTM :‫اﻟﻤﻔﺎﺗﻴﺢ‬
.‫اﻟﺒﻴﺎﻧﺎت‬ ‫ﻣﺠﻤﻮﻋﺎت‬ ، ‫اﻵﻟﻲ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫اﻟﻌﻤﻴﻖ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫ﻟﻺﺷﺮاف‬
Mots clés: EEG, CNN, RNN , VAE, LSTM, Python, prédiction, reconnaissance des émotion
apprentissage semi supervisé, apprentissage profond, machine learning, datasets
Key-words: EEG, CNN, RNN , VAE, LSTM, Python, prediction, emotion recognition, semi-
.supervised learning, deep learning, machine learning, Datasets
Emotions prediction for augmented EEG signals using VAE and
Convolutional Neural Networks CNN combined with LSTM
‫اﻟﺮﺣﻴﻢ‬ ‫اﻟﺮﺣﻤﺎن‬ ‫ﷲ‬ ‫ﺑﺴﻢ‬
‫ﷲ‬ ‫رﺣﻤﻪ‬ ‫اﻟﺰواري‬ ‫ﻣﺤﻤﺪ‬ ‫اﻟﻤﻬﻨﺪس‬ ‫ﻟﻠﺮاﺣﻞ‬ ‫إﻫﺪاء‬
‫ﻋﻠﻴﻨﺎ‬ ‫ﻓﻀﻞ‬ ‫ﻟﻪ‬ ‫ﻛﺎن‬ ‫ﻣﻦ‬ ‫و‬ ‫درﺳﻨﺎ‬ ‫و‬ ‫ﻋﻠﻤﻨﺎ‬ ‫ﻣﻦ‬ ‫ﻟﻜﻞ‬ ‫و‬ ‫ﻷﺳﺎﺗﺬﺗﻨﺎ‬ ‫إﻫﺪاء‬
‫اﻟﺤﻀﺎرة‬ ‫ﻓﻀﻞ‬ ‫ﻓﻲ‬ ‫و‬ ‫اﻟﻨﺎﻓﻊ‬ ‫اﻟﻌﻠﻢ‬ ‫ﻓﻀﻞ‬ ‫ﻓﻲ‬ ‫ﻣﺨﺘﺼﺮة‬ ‫ﻛﻠﻤﺎت‬ ‫ﻓﻬﺬه‬ ‫ﺑﻌﺪ‬ ‫أﻣﺎ‬
‫ﺟﻤﻌﺎء‬ ‫اﻟﺒﺸﺮﻳﺔ‬ ‫ﻋﲆ‬ ‫اﻹﺳﻼﻣﻴﺔ‬
‫ﻬﻮر‬ ُ
‫ﻛﻈ‬ ٌ‫ﺮ‬ ِ
‫وﻇﺎﻫ‬ ٌ‫واﺿﺢ‬ ‫وﻫﺬا‬ ،‫اﻹﺳﻼم‬ ‫ﻓﻲ‬ ٌ‫ﻛﺒﻴﺮة‬ ٌ‫ﻨﺰﻟﺔ‬ َ‫وﻣ‬ ٌ‫ﻗﻴﻤﺔ‬ ‫ﻟﻠﻌﻠﻢ‬ ‫إن‬
‫آﺛﺎر‬ ‫ﻓﻲ‬ ‫و‬ ‫ﻨﺔ‬‫اﻟﺴ‬ ‫و‬ ‫اﻟﻘﺮآن‬ - ِ
‫ﻴﻦ‬َ‫ﻴ‬‫اﻟﻮﺣ‬ ‫ﻧﺼﻮص‬ ‫ﻓﻲ‬ ‫ﻤﺎء‬‫اﻟﺴ‬ ‫ﻛﺒﺪ‬ ‫ﻓﻲ‬ ‫ﻤﺲ‬‫اﻟﺸ‬
َ
‫ﻚ‬‫َﺑ‬‫ر‬ ِ
‫ﻢ‬ ْ
‫ﺎﺳ‬ِ‫ﺑ‬
ْ
‫َأ‬‫ﺮ‬ ْ
‫﴿اﻗ‬ ‫أﻣﺔ‬ ‫ﻧﺤﻦ‬ ‫ﺑﻞ‬ ‫ﻓﻘﻂ‬ ﴾
ْ
‫َأ‬‫ﺮ‬ ْ
‫﴿اﻗ‬ ‫أﻣﺔ‬ ‫ﻟﺴﻨﺎ‬ ‫ﻓﻨﺤﻦ‬ ‫ﺎ‬ ً
‫أﻳﻀ‬ ‫اﻟﺼﺎﻟﺢ‬ ‫ﻠﻒ‬‫اﻟﺴ‬
‫أن‬ ‫ﻳﺠﺐ‬ ‫ﻻ‬ ‫اﻟﻌﻠﻮم‬ ‫ﺗﻌﻠﻢ‬ ‫و‬ ‫ﷲ‬ ‫ﻣﻦ‬ ‫ﺑﻤﻌﻴﺔ‬ ‫ﻣﺴﺪدة‬ ‫ﻟﻠﻜﻮن‬ ‫ﻓﻘﺮاءﺗﻨﺎ‬ ﴾ َ
‫ﻖ‬َ‫ﻠ‬َ‫ﺧ‬ ‫ي‬ ِ‫ﺬ‬‫اﻟ‬
‫ﺑﻔﻀﻞ‬ ‫اﻟﻐﺮب‬ ‫ﺷﻬﺪ‬ ‫ﻗﺪ‬ ‫و‬ ‫ﻋﺎﻣﺔ‬ ‫اﻟﺒﺸﺮﻳﺔ‬ ‫و‬ ‫ﺧﺎﺻﺔ‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬ ‫ﻟﻔﺎﺋﺪة‬ ‫إﻻ‬ ‫ﻳﻜﻮن‬
‫)ﻋﺎﻟﻤﻴﺔ‬ ‫ﻛﺘﺎب‬ ‫ﻓﻰ‬ ‫ﻟﻮﺑﻮن‬ ‫ﺟﻮﺳﺘﺎف‬ ‫اﻟﻔﺮﻧﺴﻲ‬ ‫اﻟﻄﺒﻴﺐ‬ ‫ﻳﻘﻮل‬ ‫ﺣﻴﺚ‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬
‫اﻟﺤﻀﺎرة‬ ‫ﻣﻴﺪان‬ ‫ﻓﻲ‬ ‫واﻟﻤﺴﻠﻤﻴﻦ‬ ‫اﻟﻌﺮب‬ ‫ﻓﻀﻞ‬ ‫ﻳﻘﺘﺼﺮ‬ ‫ﻟﻢ‬ :(‫اﻻﺳﻼﻣﻴﺔ‬ ‫اﻟﺤﻀﺎرة‬
‫ﻟﻬﻢ‬ ‫ﻣﺪﻳﻨﺎن‬ ‫ﻓﻬﻤﺎ‬ ‫واﻟﻐﺮب‬ ‫اﻟﺸﺮق‬ ‫ﻓﻲ‬ ‫اﻟﺒﺎﻟﻎ‬ ‫اﻷﺛﺮ‬ ‫ﻟﻬﻢ‬ ‫ﻛﺎن‬ ‫ﻓﻘﺪ‬ ‫أﻧﻔﺴﻬﻢ‬ ‫ﻋﲆ‬
‫ﺗﻤﺪﻧﻬﻢ‬ ‫ﻓﻲ‬
‫ﻓﻘﺎل‬ ‫وﺣﺪاﻧﻴﺘﻪ‬ ‫ﻋﲆ‬ ‫اﻟﻌﻠﻤﺎء‬ ‫ﷲ‬ ‫اﺳﺘﺸﻬﺪ‬ ‫اﻟﻌﻠﻤﺎء‬ ‫و‬ ‫اﻟﻌﻠﻢ‬ ‫ﺑﻤﻘﺎم‬ ً‫وﺗﻨﻮﻳﻬﺎ‬
‫ﻻ‬ ‫ﺑﺎﻟﻘﺴﻂ‬ ً‫ﻗﺎﺋﻤﺎ‬ ‫اﻟﻌﻠﻢ‬ ‫وأوﻟﻮا‬ ‫واﻟﻤﻼﺋﻜﺔ‬ ‫ﻫﻮ‬ ‫إﻻ‬ ‫إﻟﻪ‬ ‫ﻻ‬ ‫أﻧﻪ‬ ‫ﷲ‬ ‫ﺷﻬﺪ‬ ) : ‫ﺳﺒﺤﺎﻧﻪ‬
18/‫ﻋﻤﺮان‬ ‫آل‬ ( ‫اﻟﺤﻜﻴﻢ‬ ‫اﻟﻌﺰﻳﺰ‬ ‫ﻫﻮ‬ ‫إﻻ‬ ‫إﻟﻪ‬ .
‫زدﻧﻲ‬ ‫رب‬ ‫﴿وﻗﻞ‬ : ‫ﻓﻘﺎل‬ ‫ﻣﻨﻪ‬ ‫اﻟﻤﺰﻳﺪ‬ ‫ﻳﻄﻠﺐ‬ ‫أن‬ ‫رﺳﻮﻟﻪ‬ ‫ﷲ‬ ‫أﻣﺮ‬ ‫اﻟﻌﻠﻢ‬ ‫وﻷﻫﻤﻴﺔ‬
‫ﻋﲆ‬ ‫)اﻹﺳﻼم‬ ‫ﻛﺘﺎﺑﻪ‬ ‫ﻓﻲ‬ ‫ﻓﺎﻳﺲ‬ ‫ﻟﻴﻮﺑﻮﻟﺪ‬ ‫اﻟﻨﻤﺴﺎوي‬ ‫اﻟﻤﻔﻜﺮ‬ ‫ﻳﺼﺮح‬ ‫ﻛﻤﺎ‬ ﴾ً‫ﻋﻠﻤﺎ‬
‫ﻧﻌﻴﺶ‬ ‫اﻟﺬي‬ ‫اﻟﺤﺪﻳﺚ‬ ‫اﻟﻌﻠﻤﻲ‬ ‫اﻟﻌﺼﺮ‬ ‫إن‬ ‫ﻗﻠﻨﺎ‬ ‫إذ‬ ‫ﻧﺒﺎﻟﻎ‬ ‫ﻟﺴﻨﺎ‬ : (‫اﻟﻄﺮق‬ ‫ﻣﻔﺘﺮق‬
‫و‬ ‫دﻣﺸﻖ‬ ‫ﻓﻲ‬ ‫اﻹﺳﻼﻣﻴﺔ‬ ‫اﻟﻤﺮاﻛﺰ‬ ‫ﻓﻲ‬ ‫وﻟﻜﻦ‬ ، ‫أوروﺑﺎ‬ ‫ﻣﺪن‬ ‫ﻓﻲ‬ ‫ﻦ‬ ّ
‫ﺪﺷ‬ُ‫ﻳ‬ ‫ﻟﻢ‬ ، ‫ﻓﻴﻪ‬
‫ﻗﺮﻃﺒﺔ‬ ‫و‬ ‫اﻟﻘﺎﻫﺮة‬ ‫و‬ ‫ﺑﻐﺪاد‬
‫اﻹﺳﻼم‬ ‫ﻓﻲ‬ :(‫اﻟﺠﻤﻴﻠﺔ‬ ‫)اﻟﺤﻴﺎة‬ ‫ﻛﺘﺎﺑﻪ‬ ‫ﻓﻲ‬ ‫ﻓﺮاﻧﺲ‬ ‫أﻧﺎﺗﻮل‬ ‫اﻟﻔﺮﻧﺴﻲ‬ ‫اﻟﺸﺎﻋﺮ‬ ‫ﻳﻘﻮل‬
‫وإن‬ ، ‫اﻟﻌﻠﻢ‬ ‫ﻋﲆ‬ ً‫ﺑﺎﻋﺜﺎ‬ ‫اﻟﺪﻳﻦ‬ ‫ﻛﺎن‬ ‫ﺑﻞ‬ ، ‫ﻟﻶﺧﺮ‬ ‫ﻇﻬﺮه‬ ‫واﻟﺪﻳﻦ‬ ‫اﻟﻌﻠﻢ‬ ‫ﻣﻦ‬ ‫ﻛﻞ‬ ّ
‫ﻮل‬ُ‫ﻳ‬ ‫ﻟﻢ‬
‫ﻣﻌﻬﺎ‬ ‫ﻧﻌﺠﺰ‬ ‫درﺟﺔ‬ ‫إﱃ‬ ‫ﻛﺜﻴﺮ‬ ‫ﺑﺸﻲء‬ ‫اﻹﺳﻼﻣﻴﺔ‬ ‫ﻟﻠﺤﻀﺎرة‬ ‫ﻣﺪﻳﻨﺔ‬ ‫اﻟﻐﺮﺑﻴﺔ‬ ‫اﻟﺤﻀﺎرة‬
‫اﻟﺜﺎﻧﻴﺔ‬ ‫ﻣﻌﺮﻓﺔ‬ ‫ﺗﺘﻢ‬ ‫ﻟﻢ‬ ‫إذا‬ ‫اﻷوﱃ‬ ‫ﻓﻬﻢ‬ ‫ﻋﻦ‬
‫و‬ ،‫ﺎت‬‫ﻳﺎﺿﻴ‬ّ‫ﺮ‬‫اﻟ‬ ‫و‬ ‫اﻟﻔﻴﺰﻳﺎء‬ ‫و‬ ّ
‫ﺐ‬ ّ
‫ﻛﺎﻟﻄ‬ ،‫اﻟﺪﻧﻴﻮﻳﺔ‬ ‫اﻟﻌﻠﻮم‬ ‫إن‬ ‫ﻧﻘﻮل‬ ‫اﻟﺨﺘﺎم‬ ‫ﻓﻲ‬ ‫و‬
‫ﺤﺘﺎﺟﻮن‬َ‫ﻳ‬ ‫و‬ ‫اﻟﻤﺴﻠﻤﻮن‬ ‫ﻣﻨﻬﺎ‬ ُ‫ﻳﺴﺘﻔﻴﺪ‬ ‫و‬ ‫اﻹﻧﺴﺎن‬ ‫ﻳﻨﻔﻊ‬ ‫ﻣﻤﺎ‬ ‫ﻏﻴﺮﻫﺎ‬ ‫و‬ ‫اﻟﻬﻨﺪﺳﺔ‬
‫ﻧﻔﻊ‬ ‫ﻷﺟﻞ‬ ‫ﻤﻬﺎ‬‫ﺗﻌﻠ‬ ‫ﻦ‬ َ‫ﻓﻤ‬ ،‫اﻟﻜﺮﻳﻤﺔ‬ ‫اﻷﻋﻤﺎل‬ ‫ﻣﻦ‬ ‫ﻬﺎ‬ َ‫ﻤ‬‫ﺗﻌﻠ‬ ‫أن‬ ‫ﺷﻚ‬ ‫ﻻ‬ - ‫ﻬﺎ‬ْ‫ﻴ‬‫إﻟ‬
‫ﻟﻐﻴﺮﻫﻢ‬ ‫اﻟﺤﺎﺟﺔ‬ ‫ﻋﻦ‬ ‫إﺳﺘﻐﻨﺎءﻫﻢ‬ ‫و‬ ‫اﻟﺬاﺗﻲ‬ ‫إﻛﺘﻔﺎءﻫﻢ‬ ‫ﺗﺤﻘﻴﻖ‬ ‫و‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬
‫ﺎت‬‫ﻴ‬‫ﺑﺎﻟﻨ‬ ‫اﻷﻋﻤﺎل‬ ‫ﻤﺎ‬‫إﻧ‬ :‫ﻢ‬‫وﺳﻠ‬ ‫ﻋﻠﻴﻪ‬ ‫ﷲ‬ ‫ﺻﲆ‬ ‫ﻪ‬ِ‫ﻟ‬‫ﻟﻘﻮ‬ ‫ﻋﺒﺎدة‬ ‫ﻓﻲ‬ ‫ﻓﻬﻮ‬
ٔ
‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬
1442
2021
Contents
List of Figures 6
List of Tables 1
1 General Introduction 2
2 Review of EEG-based emotion recognition 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Electroencephalography . . . . . . . . . . . . . . . . . 4
2.2.2 Brain waves . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Anxiety disorder . . . . . . . . . . . . . . . . . . . . . 8
2.2.4 Worldwide and Tunisia statistics . . . . . . . . . . . . 8
2.3 Ways of emotion detection using machine learning . . . . . . . 9
2.3.1 Facial Recognition . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Speech Recognition . . . . . . . . . . . . . . . . . . . . 10
2.3.3 Body Gestures and Movements . . . . . . . . . . . . . 10
2.3.4 Motor Behavioural Patterns . . . . . . . . . . . . . . . 11
2.3.5 Biosignals . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Existing applications for anxiety’s detection and treatment . . 14
2.4.1 Wysa: Depression and anxiety therapy chatbot . . . . 14
2.4.2 Daylio: . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Headspace: . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.4 Calm: . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.5 AntiStress, Relaxing, Anxiety Stress Relief Game: . . . 17
2.4.6 Shine: Calm Anxiety Stress: . . . . . . . . . . . . . . . 18
2.5 Available anxiety elicitation-based datasets . . . . . . . . . . . 19
2.5.1 Problem of imbalanced dataset . . . . . . . . . . . . . 21
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 EEG data augmentation using CVAE 22
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 The autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Variational AutoEncoder (VAE) . . . . . . . . . . . . . . . . . 24
3.4 The gap between AE and VAE . . . . . . . . . . . . . . . . . 26
3.5 Conditional Variational Autoencoder . . . . . . . . . . . . . . 27
3.6 Coding in COLAB . . . . . . . . . . . . . . . . . . . . . . . . 29
4
CONTENTS 5
3.7 Evaluation of the generation process . . . . . . . . . . . . . . 29
3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Emotion recognition using recurrent CNN 32
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Convolutional Neural Network . . . . . . . . . . . . . . . . . . 32
4.2.1 CNN principle . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.2 CNN applications . . . . . . . . . . . . . . . . . . . . . 33
4.2.3 CNN Architecture . . . . . . . . . . . . . . . . . . . . 34
4.2.4 Difference between Conv1D and Conv2D . . . . . . . . 35
4.3 Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . 35
4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.2 RNN Applications . . . . . . . . . . . . . . . . . . . . 36
4.3.3 Long Short Term Memory layer LSTM . . . . . . . . . 36
4.4 The proposed architecture for anxiety states recognition . . . . 37
4.5 Anxiety states recognition results . . . . . . . . . . . . . . . . 38
4.5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . 38
4.5.2 Classification results without data augmentation . . . . 39
4.5.3 Classification results with data augmentation . . . . . 42
4.6 Graphical user interface . . . . . . . . . . . . . . . . . . . . . 43
4.7 The general methodology of our work . . . . . . . . . . . . . . 44
4.8 Project schedule . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 Conclusion and Future Work 46
Netography 51
List of Figures
2.1 Measuring the electrical activity using fixated electrodes on
an EEG cap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The five EEG signals and its associated activities . . . . . . . 7
2.3 Hapiness index for 2015-2017 . . . . . . . . . . . . . . . . . . 8
2.4 Extracting facial features . . . . . . . . . . . . . . . . . . . . . 9
2.5 Extracting speech features . . . . . . . . . . . . . . . . . . . . 10
2.6 Emotion recognition using hand movements . . . . . . . . . . 11
2.7 ER using walking behavioural features . . . . . . . . . . . . . 12
2.8 Emotion recognition using biosignals . . . . . . . . . . . . . . 13
2.9 Screenshot from the Penguin chatbot . . . . . . . . . . . . . . 14
2.10 Screenshots from Daylioo . . . . . . . . . . . . . . . . . . . . . 15
2.11 Screenshots from Headscape . . . . . . . . . . . . . . . . . . . 16
2.12 Mindful days follow-up sechedule from Calm . . . . . . . . . . 17
2.13 Many relaxing and coloring games to treat anxiety . . . . . . . 18
2.14 Screenshots from Shine app . . . . . . . . . . . . . . . . . . . 19
2.15 GAP results between balanced and imbalanced dataset . . . . 21
3.1 The autoencoder architecture . . . . . . . . . . . . . . . . . . 23
3.2 The latent space regularization problem . . . . . . . . . . . . . 23
3.3 The variational autoencoder model . . . . . . . . . . . . . . . 25
3.4 The optimisation of the variational autoencoder model . . . . 26
3.5 The reparameterization trick . . . . . . . . . . . . . . . . . . 26
3.6 Real difference between AE and VAEs on MNIST dataset . . . 27
3.7 The network of CVAE. Here, Enc and Dec represent one real
sample, real label, generated sample, mean value, standard
deviation, resampled noise, encoder, and decoder, respectively. 28
3.8 The architecture of a Conditional Variational Autoencoder . . 28
3.9 t-SNE representation of real and generated data . . . . . . . . 30
3.10 Topographical map of real and generated data . . . . . . . . . 31
4.1 Charles Camiel looks into the camera for a facial recognition
test at Logan International Airport in Boston . . . . . . . . . 33
4.2 CNN architecture . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Conv2D kernel sliding . . . . . . . . . . . . . . . . . . . . . . 35
4.4 GAP between RNNs (L) and Feedforward Neural Networks (R) 36
4.5 Inside the LSTM cell . . . . . . . . . . . . . . . . . . . . . . . 37
4.6 CNN-LSTM architecture for anxiety states recognition . . . . 38
6
LIST OF FIGURES 7
4.7 CNN LSTM architecture description . . . . . . . . . . . . . . 39
4.8 Training and validation loss . . . . . . . . . . . . . . . . . . . 40
4.9 Training and validation accuracy of CNN-LSTM on DASPS
dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.10 Confusion matrix of CNN-LSTM on DASPS dataset . . . . . . 41
4.11 Uploading data to GUI by therapist . . . . . . . . . . . . . . . 42
4.12 GUI showing EEG brain map in side a and emotion prediction
in side b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.13 Basic steps of our methodology . . . . . . . . . . . . . . . . . 44
4.14 GANTT chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Acknowledgments
In the Name of Allah, the Most Beneficent, the Most Merciful
The Prophet Mohammad, peace be upon him said:
”Allah does not thank the person who does not thank people.”
• I would first like to thank my thesis advisor, Ms. Boudour
Ammar the PhD, Eng.and Assistant Professor in National
Engineering School of Sfax for her big support, help and
precious advices during this internship.
• I would also express my gratitude to Doctor Rahma Fourati,
member of REGIM Lab at ENIS, her door’s office was al-
ways open whenever I ran into a trouble spot or had a ques-
tion about my research or writing. They consistently allowed
this report to be my own work, but steered me in the right
the direction whenever he thought I needed it.
• I would like to thank the industrial head of CISEN Com-
puter Mr. Maher Yangui, the IT engineer, who gave me
the opportunity to accomplish this internship with their es-
teemed company, I would thank also thank the administra-
tive responsibles of UVT for their coordination and support
during my studies.
• Finally, I must express my very profound gratitude to my
parents: Mokhtar and Baya and to my brothers: Chedi and
Jemil, to my friends and colleagues for providing me with
unfailing support and continuous encouragement throughout
my years of study and through the process of researching
and writing this thesis. This accomplishment would not have
been possible without them. Thank you.
Amir Bouzidi
1
Chapter 1
General Introduction
Affective computing is the study and development of systems and devices
that can recognize, interpret, process, and simulate human affects. It is an
interdisciplinary field spanning computer science, psychology, and cognitive
science. The machine should interpret the emotional state of humans and
adapt its behavior to them, giving an appropriate response for those emo-
tions.
Affective computing technologies sense the emotional state of a user (via
sensors, microphone, cameras or software logic). They respond by perform-
ing specific predefined product/service features, such as changing a quiz or
recommending a set of videos to fit the mood of the learner. The more com-
puters we have in our lives the more we’re going to want them to be socially
smart. We don’t want it to bother us with unimportant information. That
kind of common-sense reasoning requires an understanding of the person’s
emotional state.
A major area in affective computing is the design of computational de-
vices proposed to exhibit either innate emotional capabilities or that are
capable of convincingly simulating emotions. A more practical approach,
based on current technological capabilities, is the simulation of emotions in
conversational agents in order to enrich and facilitate interactivity between
human and machine. While human emotions are often associated with surges
in hormones and other neuropeptides, emotions in machines might be asso-
ciated with abstract states associated with progress (or lack of progress) in
autonomous learning systems. In this view, affective emotional states cor-
respond to time-derivatives in the learning curve of an arbitrary learning
system. Two major categories describing emotions in machines: Emotional
speech and Facial affect detection.
Anxiety is a kind of a negative emotion. In this work, the goal is to
design an application for the visualization of EEG signals, the inspection of
the topographical map and the recognition of the anxiety level.
The desktop application is useful for the therapist as a first diagnosis to
choose the convenient technique to continue the therapy. In other words,
some people do not share their thinking or their mental state. In such case,
the inspection of the EEG signals allows to access the inner state of the
human mind. The current work proposes two contributions:
2
CHAPTER 1. GENERAL INTRODUCTION 3
• The content generation of data EEG signals in order to enrich the
dataset to guarantee an efficient training of a deep neural network.
• The recognition of anxiety states with a recurrent neural network com-
posed of convolutional layers followed by a Long Short Term (LSTM)
memory to capture spatio-temporal features in clean EEG signals.
The rest of this master is composed of four chapters. These are organised
as follows:
• Chapter Two: This chapter presents the theoretical background and
literature review on EEG-based emotion recognition as well as EEG-
based Anxiety levels recognition. It also provides a background on EEG
signals, including EEG rhythms, analysis techniques of EEG signals
and the role of EEG as a modality for emotion recognition. In addition
to this, the chapter contains the used techniques by machine learning
for emotions recognition (facial recognition, speech recognition, body
gestures and movements, motor behavioural patterns and biosignals)
followed by an overview of the existing affective benchmarks then the
available datasets and the framework and steps for this project.
• Chapter three: The chapter introduces the data projection with au-
toencoder followed by the explanation of the AE architecture, vari-
ational autoencoders and the gap between AE and VAE. Then, we
choose the conditional variational autoencoder for writing the code on
COLAB. Finally, we talk about metrics of our machine learning.
• Chapter four: This chapter presents the proposed model for the
recognition of the emotion states. We talk about CNN advantages, it
principle, architecture and it applications in real world then we describe
the difference between 1D and used 2D CNN for our code followed by
a brief explanation for recurrent neural networks RNN and it impor-
tance for our application for continuous signals like EEG, then we write
about the used technique LSTM for our code, followed by an analysis
for the code, finally we had present the graphical user interface and the
GANTT chart for whole project.
• Chapter five: This chapter presents the obtained results, summariz-
ing this experience and the contribution for future works. The master
ends with a conclusion that provides a summary of our contributions,
outlines the conclusions and the limitations of this research and also
suggests several directions for future research.
Chapter 2
Review of EEG-based emotion
recognition
2.1 Introduction
The mechanisms that regulate our physiological and mental processes behave
in a coupled way in which there is a miraculous inter-dependency. Mental
processes are responsible for changes of the physiological state in our body.
On the other side, changes in bodily functions also lead to different thoughts,
behaviours and emotions. In this chapter, we will present the EEG modality
along with its brain waves then we will talk about the phenomenon of anxiety
in Tunisia and worldwide followed by a presentation of the latest machine
learning used techniques for emotions recognition and finally we will intro-
duce the top used mobile applications in our community for treating anxiety
and emotion identification with an overview for the elicitation techniques of
anxious states.
2.2 Preliminaries
2.2.1 Electroencephalography
Electroencephalography, also known as EEG, is the study of the brain func-
tions reflected by the brain’s electrical activity and it is considered as one
of the basic tools to image brain functioning. Our thoughts are generated
through a network of neurons, that send signals to each other with the help
of electrical currents. We have to use some electrodes of an EEG headset
placed on the scalp to collect the brain’s electrical signals. In addition, a
conductive paste is used to improve the conduction of the electrical signals.
The EEG headset used in this study is an elastic cap similar to a bathing
cap with the electrodes mounted to the cap. The electrodes are mounted
systematically on the cap using the international 10-20 system for electrode
placement to ensure that the data can be collected from identical positions
across all the respondents. These electrodes detect the electrical changes of
thousands of synchronized neurons simultaneously. The voltage fluctuations
4
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 5
measured by those kind of sensors are very low, typically in micro-volts.
The signals are digitized and sent to an amplifier where the signals will be
amplified.
Once the signals are amplified, the signals are sent to a computer, where
these can be recorded. Using different methodologies, the signals can be
represented as a vector array or matrix array for data processing utilities
[Cong et al., 2015]. In addition, various maps of the brain activity can be
generated, with a fast temporal resolution.
Figure 2.1: Measuring the electrical activity using fixated electrodes on an
EEG cap
[Web 2]
The main drawback for EEG is the spatial resolution where it is difficult to
tell whether the signals that were measured by the electrodes were generated
near the surface, or in deeper regions of the brain so it’s hard to figure out
where in the brain the electrical activity is coming from. The cost of EEG
systems depend on several factors: Firstly, the number of electrodes on the
headset, secondly on the quality of the amplifier and thirdly on the sampling
rate, measured in Hz.
One of the major advantages of using EEG is the fact that it has excel-
lent temporal resolution, meaning that it can measure in fine detail events
happening in real-time. According to neuroscience news, researchers believe
that it takes about 17 milliseconds for the brain to form a representation of a
human face making EEG the perfect candidate, as EEG can capture activity
at a time scale down to milliseconds [ScienceDaily, 2018].
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 6
2.2.2 Brain waves
The development of technologies such as virtual reality, wearable devices
and understanding the physiological responses to emotional states can serve
a wide range of valuable applications in such diverse domains:
• Medicine: Rehabilitation (Help monitoring), companion (Enhance re-
alism), counseling (Client’s emotional state), health care (Patient’s feel-
ing about treatment especially for for deaf, dumb and blind).
• E-learning: Adjust the online presentation of an online tutor, detect
the state of the learner, improve tutoring systems.
• Monitoring: Detect driver’s state and warning him, ATM not dis-
pensing money when scared, improve call-center system (Detect and
prioritize angry customers via their voices).
• Entertainment: Recognize mood and emotions of the users and sat-
isfy their needs with the right content (Movies and music recommen-
dations).
• Law: Deeper discovery of depositions: Improve investigations tools
with criminals, suspects and witnesses.
• Marketing: Impact of ads, ameliorate advertising plans, optimize rec-
ommendation systems then satisfy user shopping experience so increase
sales.
Brain waves represent the regularly recurring wave forms that are similar
in shape and duration [Steriade, 2005]. There are five main EEG frequency
bands: Delta, theta, alpha, beta and gamma which reflect the different brain
states [NeuroSky, 2009].
Brain waves and the functions of each EEG band are described below:
• Delta waves (0.1-3 Hz): Appear in dreamless sleep and unconscious
states.
• Theta waves (4-7 Hz): Observed in different states such as intuitive,
creative, imaginary and drowsy states.
• Alpha waves (8-12 Hz): The first EEG waves that were discovered by
Berger [Niedermeyer and da Silva, 2005]. Alpha waves appear in the
relaxation state, tranquil and conscious states but not during drowsy.
Alpha waves become attenuated in several states such as eyes opening,
hearing sounds, anxiety or attenuation.
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 7
• Beta waves (12-30 Hz): Observed in the active state and anxious think-
ing. There are three different bands of beta waves:
– Low beta waves (12-15 Hz): Appear in relaxed yet focused and
integrated cases.
– Mid-range beta waves (16-20 Hz): Appear in thinking and aware-
ness of self and surroundings.
– High beta (21-30 Hz): Observed in alertness and agitation states.
• Gamma waves (30-100 Hz): Observed in higher mental activity such
as in processing information and learning.
The EEG oscillations of the same frequency may have different functions as
depicted in Figure 2.1 for example, delta oscillations are normal and abnor-
mal based on states : Normal through slow wave sleep and clearly signing
abnormality during awake state [Freeman and Quiroga, 2012].
Figure 2.2: The five EEG signals and its associated activities
[Web 3]
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 8
2.2.3 Anxiety disorder
While the origins of the field may be traced as far back as to early philosoph-
ical inquiries into emotion (”affect” is, basically, a synonym for ”emotion”),
the more modern branch of computer science originated with Rosalind Pi-
card’s book [Picard, 1997] on affective computing. The motivation for the
research is the ability to simulate empathy.
According to [Web 4], anxiety refers to multiple mental and physiological
phenomena, including a person’s conscious state of worry over a future un-
wanted event, or fear of an actual situation. Anxiety and fear are closely
related, some scholars view anxiety as a uniquely human emotion and fear as
common to nonhuman species. Another distinction often made between fear
and anxiety is that fear is an adaptive response to realistic threat, whereas
anxiety is a diffuse emotion, sometimes an unreasonable or excessive reaction
to current or future perceived threat.
2.2.4 Worldwide and Tunisia statistics
According to ADAA [Web 5], anxiety disorders are the most common men-
tal illness in the United States, affecting 40 million adults in the country age
18 and older, or 18.1% of the population every year. Anxiety disorders are
highly treatable, yet only 36.9% of those suffering receive treatment. Anxi-
ety disorders affect 25.1% of children between 13 and 18 years old. Research
shows that untreated children with anxiety disorders are at higher risk to per-
form poorly in school, miss out on important social experiences, and engage
in substance abuse. The WHO reports that anxiety disorders are the most
common mental disorders worldwide with specific phobia, major depressive
disorder and social phobia being the most common anxiety disorders. The
figure below quoted from the UN Sustainable Development Solutions Net-
work Report of 2018 showed that Tunisia ranked 111 in the happiness index.
This result reflects the deterioration of the mental health situation within
Tunisian society and the need for optimizing existing solutions.
Figure 2.3: Hapiness index for 2015-2017
[Web 6]
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 9
2.3 Ways of emotion detection using machine
learning
It exists different ways for emotion recognition through machine learning.
Here are the last used tehcniques for feelings identification:
2.3.1 Facial Recognition
Facial recognition based on machine learning (ML) is a widely used method
for detecting emotions. It takes advantage of the fact that our face charac-
teristics fluctuate dramatically in response to our emotions. When we are
happy, for example, our lips expand upwards from both ends. Similarly, when
we are excited, we elevate our brows.
Facial Recognition is a valuable emotion detection technology in which
pixels of critical facial regions are evaluated to characterize facial expressions
using facial landmarks, machine learning and deep learning. Eyes, nose, lips,
jaw, eyebrows, mouth, and other facial landmarks are employed in emotion
detection using machine learning. While a distinct facial landmark may
present in two separate emotions, a detailed analysis of the combination of
different landmarks using artificial intelligence via machine learning can help
distinguish between similar-appearing but unique emotions. For example,
while elevated eyebrows can indicate astonishment, they can also be a sign
of worry. Higher brows with raised lip boundaries, on the other hand, would
signal a joyful surprise rather than anxiety. Face recognition can be used to
detect emotions in surveillance healthcare.
Figure 2.4: Extracting facial features
[Web 8]
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION10
2.3.2 Speech Recognition
Speech feature extraction and voice activity detection are required for emo-
tion identification using speech recognition. The method entails utilizing
machine learning to analyze speech parameters such as tone, energy, pitch,
formant frequency, and so on, and determining emotions based on changes
in these features.
Because voice signals can be obtained quickly and cheaply, ML-based
emotion identification via speech, also known as Speech Emotion Recogni-
tion (SER), is very popular. A good audio database, effective feature extrac-
tion, and the deployment of trustworthy classifiers employing ML techniques
and Natural Language Processing (NLP) are all required for speech emotion
recognition using machine learning.
Both feature extraction and feature selection are critical for reliable find-
ings. Then, using various classification techniques, raw data is classified into
a certain emotion class based on features retrieved from the data, such as the
Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), Support
Vector Machine (SVM), Neural Networks (NN), and Recurrent Neural Net
(RNN).
Major application areas for SER are audio surveillance, e-learning, clinical
studies, banking, entertainment, call-centers, gaming, and many more. For
example, emotion detection in e-learning helps understand students’ emo-
tions and modify the teaching techniques accordingly [Web 7].
Figure 2.5: Extracting speech features
[Web 9]
2.3.3 Body Gestures and Movements
With the help of machine learning, analyzing body movements and gestures
can also aid in emotion identification. With changes in emotions, our bodily
movements, posture, and gestures alter dramatically. This is why, based on
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION11
a mix of hand/arm gestures and body movements, we can usually infer a
person’s basic mood. A clenched fist with an alert stance, for example, is an
indication of rage. In addition, if a person
Every shift in human mood is followed by a succession of gestures and
changes in body movement. With the use of proper Machine learning clas-
sifier algorithms and gesture sensors like Microsoft Kinect, OpenKinect, and
OpenNI, analyzing a combination of various gestures and body motions can
provide excellent insights into emotion recognition.
The process of emotion detection through body gestures and movements
involves the extraction of regions in relevant body parts, for example, from
hands to get a hand region mask. Then, contour analysis is performed in
this region that provides contours and convexity defects. This is used for
classification. The extraction of areas in important body parts, such as hands,
to obtain a hand region mask, is part of the process of emotion detection
using body gestures and motions. Then, in this region, contour analysis is
performed, which produces contours and convexity defects. This is used to
categorize results. Five extended fingers imply open hands and no extended
finger implies a fist.
Figure 2.6: Emotion recognition using hand movements
[Web 10]
2.3.4 Motor Behavioural Patterns
The changes in a person’s behavioral patterns with muscle tension, strength,
coordination, and frequency can also help characterize changes in their emo-
tional state when using the correct machine learning algorithms. As a result,
these are useful factors for machine learning-based emotion identification.
A cheerful state, for example, is shown by symmetric up and down hand
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION12
gestures. This method leverages the fact that our body muscles react sig-
nificantly to the changes in our emotional state, as a reflex action. While
we would not even be aware of how prominent these changes are, these mo-
tor behavioral changes if recorded and analyzed properly through machine
learning techniques, act as great indicators for emotion detection.
Figure 2.7: ER using walking behavioural features
([Randhavane, 2020])
2.3.5 Biosignals
Emotion detection through biosignals is the process of analyzing biologi-
cal changes occurring with emotion changes. Biosignals include heart rate,
temperature, pulse, respiration, perspiration, skin conductivity, electrical im-
pulses in the muscles, and brain activity. For example, a rapidly increasing
heart rate indicates a state of stress or anxiety.[Web 7]
These biosignals, also known as physiological signals, aid in gaining
knowledge on human physiological states. The problem is that a single biosig-
nal is insufficient because it can convey a variety of emotions. As a result,
several biosignals from various areas of the body are combined and examined
as a whole. M is then used to categorize these biosignals (combinations).
These biosignals (combinations) are then analyzed using machine learning
techniques such as convolutional neural networks (CNN) and classification
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION13
algorithms such as regression tree, support vector machine, linear discrimi-
nant analysis, and Naive Bayes, among others. This technology is practical
since it is now possible to record and analyze biosignals using smart wear-
able devices. More complicated biosignals are also recorded for healthcare
purposes using electroencephalography (EEG), electrocardiography (ECG),
and electromyography (EMG).
Figure 2.8: Emotion recognition using biosignals
[Web 11]
In conclusion, the greatest results in emotion detection using machine learn-
ing may be obtained by combining two or more of these approaches. By
expanding the number of users, learning ability improves, and the obtained
data from these strategies enhances the results. EEG, as a breakthrough ma-
chine learning methodology for understanding emotions, has the potential to
be a game-changing method for treating millions of individuals all over the
world.
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION14
2.4 Existing applications for anxiety’s detec-
tion and treatment
2.4.1 Wysa: Depression and anxiety therapy chatbot
Wysa is an emotionally intelligent chatbot that uses AI to react to the
user’s emotions expressions for free. Talk to the cute penguin or use its
free mindfulness exercises for effective anxiety relief, depression and stress
management. With the highest rating in all health care apps 4.8/5 and
over 1 million downloads, Wysa obtained ORCHA prize for best stress
apps (ORCHA: A British organization for testing and reviewing health
apps) and Editor’s choice WMHD (World Mental Health Day) both for 2019.
Figure 2.9: Screenshot from the Penguin chatbot
[Web 12]
If the patient is dealing with stress, anxiety and depression or coping
with low self-esteem, then talking to Wysa can help them relax and get
unstuck, it’s empathetic, helpful, and will never judge. Persons will overcome
their mental health obstacles, through empathetic conversation and free CBT
therapy (Cognitive Behavioral Therapy) based technique. Used around the
clock and trusted by 1,000,000 people.
For extra support, people can avail guidance from a real human coach,
a skilled psychologist who will take them through the advanced coaching
sessions for their needs.
Vent and talk through things or just reflect on their day with the AI
chatbot, practice CBT and DBT techniques (Dialectical Behavior Therapy)
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION15
to build resilience in a fun way using one of 40 conversational coaching tools
which helps in dealing with (stress, anxiety, depression, panic attacks, worry,
loss, or conflict), manage anxious thoughts and anxiety: deep breathing,
techniques for observing thoughts, visualization, and tension relief [Web 12].
In brief, this application gathers data through messages received from
users. AI algorithms improve their answers from the huge database.
2.4.2 Daylio:
Daylio is a highly flexible tool, which we can use it to track whatever we
want. Exercise, meditate, eat, and be thankful with our fitness goal buddy,
mental health coach, and food log. This program looks after the mental,
emotional, and physical well-being. Self-care is essential for a better mood
and less worry. With a 4.6/5 rating from 320.000 users, Daylio surpassed 10
million downloads.
Figure 2.10: Screenshots from Daylioo
[Web 13]
-This application is built on three principles:
1. Reach happiness and self-improvement by being mindful of our days.
2. Validate our hunes. How does our new hobby influence our life.
3. Form a new habit in an obstacle-free environment no learning curve.
Finally, this app does not gather any data as mentioned in its Google Play
page: “We don’t send your data to our servers. We don’t have access to your
entries. Also, any other third-party app can’t read your data.”
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION16
2.4.3 Headspace:
With over 10 million downloads and 4.6/5 rating by 200 thousands users, this
piece of software is among the most popular mental health care applications.
Headspace is a mindfulness app that users may utilize in their daily lives.
Learn meditation and mindfulness methods from world-class experts and
build tools to help patients focus, breathe, stay calm, and create balance in
their lives, whether they require stress relief or sleep assistance.
Figure 2.11: Screenshots from Headscape
[Web 15]
The users will learn how to deal with tension and anxiety, as well as how
to calm their minds.
• Stress & anxiety meditation: Managing anxiety, letting go of stress
• Falling asleep & waking up meditation: Sleep, restlessness
• Work & productivity meditation: Finding focus, prioritization, pro-
ductivity, creativity and student’s meditations.
• Movement & sports meditation: Motivation, focus, training, competi-
tion, recovery
• Physical health mindfulness training: Mindful eating, pain manage-
ment, pregnancy, coping with cancer. [Web 14]
In brief, Headscape does not gather any data, it does not use chatbots or
surveys to collect users data.
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION17
2.4.4 Calm:
This app is a popular choice for meditation and sleep. With guided medita-
tions, Sleep Stories, breathing programs, masterclasses, and calming music,
millions of individuals enjoy reduced tension, anxiety, and more peaceful
sleep. Top psychiatrists, therapists, and mental health professionals have
endorsed the app, according to the creators.
Guided meditation sessions are available in lengths of 3, 5, 10, 15, 20 or
25 minutes so user can choose the perfect length to fit with his schedule.
Calming anxiety, managing stress, deep sleep, focus and concentration,
relationships, breaking habits, happiness, gratitude, self-esteem, body scan,
loving-kindness, forgiveness, non-judgment, commuting to work or school,
mindfulness at College, mindfulness at work, walking meditation, and calm
kids are just a few of the topics covered. [Web 16]
Figure 2.12: Mindful days follow-up sechedule from Calm
[Web 17]
In sum, Calm concentrates on relaxing music, sleep stories and breathing
programs, the app does not use chatbots or gather any data to improve its
algorithms.
2.4.5 AntiStress, Relaxing, Anxiety Stress Relief
Game:
With more than 5 millions installs and 4.2/5 rating by 53 thousands users,
the app provides users relaxation with satisfying games that are designed
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION18
with great concepts and full of relaxation toys. Using them and creating fun
loving moments in the hectic routine. This relaxing game 2021 with color
therapy is for all ages. Users just need to download it and plunge into it for
his unlimited fun and relaxation.
Figure 2.13: Many relaxing and coloring games to treat anxiety
[Web 19]
The app contains: Realistic 3D brain exercise and relaxation, different
mind freshness toys, high quality relaxing sounds to release stress, realistic
experience of release stress in minutes, smooth controls to play with the 3D
fidget toys and different relaxation toys missions.[Web 18]
In brief, this application does not gather any data or information from
users.
2.4.6 Shine: Calm Anxiety Stress:
This application have more than 100 thousands downloads and 4.8/5 rat-
ing,. The application help users to learn a new self-care strategy every day,
get support from a diverse community, and access an audio library of 800+
original meditations, bedtime stories, and calming sounds to help patients
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION19
shift their mindset or mood plus: Meditations specific to the mental health
challenges faced by members of marginalized groups. [Web 20]
Topics include: Black Well being calming anxiety, reducing stress, con-
fidence, growth, improving sleep, focus, burnout, forgiveness, self-love, mo-
tivation, creativity, finding joy, managing work frustrations, strengthening
relationships and creating healthy habits.
Figure 2.14: Screenshots from Shine app
[Web 21]
To summarize, SHINE does not collect any informations or data from it
users, it is just a classic programmed app. [Web 20]
2.5 Available anxiety elicitation-based
datasets
Anxiety affects human capabilities and behavior as much as it affects pro-
ductivity and quality of life. It is considered to be the main cause of
depression and suicide. Anxious states are detectable by specialists by
virtue of their acquired cognition and skills. There is a need for non-
invasive reliable techniques that performs the complex task of anxiety de-
tection. Several works [Garcı́a-Martı́nez et al., 2017], [Arsalan et al., 2019],
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION20
and [Zhang et al., 2020] were proposed to recognize anxious states. There is
no consensus nor about the elicitation of anxious states neither about the
labels which makes existing works very different and difficult to compare
them.
• Recently, a new dataset known as ”DASPS” for anxiety levels recog-
nition ([Baghdadi et al., 2020]) from low-cost portable EEG device
(EMOTIV-EPOC) with 14 channels is released. The EEG recordings
were taken from 23 participants. DASPS is characterized by a ther-
apeutic elicitation which triggers different levels of anxiety in partici-
pants by self-recall of stressful situations. To accord labels to two and
four levels, Hamilton score was taken from questionnaire filled before
and after experiment.
• In the same context, Arsalen et al. ([Arsalan et al., 2019]) carried a
psychological experiment on 28 participants by recording EEG signals
using low-cost portable EEG device (MUSE:Muse is a wearable brain
sensing headband where the device measures brain activity via 4 elec-
troencephalography EEG sensors). Preparing oral presentation is used
as stressful activity to trigger perceived mental stress. Three sessions
recording: The pre-activity when participants are in a resting position,
activity when they prepare the presentation and post-activity for the
public oral presentation. Arsalen et al. showed that only pre-activity
EEG recordings are well correlated to two and three stress levels, re-
spectively. In the classification task, only pre-activity EEG signals are
considered.
• Anxiety disorder is recognized through Healthy Brain Network (HBN)
dataset [Alexander et al., 2017] launched by the American Institute of
Child Psychology and includes data collected from children and ado-
lescents (ages 5 to 17) in New York City. HBN was proposed to di-
agnose and intervene in the mental health of minors. The dataset
contains also eye movements and large EEG recordings. Zhang et al.
[Zhang et al., 2020] selected 92 subjects (where 45 children are consid-
ered as anxious and 47 children as normal) to conduct experiments ac-
cording to the Screen for Child Anxiety Related Disorders (SCARED)
scale. Zhang et al. [Zhang et al., 2020] extracted PSD (Post-Stroke
Depression) features from Gamma band and transform them using a
new proposed Group Sparse Canonical Correlation Analysis (GSCCA)
to achieve 82.70% with SVM classifier.
CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION21
2.5.1 Problem of imbalanced dataset
First, let is explain clearly what is balanced dataset, we consider flower pic-
tures as a positive values and tree pictures as a Negative value. We can
say that the number of positive values and negative values in approximately
same but the imbalanced dataset: If there is the very high different between
the positive values and negative values. So, using an imbalanced dataset
will produce a wrong learning and finally a wrong classification results. For
this reason we must generate new sufficient amount of data for each category
before beginning the training and testing steps to have a precise results with
high accuracy.
Figure 2.15: GAP results between balanced and imbalanced dataset
2.6 Conclusion
EEG-based emotion recognition task is very crucial for human daily life,
especially to maintain mental health and keep information in an early stage
before going into a bad situation. In this chapter, we detailed several concepts
related to EEG-based emotion recognition task. The specificity’s of EEG
signals where the representation of anxiety state is explained. A brief research
about used machine learning methods for emotion recognition then we have
mentioned the most famous existing applications about anxiety detection
and treatment and finally we have presented the available EEG datasets
from several scientific experiences such as DASPS dataset.
In the next chapter, a data augmentation step is proposed in order to
provide sufficient data for the training of the neural network.
Chapter 3
EEG data augmentation using
CVAE
3.1 Introduction
Unsupervised learning is modeling the underlying structure or distribution
in the data in order to learn more about it. These, are called unsupervised
learning because unlike supervised learning there are no correct answers and
there is no teacher. Algorithms are left to their own devices to discover and
present the interesting structure in the data.
Variational Autoencoder as a generative model is based on unsupervised
learning to learn the structure of input data for the aim of generating new
similar data to the orginal real data.
3.2 The autoencoder
Autoencoders (AE) are a family of neural networks for which the output
produced data is the same as the input data. They work by compressing the
input into a latent-space representation, and then reconstructing the output
from this representation. The general idea of autoencoders is simple and
consists on setting an encoder and a decoder as neural networks and to learn
the best encoding-decoding scheme using an iterative optimization process.
The search of encoder and decoder that minimize the reconstruction error is
done by gradient descent over the parameters of these networks. Figure 3.1
depicts the encoder and decoder of an autoencoder network.
Notice that the gradient descent is an optimization algorithm for finding
a local minimum of a differentiable function. Gradient descent is used to
find the values of a function’s parameters (coefficients) that minimize a cost
function as far as possible.
Notice that dimensionality reduction refers to techniques that reduce the
number of input variables in a dataset.
The more complex the architecture is, the more the autoencoder can
proceed to a high dimensionality reduction while keeping reconstruction loss
low.
22
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 23
Figure 3.1: The autoencoder architecture
[Web 22]
Second, most of the time, the final purpose of dimensionality reduction
is not to only reduce the number of dimensions of the data but to reduce
this number of dimensions while keeping the major part of the data struc-
ture information in the reduced representations. For these two reasons, the
dimension of the latent space and the ”depth” of autoencoders (That define
the degree and the quality of compression) have to be carefully controlled and
adjusted depending on the final purpose of the dimensionality reduction.
Figure 3.2: The latent space regularization problem
[Web 23]
The regularity of the latent space for autoencoders is a difficult point that
depends on the distribution of the data in the initial space, the dimension
of the latent space and the architecture of the encoder. The high degree of
freedom of the autoencoder that makes possible to encode and decode with
no information loss (Despite the low dimensionality of the latent space) leads
to a severe overfitting.
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 24
According to Figure 3.2, we can notice that the problem of the autoen-
coders latent space regularity is much more general and need a special atten-
tion. Indeed, the autoencoder is trained for enforce to get such organisation:
The autoencoder is solely trained to encode and decode with as few loss as
possible, no matter how the latent space is organised.
3.3 Variational AutoEncoder (VAE)
Variational autoencoders (VAEs) are a deep learning technique for learning
latent representations. They have also been used to draw images or gener-
ating new data in semi-supervised learning. VAE is a generative model that
estimates the Probability Density Function (PDF) of the training data. The
unique fundamental property that separates it from standard autoencoders,
and makes them so useful for generative modeling is that their latent spaces
are by design, continuous, allowing easy random sampling and interpolation.
In brief, VAE has more parameters to tune that gives significant control over
how we want to model our latent distribution therefore a meaningful outputs
with high quality.
The VAE training objective is to maximize the likelihood of the training
data as described by equation 3.1, according to the model shown in Figure
3.3, where x is the input, z is the latent vector or the hidden representation
and θ represents the network parameters.
pθ(x) =
Z
pθ(z)pθ(x|z)dz (3.1)
The choice of the output distribution is mainly Gaussian, i.e. p(x|z; θ) =
N(x|f(z; θ), σ2
∗ I), f(z; θ) will be modeled using a neural network and σ is
a hyper parameter that multiplies the identity matrix I.
The formula for P(x) is intractable because it requires exponential time to
compute as it needs to be evaluated over all configurations of latent variables.
To solve this problem, an additional encoder network is defined qφ(z|x) to
approximate pθ(z|x). The marginal likelihood of individual data points can
be rewritten as follows:
log pθ(x(i)
) = DKL(qφ(z|x)||pθ(z)) + L(θ, φ, x(i)
) (3.2)
The first term of equation 3.2 is the KL (Kullback-Leibler) divergence
of the approximate posterior and the prior, which intuitively measures how
similar two distributions are. The second term is the variational lower bound
on the marginal likelihood of the data point i. Since the Kullback-Leibler di-
vergence is always greater than or equal to zero. This means that minimizing
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 25
Figure 3.3: The variational autoencoder model
[Web 23]
the Kullback-Leibler divergence is equivalent to maximizing the variational
lower bound. Equation 3.2 can be rewritten as follows:
log pθ(x(i)
) ≥ L(θ, φ, x(i)
) (3.3)
L(θ, φ, x(i)
) = Ez[log pθ(x(i)
|z)] − DKL[qφ(z|x(i)
)||pθ(z)] (3.4)
The loss function for this network (equation 3.4) consists of two terms, the
first penalizes the reconstruction error and the second term encourages the
learned distribution qφ(z|x(i)
) to be similar to the true prior distribution
pθ(z).
The VAE architecture is presented in Figure 3.4 where the encoder model
learns a mapping from x to z and the decoder model learns a mapping from
z back to x. The encoder output is constrained from two vectors describing
the mean µz|x and variance σz|x of the latent state distributions. The de-
coder generates a latent vector by sampling from these defined distributions
and proceed to develop a reconstruction of the original input. Using the
backpropagation technique for optimizing the loss is not feasible because the
sampling process is random. To solve this problem, a ”reparameterization
trick” is used which consists on randomly sampling  from the desired distri-
bution, and then multiply it by the mean µz|x and add the variance σz|x to
the result as described in Figure 3.5.
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 26
Figure 3.4: The optimisation of the variational autoencoder model
[Web 23]
Figure 3.5: The reparameterization trick
3.4 The gap between AE and VAE
An autoencoder accepts input, compresses it, and then recreates the original
input. This is an unsupervised technique because all we need is the original
data, without any labels of known, correct results. The two main uses of
an autoencoder are to compress data to two (or three) dimensions so it can
be graphed, and to compress and decompress images or documents, which
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 27
removes noise in the data. A variational autoencoder assumes that the source
data has some sort of underlying probability distribution (such as Gaussian)
and then attempts to find the parameters of the distribution. Implementing
a variational autoencoder is much more challenging than implementing an
autoencoder. The one main use of a variational autoencoder is to generate
new data that’s related to the original source data. Now exactly what the
additional data is good for is hard to say. A variational autoencoder is a
generative system, and serves a similar purpose as a generative adversarial
network (although GANs work quite differently)[Web 24].To conclude, if we
want precise control over your latent representations and what we would like
them to represent, then choose VAE [Web 23].
Figure 3.6: Real difference between AE and VAEs on MNIST dataset
[Web 25]
3.5 Conditional Variational Autoencoder
The only problem for generating data using Variational Autoencoders is that
we do not have any control over what sort of data it generates. To explain the
principle, for example when we train a VAE with the EEG data set and try
to produce new signals by feeding Z ∼ N(0, 1) into the decoder, it will also
generate another random outputs. If we train the decoder well, we will have
a better signal’s quality but we will have no control over what EEG precisely
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 28
it will produce. For example, We can’t decide exactly what we want to get
in the output.
Figure 3.7: The network of CVAE. Here, Enc and Dec represent one real sam-
ple, real label, generated sample, mean value, standard deviation, resampled
noise, encoder, and decoder, respectively.
For this, we should change our VAE architecture. Give an input Y(label
of the EEG) we want our generative model to produce output X(EEG). Thus,
the process of VAE will be modified as the following: given observation y, z is
drawn from the prior distribution Pθ(z|y), and the output x is produced from
the distribution Pθ(x|y, z). Note that, for simple VAE, the prior is Pθ(z) and
the output is produced by Pθ(x|z).
Therefore, the encoder part tries to learn Qθ(z|x, y), which is equivalent to
learning hidden representation of data X or encoding the X into the hidden
representation conditioned y. The decoder part tries to learn Pθ(X|z, y)
which decoding the hidden representation to input space conditioned by y.
The graphical model can be expressed as mentioned below.
Figure 3.8: The architecture of a Conditional Variational Autoencoder
[Web 26]
In this method, we aim to generate data with the specific category. As
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 29
shown in next figure, to control the generated category, an extra label Y is
added to the encoder and decoder. Firstly we feed the training data point and
the corresponding label to the encoder, secondly we concatenate the hidden
representation with the corresponding label and feed it to the decoder to
train the network. Thirdly, we can generate data with the specific label by
feeding the decoder with the noise sampled from the Gaussian distribution
and the assigned label.
3.6 Coding in COLAB
Our code in COLAB is divided into 4 parts:
1. Create a sampling layer
2. Define the standalone encoder model
3. Define the standalone decoder model
4. Define the CVAE as a Model with a custom training step: The genera-
tion process is performed with sampling layer followed by the decoder,
we have generated different samples, labels encoded with 4 bits.
3.7 Evaluation of the generation process
It is important to ensure that the generated samples are of high quality;
in other words, they are realistic and diverse. Lack of diversity among the
generated samples is an indicator of mode collapse, meaning that the gen-
erator has collapsed into generating only limited modes of the real data. In
this case, we use several qualitative and quantitative metrics to evaluate the
quality of the samples generated by VAEs in terms of diversity and similarity
with the real samples.
Visualization:
Before we start, we have to define t-SNE:t-Distributed Stochastic Neigh-
bor Embedding (t-SNE) is an unsupervised, non-linear technique primarily
used for data exploration and visualizing high-dimensional data. In sim-
pler terms, t-SNE gives a feel or intuition of how the data is arranged in a
high-dimensional space.
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 30
Figure 3.9: t-SNE representation of real and generated data
Now, we visually inspect the quality of the artificial samples by map-
ping the generated and real samples into two dimensions using (t-SNE) and
temporal distribution. T-SNE (t-distributed Stochastic Neighbor Embed-
ding) is applied to map the high dimensional real (Training) and generated
EEG samples into 2-D space. Next figure displays a 2D plot of the anxiety
classes in the latent space. It can be seen that t-SNE embedding of real MI
and generated MI is similar. In addition, real and generated rest samples
have similar distributions. Besides comparing the generated samples with
the training samples, it is interesting to compare them with the test samples.
In fact, the similarity between the generated and the test samples will ex-
plain the classification improvement made by the augmentation. The training
set includes the samples from all subjects excluding the target subject, the
generated set includes the generated samples for the target subject, and the
test set includes the second half of the target subject’s samples that were
not seen during training. Thus, no overlap between the training, test, and
generated sets exists. The results verified that the generated samples were
indeed realistic and diverse.
CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 31
Figure 3.10: Topographical map of real and generated data
To evaluate the quality of the generated data, we plotted the topograph-
ical map of the generated data and the real data. In general, the signals are
similar.
3.8 Conclusion
In summary, due to missing sufficient amount of data for good machine learn-
ing prediction, we have used Variational Autoencoders to generate new more
data. An autoencoder is a neural network that is trained to attempt to copy
its input to its output.The problem with Variational autoencoders is that we
have not any control over what sort of data it generates so we use decide to
use Conditional VAEs as solution to improve the outputs.
Chapter 4
Emotion recognition using
recurrent CNN
4.1 Introduction
The problem of classifying multi-channel Electroencephalogram (EEG)
time series consists in assigning their representation to one of a fixed
number of classes. This is a fundamental task in many health-
care applications, including anxiety detection ([Baghdadi et al., 2017],
[Fourati et al., 2020] and [Baghdadi and Aribi, 2019]), epileptic seizures pre-
diction [Tsiouris et al., 2018] and also affective computing applications such
as EEG-based emotion recognition ([Fourati et al., 2020]). The problem
has been tackled by a wealth of different approaches, spanning from
the signal decomposition techniques of EEG signals to the feature ex-
traction and feature selection algorithms as highlighted in the surveys
[Baghdadi et al., 2016], [Movahedi et al., 2017], [Mahmud et al., 2018], and
[Garcı́a-Martı́nez et al., 2019].
Representation learning or feature learning [Bengio et al., 2013] consists
in automatically discovering the relevant representations for a classification
or detection task directly from raw. Consequently, the laborious handcrafted
features are no longer needed since representation learning permits to both
learn the features and use them to perform a specific task.
In this chapter, we focus on EEG representation learning for anxiety
states recognition using recurrent convolutional neural network in a subject-
independent context.
4.2 Convolutional Neural Network
4.2.1 CNN principle
Convolutional Neural Network (CNN) is a deep neural network originally
designed for image analysis. Recently, it was discovered that the CNN has
also an excellent capacity in sequence data analysis such as natural language
processing. CNN always contains two basic operations, namely convolution
32
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN33
and pooling. The convolution operation using multiple filters is able to
extract features (Edges) from the data set, through which their corresponding
spatial information can be preserved. The pooling operation, also called
sub-sampling, is used to reduce the dimensionality of feature maps from
the convolution operation. Max pooling and average pooling are the most
common pooling operations used in the CNN. Due to the complicity of CNN,
ReLU is the common choice for the activation function to transfer gradient
in training by backpropagation [Jitendra Verma, 2020].
4.2.2 CNN applications
Convolutional neural networks (CNNs) are more often utilized for classifica-
tion and computer vision tasks such as pedestrian and object detection for
self-driving cars, face recognition on social media or securing mobile phones,
image analysis in healthcare (detecting tumours and diseases), quality in-
spection in manufacturing, security in airports, improving results in search
engines, recommender systems (like in Youtube, Amazon and Facebook, etc),
emotions recognition, stock and currency prediction values.
Figure 4.1: Charles Camiel looks into the camera for a facial recognition test
at Logan International Airport in Boston
[Web 28]
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN34
4.2.3 CNN Architecture
In general CNN architecture consists of 4 kinds of layers: Convolutional layer,
pooling layer, dense layer and output layer.
• Convolutional layer: Convolutional layer is the backbone of any
CNN working model. This layer is the one where pixel by pixel scanning
takes place of the images and creates a feature map to define future
classifications.
• Pooling layer: Pooling is also known as the down-sampling of the
data by bringing the overall dimensions of the images. The informa-
tion of each feature from each convolutional layer is limited down to
only containing the most necessary data. The process of creating con-
volutional layers and applying pooling is continuous, may take several
times.
• Fully connected input layer: This is also known as the flattening
of the images. The outputs gained from the last layer are flattened
into a single vector so that it can be used as the input data from the
upcoming layer.
• Fully connected layer: After the feature analysis has been done and
it’s time for computation, this layer assigns random weights to the
inputs and predicts a suitable label.
• Fully connected Output layer: This is the final layer of the CNN
model which contains the results of the labels determined for the clas-
sification and assigns a class to the images.
Figure 4.2: CNN architecture
[Web 29]
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN35
4.2.4 Difference between Conv1D and Conv2D
We wanted to show the difference between Conv1D and Conv2D because
in our code we relied on Conv2D
• For Conv1D, the kernel moves in only one direction. The input
and output data of Conv1D is 2-dimensional (Time and variable Y).
Mainly used for time series data like audio, text, acceleration, etc.
• The Conv2D is known by the kernel slides along the data is 2 di-
mensions, the kernel moves in 2 directions (Height and width). The
input and output data of Conv2D is 3-dimensional (Height, width
and depth). It is mainly used for image data. Kernel matrix can ex-
tract spatial features from the data, it detects edges, color distribution,
etc.
Figure 4.3: Conv2D kernel sliding
[Web 30]
4.3 Recurrent Neural Network
4.3.1 Definition
A recurrent neural network (RNN) is a type of artificial neural network which
uses sequential data or time series data. Like feedforward and CNNs, RNNs
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN36
utilize training data to learn. They are distinguished by their ”memory”
as they take information from prior inputs to influence the current input
and output. While traditional deep neural networks assume that inputs
and outputs are independent of each other, the output of recurrent neural
networks depend on the prior elements within the sequence. While future
events would also be helpful in determining the output of a given sequence,
unidirectional recurrent neural networks cannot account for these events in
their predictions.
4.3.2 RNN Applications
RNNs are commonly used for ordinal or temporal problems, such as language
translation, natural language processing (NLP), speech recognition, and im-
age captioning; they are incorporated into popular applications such as Siri,
voice search, and Google Translate.
Figure 4.4: GAP between RNNs (L) and Feedforward Neural Networks (R)
[Web 31]
4.3.3 Long Short Term Memory layer LSTM
The problem with RNNs is that as time passes by and they get fed more
and more new data, they start to ”forget” about the previous data they have
seen in what called: Vanishing gradient problem, so we need some sort of
Long term memory, which is just what LSTMs provide. The core concept
of LSTM is the cell state, and its various gates. LSTM is a type of cell in
a recurrent neural network used to process sequences of data in applications
such as handwriting recognition, machine translation, and image captioning.
LSTMs address the vanishing gradient problem that occurs when train-
ing RNNs due to long data sequences by maintaining history in an internal
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN37
memory state based on new input and context from previous cells in the
RNN. The cell state as illustrated in Figure 4.5 acts as a transport highway
that transfers relative information all the way down the sequence chain. It
can be seen as the ”memory” of the network. The cell state, in theory, can
carry relevant information throughout the processing of the sequence. So
even information from the earlier time steps can make its way to later time
steps, reducing the effects of short-term memory. As the cell state goes on its
journey, information gets added or removed to the cell state via gates. The
last are different neural networks that decide which information is allowed
on the cell state. The gates can learn what information is relevant to keep
or forget during training.
Figure 4.5: Inside the LSTM cell
[Web 32]
4.4 The proposed architecture for anxiety
states recognition
In this section, we detail the components of 4.6. The preprocessed EEG
signals are fed directly to the CNN-LSTM model. A convolutional block
composed of convolutional layer followed by a pooling layer. This block is
responsible for the spatial encoding of EEG time series. Actually, there is a
relation between channels. We did not perform a 1D convolution since the
kernel will be convoluted with each channel separately.
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN38
A 2D convolution is performed which allows to operate on channels to-
gether. The encoded CNN features are then fed to an LSTM layer for tem-
poral parsing. Each LSTM cell encoded in the time axis the CNN features
and forward then to the next cell. LSTM produce the last output activa-
tion and then it is classified with a softmax layer. Our architecture is a
spatio-temporal processing of EEG time series.
Figure 4.6: CNN-LSTM architecture for anxiety states recognition
4.5 Anxiety states recognition results
4.5.1 Experimental setup
We chose Colab because it allows anybody to write and execute arbitrary
python code through the browser, and is especially well suited to machine
learning and data analysis. The main advantages for using this environement
are: Free access to GPUs, zero configuration required, easy access and sharing
with other users.
In our model, four convolutional blocks as follows:
• Input layer, Conv2D layer, batch normalization, leaky ReLU activation
function, Max pooling 2D layer and dropout.
• Conv2D layer, batch normalization, leaky ReLU activation function,
Max pooling 2D layer
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN39
Figure 4.7: CNN LSTM architecture description
• Conv2D layer, batch normalization, leaky ReLU activation function,
Max pooling 2D layer
• Conv2D layer, batch normalization, leaky ReLU activation function,
Max pooling 2D layer
• Flattening layer, reshape, LSTM, dropout, first dense layer and second
dense layer (Output).
4.5.2 Classification results without data augmentation
To prove the usefulness of the data augmentation approach, we first classify
only original EEG signals.
The training and validation loss cure are depicted in Figure 4.8.
• The red curve refers to the training loss which is the error on the
training data set.
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN40
• The blue curve refers to the validation loss which is the error after
running the validation set of data through the trained network.
Figure 4.8: Training and validation loss
• If validation loss  training loss we can call it overfitting.
• If validation loss  training loss we can call it underfitting.
While there are some fluctuations in the validation curve, the training step
ends with a small gap between training loss and validation loss.
Figure 4.9 presents the training and validation accuracy curve. The train-
ing set is used to train the model, while the validation set is only used to
evaluate the model’s performance.
Training accuracy is the accuracy we get if we apply the model on the
training data, while validation or testing accuracy is the accuracy for unseen
data. We have validation accuracy less than training accuracy because EEG
training data is something with which the model is already familiar and
validation data is a collection of new data points which is new to the model.
The fluctuations in validation loss curve are also present in the accuracy
validation curve. But, it still remain the training and validation accuracy
curves are close to each other.
In the field of machine learning and specifically the problem of statistical
classification, a confusion matrix, also known as an error matrix, is a specific
table layout that allows visualization of the performance of an algorithm.
Each row of the matrix represents the instances in an actual class while
each column represents the instances in a predicted class, or vice versa –
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN41
Figure 4.9: Training and validation accuracy of CNN-LSTM on DASPS
dataset
both variants are found in the literature. The name stems from the fact
that it makes it easy to see whether the system is confusing two classes (i.e.
commonly mislabeling one as another). It is a special kind of contingency
table, with two dimensions (”actual” and ”predicted”), and identical sets of
”classes” in both dimensions (each combination of dimension and class is a
variable in the contingency table).
Figure 4.10: Confusion matrix of CNN-LSTM on DASPS dataset
When showing the test data to the trained model, the CNN-LSTM
achieves 89.96%. According to Figure 4.10, the model achieves its higher
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN42
accuracy on the normal anxiety state. The lowest accuracy is achieved with
light anxiety state. There is no confusion between severe trials and light
trials. The highest confusion is done between moderate trials with normal
trials.
4.5.3 Classification results with data augmentation
To begin this part, we should discuss DASPS (A Database for Anxious States
based on a Psychological Stimulation). DASPS is a database that comprises
EEG signals for detecting anxiety levels. The electroencephalogram (EEG)
signals of 23 subjects were captured during fear elicitation using face-to-
face psychological cues in this database. This work is innovative not only in
making EEG data available to the affective computing community, but also in
the design of a psychological stimulation protocol that provides comfortable
conditions for participants in direct interaction with the therapist, as well
as the use of a wireless EEG cap with fewer channels, namely only 14 dry
electrodes. The raw EEG data obtained from the 23 individuals is stored in
.edf files in the database. This included database contains raw data as well
as preprocessed data in .mat format.
The researchers offered a matlab script for segmenting each EEG signal
into six segments, one for each of the six scenarios.
Figure 4.11: Uploading data to GUI by therapist
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN43
4.6 Graphical user interface
A GUI (graphical user interface) is a system of interactive visual components
for computer software. A GUI displays objects that convey information, and
represent actions that can be taken by the user. The objects change color,
size, or visibility when the user interacts with them.
Figure 4.12: GUI showing EEG brain map in side a and emotion prediction
in side b
In Figure 4.11, the therapist can upload the recorded EEG signals (a
pre-processed trial saved as .mat file).
Then, the system will load the trial and plot the topographical map which
helps in visualization of activated brain regions as illustrated in side a Figure
4.12.
The last interface consists in classifying the anxiety state of the subject.
Actually, the trained CNN-LSTM model is saved. After that, we call it for
every new trial to classify. The anxiety state is depicted in side b of the
Figure 4.12.
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN44
4.7 The general methodology of our work
In the chart below, we summarized our framework in 3 basic steps:
1. The first step is about gathering EEG signals and collecting original
data set but the needed capacity of data is very little, for this reason
we proceed to second step: Generate new data.
2. The second step is about generating new data using the Conditional
Variational Autoencoder model.
3. The final step is about predicting the emotions using convolutional
neural networks along with Long-Short Term Memory cells.
Figure 4.13: Basic steps of our methodology
4.8 Project schedule
A Gantt chart is a type of bar chart that illustrates a project schedule,
named after its inventor, Henry Gantt (1861–1919), who designed such a
chart around the years 1910–1915. Modern Gantt charts also show the de-
pendency relationships between activities and the current schedule status.
Figure 4.14: GANTT chart
CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN45
In our Gantt (see Figure 4.14) chart, we elaborated 7 basic tasks:
• Firstly, planning outlines for the whole internship with the professors.
• Secondly, beginning writing chapter 1: Introduction
• Thirdly, in parallel with chapter 1 we will begin practically our machine
learning project, find more details in the architecture of the project.
• Fourthly, starting writing chapter 2 in the beginning of April: Back-
ground and literature review of EEG-based emotion recognition.
• Fifthly, starting writing chapter 3 talking deeply about the used in-
ternal architectures: VAEs for generating new data and LSTM for
classifying the emotions.
• Sixthly, in last month of April, we will create the graphical user inter-
face using Tkinter.
• Sevenly and last, beginning writing the chapter 4 talking about results
and future optimization for this project.
4.9 Conclusion
In conclusion, for this chapter we have talking about convolutional neural
networks (Principle, application and architecture). Then, we have talking
about about recurrent neural networks as primary solution for predicting
emotions. In the next stage, we decided to change to LSTM technique that
is more suitable to our continuous signals such as (EEG, voice and text
translation). Finally, we have analysed our code on COLAB and interpret
the results and we have presented our graphical user interface for therapist.
Chapter 5
Conclusion and Future Work
This work ”Emotions prediction for augmented EEG signals using VAE and
Convolutional Neural Networks LSTM”, is based on two main ideas: gener-
ating new data and classifying emotions using old and new produced data. In
the first idea we have used conditional variational autoencoders to generate
new high quality EEG data and in the second part we have used convolu-
tional neural networks combined with LSTM technique to classify the output
level of anxiety (normal, light, moderate and severe). The code is writen us-
ing python and Google COLAB. Finally, we make this code into a graphical
user interface to use it by normal users and therapists.
This work has enabled us to know the patient’s feelings without relying
on many traditional questionnaires, surveys ans tests taking into account the
psychological and emotional state of the persons, especially for those who
suffer from psychological trauma, depression and negative feelings. It helps
both, the patient and the doctor by having a high accuracy in emotion recog-
nition and fast diagnosis. It allows to speed up the treatment process and not
cause embarrassment to the patient, taking into account the psychological
aspect of the children, the disabled, the deaf and the dumb whom find it dif-
ficult to express their emotions after a such negative event in his life, such as
the death of a loved one, divorce, school failure or family problems. It helps,
also, doctors in refugee camps in war zones to diagnose psychological con-
ditions as quickly as possible, especially for those who suffer severe trauma
after bloody events. This work is a an important step towards improving
mental health care on two levels: Diagnosis and treatment. It shortens time
and makes it easier to understand millions of patient’s situation around the
world with the least amount of documents and questionnaires.
This work can be improved and developed in coordination with official
authorities by cooperating with medicine universities, doctors’ clinics, the
ministry of health to improve the database and optimize other points by
interacting with psychiatrists, therapists and intervening ministry of woman
and children, sociologists, pediatricians and other organizations.
Following machine learning life cycle, this work can be improved in fu-
ture projects using a large big data from many hospitals, we can also use
Generative adversarial networks (GANs) for generating more high quality
46
CHAPTER 5. CONCLUSION AND FUTURE WORK 47
EEG signals which will improve the prediction results, using the new inputs
for new patients. In such application, it is a necessary step to improve our
algorithms to feed a better prediction results.
Bibliography
[Alexander et al., 2017] Alexander, L. M., Escalera, J., Ai, L., Andreotti,
C., Febre, K., Mangone, A., Vega-Potler, N., Langer, N., Alexander, A.,
Kovacs, M., et al. (2017). An open resource for transdiagnostic research in
pediatric mental health and learning disorders. Scientific data, 4:170181.
[Arsalan et al., 2019] Arsalan, A., Majid, M., Butt, A. R., and Anwar, S. M.
(2019). Classification of perceived mental stress using a commercially avail-
able eeg headband. IEEE journal of biomedical and health informatics,
23(6):2257–2264.
[Baghdadi and Aribi, 2019] Baghdadi, A. and Aribi, Y. (2019). Effectiveness
of dominance for anxiety vs anger detection. In 2019 Fifth International
Conference on Advances in Biomedical Engineering (ICABME), pages 1–4.
IEEE.
[Baghdadi et al., 2016] Baghdadi, A., Aribi, Y., and Alimi, A. M. (2016).
A survey of methods and performances for eeg-based emotion recognition.
In International Conference on Hybrid Intelligent Systems, pages 164–174.
Springer.
[Baghdadi et al., 2017] Baghdadi, A., Aribi, Y., and Alimi, A. M. (2017).
Efficient human stress detection system based on frontal alpha asymmetry.
In International Conference on Neural Information Processing, pages 858–
867. Springer.
[Baghdadi et al., 2020] Baghdadi, A., Aribi, Y., Fourati, R., Halouani, N.,
Siarry, P., and Alimi, A. (2020). Psychological stimulation for anxious
states detection based on eeg-related features. Journal of Ambient Intelli-
gence and Humanized Computing, pages 1–15.
[Bengio et al., 2013] Bengio, Y., Courville, A., and Vincent, P. (2013). Rep-
resentation learning: A review and new perspectives. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 35(8):1798–1828.
[Cong et al., 2015] Cong, F., Lin, Q.-H., Kuang, L.-D., Gong, X.-F.,
Astikainen, P., and Ristaniemi, T. (2015). Tensor decomposition of eeg
signals: a brief review. Journal of neuroscience methods, 248:59–69.
48
BIBLIOGRAPHY 49
[Fourati et al., 2020] Fourati, R., Ammar, B., Sanchez-Medina, J., and Al-
imi, A. M. (2020). Unsupervised learning in reservoir computing for eeg-
based emotion recognition. IEEE Transactions on Affective Computing,
to be published. doi:10.1109/TAFFC.2020.2982143.
[Freeman and Quiroga, 2012] Freeman, W. and Quiroga, R. Q. (2012). Imag-
ing brain function with EEG: advanced temporal and spatial analysis of
electroencephalographic signals. Springer Science  Business Media.
[Garcı́a-Martı́nez et al., 2019] Garcı́a-Martı́nez, B., Martinez-Rodrigo, A.,
Alcaraz, R., and Fernández-Caballero, A. (2019). A review on nonlin-
ear methods using electroencephalographic recordings for emotion recog-
nition. IEEE Transactions on Affective Computing, to be published.
doi:10.1109/TAFFC.2018.2890636.
[Garcı́a-Martı́nez et al., 2017] Garcı́a-Martı́nez, B., Martı́nez-Rodrigo, A.,
Zangróniz, R., Pastor, J. M., and Alcaraz, R. (2017). Symbolic analy-
sis of brain dynamics detects negative stress. Entropy, 19(5):196.
[Jitendra Verma, 2020] Jitendra Verma, Sudip Paul, P. J. (2020). Computa-
tional Intelligence and Its Applications in Healthcare. “Elsevier”.
[Mahmud et al., 2018] Mahmud, M., Kaiser, M. S., Hussain, A., and Vas-
sanelli, S. (2018). Applications of deep learning and reinforcement learning
to biological data. IEEE Transactions on Neural Networks and Learning
Systems, 29(6):2063–2079.
[Movahedi et al., 2017] Movahedi, F., Coyle, J. L., and Sejdić, E. (2017).
Deep belief networks for electroencephalography: A review of recent con-
tributions and future outlooks. IEEE Journal of Biomedical and Health
Informatics, 22(3):642–652.
[NeuroSky, 2009] NeuroSky, I. (2009). Brain wave signal (eeg) of neurosky,
inc. Last accessed June 30, 2020.
[Niedermeyer and da Silva, 2005] Niedermeyer, E. and da Silva, F. L. (2005).
Electroencephalography: basic principles, clinical applications, and related
fields. Lippincott Williams  Wilkins.
[Picard, 1997] Picard, R. (1997). Affective Computing. Inteligencia artificial.
“The” MIT Press.
[Randhavane, 2020] Randhavane, T. (2020). Identifying emotions from walk-
ing using affective and deep features. ARXIV, pages 1–15.
BIBLIOGRAPHY 50
[ScienceDaily, 2018] ScienceDaily (2018). University of toronto, mind-
reading algorithm uses eeg data to reconstruct images based on what we
perceive: New technique using eeg shows how our brains perceive faces.
Last accessed June 30, 2020.
[Steriade, 2005] Steriade, M. (2005). Cellular substrates of brain rhythms.
Electroencephalography: Basic principles, clinical applications, and related
fields, 5:31–83.
[Tsiouris et al., 2018] Tsiouris, K. M., Pezoulas, V. C., Zervakis, M., Konit-
siotis, S., Koutsouris, D. D., and Fotiadis, D. I. (2018). A long short-term
memory deep learning network for the prediction of epileptic seizures using
eeg signals. Computers in Biology and Medicine, 99:24–37.
[Zhang et al., 2020] Zhang, X., Pan, J., Shen, J., Din, Z. U., Li, J., Lu,
D., Wu, M., and Hu, B. (2020). Fusing of electroencephalogram and
eye movement with group sparse canonical correlation analysis for anxiety
detection. IEEE Transactions on Affective Computing, to be published.
doi:10.1109/TAFFC.2020.2981440.
Netography (Visited on Mai
2021)
[Web 1]: https://www.bbvaopenmind.com/en/technology/digital-
world/what-is-affective-computing/
[Web 2]: https://www.researchgate.net/figure/Sketch-of-how-to-record-an-
Electroencephalogram-An-EEG-allows-measuring-the-electrical
[Web 3]: https://www.sciencedirect.com/topics/agricultural-and-biological-
sciences
[Web 4]: https://oxfordmedicine.com/view/10.1093/9780195173642.001.0001/
[Web 5]: https://adaa.org/understanding-anxiety/
[Web 6]: https://worldhappiness.report/archive/
[Web 7]: https://bluewhaleapps.com/blog/implementing-machine-learning-
for-emotion-detection
[Web 8]: https://glengilmore.medium.com/facial-recognition-ai-will-use-
your-facial-expressions-to-judge-creditworthiness-b0e9a9ac4174
[Web 9]: https://www.thesaurabh.com/images/posts/about-
me/projects/speech-features-for-emotion-recognition/speech-features-
for-emotion-recognition-header.jpg
[Web 10]: https://ai.googleblog.com/2019/08/on-device-real-time-hand-
tracking-with.html
[Web 11]: https://medium.com/@mindpass2050/biosignals-as-dynamic-
biometrics-d93c3455e895
[Web 12]: https://www.wysa.io/
[Web 13]: https://play.google.com/store/apps/details?id=net.daylio
[Web 14]: https://www.headspace.com/
[Web 15]: https://play.google.com/store/apps/id=com.getsomeheadspace
[Web 16]: https://www.calm.com/
[Web 17]: https://play.google.com/store/apps/
[Web 18]: https://play.google.com/store/apps/details?id=com.relextro.anti.stress
[Web 20]: https://www.theshineapp.com/
[Web 21]: https://play.google.com/store/apps/details?id=com.shinetext.shine
[Web 22]: https://www.researchgate.net/figure/Schematic-overview-of-
convolutional-autoencoder-CAE-and-an-example-reconstruction
[Web 23]: https://towardsdatascience.com/understanding-variational-
autoencoders-vaes-f70510919f73
[Web 24]: https://jamesmccaffrey.wordpress.com/2018/07/03/the-
difference-between-an-autoencoder-and-a-variational-autoencoder/
51
BIBLIOGRAPHY 52
[Web 25]: https://www.slideshare.net/andersonljason/variational-
autoencoders-for-image-generation
[Web 26]: https://www.researchgate.net/figure/Variational-AutoEncoder-
VAE-architecture
[Web 27]: https://pubmed.ncbi.nlm.nih.gov/23663147/
[Web 28]: https://www.npr.org/sections/alltechconsidered/2017/06/26/534131967/facial-
recognition-may-boost-airport-security-but-raises-privacy-worries
[Web 29]: https://medium.com/@codyorazymbet/a-quick-introduction-to-
cnn-layers-3b598e9d9963
[Web 30]: https://www.programmersought.com/article/63014851723/
[Web 31]: https://medium.com/analytics-vidhya/cnn-vs-rnn-vs-ann-
analyzing-3-types-of-neural-networks-in-deep-learning-f3fa1249589d
[Web 32]: https://www.researchgate.net/figure/LSTM-cell-structure

More Related Content

What's hot

Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechLakshmi Sarvani Videla
 
Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Mohd Faiz
 
Analysis and Classification of ECG Signal using Neural Network
Analysis and Classification of ECG Signal using Neural NetworkAnalysis and Classification of ECG Signal using Neural Network
Analysis and Classification of ECG Signal using Neural NetworkZHENG YAN LAM
 
Project Report Distance measurement system
Project Report Distance measurement systemProject Report Distance measurement system
Project Report Distance measurement systemkurkute1994
 
EEG Signal processing
EEG Signal processing EEG Signal processing
EEG Signal processing DikshaKalra9
 
Artifacts in EEG - Recognition and differentiation
Artifacts in EEG - Recognition and differentiationArtifacts in EEG - Recognition and differentiation
Artifacts in EEG - Recognition and differentiationRahul Kumar
 
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...BharathSrinivasG
 
Use of eeg signal for mental analysis of a Person
Use of eeg signal for mental analysis of a PersonUse of eeg signal for mental analysis of a Person
Use of eeg signal for mental analysis of a PersonDipesh Pandey
 
A hybrid classification model for eeg signals
A hybrid classification model for  eeg signals A hybrid classification model for  eeg signals
A hybrid classification model for eeg signals Aboul Ella Hassanien
 
EEG basic to practice 1
EEG basic to practice 1EEG basic to practice 1
EEG basic to practice 1Mohamed Mahdy
 
Face Liveness Detection for Biometric Antispoofing Applications using Color T...
Face Liveness Detection for Biometric Antispoofing Applications using Color T...Face Liveness Detection for Biometric Antispoofing Applications using Color T...
Face Liveness Detection for Biometric Antispoofing Applications using Color T...rahulmonikasharma
 
EEG artifacts 2
EEG artifacts  2EEG artifacts  2
EEG artifacts 2DGIST
 
Independent Component Analysis
Independent Component AnalysisIndependent Component Analysis
Independent Component AnalysisTatsuya Yokota
 

What's hot (20)

Eeg basics.drjma
Eeg basics.drjmaEeg basics.drjma
Eeg basics.drjma
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speech
 
Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.
 
Analysis and Classification of ECG Signal using Neural Network
Analysis and Classification of ECG Signal using Neural NetworkAnalysis and Classification of ECG Signal using Neural Network
Analysis and Classification of ECG Signal using Neural Network
 
Basics of EEG
Basics of EEGBasics of EEG
Basics of EEG
 
Project Report Distance measurement system
Project Report Distance measurement systemProject Report Distance measurement system
Project Report Distance measurement system
 
EEG Signal processing
EEG Signal processing EEG Signal processing
EEG Signal processing
 
Artifacts in EEG - Recognition and differentiation
Artifacts in EEG - Recognition and differentiationArtifacts in EEG - Recognition and differentiation
Artifacts in EEG - Recognition and differentiation
 
Neuromorphic computing
Neuromorphic computingNeuromorphic computing
Neuromorphic computing
 
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
 
Use of eeg signal for mental analysis of a Person
Use of eeg signal for mental analysis of a PersonUse of eeg signal for mental analysis of a Person
Use of eeg signal for mental analysis of a Person
 
A hybrid classification model for eeg signals
A hybrid classification model for  eeg signals A hybrid classification model for  eeg signals
A hybrid classification model for eeg signals
 
EEG basic to practice 1
EEG basic to practice 1EEG basic to practice 1
EEG basic to practice 1
 
Biometric Security Systems ppt
Biometric Security Systems pptBiometric Security Systems ppt
Biometric Security Systems ppt
 
Advance fMRI (Fast fMRI)
Advance fMRI (Fast fMRI)Advance fMRI (Fast fMRI)
Advance fMRI (Fast fMRI)
 
Face Liveness Detection for Biometric Antispoofing Applications using Color T...
Face Liveness Detection for Biometric Antispoofing Applications using Color T...Face Liveness Detection for Biometric Antispoofing Applications using Color T...
Face Liveness Detection for Biometric Antispoofing Applications using Color T...
 
EEG artifacts 2
EEG artifacts  2EEG artifacts  2
EEG artifacts 2
 
Bionic lens report
Bionic lens reportBionic lens report
Bionic lens report
 
Independent Component Analysis
Independent Component AnalysisIndependent Component Analysis
Independent Component Analysis
 
Brain Chips_main.pptx
Brain Chips_main.pptxBrain Chips_main.pptx
Brain Chips_main.pptx
 

Similar to Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks CNN combined with LSTM

Deep Learning for Health Informatics
Deep Learning for Health InformaticsDeep Learning for Health Informatics
Deep Learning for Health InformaticsJason J Pulikkottil
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on SteroidsAdam Blevins
 
Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...stainvai
 
Low Power Context Aware Hierarchical System Design
Low Power Context Aware Hierarchical System DesignLow Power Context Aware Hierarchical System Design
Low Power Context Aware Hierarchical System DesignHoopeer Hoopeer
 
Keraudren-K-2015-PhD-Thesis
Keraudren-K-2015-PhD-ThesisKeraudren-K-2015-PhD-Thesis
Keraudren-K-2015-PhD-ThesisKevin Keraudren
 
RY_PhD_Thesis_2012
RY_PhD_Thesis_2012RY_PhD_Thesis_2012
RY_PhD_Thesis_2012Rajeev Yadav
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKSara Parker
 
Project report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemProject report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemkurkute1994
 
gemes_daniel_thesis
gemes_daniel_thesisgemes_daniel_thesis
gemes_daniel_thesisDaniel Gemes
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_VisualizationHongfu Huang
 
46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-reportvishnuchitiki
 

Similar to Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks CNN combined with LSTM (20)

edc_adaptivity
edc_adaptivityedc_adaptivity
edc_adaptivity
 
Deep Learning for Health Informatics
Deep Learning for Health InformaticsDeep Learning for Health Informatics
Deep Learning for Health Informatics
 
thesis
thesisthesis
thesis
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on Steroids
 
Diplomarbeit
DiplomarbeitDiplomarbeit
Diplomarbeit
 
Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...Trade-off between recognition an reconstruction: Application of Robotics Visi...
Trade-off between recognition an reconstruction: Application of Robotics Visi...
 
Thesis small
Thesis smallThesis small
Thesis small
 
Low Power Context Aware Hierarchical System Design
Low Power Context Aware Hierarchical System DesignLow Power Context Aware Hierarchical System Design
Low Power Context Aware Hierarchical System Design
 
Keraudren-K-2015-PhD-Thesis
Keraudren-K-2015-PhD-ThesisKeraudren-K-2015-PhD-Thesis
Keraudren-K-2015-PhD-Thesis
 
main
mainmain
main
 
exjobb Telia
exjobb Teliaexjobb Telia
exjobb Telia
 
RY_PhD_Thesis_2012
RY_PhD_Thesis_2012RY_PhD_Thesis_2012
RY_PhD_Thesis_2012
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORK
 
Project report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemProject report on Eye tracking interpretation system
Project report on Eye tracking interpretation system
 
gemes_daniel_thesis
gemes_daniel_thesisgemes_daniel_thesis
gemes_daniel_thesis
 
phd-thesis
phd-thesisphd-thesis
phd-thesis
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_Visualization
 
mscthesis
mscthesismscthesis
mscthesis
 
46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report46260004 blue-brain-seminar-report
46260004 blue-brain-seminar-report
 

Recently uploaded

Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks CNN combined with LSTM

  • 1. MASTER’S THESIS To obtain the Master degree in : Advanced Engineering of Robotized Systems and Artificial Intelligence Presented by: Bouzidi Amir Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks CNN combined with LSTM Presented in : July, 4th 2021 Graduation committee : President : Mr. Sayadi Mounir Technical advisor : Ms. Fourati Rahma and Mr. Yangui Maher Report advisor : Ms. Ammar Boudour Member : Ms. Selmani Anissa Academic year: 2020/2021
  • 2. ‫اﻟﺨﻼﺻﺔ‬ ‫ﺑﺎﺳﺘﺨﺪام‬ ‫اﻟﻤﻌﺰزة‬ EEG ‫إﺷﺎرات‬ ‫ﺑﺈﺳﺘﻌﻤﺎل‬ ‫ﺑﺎﻟﻌﻮاﻃﻒ‬ ‫ﻟﻠﺘﻨﺒﺆ‬ ‫ذﻛﻲ‬ ‫ﻧﻈﺎم‬ ‫إﻧﺸﺎء‬ ‫ﻋﻦ‬ ‫ﻋﺒﺎرة‬ ‫ﻫﺬا‬ ‫اﻟﺪروس‬ ‫ﺧﺘﻢ‬ ‫ﻣﺸﺮوع‬ ‫ﺗﺴﻬﻴﻞ‬‫ﺧﺎﺻﺔ‬‫و‬‫اﻟﻄﺒﻲ‬‫اﻟﺘﺸﺨﻴﺺ‬‫ﺗﺤﺴﻴﻦ‬‫ﻫﻮ‬‫اﻟﻤﺸﺮوع‬‫اﻟﻬﺪف‬.LSTM‫ﺑﻮاﺳﻄﺔ‬‫اﻹﻟﺘﻔﺎﻓﻴﺔ‬‫اﻟﻌﺼﺒﻴﺔ‬‫اﻟﺸﺒﻜﺎت‬‫و‬VAE ‫و‬ ‫اﻟﻨﻔﺴﻴﺔ‬ ‫اﻹﺿﻄﺮاﺑﺎت‬ ‫ﻣﻦ‬ ‫ﻳﻌﺎﻧﻮن‬ ‫اﻟﺬﻳﻦ‬ ‫و‬ ‫ﻟﻠﻤﺮﺿﻰ‬ ‫ﺑﺴﺮﻋﺔ‬ ‫اﻟﻌﻼج‬ ‫ﻟﻤﺮﺣﻠﺔ‬ ‫اﻟﻤﺮور‬ ‫ﺑﺎﻟﺘﺎﻟﻲ‬ ‫و‬ ‫اﻟﻌﻮاﻃﻒ‬ ‫اﻛﺘﺸﺎف‬ ‫أﺟﻬﺰة‬‫ﻋﺒﺮ‬‫ﺟﺪﻳﺪة‬‫ﺑﻴﺎﻧﺎت‬‫أﻧﺘﺠﻨﺎ‬،‫اﻷوﻟﻰ‬‫اﻟﻤﺮﺣﻠﺔ‬‫ﻓﻲ‬ ‫ﺳﻮاء‬‫ﺣﺪ‬‫ﻋﻠﻰ‬‫واﻟﻤﺮﻳﺾ‬‫اﻟﻄﺒﻲ‬‫اﻟﻄﺎﻗﻢ‬‫ﻳﺴﺎﻋﺪ‬‫اﻟﻌﻤﻞ‬‫ﻫﺬا‬ ‫اﻟﻘﻠﻖ‬ ‫اﻟﻌﺼﺒﻴﺔ‬ ‫اﻟﺸﺒﻜﺎت‬ ‫اﺳﺘﺨﺪﻣﻨﺎ‬ ‫ًﺎ‬‫ﻴ‬‫وﺛﺎﻧ‬ ، ‫اﻟﺒﻴﺎﻧﺎت‬ ‫ﻣﻦ‬ ‫ﻛﺎﻓﻴﺔ‬ ‫ﻛﻤﻴﺔ‬ ‫ﻟﺘﻮﻓﻴﺮ‬ cVAE ‫اﻟﺸﺮﻃﻴﺔ‬ ‫اﻟﻤﺘﻐﻴﺮة‬ ‫اﻟﺘﻠﻘﺎﺋﻲ‬ ‫اﻟﺘﺸﻔﻴﺮ‬ ‫و‬ ‫ﺳﻬﻠﺔ‬ ‫واﺟﻬﺔ‬ ‫أﻧﺸﺄﻧﺎ‬ ‫ًا‬‫ﺮ‬‫وأﺧﻴ‬ ، ‫ﺑﺎﻟﻌﻮاﻃﻒ‬ ‫ﻟﻠﺘﻨﺒﺆ‬ ‫ﺑﺎﻟﻀﺒﻂ‬ LSTM ‫اﻟﻤﺪى‬ ‫ﻃﻮﻳﻠﺔ‬ ‫اﻟﺬاﻛﺮة‬ ‫ﺗﻘﻨﻴﺔ‬ ‫ﻓﻲ‬ ‫ﻣﺘﻤﺜﻠﺔ‬ ، ‫اﻟﺘﻼﻓﻴﻔﻴﺔ‬ ‫اﻟﻄﺒﻲ‬‫ﻟﻠﺘﺸﺨﻴﺺ‬‫ﻻﺳﺘﺨﺪاﻣﻬﺎ‬‫ﻟﻠﻤﺴﺘﺨﺪﻣﻴﻦ‬‫اﻹﺳﺘﻌﻤﺎل‬‫ﺑﺴﻴﻄﺔ‬ Resumé Notre projet de fin d’étude consiste à développer un système intelligent de prédiction des émotions en utilisant les signaux EEG augmentés à l'aide de VAE et de réseaux de neurones convolutifs combiné avec LSTM. L’objectif du projet est d'améliorer le diagnostic médical, en facilitant notamment la détection des émotions et en appliquant un traitement rapide des patients, ceux qui souffrent du traumatisme et de l'anxiété psychologique. Ce travail aide aussi bien le personnel médical que le patient. Dans une première étape, nous avons généré de nouvelles données via des cVAEs pour fournir une quantité suffisante de données. En deuxième lieu, nous avons utilisé les CNNs combiné avec la technique LSTM pour prédire les émotions. Enfin, nous avons créé une interface ergonomique à simple accès pour les utilisateurs au diagnostic médical. Abstract Our end-of-study project aims to develop an intelligent emotion prediction system using augmented EEG signals using VAE and convolutional neural networks combined with LSTM. The aim of the project is to improve medical diagnosis, in particular by facilitating the detection of emotions and applying quick treatment to patients, those suffering from trauma and anxiety. This work helps both the medical staff and the patient. In a first step, we generated new data via cVAEs to provide a sufficient amount of data. Second, we used CNNs combined with the LSTM technique to predict emotions. Finally, we have created an ergonomic, easy-to-access interface for users in medical diagnosis. ‫اﻟﺨﺎﺿﻊ‬ ‫ﺷﺒﻪ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫اﻟﻤﺸﺎﻋﺮ‬ ‫ﻋﲆ‬ ‫اﻟﺘﻌﺮف‬ ، ‫اﻟﺘﻨﺒﺆ‬ ، ‫ﺑﺎﻳﺜﻮن‬ ، EEG ، CNN ، RNN ، VAE ، LSTM :‫اﻟﻤﻔﺎﺗﻴﺢ‬ .‫اﻟﺒﻴﺎﻧﺎت‬ ‫ﻣﺠﻤﻮﻋﺎت‬ ، ‫اﻵﻟﻲ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫اﻟﻌﻤﻴﻖ‬ ‫اﻟﺘﻌﻠﻢ‬ ، ‫ﻟﻺﺷﺮاف‬ Mots clés: EEG, CNN, RNN , VAE, LSTM, Python, prédiction, reconnaissance des émotion apprentissage semi supervisé, apprentissage profond, machine learning, datasets Key-words: EEG, CNN, RNN , VAE, LSTM, Python, prediction, emotion recognition, semi- .supervised learning, deep learning, machine learning, Datasets Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks CNN combined with LSTM
  • 3. ‫اﻟﺮﺣﻴﻢ‬ ‫اﻟﺮﺣﻤﺎن‬ ‫ﷲ‬ ‫ﺑﺴﻢ‬ ‫ﷲ‬ ‫رﺣﻤﻪ‬ ‫اﻟﺰواري‬ ‫ﻣﺤﻤﺪ‬ ‫اﻟﻤﻬﻨﺪس‬ ‫ﻟﻠﺮاﺣﻞ‬ ‫إﻫﺪاء‬ ‫ﻋﻠﻴﻨﺎ‬ ‫ﻓﻀﻞ‬ ‫ﻟﻪ‬ ‫ﻛﺎن‬ ‫ﻣﻦ‬ ‫و‬ ‫درﺳﻨﺎ‬ ‫و‬ ‫ﻋﻠﻤﻨﺎ‬ ‫ﻣﻦ‬ ‫ﻟﻜﻞ‬ ‫و‬ ‫ﻷﺳﺎﺗﺬﺗﻨﺎ‬ ‫إﻫﺪاء‬ ‫اﻟﺤﻀﺎرة‬ ‫ﻓﻀﻞ‬ ‫ﻓﻲ‬ ‫و‬ ‫اﻟﻨﺎﻓﻊ‬ ‫اﻟﻌﻠﻢ‬ ‫ﻓﻀﻞ‬ ‫ﻓﻲ‬ ‫ﻣﺨﺘﺼﺮة‬ ‫ﻛﻠﻤﺎت‬ ‫ﻓﻬﺬه‬ ‫ﺑﻌﺪ‬ ‫أﻣﺎ‬ ‫ﺟﻤﻌﺎء‬ ‫اﻟﺒﺸﺮﻳﺔ‬ ‫ﻋﲆ‬ ‫اﻹﺳﻼﻣﻴﺔ‬ ‫ﻬﻮر‬ ُ ‫ﻛﻈ‬ ٌ‫ﺮ‬ ِ ‫وﻇﺎﻫ‬ ٌ‫واﺿﺢ‬ ‫وﻫﺬا‬ ،‫اﻹﺳﻼم‬ ‫ﻓﻲ‬ ٌ‫ﻛﺒﻴﺮة‬ ٌ‫ﻨﺰﻟﺔ‬ َ‫وﻣ‬ ٌ‫ﻗﻴﻤﺔ‬ ‫ﻟﻠﻌﻠﻢ‬ ‫إن‬ ‫آﺛﺎر‬ ‫ﻓﻲ‬ ‫و‬ ‫ﻨﺔ‬‫اﻟﺴ‬ ‫و‬ ‫اﻟﻘﺮآن‬ - ِ ‫ﻴﻦ‬َ‫ﻴ‬‫اﻟﻮﺣ‬ ‫ﻧﺼﻮص‬ ‫ﻓﻲ‬ ‫ﻤﺎء‬‫اﻟﺴ‬ ‫ﻛﺒﺪ‬ ‫ﻓﻲ‬ ‫ﻤﺲ‬‫اﻟﺸ‬ َ ‫ﻚ‬‫َﺑ‬‫ر‬ ِ ‫ﻢ‬ ْ ‫ﺎﺳ‬ِ‫ﺑ‬ ْ ‫َأ‬‫ﺮ‬ ْ ‫﴿اﻗ‬ ‫أﻣﺔ‬ ‫ﻧﺤﻦ‬ ‫ﺑﻞ‬ ‫ﻓﻘﻂ‬ ﴾ ْ ‫َأ‬‫ﺮ‬ ْ ‫﴿اﻗ‬ ‫أﻣﺔ‬ ‫ﻟﺴﻨﺎ‬ ‫ﻓﻨﺤﻦ‬ ‫ﺎ‬ ً ‫أﻳﻀ‬ ‫اﻟﺼﺎﻟﺢ‬ ‫ﻠﻒ‬‫اﻟﺴ‬ ‫أن‬ ‫ﻳﺠﺐ‬ ‫ﻻ‬ ‫اﻟﻌﻠﻮم‬ ‫ﺗﻌﻠﻢ‬ ‫و‬ ‫ﷲ‬ ‫ﻣﻦ‬ ‫ﺑﻤﻌﻴﺔ‬ ‫ﻣﺴﺪدة‬ ‫ﻟﻠﻜﻮن‬ ‫ﻓﻘﺮاءﺗﻨﺎ‬ ﴾ َ ‫ﻖ‬َ‫ﻠ‬َ‫ﺧ‬ ‫ي‬ ِ‫ﺬ‬‫اﻟ‬ ‫ﺑﻔﻀﻞ‬ ‫اﻟﻐﺮب‬ ‫ﺷﻬﺪ‬ ‫ﻗﺪ‬ ‫و‬ ‫ﻋﺎﻣﺔ‬ ‫اﻟﺒﺸﺮﻳﺔ‬ ‫و‬ ‫ﺧﺎﺻﺔ‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬ ‫ﻟﻔﺎﺋﺪة‬ ‫إﻻ‬ ‫ﻳﻜﻮن‬ ‫)ﻋﺎﻟﻤﻴﺔ‬ ‫ﻛﺘﺎب‬ ‫ﻓﻰ‬ ‫ﻟﻮﺑﻮن‬ ‫ﺟﻮﺳﺘﺎف‬ ‫اﻟﻔﺮﻧﺴﻲ‬ ‫اﻟﻄﺒﻴﺐ‬ ‫ﻳﻘﻮل‬ ‫ﺣﻴﺚ‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬ ‫اﻟﺤﻀﺎرة‬ ‫ﻣﻴﺪان‬ ‫ﻓﻲ‬ ‫واﻟﻤﺴﻠﻤﻴﻦ‬ ‫اﻟﻌﺮب‬ ‫ﻓﻀﻞ‬ ‫ﻳﻘﺘﺼﺮ‬ ‫ﻟﻢ‬ :(‫اﻻﺳﻼﻣﻴﺔ‬ ‫اﻟﺤﻀﺎرة‬ ‫ﻟﻬﻢ‬ ‫ﻣﺪﻳﻨﺎن‬ ‫ﻓﻬﻤﺎ‬ ‫واﻟﻐﺮب‬ ‫اﻟﺸﺮق‬ ‫ﻓﻲ‬ ‫اﻟﺒﺎﻟﻎ‬ ‫اﻷﺛﺮ‬ ‫ﻟﻬﻢ‬ ‫ﻛﺎن‬ ‫ﻓﻘﺪ‬ ‫أﻧﻔﺴﻬﻢ‬ ‫ﻋﲆ‬ ‫ﺗﻤﺪﻧﻬﻢ‬ ‫ﻓﻲ‬ ‫ﻓﻘﺎل‬ ‫وﺣﺪاﻧﻴﺘﻪ‬ ‫ﻋﲆ‬ ‫اﻟﻌﻠﻤﺎء‬ ‫ﷲ‬ ‫اﺳﺘﺸﻬﺪ‬ ‫اﻟﻌﻠﻤﺎء‬ ‫و‬ ‫اﻟﻌﻠﻢ‬ ‫ﺑﻤﻘﺎم‬ ً‫وﺗﻨﻮﻳﻬﺎ‬ ‫ﻻ‬ ‫ﺑﺎﻟﻘﺴﻂ‬ ً‫ﻗﺎﺋﻤﺎ‬ ‫اﻟﻌﻠﻢ‬ ‫وأوﻟﻮا‬ ‫واﻟﻤﻼﺋﻜﺔ‬ ‫ﻫﻮ‬ ‫إﻻ‬ ‫إﻟﻪ‬ ‫ﻻ‬ ‫أﻧﻪ‬ ‫ﷲ‬ ‫ﺷﻬﺪ‬ ) : ‫ﺳﺒﺤﺎﻧﻪ‬ 18/‫ﻋﻤﺮان‬ ‫آل‬ ( ‫اﻟﺤﻜﻴﻢ‬ ‫اﻟﻌﺰﻳﺰ‬ ‫ﻫﻮ‬ ‫إﻻ‬ ‫إﻟﻪ‬ . ‫زدﻧﻲ‬ ‫رب‬ ‫﴿وﻗﻞ‬ : ‫ﻓﻘﺎل‬ ‫ﻣﻨﻪ‬ ‫اﻟﻤﺰﻳﺪ‬ ‫ﻳﻄﻠﺐ‬ ‫أن‬ ‫رﺳﻮﻟﻪ‬ ‫ﷲ‬ ‫أﻣﺮ‬ ‫اﻟﻌﻠﻢ‬ ‫وﻷﻫﻤﻴﺔ‬ ‫ﻋﲆ‬ ‫)اﻹﺳﻼم‬ ‫ﻛﺘﺎﺑﻪ‬ ‫ﻓﻲ‬ ‫ﻓﺎﻳﺲ‬ ‫ﻟﻴﻮﺑﻮﻟﺪ‬ ‫اﻟﻨﻤﺴﺎوي‬ ‫اﻟﻤﻔﻜﺮ‬ ‫ﻳﺼﺮح‬ ‫ﻛﻤﺎ‬ ﴾ً‫ﻋﻠﻤﺎ‬ ‫ﻧﻌﻴﺶ‬ ‫اﻟﺬي‬ ‫اﻟﺤﺪﻳﺚ‬ ‫اﻟﻌﻠﻤﻲ‬ ‫اﻟﻌﺼﺮ‬ ‫إن‬ ‫ﻗﻠﻨﺎ‬ ‫إذ‬ ‫ﻧﺒﺎﻟﻎ‬ ‫ﻟﺴﻨﺎ‬ : (‫اﻟﻄﺮق‬ ‫ﻣﻔﺘﺮق‬ ‫و‬ ‫دﻣﺸﻖ‬ ‫ﻓﻲ‬ ‫اﻹﺳﻼﻣﻴﺔ‬ ‫اﻟﻤﺮاﻛﺰ‬ ‫ﻓﻲ‬ ‫وﻟﻜﻦ‬ ، ‫أوروﺑﺎ‬ ‫ﻣﺪن‬ ‫ﻓﻲ‬ ‫ﻦ‬ ّ ‫ﺪﺷ‬ُ‫ﻳ‬ ‫ﻟﻢ‬ ، ‫ﻓﻴﻪ‬ ‫ﻗﺮﻃﺒﺔ‬ ‫و‬ ‫اﻟﻘﺎﻫﺮة‬ ‫و‬ ‫ﺑﻐﺪاد‬ ‫اﻹﺳﻼم‬ ‫ﻓﻲ‬ :(‫اﻟﺠﻤﻴﻠﺔ‬ ‫)اﻟﺤﻴﺎة‬ ‫ﻛﺘﺎﺑﻪ‬ ‫ﻓﻲ‬ ‫ﻓﺮاﻧﺲ‬ ‫أﻧﺎﺗﻮل‬ ‫اﻟﻔﺮﻧﺴﻲ‬ ‫اﻟﺸﺎﻋﺮ‬ ‫ﻳﻘﻮل‬ ‫وإن‬ ، ‫اﻟﻌﻠﻢ‬ ‫ﻋﲆ‬ ً‫ﺑﺎﻋﺜﺎ‬ ‫اﻟﺪﻳﻦ‬ ‫ﻛﺎن‬ ‫ﺑﻞ‬ ، ‫ﻟﻶﺧﺮ‬ ‫ﻇﻬﺮه‬ ‫واﻟﺪﻳﻦ‬ ‫اﻟﻌﻠﻢ‬ ‫ﻣﻦ‬ ‫ﻛﻞ‬ ّ ‫ﻮل‬ُ‫ﻳ‬ ‫ﻟﻢ‬ ‫ﻣﻌﻬﺎ‬ ‫ﻧﻌﺠﺰ‬ ‫درﺟﺔ‬ ‫إﱃ‬ ‫ﻛﺜﻴﺮ‬ ‫ﺑﺸﻲء‬ ‫اﻹﺳﻼﻣﻴﺔ‬ ‫ﻟﻠﺤﻀﺎرة‬ ‫ﻣﺪﻳﻨﺔ‬ ‫اﻟﻐﺮﺑﻴﺔ‬ ‫اﻟﺤﻀﺎرة‬ ‫اﻟﺜﺎﻧﻴﺔ‬ ‫ﻣﻌﺮﻓﺔ‬ ‫ﺗﺘﻢ‬ ‫ﻟﻢ‬ ‫إذا‬ ‫اﻷوﱃ‬ ‫ﻓﻬﻢ‬ ‫ﻋﻦ‬ ‫و‬ ،‫ﺎت‬‫ﻳﺎﺿﻴ‬ّ‫ﺮ‬‫اﻟ‬ ‫و‬ ‫اﻟﻔﻴﺰﻳﺎء‬ ‫و‬ ّ ‫ﺐ‬ ّ ‫ﻛﺎﻟﻄ‬ ،‫اﻟﺪﻧﻴﻮﻳﺔ‬ ‫اﻟﻌﻠﻮم‬ ‫إن‬ ‫ﻧﻘﻮل‬ ‫اﻟﺨﺘﺎم‬ ‫ﻓﻲ‬ ‫و‬ ‫ﺤﺘﺎﺟﻮن‬َ‫ﻳ‬ ‫و‬ ‫اﻟﻤﺴﻠﻤﻮن‬ ‫ﻣﻨﻬﺎ‬ ُ‫ﻳﺴﺘﻔﻴﺪ‬ ‫و‬ ‫اﻹﻧﺴﺎن‬ ‫ﻳﻨﻔﻊ‬ ‫ﻣﻤﺎ‬ ‫ﻏﻴﺮﻫﺎ‬ ‫و‬ ‫اﻟﻬﻨﺪﺳﺔ‬ ‫ﻧﻔﻊ‬ ‫ﻷﺟﻞ‬ ‫ﻤﻬﺎ‬‫ﺗﻌﻠ‬ ‫ﻦ‬ َ‫ﻓﻤ‬ ،‫اﻟﻜﺮﻳﻤﺔ‬ ‫اﻷﻋﻤﺎل‬ ‫ﻣﻦ‬ ‫ﻬﺎ‬ َ‫ﻤ‬‫ﺗﻌﻠ‬ ‫أن‬ ‫ﺷﻚ‬ ‫ﻻ‬ - ‫ﻬﺎ‬ْ‫ﻴ‬‫إﻟ‬ ‫ﻟﻐﻴﺮﻫﻢ‬ ‫اﻟﺤﺎﺟﺔ‬ ‫ﻋﻦ‬ ‫إﺳﺘﻐﻨﺎءﻫﻢ‬ ‫و‬ ‫اﻟﺬاﺗﻲ‬ ‫إﻛﺘﻔﺎءﻫﻢ‬ ‫ﺗﺤﻘﻴﻖ‬ ‫و‬ ‫اﻟﻤﺴﻠﻤﻴﻦ‬ ‫ﺎت‬‫ﻴ‬‫ﺑﺎﻟﻨ‬ ‫اﻷﻋﻤﺎل‬ ‫ﻤﺎ‬‫إﻧ‬ :‫ﻢ‬‫وﺳﻠ‬ ‫ﻋﻠﻴﻪ‬ ‫ﷲ‬ ‫ﺻﲆ‬ ‫ﻪ‬ِ‫ﻟ‬‫ﻟﻘﻮ‬ ‫ﻋﺒﺎدة‬ ‫ﻓﻲ‬ ‫ﻓﻬﻮ‬ ٔ ‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬‫ـ‬ 1442 2021
  • 4. Contents List of Figures 6 List of Tables 1 1 General Introduction 2 2 Review of EEG-based emotion recognition 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.1 Electroencephalography . . . . . . . . . . . . . . . . . 4 2.2.2 Brain waves . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.3 Anxiety disorder . . . . . . . . . . . . . . . . . . . . . 8 2.2.4 Worldwide and Tunisia statistics . . . . . . . . . . . . 8 2.3 Ways of emotion detection using machine learning . . . . . . . 9 2.3.1 Facial Recognition . . . . . . . . . . . . . . . . . . . . 9 2.3.2 Speech Recognition . . . . . . . . . . . . . . . . . . . . 10 2.3.3 Body Gestures and Movements . . . . . . . . . . . . . 10 2.3.4 Motor Behavioural Patterns . . . . . . . . . . . . . . . 11 2.3.5 Biosignals . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Existing applications for anxiety’s detection and treatment . . 14 2.4.1 Wysa: Depression and anxiety therapy chatbot . . . . 14 2.4.2 Daylio: . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.3 Headspace: . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.4 Calm: . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.5 AntiStress, Relaxing, Anxiety Stress Relief Game: . . . 17 2.4.6 Shine: Calm Anxiety Stress: . . . . . . . . . . . . . . . 18 2.5 Available anxiety elicitation-based datasets . . . . . . . . . . . 19 2.5.1 Problem of imbalanced dataset . . . . . . . . . . . . . 21 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3 EEG data augmentation using CVAE 22 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 The autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Variational AutoEncoder (VAE) . . . . . . . . . . . . . . . . . 24 3.4 The gap between AE and VAE . . . . . . . . . . . . . . . . . 26 3.5 Conditional Variational Autoencoder . . . . . . . . . . . . . . 27 3.6 Coding in COLAB . . . . . . . . . . . . . . . . . . . . . . . . 29 4
  • 5. CONTENTS 5 3.7 Evaluation of the generation process . . . . . . . . . . . . . . 29 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Emotion recognition using recurrent CNN 32 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Convolutional Neural Network . . . . . . . . . . . . . . . . . . 32 4.2.1 CNN principle . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.2 CNN applications . . . . . . . . . . . . . . . . . . . . . 33 4.2.3 CNN Architecture . . . . . . . . . . . . . . . . . . . . 34 4.2.4 Difference between Conv1D and Conv2D . . . . . . . . 35 4.3 Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . 35 4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3.2 RNN Applications . . . . . . . . . . . . . . . . . . . . 36 4.3.3 Long Short Term Memory layer LSTM . . . . . . . . . 36 4.4 The proposed architecture for anxiety states recognition . . . . 37 4.5 Anxiety states recognition results . . . . . . . . . . . . . . . . 38 4.5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . 38 4.5.2 Classification results without data augmentation . . . . 39 4.5.3 Classification results with data augmentation . . . . . 42 4.6 Graphical user interface . . . . . . . . . . . . . . . . . . . . . 43 4.7 The general methodology of our work . . . . . . . . . . . . . . 44 4.8 Project schedule . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5 Conclusion and Future Work 46 Netography 51
  • 6. List of Figures 2.1 Measuring the electrical activity using fixated electrodes on an EEG cap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 The five EEG signals and its associated activities . . . . . . . 7 2.3 Hapiness index for 2015-2017 . . . . . . . . . . . . . . . . . . 8 2.4 Extracting facial features . . . . . . . . . . . . . . . . . . . . . 9 2.5 Extracting speech features . . . . . . . . . . . . . . . . . . . . 10 2.6 Emotion recognition using hand movements . . . . . . . . . . 11 2.7 ER using walking behavioural features . . . . . . . . . . . . . 12 2.8 Emotion recognition using biosignals . . . . . . . . . . . . . . 13 2.9 Screenshot from the Penguin chatbot . . . . . . . . . . . . . . 14 2.10 Screenshots from Daylioo . . . . . . . . . . . . . . . . . . . . . 15 2.11 Screenshots from Headscape . . . . . . . . . . . . . . . . . . . 16 2.12 Mindful days follow-up sechedule from Calm . . . . . . . . . . 17 2.13 Many relaxing and coloring games to treat anxiety . . . . . . . 18 2.14 Screenshots from Shine app . . . . . . . . . . . . . . . . . . . 19 2.15 GAP results between balanced and imbalanced dataset . . . . 21 3.1 The autoencoder architecture . . . . . . . . . . . . . . . . . . 23 3.2 The latent space regularization problem . . . . . . . . . . . . . 23 3.3 The variational autoencoder model . . . . . . . . . . . . . . . 25 3.4 The optimisation of the variational autoencoder model . . . . 26 3.5 The reparameterization trick . . . . . . . . . . . . . . . . . . 26 3.6 Real difference between AE and VAEs on MNIST dataset . . . 27 3.7 The network of CVAE. Here, Enc and Dec represent one real sample, real label, generated sample, mean value, standard deviation, resampled noise, encoder, and decoder, respectively. 28 3.8 The architecture of a Conditional Variational Autoencoder . . 28 3.9 t-SNE representation of real and generated data . . . . . . . . 30 3.10 Topographical map of real and generated data . . . . . . . . . 31 4.1 Charles Camiel looks into the camera for a facial recognition test at Logan International Airport in Boston . . . . . . . . . 33 4.2 CNN architecture . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 Conv2D kernel sliding . . . . . . . . . . . . . . . . . . . . . . 35 4.4 GAP between RNNs (L) and Feedforward Neural Networks (R) 36 4.5 Inside the LSTM cell . . . . . . . . . . . . . . . . . . . . . . . 37 4.6 CNN-LSTM architecture for anxiety states recognition . . . . 38 6
  • 7. LIST OF FIGURES 7 4.7 CNN LSTM architecture description . . . . . . . . . . . . . . 39 4.8 Training and validation loss . . . . . . . . . . . . . . . . . . . 40 4.9 Training and validation accuracy of CNN-LSTM on DASPS dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.10 Confusion matrix of CNN-LSTM on DASPS dataset . . . . . . 41 4.11 Uploading data to GUI by therapist . . . . . . . . . . . . . . . 42 4.12 GUI showing EEG brain map in side a and emotion prediction in side b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.13 Basic steps of our methodology . . . . . . . . . . . . . . . . . 44 4.14 GANTT chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
  • 8. Acknowledgments In the Name of Allah, the Most Beneficent, the Most Merciful The Prophet Mohammad, peace be upon him said: ”Allah does not thank the person who does not thank people.” • I would first like to thank my thesis advisor, Ms. Boudour Ammar the PhD, Eng.and Assistant Professor in National Engineering School of Sfax for her big support, help and precious advices during this internship. • I would also express my gratitude to Doctor Rahma Fourati, member of REGIM Lab at ENIS, her door’s office was al- ways open whenever I ran into a trouble spot or had a ques- tion about my research or writing. They consistently allowed this report to be my own work, but steered me in the right the direction whenever he thought I needed it. • I would like to thank the industrial head of CISEN Com- puter Mr. Maher Yangui, the IT engineer, who gave me the opportunity to accomplish this internship with their es- teemed company, I would thank also thank the administra- tive responsibles of UVT for their coordination and support during my studies. • Finally, I must express my very profound gratitude to my parents: Mokhtar and Baya and to my brothers: Chedi and Jemil, to my friends and colleagues for providing me with unfailing support and continuous encouragement throughout my years of study and through the process of researching and writing this thesis. This accomplishment would not have been possible without them. Thank you. Amir Bouzidi 1
  • 9. Chapter 1 General Introduction Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. The machine should interpret the emotional state of humans and adapt its behavior to them, giving an appropriate response for those emo- tions. Affective computing technologies sense the emotional state of a user (via sensors, microphone, cameras or software logic). They respond by perform- ing specific predefined product/service features, such as changing a quiz or recommending a set of videos to fit the mood of the learner. The more com- puters we have in our lives the more we’re going to want them to be socially smart. We don’t want it to bother us with unimportant information. That kind of common-sense reasoning requires an understanding of the person’s emotional state. A major area in affective computing is the design of computational de- vices proposed to exhibit either innate emotional capabilities or that are capable of convincingly simulating emotions. A more practical approach, based on current technological capabilities, is the simulation of emotions in conversational agents in order to enrich and facilitate interactivity between human and machine. While human emotions are often associated with surges in hormones and other neuropeptides, emotions in machines might be asso- ciated with abstract states associated with progress (or lack of progress) in autonomous learning systems. In this view, affective emotional states cor- respond to time-derivatives in the learning curve of an arbitrary learning system. Two major categories describing emotions in machines: Emotional speech and Facial affect detection. Anxiety is a kind of a negative emotion. In this work, the goal is to design an application for the visualization of EEG signals, the inspection of the topographical map and the recognition of the anxiety level. The desktop application is useful for the therapist as a first diagnosis to choose the convenient technique to continue the therapy. In other words, some people do not share their thinking or their mental state. In such case, the inspection of the EEG signals allows to access the inner state of the human mind. The current work proposes two contributions: 2
  • 10. CHAPTER 1. GENERAL INTRODUCTION 3 • The content generation of data EEG signals in order to enrich the dataset to guarantee an efficient training of a deep neural network. • The recognition of anxiety states with a recurrent neural network com- posed of convolutional layers followed by a Long Short Term (LSTM) memory to capture spatio-temporal features in clean EEG signals. The rest of this master is composed of four chapters. These are organised as follows: • Chapter Two: This chapter presents the theoretical background and literature review on EEG-based emotion recognition as well as EEG- based Anxiety levels recognition. It also provides a background on EEG signals, including EEG rhythms, analysis techniques of EEG signals and the role of EEG as a modality for emotion recognition. In addition to this, the chapter contains the used techniques by machine learning for emotions recognition (facial recognition, speech recognition, body gestures and movements, motor behavioural patterns and biosignals) followed by an overview of the existing affective benchmarks then the available datasets and the framework and steps for this project. • Chapter three: The chapter introduces the data projection with au- toencoder followed by the explanation of the AE architecture, vari- ational autoencoders and the gap between AE and VAE. Then, we choose the conditional variational autoencoder for writing the code on COLAB. Finally, we talk about metrics of our machine learning. • Chapter four: This chapter presents the proposed model for the recognition of the emotion states. We talk about CNN advantages, it principle, architecture and it applications in real world then we describe the difference between 1D and used 2D CNN for our code followed by a brief explanation for recurrent neural networks RNN and it impor- tance for our application for continuous signals like EEG, then we write about the used technique LSTM for our code, followed by an analysis for the code, finally we had present the graphical user interface and the GANTT chart for whole project. • Chapter five: This chapter presents the obtained results, summariz- ing this experience and the contribution for future works. The master ends with a conclusion that provides a summary of our contributions, outlines the conclusions and the limitations of this research and also suggests several directions for future research.
  • 11. Chapter 2 Review of EEG-based emotion recognition 2.1 Introduction The mechanisms that regulate our physiological and mental processes behave in a coupled way in which there is a miraculous inter-dependency. Mental processes are responsible for changes of the physiological state in our body. On the other side, changes in bodily functions also lead to different thoughts, behaviours and emotions. In this chapter, we will present the EEG modality along with its brain waves then we will talk about the phenomenon of anxiety in Tunisia and worldwide followed by a presentation of the latest machine learning used techniques for emotions recognition and finally we will intro- duce the top used mobile applications in our community for treating anxiety and emotion identification with an overview for the elicitation techniques of anxious states. 2.2 Preliminaries 2.2.1 Electroencephalography Electroencephalography, also known as EEG, is the study of the brain func- tions reflected by the brain’s electrical activity and it is considered as one of the basic tools to image brain functioning. Our thoughts are generated through a network of neurons, that send signals to each other with the help of electrical currents. We have to use some electrodes of an EEG headset placed on the scalp to collect the brain’s electrical signals. In addition, a conductive paste is used to improve the conduction of the electrical signals. The EEG headset used in this study is an elastic cap similar to a bathing cap with the electrodes mounted to the cap. The electrodes are mounted systematically on the cap using the international 10-20 system for electrode placement to ensure that the data can be collected from identical positions across all the respondents. These electrodes detect the electrical changes of thousands of synchronized neurons simultaneously. The voltage fluctuations 4
  • 12. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 5 measured by those kind of sensors are very low, typically in micro-volts. The signals are digitized and sent to an amplifier where the signals will be amplified. Once the signals are amplified, the signals are sent to a computer, where these can be recorded. Using different methodologies, the signals can be represented as a vector array or matrix array for data processing utilities [Cong et al., 2015]. In addition, various maps of the brain activity can be generated, with a fast temporal resolution. Figure 2.1: Measuring the electrical activity using fixated electrodes on an EEG cap [Web 2] The main drawback for EEG is the spatial resolution where it is difficult to tell whether the signals that were measured by the electrodes were generated near the surface, or in deeper regions of the brain so it’s hard to figure out where in the brain the electrical activity is coming from. The cost of EEG systems depend on several factors: Firstly, the number of electrodes on the headset, secondly on the quality of the amplifier and thirdly on the sampling rate, measured in Hz. One of the major advantages of using EEG is the fact that it has excel- lent temporal resolution, meaning that it can measure in fine detail events happening in real-time. According to neuroscience news, researchers believe that it takes about 17 milliseconds for the brain to form a representation of a human face making EEG the perfect candidate, as EEG can capture activity at a time scale down to milliseconds [ScienceDaily, 2018].
  • 13. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 6 2.2.2 Brain waves The development of technologies such as virtual reality, wearable devices and understanding the physiological responses to emotional states can serve a wide range of valuable applications in such diverse domains: • Medicine: Rehabilitation (Help monitoring), companion (Enhance re- alism), counseling (Client’s emotional state), health care (Patient’s feel- ing about treatment especially for for deaf, dumb and blind). • E-learning: Adjust the online presentation of an online tutor, detect the state of the learner, improve tutoring systems. • Monitoring: Detect driver’s state and warning him, ATM not dis- pensing money when scared, improve call-center system (Detect and prioritize angry customers via their voices). • Entertainment: Recognize mood and emotions of the users and sat- isfy their needs with the right content (Movies and music recommen- dations). • Law: Deeper discovery of depositions: Improve investigations tools with criminals, suspects and witnesses. • Marketing: Impact of ads, ameliorate advertising plans, optimize rec- ommendation systems then satisfy user shopping experience so increase sales. Brain waves represent the regularly recurring wave forms that are similar in shape and duration [Steriade, 2005]. There are five main EEG frequency bands: Delta, theta, alpha, beta and gamma which reflect the different brain states [NeuroSky, 2009]. Brain waves and the functions of each EEG band are described below: • Delta waves (0.1-3 Hz): Appear in dreamless sleep and unconscious states. • Theta waves (4-7 Hz): Observed in different states such as intuitive, creative, imaginary and drowsy states. • Alpha waves (8-12 Hz): The first EEG waves that were discovered by Berger [Niedermeyer and da Silva, 2005]. Alpha waves appear in the relaxation state, tranquil and conscious states but not during drowsy. Alpha waves become attenuated in several states such as eyes opening, hearing sounds, anxiety or attenuation.
  • 14. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 7 • Beta waves (12-30 Hz): Observed in the active state and anxious think- ing. There are three different bands of beta waves: – Low beta waves (12-15 Hz): Appear in relaxed yet focused and integrated cases. – Mid-range beta waves (16-20 Hz): Appear in thinking and aware- ness of self and surroundings. – High beta (21-30 Hz): Observed in alertness and agitation states. • Gamma waves (30-100 Hz): Observed in higher mental activity such as in processing information and learning. The EEG oscillations of the same frequency may have different functions as depicted in Figure 2.1 for example, delta oscillations are normal and abnor- mal based on states : Normal through slow wave sleep and clearly signing abnormality during awake state [Freeman and Quiroga, 2012]. Figure 2.2: The five EEG signals and its associated activities [Web 3]
  • 15. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 8 2.2.3 Anxiety disorder While the origins of the field may be traced as far back as to early philosoph- ical inquiries into emotion (”affect” is, basically, a synonym for ”emotion”), the more modern branch of computer science originated with Rosalind Pi- card’s book [Picard, 1997] on affective computing. The motivation for the research is the ability to simulate empathy. According to [Web 4], anxiety refers to multiple mental and physiological phenomena, including a person’s conscious state of worry over a future un- wanted event, or fear of an actual situation. Anxiety and fear are closely related, some scholars view anxiety as a uniquely human emotion and fear as common to nonhuman species. Another distinction often made between fear and anxiety is that fear is an adaptive response to realistic threat, whereas anxiety is a diffuse emotion, sometimes an unreasonable or excessive reaction to current or future perceived threat. 2.2.4 Worldwide and Tunisia statistics According to ADAA [Web 5], anxiety disorders are the most common men- tal illness in the United States, affecting 40 million adults in the country age 18 and older, or 18.1% of the population every year. Anxiety disorders are highly treatable, yet only 36.9% of those suffering receive treatment. Anxi- ety disorders affect 25.1% of children between 13 and 18 years old. Research shows that untreated children with anxiety disorders are at higher risk to per- form poorly in school, miss out on important social experiences, and engage in substance abuse. The WHO reports that anxiety disorders are the most common mental disorders worldwide with specific phobia, major depressive disorder and social phobia being the most common anxiety disorders. The figure below quoted from the UN Sustainable Development Solutions Net- work Report of 2018 showed that Tunisia ranked 111 in the happiness index. This result reflects the deterioration of the mental health situation within Tunisian society and the need for optimizing existing solutions. Figure 2.3: Hapiness index for 2015-2017 [Web 6]
  • 16. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION 9 2.3 Ways of emotion detection using machine learning It exists different ways for emotion recognition through machine learning. Here are the last used tehcniques for feelings identification: 2.3.1 Facial Recognition Facial recognition based on machine learning (ML) is a widely used method for detecting emotions. It takes advantage of the fact that our face charac- teristics fluctuate dramatically in response to our emotions. When we are happy, for example, our lips expand upwards from both ends. Similarly, when we are excited, we elevate our brows. Facial Recognition is a valuable emotion detection technology in which pixels of critical facial regions are evaluated to characterize facial expressions using facial landmarks, machine learning and deep learning. Eyes, nose, lips, jaw, eyebrows, mouth, and other facial landmarks are employed in emotion detection using machine learning. While a distinct facial landmark may present in two separate emotions, a detailed analysis of the combination of different landmarks using artificial intelligence via machine learning can help distinguish between similar-appearing but unique emotions. For example, while elevated eyebrows can indicate astonishment, they can also be a sign of worry. Higher brows with raised lip boundaries, on the other hand, would signal a joyful surprise rather than anxiety. Face recognition can be used to detect emotions in surveillance healthcare. Figure 2.4: Extracting facial features [Web 8]
  • 17. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION10 2.3.2 Speech Recognition Speech feature extraction and voice activity detection are required for emo- tion identification using speech recognition. The method entails utilizing machine learning to analyze speech parameters such as tone, energy, pitch, formant frequency, and so on, and determining emotions based on changes in these features. Because voice signals can be obtained quickly and cheaply, ML-based emotion identification via speech, also known as Speech Emotion Recogni- tion (SER), is very popular. A good audio database, effective feature extrac- tion, and the deployment of trustworthy classifiers employing ML techniques and Natural Language Processing (NLP) are all required for speech emotion recognition using machine learning. Both feature extraction and feature selection are critical for reliable find- ings. Then, using various classification techniques, raw data is classified into a certain emotion class based on features retrieved from the data, such as the Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), Support Vector Machine (SVM), Neural Networks (NN), and Recurrent Neural Net (RNN). Major application areas for SER are audio surveillance, e-learning, clinical studies, banking, entertainment, call-centers, gaming, and many more. For example, emotion detection in e-learning helps understand students’ emo- tions and modify the teaching techniques accordingly [Web 7]. Figure 2.5: Extracting speech features [Web 9] 2.3.3 Body Gestures and Movements With the help of machine learning, analyzing body movements and gestures can also aid in emotion identification. With changes in emotions, our bodily movements, posture, and gestures alter dramatically. This is why, based on
  • 18. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION11 a mix of hand/arm gestures and body movements, we can usually infer a person’s basic mood. A clenched fist with an alert stance, for example, is an indication of rage. In addition, if a person Every shift in human mood is followed by a succession of gestures and changes in body movement. With the use of proper Machine learning clas- sifier algorithms and gesture sensors like Microsoft Kinect, OpenKinect, and OpenNI, analyzing a combination of various gestures and body motions can provide excellent insights into emotion recognition. The process of emotion detection through body gestures and movements involves the extraction of regions in relevant body parts, for example, from hands to get a hand region mask. Then, contour analysis is performed in this region that provides contours and convexity defects. This is used for classification. The extraction of areas in important body parts, such as hands, to obtain a hand region mask, is part of the process of emotion detection using body gestures and motions. Then, in this region, contour analysis is performed, which produces contours and convexity defects. This is used to categorize results. Five extended fingers imply open hands and no extended finger implies a fist. Figure 2.6: Emotion recognition using hand movements [Web 10] 2.3.4 Motor Behavioural Patterns The changes in a person’s behavioral patterns with muscle tension, strength, coordination, and frequency can also help characterize changes in their emo- tional state when using the correct machine learning algorithms. As a result, these are useful factors for machine learning-based emotion identification. A cheerful state, for example, is shown by symmetric up and down hand
  • 19. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION12 gestures. This method leverages the fact that our body muscles react sig- nificantly to the changes in our emotional state, as a reflex action. While we would not even be aware of how prominent these changes are, these mo- tor behavioral changes if recorded and analyzed properly through machine learning techniques, act as great indicators for emotion detection. Figure 2.7: ER using walking behavioural features ([Randhavane, 2020]) 2.3.5 Biosignals Emotion detection through biosignals is the process of analyzing biologi- cal changes occurring with emotion changes. Biosignals include heart rate, temperature, pulse, respiration, perspiration, skin conductivity, electrical im- pulses in the muscles, and brain activity. For example, a rapidly increasing heart rate indicates a state of stress or anxiety.[Web 7] These biosignals, also known as physiological signals, aid in gaining knowledge on human physiological states. The problem is that a single biosig- nal is insufficient because it can convey a variety of emotions. As a result, several biosignals from various areas of the body are combined and examined as a whole. M is then used to categorize these biosignals (combinations). These biosignals (combinations) are then analyzed using machine learning techniques such as convolutional neural networks (CNN) and classification
  • 20. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION13 algorithms such as regression tree, support vector machine, linear discrimi- nant analysis, and Naive Bayes, among others. This technology is practical since it is now possible to record and analyze biosignals using smart wear- able devices. More complicated biosignals are also recorded for healthcare purposes using electroencephalography (EEG), electrocardiography (ECG), and electromyography (EMG). Figure 2.8: Emotion recognition using biosignals [Web 11] In conclusion, the greatest results in emotion detection using machine learn- ing may be obtained by combining two or more of these approaches. By expanding the number of users, learning ability improves, and the obtained data from these strategies enhances the results. EEG, as a breakthrough ma- chine learning methodology for understanding emotions, has the potential to be a game-changing method for treating millions of individuals all over the world.
  • 21. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION14 2.4 Existing applications for anxiety’s detec- tion and treatment 2.4.1 Wysa: Depression and anxiety therapy chatbot Wysa is an emotionally intelligent chatbot that uses AI to react to the user’s emotions expressions for free. Talk to the cute penguin or use its free mindfulness exercises for effective anxiety relief, depression and stress management. With the highest rating in all health care apps 4.8/5 and over 1 million downloads, Wysa obtained ORCHA prize for best stress apps (ORCHA: A British organization for testing and reviewing health apps) and Editor’s choice WMHD (World Mental Health Day) both for 2019. Figure 2.9: Screenshot from the Penguin chatbot [Web 12] If the patient is dealing with stress, anxiety and depression or coping with low self-esteem, then talking to Wysa can help them relax and get unstuck, it’s empathetic, helpful, and will never judge. Persons will overcome their mental health obstacles, through empathetic conversation and free CBT therapy (Cognitive Behavioral Therapy) based technique. Used around the clock and trusted by 1,000,000 people. For extra support, people can avail guidance from a real human coach, a skilled psychologist who will take them through the advanced coaching sessions for their needs. Vent and talk through things or just reflect on their day with the AI chatbot, practice CBT and DBT techniques (Dialectical Behavior Therapy)
  • 22. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION15 to build resilience in a fun way using one of 40 conversational coaching tools which helps in dealing with (stress, anxiety, depression, panic attacks, worry, loss, or conflict), manage anxious thoughts and anxiety: deep breathing, techniques for observing thoughts, visualization, and tension relief [Web 12]. In brief, this application gathers data through messages received from users. AI algorithms improve their answers from the huge database. 2.4.2 Daylio: Daylio is a highly flexible tool, which we can use it to track whatever we want. Exercise, meditate, eat, and be thankful with our fitness goal buddy, mental health coach, and food log. This program looks after the mental, emotional, and physical well-being. Self-care is essential for a better mood and less worry. With a 4.6/5 rating from 320.000 users, Daylio surpassed 10 million downloads. Figure 2.10: Screenshots from Daylioo [Web 13] -This application is built on three principles: 1. Reach happiness and self-improvement by being mindful of our days. 2. Validate our hunes. How does our new hobby influence our life. 3. Form a new habit in an obstacle-free environment no learning curve. Finally, this app does not gather any data as mentioned in its Google Play page: “We don’t send your data to our servers. We don’t have access to your entries. Also, any other third-party app can’t read your data.”
  • 23. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION16 2.4.3 Headspace: With over 10 million downloads and 4.6/5 rating by 200 thousands users, this piece of software is among the most popular mental health care applications. Headspace is a mindfulness app that users may utilize in their daily lives. Learn meditation and mindfulness methods from world-class experts and build tools to help patients focus, breathe, stay calm, and create balance in their lives, whether they require stress relief or sleep assistance. Figure 2.11: Screenshots from Headscape [Web 15] The users will learn how to deal with tension and anxiety, as well as how to calm their minds. • Stress & anxiety meditation: Managing anxiety, letting go of stress • Falling asleep & waking up meditation: Sleep, restlessness • Work & productivity meditation: Finding focus, prioritization, pro- ductivity, creativity and student’s meditations. • Movement & sports meditation: Motivation, focus, training, competi- tion, recovery • Physical health mindfulness training: Mindful eating, pain manage- ment, pregnancy, coping with cancer. [Web 14] In brief, Headscape does not gather any data, it does not use chatbots or surveys to collect users data.
  • 24. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION17 2.4.4 Calm: This app is a popular choice for meditation and sleep. With guided medita- tions, Sleep Stories, breathing programs, masterclasses, and calming music, millions of individuals enjoy reduced tension, anxiety, and more peaceful sleep. Top psychiatrists, therapists, and mental health professionals have endorsed the app, according to the creators. Guided meditation sessions are available in lengths of 3, 5, 10, 15, 20 or 25 minutes so user can choose the perfect length to fit with his schedule. Calming anxiety, managing stress, deep sleep, focus and concentration, relationships, breaking habits, happiness, gratitude, self-esteem, body scan, loving-kindness, forgiveness, non-judgment, commuting to work or school, mindfulness at College, mindfulness at work, walking meditation, and calm kids are just a few of the topics covered. [Web 16] Figure 2.12: Mindful days follow-up sechedule from Calm [Web 17] In sum, Calm concentrates on relaxing music, sleep stories and breathing programs, the app does not use chatbots or gather any data to improve its algorithms. 2.4.5 AntiStress, Relaxing, Anxiety Stress Relief Game: With more than 5 millions installs and 4.2/5 rating by 53 thousands users, the app provides users relaxation with satisfying games that are designed
  • 25. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION18 with great concepts and full of relaxation toys. Using them and creating fun loving moments in the hectic routine. This relaxing game 2021 with color therapy is for all ages. Users just need to download it and plunge into it for his unlimited fun and relaxation. Figure 2.13: Many relaxing and coloring games to treat anxiety [Web 19] The app contains: Realistic 3D brain exercise and relaxation, different mind freshness toys, high quality relaxing sounds to release stress, realistic experience of release stress in minutes, smooth controls to play with the 3D fidget toys and different relaxation toys missions.[Web 18] In brief, this application does not gather any data or information from users. 2.4.6 Shine: Calm Anxiety Stress: This application have more than 100 thousands downloads and 4.8/5 rat- ing,. The application help users to learn a new self-care strategy every day, get support from a diverse community, and access an audio library of 800+ original meditations, bedtime stories, and calming sounds to help patients
  • 26. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION19 shift their mindset or mood plus: Meditations specific to the mental health challenges faced by members of marginalized groups. [Web 20] Topics include: Black Well being calming anxiety, reducing stress, con- fidence, growth, improving sleep, focus, burnout, forgiveness, self-love, mo- tivation, creativity, finding joy, managing work frustrations, strengthening relationships and creating healthy habits. Figure 2.14: Screenshots from Shine app [Web 21] To summarize, SHINE does not collect any informations or data from it users, it is just a classic programmed app. [Web 20] 2.5 Available anxiety elicitation-based datasets Anxiety affects human capabilities and behavior as much as it affects pro- ductivity and quality of life. It is considered to be the main cause of depression and suicide. Anxious states are detectable by specialists by virtue of their acquired cognition and skills. There is a need for non- invasive reliable techniques that performs the complex task of anxiety de- tection. Several works [Garcı́a-Martı́nez et al., 2017], [Arsalan et al., 2019],
  • 27. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION20 and [Zhang et al., 2020] were proposed to recognize anxious states. There is no consensus nor about the elicitation of anxious states neither about the labels which makes existing works very different and difficult to compare them. • Recently, a new dataset known as ”DASPS” for anxiety levels recog- nition ([Baghdadi et al., 2020]) from low-cost portable EEG device (EMOTIV-EPOC) with 14 channels is released. The EEG recordings were taken from 23 participants. DASPS is characterized by a ther- apeutic elicitation which triggers different levels of anxiety in partici- pants by self-recall of stressful situations. To accord labels to two and four levels, Hamilton score was taken from questionnaire filled before and after experiment. • In the same context, Arsalen et al. ([Arsalan et al., 2019]) carried a psychological experiment on 28 participants by recording EEG signals using low-cost portable EEG device (MUSE:Muse is a wearable brain sensing headband where the device measures brain activity via 4 elec- troencephalography EEG sensors). Preparing oral presentation is used as stressful activity to trigger perceived mental stress. Three sessions recording: The pre-activity when participants are in a resting position, activity when they prepare the presentation and post-activity for the public oral presentation. Arsalen et al. showed that only pre-activity EEG recordings are well correlated to two and three stress levels, re- spectively. In the classification task, only pre-activity EEG signals are considered. • Anxiety disorder is recognized through Healthy Brain Network (HBN) dataset [Alexander et al., 2017] launched by the American Institute of Child Psychology and includes data collected from children and ado- lescents (ages 5 to 17) in New York City. HBN was proposed to di- agnose and intervene in the mental health of minors. The dataset contains also eye movements and large EEG recordings. Zhang et al. [Zhang et al., 2020] selected 92 subjects (where 45 children are consid- ered as anxious and 47 children as normal) to conduct experiments ac- cording to the Screen for Child Anxiety Related Disorders (SCARED) scale. Zhang et al. [Zhang et al., 2020] extracted PSD (Post-Stroke Depression) features from Gamma band and transform them using a new proposed Group Sparse Canonical Correlation Analysis (GSCCA) to achieve 82.70% with SVM classifier.
  • 28. CHAPTER 2. REVIEW OF EEG-BASED EMOTION RECOGNITION21 2.5.1 Problem of imbalanced dataset First, let is explain clearly what is balanced dataset, we consider flower pic- tures as a positive values and tree pictures as a Negative value. We can say that the number of positive values and negative values in approximately same but the imbalanced dataset: If there is the very high different between the positive values and negative values. So, using an imbalanced dataset will produce a wrong learning and finally a wrong classification results. For this reason we must generate new sufficient amount of data for each category before beginning the training and testing steps to have a precise results with high accuracy. Figure 2.15: GAP results between balanced and imbalanced dataset 2.6 Conclusion EEG-based emotion recognition task is very crucial for human daily life, especially to maintain mental health and keep information in an early stage before going into a bad situation. In this chapter, we detailed several concepts related to EEG-based emotion recognition task. The specificity’s of EEG signals where the representation of anxiety state is explained. A brief research about used machine learning methods for emotion recognition then we have mentioned the most famous existing applications about anxiety detection and treatment and finally we have presented the available EEG datasets from several scientific experiences such as DASPS dataset. In the next chapter, a data augmentation step is proposed in order to provide sufficient data for the training of the neural network.
  • 29. Chapter 3 EEG data augmentation using CVAE 3.1 Introduction Unsupervised learning is modeling the underlying structure or distribution in the data in order to learn more about it. These, are called unsupervised learning because unlike supervised learning there are no correct answers and there is no teacher. Algorithms are left to their own devices to discover and present the interesting structure in the data. Variational Autoencoder as a generative model is based on unsupervised learning to learn the structure of input data for the aim of generating new similar data to the orginal real data. 3.2 The autoencoder Autoencoders (AE) are a family of neural networks for which the output produced data is the same as the input data. They work by compressing the input into a latent-space representation, and then reconstructing the output from this representation. The general idea of autoencoders is simple and consists on setting an encoder and a decoder as neural networks and to learn the best encoding-decoding scheme using an iterative optimization process. The search of encoder and decoder that minimize the reconstruction error is done by gradient descent over the parameters of these networks. Figure 3.1 depicts the encoder and decoder of an autoencoder network. Notice that the gradient descent is an optimization algorithm for finding a local minimum of a differentiable function. Gradient descent is used to find the values of a function’s parameters (coefficients) that minimize a cost function as far as possible. Notice that dimensionality reduction refers to techniques that reduce the number of input variables in a dataset. The more complex the architecture is, the more the autoencoder can proceed to a high dimensionality reduction while keeping reconstruction loss low. 22
  • 30. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 23 Figure 3.1: The autoencoder architecture [Web 22] Second, most of the time, the final purpose of dimensionality reduction is not to only reduce the number of dimensions of the data but to reduce this number of dimensions while keeping the major part of the data struc- ture information in the reduced representations. For these two reasons, the dimension of the latent space and the ”depth” of autoencoders (That define the degree and the quality of compression) have to be carefully controlled and adjusted depending on the final purpose of the dimensionality reduction. Figure 3.2: The latent space regularization problem [Web 23] The regularity of the latent space for autoencoders is a difficult point that depends on the distribution of the data in the initial space, the dimension of the latent space and the architecture of the encoder. The high degree of freedom of the autoencoder that makes possible to encode and decode with no information loss (Despite the low dimensionality of the latent space) leads to a severe overfitting.
  • 31. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 24 According to Figure 3.2, we can notice that the problem of the autoen- coders latent space regularity is much more general and need a special atten- tion. Indeed, the autoencoder is trained for enforce to get such organisation: The autoencoder is solely trained to encode and decode with as few loss as possible, no matter how the latent space is organised. 3.3 Variational AutoEncoder (VAE) Variational autoencoders (VAEs) are a deep learning technique for learning latent representations. They have also been used to draw images or gener- ating new data in semi-supervised learning. VAE is a generative model that estimates the Probability Density Function (PDF) of the training data. The unique fundamental property that separates it from standard autoencoders, and makes them so useful for generative modeling is that their latent spaces are by design, continuous, allowing easy random sampling and interpolation. In brief, VAE has more parameters to tune that gives significant control over how we want to model our latent distribution therefore a meaningful outputs with high quality. The VAE training objective is to maximize the likelihood of the training data as described by equation 3.1, according to the model shown in Figure 3.3, where x is the input, z is the latent vector or the hidden representation and θ represents the network parameters. pθ(x) = Z pθ(z)pθ(x|z)dz (3.1) The choice of the output distribution is mainly Gaussian, i.e. p(x|z; θ) = N(x|f(z; θ), σ2 ∗ I), f(z; θ) will be modeled using a neural network and σ is a hyper parameter that multiplies the identity matrix I. The formula for P(x) is intractable because it requires exponential time to compute as it needs to be evaluated over all configurations of latent variables. To solve this problem, an additional encoder network is defined qφ(z|x) to approximate pθ(z|x). The marginal likelihood of individual data points can be rewritten as follows: log pθ(x(i) ) = DKL(qφ(z|x)||pθ(z)) + L(θ, φ, x(i) ) (3.2) The first term of equation 3.2 is the KL (Kullback-Leibler) divergence of the approximate posterior and the prior, which intuitively measures how similar two distributions are. The second term is the variational lower bound on the marginal likelihood of the data point i. Since the Kullback-Leibler di- vergence is always greater than or equal to zero. This means that minimizing
  • 32. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 25 Figure 3.3: The variational autoencoder model [Web 23] the Kullback-Leibler divergence is equivalent to maximizing the variational lower bound. Equation 3.2 can be rewritten as follows: log pθ(x(i) ) ≥ L(θ, φ, x(i) ) (3.3) L(θ, φ, x(i) ) = Ez[log pθ(x(i) |z)] − DKL[qφ(z|x(i) )||pθ(z)] (3.4) The loss function for this network (equation 3.4) consists of two terms, the first penalizes the reconstruction error and the second term encourages the learned distribution qφ(z|x(i) ) to be similar to the true prior distribution pθ(z). The VAE architecture is presented in Figure 3.4 where the encoder model learns a mapping from x to z and the decoder model learns a mapping from z back to x. The encoder output is constrained from two vectors describing the mean µz|x and variance σz|x of the latent state distributions. The de- coder generates a latent vector by sampling from these defined distributions and proceed to develop a reconstruction of the original input. Using the backpropagation technique for optimizing the loss is not feasible because the sampling process is random. To solve this problem, a ”reparameterization trick” is used which consists on randomly sampling from the desired distri- bution, and then multiply it by the mean µz|x and add the variance σz|x to the result as described in Figure 3.5.
  • 33. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 26 Figure 3.4: The optimisation of the variational autoencoder model [Web 23] Figure 3.5: The reparameterization trick 3.4 The gap between AE and VAE An autoencoder accepts input, compresses it, and then recreates the original input. This is an unsupervised technique because all we need is the original data, without any labels of known, correct results. The two main uses of an autoencoder are to compress data to two (or three) dimensions so it can be graphed, and to compress and decompress images or documents, which
  • 34. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 27 removes noise in the data. A variational autoencoder assumes that the source data has some sort of underlying probability distribution (such as Gaussian) and then attempts to find the parameters of the distribution. Implementing a variational autoencoder is much more challenging than implementing an autoencoder. The one main use of a variational autoencoder is to generate new data that’s related to the original source data. Now exactly what the additional data is good for is hard to say. A variational autoencoder is a generative system, and serves a similar purpose as a generative adversarial network (although GANs work quite differently)[Web 24].To conclude, if we want precise control over your latent representations and what we would like them to represent, then choose VAE [Web 23]. Figure 3.6: Real difference between AE and VAEs on MNIST dataset [Web 25] 3.5 Conditional Variational Autoencoder The only problem for generating data using Variational Autoencoders is that we do not have any control over what sort of data it generates. To explain the principle, for example when we train a VAE with the EEG data set and try to produce new signals by feeding Z ∼ N(0, 1) into the decoder, it will also generate another random outputs. If we train the decoder well, we will have a better signal’s quality but we will have no control over what EEG precisely
  • 35. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 28 it will produce. For example, We can’t decide exactly what we want to get in the output. Figure 3.7: The network of CVAE. Here, Enc and Dec represent one real sam- ple, real label, generated sample, mean value, standard deviation, resampled noise, encoder, and decoder, respectively. For this, we should change our VAE architecture. Give an input Y(label of the EEG) we want our generative model to produce output X(EEG). Thus, the process of VAE will be modified as the following: given observation y, z is drawn from the prior distribution Pθ(z|y), and the output x is produced from the distribution Pθ(x|y, z). Note that, for simple VAE, the prior is Pθ(z) and the output is produced by Pθ(x|z). Therefore, the encoder part tries to learn Qθ(z|x, y), which is equivalent to learning hidden representation of data X or encoding the X into the hidden representation conditioned y. The decoder part tries to learn Pθ(X|z, y) which decoding the hidden representation to input space conditioned by y. The graphical model can be expressed as mentioned below. Figure 3.8: The architecture of a Conditional Variational Autoencoder [Web 26] In this method, we aim to generate data with the specific category. As
  • 36. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 29 shown in next figure, to control the generated category, an extra label Y is added to the encoder and decoder. Firstly we feed the training data point and the corresponding label to the encoder, secondly we concatenate the hidden representation with the corresponding label and feed it to the decoder to train the network. Thirdly, we can generate data with the specific label by feeding the decoder with the noise sampled from the Gaussian distribution and the assigned label. 3.6 Coding in COLAB Our code in COLAB is divided into 4 parts: 1. Create a sampling layer 2. Define the standalone encoder model 3. Define the standalone decoder model 4. Define the CVAE as a Model with a custom training step: The genera- tion process is performed with sampling layer followed by the decoder, we have generated different samples, labels encoded with 4 bits. 3.7 Evaluation of the generation process It is important to ensure that the generated samples are of high quality; in other words, they are realistic and diverse. Lack of diversity among the generated samples is an indicator of mode collapse, meaning that the gen- erator has collapsed into generating only limited modes of the real data. In this case, we use several qualitative and quantitative metrics to evaluate the quality of the samples generated by VAEs in terms of diversity and similarity with the real samples. Visualization: Before we start, we have to define t-SNE:t-Distributed Stochastic Neigh- bor Embedding (t-SNE) is an unsupervised, non-linear technique primarily used for data exploration and visualizing high-dimensional data. In sim- pler terms, t-SNE gives a feel or intuition of how the data is arranged in a high-dimensional space.
  • 37. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 30 Figure 3.9: t-SNE representation of real and generated data Now, we visually inspect the quality of the artificial samples by map- ping the generated and real samples into two dimensions using (t-SNE) and temporal distribution. T-SNE (t-distributed Stochastic Neighbor Embed- ding) is applied to map the high dimensional real (Training) and generated EEG samples into 2-D space. Next figure displays a 2D plot of the anxiety classes in the latent space. It can be seen that t-SNE embedding of real MI and generated MI is similar. In addition, real and generated rest samples have similar distributions. Besides comparing the generated samples with the training samples, it is interesting to compare them with the test samples. In fact, the similarity between the generated and the test samples will ex- plain the classification improvement made by the augmentation. The training set includes the samples from all subjects excluding the target subject, the generated set includes the generated samples for the target subject, and the test set includes the second half of the target subject’s samples that were not seen during training. Thus, no overlap between the training, test, and generated sets exists. The results verified that the generated samples were indeed realistic and diverse.
  • 38. CHAPTER 3. EEG DATA AUGMENTATION USING CVAE 31 Figure 3.10: Topographical map of real and generated data To evaluate the quality of the generated data, we plotted the topograph- ical map of the generated data and the real data. In general, the signals are similar. 3.8 Conclusion In summary, due to missing sufficient amount of data for good machine learn- ing prediction, we have used Variational Autoencoders to generate new more data. An autoencoder is a neural network that is trained to attempt to copy its input to its output.The problem with Variational autoencoders is that we have not any control over what sort of data it generates so we use decide to use Conditional VAEs as solution to improve the outputs.
  • 39. Chapter 4 Emotion recognition using recurrent CNN 4.1 Introduction The problem of classifying multi-channel Electroencephalogram (EEG) time series consists in assigning their representation to one of a fixed number of classes. This is a fundamental task in many health- care applications, including anxiety detection ([Baghdadi et al., 2017], [Fourati et al., 2020] and [Baghdadi and Aribi, 2019]), epileptic seizures pre- diction [Tsiouris et al., 2018] and also affective computing applications such as EEG-based emotion recognition ([Fourati et al., 2020]). The problem has been tackled by a wealth of different approaches, spanning from the signal decomposition techniques of EEG signals to the feature ex- traction and feature selection algorithms as highlighted in the surveys [Baghdadi et al., 2016], [Movahedi et al., 2017], [Mahmud et al., 2018], and [Garcı́a-Martı́nez et al., 2019]. Representation learning or feature learning [Bengio et al., 2013] consists in automatically discovering the relevant representations for a classification or detection task directly from raw. Consequently, the laborious handcrafted features are no longer needed since representation learning permits to both learn the features and use them to perform a specific task. In this chapter, we focus on EEG representation learning for anxiety states recognition using recurrent convolutional neural network in a subject- independent context. 4.2 Convolutional Neural Network 4.2.1 CNN principle Convolutional Neural Network (CNN) is a deep neural network originally designed for image analysis. Recently, it was discovered that the CNN has also an excellent capacity in sequence data analysis such as natural language processing. CNN always contains two basic operations, namely convolution 32
  • 40. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN33 and pooling. The convolution operation using multiple filters is able to extract features (Edges) from the data set, through which their corresponding spatial information can be preserved. The pooling operation, also called sub-sampling, is used to reduce the dimensionality of feature maps from the convolution operation. Max pooling and average pooling are the most common pooling operations used in the CNN. Due to the complicity of CNN, ReLU is the common choice for the activation function to transfer gradient in training by backpropagation [Jitendra Verma, 2020]. 4.2.2 CNN applications Convolutional neural networks (CNNs) are more often utilized for classifica- tion and computer vision tasks such as pedestrian and object detection for self-driving cars, face recognition on social media or securing mobile phones, image analysis in healthcare (detecting tumours and diseases), quality in- spection in manufacturing, security in airports, improving results in search engines, recommender systems (like in Youtube, Amazon and Facebook, etc), emotions recognition, stock and currency prediction values. Figure 4.1: Charles Camiel looks into the camera for a facial recognition test at Logan International Airport in Boston [Web 28]
  • 41. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN34 4.2.3 CNN Architecture In general CNN architecture consists of 4 kinds of layers: Convolutional layer, pooling layer, dense layer and output layer. • Convolutional layer: Convolutional layer is the backbone of any CNN working model. This layer is the one where pixel by pixel scanning takes place of the images and creates a feature map to define future classifications. • Pooling layer: Pooling is also known as the down-sampling of the data by bringing the overall dimensions of the images. The informa- tion of each feature from each convolutional layer is limited down to only containing the most necessary data. The process of creating con- volutional layers and applying pooling is continuous, may take several times. • Fully connected input layer: This is also known as the flattening of the images. The outputs gained from the last layer are flattened into a single vector so that it can be used as the input data from the upcoming layer. • Fully connected layer: After the feature analysis has been done and it’s time for computation, this layer assigns random weights to the inputs and predicts a suitable label. • Fully connected Output layer: This is the final layer of the CNN model which contains the results of the labels determined for the clas- sification and assigns a class to the images. Figure 4.2: CNN architecture [Web 29]
  • 42. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN35 4.2.4 Difference between Conv1D and Conv2D We wanted to show the difference between Conv1D and Conv2D because in our code we relied on Conv2D • For Conv1D, the kernel moves in only one direction. The input and output data of Conv1D is 2-dimensional (Time and variable Y). Mainly used for time series data like audio, text, acceleration, etc. • The Conv2D is known by the kernel slides along the data is 2 di- mensions, the kernel moves in 2 directions (Height and width). The input and output data of Conv2D is 3-dimensional (Height, width and depth). It is mainly used for image data. Kernel matrix can ex- tract spatial features from the data, it detects edges, color distribution, etc. Figure 4.3: Conv2D kernel sliding [Web 30] 4.3 Recurrent Neural Network 4.3.1 Definition A recurrent neural network (RNN) is a type of artificial neural network which uses sequential data or time series data. Like feedforward and CNNs, RNNs
  • 43. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN36 utilize training data to learn. They are distinguished by their ”memory” as they take information from prior inputs to influence the current input and output. While traditional deep neural networks assume that inputs and outputs are independent of each other, the output of recurrent neural networks depend on the prior elements within the sequence. While future events would also be helpful in determining the output of a given sequence, unidirectional recurrent neural networks cannot account for these events in their predictions. 4.3.2 RNN Applications RNNs are commonly used for ordinal or temporal problems, such as language translation, natural language processing (NLP), speech recognition, and im- age captioning; they are incorporated into popular applications such as Siri, voice search, and Google Translate. Figure 4.4: GAP between RNNs (L) and Feedforward Neural Networks (R) [Web 31] 4.3.3 Long Short Term Memory layer LSTM The problem with RNNs is that as time passes by and they get fed more and more new data, they start to ”forget” about the previous data they have seen in what called: Vanishing gradient problem, so we need some sort of Long term memory, which is just what LSTMs provide. The core concept of LSTM is the cell state, and its various gates. LSTM is a type of cell in a recurrent neural network used to process sequences of data in applications such as handwriting recognition, machine translation, and image captioning. LSTMs address the vanishing gradient problem that occurs when train- ing RNNs due to long data sequences by maintaining history in an internal
  • 44. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN37 memory state based on new input and context from previous cells in the RNN. The cell state as illustrated in Figure 4.5 acts as a transport highway that transfers relative information all the way down the sequence chain. It can be seen as the ”memory” of the network. The cell state, in theory, can carry relevant information throughout the processing of the sequence. So even information from the earlier time steps can make its way to later time steps, reducing the effects of short-term memory. As the cell state goes on its journey, information gets added or removed to the cell state via gates. The last are different neural networks that decide which information is allowed on the cell state. The gates can learn what information is relevant to keep or forget during training. Figure 4.5: Inside the LSTM cell [Web 32] 4.4 The proposed architecture for anxiety states recognition In this section, we detail the components of 4.6. The preprocessed EEG signals are fed directly to the CNN-LSTM model. A convolutional block composed of convolutional layer followed by a pooling layer. This block is responsible for the spatial encoding of EEG time series. Actually, there is a relation between channels. We did not perform a 1D convolution since the kernel will be convoluted with each channel separately.
  • 45. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN38 A 2D convolution is performed which allows to operate on channels to- gether. The encoded CNN features are then fed to an LSTM layer for tem- poral parsing. Each LSTM cell encoded in the time axis the CNN features and forward then to the next cell. LSTM produce the last output activa- tion and then it is classified with a softmax layer. Our architecture is a spatio-temporal processing of EEG time series. Figure 4.6: CNN-LSTM architecture for anxiety states recognition 4.5 Anxiety states recognition results 4.5.1 Experimental setup We chose Colab because it allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning and data analysis. The main advantages for using this environement are: Free access to GPUs, zero configuration required, easy access and sharing with other users. In our model, four convolutional blocks as follows: • Input layer, Conv2D layer, batch normalization, leaky ReLU activation function, Max pooling 2D layer and dropout. • Conv2D layer, batch normalization, leaky ReLU activation function, Max pooling 2D layer
  • 46. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN39 Figure 4.7: CNN LSTM architecture description • Conv2D layer, batch normalization, leaky ReLU activation function, Max pooling 2D layer • Conv2D layer, batch normalization, leaky ReLU activation function, Max pooling 2D layer • Flattening layer, reshape, LSTM, dropout, first dense layer and second dense layer (Output). 4.5.2 Classification results without data augmentation To prove the usefulness of the data augmentation approach, we first classify only original EEG signals. The training and validation loss cure are depicted in Figure 4.8. • The red curve refers to the training loss which is the error on the training data set.
  • 47. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN40 • The blue curve refers to the validation loss which is the error after running the validation set of data through the trained network. Figure 4.8: Training and validation loss • If validation loss training loss we can call it overfitting. • If validation loss training loss we can call it underfitting. While there are some fluctuations in the validation curve, the training step ends with a small gap between training loss and validation loss. Figure 4.9 presents the training and validation accuracy curve. The train- ing set is used to train the model, while the validation set is only used to evaluate the model’s performance. Training accuracy is the accuracy we get if we apply the model on the training data, while validation or testing accuracy is the accuracy for unseen data. We have validation accuracy less than training accuracy because EEG training data is something with which the model is already familiar and validation data is a collection of new data points which is new to the model. The fluctuations in validation loss curve are also present in the accuracy validation curve. But, it still remain the training and validation accuracy curves are close to each other. In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm. Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa –
  • 48. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN41 Figure 4.9: Training and validation accuracy of CNN-LSTM on DASPS dataset both variants are found in the literature. The name stems from the fact that it makes it easy to see whether the system is confusing two classes (i.e. commonly mislabeling one as another). It is a special kind of contingency table, with two dimensions (”actual” and ”predicted”), and identical sets of ”classes” in both dimensions (each combination of dimension and class is a variable in the contingency table). Figure 4.10: Confusion matrix of CNN-LSTM on DASPS dataset When showing the test data to the trained model, the CNN-LSTM achieves 89.96%. According to Figure 4.10, the model achieves its higher
  • 49. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN42 accuracy on the normal anxiety state. The lowest accuracy is achieved with light anxiety state. There is no confusion between severe trials and light trials. The highest confusion is done between moderate trials with normal trials. 4.5.3 Classification results with data augmentation To begin this part, we should discuss DASPS (A Database for Anxious States based on a Psychological Stimulation). DASPS is a database that comprises EEG signals for detecting anxiety levels. The electroencephalogram (EEG) signals of 23 subjects were captured during fear elicitation using face-to- face psychological cues in this database. This work is innovative not only in making EEG data available to the affective computing community, but also in the design of a psychological stimulation protocol that provides comfortable conditions for participants in direct interaction with the therapist, as well as the use of a wireless EEG cap with fewer channels, namely only 14 dry electrodes. The raw EEG data obtained from the 23 individuals is stored in .edf files in the database. This included database contains raw data as well as preprocessed data in .mat format. The researchers offered a matlab script for segmenting each EEG signal into six segments, one for each of the six scenarios. Figure 4.11: Uploading data to GUI by therapist
  • 50. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN43 4.6 Graphical user interface A GUI (graphical user interface) is a system of interactive visual components for computer software. A GUI displays objects that convey information, and represent actions that can be taken by the user. The objects change color, size, or visibility when the user interacts with them. Figure 4.12: GUI showing EEG brain map in side a and emotion prediction in side b In Figure 4.11, the therapist can upload the recorded EEG signals (a pre-processed trial saved as .mat file). Then, the system will load the trial and plot the topographical map which helps in visualization of activated brain regions as illustrated in side a Figure 4.12. The last interface consists in classifying the anxiety state of the subject. Actually, the trained CNN-LSTM model is saved. After that, we call it for every new trial to classify. The anxiety state is depicted in side b of the Figure 4.12.
  • 51. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN44 4.7 The general methodology of our work In the chart below, we summarized our framework in 3 basic steps: 1. The first step is about gathering EEG signals and collecting original data set but the needed capacity of data is very little, for this reason we proceed to second step: Generate new data. 2. The second step is about generating new data using the Conditional Variational Autoencoder model. 3. The final step is about predicting the emotions using convolutional neural networks along with Long-Short Term Memory cells. Figure 4.13: Basic steps of our methodology 4.8 Project schedule A Gantt chart is a type of bar chart that illustrates a project schedule, named after its inventor, Henry Gantt (1861–1919), who designed such a chart around the years 1910–1915. Modern Gantt charts also show the de- pendency relationships between activities and the current schedule status. Figure 4.14: GANTT chart
  • 52. CHAPTER 4. EMOTION RECOGNITION USING RECURRENT CNN45 In our Gantt (see Figure 4.14) chart, we elaborated 7 basic tasks: • Firstly, planning outlines for the whole internship with the professors. • Secondly, beginning writing chapter 1: Introduction • Thirdly, in parallel with chapter 1 we will begin practically our machine learning project, find more details in the architecture of the project. • Fourthly, starting writing chapter 2 in the beginning of April: Back- ground and literature review of EEG-based emotion recognition. • Fifthly, starting writing chapter 3 talking deeply about the used in- ternal architectures: VAEs for generating new data and LSTM for classifying the emotions. • Sixthly, in last month of April, we will create the graphical user inter- face using Tkinter. • Sevenly and last, beginning writing the chapter 4 talking about results and future optimization for this project. 4.9 Conclusion In conclusion, for this chapter we have talking about convolutional neural networks (Principle, application and architecture). Then, we have talking about about recurrent neural networks as primary solution for predicting emotions. In the next stage, we decided to change to LSTM technique that is more suitable to our continuous signals such as (EEG, voice and text translation). Finally, we have analysed our code on COLAB and interpret the results and we have presented our graphical user interface for therapist.
  • 53. Chapter 5 Conclusion and Future Work This work ”Emotions prediction for augmented EEG signals using VAE and Convolutional Neural Networks LSTM”, is based on two main ideas: gener- ating new data and classifying emotions using old and new produced data. In the first idea we have used conditional variational autoencoders to generate new high quality EEG data and in the second part we have used convolu- tional neural networks combined with LSTM technique to classify the output level of anxiety (normal, light, moderate and severe). The code is writen us- ing python and Google COLAB. Finally, we make this code into a graphical user interface to use it by normal users and therapists. This work has enabled us to know the patient’s feelings without relying on many traditional questionnaires, surveys ans tests taking into account the psychological and emotional state of the persons, especially for those who suffer from psychological trauma, depression and negative feelings. It helps both, the patient and the doctor by having a high accuracy in emotion recog- nition and fast diagnosis. It allows to speed up the treatment process and not cause embarrassment to the patient, taking into account the psychological aspect of the children, the disabled, the deaf and the dumb whom find it dif- ficult to express their emotions after a such negative event in his life, such as the death of a loved one, divorce, school failure or family problems. It helps, also, doctors in refugee camps in war zones to diagnose psychological con- ditions as quickly as possible, especially for those who suffer severe trauma after bloody events. This work is a an important step towards improving mental health care on two levels: Diagnosis and treatment. It shortens time and makes it easier to understand millions of patient’s situation around the world with the least amount of documents and questionnaires. This work can be improved and developed in coordination with official authorities by cooperating with medicine universities, doctors’ clinics, the ministry of health to improve the database and optimize other points by interacting with psychiatrists, therapists and intervening ministry of woman and children, sociologists, pediatricians and other organizations. Following machine learning life cycle, this work can be improved in fu- ture projects using a large big data from many hospitals, we can also use Generative adversarial networks (GANs) for generating more high quality 46
  • 54. CHAPTER 5. CONCLUSION AND FUTURE WORK 47 EEG signals which will improve the prediction results, using the new inputs for new patients. In such application, it is a necessary step to improve our algorithms to feed a better prediction results.
  • 55. Bibliography [Alexander et al., 2017] Alexander, L. M., Escalera, J., Ai, L., Andreotti, C., Febre, K., Mangone, A., Vega-Potler, N., Langer, N., Alexander, A., Kovacs, M., et al. (2017). An open resource for transdiagnostic research in pediatric mental health and learning disorders. Scientific data, 4:170181. [Arsalan et al., 2019] Arsalan, A., Majid, M., Butt, A. R., and Anwar, S. M. (2019). Classification of perceived mental stress using a commercially avail- able eeg headband. IEEE journal of biomedical and health informatics, 23(6):2257–2264. [Baghdadi and Aribi, 2019] Baghdadi, A. and Aribi, Y. (2019). Effectiveness of dominance for anxiety vs anger detection. In 2019 Fifth International Conference on Advances in Biomedical Engineering (ICABME), pages 1–4. IEEE. [Baghdadi et al., 2016] Baghdadi, A., Aribi, Y., and Alimi, A. M. (2016). A survey of methods and performances for eeg-based emotion recognition. In International Conference on Hybrid Intelligent Systems, pages 164–174. Springer. [Baghdadi et al., 2017] Baghdadi, A., Aribi, Y., and Alimi, A. M. (2017). Efficient human stress detection system based on frontal alpha asymmetry. In International Conference on Neural Information Processing, pages 858– 867. Springer. [Baghdadi et al., 2020] Baghdadi, A., Aribi, Y., Fourati, R., Halouani, N., Siarry, P., and Alimi, A. (2020). Psychological stimulation for anxious states detection based on eeg-related features. Journal of Ambient Intelli- gence and Humanized Computing, pages 1–15. [Bengio et al., 2013] Bengio, Y., Courville, A., and Vincent, P. (2013). Rep- resentation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828. [Cong et al., 2015] Cong, F., Lin, Q.-H., Kuang, L.-D., Gong, X.-F., Astikainen, P., and Ristaniemi, T. (2015). Tensor decomposition of eeg signals: a brief review. Journal of neuroscience methods, 248:59–69. 48
  • 56. BIBLIOGRAPHY 49 [Fourati et al., 2020] Fourati, R., Ammar, B., Sanchez-Medina, J., and Al- imi, A. M. (2020). Unsupervised learning in reservoir computing for eeg- based emotion recognition. IEEE Transactions on Affective Computing, to be published. doi:10.1109/TAFFC.2020.2982143. [Freeman and Quiroga, 2012] Freeman, W. and Quiroga, R. Q. (2012). Imag- ing brain function with EEG: advanced temporal and spatial analysis of electroencephalographic signals. Springer Science Business Media. [Garcı́a-Martı́nez et al., 2019] Garcı́a-Martı́nez, B., Martinez-Rodrigo, A., Alcaraz, R., and Fernández-Caballero, A. (2019). A review on nonlin- ear methods using electroencephalographic recordings for emotion recog- nition. IEEE Transactions on Affective Computing, to be published. doi:10.1109/TAFFC.2018.2890636. [Garcı́a-Martı́nez et al., 2017] Garcı́a-Martı́nez, B., Martı́nez-Rodrigo, A., Zangróniz, R., Pastor, J. M., and Alcaraz, R. (2017). Symbolic analy- sis of brain dynamics detects negative stress. Entropy, 19(5):196. [Jitendra Verma, 2020] Jitendra Verma, Sudip Paul, P. J. (2020). Computa- tional Intelligence and Its Applications in Healthcare. “Elsevier”. [Mahmud et al., 2018] Mahmud, M., Kaiser, M. S., Hussain, A., and Vas- sanelli, S. (2018). Applications of deep learning and reinforcement learning to biological data. IEEE Transactions on Neural Networks and Learning Systems, 29(6):2063–2079. [Movahedi et al., 2017] Movahedi, F., Coyle, J. L., and Sejdić, E. (2017). Deep belief networks for electroencephalography: A review of recent con- tributions and future outlooks. IEEE Journal of Biomedical and Health Informatics, 22(3):642–652. [NeuroSky, 2009] NeuroSky, I. (2009). Brain wave signal (eeg) of neurosky, inc. Last accessed June 30, 2020. [Niedermeyer and da Silva, 2005] Niedermeyer, E. and da Silva, F. L. (2005). Electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams Wilkins. [Picard, 1997] Picard, R. (1997). Affective Computing. Inteligencia artificial. “The” MIT Press. [Randhavane, 2020] Randhavane, T. (2020). Identifying emotions from walk- ing using affective and deep features. ARXIV, pages 1–15.
  • 57. BIBLIOGRAPHY 50 [ScienceDaily, 2018] ScienceDaily (2018). University of toronto, mind- reading algorithm uses eeg data to reconstruct images based on what we perceive: New technique using eeg shows how our brains perceive faces. Last accessed June 30, 2020. [Steriade, 2005] Steriade, M. (2005). Cellular substrates of brain rhythms. Electroencephalography: Basic principles, clinical applications, and related fields, 5:31–83. [Tsiouris et al., 2018] Tsiouris, K. M., Pezoulas, V. C., Zervakis, M., Konit- siotis, S., Koutsouris, D. D., and Fotiadis, D. I. (2018). A long short-term memory deep learning network for the prediction of epileptic seizures using eeg signals. Computers in Biology and Medicine, 99:24–37. [Zhang et al., 2020] Zhang, X., Pan, J., Shen, J., Din, Z. U., Li, J., Lu, D., Wu, M., and Hu, B. (2020). Fusing of electroencephalogram and eye movement with group sparse canonical correlation analysis for anxiety detection. IEEE Transactions on Affective Computing, to be published. doi:10.1109/TAFFC.2020.2981440.
  • 58. Netography (Visited on Mai 2021) [Web 1]: https://www.bbvaopenmind.com/en/technology/digital- world/what-is-affective-computing/ [Web 2]: https://www.researchgate.net/figure/Sketch-of-how-to-record-an- Electroencephalogram-An-EEG-allows-measuring-the-electrical [Web 3]: https://www.sciencedirect.com/topics/agricultural-and-biological- sciences [Web 4]: https://oxfordmedicine.com/view/10.1093/9780195173642.001.0001/ [Web 5]: https://adaa.org/understanding-anxiety/ [Web 6]: https://worldhappiness.report/archive/ [Web 7]: https://bluewhaleapps.com/blog/implementing-machine-learning- for-emotion-detection [Web 8]: https://glengilmore.medium.com/facial-recognition-ai-will-use- your-facial-expressions-to-judge-creditworthiness-b0e9a9ac4174 [Web 9]: https://www.thesaurabh.com/images/posts/about- me/projects/speech-features-for-emotion-recognition/speech-features- for-emotion-recognition-header.jpg [Web 10]: https://ai.googleblog.com/2019/08/on-device-real-time-hand- tracking-with.html [Web 11]: https://medium.com/@mindpass2050/biosignals-as-dynamic- biometrics-d93c3455e895 [Web 12]: https://www.wysa.io/ [Web 13]: https://play.google.com/store/apps/details?id=net.daylio [Web 14]: https://www.headspace.com/ [Web 15]: https://play.google.com/store/apps/id=com.getsomeheadspace [Web 16]: https://www.calm.com/ [Web 17]: https://play.google.com/store/apps/ [Web 18]: https://play.google.com/store/apps/details?id=com.relextro.anti.stress [Web 20]: https://www.theshineapp.com/ [Web 21]: https://play.google.com/store/apps/details?id=com.shinetext.shine [Web 22]: https://www.researchgate.net/figure/Schematic-overview-of- convolutional-autoencoder-CAE-and-an-example-reconstruction [Web 23]: https://towardsdatascience.com/understanding-variational- autoencoders-vaes-f70510919f73 [Web 24]: https://jamesmccaffrey.wordpress.com/2018/07/03/the- difference-between-an-autoencoder-and-a-variational-autoencoder/ 51
  • 59. BIBLIOGRAPHY 52 [Web 25]: https://www.slideshare.net/andersonljason/variational- autoencoders-for-image-generation [Web 26]: https://www.researchgate.net/figure/Variational-AutoEncoder- VAE-architecture [Web 27]: https://pubmed.ncbi.nlm.nih.gov/23663147/ [Web 28]: https://www.npr.org/sections/alltechconsidered/2017/06/26/534131967/facial- recognition-may-boost-airport-security-but-raises-privacy-worries [Web 29]: https://medium.com/@codyorazymbet/a-quick-introduction-to- cnn-layers-3b598e9d9963 [Web 30]: https://www.programmersought.com/article/63014851723/ [Web 31]: https://medium.com/analytics-vidhya/cnn-vs-rnn-vs-ann- analyzing-3-types-of-neural-networks-in-deep-learning-f3fa1249589d [Web 32]: https://www.researchgate.net/figure/LSTM-cell-structure