SlideShare a Scribd company logo
Speech Emotion Recognition
Guided by:
Mrs. R.K. Patole
111707049-Pragya Sharma
141807005-Kanchan Itankar
141807008-Saniya Shaikh
141807009-Triveni Vyavahare
Aim
Speech Emotion
Recognition using
Machine Learning
[1] Speech Emotion Recognition using Neural Network and MLP Classifier (Jerry
Joy, Aparna Kannan, Shreya Ram, S. Rama)
● MLP Classifier
● 5 features extracted- MFCC, Contrast, Mel Spectrograph Frequency, Chroma
and Tonnetz
● Accuracy 70.28%
[2]Voice Emotion Recognition using CNN and Decision Tree (Navya Damodar,
Vani H Y, Anusuya M A.)
● Decision Tree , CNN
● MFCCs extracted
● Accuracy 72% CNN, 63% Decision Tree
Literature Review
● To build a model to recognize emotion from speech using the librosa and
sklearn libraries and the RAVDESS dataset.
● To present a classification model of emotion elicited by speeches based on
deep neural networks MLP Classification based on acoustic features such as
Mel Frequency Cepstral Coefficient (MFCC). The model has been trained to
classify eight different emotions (calm, happy, fearful, disgust, angry, neutral,
surprised,sad).
Objective
Applications
Business Marketing
Suicide prevention
Voice Assistant
● As human beings speech is amongst the most natural way to express ourselves. We depend
so much on it that we recognize its importance when resorting to other communication
forms like emails and text messages where we often use emojis to express the emotions
associated with the messages. As emotions play a vital role in communication, the detection
and analysis of the same is of vital importance in today’s digital world of remote
communication.
● Emotion detection is a challenging task, because emotions are subjective. There is no
common consensus on how to measure them. We define a Speech Emotions Recognition
system as a collection of methodologies that process and classify speech signals to detect
emotions embedded in them.
Motivation
● Human machine interaction is widely used nowadays in many applications. One of the medium
of interaction is speech. The main challenges in human machine interaction is detection of
emotion from speech.
● Emotion can play an important role in decision making. Emotion can be detected from different
physiological signal also. If emotion can be recognized properly from speech then a system can
act accordingly. Identification of emotion can be done by extracting the features or different
characteristics from the speech and training needed for a large number of speech database to
make the system accurate.
● An emotional speech RAVDESS dataset is selected then emotion specific features are extracted
from those speeches and finally a MLP classification model is used to recognize the emotions.
Introduction
System Block Diagram
Methodology
Preprocessing Feature Extraction Classification
1.Preprocessing
The removal of unwanted noise signal from the speech.
➢Silent removal
➢Background Noise
removal
➢Windowing
➢Normalization
2.Feature Extraction
● Extract the feature from audio file
● Used to identify How we speak
➢ Pitch
➢ Loudness
➢ Rhythm,etc
Dataset
Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset.
● [3]RAVDESS dataset has recordings of 24 actors, 12 male actors and 12 female
actors, the actors are numbered from 01 to 24 in North American accent.
● All emotional expressions are uttered at two levels of intensity: normal and strong,
except for the ‘neutral’ emotion, it is produced only in normal intensity. Thus, the
portion of the RAVDESS, that we use contains 60 trials for each of the 24 actors,
thus making it 1440 files in total.
[1] Training process workflow
[1] Testing process workflow
3.Classification
● Match the feature with corresponding emotions
Multilayer Perceptron
Multi-Layer Perceptron Classifier
● A multilayer perceptron (MLP) is a class of feedforward
artificial neural network (ANN).
● MLP consists of at least three layers of nodes-input
layer,hidden layer and output layer.
● MLPs are suitable for classification prediction problems
where inputs are assigned a class or label.
Building the MLP Classifier involves the following steps-
1. Initialisation MLP Classifier.
2. Neural Network.
3. Prediction.
4. Accuracy Calculation.
Multi-Layer Perceptron Classifier
Fig:- Multi-Layer Perceptron Classifier
Feature Extraction
From the Audio data we have extracted three key features which have been used in this, namely:
● MFCC (Mel Frequency Cepstral Coefficients)
● Mel Spectrogram
● Chroma
MFCC (Mel Frequency Cepstral Coefficients)
Mel Spectrogram
A Fast Fourier Transform is computed on overlapping windowed segments of the signal,
and we get what is called the spectrogram. This is just a spectrogram that depicts amplitude
which is mapped on a Mel scale.
Chroma
A Chroma vector is typically a 12-element feature vector indicating how much energy of
each pitch class is present in the signal in a standard chromatic scale.
MFCC Chroma
Accuracy
Classification Matrix
Confusion Matrix
1.angry
2.calm
3.disgust
4.fearful
5.happy
6.neutral
7.sad
8.surprised
● The proposed model achieved an accuracy of 66.67%.
● Calm was the best identified emotion.
● The model gets confused between similar emotions like calm-neutral, happy-surprised.
● We tested the model on our own voice file for the sentence “Dogs are sitting by the door” and it
identified the emotion correctly.
Conclusion
Future Work
● The system could take into consideration multiple speakers from different geographic locations
speaking with different accents.
● Though standard feed forward MLP is powerful tool for classification problems, we can use
CNN, RNN models with larger data sets and high computational power machines and compare
between them.
● Study shows that people suffering with autism have difficulty expressing their emotions
explicitly. Image based speech processing in real time can prove to be of great assistance.
References
[1] Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama Speech Emotion Recognition using Neural
Network and MLP Classifier, IJESC, April 2020.
[2]Navya Damodar, Vani H Y, Anusuya M A. Voice Emotion Recognition using CNN and
Decision Tree. International Journal of Innovative Technology and Exploring Engineering
(IJITEE), October 2019.
[3]RAVDESS Dataset: https://zenodo.org/record/1188976#.X5r20ogzZPZ
[4]MLP/CNN/RNN Classification:
https://machinelearningmastery.com/when-to-use-mlp-cnn-and-rnn-neural-networks/
[5]MFCC:https://medium.com/prathena/the-dummys-guide-to-mfcc-aceab2450fd

More Related Content

What's hot

Image captioning
Image captioningImage captioning
Image captioning
Muhammad Zbeedat
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
Chaitanya Maddala
 
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
IOSR Journals
 
HUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEMHUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEM
soumi sarkar
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
Takrim Ul Islam Laskar
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
Aantariksh Developers
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Data Science Society
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speech
Lakshmi Sarvani Videla
 
detect emotion from text
detect emotion from textdetect emotion from text
detect emotion from text
Safayet Hossain
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
Kamonasish Hore
 
Emotion recognition using image processing in deep learning
Emotion recognition using image     processing in deep learningEmotion recognition using image     processing in deep learning
Emotion recognition using image processing in deep learning
vishnuv43
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender Detection
AbhiAchalla
 
Human Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine LearningHuman Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine Learning
ijtsrd
 
Image captioning
Image captioningImage captioning
Image captioning
Rajesh Shreedhar Bhat
 
Hate speech detection
Hate speech detectionHate speech detection
Hate speech detection
NASIM ALAM
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Applications of Emotions Recognition
Applications of Emotions RecognitionApplications of Emotions Recognition
Applications of Emotions Recognition
Francesco Bonadiman
 

What's hot (20)

Image captioning
Image captioningImage captioning
Image captioning
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
 
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
 
HUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEMHUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEM
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speech
 
detect emotion from text
detect emotion from textdetect emotion from text
detect emotion from text
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Emotion recognition using image processing in deep learning
Emotion recognition using image     processing in deep learningEmotion recognition using image     processing in deep learning
Emotion recognition using image processing in deep learning
 
Human age and gender Detection
Human age and gender DetectionHuman age and gender Detection
Human age and gender Detection
 
Human Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine LearningHuman Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine Learning
 
Image captioning
Image captioningImage captioning
Image captioning
 
Hate speech detection
Hate speech detectionHate speech detection
Hate speech detection
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Applications of Emotions Recognition
Applications of Emotions RecognitionApplications of Emotions Recognition
Applications of Emotions Recognition
 

Similar to Speech emotion recognition

Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
sipij
 
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
sipij
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
IRJET Journal
 
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNNSPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
IRJET Journal
 
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
ijtsrd
 
Speech Emotion Recognition Using Neural Networks
Speech Emotion Recognition Using Neural NetworksSpeech Emotion Recognition Using Neural Networks
Speech Emotion Recognition Using Neural Networks
ijtsrd
 
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODELASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
sipij
 
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN ModelASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
sipij
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International JournalSignal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
sipij
 
H010215561
H010215561H010215561
H010215561
IOSR Journals
 
A017410108
A017410108A017410108
A017410108
IOSR Journals
 
A017410108
A017410108A017410108
A017410108
IOSR Journals
 
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
HanzalaSiddiqui8
 
Kf2517971799
Kf2517971799Kf2517971799
Kf2517971799
IJERA Editor
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
StutiAgarwal36
 
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
BIJIAM Journal
 
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
csandit
 
5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf
satyaprakashkumawat2
 

Similar to Speech emotion recognition (20)

Ho3114511454
Ho3114511454Ho3114511454
Ho3114511454
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal Signal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
 
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM ModelASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
 
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNNSPEECH EMOTION RECOGNITION SYSTEM USING RNN
SPEECH EMOTION RECOGNITION SYSTEM USING RNN
 
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledg...
 
Speech Emotion Recognition Using Neural Networks
Speech Emotion Recognition Using Neural NetworksSpeech Emotion Recognition Using Neural Networks
Speech Emotion Recognition Using Neural Networks
 
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODELASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
ASERS-CNN: ARABIC SPEECH EMOTION RECOGNITION SYSTEM BASED ON CNN MODEL
 
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN ModelASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model
 
Signal & Image Processing : An International Journal
Signal & Image Processing : An International JournalSignal & Image Processing : An International Journal
Signal & Image Processing : An International Journal
 
H010215561
H010215561H010215561
H010215561
 
A017410108
A017410108A017410108
A017410108
 
A017410108
A017410108A017410108
A017410108
 
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
Emotion Recognition based on Speech and EEG Using Machine Learning Techniques...
 
Kf2517971799
Kf2517971799Kf2517971799
Kf2517971799
 
Kf2517971799
Kf2517971799Kf2517971799
Kf2517971799
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
 
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
Emotion Recognition Based on Speech Signals by Combining Empirical Mode Decom...
 
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
H IDDEN M ARKOV M ODEL A PPROACH T OWARDS E MOTION D ETECTION F ROM S PEECH S...
 
5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Speech emotion recognition

  • 1. Speech Emotion Recognition Guided by: Mrs. R.K. Patole 111707049-Pragya Sharma 141807005-Kanchan Itankar 141807008-Saniya Shaikh 141807009-Triveni Vyavahare
  • 3. [1] Speech Emotion Recognition using Neural Network and MLP Classifier (Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama) ● MLP Classifier ● 5 features extracted- MFCC, Contrast, Mel Spectrograph Frequency, Chroma and Tonnetz ● Accuracy 70.28% [2]Voice Emotion Recognition using CNN and Decision Tree (Navya Damodar, Vani H Y, Anusuya M A.) ● Decision Tree , CNN ● MFCCs extracted ● Accuracy 72% CNN, 63% Decision Tree Literature Review
  • 4. ● To build a model to recognize emotion from speech using the librosa and sklearn libraries and the RAVDESS dataset. ● To present a classification model of emotion elicited by speeches based on deep neural networks MLP Classification based on acoustic features such as Mel Frequency Cepstral Coefficient (MFCC). The model has been trained to classify eight different emotions (calm, happy, fearful, disgust, angry, neutral, surprised,sad). Objective
  • 6. ● As human beings speech is amongst the most natural way to express ourselves. We depend so much on it that we recognize its importance when resorting to other communication forms like emails and text messages where we often use emojis to express the emotions associated with the messages. As emotions play a vital role in communication, the detection and analysis of the same is of vital importance in today’s digital world of remote communication. ● Emotion detection is a challenging task, because emotions are subjective. There is no common consensus on how to measure them. We define a Speech Emotions Recognition system as a collection of methodologies that process and classify speech signals to detect emotions embedded in them. Motivation
  • 7. ● Human machine interaction is widely used nowadays in many applications. One of the medium of interaction is speech. The main challenges in human machine interaction is detection of emotion from speech. ● Emotion can play an important role in decision making. Emotion can be detected from different physiological signal also. If emotion can be recognized properly from speech then a system can act accordingly. Identification of emotion can be done by extracting the features or different characteristics from the speech and training needed for a large number of speech database to make the system accurate. ● An emotional speech RAVDESS dataset is selected then emotion specific features are extracted from those speeches and finally a MLP classification model is used to recognize the emotions. Introduction
  • 10. 1.Preprocessing The removal of unwanted noise signal from the speech. ➢Silent removal ➢Background Noise removal ➢Windowing ➢Normalization
  • 11. 2.Feature Extraction ● Extract the feature from audio file ● Used to identify How we speak ➢ Pitch ➢ Loudness ➢ Rhythm,etc
  • 12. Dataset Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset. ● [3]RAVDESS dataset has recordings of 24 actors, 12 male actors and 12 female actors, the actors are numbered from 01 to 24 in North American accent. ● All emotional expressions are uttered at two levels of intensity: normal and strong, except for the ‘neutral’ emotion, it is produced only in normal intensity. Thus, the portion of the RAVDESS, that we use contains 60 trials for each of the 24 actors, thus making it 1440 files in total.
  • 15. 3.Classification ● Match the feature with corresponding emotions Multilayer Perceptron
  • 16. Multi-Layer Perceptron Classifier ● A multilayer perceptron (MLP) is a class of feedforward artificial neural network (ANN). ● MLP consists of at least three layers of nodes-input layer,hidden layer and output layer. ● MLPs are suitable for classification prediction problems where inputs are assigned a class or label.
  • 17. Building the MLP Classifier involves the following steps- 1. Initialisation MLP Classifier. 2. Neural Network. 3. Prediction. 4. Accuracy Calculation.
  • 18. Multi-Layer Perceptron Classifier Fig:- Multi-Layer Perceptron Classifier
  • 19. Feature Extraction From the Audio data we have extracted three key features which have been used in this, namely: ● MFCC (Mel Frequency Cepstral Coefficients) ● Mel Spectrogram ● Chroma
  • 20. MFCC (Mel Frequency Cepstral Coefficients)
  • 21. Mel Spectrogram A Fast Fourier Transform is computed on overlapping windowed segments of the signal, and we get what is called the spectrogram. This is just a spectrogram that depicts amplitude which is mapped on a Mel scale. Chroma A Chroma vector is typically a 12-element feature vector indicating how much energy of each pitch class is present in the signal in a standard chromatic scale.
  • 26. ● The proposed model achieved an accuracy of 66.67%. ● Calm was the best identified emotion. ● The model gets confused between similar emotions like calm-neutral, happy-surprised. ● We tested the model on our own voice file for the sentence “Dogs are sitting by the door” and it identified the emotion correctly. Conclusion
  • 27. Future Work ● The system could take into consideration multiple speakers from different geographic locations speaking with different accents. ● Though standard feed forward MLP is powerful tool for classification problems, we can use CNN, RNN models with larger data sets and high computational power machines and compare between them. ● Study shows that people suffering with autism have difficulty expressing their emotions explicitly. Image based speech processing in real time can prove to be of great assistance.
  • 28. References [1] Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama Speech Emotion Recognition using Neural Network and MLP Classifier, IJESC, April 2020. [2]Navya Damodar, Vani H Y, Anusuya M A. Voice Emotion Recognition using CNN and Decision Tree. International Journal of Innovative Technology and Exploring Engineering (IJITEE), October 2019. [3]RAVDESS Dataset: https://zenodo.org/record/1188976#.X5r20ogzZPZ [4]MLP/CNN/RNN Classification: https://machinelearningmastery.com/when-to-use-mlp-cnn-and-rnn-neural-networks/ [5]MFCC:https://medium.com/prathena/the-dummys-guide-to-mfcc-aceab2450fd