1. Ajeenkya D. Y. Patil School of Engineering
Department of Computer Engineering
BE Project Semester-II A.Y. 2022-23
Project Group ID :
Team Members :
1. Name YASH BAHIRAT PRN No. B190884375
2. Name ABHISHEK YADAV PRN No. B190884374
3. Name PIYUSH SONAWANE PRN No. B190884349
4. Name VIPUL KANDGE PRN No. B190884273
Project Title : AUTOMATIC DEPRESSION LEVEL DETECTION THROUGH
VISUAL INPUTS
Project Guide : Prof. Manisha Wasnik
Area of Project : Machine Learning
2. Contents
Problem Statement
Motivation
Objectives
Introduction to Domain
Literature Review (Tabular Comparison of at least 6 Latest Research Papers)
Project Feasibility (Project Achievability Timewise, Costwise, Technologywise)
Scope of Project
Mathematical Model
Software Hardware Requirements
System Architecture
Algorithm
Structural Diagrams (Class, Object Diagrams etc) as per standard UML Formats
Behavioral Diagrams (use case, sequence, activity, etc) as per standard UML Formats
Advantages, Limitations, Applications
Conclusion
Future Work
References in Standard Format (At least 15, starting with Base Paper and in Year wise Decreasing Order)
3. Problem Statement
This work will focus on feature development for depression detection by
investigating how to build a detection system that extracts features from Text
and speech and detect emotions using facial expressions. This work aims to
discover which features provide the best discrimination between depression
levels.
4. Motivation
The main motivation of such research is to make the man-machine interface
more flexible and more easy for the user. Depression is a common mental
illness and a leading cause of disability worldwide, which may cause suicides.
Globally, more than 300 million people are estimated to suffer from depression
every year1 Generally, depression is diagnosed through face-to-face clinical
depression criteria.However, at early stages of depression, 70% of the patients
would not consult doctors, which may take their condition to advance
stages.Human experts will have privileged knowledge that codes the facial,
text and audio features.
5. Objectives
• To Predict mood level or activity based on score with class label.
• To successfully implement the test model based on training set as deep
learning approach.
• To execute the proposed system maximum accuracy.
• The main goal of this project is to detect stress of person.
• The aim of the system To Decrease the Suicide.
• To detect stress through speech analysis, to use these MFC coefficients is
the computation of the mean and the standard deviation of each of them,
instead of using them as a single feature which can lead to very large
feature sets.
6. Introduction to Domain
• Machine Learning:
Machine learning (ML) is a type of artificial intelligence (AI) that allows
software applications to become more accurate at predicting outcomes without
being explicitly programmed to do so. Machine learning algorithms use
historical data as input to predict new output values. The machine learning
process begins with observations or data, such as examples, direct experience
or instruction. It looks for patterns in data so it can later make inferences based
on the examples provided. The primary aim of ML is to allow computers to
learn autonomously without human intervention or assistance and adjust
actions accordingly.ML has proven valuable because it can solve problems at a
speed and scale that cannot be duplicated by the human mind alone. With
massive amounts of computational ability behind a single task or multiple
specific tasks, machines can be trained to identify patterns in and relationships
between input data and automate routine processes.
7. Literature review
SR.
NO.
Paper Year Author Description
1 Depression Detection Using
Machine Learning
Techniques on Twitter Data
2021
IEEE
Kuhaneswaran A/L
Govindasamy,
Naveen
Palanichamy
The proposed research work aims to detect the depression of the user by their
data, which is shared on social media. The Twitter data is then fed into two
different types of classifiers, which are Naïve Bayes and a hybrid model,
NBTree. The results will be compared based on the highest accuracy value to
determine the best algorithm to detect depression.The results shows both
algorithm perform equally by proving same accuracy level.
2 Depression Detection by
Analyzing Social Media
Posts of User
2019
IEEE
Nafiz Al Asad, Md.
Appel Mahmud
Pranto, Sadia
Afreen, Md.
Maynul Islam
This paper proposed a model that takes a username and analyzes the social
media posts of the user to determine the levels of vulnerability to depression.
The machine learning model is trained to classify the depression criteria
in six ranges (Considered Normal, Mild, Moderate, Borderline, Severe,
Extreme). The verdict is depressed when the percentage is above borderline
(above 55%). The collected tweets and the facebook posts are analyzed by the
model and labeled the user as depressed or non-depressed.
3 Recognition of Audio
Depression Based on
Convolutional Neural
Network and Generative
Antagonism Network Model
2020
IEEE
Zhiyong wang ,
longxi chen , lifeng
wang , and
guangqiang diao
This paper proposes an audio depression recognition method based on
convolution neural network and generative antagonism network model. First of
all, preprocess the data set, remove the long-term mute segments in the data set,
and splice the rest into a new audio file. Then, the features of speech signal,
such as Mel-scale Frequency Cepstral Coefficients (MFCCs), short-term
energy and spectral entropy, are extracted based on audio difference
normalization algorithm.
8. Literature review
SR.
NO.
Paper Year Author Description
4 The Detection of Depression
Using Multimodal
Models Based on Text and
Voice Quality Features
2021
IEEE
Hanadi
Solieman,
Evgenii A.
Pustozerov
We created a text analysis model on a word-level using Natural Language
Processing (NLP) techniques, and a voice quality analysis model on tense to
breathy dimension. The text analysis model made its best performance with an F1-
score equal to 0.8 (0.42) for non-depressed (depressed) individuals, while the voice
quality model scored 0.76 (0.38). As a result, we had two models
that would be implemented in a system for the diagnosis of depression.
5 A Depression Recognition
Method for College
Students Using Deep
Integrated Support Vector
Algorithm
2021
IEEE
Yan ding,
xuemei chen,
qiming fu , and
shan zhong
This study uses text-level mining of Sina Weibo data from college students to detect
depression among college students. First, collect text information of college student
users in Sina Weibo, and construct the text information into input data that can be
used for machine learning.Deep neural networks are used for feature extraction. An
deep integrated support vector machine(DISVM)
algorithm is introduced to classify the input data, and finally realize the recognition
of depression. DISVM makes the recognition model more stable and improves the
accuracy of depression diagnosis to a certain extent. Simulation experiments verify
that the proposed depression recognition scheme can detect potential depression
patients in the college student population through Sina Weibo data.
6 Diagnosis of Depressive
Disorder Model on Facial
Expression
Based on Fast R-CNN
2022
MDPI
Young-Shin
Lee 1 and Won-
Hyung Park
In this study, the model was limited to the field of the depressive disorder diagnosis
assistance system, but smartphones have a greater potential for EMI. When the
level of detection through smartphones is further improved and insight into digitally
identified expressions increases, it can be integrated to communicate personalized
treatment recommendations through AI.
9. Project Feasibility
• The functionality analysis is a crucial reason for system experiments for general purposes.
• In earlier phases of S/W integration, it is important to check whether or not the environment
project is effective.
1. Technical Feasibility
• Technical analysis is the study of hardware components and technical specifications, i.e. our
analysis detailed technical specifications, in intended to notify an entity and users of this
program that these several capital requirements are needed.
• The proposed work is theoretically feasible for creation as per Programming language
implementations, with enterprise applications for Application development in the sql
repository.
2. Economical Feasibility
• In order to understand that they are in accordance with the projected budget or whether the
project has a reasonable investment return, the economic feasibility would evaluate the
predictable cost
• It is only necessary to determine if it is possible that the cost of the project will fall within the
target budget or the investment benefit.
10. Project Feasibility
3. Operational Feasibility
• Operational viability tests the willingness of the company to support the
scheme.
• This is probably the hardest feasibility to scale.
4. Time Feasibility
• Similar to financial impact, it is important to roughly guess a business plan
in order to determine if it will be feasible within a critical timeframe for the
entire system project.
• The period has already been listed in the task for project completion as well
as in the Gantt map.
11. Scope of Project
Reflecting the essential characteristics of depression as a category of mental
illness and better reflecting the fact that depression is an important link in the
human public health care. The classification plays a major role in determining
the kind of help a depressed person needs and also, the person with suicidal
thoughts need to be identified and helped according to his condition.
12. Mathematical Model
Let S be the Whole system S= {I,P,O}
I-input
P-procedure
O-output
Input( I)
I={ Face, Speech and Text data }
Where,
Dataset contain facial, Speech and tweet as a text
Procedure (P),
P={I, Using I System perform operations and calculate the prediction }
Output(O)-
O={System detect whether person is depressed or not.}
15. System Explanation
• Input Audio Speech and text (Data Set): The first step of the system is to input an audio speech
and text Pre-Processing: The second phase of the system deals with quality enhancement of the
input signals of the audio speech and text. It may include silence removal, Preemphasis, noise
removal, windowing and unwanted pauses, etc. Feature Extraction: Feature extraction involves
the analysis of the speech signal and text. The speech signal contains large number hidden
information which reflects the emotional characteristics. It is considered as an important phase
of the system as extraction of relevant and significant features heavily impact on the final
recognition. Some of the features extracted by various researchers are MFCC (Mel-Frequency
Cepstral Coefficients), LFPC (Log Frequency Power Coefficients), pitch, energy, and voice
quality. Classification: The fourth step is the main step of the system in which the audio speech
and text is classified into different emotions based on the features extracted from the audio
speech using CNN classifier. With the help of the features extracted, the audio speech is
classified into different emotions. Then detect person audio is stress or not.
• Facial Expression:A depressed face expression has the same characteristics of a sad expression,
such as the upward slanted eyebrows etc. but the main difference is that there is no major frown
involved. Also a sad face may have eyes lowered looking downward showing the helpless,
dejected mood. On the contrary a depressed person can put forth a face devoid of depression.
This depicts a case of concealed expression of depression, i.e. the depressed face may not be a
sad face, and instead the person may put forth a happy face to conceal depression. Individual
person are classified as neutral or negative, based on a curated word-list to detect depression
tendencies.
16. Algorithm
Convolutional Neural Network:
Convolutional Neural Networks specialized for applications in image video recognition.
CNN is mainly used in image analysis tasks
like Image recognition, Object detection Segmentation. There are Four types of layers in
Convolutional Neural Networks:
1) Convolutional Layer: In a typical neural
network each input neuron is connected to the next hidden layer. In CNN, only a
small region of the input layer neurons connect to the neuron hidden layer.
2) Pooling Layer: The pooling layer is used to reduce the dimensionality of the feature
map.
There will be multiple activation pooling layers inside the hidden layer of the CNN.
3) Flatten: - Flattening is converting the data into a 1-dimensional array for inputting it to
the next layer. We flatten the output of the convolutional layers to createa single long
feature vector.
4) Fully-Connected layer: Fully Connected Layers formthe last few layers in the network.
The input to the fully connected layer is the output
from the final Pooling or Convolutional Layer, which is flattened and then fed into
the fully connected layer.
17. Algorithm
Mel-Frequency Cepstral Coefficients:MFCC are popular features extracted
from speech signals for use in recognition tasks. In the source-filter model of
speech, MFCC are understood to represent the filter (vocal tract). The
frequency response of the vocal tract is relatively smooth, whereas the source
of voiced speech can be modeled as an impulse train.The MFCC feature
extraction technique basically includes windowing the signal, applying the
DFT, taking the log of the magnitude,and then warping the frequencies on a
Mel scale, followed by applying the inverse DCT. MFCCs are commonly used
as features in speech recognition systems, such as the systems which can
automatically recognize numbers spoken into a telephone. MFCCs are also
increasingly finding uses in music information retrieval applications such as
genre classification, audio similarity measures, etc
18. Algorithm
Support Vector Machine:In machine learning, support-vector machines (SVMs,
also support-vector networks) are supervised learning models with associated learning algorithms
that analyze data used for classification and regression analysis.
Given a set of training examples, each marked as belonging to one or the other of two categories,
an SVM training algorithm builds a model that assigns new exmples to one category or the other,
making it a non-probabilistic binary linear classifier (although methods such as Platt scaling exist
to use SVM in a probabilistic classification setting). An SVM model is a representation of the
examples as points in space, mapped so that the examples of the separate categories are divided by
a clear gap that is as wide as possible. New examples are then mapped into that same space and
predicted to belong to a category based on the side of the gap on which they fall.
Two types Of SVM
Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be
classified into two classes by using a single straight line, then such data is termed as linearly
separable data, and classifier is used called as Linear
SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a
dataset cannot be classified by using a straight line, then such data is termed as non-linear data
and classifier used is called as Non-linear SVM classifier
24. Advantages, Limitations and Application
Advantages:
• Early detection is critical for rapid intervention, which can potentially
reduce the escalation of the disorder.
• It also helps determine how we handle stress, relate to others, and make
healthy choices.
Limitations:
In machine learning, there’s something called the “No Free Lunch” theorem.
In a nutshell, it states that no one algorithm works best for every problem, and
it’s especially relevant for supervised learning.
Application:
• Identifying signs of depression in individuals.
• depression detection is quite useful as it shows that words used by
nondepressed and depressed people may differ.
• can help you determine whether you or someone you know may be
experiencing – or be at risk for developing – depression
25. Conclusion
Conclude, That in system we will detect person is depressed or not using
Audio and text dataset. The Deep learning technique is used CNN algorithm
and Machine Learning used SVM. With the help of These CNN algorithm and
SVM algorithm To detect depressed person or Not.
26. Future work
In the future, we will be able to use more models to do analysis of tweets and
more social media outlets along with emails to determine various mental health
issues other than depression such as PTSD, stress and anxiety. As of the
current working, the desktop application is local. In the future days, this
application can be hosted on a website using an internet connection. The
current application is basically a screening test before consulting a doctor. In
the future days, a video consultancy to doctor can be arranged if the user is
detected to be depressed.
27. References
1 Cohn J.; Kruez T.; Matthew, I.; Yang, Y.; Nguyen, M.; Padilla, M., Zhou, F.; Torre, F.
Detecting Depression from Facial Actions and Vocal Prosody. ACII 2009 978-1-4244-
4799-2/09.
2 Hasan, M.; Rundensteiner, E.; Agu, E. EMOTEX: Detecting emotions in Twit_x0002_ter
messages. Presented at the 2014 ASE Bigdata Socialcom Cybersecurity Conference,
Stanford University.
3 Hadjipavlou, G.; Hernandez, C.; Ogrodniczuk, J. Psychotherapy in
Contem_x0002_porary Psychiatric Practice. Can J Psychiatry. 2015 60(6): 294300.
4 Wang, X.; Zhang, C.; Ji1, Y.; Sun,L.; Wu, L.; Bao, Z. 2013. A Depression Detection
Model Based on Sentiment Analysis in Micro-blog Social Network. 201213. In
Proceedings of Springer-Verlag Berlin Heidelberg 2013
5 Paul, M., J.; Dredze, M. 2011. You are What You Tweet: Analyzing Twitter for Public
Health. In Proceedings of ICWSM 2011.
6 Nadeem, M.; Horn, M.; Glen Coppersmith, G.; Sen, S. Identifying depression on
Twitter. arXiv preprint 2016 arXiv:1607.07384.
7 Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed
Rep_x0002_resentations of Words and Phrases and their Compositionality. Advances in
Neural Information Processing Systems 26 NIPS 2013