Generative AI on Enterprise Cloud with NiFi and Milvus
new.pptx
1. “Suicide is not a blot on anyone’s
name; it is a tragedy.”
– Kay Redfield Jamison
2. “A Comparative Study on Suicidal Ideation
Detection Using Machine Learning”
Presented by
Gulam Murshed Robin (ASH1801033M)
Sadia Tamanna(BKH1801044F)
Under the Supervision of
A. R. M. Mahmudul Hasan Rana
Assistant Professor
Dept of CSTE
5. 9
Works to be done
7
Planning of
creating Bangla
Dataset
8
Expected Output
of the research
Methods to do
Suicidal Analysis
6
6. What is Suicide and why do people attempt it?
Suicide is the act of intentionally causing one's own death.
While there are many factors that can influence a person's decision to
commit suicide, the most common factors are given below:
1) Mental Illness
2) Traumatic Stress
3) Substance Use and Impulsivity
4) Loss or Fear of Loss
5) Social Isolation Suicide
7. So what type of work “Suicidal Ideation Detection” is?
Sentiment Analysis
Suicidal Analysis
Sentiment refers to the emotion. Sentiment Analysis means analyzing the emotion from a given text.
Suicidal Analysis falls under the category of Sentiment Analysis.
8. What is the objective of our work?
• To design an efficient Suicidal Analysis method.
• To Analyze existing suicidal algorithm and their pros and cons.
• To create a Bangla dataset and preprocess the data.
9. Is there any related work that has been done already?
Suicide and Depression
Identication with
Unsupervised Label
Correction[1]
Ensemble Classifier
Model for Tweeter
Sentiment Analysis[2]
Sentiment Analysis for
Bengali[3]
Sentiment Analysis for
Hindi language using
unsupervised lexicon
method[4]
Recurrent Neural
networks for Bangla
Sentiment Analysis[5]
Deep Learning model for
Punjabi tweet sentiment
analysis[6]
Deep Learning model for
Bengali tweet sentiment
analysis[7]
10. What should be the procedure of suicidal analysis?
Raw Data
Data Preprocessing
Add Contraction
Tokenization
Apply Algorithm
Calculate Accuracy
Stop Word Remove
Split Text
Figure: Data Preprocessing & Data Cleaning[8]
Md.Rafi said ,”I'm going to kill myself“.
[Md] [.] [Rafi] [said] [,] [”] [I’m]
[going] [to] [kill] [myself] [“] [.]
[Mohammad] [Rafi] [said] [,]
[”] [I’m] [going] [to] [“] [.]
[{kill},{myself}]
[Md.]--->[Mohammad]
11. Below are some main reasons which describe the importance of
Unsupervised Learning:
Unsupervised learning is helpful for finding useful insights from the
data.
Unsupervised learning is much similar as a human learns to think by
their own experiences, which makes it closer to the real AI.
Unsupervised learning works on unlabeled and uncategorized data
which make unsupervised learning more important.
In real-world, we do not always have input data with the corresponding
output so to solve such cases, we need unsupervised learning.
Why use Unsupervised Learning?
12. Working of Unsupervised
Learning
Working of unsupervised learning can be understood by the below
diagram:
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable algorithms.
15. K-means Clustering:
K means it is an iterative clustering algorithm which helps you to find the highest value for every
iteration. Initially, the desired number of clusters are selected. In this clustering method, you need
to cluster the data points into k groups. A larger k means smaller groups with more granularity in
the same way. A lower k means larger groups with less granularity.
The output of the algorithm is a group of “labels.” It assigns data point to one of the k groups. In k-
means clustering, each group is defined by creating a centroid for each group. The centroids are
like the heart of the cluster, which captures the points closest to them and adds them to the
cluster.
Hierarchical Clustering
Hierarchical clustering is an algorithm which builds a hierarchy of clusters. It begins with all the
data which is assigned to a cluster of their own. Here, two close cluster are going to be in the
same cluster. This algorithm ends when there is only one cluster left.
K- Nearest neighbors
K- nearest neighbour is the simplest of all machine learning classifiers. It differs from other
machine learning techniques, in that it doesn’t produce a model. It is a simple algorithm which
stores all available cases and classifies new instances based on a similarity measure.
It works very well when there is a distance between examples. The learning speed is slow when
the training set is large, and the distance calculation is nontrivial.
16. Principal Components Analysis
In case you want a higher-dimensional space. You need to select a basis for that space and only
the 200 most important scores of that basis. This base is known as a principal component. The
subset you select constitute is a new space which is small in size compared to original space. It
maintains as much of the complexity of data as possible.
Principal Components Analysis
In case you want a higher-dimensional space. You need to select a basis for that space and only
the 200 most important scores of that basis. This base is known as a principal component. The
subset you select constitute is a new space which is small in size compared to original space. It
maintains as much of the complexity of data as possible.
17. How do we plan to create our own Bangla Dataset?
Most of our data will be collected from Twitter and Reddit. Along
with that, we will try to collect as much data as we can from people
around us using Google Form. After collecting the raw data, we will
manually label those. Finally we will preprocess and clean the data to
apply the machine learning algorithm.
18. What output do we expect from this research?
• We will be effectively mining the implicit emotion in the text, which can help
enterprises or organizations to make an effective decision.
• We will also try to break the co-relation between cyberbullying and suicide.
• We will analyze the factors of a person to attempt suicide i.e. health factors,
environmental factors and historical factors.
• Finally we will develop Bangla dataset, enrich it and design an efficient
suicidal analysis method using six different machine learning approaches.
19. Work that need to be done?
We will implement the algorithm mentioned earlier. Along with that,
we will develop the Bangla Dataset. Also we will examine the
accuracy after applying different algorithm on existing dataset. Our
work might provide the opportunity to law and order agencies to
help the victims with suicidal thoughts and take necessary steps to
prevent it.
20. References
[1] Onan, A., Korukoğlu, S., & Bulut, H. (2016). A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for
text sentiment classification. Expert Systems with Applications, 62, 1-16.
[2] Hassan, A., Abbasi, A., & Zeng, D. (2013, September). Twitter sentiment analysis: A bootstrap ensemble framework. In 2013 international
conference on social computing (pp. 357-364). IEEE.
Sarkar, K. (2019). Sentiment polarity detection in Bengali tweets using deep convolutional neural networks. Journal of Intelligent Systems, 28(3),
377-386.
[3] Mittal, N., Agarwal, B., Chouhan, G., Bania, N., & Pareek, P. (2013, October). Sentiment analysis of hindi reviews based on negation and discourse
relation. In Proceedings of the 11th Workshop on Asian Language Resources (pp. 45-50).
[4] Rani, S., & Kumar, P. (2019). Deep learning based sentiment analysis using convolution neural network. Arabian Journal for Science and
Engineering, 44(4), 3305-3314.
[5] Joshi, A., Balamurali, A. R., & Bhattacharyya, P. (2010). A fall-back strategy for sentiment analysis in hindi: a case study. Proceedings of the 8th
ICON.
[6] Saleena, N. (2018). An ensemble classification system for twitter sentiment analysis. Procedia computer science, 132, 937-946.
[7] Da Silva, N. F., Hruschka, E. R., & Hruschka Jr, E. R. (2014). Tweet sentiment analysis with classifier ensembles. Decision Support Systems, 66, 170-
179.
[8] Islam, K. I., Islam, M., & Amin, M. R. (2020). Sentiment analysis in Bengali via transfer learning using multi-lingual BERT. arXiv preprint
arXiv:2012.07538.