Sentiment analysis is a fundamental part of Natural Language
Processing. There are numerous works on this topic in English and other
languages. However, it is still a comparatively new practice in Bangla. The
absence of a suitable Bangla corpus is the primary obstacle for sentiment
analysis tasks in Bangla. Nonetheless, Long Short-term Memory (LSTM) is a
common technique for resolving sentiments from a dataset containing a large
amount of text data. However, Gated Recurrent Unit (GRU) is very efficient for
datasets with a low amount of text data. In this manuscript, we present a 5-
layered GRU neural network model, each layer comprising of 48 neurons,
applied the model on an existing Bangla corpus. We implemented the 10-folds
cross-validation approach and repeated the same processes three times. Each
time, we considered the averages of the ten validation accuracy and losses and
compared the results with the state-of-the-art published outcome (77.85%
highest accuracy) for Bi-directional LSTM (BLSTM). The highest accuracies
for our model was 78.41%, while the lowest accuracy was 76.34%.
Sentiment Analysis of Bengali text using Gated Recurrent Neural Network
1. 4th INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING
& COMMUNICATION (ICICC-2021)
20-21 February 2021.
Sentiment Analysis of Bangla Text using Gated Recurrent
Neural Network
Nasif Alvi
Kamrul Hasan Talukder
Abdul Hasib Uddin
Presented by:
Abdul Hasib Uddin
Khulna University, Khulna, Bangladesh
2. 1. INDEX:
Abstract
Introduction
Literature Review
Proposed Methodology
Result and Discussion
Comparative Analysis
Conclusion and Future Work
3. 2. ABSTRACT:
Sentiment analysis is a fundamental part of Natural Language Processing.
Numerous works remain on this topic in English and other languages.
However, it is still a comparatively new practice in Bangla.
It is difficult to find suitable Bangla corpus for sentiment analysis tasks in Bangla.
Long Short-Term Memory (LSTM) is a common technique for resolving sentiments
from a dataset containing a large amount of text data.
However, Gated Recurrent Unit (GRU) is very efficient for datasets with a low
amount of text data.
we present a 5-layered GRU neural network model
each layer comprising of 48 neurons
applied the model on an existing Bangla corpus.
10-folds cross-validation approach and repeated the same processes three times.
Each time, we have considered the averages of the ten validation accuracy and losses and
compared the results with the state-of-the-art published outcome (77.85% highest
accuracy) for Bi-directional LSTM (Bi-LSTM).
The highest accuracy for our model is 78.41%, while the lowest accuracy is 76.34%.
4. 3. INTRODUCTION:
Sentiment Analysis can be referred to as “Opinion Mining”.
Micro-blogging platforms such as Twitter, YouTube, Facebook etc. are very
popular for social connections.
The main objective is to extract and identify the sentiment from a text.
Applied to reviews and social media for a variety of applications.
Few research has been performed on the Bangla text.
5. 4. LITERATURE REVIEW :
Hoque et al. [1]
Used Machine Learning approaches along with doc2Vec.
Used corpus developed with seven thousand Bangla sentence.
Chose 80% training data and rest 20% test data randomly.
Achieved highest accuracy using Bi-directional LSTM.
Uddin et al. [2]
Depression detection using Gated Recurrent Unit.
Collected Bangla data from Twitter.
utilized GRU size 64, 128, 256, 512, and 1024 for this analysis.
6. 4. LITERATURE REVIEW (Continue…):
Hossain et al. [3]
Sentiment analysis on restaurant surveys.
Proposed joint model with CNN-LSTM.
Data consisting 1,000 food reviews.
Sharfuddin et al. [4]
Sentiment classification of Bangla text using RNN with Bi-LSTM.
10,000 comments from Facebook consisting 5,000 negative comments and 5,000
positive comments.
Removed all the symbols form the text.
Tripto et al. [5]
3-way and 5-way classification of sentiments.
Analysed Bangla, Romanized Bangla , and English comments from YouTube.
Highest accuracy was 65.97% accuracy.
8. 7. RESULTS & DISCUSSION:
Fig 1. Validation accuracy and validation loss of 10-folds cross validation in
three runs
9. 7. RESULTS & DISCUSSION (Continue…):
Fig 2. Graphical view of (a) average accuracy and (b) average loss in three runs
10. 8. COMPARATIVE ANALYSIS:
Our system Hoque et. al [3]
Highest accuracy 78.41% 77.85%
Lowest accuracy 76.34% 59.21%
11. 9. CONCLUSION & FUTURE WORK :
We applied 10 fold cross validation three times by shuffling the dataset
each times to achieve more accurate result.
We achieved the highest average accuracy of 78.41%.
More preprocessing techniques and other feature extraction methods can
be applied to get better result.
Other classification algorithms can be deployed to compare our results.
12. 10. REFERENCES :
[1] M. T. Hoque, A. Islam, E. Ahmed, K. A. Mamun and M. N. Huda, "Analyzing Performance of Different Machine
Learning Approaches With Doc2vec for Classifying Sentiment of Bengali Natural Language," International Conference
on Electrical, Computer and Communication Engineering (ECCE), Cox'sBazar, Bangladesh, 2019, pp. 1-5, doi:
10.1109/ECACE.2019.8679272.
[2] A. H. Uddin, D. Bapery and A. S. Mohammad Arif, "Depression Analysis of Bangla Social Media Data using
Gated Recurrent Neural Network," 1st International Conference on Advances in Science, Engineering and Robotics
Technology (ICASERT), Dhaka, Bangladesh, 2019, pp. 1-6, doi: 10.1109/ICASERT.2019.8934455.
[3] N. Hossain, M. R. Bhuiyan, Z. N. Tumpa and S. A. Hossain, "Sentiment Analysis of Restaurant Reviews using
Combined CNN-LSTM," ICCCNT, Kharagpur, India, 2020, pp. 1-5, doi: 10.1109/ICCCNT49239.2020.9225328.
[4] A. Aziz Sharfuddin, M. Nafis Tihami and M. Saiful Islam, "A Deep Recurrent Neural Network with BiLSTM
model for Sentiment Classification," International Conference on Bangla Speech and Language Processing (ICBSLP),
Sylhet, 2018, pp. 1-4, doi: 10.1109/ICBSLP.2018.8554396.
[5] N. Irtiza Tripto and M. Eunus Ali, "Detecting Multilabel Sentiment and Emotions from Bangla YouTube
Comments," ICBSLP, Sylhet, 2018, pp. 1-6, doi: 10.1109/ICBSLP.2018.8554875.