SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3749
Derogatory Comment Classification
Sudhanshu Chaurasia1, KMR Dayaasaagar2, Jayesh Girdhar3 ,Dhanesh.S4,Sunil Shelke5
1,2,3,4 UG Student, Dept. of Computer Engineering, Pillai College of Engineering, New Panvel, India. 5 Assistant
Professor, Dept. of Information Technology, Pillai College of Engineering, New Panvel, India.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Social media platforms have become a facet for
people’s opinion and reviews, wherein people share their
opinions on a wide variety of topics. However, people exploit
this facility to take a dig at those with whom they don't find
their opinions match with. People use this facility to post
harmful, racial, gender biased, threatfulcomments. Asaresult,
social media platforms are quickly becoming indispensable.
They often struggle to facilitate conversation, effectively
forcing many communities to shut down user comments. This
motivates us to look into this problemandbuildamodelwhich
will detect and classify derogatory comments. We will collect
the dataset from Kaggle and experiment with the help of deep
learning approaches like Naïve-Bayes, SVM, LSTM and BERT.
algorithms. Thereby, we will classify the derogatory
comments. We will compare these algorithms and conclude
which algorithm is more effective.
Key Words: Language recognition, multilingual,
offensive word, NBSVM
1. INTRODUCTION
In today’s current society, there is a big problem when it
comes to online toxicity. Internet isanopendiscussingspace
for everyone to freely express their opinions. With the
massive increase in social interactions on online social
networks, there has also been an increase of hateful
activities that exploit such infrastructure. This toxicitytends
to negatively impact how a lot of people tend to engage in
conversation and deters some from engaging in online
conversation entirely.
As a result, online platforms tend to struggle to effectively
facilitate conversations, leading many communities to limit
or completely shut down user comments. Harassment and
abuse are discouraging people from sharing their ideas and
disturbing the internet environment. Platforms struggle to
effectively facilitate conversations, leading many
communities to limit or completely shut down user
comments if it’s toxic. Motivated by thisproblem,we want to
build a technology that detects and classifies the derogatory
comment. For our model, the input will be a comment. We
will use three different modelstopredictscoresfor“toxicity”
(which is our target), “severe toxicity”, “obscenity”, “identity
attack”, “insult”, “threat”. Then Support Vector Machine
(SVM) models are often used as baselines for other methods
in text categorization .In this case Naive Bayes -SVM (NVB-
SVM) provides more accuracy for further classification.
2. Literature Survey
A. Base-Line Model:Naive NVB-SVM was developed by Sida
Wang and Christopher Manning. Bayes (NB) and Support
Vector Machine (SVM) models areoften usedasbaselinesfor
other methods in text categorization .In this case Naive
Bayes -SVM (NVB-SVM) provides more accuracy for further
classification.
B. BERT: which stands for Bidirectional Encoder
Representations transformers. Bidirectional Encoder
Representations from Transformers (BERT) is a NLP model
that was designed to pretrain deep bidirectional
representations from unlabeled text and, after that, be fine-
tuned using labeled text for different NLP tasks [7]. That
way, with BERT model,wecancreatestate-of-the-artmodels
for many different NLP tasks [7]. We can see the results
obtained by BERT in different NLP tasks at [7].
C. Toxic Comment Detection using LSTM This papers
authors have developed a LSTM model to classify speech as
hate or not with an accuracy of 95% accuracy. LSTM stands
for long short-term memory and is an improvement over a
normal RNN. The model not only classifies a given sentence
as toxic or non-toxic but also gives the percentage of toxicity
or non-toxicity of the given sentence.
D. Research on Text Classification Based on CNN and
LSTM: In this paper a new model is developed by combining
a LSTM and CNN which outperforms the performance of
both the algorithms individually
2. 1 Summary of Related Work
The summary of methods used in literature is given in Table
1.
Literature MODEL Accuracy
Sida Wang, Baselines and
Bigrams: Simple, Good
Sentiment and Topic
Classification,
2020
NVB-SVM 87%
Hong Fu, Social Media Toxicity
Classification Using Deep
Learning ,2021
BERT 98%
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3750
Krishna Dubey, ToxicComment
Detection using LSTM ,2020
LSTM 95%
2019, YuandongLuanResearch
on Text Classification Based on
CNN and LSTM
LSTM-
CNN
98%
3. Proposed Architecture
Our research is aimed at derogatory terms in a text that the
user provides as input. Because social media has grown in
popularity in recent years, there are people who engage in
some type of online material (for example, articles) aimed
towards a person or a group, etc. So, using natural language
processing or machine learning methods, our program will
evaluate the text and refer back to the dataset to determine
the language of the input text as well as whether or not it
contains any derogatory terms.
3.1 System Architecture
The proposed system architecture is given in Figure 1
Fig. 1 Proposed system architecture
B. Block 1 Description: upload the data set which has been
taken from Kaggle . It containsWikipedia commentswhichis
classified into Obscene, severely toxic, threat, identity hate
and insult.
C. Block 2 Description: We will splitthedatasetintotraining
and testing sets . we will split the dataset into 60% for
training and 40% for testing and can be increased if there is
a demand for increasing the accuracy.
D. Block 3 Description: This datasetcontainsdata which has
special, characters, useless words and links etc. We will be
filtering them out .
E. Block 4 Description: Tokenization is an important step
when working with text data. We will use this process to
split the sentences into smaller units called tokens.
F. Block 5 Description: We will use Naïve Bayes-SVM model
to create a base-line model so as to improve the accuracy in
detecting and classifying the comments.
G. Block 6 and 7 Description: We will feed the output of the
base line model to each of these models separately. BERT
and LSTM will be able to classify and predict derogatory
comments with higher accuracy rate.
H.Block 8 Description: We will compare the results
generated by LSTM and BERT models in detecting
derogatory comments .
3. Requirement Analysis
The implementation detail is given in this section.
3.1 Software
We will be using streamlit for making the web app. We will
be using python for developing the classification model. We
will be required to import different libraries to make the
classification models .
3.2 Hardware
We will be requiring Intel core i-5 generationastrainingand
testing requires a high-performance processor. We will be
requiring GeForce 1650 or above.
3.3 Dataset and Parameters
The Conversation AI team, a research initiative founded by
Jigsaw and Google are working on tools to help improve
online conversation. One area of focus is the study of
negative online behaviors, like toxic comments (i.e.
comments that are rude, disrespectful or otherwise likely to
make someone leave a discussion). So far they’ve built a
range of publicly available models served through the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3751
Perspective API, including toxicity. But the current models
still make errors, and they don’t allow users to select which
types of toxicity they’re interested in finding (e.g. some
platforms may be fine with profanity, but not with other
types of toxic content). A multi-headed model that’s capable
of detecting different types of toxicity likethreats,obscenity,
insults, and identity-based hate is developed. An example of
a data source by Jigsaw is given in Fig. 3.1.
Fig. 3.1 Kaggle Toxic dataset by jigsaw
3. CONCLUSIONS
To sum up our work, we implemented two machine learning
models, namely, LSTM and BERT. We utilized information
from data visualization to preprocess data. BERT can
produce state-of-the-art work with only one output layer.
We trained BERT with only one epoch and got results better
than the LSTM model.
For future work, we would like to try waystofixtheproblem
of imbalanced data. BERT model brought us surprisingly
good results and we would like to explore it if we have more
time.
ACKNOWLEDGMENT
It is our privilege to express our sincerest regards to our
supervisor Dr Sunil Shelke for the valuable inputs, able
guidance, encouragement, whole-hearted cooperation and
constructive criticism throughout the duration of this work.
We deeply express our sincere thanks our Head of the
Department Dr. Sharvari Govilkar and our Principal Dr.
Sandeep M. Joshi for encouraging and allowing us to
presenting this work.
REFERENCES
1.Sida Wang, Baselines and Bigrams: Simple, Good
Sentiment and Topic Classification,2020
2. Hong Fu, Social Media Toxicity Classification Using
DeepLearning,2021
3. Krishna Dubey, Toxic Comment Detection usingLSTM,
Third International Conference on Advances in
Electronics , 2020
4. Yuandong Luan Research on Text Classification Based
on CNN and LSTM,International Conference on Artificial
Intelligence and Computer Applications,2019

More Related Content

Similar to Derogatory Comment Classification

Sentiment Analysis for Sarcasm Detection using Deep Learning
Sentiment Analysis for Sarcasm Detection using Deep LearningSentiment Analysis for Sarcasm Detection using Deep Learning
Sentiment Analysis for Sarcasm Detection using Deep LearningIRJET Journal
 
IRJET - YouTube Spam Comments Detection
IRJET - YouTube Spam Comments DetectionIRJET - YouTube Spam Comments Detection
IRJET - YouTube Spam Comments DetectionIRJET Journal
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisEditor IJCATR
 
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...IRJET Journal
 
IRJET - Mobile Chatbot for Information Search
 IRJET - Mobile Chatbot for Information Search IRJET - Mobile Chatbot for Information Search
IRJET - Mobile Chatbot for Information SearchIRJET Journal
 
IRJET - E-Assistant: An Interactive Bot for Banking Sector using NLP Process
IRJET -  	  E-Assistant: An Interactive Bot for Banking Sector using NLP ProcessIRJET -  	  E-Assistant: An Interactive Bot for Banking Sector using NLP Process
IRJET - E-Assistant: An Interactive Bot for Banking Sector using NLP ProcessIRJET Journal
 
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOTA Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOTIRJET Journal
 
IRJET- College Enquiry Chatbot System(DMCE)
IRJET-  	  College Enquiry Chatbot System(DMCE)IRJET-  	  College Enquiry Chatbot System(DMCE)
IRJET- College Enquiry Chatbot System(DMCE)IRJET Journal
 
Language and Offensive Word Detection
Language and Offensive Word DetectionLanguage and Offensive Word Detection
Language and Offensive Word DetectionIRJET Journal
 
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...IRJET Journal
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine LearningIRJET Journal
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfDevinSohi
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningIRJET Journal
 
IRJET- Conversational Assistant based on Sentiment Analysis
IRJET- Conversational Assistant based on Sentiment AnalysisIRJET- Conversational Assistant based on Sentiment Analysis
IRJET- Conversational Assistant based on Sentiment AnalysisIRJET Journal
 
IRJET - Artificial Conversation Entity for an Educational Institute
IRJET - Artificial Conversation Entity for an Educational InstituteIRJET - Artificial Conversation Entity for an Educational Institute
IRJET - Artificial Conversation Entity for an Educational InstituteIRJET Journal
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET Journal
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELIRJET Journal
 
E-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector MachineE-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector MachineIRJET Journal
 
Twitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachTwitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachIRJET Journal
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Andrew Parish
 

Similar to Derogatory Comment Classification (20)

Sentiment Analysis for Sarcasm Detection using Deep Learning
Sentiment Analysis for Sarcasm Detection using Deep LearningSentiment Analysis for Sarcasm Detection using Deep Learning
Sentiment Analysis for Sarcasm Detection using Deep Learning
 
IRJET - YouTube Spam Comments Detection
IRJET - YouTube Spam Comments DetectionIRJET - YouTube Spam Comments Detection
IRJET - YouTube Spam Comments Detection
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
 
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
 
IRJET - Mobile Chatbot for Information Search
 IRJET - Mobile Chatbot for Information Search IRJET - Mobile Chatbot for Information Search
IRJET - Mobile Chatbot for Information Search
 
IRJET - E-Assistant: An Interactive Bot for Banking Sector using NLP Process
IRJET -  	  E-Assistant: An Interactive Bot for Banking Sector using NLP ProcessIRJET -  	  E-Assistant: An Interactive Bot for Banking Sector using NLP Process
IRJET - E-Assistant: An Interactive Bot for Banking Sector using NLP Process
 
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOTA Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
A Research Paper on HUMAN MACHINE CONVERSATION USING CHATBOT
 
IRJET- College Enquiry Chatbot System(DMCE)
IRJET-  	  College Enquiry Chatbot System(DMCE)IRJET-  	  College Enquiry Chatbot System(DMCE)
IRJET- College Enquiry Chatbot System(DMCE)
 
Language and Offensive Word Detection
Language and Offensive Word DetectionLanguage and Offensive Word Detection
Language and Offensive Word Detection
 
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdf
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine Learning
 
IRJET- Conversational Assistant based on Sentiment Analysis
IRJET- Conversational Assistant based on Sentiment AnalysisIRJET- Conversational Assistant based on Sentiment Analysis
IRJET- Conversational Assistant based on Sentiment Analysis
 
IRJET - Artificial Conversation Entity for an Educational Institute
IRJET - Artificial Conversation Entity for an Educational InstituteIRJET - Artificial Conversation Entity for an Educational Institute
IRJET - Artificial Conversation Entity for an Educational Institute
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
 
E-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector MachineE-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector Machine
 
Twitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised ApproachTwitter Sentiment Analysis: An Unsupervised Approach
Twitter Sentiment Analysis: An Unsupervised Approach
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 

Derogatory Comment Classification

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3749 Derogatory Comment Classification Sudhanshu Chaurasia1, KMR Dayaasaagar2, Jayesh Girdhar3 ,Dhanesh.S4,Sunil Shelke5 1,2,3,4 UG Student, Dept. of Computer Engineering, Pillai College of Engineering, New Panvel, India. 5 Assistant Professor, Dept. of Information Technology, Pillai College of Engineering, New Panvel, India. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Social media platforms have become a facet for people’s opinion and reviews, wherein people share their opinions on a wide variety of topics. However, people exploit this facility to take a dig at those with whom they don't find their opinions match with. People use this facility to post harmful, racial, gender biased, threatfulcomments. Asaresult, social media platforms are quickly becoming indispensable. They often struggle to facilitate conversation, effectively forcing many communities to shut down user comments. This motivates us to look into this problemandbuildamodelwhich will detect and classify derogatory comments. We will collect the dataset from Kaggle and experiment with the help of deep learning approaches like Naïve-Bayes, SVM, LSTM and BERT. algorithms. Thereby, we will classify the derogatory comments. We will compare these algorithms and conclude which algorithm is more effective. Key Words: Language recognition, multilingual, offensive word, NBSVM 1. INTRODUCTION In today’s current society, there is a big problem when it comes to online toxicity. Internet isanopendiscussingspace for everyone to freely express their opinions. With the massive increase in social interactions on online social networks, there has also been an increase of hateful activities that exploit such infrastructure. This toxicitytends to negatively impact how a lot of people tend to engage in conversation and deters some from engaging in online conversation entirely. As a result, online platforms tend to struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments. Harassment and abuse are discouraging people from sharing their ideas and disturbing the internet environment. Platforms struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments if it’s toxic. Motivated by thisproblem,we want to build a technology that detects and classifies the derogatory comment. For our model, the input will be a comment. We will use three different modelstopredictscoresfor“toxicity” (which is our target), “severe toxicity”, “obscenity”, “identity attack”, “insult”, “threat”. Then Support Vector Machine (SVM) models are often used as baselines for other methods in text categorization .In this case Naive Bayes -SVM (NVB- SVM) provides more accuracy for further classification. 2. Literature Survey A. Base-Line Model:Naive NVB-SVM was developed by Sida Wang and Christopher Manning. Bayes (NB) and Support Vector Machine (SVM) models areoften usedasbaselinesfor other methods in text categorization .In this case Naive Bayes -SVM (NVB-SVM) provides more accuracy for further classification. B. BERT: which stands for Bidirectional Encoder Representations transformers. Bidirectional Encoder Representations from Transformers (BERT) is a NLP model that was designed to pretrain deep bidirectional representations from unlabeled text and, after that, be fine- tuned using labeled text for different NLP tasks [7]. That way, with BERT model,wecancreatestate-of-the-artmodels for many different NLP tasks [7]. We can see the results obtained by BERT in different NLP tasks at [7]. C. Toxic Comment Detection using LSTM This papers authors have developed a LSTM model to classify speech as hate or not with an accuracy of 95% accuracy. LSTM stands for long short-term memory and is an improvement over a normal RNN. The model not only classifies a given sentence as toxic or non-toxic but also gives the percentage of toxicity or non-toxicity of the given sentence. D. Research on Text Classification Based on CNN and LSTM: In this paper a new model is developed by combining a LSTM and CNN which outperforms the performance of both the algorithms individually 2. 1 Summary of Related Work The summary of methods used in literature is given in Table 1. Literature MODEL Accuracy Sida Wang, Baselines and Bigrams: Simple, Good Sentiment and Topic Classification, 2020 NVB-SVM 87% Hong Fu, Social Media Toxicity Classification Using Deep Learning ,2021 BERT 98%
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3750 Krishna Dubey, ToxicComment Detection using LSTM ,2020 LSTM 95% 2019, YuandongLuanResearch on Text Classification Based on CNN and LSTM LSTM- CNN 98% 3. Proposed Architecture Our research is aimed at derogatory terms in a text that the user provides as input. Because social media has grown in popularity in recent years, there are people who engage in some type of online material (for example, articles) aimed towards a person or a group, etc. So, using natural language processing or machine learning methods, our program will evaluate the text and refer back to the dataset to determine the language of the input text as well as whether or not it contains any derogatory terms. 3.1 System Architecture The proposed system architecture is given in Figure 1 Fig. 1 Proposed system architecture B. Block 1 Description: upload the data set which has been taken from Kaggle . It containsWikipedia commentswhichis classified into Obscene, severely toxic, threat, identity hate and insult. C. Block 2 Description: We will splitthedatasetintotraining and testing sets . we will split the dataset into 60% for training and 40% for testing and can be increased if there is a demand for increasing the accuracy. D. Block 3 Description: This datasetcontainsdata which has special, characters, useless words and links etc. We will be filtering them out . E. Block 4 Description: Tokenization is an important step when working with text data. We will use this process to split the sentences into smaller units called tokens. F. Block 5 Description: We will use Naïve Bayes-SVM model to create a base-line model so as to improve the accuracy in detecting and classifying the comments. G. Block 6 and 7 Description: We will feed the output of the base line model to each of these models separately. BERT and LSTM will be able to classify and predict derogatory comments with higher accuracy rate. H.Block 8 Description: We will compare the results generated by LSTM and BERT models in detecting derogatory comments . 3. Requirement Analysis The implementation detail is given in this section. 3.1 Software We will be using streamlit for making the web app. We will be using python for developing the classification model. We will be required to import different libraries to make the classification models . 3.2 Hardware We will be requiring Intel core i-5 generationastrainingand testing requires a high-performance processor. We will be requiring GeForce 1650 or above. 3.3 Dataset and Parameters The Conversation AI team, a research initiative founded by Jigsaw and Google are working on tools to help improve online conversation. One area of focus is the study of negative online behaviors, like toxic comments (i.e. comments that are rude, disrespectful or otherwise likely to make someone leave a discussion). So far they’ve built a range of publicly available models served through the
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | Apr 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3751 Perspective API, including toxicity. But the current models still make errors, and they don’t allow users to select which types of toxicity they’re interested in finding (e.g. some platforms may be fine with profanity, but not with other types of toxic content). A multi-headed model that’s capable of detecting different types of toxicity likethreats,obscenity, insults, and identity-based hate is developed. An example of a data source by Jigsaw is given in Fig. 3.1. Fig. 3.1 Kaggle Toxic dataset by jigsaw 3. CONCLUSIONS To sum up our work, we implemented two machine learning models, namely, LSTM and BERT. We utilized information from data visualization to preprocess data. BERT can produce state-of-the-art work with only one output layer. We trained BERT with only one epoch and got results better than the LSTM model. For future work, we would like to try waystofixtheproblem of imbalanced data. BERT model brought us surprisingly good results and we would like to explore it if we have more time. ACKNOWLEDGMENT It is our privilege to express our sincerest regards to our supervisor Dr Sunil Shelke for the valuable inputs, able guidance, encouragement, whole-hearted cooperation and constructive criticism throughout the duration of this work. We deeply express our sincere thanks our Head of the Department Dr. Sharvari Govilkar and our Principal Dr. Sandeep M. Joshi for encouraging and allowing us to presenting this work. REFERENCES 1.Sida Wang, Baselines and Bigrams: Simple, Good Sentiment and Topic Classification,2020 2. Hong Fu, Social Media Toxicity Classification Using DeepLearning,2021 3. Krishna Dubey, Toxic Comment Detection usingLSTM, Third International Conference on Advances in Electronics , 2020 4. Yuandong Luan Research on Text Classification Based on CNN and LSTM,International Conference on Artificial Intelligence and Computer Applications,2019