SlideShare a Scribd company logo
1 of 5
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 758
Automatic Prediction and Countering of Communal Tweets using
Machine Learning Techniques
R. Durga1, V. Pavithra2
1Student, M.E, Dept. of Computer Science and Engineering, T.J.S Engineering College, Tamil Nadu, India
2Assistant Professor, Dept. of Computer Science and Engineering, T.J.S Engineering College, Tamil Nadu, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Social networks such as Twitter and Facebook
are seriously affected by abusive and offensive content. A lot
of research has been carried out in recent years for
automatic identification of different types of hate speech.
Hate speech can come under several categories such as
gender, ethnicity, nationality, racial groups and religions of
the targeted groups. This work explores methods which aim
to counter potentially harmful communal tweets which are
pointed towards religious groups such as Hindus, Sikhs,
Muslims, Jains, Christians, Buddhists. Communal tweets are
largely posted during calamities and taking advantage of
such situations, hatred and misinformation are propagated
in the affected region which may result in serious riot and
violence. Considering the potentially adverse effects of
communal tweets, a classifier to identify communal tweets
and non communal tweets is developed using the Support
Vector Machine algorithm which performs better than
existing approaches. Users who post communal tweets are
identified. Anti communal tweets are identified from the non
communal tweets using a classifier and by finding the
appropriate anti communal tweet to a communal tweet, this
system counters communal tweets by making use of such
anti communal tweets posted by some users. The proposed
system will be really helpful to reduce the increasing
communal venom in society by blocking the users who
repeat posting such communal tweets. It also changes the
negative opinion of communal tweeters by promoting anti
communal contents.
Key Words: Anti communal tweets, Classifier, Communal
tweets, Religious groups, Users
1. INTRODUCTION
In recent years, there is an increase in propagation of
communal speech on social media and the urgent need for
effective countermeasures are needed. Twitter is a social
broadcast network that enables people to publicly share
brief messages instantly around the world. This message
brings a variety of people with different voices, ideas and
perspectives. Twitter prevents and filters only abusive
contents and offensive words from the tweets but not the
communal contents. Using this, communal tweets are
pointed towards certain religions, racial communities,
politicians or certain groups. Communal tweets are posted
during times of calamities or emergency situations. Hence,
analyzing the communal tweets on twitter during
calamities is focused. While looking through these tweets,
it has been observed that a large amount of communal
tweets are posted rather than anti communal tweets. A
calamity generally affects the morale of the masses making
them vulnerable. Often, taking advantage of such
situations, hatred and misinformation are propagated in
the affected region, which may result in serious
deterioration of law and order.
A detailed analysis of communal tweets during disaster
situations are detected and countered. This system
characterizes users who initiate and promote communal
tweets and also suggest a way to counter such communal
content with anti communal tweets that asks users not to
spread communal hatred. Considering the adverse effects
of communal tweets, a Support Vector Machine Classifier
is developed to automatically separate the communal
tweets from the non communal ones. After identifying
communal tweets, the nature of communal tweets and the
users who post them are identified. Initiators, who initiate
the communal tweets are identified and propagators who
retweet the communal tweets posted by initiators or copy
the contents of other initiators and post their own tweet
with minor changes are also identified. Apart from the
communal tweets, non communal tweets are analyzed to
automatically identify anti communal tweets for
countering. This will dissuade people from posting
communal contents. Hence, a convincing way to counter
the communal tweetis is to counter them with anti
communal tweets to the initiators and propagators who
post communal tweets. A real time system is also
developed that automatically collects tweets from the user
and counters with anti communal tweets if the posted
tweet is communal and blocks the users who repeat
posting the communal tweets through which communal
venom in the society can be reduced.
2. EXISTING SYSTEM
It has been observed that offensive tweets are often
posted during calamities to which the attackers are
affiliated. However, it is quite surprising that in certain
geographical regions such as the Indian subcontinent,
communal tweets are posted even during natural disasters
such as floods and earthquakes and also during man-made
disasters. Several studies have attempted to identify
online content that is potentially hate speech or offensive
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 759
in nature by a supervised bag-of-words (BOW) model to
classify racist content in web pages. Along with words,
context features are also incorporated to improve the
classification accuracy. Offensive content in Youtube
comments using obscenities, profanities and pejorative
terms as features with appropriate weightage. Similarly,
cyberbullying was identified by Dinakar et al. using
features like parts-of-speech tags, profane words, words
with negative connotations and so on. More recently,
Burnap and Williams proposed a hate term and
dependence feature-based model to identify hate speeches
posted during a disaster event called the Woolwich attack.
Alsaedi et al. proposed a classification and clustering
based technique to predict disruptive events like riot.
Burnap and Williams proposed a model to detect cyber
hate on Twitter across multiple protected characteristics
such as race, disability, sex, etc. The most prevailing hate
speeches are targeted toward certain races, while religion-
induced hate speech is very sparse. Hence, a general
purpose hate speech identifier may fail to capture all the
nuances of a rare category like religion-based hate speech,
especially, when tweets from the rare categories are
posted in huge volumes for a short period of time. We
actually demonstrate in this paper that the classifier
proposed in can hardly capture communal tweets.
Consequently, in recent times, researchers focus on more
granular levels of hate speech detection in Twitter. For
example, Chaudhry tried to track racism in Twitter and
Burnap and Williams detected religious hate speeches
posted during the Woolwich attack. On the contrary, this
paper focuses on Twitter and it has been widely
demonstrated that the standard Natural Language
Processing based methodologies which have been
developed for formally written text, do not work well for
short, informal tweets. Hence, new methodologies are
necessary to deal with communal and noisy content
posted on Twitter.
3. PROPOSED WORK
Detailed analysis of communal tweets posted during
calamity situations such as automatic identification of
such tweets, analyzing the users who post such tweets and
also suggest a way to counter such content. In this paper,
we try to identify communal tweets, characterize users
initiating or promoting such contents and counter such
communal tweets with anti communal posts that ask users
not to spread communal venom. Support Vector Machine
classifier is proposed to extract communal tweets and this
classifier can be directly used over any future event
without further training. Second, users are classified into
two categories, one is originators who post a tweet and the
other one is propagators who retweets the content of
originators. We not only rely on retweets but also explore
similarities between tweets, their timestamps in order to
identify initiators and propagators more accurately. Rest
of the analysis is performed on these modified groups of
users. Apart from that, temporal patterns of the identified
set of communal users are analyzed to understand their
outraging phenomenon. Third, another Support Vector
classifier is proposed to detect anti communal tweets and
such tweets can be used to neutralize the effect of
communal tweets. A real time system is developed to
automatically identify communal and anti communal
tweets posted during and counter with anti communal
tweet to the communal tweet and also block the user if
communal tweets are posted by them again.
4. METHODOLOGY
Fig -1: Proposed Methodology
4.1 Preprocessing
The dataset is collected in two ways, first the implicit
dataset is given from the recent disaster event in the
admin page and second, to develop the real time service
which automatically collects tweets from the user in the
user page. The tweets are preprocessed using stopwords
removal and stemming techniques. Dictionary based
approach is used to remove stopwords. Articles,
prepositions, pronouns, conjunctions, etc are stop words
and do not add much information to the text. Examples of
a few stop words are “the”, “a”, “an”, “so”, “what”. Removal
of stop words reduces the dataset size and reduces the
training time due to the fewer number of tokens involved
in the training. Stemming is the process of reducing
inflected words to their word stem, base or root form,
generally written in a word form.
4.2 Communal Tweet Classification
Support Vector Machine Classifier is used to classify the
tweets into communal and non communal tweets. This is
the algorithm which gives high accuracy for text
classification. Support Vector Machine algorithm creates
the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put
the new data point in the correct category in the future.
This best decision boundary is called a hyperplane. SVM
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 760
chooses the extreme points or vectors that help in creating
the hyperplane. A model that can accurately identify
whether it is a communal or a non communal tweet should
be created by using the Support Vector Machine algorithm.
We will first train our model with lots of communal and
non communal tweets using communal keywords so that
it can learn about which is communal and non communal
tweet and then we test it with a new tweet. On the basis of
the support vectors, it will classify. So as the support
vector creates a decision boundary between these two
data (communal and non communal) and chooses extreme
cases (support vectors), it will see the extreme case of
communal and non communal.
4.3 Identifying Anti Communal Tweets
After classifying communal tweets and non communal
tweets, anti communal tweets from the non communal
tweets are identified using the Support Vector Machine
Classifier which has been trained using anti communal
contents which try to dissuade people from posting
communal tweets.
4.4 Finding suitable Anti Communal Tweet
Anti communal tweet which is suitable to counter
communal tweet is identified through a training model.
This anti communal tweet is used to counter the initiator
or propagator with such anti communal tweets for those
who post the communal tweet.
Table -1: Sample Communal and Anti Communal Tweets
Communal Tweets Anti Communal Tweets
Soul Vultures, Evangelical
Vultures
Tears and blood have no
religion. All they know is only
pain. It is not just in your
country. It is everywhere.
Huh, it’s Muslims behind
attack, Islam attacks
Paris
Muslims are not terrorists.
This is not Islam. Nothing to
do with Islam.
4.5 User Identification
Users who posted the communal tweets are found. We
have two categories of users, one is initiators who post a
tweet and the other one is propagators who retweet the
posted tweet of some user or post a tweet by modifying an
already posted tweet. All the users who posted communal
tweets are stored in a database so if a user name repeats
for posting a communal tweet then the user will be
blocked.
4.6 Countering Communal Tweet
Anti communal tweet which is suitable to counter the
communal tweet is identified and countered to the
initiator or propagator who posted the communal tweet.
The various outputs are shown below,
Fig -2: Tweets Dataset
Fig -3: Admin Page to upload Implicit Dataset
Fig -4: Uploaded Dataset
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 761
Fig -5: Classified Communal and Non Communal Tweet
Fig -6: Classified Anti Communal Tweet
Fig -7: User Login
Fig -8: New User Registration
Fig -9: Real Time User’s Tweet Page
5. CONCLUSION
This proposed system counters communal tweets. Support
Vector Machine classifier is used to filter out communal
tweets. Anti communal tweets are found from the non
communal contents and that is used to counter the
communal tweets. A real time service is developed with
user and admin login where admin can upload the tweet
dataset and see all the classified communal, non
communal and anti communal tweets and the users who
posted the communal tweets. Tweets are collected from
dataset and also through real time service in which users
post tweets in real time which can be used directly in the
future to identify and counter the communal tweets with
anti communal tweets and also the user will be blocked if
the user posts communal tweet again. It gives 91%
precision, 95% recall, 93% F1 Score and 92% of overall
accuracy. The accuracy of the classifier in the proposed
system is comparatively higher than the existing system.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 762
6. FUTURE ENHANCEMENTS
● The proposed communal tweet classifier can be
used as an early warning signal to identify
communal tweets, and then celebrities, political
personalities can be made aware of the situation
and requested to post anti communal tweets so
that such tweets get higher exposure. Anti
Communal content needs to be promoted via
mentioning popular celebrities, political persons.
Our real time system can be used by the
Government in taking decisions regarding
filtering communal content, promoting anti
communal content etc.
● The number of distinct anti communal tweets is
much less in number due to the availability of a
small number of anti communal tweets. In the
future, the proposed set of lexicons will be
enlarged.
● Tweets are known to be informally written and
noisy in nature, containing misspellings,
abbreviations, smiley etc. In the future, these
variations will be handled to improve our
classifiers.
● It would be effective if the posted communal
tweets are blocked. It would be more effective if
the user is intimated or given warning before
posting a communal tweet or the system should
not allow users posting communal tweets.
REFERENCES
[1] P. Burnap and M. L. Williams, “Cyber hate speech on
Twitter: An application of machine classification and
statistical modeling for policy and decision making,”
Policy Internet, vol. 7, no. 2, pp. 223–242, 2015.
[2] L. A. Silva, M. Mondal, D. Correa, F. Benevenuto, and I.
Weber, “Analyzing the targets of hate in online social
media,” in Proc. ICWSM, Mar. 2016, pp. 687–690.
[3] M. Mondal, L. A. Silva, and F. Benevenuto, “A
measurement study of hate speech in social media,” in
Proc. ACM HT, 2017, pp. 85–94.
[4] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V.
Radosavljevic, and N. Bhamidipati, “Hate speech
detection with comment embeddings,” in Proc. WWW,
2015, pp. 29–30.
[5] W. Magdy, K. Darwish, N. Abokhodair, A. Rahimi, and
T. Baldwin, “#ISISisNotIslam or #DeportAllMuslims?:
Predicting unspoken views,” in Proc. ACM Web Sci.,
2016, pp. 95–106.
[6] K. Rudra, A. Sharma, N. Ganguly, and S. Ghosh,
“Characterizing communal microblogs during disaster
events,” in Proc. IEEE/ACM ASONAM, Aug. 2016, pp.
96–99.
[7] E. Chandrasekharan, M. Samory, A. Srinivasan, and E.
Gilbert, “The bag of communities: Identifying abusive
behavior online with preexisting Internet data,” in
Proc. ACM CHI, 2017, pp. 3175–3187.
[8] Ika Alfina, Rio Mulia, Mohamad Ivan Fanany, and Yudo
Ekanata, “Hate Speech Detection in the Indonesian
Language: A Dataset and Preliminary Study”,
10.1109/ICACSIS.2017.8355039, 2017.
[9] K. Rudra, S. Banerjee, N. Ganguly, P. Goyal, M. Imran,
and P. Mitra, “Summarizing situational tweets in crisis
scenario,” in Proc. 27th ACM Conf. Hypertext Social
Media (HT), 2016, pp. 137–147.

More Related Content

Similar to Automatic Prediction and Countering of Communal Tweets using Machine Learning Techniques

Social Sensor for Real Time Event Detection
Social Sensor for Real Time Event DetectionSocial Sensor for Real Time Event Detection
Social Sensor for Real Time Event Detection
IJERA Editor
 
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
IJET - International Journal of Engineering and Techniques
 
Iaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensorsIaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensors
Iaetsd Iaetsd
 

Similar to Automatic Prediction and Countering of Communal Tweets using Machine Learning Techniques (20)

IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...
 
IRJET - Social Media Intelligence Tools
IRJET -  	  Social Media Intelligence ToolsIRJET -  	  Social Media Intelligence Tools
IRJET - Social Media Intelligence Tools
 
E05742630
E05742630E05742630
E05742630
 
Social Sensor for Real Time Event Detection
Social Sensor for Real Time Event DetectionSocial Sensor for Real Time Event Detection
Social Sensor for Real Time Event Detection
 
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYFRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
 
32 99-1-pb
32 99-1-pb32 99-1-pb
32 99-1-pb
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODELAGGRESSION DETECTION USING MACHINE LEARNING MODEL
AGGRESSION DETECTION USING MACHINE LEARNING MODEL
 
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
 
Iaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensorsIaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensors
 
Predicting the future with social media (Twitter y Box Office)
Predicting the future with social media (Twitter y Box Office)Predicting the future with social media (Twitter y Box Office)
Predicting the future with social media (Twitter y Box Office)
 
F017433947
F017433947F017433947
F017433947
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Social_Media_Overview_Spring_2023_Technology_Committee_Letter_to.pdf
Social_Media_Overview_Spring_2023_Technology_Committee_Letter_to.pdfSocial_Media_Overview_Spring_2023_Technology_Committee_Letter_to.pdf
Social_Media_Overview_Spring_2023_Technology_Committee_Letter_to.pdf
 
IRJET - Detecting Spiteful Accounts in Social Network
IRJET - Detecting Spiteful Accounts in Social NetworkIRJET - Detecting Spiteful Accounts in Social Network
IRJET - Detecting Spiteful Accounts in Social Network
 
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
 
Classification Methods for Spam Detection in Online Social Network
Classification Methods for Spam Detection in Online Social NetworkClassification Methods for Spam Detection in Online Social Network
Classification Methods for Spam Detection in Online Social Network
 
IRJET- Twitter Spammer Detection
IRJET- Twitter Spammer DetectionIRJET- Twitter Spammer Detection
IRJET- Twitter Spammer Detection
 
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
 

More from IRJET Journal

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 

Automatic Prediction and Countering of Communal Tweets using Machine Learning Techniques

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 758 Automatic Prediction and Countering of Communal Tweets using Machine Learning Techniques R. Durga1, V. Pavithra2 1Student, M.E, Dept. of Computer Science and Engineering, T.J.S Engineering College, Tamil Nadu, India 2Assistant Professor, Dept. of Computer Science and Engineering, T.J.S Engineering College, Tamil Nadu, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Social networks such as Twitter and Facebook are seriously affected by abusive and offensive content. A lot of research has been carried out in recent years for automatic identification of different types of hate speech. Hate speech can come under several categories such as gender, ethnicity, nationality, racial groups and religions of the targeted groups. This work explores methods which aim to counter potentially harmful communal tweets which are pointed towards religious groups such as Hindus, Sikhs, Muslims, Jains, Christians, Buddhists. Communal tweets are largely posted during calamities and taking advantage of such situations, hatred and misinformation are propagated in the affected region which may result in serious riot and violence. Considering the potentially adverse effects of communal tweets, a classifier to identify communal tweets and non communal tweets is developed using the Support Vector Machine algorithm which performs better than existing approaches. Users who post communal tweets are identified. Anti communal tweets are identified from the non communal tweets using a classifier and by finding the appropriate anti communal tweet to a communal tweet, this system counters communal tweets by making use of such anti communal tweets posted by some users. The proposed system will be really helpful to reduce the increasing communal venom in society by blocking the users who repeat posting such communal tweets. It also changes the negative opinion of communal tweeters by promoting anti communal contents. Key Words: Anti communal tweets, Classifier, Communal tweets, Religious groups, Users 1. INTRODUCTION In recent years, there is an increase in propagation of communal speech on social media and the urgent need for effective countermeasures are needed. Twitter is a social broadcast network that enables people to publicly share brief messages instantly around the world. This message brings a variety of people with different voices, ideas and perspectives. Twitter prevents and filters only abusive contents and offensive words from the tweets but not the communal contents. Using this, communal tweets are pointed towards certain religions, racial communities, politicians or certain groups. Communal tweets are posted during times of calamities or emergency situations. Hence, analyzing the communal tweets on twitter during calamities is focused. While looking through these tweets, it has been observed that a large amount of communal tweets are posted rather than anti communal tweets. A calamity generally affects the morale of the masses making them vulnerable. Often, taking advantage of such situations, hatred and misinformation are propagated in the affected region, which may result in serious deterioration of law and order. A detailed analysis of communal tweets during disaster situations are detected and countered. This system characterizes users who initiate and promote communal tweets and also suggest a way to counter such communal content with anti communal tweets that asks users not to spread communal hatred. Considering the adverse effects of communal tweets, a Support Vector Machine Classifier is developed to automatically separate the communal tweets from the non communal ones. After identifying communal tweets, the nature of communal tweets and the users who post them are identified. Initiators, who initiate the communal tweets are identified and propagators who retweet the communal tweets posted by initiators or copy the contents of other initiators and post their own tweet with minor changes are also identified. Apart from the communal tweets, non communal tweets are analyzed to automatically identify anti communal tweets for countering. This will dissuade people from posting communal contents. Hence, a convincing way to counter the communal tweetis is to counter them with anti communal tweets to the initiators and propagators who post communal tweets. A real time system is also developed that automatically collects tweets from the user and counters with anti communal tweets if the posted tweet is communal and blocks the users who repeat posting the communal tweets through which communal venom in the society can be reduced. 2. EXISTING SYSTEM It has been observed that offensive tweets are often posted during calamities to which the attackers are affiliated. However, it is quite surprising that in certain geographical regions such as the Indian subcontinent, communal tweets are posted even during natural disasters such as floods and earthquakes and also during man-made disasters. Several studies have attempted to identify online content that is potentially hate speech or offensive
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 759 in nature by a supervised bag-of-words (BOW) model to classify racist content in web pages. Along with words, context features are also incorporated to improve the classification accuracy. Offensive content in Youtube comments using obscenities, profanities and pejorative terms as features with appropriate weightage. Similarly, cyberbullying was identified by Dinakar et al. using features like parts-of-speech tags, profane words, words with negative connotations and so on. More recently, Burnap and Williams proposed a hate term and dependence feature-based model to identify hate speeches posted during a disaster event called the Woolwich attack. Alsaedi et al. proposed a classification and clustering based technique to predict disruptive events like riot. Burnap and Williams proposed a model to detect cyber hate on Twitter across multiple protected characteristics such as race, disability, sex, etc. The most prevailing hate speeches are targeted toward certain races, while religion- induced hate speech is very sparse. Hence, a general purpose hate speech identifier may fail to capture all the nuances of a rare category like religion-based hate speech, especially, when tweets from the rare categories are posted in huge volumes for a short period of time. We actually demonstrate in this paper that the classifier proposed in can hardly capture communal tweets. Consequently, in recent times, researchers focus on more granular levels of hate speech detection in Twitter. For example, Chaudhry tried to track racism in Twitter and Burnap and Williams detected religious hate speeches posted during the Woolwich attack. On the contrary, this paper focuses on Twitter and it has been widely demonstrated that the standard Natural Language Processing based methodologies which have been developed for formally written text, do not work well for short, informal tweets. Hence, new methodologies are necessary to deal with communal and noisy content posted on Twitter. 3. PROPOSED WORK Detailed analysis of communal tweets posted during calamity situations such as automatic identification of such tweets, analyzing the users who post such tweets and also suggest a way to counter such content. In this paper, we try to identify communal tweets, characterize users initiating or promoting such contents and counter such communal tweets with anti communal posts that ask users not to spread communal venom. Support Vector Machine classifier is proposed to extract communal tweets and this classifier can be directly used over any future event without further training. Second, users are classified into two categories, one is originators who post a tweet and the other one is propagators who retweets the content of originators. We not only rely on retweets but also explore similarities between tweets, their timestamps in order to identify initiators and propagators more accurately. Rest of the analysis is performed on these modified groups of users. Apart from that, temporal patterns of the identified set of communal users are analyzed to understand their outraging phenomenon. Third, another Support Vector classifier is proposed to detect anti communal tweets and such tweets can be used to neutralize the effect of communal tweets. A real time system is developed to automatically identify communal and anti communal tweets posted during and counter with anti communal tweet to the communal tweet and also block the user if communal tweets are posted by them again. 4. METHODOLOGY Fig -1: Proposed Methodology 4.1 Preprocessing The dataset is collected in two ways, first the implicit dataset is given from the recent disaster event in the admin page and second, to develop the real time service which automatically collects tweets from the user in the user page. The tweets are preprocessed using stopwords removal and stemming techniques. Dictionary based approach is used to remove stopwords. Articles, prepositions, pronouns, conjunctions, etc are stop words and do not add much information to the text. Examples of a few stop words are “the”, “a”, “an”, “so”, “what”. Removal of stop words reduces the dataset size and reduces the training time due to the fewer number of tokens involved in the training. Stemming is the process of reducing inflected words to their word stem, base or root form, generally written in a word form. 4.2 Communal Tweet Classification Support Vector Machine Classifier is used to classify the tweets into communal and non communal tweets. This is the algorithm which gives high accuracy for text classification. Support Vector Machine algorithm creates the best line or decision boundary that can segregate n- dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane. SVM
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 760 chooses the extreme points or vectors that help in creating the hyperplane. A model that can accurately identify whether it is a communal or a non communal tweet should be created by using the Support Vector Machine algorithm. We will first train our model with lots of communal and non communal tweets using communal keywords so that it can learn about which is communal and non communal tweet and then we test it with a new tweet. On the basis of the support vectors, it will classify. So as the support vector creates a decision boundary between these two data (communal and non communal) and chooses extreme cases (support vectors), it will see the extreme case of communal and non communal. 4.3 Identifying Anti Communal Tweets After classifying communal tweets and non communal tweets, anti communal tweets from the non communal tweets are identified using the Support Vector Machine Classifier which has been trained using anti communal contents which try to dissuade people from posting communal tweets. 4.4 Finding suitable Anti Communal Tweet Anti communal tweet which is suitable to counter communal tweet is identified through a training model. This anti communal tweet is used to counter the initiator or propagator with such anti communal tweets for those who post the communal tweet. Table -1: Sample Communal and Anti Communal Tweets Communal Tweets Anti Communal Tweets Soul Vultures, Evangelical Vultures Tears and blood have no religion. All they know is only pain. It is not just in your country. It is everywhere. Huh, it’s Muslims behind attack, Islam attacks Paris Muslims are not terrorists. This is not Islam. Nothing to do with Islam. 4.5 User Identification Users who posted the communal tweets are found. We have two categories of users, one is initiators who post a tweet and the other one is propagators who retweet the posted tweet of some user or post a tweet by modifying an already posted tweet. All the users who posted communal tweets are stored in a database so if a user name repeats for posting a communal tweet then the user will be blocked. 4.6 Countering Communal Tweet Anti communal tweet which is suitable to counter the communal tweet is identified and countered to the initiator or propagator who posted the communal tweet. The various outputs are shown below, Fig -2: Tweets Dataset Fig -3: Admin Page to upload Implicit Dataset Fig -4: Uploaded Dataset
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 761 Fig -5: Classified Communal and Non Communal Tweet Fig -6: Classified Anti Communal Tweet Fig -7: User Login Fig -8: New User Registration Fig -9: Real Time User’s Tweet Page 5. CONCLUSION This proposed system counters communal tweets. Support Vector Machine classifier is used to filter out communal tweets. Anti communal tweets are found from the non communal contents and that is used to counter the communal tweets. A real time service is developed with user and admin login where admin can upload the tweet dataset and see all the classified communal, non communal and anti communal tweets and the users who posted the communal tweets. Tweets are collected from dataset and also through real time service in which users post tweets in real time which can be used directly in the future to identify and counter the communal tweets with anti communal tweets and also the user will be blocked if the user posts communal tweet again. It gives 91% precision, 95% recall, 93% F1 Score and 92% of overall accuracy. The accuracy of the classifier in the proposed system is comparatively higher than the existing system.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 08 | Aug 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 762 6. FUTURE ENHANCEMENTS ● The proposed communal tweet classifier can be used as an early warning signal to identify communal tweets, and then celebrities, political personalities can be made aware of the situation and requested to post anti communal tweets so that such tweets get higher exposure. Anti Communal content needs to be promoted via mentioning popular celebrities, political persons. Our real time system can be used by the Government in taking decisions regarding filtering communal content, promoting anti communal content etc. ● The number of distinct anti communal tweets is much less in number due to the availability of a small number of anti communal tweets. In the future, the proposed set of lexicons will be enlarged. ● Tweets are known to be informally written and noisy in nature, containing misspellings, abbreviations, smiley etc. In the future, these variations will be handled to improve our classifiers. ● It would be effective if the posted communal tweets are blocked. It would be more effective if the user is intimated or given warning before posting a communal tweet or the system should not allow users posting communal tweets. REFERENCES [1] P. Burnap and M. L. Williams, “Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making,” Policy Internet, vol. 7, no. 2, pp. 223–242, 2015. [2] L. A. Silva, M. Mondal, D. Correa, F. Benevenuto, and I. Weber, “Analyzing the targets of hate in online social media,” in Proc. ICWSM, Mar. 2016, pp. 687–690. [3] M. Mondal, L. A. Silva, and F. Benevenuto, “A measurement study of hate speech in social media,” in Proc. ACM HT, 2017, pp. 85–94. [4] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati, “Hate speech detection with comment embeddings,” in Proc. WWW, 2015, pp. 29–30. [5] W. Magdy, K. Darwish, N. Abokhodair, A. Rahimi, and T. Baldwin, “#ISISisNotIslam or #DeportAllMuslims?: Predicting unspoken views,” in Proc. ACM Web Sci., 2016, pp. 95–106. [6] K. Rudra, A. Sharma, N. Ganguly, and S. Ghosh, “Characterizing communal microblogs during disaster events,” in Proc. IEEE/ACM ASONAM, Aug. 2016, pp. 96–99. [7] E. Chandrasekharan, M. Samory, A. Srinivasan, and E. Gilbert, “The bag of communities: Identifying abusive behavior online with preexisting Internet data,” in Proc. ACM CHI, 2017, pp. 3175–3187. [8] Ika Alfina, Rio Mulia, Mohamad Ivan Fanany, and Yudo Ekanata, “Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study”, 10.1109/ICACSIS.2017.8355039, 2017. [9] K. Rudra, S. Banerjee, N. Ganguly, P. Goyal, M. Imran, and P. Mitra, “Summarizing situational tweets in crisis scenario,” in Proc. 27th ACM Conf. Hypertext Social Media (HT), 2016, pp. 137–147.