SlideShare a Scribd company logo
1 of 31
Computer Engineering Department
V.V.P Engineering College
Made By: Jejani Yasmin(170470702004)
Guided By: Prof. Maulik Dhamecha
Co-Guided By: Prof. Sagar Virani
Introduction
Motivation
Challenges
Literature Survey
Objective
Problem Statement
Proposed System
Conclusion
References
What Is Hate Speech?
 speech that attacks a person or group on the basis of
attributes such as race, religion, ethnic origin, national
origin, sex, disability, sexual orientation, or gender
identity.
 intended to insult, offend or intimidate to a person or
group.
 Hate speech is a crime.
 Social networking sites promotes free speech not hate
speech.
 World wide problems
 Spread of internet & rapid growth of social networking
site
 Anonymity provided by online social networking
 Also influence the business , causes serious real life
conflicts like murder , suicide.
 Maintain social media as a viable medium of
communication.
 no clear word define in statement that showing hate.
 Online social networking are full of ironic and joking
content that might be sound as offensive which in reality
is not.
 election time it is very challenging to detect hate speech.
 Require the proposed system that provide better
accuracy.
 Example: I hate seeing them loosing every time it's just
unfair.
 Example: if we want the opinion of a women, we'll ask
you dear...for now keep quite[3].
 Clustering algorithm is use to division of data according
to it’s similarity[6].
 Classification is data mining function that assign items
collection to target classes[7].
 hate speech detection are classified using machine
learning .
 Algorithm - usual Suspects
Decision tree,
Naïve bayes,
Random Forest,
Support vector machine
 Hybrid algorithms for data mining are a logical
combination of multiple pre-existing techniques to
enhance performance and provide better results[11].
 In the Hybrid approach use the concept of clustering and
classification to classify hate speech in order to
improvise classification accuracy[4].
 In Proposed system modified the hybrid approach in the
way that clustering process use for refinement for
classification to improve the accuracy
Title Automated Cyber bullying Detection using Clustering
Appearance Pattern[2]
Author&
Journal
2017-IEEE, Wails Romsaiyud, Kodchakorn na Pimpaka
,Prasetsilp, Piyaporn, Nurarak, Pirom Konglerd
Literature The algorithm included two main methods:
• creating partitions entire datasets into clusters
• capturing any specific partition with the frequency of
words with multinomial model feature vector and drawing
the probability of words occurring in a document for
predicting the eight classes.
Remark in future study more on the increasing a performance of
computation of time & cost on different data types from
many data sets.
Reference: Wallis Romsaiyud1, Kodchakorn na Nakornphanom2 , Pimpaka Prasertsilp3,
Piyaporn Nurarak4, Pirom Konglerd5(2017 IEEE) “Automated Cyber bullying Detection using
Clustering Appearance Patterns”
Title Hate Speech Detection in the Indonesian Language: A
Dataset and Preliminary Study[1]
Author&
Journal
2017-IEEE Ika Alfina , Rio Mulina , Mahomad Ivan
Fanany And Yudo Ekanata
Literature • Feature extraction using word n-gram(n=1,n=2),
character n-gram(n=3, n=4), negative sentiments.
• Classification perform using naive bayes, SVM,
Bayesian logistic regression , random forest decision tree.
Remarks •F-measure 93.5% was achieved when using word n-
gram feature with random forest tree.
•Results also show that word n-gram feature outperformed
character n-gram.
Reference: Ika Alfina, Rio Mulia, Mohamad Ivan Fanany, and Yudo Ekanata(ICACSIS 2017)” Hate Speech
Detection in the Indonesian Language: A Dataset andPreliminary Study”
Title Hate Speech on Twitter: A Pragmatic Approach to
Collect Hateful and Offensive Expressions and
Perform Hate Speech Detection[3]
Author &
Journal
2018:IEEE HAJIME WATANABE, MONDHER BOUAZIZI ,
AND TOMOAKI OHTSUKI
Literature •Approach is based on unigrams and patterns that are
automatically collected from the training set. These
patterns and unigrams are later used, among others, as
features to train a machine learning algorithm.
•Use the binary and ternery classification reaches the
accuracy equal to 87.4% and 78.4%.
Remarks Result show that j48 outperforms SVM
References: 2018:IEEE HAJIME WATANABE, MONDHER BOUAZIZI , AND TOMOAKIOHTSUKI “Hate
Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate
Speech Detection”
Title Combining Classi cation and Clustering for Tweet
sentimate analysis[4]
Author&
Journal
2014:IEE Brazilian Conference on Intelligent Systems Luiz
F.S coletta ,N’adia F.F.da silva,R. Hruschka∗ Estevam R.
Hruschka Jr.
Literature •In this SVM are combined with cluster ensemble.
•similar instance of same cluster more likely share the
same class.
•result are better that only using SVM.
Remarks •investigate “good” data partitions to compose a cluster
ensemble deserve attention
References: 2014:IEE Brazilian Conference on Intelligent SystemsLuis F. S. Colette∗, N´adia F. F. da
Silva∗, Eduardo R. Hruschka∗ Estevam R. HruschkaJr. Combining Classi cation and Clustering for Tweet
Sentiment Analysis
Title Combining Clustering with Classification: A Technique
to Improve Classification Accuracy[5]
Author&
Journal
2016 Yaswanth Kumar Alapati et al. / International
Journal of Computer Science Engineering (IJCSE)
Literature •Use the clustering priories to classification for the real life
data .
•For the clustering use K-mean and hierarchical clustering
& classification use naive bayes & neural network compare
the accuracy of all combination .
Remarks Results shows that clustering priories to classification is give
the better result in accuracy.
Reference: Yaswanth Kumar Alapati et al. / International Journal of Computer Science Engineering (IJCSE).
“Combining Clustering with Classification: A Technique to Improve Classification Accuracy”
Title Algorithm Result
Automated Cyber
bullying Detection
using Clustering
Appearance
Pattern[2]
Use the k-mean
and naïve bayes
algorithm
The main objective of the paper is
to partition abusive messages from
big data streaming with use of K-
means clustering and naïve bayes
An Improved
Malicious
Behavior
Detection Via k-
Means and
Decision Tree[12]
Use the k-mean
and decision tree
for detect
malicious
behaviour
KMDT have detected more
malicious behaviours accurately as
contrast to discrete and diversely
combined methods.
Combining
Classification and
Clustering for
Tweet Sentiment
Analysis[4]
Use SVM with k-
medoids
SVM classifier combined with
cluster ensembles can offer better
accuracy than stand alone
SVM..and give this algorithm name
C^3E-SE, and use clustering
algorithm is K-medoids clustering.
Title Algorithm Result
Improving
Classification in Data
mining using Hybrid
algorithm [11]
Use the k-mean
and decision
tree for the
hybrid
algorithm.
This approach solves issues of
burdening decision tree with large
datasets by dividing the data
samples into clusters.
Classification using
Latent Dirichlet
Allocation with Naive
Bayes Classifier to
detect Cyber Bullying
in Twitter[13]
Use the LDA
and Naïve
bayes
LDA is use for identifying key
terms used as a feature vector
and provide the better accuracy
with naïve bayes.
 Proposed the approach based on clustering and
classification to give the better accuracy for detect hate
speech.
 In the proposed system modified hybrid algorithm in the
way that only require data are go to the classification
stage.
 Find the high swear word that clearly define hate speech.
There is also available hate base dictionary on the data.
world so I modified that and only take swear words.
 Use the clustering and classification technique for detect
hate speech .
 Clustering use refinement for classification.
 Implement the hybrid approach to detect hate speech to
provide the better result.
Input
(Tweets)
Pre-
processing Clustering
Feature
Extraction
High swear
word tweet
Extremely
positive
Other tweets
Not hate
speech
Hate speech
Classificatio
n
Compare
accuracy
and
precession
Step 1: Take the Tweeter's data.
Step 2: Preprocessing of tweets.( Remove url,
tokenization , lemmatization, For Example:
'caring’ lemmatization ‘care'. )
Step 3:Feature Extraction.
Bag of word
N-gram
Sentiment based feature using the positive and
negative lexicons
Swear words lexicons .
Step:4 apply the clustering on the entire data set that
partition the data to the clusters,
Cluster 1: cluster that contain high swear word that clearly decide
tweets are under the hate speech .
Cluster 2: cluster that contain only positive words that classified as non
hate speech.
Cluster 3: remaining tweets
Step 5: perform the classification on cluster 3.
Step 6: give the output hate speech and non hate speech
Step 7: compare the accuracy and precision
Naive
Bayes
K - Nearest
Neighbor
Decision
Trees
Accuracy in general Average Good Good
Speed of learning Excellent Excellent V. good
Speed of classification Excellent Average Excellent
Tolerance to missing values Excellent Average V. good
Tolerance to irrelevant attributes Good Good V. good
Tolerance to noise V. good Average Good
Attempts for incremental
learning
Excellent Excellent Good
Explanation ability/
transparency of knowledge/
classification
Excellent Good Excellent
Support Multi Classification Naturally
Extended
Excellent Excellent
 according to[9] random forest provide all the benefits of
decision tree also provide better result for large data set,
avoids over fitting problem ,also cover missing value
problem in the dataset.
 Correctly classified instance better than decision tree[9]
 Random forests provide information about the
importance of a variable and also the proximity of the data
points with one another[8]
 For the real time of tweeter data is very large for that we
require the clustering algorithm that is give better result
for large data set.
 According to[6] comparison of clustering algorithms show
that for the large data set k-mean is better ,small data set
hierarchical clustering give the better result.
 K-mean is one of the simplest and easy algorithm that’s
why choose the k-mean for clustering the data.
Majority voting
Final class
 For hate speech detection use clustering and
classification for detect the hate speech.
 Clustering use as refinement of data for classification.
 Use the hybrid approach to provide the better accuracy in
hate speech detection.
[1] Ika Alfina, Rio Mulia, Mohamad Ivan FananyYu Ekanata(ICACSIS
2017) “Hate Speech Detection in the Indonesian Language A Dataset and
Preliminary Study”
[2] Walisa Romsaiyud1, Kodchakorn n Nakornphanom2 , Pimpaka
Prasertsilp3,Piyaporn Nurarak4, Pirom Konglerd5(2017 IEEE) ”Automated
CyberbullyingDetection using Clustering Appearance Patterns.”
[3] HAJIME WATANABE, MONDHER BOUAZIZI ,& TOMOAKI
OHTSUKI(2018IEEEAccess) “Hate Speech on Twitter: A Pragmatic Approach
to Collect Hateful and Offensive Expressions and Perform Hate Speech
Detection”
[4]2014:IEEE Brazilian Conference on Intellig SystemsnLuiz F. S. Colette∗,
N´adia F. F. da Silva∗, Eduardo R.Hruschka∗ Estevam R. Hruschka Jr.
“Combining Classi cation and Clustering for Tweet Sentimen Analysis”
[5] Yaswanth Kumar Alapati et al. / InternationaJournal of Computer Science
Engineering (IJCSE). “Combining Clustering with Classification: A Technique to
Improve ClassificationAccuracy”
[6] Osama Abu Abbas (TIAjOIT) “Comparison between data clustering algorithm”
[7] Omkar Ardhapure1, Gayatri Patil2, Disha Udani3, Kamlesh Jetha4 (IJRET)
“COMPARATIVE STUDY OF CLASSIFICATION ALGORITHM FOR TEXT
BASED CATEGORIZATIO”
[8] Prajwala T R (International Journal of Advanced Research in Computer and
Communication Engineering Vol. 4, Issue 1, January 2015) “A Comparative Study
on Decision Tree and Random Forest Using R Tool”
[9] Jehad Ali1 , Rehanullah Khan2 , Nasir Ahmad3 , Imran Maqsood4 IJCSI
International Journal of Computer Science Issues, Vol. 9, Issue 5, No 3,
“Random Forests and Decision Trees September 2012 “
[10] Hamed Jelodar1 , Yongli Wang1 , Chi Yuan1 , Xia Feng2 “Latent
Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey”
[11]Akanksha Ahlawat1 , Bharti Suri2 (2016 IEEE) “Improving Classification in
Data mining using Hybrid algorithm”
[12]Warusia Yassin, Siti Rahayu, Faizal Abdollah and Hazlin Zin((IJACSA)
International Journal of Advanced Computer Science and Applications, Vol.7,
No.12,2016206) “An Improved Malicious Behavior Detection Via kMeans and
Decision Tree”
[13] K. Nalini andL. Jaba Sheela (Indian Journal of Science and Technology,
Vol 9(28), DOI: 10.17485/ijst/2016/v9i28/93825, July 2016ISSN) “Classification
using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber
Bullying in Twitter”
[14] Niyati Aggrawal (Computer Reviews Journal 2018) “Detection of Offensive
Tweets: A Comparative Study”
[15] Timothy Pratama, Ayu Purwarianti (©2017 IEEE) “Topic Classification and
Clustering on Indonesian Complaint Tweets for Bandung Government using
Supervised and Unsupervised Learning”
[16] PAULA FORTUNA, INESC TEC SÉRGIO NUNES, INESC TECand Faculty
of Engineering, University of Porto (ACM Computing Surveys July 2018) “A
Survey on Automatic Detection of Hate Speech in Text”
[17] Naufal Riza Fatahillah, Pulut Suryati , Cosmas Haryawan (2017
International Conference on Sustainable Information Engineering and
Technology (SIET)) “Implementation Of Naive Bayes Classifier Algorithm On
Social Media (Twitter) To The Teaching Of Indonesian Hate Speech“
[18] Pete Burnap and Matthew L. Williams(Policy & Internet, 7:2) “Cyber Hate
Speech on Twitter: An Application of Machine Classification and Statistical
Modeling for Policy and Decision Making”
[19] Data Mining: Concepts and Techniques, . Jewie Han, Michelin Kamber,
Jian Pei.
ashu ppt final.pptx

More Related Content

Similar to ashu ppt final.pptx

Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...IAESIJAI
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441IJRAT
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningIJSRD
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningIJSRD
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisEditor IJCATR
 
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET Journal
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Andrew Parish
 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAdamBab
 
Demography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationDemography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationUmmeSalmaM1
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmIJSRD
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMGender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMAIRCC Publishing Corporation
 
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMGender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMAIRCC Publishing Corporation
 
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPM
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPMGENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPM
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPMijcsit
 

Similar to ashu ppt final.pptx (20)

Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion Mining
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion Mining
 
Bx044461467
Bx044461467Bx044461467
Bx044461467
 
Neural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment AnalysisNeural Network Based Context Sensitive Sentiment Analysis
Neural Network Based Context Sensitive Sentiment Analysis
 
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments Dataset
 
Demography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationDemography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendation
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMGender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
 
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPMGender and Authorship Categorisation of Arabic Text from Twitter Using PPM
Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
 
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPM
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPMGENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPM
GENDER AND AUTHORSHIP CATEGORISATION OF ARABIC TEXT FROM TWITTER USING PPM
 

Recently uploaded

VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012
VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012
VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012Call Girls Service Gurgaon
 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...Vip call girls In Chandigarh
 
Udaipur Call Girls 📲 9999965857 Call Girl in Udaipur
Udaipur Call Girls 📲 9999965857 Call Girl in UdaipurUdaipur Call Girls 📲 9999965857 Call Girl in Udaipur
Udaipur Call Girls 📲 9999965857 Call Girl in Udaipurseemahedar019
 
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhHot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhVip call girls In Chandigarh
 
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋Sheetaleventcompany
 
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...Call Girls Noida
 
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅gragmanisha42
 
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Miss joya
 
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR Call G...
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR   Call G...❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR   Call G...
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR Call G...Gfnyt.com
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...High Profile Call Girls Chandigarh Aarushi
 
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In RaipurCall Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipurgragmanisha42
 
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meet
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real MeetChandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meet
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meetpriyashah722354
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Memriyagarg453
 
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋Sheetaleventcompany
 
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...High Profile Call Girls Chandigarh Aarushi
 
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliCall Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliHigh Profile Call Girls Chandigarh Aarushi
 

Recently uploaded (20)

Call Girl Guwahati Aashi 👉 7001305949 👈 🔝 Independent Escort Service Guwahati
Call Girl Guwahati Aashi 👉 7001305949 👈 🔝 Independent Escort Service GuwahatiCall Girl Guwahati Aashi 👉 7001305949 👈 🔝 Independent Escort Service Guwahati
Call Girl Guwahati Aashi 👉 7001305949 👈 🔝 Independent Escort Service Guwahati
 
Russian Call Girls in Dehradun Komal 🔝 7001305949 🔝 📍 Independent Escort Serv...
Russian Call Girls in Dehradun Komal 🔝 7001305949 🔝 📍 Independent Escort Serv...Russian Call Girls in Dehradun Komal 🔝 7001305949 🔝 📍 Independent Escort Serv...
Russian Call Girls in Dehradun Komal 🔝 7001305949 🔝 📍 Independent Escort Serv...
 
VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012
VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012
VIP Call Girls Sector 67 Gurgaon Just Call Me 9711199012
 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
 
Udaipur Call Girls 📲 9999965857 Call Girl in Udaipur
Udaipur Call Girls 📲 9999965857 Call Girl in UdaipurUdaipur Call Girls 📲 9999965857 Call Girl in Udaipur
Udaipur Call Girls 📲 9999965857 Call Girl in Udaipur
 
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In ChandigarhHot  Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
Hot Call Girl In Chandigarh 👅🥵 9053'900678 Call Girls Service In Chandigarh
 
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Kolkata Escort Service Call Girls, ₹5000 To 25K With AC💚😋
 
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...
pOOJA sexy Call Girls In Sector 49,9999965857 Young Female Escorts Service In...
 
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅
Russian Call Girls Kota * 8250192130 Service starts from just ₹9999 ✅
 
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
Vip Kolkata Call Girls Cossipore 👉 8250192130 ❣️💯 Available With Room 24×7
 
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR Call G...
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR   Call G...❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR   Call G...
❤️♀️@ Jaipur Call Girls ❤️♀️@ Meghna Jaipur Call Girls Number CRTHNR Call G...
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
 
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In RaipurCall Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
 
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meet
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real MeetChandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meet
Chandigarh Call Girls 👙 7001035870 👙 Genuine WhatsApp Number for Real Meet
 
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near MeVIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
VIP Call Girls Noida Jhanvi 9711199171 Best VIP Call Girls Near Me
 
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋
💚😋Mumbai Escort Service Call Girls, ₹5000 To 25K With AC💚😋
 
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service LucknowCall Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
 
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
 
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service MohaliCall Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
Call Girls in Mohali Surbhi ❤️🍑 9907093804 👄🫦 Independent Escort Service Mohali
 
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service DehradunCall Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
 

ashu ppt final.pptx

  • 1. Computer Engineering Department V.V.P Engineering College Made By: Jejani Yasmin(170470702004) Guided By: Prof. Maulik Dhamecha Co-Guided By: Prof. Sagar Virani
  • 3. What Is Hate Speech?  speech that attacks a person or group on the basis of attributes such as race, religion, ethnic origin, national origin, sex, disability, sexual orientation, or gender identity.  intended to insult, offend or intimidate to a person or group.  Hate speech is a crime.  Social networking sites promotes free speech not hate speech.
  • 4.  World wide problems  Spread of internet & rapid growth of social networking site  Anonymity provided by online social networking  Also influence the business , causes serious real life conflicts like murder , suicide.  Maintain social media as a viable medium of communication.
  • 5.  no clear word define in statement that showing hate.  Online social networking are full of ironic and joking content that might be sound as offensive which in reality is not.  election time it is very challenging to detect hate speech.  Require the proposed system that provide better accuracy.  Example: I hate seeing them loosing every time it's just unfair.  Example: if we want the opinion of a women, we'll ask you dear...for now keep quite[3].
  • 6.
  • 7.  Clustering algorithm is use to division of data according to it’s similarity[6].  Classification is data mining function that assign items collection to target classes[7].  hate speech detection are classified using machine learning .  Algorithm - usual Suspects Decision tree, Naïve bayes, Random Forest, Support vector machine
  • 8.  Hybrid algorithms for data mining are a logical combination of multiple pre-existing techniques to enhance performance and provide better results[11].  In the Hybrid approach use the concept of clustering and classification to classify hate speech in order to improvise classification accuracy[4].  In Proposed system modified the hybrid approach in the way that clustering process use for refinement for classification to improve the accuracy
  • 9. Title Automated Cyber bullying Detection using Clustering Appearance Pattern[2] Author& Journal 2017-IEEE, Wails Romsaiyud, Kodchakorn na Pimpaka ,Prasetsilp, Piyaporn, Nurarak, Pirom Konglerd Literature The algorithm included two main methods: • creating partitions entire datasets into clusters • capturing any specific partition with the frequency of words with multinomial model feature vector and drawing the probability of words occurring in a document for predicting the eight classes. Remark in future study more on the increasing a performance of computation of time & cost on different data types from many data sets. Reference: Wallis Romsaiyud1, Kodchakorn na Nakornphanom2 , Pimpaka Prasertsilp3, Piyaporn Nurarak4, Pirom Konglerd5(2017 IEEE) “Automated Cyber bullying Detection using Clustering Appearance Patterns”
  • 10. Title Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study[1] Author& Journal 2017-IEEE Ika Alfina , Rio Mulina , Mahomad Ivan Fanany And Yudo Ekanata Literature • Feature extraction using word n-gram(n=1,n=2), character n-gram(n=3, n=4), negative sentiments. • Classification perform using naive bayes, SVM, Bayesian logistic regression , random forest decision tree. Remarks •F-measure 93.5% was achieved when using word n- gram feature with random forest tree. •Results also show that word n-gram feature outperformed character n-gram. Reference: Ika Alfina, Rio Mulia, Mohamad Ivan Fanany, and Yudo Ekanata(ICACSIS 2017)” Hate Speech Detection in the Indonesian Language: A Dataset andPreliminary Study”
  • 11. Title Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection[3] Author & Journal 2018:IEEE HAJIME WATANABE, MONDHER BOUAZIZI , AND TOMOAKI OHTSUKI Literature •Approach is based on unigrams and patterns that are automatically collected from the training set. These patterns and unigrams are later used, among others, as features to train a machine learning algorithm. •Use the binary and ternery classification reaches the accuracy equal to 87.4% and 78.4%. Remarks Result show that j48 outperforms SVM References: 2018:IEEE HAJIME WATANABE, MONDHER BOUAZIZI , AND TOMOAKIOHTSUKI “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection”
  • 12. Title Combining Classi cation and Clustering for Tweet sentimate analysis[4] Author& Journal 2014:IEE Brazilian Conference on Intelligent Systems Luiz F.S coletta ,N’adia F.F.da silva,R. Hruschka∗ Estevam R. Hruschka Jr. Literature •In this SVM are combined with cluster ensemble. •similar instance of same cluster more likely share the same class. •result are better that only using SVM. Remarks •investigate “good” data partitions to compose a cluster ensemble deserve attention References: 2014:IEE Brazilian Conference on Intelligent SystemsLuis F. S. Colette∗, N´adia F. F. da Silva∗, Eduardo R. Hruschka∗ Estevam R. HruschkaJr. Combining Classi cation and Clustering for Tweet Sentiment Analysis
  • 13. Title Combining Clustering with Classification: A Technique to Improve Classification Accuracy[5] Author& Journal 2016 Yaswanth Kumar Alapati et al. / International Journal of Computer Science Engineering (IJCSE) Literature •Use the clustering priories to classification for the real life data . •For the clustering use K-mean and hierarchical clustering & classification use naive bayes & neural network compare the accuracy of all combination . Remarks Results shows that clustering priories to classification is give the better result in accuracy. Reference: Yaswanth Kumar Alapati et al. / International Journal of Computer Science Engineering (IJCSE). “Combining Clustering with Classification: A Technique to Improve Classification Accuracy”
  • 14. Title Algorithm Result Automated Cyber bullying Detection using Clustering Appearance Pattern[2] Use the k-mean and naïve bayes algorithm The main objective of the paper is to partition abusive messages from big data streaming with use of K- means clustering and naïve bayes An Improved Malicious Behavior Detection Via k- Means and Decision Tree[12] Use the k-mean and decision tree for detect malicious behaviour KMDT have detected more malicious behaviours accurately as contrast to discrete and diversely combined methods. Combining Classification and Clustering for Tweet Sentiment Analysis[4] Use SVM with k- medoids SVM classifier combined with cluster ensembles can offer better accuracy than stand alone SVM..and give this algorithm name C^3E-SE, and use clustering algorithm is K-medoids clustering.
  • 15. Title Algorithm Result Improving Classification in Data mining using Hybrid algorithm [11] Use the k-mean and decision tree for the hybrid algorithm. This approach solves issues of burdening decision tree with large datasets by dividing the data samples into clusters. Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter[13] Use the LDA and Naïve bayes LDA is use for identifying key terms used as a feature vector and provide the better accuracy with naïve bayes.
  • 16.  Proposed the approach based on clustering and classification to give the better accuracy for detect hate speech.  In the proposed system modified hybrid algorithm in the way that only require data are go to the classification stage.  Find the high swear word that clearly define hate speech. There is also available hate base dictionary on the data. world so I modified that and only take swear words.
  • 17.  Use the clustering and classification technique for detect hate speech .  Clustering use refinement for classification.  Implement the hybrid approach to detect hate speech to provide the better result.
  • 18.
  • 19. Input (Tweets) Pre- processing Clustering Feature Extraction High swear word tweet Extremely positive Other tweets Not hate speech Hate speech Classificatio n Compare accuracy and precession
  • 20. Step 1: Take the Tweeter's data. Step 2: Preprocessing of tweets.( Remove url, tokenization , lemmatization, For Example: 'caring’ lemmatization ‘care'. ) Step 3:Feature Extraction. Bag of word N-gram Sentiment based feature using the positive and negative lexicons Swear words lexicons .
  • 21. Step:4 apply the clustering on the entire data set that partition the data to the clusters, Cluster 1: cluster that contain high swear word that clearly decide tweets are under the hate speech . Cluster 2: cluster that contain only positive words that classified as non hate speech. Cluster 3: remaining tweets Step 5: perform the classification on cluster 3. Step 6: give the output hate speech and non hate speech Step 7: compare the accuracy and precision
  • 22. Naive Bayes K - Nearest Neighbor Decision Trees Accuracy in general Average Good Good Speed of learning Excellent Excellent V. good Speed of classification Excellent Average Excellent Tolerance to missing values Excellent Average V. good Tolerance to irrelevant attributes Good Good V. good Tolerance to noise V. good Average Good Attempts for incremental learning Excellent Excellent Good Explanation ability/ transparency of knowledge/ classification Excellent Good Excellent Support Multi Classification Naturally Extended Excellent Excellent
  • 23.  according to[9] random forest provide all the benefits of decision tree also provide better result for large data set, avoids over fitting problem ,also cover missing value problem in the dataset.  Correctly classified instance better than decision tree[9]  Random forests provide information about the importance of a variable and also the proximity of the data points with one another[8]
  • 24.  For the real time of tweeter data is very large for that we require the clustering algorithm that is give better result for large data set.  According to[6] comparison of clustering algorithms show that for the large data set k-mean is better ,small data set hierarchical clustering give the better result.  K-mean is one of the simplest and easy algorithm that’s why choose the k-mean for clustering the data.
  • 26.  For hate speech detection use clustering and classification for detect the hate speech.  Clustering use as refinement of data for classification.  Use the hybrid approach to provide the better accuracy in hate speech detection.
  • 27. [1] Ika Alfina, Rio Mulia, Mohamad Ivan FananyYu Ekanata(ICACSIS 2017) “Hate Speech Detection in the Indonesian Language A Dataset and Preliminary Study” [2] Walisa Romsaiyud1, Kodchakorn n Nakornphanom2 , Pimpaka Prasertsilp3,Piyaporn Nurarak4, Pirom Konglerd5(2017 IEEE) ”Automated CyberbullyingDetection using Clustering Appearance Patterns.” [3] HAJIME WATANABE, MONDHER BOUAZIZI ,& TOMOAKI OHTSUKI(2018IEEEAccess) “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection” [4]2014:IEEE Brazilian Conference on Intellig SystemsnLuiz F. S. Colette∗, N´adia F. F. da Silva∗, Eduardo R.Hruschka∗ Estevam R. Hruschka Jr. “Combining Classi cation and Clustering for Tweet Sentimen Analysis” [5] Yaswanth Kumar Alapati et al. / InternationaJournal of Computer Science Engineering (IJCSE). “Combining Clustering with Classification: A Technique to Improve ClassificationAccuracy”
  • 28. [6] Osama Abu Abbas (TIAjOIT) “Comparison between data clustering algorithm” [7] Omkar Ardhapure1, Gayatri Patil2, Disha Udani3, Kamlesh Jetha4 (IJRET) “COMPARATIVE STUDY OF CLASSIFICATION ALGORITHM FOR TEXT BASED CATEGORIZATIO” [8] Prajwala T R (International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 1, January 2015) “A Comparative Study on Decision Tree and Random Forest Using R Tool” [9] Jehad Ali1 , Rehanullah Khan2 , Nasir Ahmad3 , Imran Maqsood4 IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 3, “Random Forests and Decision Trees September 2012 “ [10] Hamed Jelodar1 , Yongli Wang1 , Chi Yuan1 , Xia Feng2 “Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey” [11]Akanksha Ahlawat1 , Bharti Suri2 (2016 IEEE) “Improving Classification in Data mining using Hybrid algorithm”
  • 29. [12]Warusia Yassin, Siti Rahayu, Faizal Abdollah and Hazlin Zin((IJACSA) International Journal of Advanced Computer Science and Applications, Vol.7, No.12,2016206) “An Improved Malicious Behavior Detection Via kMeans and Decision Tree” [13] K. Nalini andL. Jaba Sheela (Indian Journal of Science and Technology, Vol 9(28), DOI: 10.17485/ijst/2016/v9i28/93825, July 2016ISSN) “Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter” [14] Niyati Aggrawal (Computer Reviews Journal 2018) “Detection of Offensive Tweets: A Comparative Study” [15] Timothy Pratama, Ayu Purwarianti (©2017 IEEE) “Topic Classification and Clustering on Indonesian Complaint Tweets for Bandung Government using Supervised and Unsupervised Learning” [16] PAULA FORTUNA, INESC TEC SÉRGIO NUNES, INESC TECand Faculty of Engineering, University of Porto (ACM Computing Surveys July 2018) “A Survey on Automatic Detection of Hate Speech in Text”
  • 30. [17] Naufal Riza Fatahillah, Pulut Suryati , Cosmas Haryawan (2017 International Conference on Sustainable Information Engineering and Technology (SIET)) “Implementation Of Naive Bayes Classifier Algorithm On Social Media (Twitter) To The Teaching Of Indonesian Hate Speech“ [18] Pete Burnap and Matthew L. Williams(Policy & Internet, 7:2) “Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making” [19] Data Mining: Concepts and Techniques, . Jewie Han, Michelin Kamber, Jian Pei.