SlideShare a Scribd company logo
A Deep Learning Approach For Hate
Speech and Offensive Language
Detection on Twitter
Presented By:
Nasim Alam
M Tech Computer
INTRODUCTION
Hate Speech
● Hate speech is speech that attacks a person or group on the basis of attributes
such as race religion, ethnic origin, national origin, gender, disability, sexual
orientation.
● The law of some countries describes hate speech as speech, gesture or
conduct, writing, or display that incites violence or prejudicial action against a
protected group or individual on the basis of their membership of the group.
● Social media platforms like Facebook and twitter has raised concerns about
emerging dubious activity such as the intensity of hate, abusive and offensive
behavior among us.
2
Motivation
Potential of social media for spreading hate speech
◉ 30% internet penetration in India (World Bank, 2016)
◉ 241 million users of Facebook alone (The Next Web Report, 2017)
◉ 136 million Indians are active social media users (Yral Report, 2016)
◉ 200 million whatsapp users in India (Mashable, 2017)
3
OBJECTIVE
• The main objective of this work is to develop an automated deep learning
based approach for detecting hate speech and offensive language.
• Automated detection corresponds to automated learning such as machine
learning: supervised and unsupervised learning. We use a supervised learning
method to detect hate and offensive language.
• Classify tweets into three or four classes (like: racist, sexist, none , both) based
on tweet sentiment and other features that a tweet demonstrate.
4
PROJECT CONTRIBUTION
• An efficient feature extraction and selection.
• A Multi-layer perceptron based model to train and classify tweets
into hate, offensive or none.
• A Dynamic CNN based model for training and GloVe embedding
vector for feature extraction.
5
Literature survey
Refereance Dataset Techinque Results
Greevy et al
(2004)
PRINCIP Corpus
Size: 3 Million words from tweets
Model: SVM
Feature Extraction:
BOW, Bi-gram
Precision: 92.5%(BOW)
Recall: 87% (BOW)
Precision: 92.5% (Bi-gram)
Recall: 87% (Bi-gram)
Waseem and Hovy
(2016)
Total Annotated tweets: 16,914.
#Sexist tweets 3,383.
#Racist Tweets 1,972.
#tweets Neither racist nor sexist: 11,559.
Model: Char n-grams
Word n-grams
Precision: 73.89%(char n-gram)
Recall: 77.75% (char n-gram)
F1 Score: 72.87 (char n-grams)
Precision: 64.58%(word n-grams)
Recall: 71.93% (word n-grams)
F1 Score: 64.58(word n-grams)
Akshita et al
(2016)
Waseem and Hovy, 2016
Size: 22,142 tweets
Class: Benevolent, Hostile, others
Model: SVM, Seq2Seq
(LSTM), FastText
Classifier(by Facebook
AI Research)
Feature Extraction: TF-
IDF, Bag of n-words
Average F1 score among
classes: 0.723 (SVM),
0.74 (Seq2Seq)
Overall F1 Score: 0.84 (FastText)
6
Literature survey
Refereance Dataset Techinque Results
Ji Ho et al
(2016)
Waseem and Hovy, 2016
Waseem 2016
Class: Racism,Sexism and None
Size: 25k tweets
Model: Hybrid CNN
Classifier(wordCNN +
CharCNN)
Precision: 0.827
Recall: 0.827
F1 Score: 0.827
Davidson et al
(2017)
CrowdFlower (CF)
Class: Hate,offensive and None
Size: 25k tweets
Model: Linear SVM, Logistic,
Regression
precision: 0.91,
Recall: 0.90,
F1 score: 0.90.
Zhang et al
(2018)
7 publicly available dataset:
DT(24k), RM(2k), WZ-LS(18k), WZ-
L(16k), WZ-S.amt(6k), WZ-S.exp(6k),
WZ-S.gb(6k)
Model:CNN+GRU
Accuracy:
DT: 0.94, RM: 0.92, WZ-L:
0.82,WZ-S.amt: 0.92, WZ-S.exp:
0.92, WZ-S.gb: 0.93
7
Dataset
8
Dataset No of Tweets Classes (%Tweets) Target Class
WZ-LS 18,595 Racism(10.6%),
Sexism(20.2%), None (68.8%)
Racism, Sexism
WZ-L 16,093 Racism(12.01%),
Sexism(19.56%), None
(68.41%)
Racism, Sexism
WZ-S.exp 6,594 Racism(1.2%),
Sexism(11.7%), both(0.53%),
None (84.37%)
Racism, Sexism
Hate (Davidson) 24,783 Hate(11.6%),
offensive(76.6%), Neither
(11.8%)
Hate, Offensive
A Multi-Layer perceptron (MLP) based model
9
● Raw text in the form of tweets in csv file crawled from twitter using
Tweepy API.
● A lots of preprocessing done to get cleaned text.
● Feature Extraction:
○ Convert it into TF-IDF feature matrix.
○ POS TF feature matrix.
○ Other Features like: No_of_syllales, avg_syl_per_word,
no_of_unique_words,num_mentions,is_retweets,VaderSentime
nt:pos,neg,neutral, compound).
● Concatenated these feature matrices into one matrix.
● We used logistic regression with L1 regularization to select most
important features and then passed this selected feature vector to an
MLP network for classification.
● MLP consists of an input layer, three hidden layer and an output layer
or softmax layer.
○ Input layer Size: Size of Input feature matrix, Activation:
Sigmoid.
○ Number of nodes: 200, 140, 70 and Activation Function: Relu.
○ Softmax Layer: Output class: 3 or 4, Activation function:
Softmax.
MLP based Proposed model
10
A simple single layer CNN
● A Sentence (a single tweet): X1:n = X1 ⊕ X2 ⊕ …………..Xn
● All possible widow of length h: {X1:h, X2:h+1, …………Xn-h+1:n }
● We can have multiple filter or window of different length like h=1 for unigram, h=2 for bigram , h=3
for trigram and so on.
● This filter is consist of random weight which is convolved over sentence matrix in overlapped
fashion and a sum of multiplication of filter and X is calculated as feature map.
● A feature map C = [c1,c2,……………………cn+h-1] ∈ ℝn-h+1, for multiple filter we may have multiple
feature map as Ci = [C1, C2, …………Cm] where m is number of filters.
● pooling: pooling is a process of selecting only interested region from the convolution feature vector.
The result of pooling is Ĉ = max{ C } and Ĉi can be pooled feature vector for ith filter.
● All the pooled vectors are concatenated into single feature vector Z = [Ĉ1, Ĉ2, ……, Ĉm ]
● Finally Z feature vector is passed through a softmax function for final classification.
Word2Vec Word Embedding
11
Word2vec
• Word2vec is a predictive model, which uses an ANN based model to
predict the real valued vector of a target word with respect to the
another context word.
• Mikolov et al used continuous bag of words and skipgram models
that are trained over millions of words and represent each in a
vector space.
GloVe
• GloVe is a semantic vector space models of language represent
each word with a real valued vector.
• GloVe model uses word frequency and global co-occurance count
matrix.
• Count-based models learn their vectors by essentially doing
dimensionality reduction on the co-occurrence counts matri.
• These vectors can be used as features in a variety of applications
such as information retrieval, document classification, question
answering, NER, and Parsing.
Representation of word in vector space
Text based CNN
12
Text based Convolutional Neural Network operation (Source: Kim 2014)
13
Dynamic Convolutional Neural Network
• Wide Convolution: Given an input sentence, to obtain the first
layer of the DCNN we take the embedding Wi ∈ ℝd for each word
in the sentence and construct the sentence matrix s ∈ ℝd × s .
• A convolutional layer in the network is obtained by convolving a
matrix of weights m ∈ ℝd × m with the matrix of activations at the
layer below.
• A dynamic k-max pooling operation is a k-max pooling
operation where we let k be a function of the length of the
sentence and the depth of the network, we simply model the
pooling parameter as follows:
Where i is ith conv-layer in which k max-pooling is
applied. L is the total number of convolutional layers
in the network.S is input sentence length.
• Folding is used just to sum every two rows in feature map
component wise. For the feature map of d rows folding returns
d/2 rows.
A DCNN Architecture (Source: Kalchbrenner et al. (2014) )
A DCNN based Model for Hate speech detection
14
● Tweets: Crawled tweets using tweet-id, saved as csv file having tweets and label.
● Preprocessing of tweets:
○ Convert to lowercase, Stop words removal.
○ Remove unwanted symbols and retweets.
○ Normalize the words to make it meaningful.
○ Remove tokens having document frequency less than 7 which removes
sparse features which is less informative.
● Word2vec conversion:
○ A 300-dimensional word embedding GloVe model, which is pre- trained on
the 4-billion-word Wikipedia text corpus by researcher from Stanford
University.
○ Embedding dimension: 100*300.
● Passed to DCNN model for classification:
○ Four conv1d layer of having 300 filters of each of window size 1,2,3 and 4.
○ K-max pooling performed corresponding to each conv1d and merged into
one single vector.
○ Further passed through Dropout, dense layer and softmax layer for
classification.
A DCNN based proposed model
Results and Discussion
15
Datasets SVM MLP CNN* DCNN
WZ-LS 0.73 0.83 0.82 0.83
WZ-L 0.74 0.83 0.82 0.83
WZ-S.exp 0.89 0.93 0.90 0.9283
Hate 0.87 0.92 0.91 0.92
Table 1: shows testing accuracy of 4 different model on 4 publicly available Hate & offensive
language datasets.
Results and Discussion
16
(a) (b)
(c) (d)
Performance of MLP based Model
17
WZ-LS
class Precision Recall F1
Racist 0.73 0.73 0.73
Sexism 0.77 0.56 0.65
None 0.85 0.92 0.88
Both 1.0 0.33 0.50
Overall 0.83 0.83 0.82
WZ-L
class Precision Recall F1
Racist 0.81 0.68 0.74
Sexism 0.85 0.61 0.71
None 0.83 0.93 0.88
Overall 0.83 0.83 0.82
WZ-S.exp
class Precision Recall F1
Racist 1.0 0.05 0.2
Sexism 0.85 0.77 0.81
None 0.95 0.99 0.97
Both 0.0 0.0 0.0
Overall 0.93 0.92 0.93
DT
class Precision Recall F1
Hate 0.60 0.52 0.56
Offensive 0.95 0.80 0.87
Neither 0.87 0.91 0.89
Overall 0.92 0.91 0.92
(a) (b)
(c)
(d)
Conclusion
The propagation of hate speech on social media has been increasing
significantly in recent years and it is recognised that effective counter-measures
rely on automated data mining techniques. Our work made several contributions
to this problem. First, we introduced a method for automatically classifying hate
speech on Twitter using a deep neural network model (DCNN and MLP) that
empirically improve classification accuracy. Second we did comparative analysis
of our model on four publicly available datasets.
18
Future Work
We will explore future work in numerous ways, such as first, further fine tuning of
hyperparameter can improve accuracy, second we will use metadata along with
tweets such as number of followers, the location, account age, total number of
(posted/favorited/liked) tweets, etc., of a user. We will make a hybrid model
(DCNN + MLP), all tweets are passed through DCNN model and metadata to MLP
in parallel then the result of these two can be combined and then it will be passed
through dense layer and softmax layer for final classification.
19
THANKS!
20
References
• Greevy E and Smeaton A F. "Classifying racist texts using a support vector machine"; In Proceedings of the 27th Annual
International ACM SIGIR Conference on Research andDevelopment in Information Retrieval SIGIR ’04, pages 468–469, New
York, NY, USA, 2004. ACM
• Davidson T, Warmsley D, Macy M, and Weber I. "Automated hate speech detection and the problem of offensive language"; In
Proceedings of the 11th Conference on Web and Social Media. AAAI, 2017.
• Lozano E, Cede˜no J, Castillo G, Layedra F, Lasso H, and Vaca C. 2017 "Requiem for online harassers: Identifying racism from
political tweets"; In 4th IEEE Conference on eDemocracy & eGovernment (ICEDEG), 154–160.
• Jha A, and Mamidi R. 2017. "When does acompliment become sexist? analysis and classification of ambivalent sexism using
twitter data"; In 2nd Workshop on NLP and Computational Social Science, 7–16.
• Park H. J. and Fung P. "One-step and two-step classcation for abusive language detection on twitter";In ALW1: 1st Workshop on
Abusive Language Online, Vancouver, Canada, 2017. Association for Computational Linguistics.
• Zhang Z, Robinson D and Tepper J, “Detection Hate Speech on Twitter Using a Convolution-GRU based DNN” In 15th ESWC 2018
conference on Semantic web.
• Waseem Z and Hovy D. "Hateful symbols or hateful people? predictive features for hate speech detection on twitter";In
Proceedings of the NAACL Student Research Workshop, pages 88–93. Association for Computational Linguistics, 2016.
• Kalchbrenner N, Grefenstette E., Blunsom P. “A Convolutional Neural Network for Modelling Sentences”, In arXiv:1404.2188v1
[cs.CL] 8 Apr 2014.
21

More Related Content

What's hot

Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
Sai Srinivas Kotni
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
Acad
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine Learning
IRJET Journal
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
Savio Aberneithie
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
Ayushi Dalmia
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
Pravin Katiyar
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
MachinePulse
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
Ashraf Uddin
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
 
Machine learning Algorithm
Machine learning AlgorithmMachine learning Algorithm
Machine learning Algorithm
Md. Farhan Nasir
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
Kavita Ganesan
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
AntaraBhattacharya12
 
Machine learning
Machine learningMachine learning
Machine learning
Amit Kumar Rathi
 
Text summarization
Text summarizationText summarization
Text summarization
kareemhashem
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Francesco Casalegno
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
Deep learning
Deep learningDeep learning
Deep learning
Mohamed Loey
 

What's hot (20)

Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Hate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine LearningHate Speech Identification Using Machine Learning
Hate Speech Identification Using Machine Learning
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Machine learning Algorithm
Machine learning AlgorithmMachine learning Algorithm
Machine learning Algorithm
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Machine learning
Machine learningMachine learning
Machine learning
 
Text summarization
Text summarizationText summarization
Text summarization
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Deep learning
Deep learningDeep learning
Deep learning
 

Similar to Hate speech detection

Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
Muthusamy Chelliah
 
CNN for modeling sentence
CNN for modeling sentenceCNN for modeling sentence
CNN for modeling sentence
ANISH BHANUSHALI
 
A technical paper presentation on Evaluation of Deep Learning techniques in S...
A technical paper presentation on Evaluation of Deep Learning techniques in S...A technical paper presentation on Evaluation of Deep Learning techniques in S...
A technical paper presentation on Evaluation of Deep Learning techniques in S...
VarshaR19
 
5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf
satyaprakashkumawat2
 
Convolutional Neural Network for Text Classification
Convolutional Neural Network for Text ClassificationConvolutional Neural Network for Text Classification
Convolutional Neural Network for Text Classification
Anaïs Addad
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
Jayesh Patil
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
ijsc
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
MadhuriChandanbatwe
 
Optimizer algorithms and convolutional neural networks for text classification
Optimizer algorithms and convolutional neural networks for text classificationOptimizer algorithms and convolutional neural networks for text classification
Optimizer algorithms and convolutional neural networks for text classification
IAESIJAI
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
Thi K. Tran-Nguyen, PhD
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
Seminar dm
Seminar dmSeminar dm
Seminar dm
MHDAmmarALkelany
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Dataworkz odsc london 2018
Dataworkz odsc london 2018Dataworkz odsc london 2018
Dataworkz odsc london 2018
Olaf de Leeuw
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
ijsc
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
QuantUniversity
 
Et25897899
Et25897899Et25897899
Et25897899
IJERA Editor
 
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKSTEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
ijdms
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Alessandro Suglia
 

Similar to Hate speech detection (20)

Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
 
CNN for modeling sentence
CNN for modeling sentenceCNN for modeling sentence
CNN for modeling sentence
 
A technical paper presentation on Evaluation of Deep Learning techniques in S...
A technical paper presentation on Evaluation of Deep Learning techniques in S...A technical paper presentation on Evaluation of Deep Learning techniques in S...
A technical paper presentation on Evaluation of Deep Learning techniques in S...
 
5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf5th_sem_presentationtoday.pdf
5th_sem_presentationtoday.pdf
 
Convolutional Neural Network for Text Classification
Convolutional Neural Network for Text ClassificationConvolutional Neural Network for Text Classification
Convolutional Neural Network for Text Classification
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Optimizer algorithms and convolutional neural networks for text classification
Optimizer algorithms and convolutional neural networks for text classificationOptimizer algorithms and convolutional neural networks for text classification
Optimizer algorithms and convolutional neural networks for text classification
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Seminar dm
Seminar dmSeminar dm
Seminar dm
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Dataworkz odsc london 2018
Dataworkz odsc london 2018Dataworkz odsc london 2018
Dataworkz odsc london 2018
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
 
Et25897899
Et25897899Et25897899
Et25897899
 
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKSTEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
TEXT ADVERTISEMENTS ANALYSIS USING CONVOLUTIONAL NEURAL NETWORKS
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 

Recently uploaded

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
mamamaam477
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 

Recently uploaded (20)

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 

Hate speech detection

  • 1. A Deep Learning Approach For Hate Speech and Offensive Language Detection on Twitter Presented By: Nasim Alam M Tech Computer
  • 2. INTRODUCTION Hate Speech ● Hate speech is speech that attacks a person or group on the basis of attributes such as race religion, ethnic origin, national origin, gender, disability, sexual orientation. ● The law of some countries describes hate speech as speech, gesture or conduct, writing, or display that incites violence or prejudicial action against a protected group or individual on the basis of their membership of the group. ● Social media platforms like Facebook and twitter has raised concerns about emerging dubious activity such as the intensity of hate, abusive and offensive behavior among us. 2
  • 3. Motivation Potential of social media for spreading hate speech ◉ 30% internet penetration in India (World Bank, 2016) ◉ 241 million users of Facebook alone (The Next Web Report, 2017) ◉ 136 million Indians are active social media users (Yral Report, 2016) ◉ 200 million whatsapp users in India (Mashable, 2017) 3
  • 4. OBJECTIVE • The main objective of this work is to develop an automated deep learning based approach for detecting hate speech and offensive language. • Automated detection corresponds to automated learning such as machine learning: supervised and unsupervised learning. We use a supervised learning method to detect hate and offensive language. • Classify tweets into three or four classes (like: racist, sexist, none , both) based on tweet sentiment and other features that a tweet demonstrate. 4
  • 5. PROJECT CONTRIBUTION • An efficient feature extraction and selection. • A Multi-layer perceptron based model to train and classify tweets into hate, offensive or none. • A Dynamic CNN based model for training and GloVe embedding vector for feature extraction. 5
  • 6. Literature survey Refereance Dataset Techinque Results Greevy et al (2004) PRINCIP Corpus Size: 3 Million words from tweets Model: SVM Feature Extraction: BOW, Bi-gram Precision: 92.5%(BOW) Recall: 87% (BOW) Precision: 92.5% (Bi-gram) Recall: 87% (Bi-gram) Waseem and Hovy (2016) Total Annotated tweets: 16,914. #Sexist tweets 3,383. #Racist Tweets 1,972. #tweets Neither racist nor sexist: 11,559. Model: Char n-grams Word n-grams Precision: 73.89%(char n-gram) Recall: 77.75% (char n-gram) F1 Score: 72.87 (char n-grams) Precision: 64.58%(word n-grams) Recall: 71.93% (word n-grams) F1 Score: 64.58(word n-grams) Akshita et al (2016) Waseem and Hovy, 2016 Size: 22,142 tweets Class: Benevolent, Hostile, others Model: SVM, Seq2Seq (LSTM), FastText Classifier(by Facebook AI Research) Feature Extraction: TF- IDF, Bag of n-words Average F1 score among classes: 0.723 (SVM), 0.74 (Seq2Seq) Overall F1 Score: 0.84 (FastText) 6
  • 7. Literature survey Refereance Dataset Techinque Results Ji Ho et al (2016) Waseem and Hovy, 2016 Waseem 2016 Class: Racism,Sexism and None Size: 25k tweets Model: Hybrid CNN Classifier(wordCNN + CharCNN) Precision: 0.827 Recall: 0.827 F1 Score: 0.827 Davidson et al (2017) CrowdFlower (CF) Class: Hate,offensive and None Size: 25k tweets Model: Linear SVM, Logistic, Regression precision: 0.91, Recall: 0.90, F1 score: 0.90. Zhang et al (2018) 7 publicly available dataset: DT(24k), RM(2k), WZ-LS(18k), WZ- L(16k), WZ-S.amt(6k), WZ-S.exp(6k), WZ-S.gb(6k) Model:CNN+GRU Accuracy: DT: 0.94, RM: 0.92, WZ-L: 0.82,WZ-S.amt: 0.92, WZ-S.exp: 0.92, WZ-S.gb: 0.93 7
  • 8. Dataset 8 Dataset No of Tweets Classes (%Tweets) Target Class WZ-LS 18,595 Racism(10.6%), Sexism(20.2%), None (68.8%) Racism, Sexism WZ-L 16,093 Racism(12.01%), Sexism(19.56%), None (68.41%) Racism, Sexism WZ-S.exp 6,594 Racism(1.2%), Sexism(11.7%), both(0.53%), None (84.37%) Racism, Sexism Hate (Davidson) 24,783 Hate(11.6%), offensive(76.6%), Neither (11.8%) Hate, Offensive
  • 9. A Multi-Layer perceptron (MLP) based model 9 ● Raw text in the form of tweets in csv file crawled from twitter using Tweepy API. ● A lots of preprocessing done to get cleaned text. ● Feature Extraction: ○ Convert it into TF-IDF feature matrix. ○ POS TF feature matrix. ○ Other Features like: No_of_syllales, avg_syl_per_word, no_of_unique_words,num_mentions,is_retweets,VaderSentime nt:pos,neg,neutral, compound). ● Concatenated these feature matrices into one matrix. ● We used logistic regression with L1 regularization to select most important features and then passed this selected feature vector to an MLP network for classification. ● MLP consists of an input layer, three hidden layer and an output layer or softmax layer. ○ Input layer Size: Size of Input feature matrix, Activation: Sigmoid. ○ Number of nodes: 200, 140, 70 and Activation Function: Relu. ○ Softmax Layer: Output class: 3 or 4, Activation function: Softmax. MLP based Proposed model
  • 10. 10 A simple single layer CNN ● A Sentence (a single tweet): X1:n = X1 ⊕ X2 ⊕ …………..Xn ● All possible widow of length h: {X1:h, X2:h+1, …………Xn-h+1:n } ● We can have multiple filter or window of different length like h=1 for unigram, h=2 for bigram , h=3 for trigram and so on. ● This filter is consist of random weight which is convolved over sentence matrix in overlapped fashion and a sum of multiplication of filter and X is calculated as feature map. ● A feature map C = [c1,c2,……………………cn+h-1] ∈ ℝn-h+1, for multiple filter we may have multiple feature map as Ci = [C1, C2, …………Cm] where m is number of filters. ● pooling: pooling is a process of selecting only interested region from the convolution feature vector. The result of pooling is Ĉ = max{ C } and Ĉi can be pooled feature vector for ith filter. ● All the pooled vectors are concatenated into single feature vector Z = [Ĉ1, Ĉ2, ……, Ĉm ] ● Finally Z feature vector is passed through a softmax function for final classification.
  • 11. Word2Vec Word Embedding 11 Word2vec • Word2vec is a predictive model, which uses an ANN based model to predict the real valued vector of a target word with respect to the another context word. • Mikolov et al used continuous bag of words and skipgram models that are trained over millions of words and represent each in a vector space. GloVe • GloVe is a semantic vector space models of language represent each word with a real valued vector. • GloVe model uses word frequency and global co-occurance count matrix. • Count-based models learn their vectors by essentially doing dimensionality reduction on the co-occurrence counts matri. • These vectors can be used as features in a variety of applications such as information retrieval, document classification, question answering, NER, and Parsing. Representation of word in vector space
  • 12. Text based CNN 12 Text based Convolutional Neural Network operation (Source: Kim 2014)
  • 13. 13 Dynamic Convolutional Neural Network • Wide Convolution: Given an input sentence, to obtain the first layer of the DCNN we take the embedding Wi ∈ ℝd for each word in the sentence and construct the sentence matrix s ∈ ℝd × s . • A convolutional layer in the network is obtained by convolving a matrix of weights m ∈ ℝd × m with the matrix of activations at the layer below. • A dynamic k-max pooling operation is a k-max pooling operation where we let k be a function of the length of the sentence and the depth of the network, we simply model the pooling parameter as follows: Where i is ith conv-layer in which k max-pooling is applied. L is the total number of convolutional layers in the network.S is input sentence length. • Folding is used just to sum every two rows in feature map component wise. For the feature map of d rows folding returns d/2 rows. A DCNN Architecture (Source: Kalchbrenner et al. (2014) )
  • 14. A DCNN based Model for Hate speech detection 14 ● Tweets: Crawled tweets using tweet-id, saved as csv file having tweets and label. ● Preprocessing of tweets: ○ Convert to lowercase, Stop words removal. ○ Remove unwanted symbols and retweets. ○ Normalize the words to make it meaningful. ○ Remove tokens having document frequency less than 7 which removes sparse features which is less informative. ● Word2vec conversion: ○ A 300-dimensional word embedding GloVe model, which is pre- trained on the 4-billion-word Wikipedia text corpus by researcher from Stanford University. ○ Embedding dimension: 100*300. ● Passed to DCNN model for classification: ○ Four conv1d layer of having 300 filters of each of window size 1,2,3 and 4. ○ K-max pooling performed corresponding to each conv1d and merged into one single vector. ○ Further passed through Dropout, dense layer and softmax layer for classification. A DCNN based proposed model
  • 15. Results and Discussion 15 Datasets SVM MLP CNN* DCNN WZ-LS 0.73 0.83 0.82 0.83 WZ-L 0.74 0.83 0.82 0.83 WZ-S.exp 0.89 0.93 0.90 0.9283 Hate 0.87 0.92 0.91 0.92 Table 1: shows testing accuracy of 4 different model on 4 publicly available Hate & offensive language datasets.
  • 17. Performance of MLP based Model 17 WZ-LS class Precision Recall F1 Racist 0.73 0.73 0.73 Sexism 0.77 0.56 0.65 None 0.85 0.92 0.88 Both 1.0 0.33 0.50 Overall 0.83 0.83 0.82 WZ-L class Precision Recall F1 Racist 0.81 0.68 0.74 Sexism 0.85 0.61 0.71 None 0.83 0.93 0.88 Overall 0.83 0.83 0.82 WZ-S.exp class Precision Recall F1 Racist 1.0 0.05 0.2 Sexism 0.85 0.77 0.81 None 0.95 0.99 0.97 Both 0.0 0.0 0.0 Overall 0.93 0.92 0.93 DT class Precision Recall F1 Hate 0.60 0.52 0.56 Offensive 0.95 0.80 0.87 Neither 0.87 0.91 0.89 Overall 0.92 0.91 0.92 (a) (b) (c) (d)
  • 18. Conclusion The propagation of hate speech on social media has been increasing significantly in recent years and it is recognised that effective counter-measures rely on automated data mining techniques. Our work made several contributions to this problem. First, we introduced a method for automatically classifying hate speech on Twitter using a deep neural network model (DCNN and MLP) that empirically improve classification accuracy. Second we did comparative analysis of our model on four publicly available datasets. 18
  • 19. Future Work We will explore future work in numerous ways, such as first, further fine tuning of hyperparameter can improve accuracy, second we will use metadata along with tweets such as number of followers, the location, account age, total number of (posted/favorited/liked) tweets, etc., of a user. We will make a hybrid model (DCNN + MLP), all tweets are passed through DCNN model and metadata to MLP in parallel then the result of these two can be combined and then it will be passed through dense layer and softmax layer for final classification. 19
  • 21. References • Greevy E and Smeaton A F. "Classifying racist texts using a support vector machine"; In Proceedings of the 27th Annual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval SIGIR ’04, pages 468–469, New York, NY, USA, 2004. ACM • Davidson T, Warmsley D, Macy M, and Weber I. "Automated hate speech detection and the problem of offensive language"; In Proceedings of the 11th Conference on Web and Social Media. AAAI, 2017. • Lozano E, Cede˜no J, Castillo G, Layedra F, Lasso H, and Vaca C. 2017 "Requiem for online harassers: Identifying racism from political tweets"; In 4th IEEE Conference on eDemocracy & eGovernment (ICEDEG), 154–160. • Jha A, and Mamidi R. 2017. "When does acompliment become sexist? analysis and classification of ambivalent sexism using twitter data"; In 2nd Workshop on NLP and Computational Social Science, 7–16. • Park H. J. and Fung P. "One-step and two-step classcation for abusive language detection on twitter";In ALW1: 1st Workshop on Abusive Language Online, Vancouver, Canada, 2017. Association for Computational Linguistics. • Zhang Z, Robinson D and Tepper J, “Detection Hate Speech on Twitter Using a Convolution-GRU based DNN” In 15th ESWC 2018 conference on Semantic web. • Waseem Z and Hovy D. "Hateful symbols or hateful people? predictive features for hate speech detection on twitter";In Proceedings of the NAACL Student Research Workshop, pages 88–93. Association for Computational Linguistics, 2016. • Kalchbrenner N, Grefenstette E., Blunsom P. “A Convolutional Neural Network for Modelling Sentences”, In arXiv:1404.2188v1 [cs.CL] 8 Apr 2014. 21