Presented by:
Aman Chaudhary
Scholar no: 202112210
SENTIMENTAL ANALYSIS
Guided by:
Prof. Namita Tiwari
Prof. Pragati Agrawal
MANIT ,BHOPAL
I. INTRODUCTION
 Twitter, with over 330 million monthly active
users, has now become a goldmine for
organizations and individuals who have a strong
political, social and economic interest in
maintaining and enhancing their reputation.
 Sentiment analysis provides these organizations
with the ability to surveying various social media
sites in real time. Text Sentiment analysis is an
automatic process to determining whether a text
segment contains objective or opinionated
content, and it can furthermore determine the
text’s sentiment polarity
I. INTRODUCTION
Cont…
 The goal of Twitter sentiment classification is to
automatically determine whether a tweet’s
sentiment polarity is negative or positive.
 Most of the existing methods of Twitter sentiment
classification follow the method proposed by
Pang et al. [1]
 and apply machine learning algorithms to build a
classifier from tweets with manually annotated
sentiment polarity label. In recent years, there
has been a growing interest in using deep
learning techniques, which can enhance
classification accuracy.
I. INTRODUCTION
Cont…
The rest of this paper is organized as follows:
 Section II discussed related work on this topic.
 Section III BOW and Naïve Bayes Discussed.
 Section IV discussion .
 Section V concludes the presentation
II. RELATED WORK
 Most existing studies to Twitter sentiment analysis can be
divided into supervised methods [2]–[5] Supervised
methods are based on training classifiers (such as Naive
Bayes, Support Vector Machine, Random Forest) using
various combinations of features such as Part-Of-Speech
(POS) tags, word N-grams, and tweet context information
features, such as hashtags, retweets, emoticon, capital
words etc.
 The sentiment of a tweet is implicitly associated with the
semantics of its context. In this work, we present semantic
feature for sentiment analysis, which is word vector
contextual representation of a word in tweet, which can
capture the deep and implicit semantic relation information
in the words of tweets.
 Deep learning methods are now well established in
machine learning [. They have been especially successful
for image and voice p4rocessing tasks. Recently, such
methods have begun to overtake traditional sparse, linear
models for nature language processing.This presentation
III. Sentiment Analysis
Natural Language Toolkit (NLTK)
Data collection is required for sentimental analysis
Suppose we are collecting the data from Twitter
Now in what for data is available to us
Text(most of the cases)
Problem: How do we convert the text data in binary , because that what
computer understand
Or we can convert in to d-dimension vector
Let suppose we have converted the text data into vector
how two tweets or two text are similar??
Cont…
Cont…
Cont…
)........3,2,1(
).......3,2,1(
dim
vNvvvvector
rNrrrreviews
vectordtext rules
 
Cont…
 If
similarity(r1,r2)>similarity(r1,r3)
Here similarity means the “english similarity”
dis(v1,v2)<dis(v1,v3)
If r1 & r2 are more similar then distance b/w
them is less
Cont…
 We need a method to convert
Text d-dim vector
Such that similar text must be closer
geometrically
Cont…
 Techniques used:
1.Bow
2.tf-idf
3.w2v
4.Avg w2v
5.tf-idf-w2v
BOW
 r1:This pasta is very tasty and affordable
 r2:This pasta is not tasty and is affordable
 r3:This pasta is delicious and cheap
 r4:Pasta is tasty and pasta tastes good
 .
 .
 .
 rN: N reviews
Cont…
 1.Constructing a dictionary: set of all the
words in your reviews.
 {This,pasta,is,very,tasty,and,affordable}
r1:This pasta is very tasty and affordable
Its vector representation V1 is(no of times the
words exists)
1 1
10987654321
.... thistastypastatheana
Cont..
 Construct a vector of size ‘d’
 Each index will corresponds to the word
 Each word is a different dimension
V1 sparse
Most of the elements of the vector are zero
This is the method of converting the text to vector
Cont…
 Length||v1-v2||=
Disadvantages:
1. For small changes it will not work
Example: very tasty
not tasty
...............2^12^12^12^1 
Binary Bag of Words
 Also called as boolean BOW
 Put whether the word occur or not
If word occur put 1 else put 0
w1 w2 w2 . . . . . . w1
0
1
1: if w1 occurs atleast once
0: otherwise
Cont…
 ||v1-v2|| = no. of words differing
#some words like ‘and’ ,‘is’, ‘this’ is not useful
called stopwords
Pasta ,very , affordable are useful
Lemmitization: This pasta is very tasty
This is the best in New York
Break the sentense into words
Cont…
 How do you break the sentence into words?
Language and context dependend.
If we remove these stop words then bow vector
is smaller and meaning.
Sometimes removing the stop words leads us
to rediculas solution so we don’t have to
remove the stop words always
Cont…
Q. Where to get list of stop words?
Predefined library in python
from nltk.corpus import stopwords
Stop=set(stopwords.words(“english”))
Cont…
 Steps to follow
1.Removing the stop words
2.Lower case
3.Stemming
4.Lemmitization
5.Taste and delicious are have same meaning
Semantic meaning of words
{tasty delicious very similar meaning}
Word 2 vec take this under consideration
Cont…
Uni-gram/Bi-gram/n-gram
Unigram
Each word is considered as dimension
this is pasta Very Tasty affordabl
e
Bi-gram
2 consecutive words
This pasta Pasta is Is very Very tasty Tasty and And
affordable
Cont…
 Text Pre Processing
1.Begin with removing html tags
2.Remove any punctuations or limited set of special
character like,or . or # etc
3.Check if word made up of english letters and is
not alpha numeric
4.Check to see if the length of words is greater than
2 (as it was researched that there is no adjective
in 2 letters)
5.Convert the words to lower case
6.Remove stop words
Cont…
 7.finally stemming the word.
After which we collect the words used to describe
positive & negative reviews.
Cont…
 Naïve Bayes Theorem
Cont…
Cont…
 1- Build a vocabulary (list of words) of all the
words resident in our training data set.
 2- Match tweet content against our vocabulary —
word-by-word.
 3- Build our word feature vector.
 4- Plug our feature vector into the Naive Bayes
Classifier.
Cont..
 Step .1: Building the vocabulary
 A vocabulary in Natural Language Processing is a
list of all speech segments available for the
model. In our case, this includes all the words
resident in the Training set we have, as the model
can make use of all of them relatively equally —
at this point, to say the least.
 This is just creating a list of all words we have in
the Training set, breaking it into word features.
Those word features are basically a list of distinct
words, each of which has its frequency (number
of occurrences in the set) as a key.
Cont..
 Step .2: Matching tweets against our
vocabulary
 This step is crucial, as we will go through all the
words in our Training set (i.e.
our word_features list), comparing every word
against the tweet at hand, associating a number
with the word following:
 label 1 (true): if word in vocabulary is resident in
tweet
label 0 (false): if word in vocabulary is not
resident in tweet
Cont..
 Step .3: Building our feature vector
 Let’s now call the last two functions we have
written. This will build our final feature vector, with
which we can proceed on to training.
 The NTLK built-in function apply_features does
the actual feature extraction from our lists. Our
final feature vector is trainingFeatures.
Cont..
 Step .4: Training the classifier
 We have finally come to the most important —
and ironically the shortest — part of our task.
Thanks to NLTK, it will only take us a function call
to train the model as a Naive Bayes Classifier,
since the latter is built into the library:
 We’re almost done! All we have left is running the
classifier training code
(i.e. nltk.NaiveBayesClassifier.train()) and testing
it. Note that this code could take a few minutes to
execute.
Cont…
 Testing The Model
 Moment of truth! Let’s finish up our work by
running the classifier (i.e. NBayesClassifier) on
the tweets that will download from Twitter (when
we do practical), according to our search term,
and getting the majority vote of the labels
returned by the classifier, then outputting the total
positive or negative percentage (i.e. score) of the
tweets
IV. DISCUSSION
 Any Doubts?
V. CONCLUSION
 sentiment polarity features based sentiment
lexicon and n-grams features as the sentiment
features vector of the tweet. Our model captures
contextual information with the recurrent structure
and constructs the representation of text using
naïve bayes approch. Our model will performs
good as the state-of-the-art approaches and
baseline model. We can finally conclude naïve
bayes approach pre-trained word vectors has
good performance in the task of Twitter sentiment
analysis.
References (1/3)
 [1] B. Pang, L. Lee, and S. Vaithyanathan,
‘‘Thumbs up?: Sentiment classification using
machine learning techniques,’’ in Proc. Conf.
Empirical Methods Natural Lang. Process., 2002,
pp. 79–86.
 [2] H. Saif, Y. He, and H. Alani, ‘‘Semantic
sentiment analysis of twitter,’’ in Proc. Semantic
Web-ISWC, 2012, pp. 508–514,2012.
References (2/3)
 [3] M. Hagen, M. Potthast, M. Büüchner, and B.
Stein, ‘‘Twitter sentiment detection via ensemble
classification using averaged confidence scores,’’
in Proc. Eur. Conf. Inf. Retrieval, 2015, pp. 741–
754.
 [4] Z. Jianqiang and C. Xueliang, ‘‘Combining
semantic and prior polarity for boosting twitter
sentiment analysis,’’ in Proc. IEEE Int. Conf.
Smart City/SocialCom/SustainCom (SmartCity),
Dec. 2015, pp. 832–837.
References (3/3)
 [5] S. Kiritchenko, X. Zhu, and S. M. Mohammad,
‘‘Sentiment analysis of short informal texts,’’ J.
Artif. Intell. Res., vol. 50, pp. 723–762, Mar. 2014.
 [6] N. F. F. da Silva, E. R. Hruschka, and E. R.
Hruschka, ‘‘Tweet sentiment analysis with
classifier ensembles,’’ Decision Support Syst.,
vol. 66, pp. 170–179, Oct. 2014.

Aman chaudhary

  • 1.
    Presented by: Aman Chaudhary Scholarno: 202112210 SENTIMENTAL ANALYSIS Guided by: Prof. Namita Tiwari Prof. Pragati Agrawal MANIT ,BHOPAL
  • 2.
    I. INTRODUCTION  Twitter,with over 330 million monthly active users, has now become a goldmine for organizations and individuals who have a strong political, social and economic interest in maintaining and enhancing their reputation.  Sentiment analysis provides these organizations with the ability to surveying various social media sites in real time. Text Sentiment analysis is an automatic process to determining whether a text segment contains objective or opinionated content, and it can furthermore determine the text’s sentiment polarity
  • 3.
    I. INTRODUCTION Cont…  Thegoal of Twitter sentiment classification is to automatically determine whether a tweet’s sentiment polarity is negative or positive.  Most of the existing methods of Twitter sentiment classification follow the method proposed by Pang et al. [1]  and apply machine learning algorithms to build a classifier from tweets with manually annotated sentiment polarity label. In recent years, there has been a growing interest in using deep learning techniques, which can enhance classification accuracy.
  • 4.
    I. INTRODUCTION Cont… The restof this paper is organized as follows:  Section II discussed related work on this topic.  Section III BOW and Naïve Bayes Discussed.  Section IV discussion .  Section V concludes the presentation
  • 5.
    II. RELATED WORK Most existing studies to Twitter sentiment analysis can be divided into supervised methods [2]–[5] Supervised methods are based on training classifiers (such as Naive Bayes, Support Vector Machine, Random Forest) using various combinations of features such as Part-Of-Speech (POS) tags, word N-grams, and tweet context information features, such as hashtags, retweets, emoticon, capital words etc.  The sentiment of a tweet is implicitly associated with the semantics of its context. In this work, we present semantic feature for sentiment analysis, which is word vector contextual representation of a word in tweet, which can capture the deep and implicit semantic relation information in the words of tweets.  Deep learning methods are now well established in machine learning [. They have been especially successful for image and voice p4rocessing tasks. Recently, such methods have begun to overtake traditional sparse, linear models for nature language processing.This presentation
  • 6.
    III. Sentiment Analysis NaturalLanguage Toolkit (NLTK) Data collection is required for sentimental analysis Suppose we are collecting the data from Twitter Now in what for data is available to us Text(most of the cases) Problem: How do we convert the text data in binary , because that what computer understand Or we can convert in to d-dimension vector Let suppose we have converted the text data into vector how two tweets or two text are similar??
  • 7.
  • 8.
  • 9.
  • 10.
    Cont…  If similarity(r1,r2)>similarity(r1,r3) Here similaritymeans the “english similarity” dis(v1,v2)<dis(v1,v3) If r1 & r2 are more similar then distance b/w them is less
  • 11.
    Cont…  We needa method to convert Text d-dim vector Such that similar text must be closer geometrically
  • 12.
  • 13.
    BOW  r1:This pastais very tasty and affordable  r2:This pasta is not tasty and is affordable  r3:This pasta is delicious and cheap  r4:Pasta is tasty and pasta tastes good  .  .  .  rN: N reviews
  • 14.
    Cont…  1.Constructing adictionary: set of all the words in your reviews.  {This,pasta,is,very,tasty,and,affordable} r1:This pasta is very tasty and affordable Its vector representation V1 is(no of times the words exists) 1 1 10987654321 .... thistastypastatheana
  • 15.
    Cont..  Construct avector of size ‘d’  Each index will corresponds to the word  Each word is a different dimension V1 sparse Most of the elements of the vector are zero This is the method of converting the text to vector
  • 16.
    Cont…  Length||v1-v2||= Disadvantages: 1. Forsmall changes it will not work Example: very tasty not tasty ...............2^12^12^12^1 
  • 17.
    Binary Bag ofWords  Also called as boolean BOW  Put whether the word occur or not If word occur put 1 else put 0 w1 w2 w2 . . . . . . w1 0 1 1: if w1 occurs atleast once 0: otherwise
  • 18.
    Cont…  ||v1-v2|| =no. of words differing #some words like ‘and’ ,‘is’, ‘this’ is not useful called stopwords Pasta ,very , affordable are useful Lemmitization: This pasta is very tasty This is the best in New York Break the sentense into words
  • 19.
    Cont…  How doyou break the sentence into words? Language and context dependend. If we remove these stop words then bow vector is smaller and meaning. Sometimes removing the stop words leads us to rediculas solution so we don’t have to remove the stop words always
  • 20.
    Cont… Q. Where toget list of stop words? Predefined library in python from nltk.corpus import stopwords Stop=set(stopwords.words(“english”))
  • 21.
    Cont…  Steps tofollow 1.Removing the stop words 2.Lower case 3.Stemming 4.Lemmitization 5.Taste and delicious are have same meaning Semantic meaning of words {tasty delicious very similar meaning} Word 2 vec take this under consideration
  • 22.
    Cont… Uni-gram/Bi-gram/n-gram Unigram Each word isconsidered as dimension this is pasta Very Tasty affordabl e Bi-gram 2 consecutive words This pasta Pasta is Is very Very tasty Tasty and And affordable
  • 23.
    Cont…  Text PreProcessing 1.Begin with removing html tags 2.Remove any punctuations or limited set of special character like,or . or # etc 3.Check if word made up of english letters and is not alpha numeric 4.Check to see if the length of words is greater than 2 (as it was researched that there is no adjective in 2 letters) 5.Convert the words to lower case 6.Remove stop words
  • 24.
    Cont…  7.finally stemmingthe word. After which we collect the words used to describe positive & negative reviews.
  • 25.
  • 26.
  • 27.
    Cont…  1- Builda vocabulary (list of words) of all the words resident in our training data set.  2- Match tweet content against our vocabulary — word-by-word.  3- Build our word feature vector.  4- Plug our feature vector into the Naive Bayes Classifier.
  • 28.
    Cont..  Step .1:Building the vocabulary  A vocabulary in Natural Language Processing is a list of all speech segments available for the model. In our case, this includes all the words resident in the Training set we have, as the model can make use of all of them relatively equally — at this point, to say the least.  This is just creating a list of all words we have in the Training set, breaking it into word features. Those word features are basically a list of distinct words, each of which has its frequency (number of occurrences in the set) as a key.
  • 29.
    Cont..  Step .2:Matching tweets against our vocabulary  This step is crucial, as we will go through all the words in our Training set (i.e. our word_features list), comparing every word against the tweet at hand, associating a number with the word following:  label 1 (true): if word in vocabulary is resident in tweet label 0 (false): if word in vocabulary is not resident in tweet
  • 30.
    Cont..  Step .3:Building our feature vector  Let’s now call the last two functions we have written. This will build our final feature vector, with which we can proceed on to training.  The NTLK built-in function apply_features does the actual feature extraction from our lists. Our final feature vector is trainingFeatures.
  • 31.
    Cont..  Step .4:Training the classifier  We have finally come to the most important — and ironically the shortest — part of our task. Thanks to NLTK, it will only take us a function call to train the model as a Naive Bayes Classifier, since the latter is built into the library:  We’re almost done! All we have left is running the classifier training code (i.e. nltk.NaiveBayesClassifier.train()) and testing it. Note that this code could take a few minutes to execute.
  • 32.
    Cont…  Testing TheModel  Moment of truth! Let’s finish up our work by running the classifier (i.e. NBayesClassifier) on the tweets that will download from Twitter (when we do practical), according to our search term, and getting the majority vote of the labels returned by the classifier, then outputting the total positive or negative percentage (i.e. score) of the tweets
  • 33.
  • 34.
    V. CONCLUSION  sentimentpolarity features based sentiment lexicon and n-grams features as the sentiment features vector of the tweet. Our model captures contextual information with the recurrent structure and constructs the representation of text using naïve bayes approch. Our model will performs good as the state-of-the-art approaches and baseline model. We can finally conclude naïve bayes approach pre-trained word vectors has good performance in the task of Twitter sentiment analysis.
  • 35.
    References (1/3)  [1]B. Pang, L. Lee, and S. Vaithyanathan, ‘‘Thumbs up?: Sentiment classification using machine learning techniques,’’ in Proc. Conf. Empirical Methods Natural Lang. Process., 2002, pp. 79–86.  [2] H. Saif, Y. He, and H. Alani, ‘‘Semantic sentiment analysis of twitter,’’ in Proc. Semantic Web-ISWC, 2012, pp. 508–514,2012.
  • 36.
    References (2/3)  [3]M. Hagen, M. Potthast, M. Büüchner, and B. Stein, ‘‘Twitter sentiment detection via ensemble classification using averaged confidence scores,’’ in Proc. Eur. Conf. Inf. Retrieval, 2015, pp. 741– 754.  [4] Z. Jianqiang and C. Xueliang, ‘‘Combining semantic and prior polarity for boosting twitter sentiment analysis,’’ in Proc. IEEE Int. Conf. Smart City/SocialCom/SustainCom (SmartCity), Dec. 2015, pp. 832–837.
  • 37.
    References (3/3)  [5]S. Kiritchenko, X. Zhu, and S. M. Mohammad, ‘‘Sentiment analysis of short informal texts,’’ J. Artif. Intell. Res., vol. 50, pp. 723–762, Mar. 2014.  [6] N. F. F. da Silva, E. R. Hruschka, and E. R. Hruschka, ‘‘Tweet sentiment analysis with classifier ensembles,’’ Decision Support Syst., vol. 66, pp. 170–179, Oct. 2014.