Twitter Sentiment Analysis using
Python and NLTK
Presentation by:
ASHWIN PERTI,
Department of IT
Sentiment Analysis using PYTHON
The purpose of this Sentiment Analysis is:
● Able to automatically classify a tweet as a
positive
OR
● Negative tweet Sentiment wise
Sentiment Analysis using PYTHON
● The classifier needs to be trained:
● We need a list of manually classified tweets.
Positive Tweets
● I love this car
● This view is amazing
● I feel great this morning
● I am so excited about the concert
● He is my best friend
Negative Tweets
● I do not like this car
● This view is horrible
● I feel tired this morning
● I am not looking forward to the concert
● He is my enemy
Test Tweets
● TEST SET – to assess the exactitude of the
trained classifier
● I feel happy this morning. positive
● Larry is my friend. positive
● I do not like that man. negative
● My house not great. negative
● Your song annoying. negative
CLASSIFIER
● The list of word features need to be extracted
from the tweets.
● It is a list with every distinct words ordered by
frequency of appearance.
CLASSIFIER – Feature Extractor
● To decide which features are more relevant.
● The one we are going to use returns a
dictionary indicating that words are contained
in the input passed.
● INPUT - tweet
classifier=nltk.NaiveBayesClassifier.
train(training_set)
Naive Bayes Classifier
● It uses the prior probability of each label – which is
the frequency of each label in the training set and the
contribution from each feature.
● In our case, the frequency of each label is the same
for 'positive' and 'negative'.
● Word 'amazing' appears in 1 of 5 of the positive
tweets and none of the negative tweets.
● This means that the likelihood of the 'positive' label
will be multiplied by 0.2 when this word is seen as
part of the input.
CLASSIFY
● Now that we have our classifier initialized,
● Classify a tweet and
● See what the sentiment type output is:
● Our classifier is able to detect that this tweet
has a positive sentiment because
● Of the word 'friend'
● Which is associated to the positive tweet:
● 'He is my best friend'
print extract_features(tweet2.split())
● 'contains(looking)': False,
'contains(feel)': False,
● 'contains(the)': False,
● 'contains(excited)': False,
● 'contains(about)': False,
● 'contains(great)': False,
● 'contains(horrible)': False,
● 'contains(car)': False,
● 'contains(this)': False,
● 'contains(best)': False,
● 'contains(friend)': True,
● 'contains(concert)': False,
● 'contains(forward)': False,
● 'contains(view)': False,
● 'contains(tired)': False,
● 'contains(like)': False,
● 'contains(love)': False,
● 'contains(amazing)': False,
● 'contains(enemy)': False,
● 'contains(not)': True,
● 'contains(morning)': False}

Sentiments Analysis using Python and nltk

  • 1.
    Twitter Sentiment Analysisusing Python and NLTK Presentation by: ASHWIN PERTI, Department of IT
  • 2.
    Sentiment Analysis usingPYTHON The purpose of this Sentiment Analysis is: ● Able to automatically classify a tweet as a positive OR ● Negative tweet Sentiment wise
  • 3.
    Sentiment Analysis usingPYTHON ● The classifier needs to be trained: ● We need a list of manually classified tweets.
  • 4.
    Positive Tweets ● Ilove this car ● This view is amazing ● I feel great this morning ● I am so excited about the concert ● He is my best friend
  • 5.
    Negative Tweets ● Ido not like this car ● This view is horrible ● I feel tired this morning ● I am not looking forward to the concert ● He is my enemy
  • 6.
    Test Tweets ● TESTSET – to assess the exactitude of the trained classifier ● I feel happy this morning. positive ● Larry is my friend. positive ● I do not like that man. negative ● My house not great. negative ● Your song annoying. negative
  • 7.
    CLASSIFIER ● The listof word features need to be extracted from the tweets. ● It is a list with every distinct words ordered by frequency of appearance.
  • 8.
    CLASSIFIER – FeatureExtractor ● To decide which features are more relevant. ● The one we are going to use returns a dictionary indicating that words are contained in the input passed. ● INPUT - tweet
  • 9.
  • 10.
    Naive Bayes Classifier ●It uses the prior probability of each label – which is the frequency of each label in the training set and the contribution from each feature. ● In our case, the frequency of each label is the same for 'positive' and 'negative'. ● Word 'amazing' appears in 1 of 5 of the positive tweets and none of the negative tweets. ● This means that the likelihood of the 'positive' label will be multiplied by 0.2 when this word is seen as part of the input.
  • 11.
    CLASSIFY ● Now thatwe have our classifier initialized, ● Classify a tweet and ● See what the sentiment type output is: ● Our classifier is able to detect that this tweet has a positive sentiment because ● Of the word 'friend' ● Which is associated to the positive tweet: ● 'He is my best friend'
  • 12.
    print extract_features(tweet2.split()) ● 'contains(looking)':False, 'contains(feel)': False, ● 'contains(the)': False, ● 'contains(excited)': False, ● 'contains(about)': False, ● 'contains(great)': False, ● 'contains(horrible)': False, ● 'contains(car)': False, ● 'contains(this)': False, ● 'contains(best)': False, ● 'contains(friend)': True, ● 'contains(concert)': False, ● 'contains(forward)': False, ● 'contains(view)': False, ● 'contains(tired)': False, ● 'contains(like)': False, ● 'contains(love)': False, ● 'contains(amazing)': False, ● 'contains(enemy)': False, ● 'contains(not)': True, ● 'contains(morning)': False}