Sentiment Analysis


Published on

Slides for my college project on Sentiment Analysis of cellphone reviews. System is to be made in Python and uses NLTK.

  • want ur email id...please mail it on ''
    Are you sure you want to  Yes  No
    Your message goes here
  • hello sagar...
    I want your email-id so as to contact with you.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Sentiment Analysis

  1. 1. Sentiment Analysis from Cellphone Reviews Sagar Ahire | 155 Preeti Singh | 178
  2. 2. What is Sentiment Analysis?• Takes a block of text as input• Determines the sentiment expressed in it• “Sentiment” refers to whether the author’s opinion is positive or negative
  3. 3. Disciplines Involved• Natural Language Processing• Data Mining• Artificial Intelligence
  4. 4. What Sentiment Analysis is NOT• Does NOT use images anywhere (that is “emotion detection”)• Does NOT aim at evaluating the product itself, just the sentiment expressed by the reviewer
  5. 5. Why Sentiment Analysis is challenging• Keywords are not usually direct “This phone is as modern as the one owned by Alexander Graham Bell”• Opinions expressed may belong to other people “Many people say iPhones are better than Androids”• Order Effects “This could have revolutionized phones for ever, but the bundled OS makes it an ultimate letdown”• Colloquial and domain-specific phrases “The phone runs a 1.2 GHz dual core processor”
  6. 6. Project Overview• Aims to perform sentiment analysis on cellphone reviews• Rates the sentiment on a scale of 1 to 5 stars
  7. 7. Inner Workings• Uses a corpus of several cellphone reviews (currently 33)• Trains a classifier using features, which may be: – Unigrams (Occurrences of single words) – Bigrams (Occurrences in pairs) – Adjectives only, etc.• Uses the classifier to classify unknown reviews
  8. 8. Steps
  9. 9. Why Python?• Less code, more productivity• Flexible paradigms (functional, procedural, object-oriented, all in one)• Fast development cycle• Wide range of modules
  10. 10. Diving In…• Modules used: – Python Standard Library (random, sys, etc) – nltk• Classifiers used: – Naïve Bayes
  11. 11. Diving In… The Algorithm (Unigram Occurrences)1. Take the entire corpus as input2. Create a list ‘l’ of all documents, each labeled by its category (i.e., no of stars)3. Extract the ‘n’ most frequent words in the entire corpus, cleaning up duplicates and non-alphabetic words
  12. 12. Diving In… The Algorithm (Unigram Occurrences)4. For every document in l: i. Create a dictionary d[l] ii. For each of the n frequent words, put a value in d[l] indicating presence or absence5. Divide the dictionary into a training set and a testing set
  13. 13. Diving In… The Algorithm (Unigram Occurrences)6. Train a Naïve Bayes Classifier using the training set7. Test the classifier using the testing set and report the accuracy
  14. 14. Next Steps• Investigating the Maximum Entropy Classifier• Refining feature choice – Negation Tagging – Synonyms• Investigating Regression techniques
  15. 15. Additional Applications of Sentiment Analysis• Filtering of SPAM or abusive e-mails• Gauging the mood of people in a particular network• Government intelligence• Psychological evaluation• Recommendation Systems• Display of ads on webpages
  16. 16. “Sentiment is the poetry of the imagination.” - Alphonse de Lamartine