Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Sentiment Analysis

on

  • 1,521 views

 

Statistics

Views

Total Views
1,521
Views on SlideShare
1,509
Embed Views
12

Actions

Likes
1
Downloads
57
Comments
0

2 Embeds 12

http://www.linkedin.com 10
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Sentiment Analysis Sentiment Analysis Presentation Transcript

  • Thumbs up? Sentiment Classification using Machine Learning Techniques
    - Bo Pang and Lillian Lee
    - ShivakumarVaithyanathan
  • What is it??
    Input – raw text over some topic
    Output – opinion ( +ve, -ve or neutral )
    Its is hard – why???
    - determines the opinion on overall text rather than just subject of the topic
    -- lets understand the problem
  • We know …
    Web – enormous amount of data
    Topical categorization – active research
  • Rise of blogs, forums …
    Web 2.0 is commonly associated with web applications that facilitate interactive informationsharing, interoperability, user-centered design, and collaboration on the World Wide Web – (source : Wikipedia)
  • Why is it interesting?
    Represents the voice about particular topic from broader audience
    Example : product reviews, movie reviews, book reviews
    Important to business intelligence applications
    - What do people (dis)like in Nikon D40
  • What this paper does
    Examines the effectiveness of applying machine learning techniques to sentiment classification problem
    Challenging – while topic are identifiable by keywords alone, sentiment can be expressed in a more subtle manner.
  • Dataset : Movie-Review Domain
    Reason :
    Large online collection for reviews
    Easy to summarize with machine-extractable rating indicator than to handle data for supervised learning
    Corpus of 752 –ve, 1301 +ve, with total 144 reviewers represented
  • Naïve approach
    Idea: people tend to use certain words to express strong sentiments, produce such list and rely to classify text
  • Machine Learning methods
    Let {f1, f2, …, fm} be predefined m features that can appear in document.Example : “still” or bigram “really stinks”
    ni(d) – number of times fi occurs in document d
    Document vector(d) = (n1(d), n2(d), …, nm(d))
  • Naïve Bayes
    Assign to a given document d the class
    Naïve Bayes rule :
  • Maximum Entropy
    Idea is to make fewest assumptions about the data while still being consistent with it
  • Support Vector Machines(SVM)
    Are large-margin, non-probabilistic classifiers in contrast to Naïve Bayes and Maximum Entropy
    Letting (corresponding to +ve,-ve), be the correct class of document dj,
  • Evaluations
    Randomly selected 700 positive, 700 negative sentiment documents
    Automatically removed rating indicators, extracted textual information from original HTML
    Added NOT_ to every word between a negation word(“not”, “isn’t”) and first punctuation.
  • Results
  • Conclusion
    Unigram presence information turned out to be most effective
    The superiority of presence information in comparison to feature frequency indicates a difference between sentiment and topic categorization.