Sentiment Analysis
Natural Language Processing
Emory University

Jinho D. Choi
Sentiment Analysis
2
A task of identifying the sentiment of a document.
↑
sentence, twit, blog, article etc.
Identifying sentiments of certain aspects.
Camera
Lens Resolution Price sizeBrand
Reviews & Ratings
3
https://www.google.com/shopping
http://ratemyprofessors.com
Movie Reviews
4
http://www.cs.cornell.edu/people/pabo/movie-review-data/
http://www.rottentomatoes.com
the film provides some great
insight into the neurotic
mindset of all comics -- even
those who have reached the
absolute top of the game .
Postive
most of the problems with the
film don't derive from the
screenplay , but rather the
mediocre performances by most
of the actors involved
Negative
Dictionary-based Approach
5
Create lists of positive/negative words (phrases).
suck
terrible
awful
unwatchable
hideous
Negative
dazzling
brilliant
phenomenal
excellent
fantastic
Positive
Sentiment = |Positive words| - |Negative words|
Around 65% accuracy!
Machine Learning Approach
6
N-gram models
the film provides some great insight into the neurotic mindset of all
comics -- even those who have reached the absolute top of the game .
1-gram = {the, film, provides, …}
2-gram = {the_film, file_provides, provides_some, …}
0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
↑
the
↑
film
↑
provides
↑

the_film
↑
film_provides
↑
provides_some
NB: 80.6 ME: 80.8 SVM: 82.7
Machine Learning Approach
7
N-gram models
the film provides some great insight into the neurotic mindset of all
comics -- even those who have reached the absolute top of the game .
Stop-word
1-gram = {the, film, provides, some, …}
2-gram = {the_film, file_provides, provides_some, …}
TF-IDF
Term frequency
Document frequency
Challenges
8
I liked this movie.
I didn’t like this movie. Negation
I liked this movie, but not the actors. Mixture
Besides a few bad plots, clumsy graphics, 

and lousy acting, I liked this movie
Thwart
This movie is unreal. Ambiguity
Parse Tree?
Aspect-based?

CS571: Sentiment Analysis

  • 1.
    Sentiment Analysis Natural LanguageProcessing Emory University
 Jinho D. Choi
  • 2.
    Sentiment Analysis 2 A taskof identifying the sentiment of a document. ↑ sentence, twit, blog, article etc. Identifying sentiments of certain aspects. Camera Lens Resolution Price sizeBrand
  • 3.
  • 4.
    Movie Reviews 4 http://www.cs.cornell.edu/people/pabo/movie-review-data/ http://www.rottentomatoes.com the filmprovides some great insight into the neurotic mindset of all comics -- even those who have reached the absolute top of the game . Postive most of the problems with the film don't derive from the screenplay , but rather the mediocre performances by most of the actors involved Negative
  • 5.
    Dictionary-based Approach 5 Create listsof positive/negative words (phrases). suck terrible awful unwatchable hideous Negative dazzling brilliant phenomenal excellent fantastic Positive Sentiment = |Positive words| - |Negative words| Around 65% accuracy!
  • 6.
    Machine Learning Approach 6 N-grammodels the film provides some great insight into the neurotic mindset of all comics -- even those who have reached the absolute top of the game . 1-gram = {the, film, provides, …} 2-gram = {the_film, file_provides, provides_some, …} 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 ↑ the ↑ film ↑ provides ↑
 the_film ↑ film_provides ↑ provides_some NB: 80.6 ME: 80.8 SVM: 82.7
  • 7.
    Machine Learning Approach 7 N-grammodels the film provides some great insight into the neurotic mindset of all comics -- even those who have reached the absolute top of the game . Stop-word 1-gram = {the, film, provides, some, …} 2-gram = {the_film, file_provides, provides_some, …} TF-IDF Term frequency Document frequency
  • 8.
    Challenges 8 I liked thismovie. I didn’t like this movie. Negation I liked this movie, but not the actors. Mixture Besides a few bad plots, clumsy graphics, 
 and lousy acting, I liked this movie Thwart This movie is unreal. Ambiguity Parse Tree? Aspect-based?