1. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Classification of Sentimental Reviews Using
Machine Learning Techniques
ICRTC-2015: 3rd International Conference On Recent Trends In
Computing
Presented At
SRM University
Delhi-NCR Campus, Ghaziabad(U.P)
Ankit Agrawal
Department of Computer Science and Engineering,
National Institute of Technology Rourkela,
Rourkela - 769008, Odisha, India
March 9, 2015
1 / 26
2. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Sentiment Analysis
Sentiment mainly refers to feelings, emotions, opinion or atti-
tude (Argamon et al., 2009).
With the rapid increase of world wide web, people often express
their sentiments over internet through social media, blogs, rat-
ing and reviews.
Business owners and advertising companies often employ sen-
timent analysis to discover new business strategies and adver-
tising campaign.
2 / 26
3. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Machine Learning Techniques
Machine leaning algorithms are used to classify and predict whether
a document represents positive or negative sentiment. Different
types of machine learning algorithms are:
Supervised algorithm uses a labeled dataset where each doc-
ument of training set is labeled with appropriate sentiment.
Unsupervised algorithm include unlabeled dataset (Singh et al.,
2007).
This study mainly concerns with supervised learning techniques on
a labeled dataset.
3 / 26
4. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Types of Sentiment Analysis
Different Types of Sentiment analysis are as follows:
Document Level: Document Level sentiment classification
aims at classifying the entire document or review as either pos-
itive or negative.
Sentence Level: Sentence level sentiment classification con-
siders the polarity of individual sentence of a document.
Aspect Level: Aspect level sentiment classification first iden-
tifies the different aspects of a corpus and then for each doc-
ument, the polarity is calculated with respect to obtained as-
pects.
Document level sentiment analysis is being considered for analysis
in this study.
4 / 26
5. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Review of Related work
Author(s) Sentiment Analysis Approach
(Pang et al.,
2002)
They have considered sentiment classification based
on categorization aspect with positive and negative
sentiments . They used three different machine learn-
ing algorithms i.e., Naive Bayes, SVM , and Maximum
Entropy classification applied over the n-gram tech-
nique.
(Turney, 2002). He presented unsupervised algorithm to classify re-
view as either recommended (positive) or not rec-
ommended (negative). The author has used Part of
Speech (POS) tagger to identify phrases which con-
tain adjectives or adverbs.
(Read, 2005). He used emotions for labeling of dataset. He used
emotions for labeling because they are independent of
time, topic and domain. He applied machine learning
classifiers on this labeled dataset.
5 / 26
6. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
continue...
Author(s) Sentiment Analysis Approach
(Dave et al.,
2003).
They have used structured review for testing and
training, identifying features and score methods to de-
termine whether the reviews are positive or negative.
They used classifier to classify the sentences obtained
from web search through search query using product
name as search condition.
(Whitelaw
et al., 2005).
They have presented a sentiment classification tech-
nique on the basis of analysis and extraction of ap-
praisal groups. Appraisal group represents a set of
attribute values in task independent semantic tax-
onomies.
(Li et al.,
2011).
They have proposed various semi-supervised tech-
niques to solve the issue of shortage of labeled data
for sentiment classification . They have used under
sampling technique to deal with the problem of sen-
timent classification i.e., imbalance problem.
6 / 26
7. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Types of Classification
Binary sentiment classification: Each document or review
of the corpus is classified into two classes either positive or
negative.
Multi-class sentiment classification: Each review can be
classified into more than two classes (strong positive, positive,
neutral , negative, strong negative).
Generally, the binary classification is useful when two products
need to be compared. In this study, implementation is done with
respect to binary sentiment classification.
7 / 26
8. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Preparation of Data
The unstructured textual review data need to be converted to mean-
ingful data in order to apply machine learning algorithms.
Following methods have been used to transform textual data to
numerical vectors.
CountVectorizer: Based on the number of occurrences of a
feature in the review, a sparse matrix is created (Garreta and
Moncecchi, 2013).
Term Frequency - Inverse Document frequency (TF-IDF):
The TF-IDF score is helpful in balancing the weight between
most frequent or general words and less commonly used words.
Term frequency calculates the frequency of each token in the
review but this frequency is offset by frequency of that token
in the whole corpus (Garreta and Moncecchi, 2013). TF-IDF
value shows the importance of a token to a document in the
corpus.
8 / 26
9. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Machine Learning Algorithms Used
Naive Bayes Algorithm: It is a probabilistic classifier which uses
the properties of Bayes theorem assuming the strong independence
between the features (McCallum et al., 1998).
For a given textual review ‘d’ and for a class ‘c’ (positive,negative),
the conditional probability for each class given a review is P(c|d) .
According to Bayes theorem this quantity can be computed using
the following equation 1
P(c|d) =
P(d|c) ∗ P(c)
P(d)
(1)
9 / 26
10. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Continue...
Support Vector Machine Algorithm: SVM is a non probabilistic
binary linear classifier (Turney, 2002). SVM Model represents each
review in vectorized form as a data point in the space. This method
is used to analyze the complete vectorized data and the key idea
behind the training of model is to find a hyperplane.
The set of textual data vectors are said to be optimally separated by
hyperplane only when it is separated without error and the distance
between closest points of each class and hyperplane is maximum.
10 / 26
11. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Confusion Matrix
Confusion matrix is generated to tabulate the performance of any
classifier.
Correct Labels
Positive Negative
Positive True Positive (TP) False Positive (FP)
Negative False Negative (FN) True Negative (TN)
Table: Confusion Matrix
TP(True Positive) is the number of positive reviews that are
correctly predicted and FP(False positive) is the number of
positive reviews predicted as negative.
TN(True Negative) is number of negative reviews correctly
predicted and FN(False Negative) is number of negative re-
views predicted as positive.
11 / 26
12. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Evaluation Parameters
Precision : It gives the exactness of the classifier. It is the
ratio of number of correctly predicted positive reviews to the
total number of reviews predicted as positive.
precision =
TP
TP + FP
(2)
Recall: It measures the completeness of the classifier. It is the
ratio of number of correctly predicted positive reviews to the
actual number of positive reviews present in the corpus.
Recall =
TP
TP + FN
(3)
12 / 26
13. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Continue...
F-measure: It is the harmonic mean of precision and recall.
F-measure can have best value as 1 and worst value as 0. The
formula for calculating F-measure is given below in equation 4
FMeasure =
2 ∗ Precision ∗ Recall
Precision + Recall
(4)
Accuracy: It is one of the most common performance eval-
uation parameter and it is calculated as the ratio of number
of correctly predicted reviews to the number of total number
of reviews present in the corpus. The formula for calculating
accuracy is given as equation 5
Accuracy =
TP + TN
TP + FP + TN + FN
(5)
13 / 26
14. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Proposed Approach
Dataset
Preprocessing: Stopword, Numerical and Special character removal
Vectorization
Train using machine learning algorithm
Classification
Result
Figure: Steps to obtain the required output
14 / 26
15. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Dataset
In this study, labeled aclIMDb movie dataset (IMDb, 2006) is
considered .
Dataset contain 12500 labeled positive and negative reviews for
training of model
It also contain 12500 positive and negative reviews for testing
of model as well.
15 / 26
16. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Preprocessing
The review contains a large amount of vague information which
needed to be eliminated.
In preprocessing step, firstly, all the special characters used like
(!@) and the unnecessary blank spaces are removed.
It is observed that reviewers often repeat a particular character
of a word to give more emphasis to an expression or to make
the review trendy (Amir et al., 2014).
second step in preprocessing involves the removal of all the
stopwords of English language.
16 / 26
17. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Vectorization
CountVectorizer: It transforms the review to token count ma-
trix. First, it tokenizes the review and according to number of
occurrence of each token, a sparse matrix is created.
TF-IDF: Its value represents the importance of a word to a
document in a corpus. TF-IDF value is proportional to the
frequency of a word in a document; but it is limited by the
frequency of the word in the corpus.
Calculation of TF-IDF value : suppose a movie review contain
100 words wherein the word Awesome appears 5 times. The term
frequency (i.e., TF) for Awesome then (5 / 100) = 0.05. Again, sup-
pose there are 1 million reviews in the corpus and the word Awesome
appears 1000 times in whole corpus. Then, the inverse document
frequency (i.e., IDF) is calculated as log(1,000,000 / 1,000) = 3.
Thus, the TF-IDF value is calculated as: 0.05 * 3 = 0.15.
17 / 26
18. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Training and Classification
Naive Bayes(NB) algorithm: Using probabilistic analysis, fea-
tures are extracted from numeric vectors. These features help
in training of the Naive Bayes classifier model (McCallum et al.,
1998).
Support vector machine (SVM) algorithm: SVM plots all the
numeric vectors in space and defines decision boundaries by
hyperplanes. This hyperplane separates the vectors in two cat-
egories such that, the distance from the closest point of each
category to the hyperplane is maximum (Turney, 2002).
After training of model using above mentioned algorithms, the
12500 positive and negative reviews given for testing are classified
based on training of model.
18 / 26
19. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Result NB
The confusion matrix obtained after Naive Bayes classification is
shown in table 2 and evaluation parameters Precision, Recall and
F-Measure are shown in table 3 as follows:
Table: Confusion matrix for
Naive Bayes classifier
Correct Labels
Positive Negative
Positive 11025 1475
Negative 2612 9888
Table: Evaluation parameter for
Naive Bayes classifier
Precision Recall F-Measure
Negative 0.81 0.88 0.84
Positive 0.87 0.79 0.83
The accuracy for Naive Bayes Classifier is 0.83652
19 / 26
20. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Result SVM
The confusion matrix obtained after Support Vector Machine
classification is shown in table 4 and evaluation parameters
Precision, Recall and F-Measure are shown in table 5 as follows:
Table: Confusion matrix for
Support Vector Machine
classifier
Correct Labels
Positive Negative
Positive 10993 1507
Negative 1749 10751
Table: Evaluation parameter for
Support Vector Machine
classifier
Precision Recall F-Measure
Negative 0.86 0.88 0.87
Positive 0.88 0.86 0.87
The accuracy for Support Vector Machine classifier for unigram is
0.86976
20 / 26
21. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Comparison of Work
Table: Comparative result obtained among different literature using
IMDb dataset
Classifier
Classification Accuracy
Pang et al. (2002) Salvetti et al. (2004) Mullen and Collier (2004) Beineke et al. (2004) Matsumoto et al. (2005) Proposed approach
Naive Bayes 0.815 0.796 x 0.659 x 0.83
Support Vector Machine 0.659 x 0.86 x 0.883 0.884
The ‘x’ mark indicates that the algorithm is not considered by the author
in their respective paper.
21 / 26
22. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Conclusion
In this study, an attempt has been made to classify sentimental
movie reviews using machine learning techniques.
Two different algorithms namely Naive Bayes (NB) and Support
Vector Machine (SVM) are implemented.
It is observed that SVM classifier outperforms every other clas-
sifier in predicting the sentiment of a review.
The result obtained in this study is comparatively better than
other literatures on same dataset.
22 / 26
23. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Future Work
In this study, only two different classifiers have been imple-
mented.
In future, other classification strategies under supervised learn-
ing methodology like Maximum Entropy classifier, Stochastic
Gradient Classifier, K Nearest Neighbor and others can be con-
sidered for implementation.
Finally, comparison of results can be presented with SVM, which
is currently the best classifier, for the sentiment analysis.
23 / 26
24. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Reference I
Amir, S., Almeida, M., Martins, B., Filgueiras, J., and Silva, M. J. (2014). Tugas: Exploiting unlabelled data for
twitter sentiment analysis. SemEval 2014, page 673.
Argamon, S., Bloom, K., Esuli, A., and Sebastiani, F. (2009). Automatically determining attitude type and force
for sentiment analysis. In Human Language Technology. Challenges of the Information Society, pages 218–231.
Springer.
Beineke, P., Hastie, T., and Vaithyanathan, S. (2004). The sentimental factor: Improving review classification via
human-provided information. In Proceedings of the 42nd Annual Meeting on Association for Computational
Linguistics, page 263. Association for Computational Linguistics.
Dave, K., Lawrence, S., and Pennock, D. M. (2003). Mining the peanut gallery: Opinion extraction and semantic
classification of product reviews. In Proceedings of the 12th international conference on World Wide Web, pages
519–528. ACM.
Garreta, R. and Moncecchi, G. (2013). Learning scikit-learn: Machine Learning in Python. Packt Publishing Ltd.
IMDb (2006). Imdb, internet movie database sentiment analysis dataset.
Li, S., Wang, Z., Zhou, G., and Lee, S. Y. M. (2011). Semi-supervised learning for imbalanced sentiment classification.
In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, volume 22, page 1826.
Matsumoto, S., Takamura, H., and Okumura, M. (2005). Sentiment classification using word sub-sequences and
dependency sub-trees. In Advances in Knowledge Discovery and Data Mining, pages 301–311. Springer.
McCallum, A., Nigam, K., et al. (1998). A comparison of event models for naive bayes text classification. In AAAI-98
workshop on learning for text categorization, volume 752, pages 41–48. Citeseer.
Mullen, T. and Collier, N. (2004). Sentiment analysis using support vector machines with diverse information sources.
In EMNLP, volume 4, pages 412–418.
Pang, B., Lee, L., and Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning
techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-
Volume 10, pages 79–86. Association for Computational Linguistics.
24 / 26
25. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Reference II
Read, J. (2005). Using emoticons to reduce dependency in machine learning techniques for sentiment classification.
In Proceedings of the ACL Student Research Workshop, pages 43–48. Association for Computational Linguistics.
Salvetti, F., Lewis, S., and Reichenbach, C. (2004). Automatic opinion polarity classification of movie. Colorado
research in linguistics, 17:2.
Singh, Y., Bhatia, P. K., and Sangwan, O. (2007). A review of studies on machine learning techniques. International
Journal of Computer Science and Security, 1(1):70–84.
Turney, P. D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of
reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 417–424.
Association for Computational Linguistics.
Whitelaw, C., Garg, N., and Argamon, S. (2005). Using appraisal groups for sentiment analysis. In Proceedings of
the 14th ACM international conference on Information and knowledge management, pages 625–631. ACM.
25 / 26
26. Introduction Related work Methodology Proposed Approach Conclusion & Future Work Reference References
Thank You!
26 / 26