Tweet sentiment analysis (Data mining)

Supervisor:
Er. Nhurendra Shakya
Presented By:
Anil Shrestha (28651)
Bijay Sahani ()
Bimal Shrestha (28658)
Deshbhakta Khanal(28661)
8/17/2017
1
A
MAJOR PROJECT PRESENTATION ON
TWEEZER

OVERVIEW
8/17/2017
2
 Introduction
 Background
 Objectives
 Scope and applications
 Features
 System diagram
 Methodology
 Future enhancement
 References
 Results

INTRODUCTION
8/17/2017
4
 Tweezer represents analysis of tweets
 Process of determining whether the tweets are positive or
negative.
 Also known as opinion mining

OBJECTIVES
8/17/2017
5
 To evaluate the persons opinion in certain cases
 To determine the emotional tone behind the series of the
word
 To teach the machine to analyze the various grammatical
nuances
 To implement an algorithm for automatic classification of
text into positive or negative.

SCOPE AND APPLICATIONS
8/17/2017
6
 Social networking sites for determining opinions of
people on products or brands
 Review on currently published movies, songs and
products graphically
 Business, politics and public actions

FEATURES
8/17/2017
7
 Hashtag Search
 Capable of training the new tweets
 Representation of results in Pie-chart, Bar graph and
Scatter Plot
 Determine the opinion about the peoples on
products or brands
 Print or download results in png, jpeg, svg vector
image or pdf document
 Popup images when hovering overs graphs

TOOLS USED
8/17/2017
8
 Python
 NLTK
 MongoDB
 PHP
 HTML
 Highcharts API
 XAMPP

8/17/2017
10
METHODOLOGY
Input (Keyword)
Tweets Retrieval
Get Feature vectorProcess Tweet
Extract Feature
Naïve Bayes Classifier
Aggregating Scores
Data

METHODOLOGY
8/17/2017
11
Algorithm used
 Naïve Bayes Method

METHODOLOGY
8/17/2017
12
DOC TEXT CLASS
1 I loved the movies +
2 I hated the movies -
3 a great movies. good movies +
4 poor acting -
5 great acting. a good movies +
You have a document and a classification:
Ten Unique words:
<I, loved, the, movies, hated, a, great, poor, acting, good>

METHODOLOGY
8/17/2017
13
DOC I loved the movies hated a great poor acting good CLASS
1 1 1 1 1 +
2 1 1 1 1 -
3 2 1 1 1 +
4 1 1 -
5 1 1 1 1 1 +
Convert the document into feature sets, where the attributes are possible
words, and the values are the number of times a word occurs in the given
document.

LIMITATION
8/17/2017
18
 Only 25 tweets are being analyzed
 Takes nearly 35 seconds to analyze and display
results
 Not able to determine neutrality of tweet
 Doesnot determines polarity of emoji/smiley
(  ;), :D, :P )

FUTURE ENHANCEMENT
8/17/2017
19
 Analyzing sentiments on emo/smiley
 Determining neutrality
 Potential improvement can be made to our data collection
and analysis method
 Future research can be done with possible improvement
such as more refined data and more accurate algorithm
 Fast processing system

CONCLUSION
8/17/2017
20
 Aimed to determine opinion of people
 About 23,000 dataset (positive and negative) was
collected
 Normalization of dataset
 Used a suitable algorithm to train data
 A simple user interface was designed
 Representation of results in graph

REFERENCES
8/17/2017
21
 Kim S-M, Hovy E (2004) Determining the sentiment of opinions In:
Proceedings of the 20th international conference on Computational
Linguistics, page 1367. Association for Computational Linguistics,
Stroudsburg, PA, USA.
 Liu B (2010) Sentiment analysis and subjectivity In: Handbook of Natural
Language Processing, Second Edition. Taylor and Francis Group, Boca. Liu B,
Hu M, Cheng J (2005) Opinion observer: Analyzing and comparing opinions
on the web In: Proceedings of the 14th International Conference on World
Wide Web, WWW ’05, 342–351. ACM, New York, NY, USA.
 Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and
opinion mining In: Proceedings of the Seventh conference on International
Language Resources and Evaluation. European Languages Resources
Association, Valletta, Malta.
 Pang B, Lee L (2004) A sentimental education: Sentiment analysis using
subjectivity summarization based on minimum cuts In: Proceedings of the
42Nd Annual Meeting on Association for Computational Linguistics, ACL
’04.. Association for Computational Linguistics, Stroudsburg, PA, USA.

REFERENCES
8/17/2017
22
 Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends
Inf Retr2(1-2): 1–135.
 Twitter (2014) Twitter apis. https://dev.twitter.com/start.
 LiuB(2014)The science of detecting fake
reviews. http://content26.com/blog/bing-liu-the-science-of-detecting-fake-
reviews/
 Jahanbakhsh, K., & Moon, Y. (2014). The predictive power of social media:
On the predictability of U.S presidential elections using Twitter
 Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in
consumer reviews In: Proceedings of the 21st, International Conference on
World Wide Web, WWW ’12, 191–200.. ACM, New York, NY, USA.
 Saif, H., He, Y., & Alani, H. (2012). Semantic sentiment analysis of twitter.
The Semantic Web (pp. 508– 524). ISWC

REFERENCES
8/17/2017
23
 Tan LK-W, Na J-C, Theng Y-L, Chang K (2011) Sentence-level sentiment
polarity classification using a linguistic approach In: Digital Libraries: For
Cultural Heritage, Knowledge Dissemination, and Future Creation, 77–87..
Springer, Heidelberg, Germany.
 Liu B (2012) Sentiment Analysis and Opinion Mining. Synthesis Lectures on
Human Language Technologies. Morgan & Claypool Publishers.
 Gann W-JK, Day J, Zhou S (2014) Twitter analytics for insider trading fraud
detection system In: Proceedings of the sencond ASE international conference
on Big Data.. ASE.
 Joachims T. Probabilistic analysis of the rocchio algorithm with TFIDF for text
categorization. In: Presented at the ICML conference; 1997.
 Yung-Ming Li, Tsung-Ying Li Deriving market intelligence from microblogs

Tweet sentiment analysis (Data mining)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tweet sentiment analysis (Data mining)

Similar to Tweet sentiment analysis (Data mining) (20)

More from Anil Shrestha

More from Anil Shrestha (7)

Recently uploaded

Recently uploaded (20)

Tweet sentiment analysis (Data mining)