Investigating the Characteristics and Research Impact of Sentiments in Tweets with Links to Computer Science Research Papers

Investigating the Characteristics and Research Impact of
Sentiments in Tweets with Links to Computer Science Research
Papers
Aravind Sesagiri Raamkumar, Savitha Ganesan, Keerthana Jothiramalingam,
Muthu Kumaran Selva, Mojisola Erdt, Yin-Leng Theng
Centre for Healthy and Sustainable Cities (CHESS)
Nanyang Technological University, Singapore
ICADL 2018
November 21 2018

2
Twitter has been used in the context of academia in different ways
1) By Journals
2) By Conferences
3) By Researchers (general use)
4) By Any User (with links to research papers)
Background

3
Twitter data has become very important in social media research
Twitter data for research
1. Explicitly Produced
– Tweet Content: User-mentions, URLs, Hashtags, Likes, Retweets
– Tweet Metadata
– User Metadata
– User Lists
– Public Timeline
2. Inferred Data
– Networks & Influence
– Information Accuracy & Sentiment
Background

4
Why do we need to look at sentiments in tweets which contain links
to research papers?
– Twitter metrics are mainly indicators of popularity
– Need other proxy indicators of interim quality of tweeted papers
– Sentiments can be potentially used in formulating recommendations for
research papers!
Previous studies have shown that the neutral sentiment is discovered
predominantly in tweets with links to research papers
- 96% in 270 tweets [1]
- 94.8% in 1,000 tweets [2]
- 81.7% in 487,610 tweets [3]
Problem Area

5
1) Understand the role and nature of sentiments in tweets by
starting with a qualitative analysis of tweets with non-neutral
sentiments
RQ1: How are the sentiments represented in the tweets, in terms of
composition, keywords and attributed aspects?
2) Compare the performance of papers with all sentiments against
papers with just neutral sentiment in tweets
RQ2: How do papers with all three sentiments compare against
papers with only neutral sentiments in terms of impact indicators?
Research Objectives

6
1) The Microsoft Academic Graph (MAG) dataset was used for this study
– February 2016 version
2) Computer science (CS) related papers were extracted from the MAG dataset using the CS
venue entries indexed in DBLP
3) Papers published since 2012 were considered (n=53,831)
4) Data extracted for these papers
– Citations count from Scopus
– Altmetrics data from Altmetric.com (Aggregated Altmetrics score)
– Altmetrics data from PlumX (Views and Downloads)
– Tweets from Twitter
5) 13,809 papers had 77,914 tweets
6) Bot and spam accounts were removed
7) Paper titles were removed from tweets
8) Retweets were not considered
9) Finally, 49,849 tweets for 12,967 associated papers
Methodology -Dataset Preparation

7
1) The TextBlob library was used for determining the sentiment polarity of
tweets
– Default scoring scale range from -1 to +1
– 0.0 corresponds to neutral sentiment by default
– Sentiment score range were modified for better accuracy
2) The keywords representing positive and negative sentiments in the tweet
along with the corresponding paper aspect were extracted
Methodology - Sentiment Identification and
Qualitative Analysis
Sentiment category Initial sentiment score
range
Modified sentiment
score range
Extremely Positive > 0.5 and <= 1.0 > 0.5 and <= 1.0
Positive > 0.0 and <= 0.5 > 0.3 and <= 0.5
Neutral = = 0.0 >= -0.3 and <= 0.3
Negative < 0.0 and >= -0.5 < -0.3 and >= -0.5
Extremely Negative < -0.5 and >= -1.0 < -0.5 and >= -1.0

8
Sentiment Tweet Count Associated
Paper Count
Likes Count
(μ)
Retweets Count
(μ)
Positive 866 (1.74%) 579 1.15 1.03
Extremely Positive 527 (1.06%) 393 1.65 1.58
Negative 15 (0.03%) 12 1.73 0.93
Extremely Negative 9 (0.02%) 7 0.78 3.33
Neutral 48432
(97.16%) 12791 0.94 0.83
RQ1: Sentiment Stats for the Tweets

9
Aspect Keywords Used Example Tweets
Overall
paper
awesome, great,
interesting, fascinating, nice,
new
a paper published on cloud biolinux. awesome. [URL]
this is really good. simple rules for better figures [URL] great tips with
examples
Readership
good, great, interesting,
nice, worth
[TH] [TH] the value of draft picks. nerdy but a great read [URL]
looks worth a read #ploscompbio: [Paper] [URL] #oxcompbio
Review
awesome, good, great,
interesting, nice
adjusting confounders in ranking #biomarkers: a model-based roc approach -
#awesome review [TH] [URL]
a nice review paper about image segmentation on gpus [TH] [TH] #gpgpu
[URL]
Work
amazing, excellent,
impressive, interesting, nice
[TH] just read article on usability testing serious games [URL] excellent work.
will be sharing with my students.
brainbrowser: distributed, web-based neurological #dataviz. impressive work
via [TH] w/ [TH] and more. [URL]
Study
beautiful, best, cool,
interesting, nice
beautiful study on how the canary sings! relevant to sequence organization in
animal behavior in general. [URL]
cool study by browning harmer suggesting anxiety disrupts the expectancy
learning process (for threat info) - [URL]
RQ1: Positive Tweets

10
Aspect Keywords Used Example Tweets
Overall paper
fool, seriously, shit,
terrible, fuck
this is terrible science ignore it: [URL] the article ([URL] even cites wakefield
(2002).
#roundup has gone and fucked up. - has anybody else seen this published
4/18/13, linking roundup weed killer to... [URL]
Study bad, stupid
another bad study on narcissism social media [URL] 19 yr use twitter, 35 yr
facebook. lots p surveys not comparable
this study should be called, facebook making us stupid [URL]
Opinion on authors Idiot
[TH] here is one. so you are totally [URL] why am i arguing with an idiot who
thinks he speaks for science
Paper length Stupid keep it long and complicated, stupid. [URL]
Paper title Horrible [Paper]. [URL] pevzner group. cool. (but what a horribly overloaded name!)
RQ1: Negative Tweets

11
Paper Group Citations
Usage
Total
Likes
Count
Retweets
Count
Mendeley
Readers
Altmetric
Score
All sentiments (group 1) 13.52 1765.78 1.83 1.73 91.62 28.85
Only neutral (group 2) 9.09 557.51 0.66 0.56 48.77 3.97
Paper Group Citations
Usage
Total
Likes
Count
Retweets
Count
Mendeley
Readers
Altmetric
Score
All sentiments (group 1) 5 92 0 0 53 6
Only neutral (group 2) 4 65.5 0 0 30 1.25
RQ2: Comparison of Paper Groups with Impact
Indicators
Mean values
Median values

12
• Only computer science research papers were
considered in this study
• By the end of 2017, Twitter increased the character
count in tweets to 280 from the earlier 140 characters
• Twitter users can now post more descriptive content and
tag more users in their tweets
Limitations

13
• Extend this study to other disciplines
• Ascertain the impact of Twitter’s policy change on increase in
the length of tweets
• Utilize the findings for conceptualizing recommendation
techniques that use social media data for recommending
research papers
Future Work

CTIDF (Common Tweeter Inverse Document Frequency)
Twitter Users
Papers in the dataset
Seed paper Candidate paper
CTIDF = (1/3) +
(1/2)
Candidate paper
STAGE 1
STAGE 2
Ranking Strategies
1.) Use co-references and co-citations
2) Use citation count
3) Use sentiments (papers that have
higher ratio of positive tweets)

Investigating the Characteristics and Research Impact of Sentiments in Tweets with Links to Computer Science Research Papers

Recommended

Recommended

More Related Content

What's hot

What's hot (10)

Similar to Investigating the Characteristics and Research Impact of Sentiments in Tweets with Links to Computer Science Research Papers

Similar to Investigating the Characteristics and Research Impact of Sentiments in Tweets with Links to Computer Science Research Papers (20)

More from Aravind Sesagiri Raamkumar

More from Aravind Sesagiri Raamkumar (20)

Recently uploaded

Recently uploaded (20)

Investigating the Characteristics and Research Impact of Sentiments in Tweets with Links to Computer Science Research Papers

Editor's Notes