Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other social media metrics research
1. Astrophysicists on Twitter
and other social media metrics research
Stefanie Haustein & Vincent Larivière
Canada Research Chair
on the Transformations of Scholarly Communication
École de bibliothéconomie et des sciences de l’information
2. Background: bibliometrics
• publication and citations used as proxy for research
productivity and impact
• based on studies to understand structure and norms
of science
• sociological research
• publications and scientific/academic capital
• reasons to cite
• bibliometric research
• disciplinary differences in publication and citation behavior
• delay and obsolescence patterns
Ø theoretical framework and legitimation for citation analysis
3. Background: altmetrics
• social media metrics as alternatives or complements to
citation analysis
• similar but more timely than citations
Ø predicting scientific impact?
• different, broader impact than citations
Ø measuring societal impact?
• including all research “products”
4. Background: altmetrics
• similar to bibliometrics in 1960s, little known about
meaning of social media metrics
• altmetrics are “representing very different things”
(Lin & Fenner, 2013)
• unclear what exactly they measure:
•
•
•
•
scientific impact?
social impact?
“buzz”?
all of the above?
5. Altmetrics: increasing use
• social media activity around scholarly articles grows
5% to 10% per month (Adie & Roe, 2013)
• Mendeley and Twitter largest sources for mentions of
scholarly documents
Mendeley
• 521 million bookmarks
• 2.7 million users
• 32% increase of users
from 09/2012 to 09/2013
Mendeley statistics based on monthly user counts from 10/2010 to 01/2014 on the Mendeley website accessed through the Internet Archive
6. Altmetrics: increasing use
• increase of Twitter use
• 230 million active users, 500 million tweets per day
• 39% increase of users from 09/2012 to 09/2013
• 16% of US, 3% of world population in 2013
• uptake by researchers
• 1 in 40 university faculty member in US and UK
•
•
have Twitter account (Priem, Costello, & Dzuba, 2011)
9% of researchers use Twitter for work (Rowlands et al., 2011)
80% of Digital Humanities scholars consider Twitter
relevant source of information (Bowman et al., 2013)
Twitter statistics calculated based on data from: http://www.sec.gov/Archives/edgar/data/1418091/000119312513400028/d564001ds1a.htm
and http://www.census.gov/population/international/data/
11. Research questions
Ø What kind of impact do Mendeley readers and tweets reflect?
•
•
•
•
What is the relationship between social media activity around a
document and the bibliometric variables of these documents?
Which topics receive the most attention on Mendeley and Twitter?
How and to what extent do researchers use social media?
Who is engaging with scholarly material on social media sites?
What are the motivations behind this use?
Results of two case studies:
• Study I: in-depth analysis of astrophysicists on Twitter
• Study II: large-scale analysis of tweets and Mendeley readers of
biomedical papers
12. Study I: Astrophysicists on Twitter
Aim of this study
• in-depth analysis of astrophysicists on Twitter
• number of tweets, followers, retweets
• characteristics of tweets: RTs, @messages,
#hashtags, URLs
• relationship with scientific output
• publications
• citations
• comparison of tweet and publication content
• identify different types of conversations
Ø provide evidence of use for scholarly communication
Haustein, S., Bowman, T.D., Holmberg, K., Peters, I., Larivière, V. (in press). Astrophysicists on Twitter: An in-depth analysis of tweeting and
scientific publication behavior. ASLIB Proceedings.
13. Study I: Astrophysicists on Twitter
Data sets & methods
• 37 astrophysicists on Twitter identified by
Holmberg & Thelwall (2013)
• focus on astrophysics professors and researchers
• often bloggers, science communicators
14. Study I: Astrophysicists on Twitter
Data sets & methods
• collection of Twitter account information
Ø heterogeneous group of Twitter users
• collection and analysis of 68,232 of 289,368 tweets
• number of RTs per tweet
• % of tweets that are RTs
• % of tweets containing #hashtags, @usernames, URLs
• web searches to identify person behind account
• publications in WoS journals
• publication years: 2008-2012
• manual author disambiguation
Ø heterogeneous group of authors
15. Study I: Astrophysicists on Twitter
Data sets & methods
• grouping astrophysicists according to tweeting and
publication behavior
• analyzing differences of tweeting characteristics
between user groups
Selected
astrophysicists
tweet rarely
tweet
tweet
(0.0-0.1 tweets occasionally regularly
tweet
frequently
(N=37)
per day)
(3.7-58.2)
do not publish
(0 publications 2008-2012)
publish occasionally
(1-9)
publish regularly
(14-37)
publish frequently
(46-112)
total
(tweeting activity)
(0.1-0.9)
(1.2-2.9)
total
(publishing activity)
--
--
1
5
6
4
3
4
2
13
--
5
5
3
13
1
3
1
--
5
5
11
11
10
37
16. Study I: Astrophysicists on Twitter
Data sets & methods
• comparison of tweet and publication content
• limited to 18 most frequently publishing astrophysicists
•
•
•
to ensure certain number of abstracts
extraction of noun phrases from abstracts and tweets
with part-of-speech tagger
analyzing overlap of character strings
calculating similarity with cosine per person and overall
Selected
astrophysicists
tweet rarely
tweet
tweet
(0.0-0.1 tweets occasionally regularly
tweet
frequently
(N=37)
per day)
(3.7-58.2)
publish regularly
(14-37)
publish frequently
(46-112)
total
(tweeting activity)
(0.1-0.9)
(1.2-2.9)
total
(publishing activity)
--
5
5
3
13
1
3
1
--
5
1
8
6
3
18
17. Study I: Astrophysicists on Twitter
Data sets & methods
• social network analysis of conversational networks
• 56,415/15,420 connections between 11,252 users
• limited to users mentioned ≥20 times:
518 users including 32 selected astrophysicists
coding users by type
•
• visualization and analysis with Gephi
• OpenOrd layout
• community detection
• analyzing clusters
• user types
• hashtags and noun phrases
• visualizing term co-occurrence with VOSviewer
18. Study I: Astrophysicists on Twitter
Results: correlations
• comparison of Twitter and publication activity and impact
19. Study I: Astrophysicists on Twitter
Results: characteristics
Mean share of retweets and tweets containing at least one
hashtag per person per group
20. Study I: Astrophysicists on Twitter
Results: characteristics
Mean share of tweets containing at least one user name or
URL per person per group
21. Study I: Astrophysicists on Twitter
Results: content similarity
• overall similarity between abstracts and tweets low
• cos=0.081
• 4.1% of 50,854 tweet NPs in abstracts
• 16.0% of 12,970 abstract NPs in tweets
• high Twitter coverage of most frequent abstract terms
• 97,1% of 104 most frequent noun phrases on Twitter
22. Results: content similarity
• similarity varies between cos=0.096 and cos=0.037 per user
cos=0.096
P=46
Tcol=2,832
cos=0.060
P=49
Tcol=1,236
cos=0.050
P=112
Tcol=423
24. Study I: Astrophysicists on Twitter
Results: conversational network
n=68
n=88
n=40
n=180
n=30
n=109
n=3
25. Study I: Astrophysicists on Twitter
Results: conversation network
Cluster content
• large overlap of noun phrases
• most frequent terms appear in
Cluster 4
all clusters
today
day
time
thank
year
earth
person
science
planet
thing
star
way
sun
talk
life
paper
moon
world
week
lot
Cluster 1
grey nodes appear in >2 of 6 clusters
26. Results: conversational network
Cluster 4
meetings and conferences;
traveling
personal;
time and places
scientific careers
and funding
planets;
observation
telescopes;
observation;
places
astronomy; observation
27. Study I: Astrophysicists on Twitter
Conclusions
• Twitter and publication activity are negatively correlated
• user groups show different tweeting behavior regarding use of
hashtags, usernames, URLs and retweeting
• low similarity between abstracts and tweets
Ø Twitter activity does not reflect publication activity
• conversations mainly with science communicators and other
astrophysicists, hardly teachers, students or amateurs
Ø communication with general public through "middlemen"
• conversational clusters vary by user type but topics overlap
Ø astrophysicists are involved in various discussions
28. Study I: Astrophysicists on Twitter
Outlook
• study of Facebook group
• analysis of arXiv papers on Twitter
• survey of astrophysicists on Twitter
Survey on Twitter use by our colleague Kim Holmberg
Please participate: http://goo.gl/S7s6e6
or https://survey.abo.fi/lomakkeet/4445/lomake.html
29. Study II: Biomedical papers on Twitter and Mendeley
Aim of the study
• large-scale analysis of tweets and Mendeley readers
• Twitter and Mendeley coverage
• Twitter and Mendeley user rates
• correlation with citations
• discovering differences between:
• documents
• journals
• disciplines & specialties
Ø providing empirical framework to understand use of
biomedical papers on Twitter and Mendeley
Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M., & Larivière, V. (2014). Tweeting Biomedicine: An Analysis of Tweets and Citations in the
Biomedical Literature. Journal of the Association for Information Sciences and Technology. doi: 10.1002/asi.23101
Haustein, S., Larivière, V., Thelwall, M., Amyot, D., & Peters, I. (submitted). Tweets vs. Mendeley readers: How do these two social media
metrics differ? IT-Information Technology.
30. Study II: Biomedical papers on Twitter and Mendeley
Data sets & methods
• 1.4 million PubMed papers covered by WoS
• publication years: 2010-2012
• document types: articles & reviews
• matching of WoS and PubMed
• tweet counts collected by Altmetric.com
• collection based on PMID, DOI, URL
• matching WoS via PMID
• Mendeley readership data collected via API
• matching title and author names
• journal-based matching of NSF classification
31. Study II: Biomedical papers on Twitter and Mendeley
Data sets & methods
Current biases influencing correlation coefficients
Ø compare documents of similar age
Ø normalize for age differences
32. Study II: Biomedical papers on Twitter and Mendeley
Data sets & methods
Framework
x-axis
coverage of specialty
on platform compared
to mean coverage
y-axis
correlation between
social media counts
and citations
bubble size
intensity of use based
on mean social media
count rate
33. Study II: Biomedical papers on Twitter and Mendeley
Results: documents
Top 10 tweeted documents:
catastrophe & topical / web & social media / curious story
scientific discovery / health implication / scholarly community
Article
Journal
C
T
Hess et al. (2011). Gain of chromosome band 7q11 in papillary thyroid carcinomas of young patients
is associated with exposure to low-dose irradiation
PNAS
9
963
Yasunari et al. (2011). Cesium-137 deposition and contamination of Japanese soils due to the
Fukushima nuclear accident
PNAS
30
639
Sparrow et al. (2011). Google Effects on Memory: Cognitive Consequences of Having Information at
Our Fingertips
Science
11
558
Onuma et al. (2011). Rebirth of a Dead Belousov–Zhabotinsky Oscillator
Journal of Physical
Chemistry A
--
549
Silverberg (2012). Whey protein precipitating moderate to severe acne flares in 5 teenaged athletes
Cutis
--
477
Wen et al. (2011). Minimum amount of physical activity for reduced mortality and extended life
expectancy: a prospective cohort study
Lancet
51
419
Kramer (2011). Penile Fracture Seems More Likely During Sex Under Stressful Situations
Journal of Sexual
Medicine
--
392
Newman & Feldman (2011). Copyright and Open Access at the Bedside
New England
Journal of Medicine
3
332
Reaves et al. (2012). Absence of Detectable Arsenate in DNA from Arsenate-Grown GFAJ-1 Cells
Science
5
323
Bravo et al. (2011). Ingestion of Lactobacillus strain regulates emotional behavior and central GABA
receptor expression in a mouse via the vagus nerve
PNAS
31
297
34. Study II: Biomedical papers on Twitter and Mendeley
Results: correlations
PubMed papers covered by Web of Science (PY=2011)
Spearman correlations between citations (C), Mendeley readers (R) and tweets (T) for all papers published in
2011 (A, n=586,600), for papers with respectively at least one citation (B, n=410,722), one Mendeley reader (C,
n=390,190) or one tweet (D, n=63,800), one Mendeley reader and one tweet (E, n=45,229) and one citation, one
Mendeley reader and one tweet (F, n=36,068). All results are significant at the 0.01 level (two-tailed).
35. Study II: Biomedical papers on Twitter and Mendeley
Results: disciplines
PubMed papers covered by Web of Science 2010-2012
36. Results: specialties
x-axis
coverage of specialty
on platform
y-axis
correlation between
social media counts
and citations
bubble size
intensity of use based
on mean social media
count rate
37. Study II: Biomedical papers on Twitter and Mendeley
Conclusions
• uptake, usage intensity and correlation differ between
disciplines and specialties
Ø social media counts from different fields not directly
comparable
• citations, Mendeley readers and tweets reflect different kind
of impact on different social groups
• Mendeley seems to mirror use of broader but still academic
audience, largely students and postdocs
• Twitter seems to reflect popularity among general public and
represents mix of societal impact, scientific discussion and buzz
Ø the number of Mendeley readers and tweets are two distinct
social media metrics
38. Outlook
• before applying social media counts in information
retrieval and research evaluation further research is
needed:
Ø identifying different factors influencing popularity of
scholarly documents on social media
Ø analyzing uptake and usage intensity in various
disciplines
Ø differentiating between audiences and engagements
Ø determine roles of social media in scholarly
communication
39. Survey on Twitter use by our colleague Kim Holmberg
Please participate: http://goo.gl/S7s6e6
or https://survey.abo.fi/lomakkeet/4445/lomake.html
Thank you for your attention!
Questions?
Stefanie Haustein
stefanie.haustein@umontreal.ca
@stefhaustein
Vincent Larivière
vincent.lariviere@umontreal.ca
@lariviev
40. References
Adie, E. & Roe, W. (2013). Altmetric: Enriching Scholarly Content with Article-level Discussion and Metrics. Learned
Publishing, 26(1), 11-17.
Bowman, T.D., Demarest, B., Weingart, S.B., Simpson, G.L., Lariviere, V., Thelwall, M., Sugimoto, C.R. (2013).
Mapping DH through heterogeneous communicative practices. Paper presented at Digital Humanities 2013, Lincoln,
Nebraska.
Haustein, S., Bowman, T.D., Holmberg, K., Larivière, V., & Peters, I., (in press). Astrophysicists on Twitter: An indepth analysis of tweeting and scientific publication behavior. Aslib Proceedings.
Haustein, S., Larivière, V., Thelwall, M., Amyot, D., & Peters, I. (submitted). Tweets vs. Mendeley readers: How do
these two social media metrics differ? IT-Information Technology.
Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M., & Larivière, V. (2014). Tweeting Biomedicine: An Analysis of
Tweets and Citations in the Biomedical Literature. Journal of the Association for Information Sciences and
Technology. doi: 10.1002/asi.23101
Holmberg, K., & Thelwall, M. (2013). Disciplinary differences in Twitter scholarly communication. Proceedings of
ISSI 2013 – 14th International Conference of the International Society for Scientometrics and Informetrics, Vienna,
Austria (Vol. 1, pp. 567-582).
Lin, J. & Fenner, M. (2013). Altmetrics in evolution: Defining and redefining the ontology of article-level metrics.
Information Standards Quarterly, 25(2), 20-26.
Priem, J., & Costello, K. L. (2010). How and why scholars cite on Twitter. Proceedings of the 73th Annual Meeting of
the American Society for Information Science and Technology, Pittsburgh, USA.
Rowlands, I., Nicholas, D., Russell, B., Canty, N., & Watkinson, A. (2011). Social media use in the research
workflow. Learned Publishing, 24, 183–195.