Rela%ve
Trends
in
Scien%fic
     Terms
on
Twi4er

       Victoria
Uren,
Aba‐Sah
Dadzie
             
The
OAK
Group,
Dept.
o...
Introduc%on
  •  scien%fic
research
tradi%onally
disseminated
via
journals,
books,
     scien%fic
conferences
  •  new
form
...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           –  Analysis
&
Results
  •  C...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           –  Analysis
&
Results
  •  C...
Related
Work
  •  Garfield,
E.
(from
1950s)
           –  father
of
scientometrics

  •  Priem
et
al.
(2010)

           – ...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           –  Analysis
&
Results
  •  C...
Experiment                                                                
  •  exploratory
experiment

           –  to
d...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           – Data
           –  Analysis
&
Results
  •  Co...
UNESCO
Thesaurus
1Gram
Terms
                                   
       Topic
                                       Terms...
Baseline
Dataset
–
Google
1Grams                                      
  •  obtained
from
Google
Books
NGrams
corpus1

  •...
Baseline
Dataset
–
Google
1Grams                                    
altmetrics11:
Tracking
scholarly
impact
on
the
social...
Twi4er
Dataset                                                            
       Sample
ID
                CollecAon
Peri...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           – Analysis
&
Results
  •  Co...
Twi4er
c.f.
Google
NGrams                                              
altmetrics11:
Tracking
scholarly
impact
on
the
soc...
Twi4er
c.f.
Google
NGrams                                              
  •  higher
varia%on
in
distribu%on
for
Twi4er
sam...
Twi4er
c.f.
Google
NGrams                                              
altmetrics11:
Tracking
scholarly
impact
on
the
soc...
Twi4er
c.f.
Google
NGrams                                                     
  •  Permafrost

           –  17%
and
15%
...
Example
Tweets
–
Permafrost

  •      advert/chat
           –  @HDNinjacp
go
to
Permafrost
Its
never
full
:
Fri
Mar
04
05...
Example
Tweets
–
Alkalinity
T‐300‐2
  •  Chemistry
Help
Needed!
pH,
concentra%on
of
carbonate
species
and
     alkalinity....
Twi4er
c.f.
Google
NGrams:
Phosphorus                                     
Sample
 Total
 LegislaAon
 NutriAon
           ...
Twi4er
c.f.
Google
NGrams:
Phosphorus                                     
  •  usage
largely
with
scien%fic
content
      ...
Example
Tweets
–
Phosphorus 
  •      Legisla%on


           –  RT
@YarnPlayCafe:
The
fact
that
he
wants
to
repeal
the
ph...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           –  Analysis
&
Results
  •  C...
Conclusions
–
Experiment                                               
  •  recognised
challenges
           –  baseline
...
Engagement
with
the
Web?
•  why
do
scien%sts
not
tweet?
(or
engage
much
in
other
social
media)?
        –  is
the
web
not
...
Implica%ons
for
Altmetrics                                                
  •  however
‐
some
level
of
scien%fic
discourse...
Outline
  •  Aims/Introduc%on
  •  Related
Work
  •  Experiment
           –  Data
           –  Analysis
&
Results
  •  C...
Next
Steps                                                                  
•  replicate
experiments
with
larger
samples
...
Acknowledgements                                                     
  •  Elizabeth
Cano
for
discussions
on
collec%on
and...
References                                                                  
  •      Garfield
bib
‐
h4p://garfield.library....
Upcoming SlideShare
Loading in …5
×

Relative Trends in Scientific Terms on Twitter

724 views
642 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
724
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Relative Trends in Scientific Terms on Twitter

  1. 1. Rela%ve
Trends
in
Scien%fic
 Terms
on
Twi4er

 Victoria
Uren,
Aba‐Sah
Dadzie
 
The
OAK
Group,
Dept.
of
Computer
Science,
The
University
of
Sheffield

  2. 2. Introduc%on
 •  scien%fic
research
tradi%onally
disseminated
via
journals,
books,
 scien%fic
conferences
 •  new
form
of
discourse
–
online
social
media

 –  suitable
forum
for
dissemina%ng
scien%fic
research?
 –  do
scien%sts
engage
with
online
social
media?
 –  are
there
sufficient
amounts
of
informa%on
on
scien%fic
topics?
 •  are
there
suitable
metrics
for
measuring
scien%fic
impact
online?
 –  between
scien%sts?
 –  for
public
engagement?
 •  are
these
new
measures
comparable
to
formal
metrics?
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  3. 3. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  4. 4. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  5. 5. Related
Work
 •  Garfield,
E.
(from
1950s)
 –  father
of
scientometrics

 •  Priem
et
al.
(2010)

 –  Scientometrics
2.0
as
a
new
metric
for
measuring
scholarly
impact
on
social
web
 
 •  Lane
(2010)

 –  need
to
improve
metrics
used
to
measure
scien%fic
impact

 •  Michel
et
al.
(2011)
 –  Google
nGrams
to
analyse
culture
 –  a.o.,
recognised
fame
for
scien%sts
low…
 •  Cheong
et
al.
(2009)

 –  H1N1
spike
(trend)
detected
on
Twi4er
during
flu
pandemic
(May
2009)
 •  Rowe
et
al.
(2011)
 –  influence
of
content
and
author
features
on
predic%on
of
ac%ve,
long
term
 discussions
on
social
web
 •  Kinsella
et
al.
(2011)
 –  using
hyperlinked
metadata
to
aid
categorisa%on
of
topics
discussed
in
online
 social
media
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  6. 6. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  7. 7. Experiment 
 •  exploratory
experiment

 –  to
determine
frequency
of
occurrence
of
scien%fic
term
usage
in
 online
social
media
 •  data
set
 –  three
sets
of
(scien%fic)
terms
selected
from
UNESCO
thesaurus
 –  Google
Books
NGrams
corpus
used
as
a
baseline
 –  300
tweets
collected
in
each
sample,
using
Twi4er
API,
for
selected
 terms
 •  frequency/usage
analysis
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  8. 8. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 – Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  9. 9. UNESCO
Thesaurus
1Gram
Terms
 
 Topic
 Terms

 Physical
Sciences

 Ioniza%on,
Electromagne%sm,
Crystallography

 Chemical
Sciences

 Phosphorus,
Alkalinity,
Microchemistry

 Earth
Sciences

 Permafrost,
Lithosphere,
Glaciology
 •  selec%on
criteria
 –  minimisa%on
of
noise
due
to
polysemy
 –  avoidance
of
scien%fic
terms
with
other
common/colloquial
usage
 –  terms
unique
to
a
par%cular
topic
 –  words
with
a
single
stem
 –  1Grams
only
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  10. 10. Baseline
Dataset
–
Google
1Grams 
 •  obtained
from
Google
Books
NGrams
corpus1

 •  total
NGrams
by
year
for
three
sets
of
terms
 –  2006
–
116,029

 –  2007
–
126,206

 –  2008
–
111,417
 •  annual
varia%on
by
topic
(of
total
NGrams
baseline
dataset)
 –  Chemical
Sciences

50‐60%

 –  Physical
Sciences



30‐40%

 –  Earth
Sciences








~
10%
 •  [1]
h4p://ngrams.googlelabs.com/datasets 

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  11. 11. Baseline
Dataset
–
Google
1Grams 
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  12. 12. Twi4er
Dataset 
 Sample
ID
 CollecAon
Period

 Elapsed
Time
(h)
 T‐300‐1

 Tue


Mar
01
20:56:43
GMT
2011
–

 41
 Thu


Mar
03
14:22:18
GMT
2011

 T‐300‐2
 Fri




Mar
04
02:35:55
GMT
2011
–
 64
 
 Sun


Mar
06
18:38:05
GMT
2011

 T‐300‐3
 Mon
Mar
07
20:31:11
GMT
2011
–
 44
 
 Wed
Mar
09
16:21:36
GMT
2011
 •  three
samples
collected,
containing
300
consecu%ve
tweets
each
 •  ~
0.003%
of
total
tweets
over
collec%on
period

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  13. 13. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 – Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  14. 14. Twi4er
c.f.
Google
NGrams 
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  15. 15. Twi4er
c.f.
Google
NGrams 
 •  higher
varia%on
in
distribu%on
for
Twi4er
sample
 –  however
largely
in
line
with
Google
NGrams
 •  can
Google
NGrams
serve
as
a
suitable
baseline?
 –  need
to
more
closely
examine
varia%on…
 •  notable
peaks
in
Twi4er
sample
for
three
terms
 –  Permafrost
(Earth
Sciences)
 –  Alkalinity
(Chemical
Sciences)

 –  Phosphorus
(Chemical
Sciences)

 •  are
these
poten%al
trends?
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  16. 16. Twi4er
c.f.
Google
NGrams 
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  17. 17. Twi4er
c.f.
Google
NGrams 
 •  Permafrost

 –  17%
and
15%
in
Twi4er
samples
(T‐300‐1
&
2)
–
c.f.
5%
in
G‐2006‐2008
 
 –  41
out
of
113
tweets
(36%)
used
in
scien%fic
context

 –  large
number
of
tweets
referred
to
 •  online
game
server1

 •  designer
case
for
iPhone
 •  Alkalinity
 –  none
found
to
have
scien%fic
content

 –  mostly
used
in
pseudo‐scien%fic
health
advice
 –  peak
in
T‐300‐2
(31
out
of
60
tweets
–
~50%)

 •  dominated
by

pH
measures
in
swimming
pools
&
fish
tanks

 •  influence
probably
due
to
collec%on
period
–
weekend
–
engagement
in
 leisure
ac%vi%es

 •  [1]
h4p://www.everquest2.com/Permafrost
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  18. 18. Example
Tweets
–
Permafrost

 •  advert/chat
 –  @HDNinjacp
go
to
Permafrost
Its
never
full
:
Fri
Mar
04
05:21:02
GMT
2011

 –  @Riffy8888
hey
Could
you
Come
to
my
Party
birthday
Party
on
CP
March
13
Server
 Permafrost
Dock
6:00PST
:
Sun
Mar
06
04:37:13
GMT
2011

 –  Party
Server
Permafrost
Dock
Please
Go
Its
An
Early
Birthday
Party
For
me
:
Thu
Mar
03
 01:38:28
GMT
2011

 •  cold
 –  36
inches
of
permafrost
s%ll,
I
want
to
stake
my
bird
condo
b4
the
squirrals
knock
it
 down
again..bas%ds..all
ofm
:
Sat
Mar
05
01:51:48
GMT
2011

 •  science
 –  Fire
and
Ice:
Permafrost
Melt
Spews
Combus%ble
Methane
h4p://%ny.ly/be8q
:
Fri
Mar
 04
16:43:10
GMT
2011

 –  (retweeted)
‐
Experts
Monitor
Methane
Release
from
Permafrost:
Over
the
past
few
 years,
methane
levels
around
the
world
have
b...
h4p://bit.ly/hvVEJX
:
Wed
Mar
02
 12:27:25
GMT
2011

 –  RT
@NetNewsBuzz:
Permafrost
Melt
Soon
Irreversible
Without
Major
Fossil
Fuel
Cuts
 h4p://%nyurl.com/5w8w2oh
#oil
#climate
#CO2
#fossilfuels
:
Thu
Mar
03
02:57:48
GMT
 2011

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  19. 19. Example
Tweets
–
Alkalinity
T‐300‐2
 •  Chemistry
Help
Needed!
pH,
concentra%on
of
carbonate
species
and
 alkalinity...
just
got
published:
h4p://bit.ly/hUCpz7

 –  URL
points
to
the
ques%on
on
“My
Chemistry
Tutor”
–
homework?
 •  retweeted
 –  The
proper
total
alkalinity
for
your
pool
is
100
ppm.
h4p://su.pr/8hrxCE
:
Fri
Mar
 04
19:02:20
GMT
2011

 –  If
the
Total
Alkalinity
in
your
swimming
pool
is
low,
your
pH
will
be
low.
h4p:// su.pr/8hrxCE
:
Fri
Mar
04
20:34:11
GMT
2011

 •  spam/adverts
(including
retweets)
 –  @Poet_Carl_Wa4s:
some
foods
create
acidity
or
alkalinity
ayer
they‚Äôre
 metabolized...h4p://ping.fm/GQTvA
#KnowledgeIsPower!
:
Sat
Mar
05
02:38:55
 GMT
2011

 –  
RT
@CourtneyPool:
Green
juice,
oh
Liquid
Emerald
Elixir
of
Life
and
Alkalinity!
 Course
through
my
BODY!
#juicing
:
Sun
Mar
06
18:34:29
GMT
2011

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  20. 20. Twi4er
c.f.
Google
NGrams:
Phosphorus 
Sample
 Total
 LegislaAon
 NutriAon
 
 
 
 Other
 Industry
 
 White
 ID
 Sciences
 
 Phosphorus 
T‐300‐1
 
 129
 
 46
 
 16
 
 29
 
 4
 5
T‐300‐2
 119
 
 4
 26
 
 35
 
 9
 5
T‐300‐3
 171
 
 12
 
 23
 
 37
 
 42
 
 19
 
 •  Twi4er
trends
for
Phosphorus
in
sample
T‐300‐3

 –  Industry

 •  takeover
of
a
Brazilian
company
by
the
Indian
firm
United
Phosphorus

 –  White
Phosphorus
 •  17
retweets
of
an
emo%ve
message
(rela%on
to
Middle
East
wars)
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  21. 21. Twi4er
c.f.
Google
NGrams:
Phosphorus 
 •  usage
largely
with
scien%fic
content
 –  with
rela%onships,
a.o.,
to
legal,
nutri%onal
&
economic
context
 –  five
main
categories
iden%fied
 •  Legisla%on

 –  limits
to
use
in
fer%liser,
soap

 •  Nutri%on
 –  phosphorus
content

 •  Other
Science

 –  peak
phosphorus,
pollu%on
 –  discovery
of
arsenic
replacing
phosphorus
in
a
microbe

 –  tweets
about
new
paper
on
Redfield
ra%o
in
organisms
 •  Industry

 –  mergers,
prices
of
Phosphorus‐containing
goods
 •  White
Phosphorus

 –  use
in
Middle
East
wars

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  22. 22. Example
Tweets
–
Phosphorus 
 •  Legisla%on


 –  RT
@YarnPlayCafe:
The
fact
that
he
wants
to
repeal
the
phosphorus
ban
and
kill
the
Madison
 lakes
is,
by
itself,
enough
to
#killthisbill

...
:
Tue
Mar
08
02:04:49
GMT
2011

 •  Nutri%on
 –  Big,
wet
snowflakes
driy
over
the
farm.
To
warm
up,
I
try
some
Horlicks,
a
wheat/barley/whey
 drink
with
lots
of
calcium
&
phosphorus.
Mmmm.
:
Tue
Mar
01
20:56:43
GMT
2011

 –  Vitamin
D
acts
as
an
hormone
and
plays
a
controlling
role
in
the
metabolism
of
calcium
and
 phosphorus
:
Sun
Mar
06
12:12:36
GMT
2011
 •  
Other
Science

 –  
[java]
129
:
Greater
Phosphorus
Efficiency
h4p://bit.ly/iehsmK
#agriculture
:
Wed
Mar
02
 14:36:21
GMT
2011

 •  Industry

 –  #stocks
#bse
#nse
Buy
United
Phosphorus
‐
posi%ve
move
to
tap
largest
La%n
American
market;
 Edelweiss
h4p://dlvr.it/JdSpV
:
Tue
Mar
08
17:22:55
GMT
2011

 –  Enshi
:
Wugang
develops
technique
to
handle
high‐phosphorus
iron
ore
‐
Steel
Business
Briefing
 (subscri
h4p://uxp.in/30538045
:
Tue
Mar
08
09:33:05
GMT
2011

 •  
White
Phosphorus

 –  Dear
America,
your
white
phosphorus
and
depleted
uranium
can
not
stop
the
growth
of
Iraqs
 future.
Iraq
Will
Rise.
:
Wed
Mar
02
07:49:21
GMT
2011

 –  @Remroum
so
first
they
steal
our
land,
now
they
want
our
"tac%cs"
i.e.
poetry?
i
guess
the
white
 phosphorus
just
isnt
cu•ng
it
anymore.
:
Sat
Mar
05
03:42:44
GMT
2011

 •  ???
 –  @p_kojo
‐
Phosphorus
Potassium
‐
Pinocchio
,
Im
so
glad
we
found
each
other
nw
we
can
hav
 lots
of
fun
:)
:
Sun
Mar
06
13:43:10
GMT
2011

altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  23. 23. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  24. 24. Conclusions
–
Experiment 
 •  recognised
challenges
 –  baseline
corpus
for
online
social
media
difficult
to
obtain
 •  very
small
(rela%vely)
samples
found
in
Twi4er
stream
 •  difficult
to
obtain
representa%ve
samples

   more
effec%ve
methods
required
to
extract
lower
frequency
terms
 –  difficulty
reproducing
experiments
 –  reliability,
ethical
&
privacy
issues
–
due
to
user‐created
content
 •  what
is
a
suitable,
publicly
available
baseline
corpus?
 –  Google
NGrams?
 •  different
informa%on
collec%on
methods
from
online
social
media
 –  coverage
of
topics
may
see
large
varia%on
between
corpora
 –  any
others?
 •  Wikipedia/DBpedia?
TREC?
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  25. 25. Engagement
with
the
Web?
•  why
do
scien%sts
not
tweet?
(or
engage
much
in
other
social
media)?
 –  is
the
web
not
seen
to
enforce
sufficient
scien%fic
rigour?
 –  do
scien%sts
not
view
the
web
as
a
poten%al
audience?
•  is
the
web
audience
a
suitable
peer
reviewer?
•  why
do
scien%sts
hesitate
to
disseminate
informa%on
online?
 –  poten%al
for
ideas
to
be
stolen?
 –  trust
–
how
to
differen%ate
between
valid
science
and
pseudo‐science,
 spam
and
adverts?
•  social
media
largely
driven
by
personal
interest,
sen%ment,
opinion
 –  may
explain
low
scien%fic
content
 –  more
colloquial
use
of
what
is
tradi%onally
scien%fic
terminology
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  26. 26. Implica%ons
for
Altmetrics 
 •  however
‐
some
level
of
scien%fic
discourse
on
Twi4er
 –  e.g.,
Phosphorus
iden%fied
as
a
poten%al
Twi4er
trend
 •  online
social
media
may
s%ll
have
poten%al
to
serve
as
an
altmetric
 for
measuring
impact
of
science
 •  star%ng
from
scientometrics
‐
which
looks
at
author
features,
e.g.,

 –  co‐cita%on
 –  affilia%on
–
rela%onship
to
reputa%on
 •  corresponding
features
in
online
social
media
 –  followers

 –  retweets
–
rela%onship
to
trust?
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  27. 27. Outline
 •  Aims/Introduc%on
 •  Related
Work
 •  Experiment
 –  Data
 –  Analysis
&
Results
 •  Conclusions

 •  Next
Steps
 •  Acknowledgements
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  28. 28. Next
Steps 
•  replicate
experiments
with
larger
samples
over
longer
period
 –  more
detailed
analysis
 •  e.g.,
hashtag
analysis;
urls
within
tweets
 •  focus
on
terms
with
more
trending
poten%al,
e.g.,
nanostructures,
nanosilver

•  consider
specific
tweets
 –  from
scien%fic
media
and
journals
 –  posted
during
scien%fic
conferences,
congresses
•  comparison
with
other
independent
baseline
data
sets
•  compare
Twi4er
use
within
different
disciplines
 –  influence
of
interdisciplinary
collabora%on
on
use
of
online
social
media?
•  create
new
benchmarks
data
&
experiments
  define
alt‐metric
for
scien%fic
term
usage
in
online
social
media
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  29. 29. Acknowledgements 
 •  Elizabeth
Cano
for
discussions
on
collec%on
and
use
of
data
from
 Twi4er
streams
 •  V.S.
Uren
&
A.‐S.
Dadzie
funded
by:
 –  European
Commission
7th
Framework
Programme
project
 SmartProducts
(grant
no.
231204)
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web

  30. 30. References 
 •  Garfield
bib
‐
h4p://garfield.library.upenn.edu/pub.html
 •  Ma4hew
Rowe,
Sofia
Angeletou
and
Harith
Alani.
(2011)
Predic%ng
Discussions
on
 the
Social
Seman%c
Web,
Proc.,
ESWC
(2)
2011:
405‐420
 •  Sheila
Kinsella,
Mengjiao
Wang,
John
Breslin
and
Conor
Hayes.
(2011)
Improving
 Categorisa%on
in
Social
Media
using
Hyperlinks
to
Structured
Data
Sources,
Proc.,
 ESWC
(2)
2011:

390–404
 •  others
in
paper
references
–
see
h4p://altmetrics.org/altmetrics11/uren‐v0
altmetrics11:
Tracking
scholarly
impact
on
the
social
Web


×