Mark2Cure: a crowdsourcing platform for biomedical literature annotation
1. Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Benjamin M Good, Max Nanis, Andrew I Su
The Scripps Research Institute, La Jolla, California, USA
ABSTRACT
ABSTRACT
Recent studies have shown that workers on microtasking platforms such
as Amazon’s Mechanical Turk (AMT) can, in aggregate, generate highquality annotations of biomedical text. In addition, several recent
volunteer-based citizen science projects have demonstrated the public’s
strong desire and ability to participate in the scientific process even
without any financial incentives. Based on these observations, the
mark2cure initiative is developing a Web interface for engaging large
groups of people in the process of manual literature annotation. The
system will support both microtask workers and volunteers. These
workers will be directed by scientific leaders from the community to
help accomplish ‘quests’ associated with specific knowledge extraction
problems. In particular, we are working with patient advocacy groups
such as the Chordoma Foundation to identify motivated volunteers and
to develop focused knowledge extraction challenges. We are currently
evaluating the first prototype of the annotation interface using the AMT
platform.
Challenge
1000000
900000
Can
non-‐experts
annotate
disease
occurrences
in
text
beRer
than
machines?
•
•
•
•
6900
disease
men9ons
in
793
PubMed
abstracts
developed
by
a
team
of
12
annotators
covers
all
sentences
in
a
PubMed
abstract
Disease
men9ons
are
categorized
into
Specific
Disease,
Disease
Class,
Composite
Men9on
and
Modifier
categories.
Use
the
AMT
to
test
the
concept
before
aRemp9ng
to
mo9vate
a
ci9zen
science
movement
Objec9ves
for
Annotators
Highlight
all
diseases
and
disease
abbreviaFons
“...are
associated
with
Hun9ngton
disease
(
HD
)...
HD
pa9ents
received...”
“The
WiskoR-‐Aldrich
syndrome
(
WAS
)
…”
Highlight
the
longest
span
of
text
specific
to
a
disease
“...
contains
the
insulin-‐dependent
diabetes
mellitus
locus
…”
and
not
just
‘diabetes’.
“...was
ini9ally
detected
in
four
of
33
colorectal
cancer
families…”
Highlight
disease
conjuncFons
as
single,
long
spans.
“...the
life
expectancy
of
Duchenne
and
Becker
muscular
dystrophy
pa9ents..”
“...
a
significant
frac9on
of
familial
breast
and
ovarian
cancer
,
but
undergoes…”
Highlight
symptoms
-‐
physical
results
of
having
a
disease
“XFE
progeroid
syndrome
can
cause
dwarfism,
cachexia,
and
microcephaly.
Pa9ents
ofen
display
learning
disabili9es,
hearing
loss,
and
visual
impairment.
Highlight
all
occurrences
of
disease
terms
“Women
who
carry
a
muta9on
in
the
BRCA1
gene
have
an
80
%
risk
of
breast
cancer
by
the
age
of
70.
Individuals
who
have
rare
alleles
of
the
VNTR
also
have
an
increased
risk
of
breast
cancer
(
2-‐4
)”.
Number
600000
arFcles
500000
added
to
PubMed
400000
300000
200000
100000
0.8
0
Worker
instruc9ons
Examples
Idea:
People
are
very
effec9ve
processors
of
text,
even
in
areas
where
they
aren’t
experts
[1].
Numerous
experiments
have
shown
the
public’s
desire
to
contribute
to
science.
Lets
give
them
an
opportunity
to
help
annotate
the
biomedical
literature.
0.6
precision
0.4
recall
0.2
Approach:
CiFzen
Science
F
0
1
2
3
4
5
Number
of
votes
per
annota9on
Costs
• one
week
each,
($30)
• one
month
turk-‐specific
developer
9me...
Consistency
with
NCBI
standard,
Development
Corpus
mturk
experiment
1,
minimum
3
votes
per
annota9on
60
50
mturk
experiment
2,
minimum
3
votes
per
annota9on
40
30
NCBO
annotator
(Human
Disease
Ontology)
20
10
NCBI
condi9onal
random
field
trained
on
the
AZ
corpus
(only
"all"
reported)
Next
Steps
Exp.
1
results
1
70
0
Tes9ng
on
the
100
abstract
“development
set”,
5
workers
per
abstract,
$.06
per
completed
abstract
700000
(N(A)
+
N(B))
To
what
degree
can
we
reproduce
the
NCBI
disease
corpus
[2]?
RESULTS,
2
experiments
800000
Consistency(A,B)
=
2*100*(N
shared
annota9ons)
consistency
with
NCBI
gold
standard
Identifying concepts and relationships in biomedical text enables
knowledge to be applied in computational analyses, such as gene set
enrichment evaluations, that would otherwise be impossible. As such,
there is a long and fruitful history of BioNLP projects that apply natural
language processing to address this challenge. However, the state of the
art in BioNLP still leaves much room for improvement in terms of
precision, recall and the complexity of knowledge structures that can be
extracted automatically. Expert curators are still vital to the process of
knowledge extraction but are in short supply.
Goal:
structure
all
knowledge
published
as
text
on
the
same
day
it
appears
in
PubMed
with
expert-‐
human
level
precision
and
recall
RESULTS,
Comparison
to
concept
recogniFon
tools
Proof
of
Concept
Experiment
with
AMT
(work
in
progress)
Exp.
2
changes
• Expanded
instruc9ons
with
more
examples
• Minor
interface
changes
(selec9ng
one
term
automa9cally
selects
all
other
occurrences)
Nearly
iden9cal
results
• Con9nued
refinement
of
the
annota9on
interface
with
AMT
• Experiment
to
compare
AMT
results
versus
volunteers
• Collabora9ons
with
disease
groups
such
as
the
Chordoma
Founda9on
to
prime
the
flow
of
ci9zen
scien9st
annotators
AMT
workers
performed
beRer
than
condi9onal
random
field
trained
on
the
AZ
corpus.
We
are
hiring!
Looking
for
postdocs,
programmers
interested
in
crowdsourcing
and
bioinforma9cs
contact
asu@scripps.edu
REFERENCES
1. Zhai, Haijun, et al. "Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural
language processing." Journal of medical Internet research 15.4 (2013).
2. Doğan, Rezarta Islamaj, and Zhiyong Lu. "An improved corpus of disease mentions in PubMed citations."
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing. Association for Computational
Linguistics, 2012.
CONTACT
Benjamin Good: bgood@scripps.edu Andrew Su: asu@scripps.edu
FUNDING
We acknowledge support from the National Institute of General Medical
Sciences (GM089820 and GM083924).