Wikidata as a FAIR
knowledge graph for the
life sciences
September 24, 2019
Slides: slideshare.net/andrewsu
Andrew Su, Ph.D.
@andrewsu
http://sulab.org
Q19857262
The Gene Wiki project, circa 2008
2
Huss, PLoS Biol, 2008
Data imported
from structured
databases
Summarized
knowledge via
crowdsourcing
3
is to data
is to text
biomedical
Provide a database of the world’s
knowledge that anyone can edit
- Denny Vrandečić
Subclass of
Regulates
Physically
interacts with
Protein
Neural
development
Property:P279
Property:P128
Property:P129
Q8054
Q1345738
VLDL receptor Q1979313
Amyloid
beta A4 Q423510
Q13561329
http://www.wikidata.org/wiki/Q13561329
Decreased
expression in
Property:P1910
Schizophrenia Q41112
Bipolar disorder Q131755
Property:P279
Property:P128
Property:P129
Q8054
Q1345738
Q1979313
Q423510
Q13561329
Property:P1910
Q41112
Q131755
https://www.wikidata.org/wiki/Special:EntityData/Q13561329.json
7
Qualifiers
References
8
Biomedical use cases
• ID translation (aka Rosetta Stone)
• Integrative queries
• Graph mining for drug repurposing
• Generalized back-end database for customized front-
end applications
9
Small data to big data
10
?
Chlambase.org for the Chlamydia research community
11
Community-specific
knowledge
Genetic mutants, gene
expression, host-pathogen
interactions, orthologs, ….
Key advantages of Wikidata-backed applications
• Application development can focus on front-end
interface
• Data persistence extends beyond the life of the web
application / grant
12
Biomedical use cases
• ID translation
• Integrative queries
• Graph mining for drug repurposing
• Generalized back-end database for customized front-
end applications
• Community curation of ontologies
13
Curated
Knowledge
Crowd-sourced
Knowledge
Wikidata Crowdsourcing Model for Enhancing DO Curation
• Github issues
• Github pull requests
• CSVs
Wikidata Crowdsourced Edits on Disease Items
98% acceptance
rate after expert
curation
16
https://commons.wikimedia.org/wiki/File:FAIR_data_principles.jpg
=
Acknowledgements
• Andra Waagmeester (Micelio)
• Sabah Ul-Hasan
• Ginger Tsueng
• Mike Mayers
• Roger Tu
• Lynn Schriml (Univ. Maryland)
• Chunlei Wu (Scripps Research
• Kevin Hybiske (Univ. Washington)
17
Past team members
• Ben Good
• Greg Stupp
• Tim Putman
• Sebastian Burgstaller-Muehlbacher
• Núria Queralt Rosinach
• Elvira Mitraka
• Derek Jow
• Paul Pavlidis
Funding

Wikidata as a FAIR knowledge graph for the life sciences