Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Rod Page @rdmpage
http://iphylo.blogspot.com
Knowledge graphs
Holly Bik @hollybik
Let’s rise up to unite taxonomy and technology
10.1371/journal.pbio.2002231
http://ispecies.org
Simple Javascript mashup
DBpedia
GBIF
CrossRef
EOL
Open Tree of Life
TreeBASE
https://doi.org/10.7717/peerj.190
The Semantic web:
“The future of the web…
and always will be” –
Peter Norvig (Google)
Obstacles to building knowledge graphs
•Technical
•Social
Obstacles to building knowledge graphs
• Need globally unique, persistent identifiers
(how to label the nodes of the graph...
A new hope
• The identifier wars are (nearly) over (DOIs FTW)
• Lots of domain-specific vocabularies, but
schema.org is “g...
Obstacles to building knowledge graphs
•Technical
•Social Economic
Identifiers, identifiers, identifiers, identifiers
How do we measure progress?
before
now
now
before
Linear growth (easy) Connectivity (hard)
Need network effects
One is useless Two is “meh” Many is better
The Semantic web:
“The future of the web…
and always will be” –
Peter Norvig (Google)
The knowledge graph is
already here (it’s just
not evenly distributed)
William Gibson @GreatDismal
Google’s Knowledge Graph
PREFIX wdt: http://www.wikidata.org/prop/direct/
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?root_name ?parent_na...
“Citations for the sum of
human knowledge”
WikiCite @WikiCite
Goal 1: Every citation in the Wikipedias should be in Wikida...
Small knowledge graphs (hexastores)
Very simple
ontology
Tom Scott @derivadow
Leigh Dodds @ldodds
Hexastore
• A triple is [s, p, o]
• Find all statements [s, ?, ?] is simple array lookup (all elements with key “s”)
• Fin...
Xanadu,
the web that wasn’t
Ted Nelson Hyperlinks and
hypermedia
Two-way links and
“transclusion”
= Xanadu
Tim Berners-Lee
HTTP, URL, HTML
One-way lin...
Web page Other web
page
Web linking, one way, document-level, “target”
doesn’t know that it is linked to (“cited”),
link c...
text
Work Source
text
Xanadu linking, two way, fragment-level,
“source” knows it is linked to, source content
is embedded,...
Xanadu
A New Account of the Genus
Horsfieldia (Myristicaceae), Pt 2
W J J O De Wilde
The Gardens' bulletin, Singapore 38(1): 55-1...
Flora Malesiana. Series I - Seed Plants,
Volume 14. Myristicaceae
https://doi.org/10.3897/ab.e1141
DescriptionDescriptio
n
Flora Article
Embedded markup (bad)…
Crocidura absconditus, new species
<i>Crocidura absconditus</i>, new species
0 20
{ [0,20], “italic...
@hypothes_is
Annotating a
scientific paper
Aggregating annotations (iPhylo)
http://iphylo.blogspot.co.uk/2016/06/aggregating-annotations-on-scientific_30.html
Taxono...
Taxonomic databases
are not lists of names…
…they are lists of annotations
(“this name occurs on this page”)
Annotations are retrospective nanopublications
Annotating existing content
(extracting “facts”)
Today
Publishing “facts” a...
Social design and the
knowledge graph
Obstacles to building knowledge graphs
•Technical
•Social Economic
Nico Franz @taxonbytes
ORCID
(person)
DOI
(publication)
LSID
(plant name)
Find my papers that
published new species
@SandyKnapp
ORCID
(person)
DOI
(publication)
LSID
(plant name)
#Iamataxonomist
(claim/demonstrate expertise)
specimen plant name
What Sandy really wants
collected type for
publication
person
“What specimens that I collected that ha...
Knowledge graphs
considered harmful
(remember Impact Factors?)
http://www.museum-analytics.org/
Cited, linkable specimens
NMNH Vertebrate Zoology
Herpetology Collections
11194
CAS Herpetology Collection Catalog
MCZ Her...
We will need to ensure our knowledge graph is
free, open, and used for good
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
Upcoming SlideShare
Loading in …5
×

Towards a biodiversity knowledge graph

3,897 views

Published on

TDW2017 Keynote

Published in: Science
  • Be the first to comment

  • Be the first to like this

Towards a biodiversity knowledge graph

  1. 1. Rod Page @rdmpage http://iphylo.blogspot.com
  2. 2. Knowledge graphs
  3. 3. Holly Bik @hollybik Let’s rise up to unite taxonomy and technology 10.1371/journal.pbio.2002231
  4. 4. http://ispecies.org Simple Javascript mashup DBpedia GBIF CrossRef EOL Open Tree of Life TreeBASE
  5. 5. https://doi.org/10.7717/peerj.190
  6. 6. The Semantic web: “The future of the web… and always will be” – Peter Norvig (Google)
  7. 7. Obstacles to building knowledge graphs •Technical •Social
  8. 8. Obstacles to building knowledge graphs • Need globally unique, persistent identifiers (how to label the nodes of the graph) • Need to create and agree on vocabularies (how to label the edges of the graph) • Need to agree how to transmit the graph • Who stores the global graph?
  9. 9. A new hope • The identifier wars are (nearly) over (DOIs FTW) • Lots of domain-specific vocabularies, but schema.org is “good enough” for most things • XML becoming a bedtime story to frighten the children, JSON is everywhere (JSON-LD FTW). • Wikidata
  10. 10. Obstacles to building knowledge graphs •Technical •Social Economic
  11. 11. Identifiers, identifiers, identifiers, identifiers
  12. 12. How do we measure progress? before now now before Linear growth (easy) Connectivity (hard)
  13. 13. Need network effects One is useless Two is “meh” Many is better
  14. 14. The Semantic web: “The future of the web… and always will be” – Peter Norvig (Google)
  15. 15. The knowledge graph is already here (it’s just not evenly distributed) William Gibson @GreatDismal
  16. 16. Google’s Knowledge Graph
  17. 17. PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?root_name ?parent_name ?child_name WHERE { VALUES ?root_name {"Hominini"} ?root wdt:P225 ?root_name . ?child wdt:P171+ ?root . ?child wdt:P171 ?parent . ?child wdt:P225 ?child_name . ?parent wdt:P225 ?parent_name . } http://biohackathon.org/d3sparql/ Toshiaki Katayama @tktym http://iphylo.blogspot.ca/2017/01/displaying-taxonomic-classifications.html
  18. 18. “Citations for the sum of human knowledge” WikiCite @WikiCite Goal 1: Every citation in the Wikipedias should be in Wikidata Goal 2: Every citation should be in Wikidata (!?)
  19. 19. Small knowledge graphs (hexastores)
  20. 20. Very simple ontology Tom Scott @derivadow Leigh Dodds @ldodds
  21. 21. Hexastore • A triple is [s, p, o] • Find all statements [s, ?, ?] is simple array lookup (all elements with key “s”) • Find all statements [?, ?, o] is slow (scan all triples)… • …unless we add array of [o, s, p] triples, then simple array lookup (all elements with key “o”) • Six variations cover all queries: [s,p,o], [s,o,p], [p, s, o], [p, o, s], [o, s, p], [o, p, s] (hence “hexastore”) • In-memory graph database in Javascript (think offline apps) http://crubier.github.io/Hexastore/
  22. 22. Xanadu, the web that wasn’t
  23. 23. Ted Nelson Hyperlinks and hypermedia Two-way links and “transclusion” = Xanadu Tim Berners-Lee HTTP, URL, HTML One-way links = world wide web
  24. 24. Web page Other web page Web linking, one way, document-level, “target” doesn’t know that it is linked to (“cited”), link can break (404)
  25. 25. text Work Source text Xanadu linking, two way, fragment-level, “source” knows it is linked to, source content is embedded, links don’t break
  26. 26. Xanadu
  27. 27. A New Account of the Genus Horsfieldia (Myristicaceae), Pt 2 W J J O De Wilde The Gardens' bulletin, Singapore 38(1): 55-144 (1985) http://biostor.org/reference/175018 Horsfieldia lancifolia BioStor @biostor_org Biodiversity Heritage Library @biodivlibrary
  28. 28. Flora Malesiana. Series I - Seed Plants, Volume 14. Myristicaceae https://doi.org/10.3897/ab.e1141
  29. 29. DescriptionDescriptio n Flora Article
  30. 30. Embedded markup (bad)… Crocidura absconditus, new species <i>Crocidura absconditus</i>, new species 0 20 { [0,20], “italics” } …versus annotation (good) (think NLM JATS XML markup versus Substance JSON used by Lens viewer https://lens.elifesciences.org/ about/) Crocidura absconditus, new species
  31. 31. @hypothes_is
  32. 32. Annotating a scientific paper
  33. 33. Aggregating annotations (iPhylo) http://iphylo.blogspot.co.uk/2016/06/aggregating-annotations-on-scientific_30.html Taxonomic names, specimen codes, geographic localities, references are all annotations
  34. 34. Taxonomic databases are not lists of names… …they are lists of annotations (“this name occurs on this page”)
  35. 35. Annotations are retrospective nanopublications Annotating existing content (extracting “facts”) Today Publishing “facts” as nanopublications Stream of “facts”
  36. 36. Social design and the knowledge graph
  37. 37. Obstacles to building knowledge graphs •Technical •Social Economic
  38. 38. Nico Franz @taxonbytes
  39. 39. ORCID (person) DOI (publication) LSID (plant name) Find my papers that published new species @SandyKnapp
  40. 40. ORCID (person) DOI (publication) LSID (plant name) #Iamataxonomist (claim/demonstrate expertise)
  41. 41. specimen plant name What Sandy really wants collected type for publication person “What specimens that I collected that have been described as new species by other people?” Published in author other person not the same person
  42. 42. Knowledge graphs considered harmful (remember Impact Factors?)
  43. 43. http://www.museum-analytics.org/
  44. 44. Cited, linkable specimens NMNH Vertebrate Zoology Herpetology Collections 11194 CAS Herpetology Collection Catalog MCZ Herpetology Collection Herpetology Collection (University of Kansas Biodiversity Research Center) 9619 6720 5818 http://iphylo.blogspot.co.uk/2012/02/gbif-specimens-in-biostor-who-are-top.html
  45. 45. We will need to ensure our knowledge graph is free, open, and used for good

×