Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building the Biodiversity Knowledge Graph

1,155 views

Published on

Slides from 4th Global Online Biodiversity Informatics Seminar https://plus.google.com/events/clvk6nd14d9fhh7e4a6oe5mt9s0

Published in: Science, Technology, Education

Building the Biodiversity Knowledge Graph

  1. 1. Building the Biodiversity Knowledge Graph @rdmpage http://iphylo.blogspot.com
  2. 2. • There are known knowns, things we know that we know • There are known unknowns, things we now know we don’t know • But there are also unknown unknowns, things we do not know we don’t know
  3. 3. known unknown
  4. 4. Things we don’t know that we know
  5. 5. Melissotarsus insularis
  6. 6. Melissotarsus insularis no hit CASENT0107663-D01 DQ176312 Melissotarsus sp. BLF m1DQ176312 CASENT0107663-D01Melissotarsus insularis 1 Melissotarsus insularisMelissotarsus sp. BLF m1 =
  7. 7. We have a vast amount of “old stuff”
  8. 8. Numbers of new animal names 1923 WWI WWII
  9. 9. We are learning new stuff
  10. 10. “New” and “old” are disconnected
  11. 11. Dark taxa http://iphylo.blogspot.co.uk/2011/04/dark-taxa-genbank-in-post-taxonomic.html
  12. 12. Mammals in GenBank Proper Linnaean names Aus sp.
  13. 13. Mammals Proper Linnaean names Aus sp.
  14. 14. “Invertebrates” BOLD
  15. 15. Challenge: linking things together (sticky data)
  16. 16. Data is good
  17. 17. More data is better…
  18. 18. …but this data is not sticky
  19. 19. Location
  20. 20. name name Tags
  21. 21. Namenname
  22. 22. Identifiers
  23. 23. Shared identifiers are sticky
  24. 24. Identifiers • Globally unique • Resolvable (for humans and machines) • Use other people’s identifiers to link things together
  25. 25. Human and machine readable machine human
  26. 26. { "author": [ { "family": "Page", "given": "Roderic D.M." } ], "container-title": "PeerJ", "reference-count": 60, "page": "e190", "deposited": { "date-parts": [ [ 2013, 11, 18 ] ], "timestamp": 1384732800000 }, "title": "BioNames: linking taxonomy, texts, and trees", "type": "journal-article", "DOI": "10.7717/peerj.190", "ISSN": [ "2167-8359" ], "URL": "http://dx.doi.org/10.7717/peerj.190” }
  27. 27. Using other people’s identifiers is hard work and scary • Hard work - you have to find their identifiers • Scary - what happens if other person breaks their identifiers? • Solution: make it easy to find them, and make them robust (e.g., CrossRef and DOIs)
  28. 28. http://dx.doi.org/10.7717/peerj.190 DOI (Digital Object Identifier)
  29. 29. Biodiversity Knowledge Graph (linking things together)
  30. 30. Our questions are “paths” in this network
  31. 31. Phylogeography
  32. 32. Taxonomy
  33. 33. GenBank records from Spain
  34. 34. MESH term
  35. 35. PMID:948206
  36. 36. http://biostor.org/reference/102054
  37. 37. http://data.gbif.org/occurrences/215921922/
  38. 38. BHL and GBIF as biomedical databases http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html
  39. 39. Metrics (counting links in the knowledge graph)
  40. 40. In an attempt to live up to that increasing demand for documentation, the leadership of the Natural History Museum of Denmark has issued an order to its curatorial staff - The staff members are requested to document which publications from 2011, written entirely by external scientists, that in one way or another are based on material in the collections of the Museum. http://markmail.org/message/opv2we7fkmro2nen@TAXACOM
  41. 41. https://twitter.com/#!/search/10.1371%252Fjournal.pone.0036881
  42. 42. https://twitter.com/edwbaker/status /205595933159858176 https://twitter.com/edwbaker/status/205595933159858176
  43. 43. http://www.museum-analytics.org/
  44. 44. Cited, linkable specimens NMNH Vertebrate Zoology Herpetology Collections 11194 CAS Herpetology Collection Catalog MCZ Herpetology Collection Herpetology Collection (University of Kansas Biodiversity Research Center) 9619 6720 5818 http://iphylo.blogspot.co.uk/2012/02/gbif-specimens-in-biostor-who-are-top.html
  45. 45. Annotation (everyone can make the knowledge graph)
  46. 46. http://bionames.org/labs/bookmarklet/
  47. 47. How many people view annotation Data Fix me!
  48. 48. Annotation as fixing errors
  49. 49. Annotation as building the knowledge graph paper specimen paper sequence taxonomic name specimen cites publishes has voucher
  50. 50. OK, but if the biodiversity knowledge graph is so cool, why haven’t we made it already?
  51. 51. Open question: Who will build the biodiversity knowledge graph?

×