Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ozymandias - from an atlas to a knowledge graph of living Australia

1,903 views

Published on

Talk given at CSIRO about biodiversity knowledge graphs

Published in: Science
  • Be the first to comment

  • Be the first to like this

Ozymandias - from an atlas to a knowledge graph of living Australia

  1. 1. @rdmpage http://iphylo.blogspot.com Ozymandias - from an atlas to a knowledge graph of living Australia
  2. 2. Why am I here? Family BHL AU at Melbourne Museum @nicolekearney Australian biodiversity data
  3. 3. Digitise all the things Specimens, images, publications, genomes, GBIF, ALA, GenBank, BOLD, BHL, Instagram,…
  4. 4. #GBIF1billion gbif.org
  5. 5. #GBIF1billion
  6. 6. #GBIF1billion #bigdata
  7. 7. iNaturalist
  8. 8. 361,429,888 birds@Team_eBird
  9. 9. GenBank: the experimenter’s museum https://doi.org/10.1086/658657
  10. 10. Carolus Linnaeus (left), the initiator of the ‘Linnean age of taxonomy’, with the morphological-descriptive assessment of some 23,000 geometrid species, achieved in 257 years. Paul D. N. Hebert (right), the initiator of the ‘Hebertian age of taxonomy’, has led to the molecular definition of some 20,000 geometrid BINs … in just 12 years. https://doi.org/10.1098/rstb.2015.0339 Classical taxonomy versus DNA barcodes
  11. 11. Challenges to primacy of museum collections Citizen science Genomics
  12. 12. Why do museums matter? •Collections (?) •Connections “macroscope”
  13. 13. Link all the things #semanticweb #linkeddata #knowledgegraph
  14. 14. Biodiversity knowledge graph Towards a biodiversity knowledge graph Research Ideas and Outcomes 2: e8767 (07 Apr 2016) doi: 10.3897/rio.2.e8767
  15. 15. Phylogeography
  16. 16. Taxonomy
  17. 17. Impact of collections
  18. 18. Incentives to build knowledge graphs: •Curiosity (connections) •Money •Metrics
  19. 19. @hollybik Holly Bik Let’s rise up to unite taxonomy and technology (2017) https://doi.org/10.1371/journal.pbio.2002231
  20. 20. GBIF Challenge - €34,000 in prizes
  21. 21. Ozymandias: a knowledge graph for the Australian fauna https://ozymandias-demo.herokuapp.com @rdmpage
  22. 22. AFD
  23. 23. Ozymandias knowledge graph Every node has an identifier (e.g., URL, DOI, LSID) Every edge (link) comes from a controlled vocabulary (e.g., schema.org)
  24. 24. What can we do with a knowledge graph?
  25. 25. 0 50 100 150 200 250 300 350 0 500 1000 1500 2000 2500 3000 3500 4000 4500 1750 1800 1850 1900 1950 2000 Numberofnames Cumulativenumberofspecies Year species names accepted species Ozymandias: A biodiversity knowledge graph https://doi.org/10.1101/485854 Weevils (CURCULIONOIDEA )
  26. 26. 0 20 40 60 80 100 120 0 100 200 300 400 500 600 700 800 1758 1808 1858 1908 1958 2008 Numberofnames Cumulativenumberofspecies Year species names accepted species Ozymandias: A biodiversity knowledge graph https://doi.org/10.1101/485854 Land snails (CAMAENIDAE)
  27. 27. Date article published Datecitedarticlepublished review citation classic The “research front” in a “hot” field
  28. 28. Citation patterns in taxonomy Taxonomy is “long data” rather than “big data” (every paper is a classic, every paper is a review) Ozymandias: A biodiversity knowledge graph https://doi.org/10.1101/485854
  29. 29. Need access to digitised literature for both humans and machines Will give us access to taxonomic names, descriptions, and traits. BHL (Australia), scanning Australian literature from museums, herbaria, and scientific societies right up to present day. Plazi and Biodiversity Literature Repository (Switzerland), extracting images from free AND paywall journals.
  30. 30. Patterns in publishing
  31. 31. Knowledge graphs are fun, but we already have ALA… • Classical aggregators merge together data based on a couple of key attributes (e.g., species names, geographical coordinates) • Doesn’t link to underlying knowledge (e.g., work by taxonomists, systematists, etc.) • Doesn’t know everything that it knows (not a graph)
  32. 32. There are known knowns, things we know that we know There are known unknowns, things we now know we don’t know But there are also unknown unknowns, things we do not know we don't know Donald Rumsfeld
  33. 33. known unknown
  34. 34. @nicolekearney@rdmpage What if could make some demos to show how ALA could be better?
  35. 35. Demo 1: Linking to the literature
  36. 36. Adding linked literature to ALA https://ozymandias-demo.herokuapp.com/alademo.php?q=Maricoccus+brucei+Poore%2C+1994 DOIs linking to taxonomic work You can read for free!
  37. 37. Demo 2: Using the literature to enhance ALA (unknown knowns)
  38. 38. Publications in AFD have images that ALA doesn’t know about ALA has no images for these weevils Plazi has them from Zookeys paper, so they are in Ozymandias
  39. 39. Demo 3: Taxonomists are people too “strings to things”
  40. 40. “D K Yeates”
  41. 41. Originally a database of facts underlying Wikipedia, fast becoming a global database of everything…
  42. 42. Here comes everybody… Wikidata has thousands of entries for researchers (especially taxonomists) linked to ORCIDs as well as taxonomic identifiers such as ZooBank, IPNI, and social networks such as ResearchGate.
  43. 43. People in ALA Wikidata ORCID ResearchGate
  44. 44. Australian taxonomy is international Wikidata query to find nationality of authors of recent publications in AFD (via Ozymandias)
  45. 45. Knowledge graphs require globally unique identifiers for things we care about DOIs, LSIDs, etc.
  46. 46. ORCID Yes, yet another %&$!@ identifier, but this one is actually useful…
  47. 47. https://orcid.org/0000-0002-7101-9767
  48. 48. @dpsSpiders David Shorthouse https://bloodhound-tracker.net @bloodhound
  49. 49. https://bloodhound-tracker.net/0000-0003-4816-2909
  50. 50. https://bloodhound-tracker.net/0000-0001-7729-6143
  51. 51. Predictions • Knowledge graphs are coming to a biodiversity database near you – get ready • These knowledge graphs will rely heavily on Wikidata (already has many species, taxonomic papers, taxonomists, and more) • The biggest driver for creating knowledge graphs may be metrics (e.g., Bloodhound) rather than science (cf. citation metrics and impact factors)
  52. 52. Link all the things

×