Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Biodiversity knowledge graphs
@rdmpage
http://iphylo.blogspot.com
Our questions are
“paths” in this network
http://iphylo.org/~rpage/geojson-phylogeny-demo/
Phylogeography
PMID:948206
http://biostor.org/reference/102054
http://data.gbif.org/occurrences/215921922/
BHL and GBIF as biomedical databases
http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html
MESH term
Metrics
(counting links in the knowledge graph)
In an attempt to live up to that increasing
demand for documentation, the leadership of
the Natural History Museum of Denm...
Use of a collection
Building the knowledge graph
BioStor
http://biostor.org
BioNames
http://bionames.org
Material examined
Species, sequences, publications
Biodiversity knowledge graph(s)
Implications
• Sequencing is cheap
• The flood of sequences is only going to
increase
• How much of this is relevant to
bi...
Numbers of new animal names
1923
WWI WWII
Implications
• Taxonomists are working at capacity
• Most taxonomic work is in the past
• Compare this to exponential grow...
Mammals in GenBank
Proper Linnaean names
Aus sp.
Mammals
Proper Linnaean names
Aus sp.
“Invertebrates”
BOLD
Dark taxa
• Disconnect between taxonomy and
genomics
• Are “dark taxa” species we already know
about or are they new diver...
100,000 articles from http://biostor.org (BHL)
1923 today
Publishers of
taxonomy
(# articles)
http://bionames.org
Legacy literature
• 1923 and the chilling effect of copyright
• Much of the taxonomic literature is digitally
“dark”
• Com...
Size of Wikipedia articles on mammals
Few, large articles
Many, small
articles“long tail”
PanTHERIA (2009)
1923 2003
Power law
• We know a lot about a few species
• For most species we know very little
(even in well-known groups)
• For poo...
GBIF.org 500 million records
GBIF
• The Global Biodiversity Information
Facility is not evenly “global”
• Tells us as much about sampling as
distributi...
Flickr EOL group
Crowd sourcing
• Where is the “crowd”?
• It’s where the iPhones are…
BOLD DNA barcodes
http://iphylo.org/~rpage/bold-map
GenBank host records “symbiome”
GenBank as a
biodiversity database
• GenBank is about more than genes
• GenBank has a wealth of information on
location, a...
Implications
• Phylogenetic data is not being archived
(why not?)
• What would make archiving a “no
brainer?”
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
Upcoming SlideShare
Loading in …5
×

Biodiversity Knowledge Graphs

1,088 views

Published on

Published in: Science
  • Be the first to comment

  • Be the first to like this

Biodiversity Knowledge Graphs

  1. 1. Biodiversity knowledge graphs @rdmpage http://iphylo.blogspot.com
  2. 2. Our questions are “paths” in this network
  3. 3. http://iphylo.org/~rpage/geojson-phylogeny-demo/
  4. 4. Phylogeography
  5. 5. PMID:948206
  6. 6. http://biostor.org/reference/102054
  7. 7. http://data.gbif.org/occurrences/215921922/
  8. 8. BHL and GBIF as biomedical databases http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html
  9. 9. MESH term
  10. 10. Metrics (counting links in the knowledge graph)
  11. 11. In an attempt to live up to that increasing demand for documentation, the leadership of the Natural History Museum of Denmark has issued an order to its curatorial staff - The staff members are requested to document which publications from 2011, written entirely by external scientists, that in one way or another are based on material in the collections of the Museum. http://markmail.org/message/opv2we7fkmro2nen@TAXACOM
  12. 12. Use of a collection
  13. 13. Building the knowledge graph
  14. 14. BioStor http://biostor.org
  15. 15. BioNames http://bionames.org
  16. 16. Material examined
  17. 17. Species, sequences, publications
  18. 18. Biodiversity knowledge graph(s)
  19. 19. Implications • Sequencing is cheap • The flood of sequences is only going to increase • How much of this is relevant to biodiversity?
  20. 20. Numbers of new animal names 1923 WWI WWII
  21. 21. Implications • Taxonomists are working at capacity • Most taxonomic work is in the past • Compare this to exponential growth of sequencing
  22. 22. Mammals in GenBank Proper Linnaean names Aus sp.
  23. 23. Mammals Proper Linnaean names Aus sp.
  24. 24. “Invertebrates” BOLD
  25. 25. Dark taxa • Disconnect between taxonomy and genomics • Are “dark taxa” species we already know about or are they new diversity? • Do we need taxonomic names?
  26. 26. 100,000 articles from http://biostor.org (BHL) 1923 today
  27. 27. Publishers of taxonomy (# articles) http://bionames.org
  28. 28. Legacy literature • 1923 and the chilling effect of copyright • Much of the taxonomic literature is digitally “dark” • Commercial publishers control access to a lot of literature
  29. 29. Size of Wikipedia articles on mammals Few, large articles Many, small articles“long tail”
  30. 30. PanTHERIA (2009) 1923 2003
  31. 31. Power law • We know a lot about a few species • For most species we know very little (even in well-known groups) • For poorly known species need to go to legacy literature
  32. 32. GBIF.org 500 million records
  33. 33. GBIF • The Global Biodiversity Information Facility is not evenly “global” • Tells us as much about sampling as distribution of diversity
  34. 34. Flickr EOL group
  35. 35. Crowd sourcing • Where is the “crowd”? • It’s where the iPhones are…
  36. 36. BOLD DNA barcodes http://iphylo.org/~rpage/bold-map
  37. 37. GenBank host records “symbiome”
  38. 38. GenBank as a biodiversity database • GenBank is about more than genes • GenBank has a wealth of information on location, and ecological interactions
  39. 39. Implications • Phylogenetic data is not being archived (why not?) • What would make archiving a “no brainer?”

×