Biological Science Collections Tagging and Tracking presented at SPNHC

  • 516 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
516
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
13
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. BiSciCol: Biological ScienceCollections TrackerTracking BiodiversityObjects to Brokering StandardsBrian Stucky, University of Colorado, BoulderJohn Deck, University of California, BerkeleyLukasz Ziemba, University of Florida, GainesevilleNico Cellinese, University of Florida, GainesvilleRob Guralnick, University of Colorado, BoulderBiSciCol Team:Reed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John Deck, RobGuralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate Rachwal, BrianStucky, Rob Whitton, Lukasz Ziemba Univ. Hawai’i Univ. Arizona Smithsonian
  • 2. • National Science Foundation funded 2010 – 2014• Infrastructure to tag & track specimens & derivates in cyberspace• Relies on globally unique identifiers (GUIDs) to track objects• Implements a Linked Data approach
  • 3. QUANTITY OF DATA IS FIRST LINK IN A LARGER CHAIN OF ISSUES
  • 4. Here is the problem:Lots of Data …. Generates …
  • 5. Data stores: Taxonomic concepts: Catalog of Life, WORMS, ITIS, EOL, GNA Geography: GBIF, IUCN ranges, Map of Life, WDPA Standards Genes/genomes: Genbank, TreeBase, ToL Web, AVATOL, BOLD Phenotypes and traits: MorphBank, TRY, Phenoscape
  • 6. EOL GBIF NCBIA Growing Constellation of Biodiversity Data and Knowledge
  • 7. How do we link all these data together?
  • 8. Borrowing from Facebook and social media…Can we track relationships for Biological Objects as well?
  • 9. A Biological Relationship Graph … Taxonomic Type Filter Class Filter X Specimens Tissues X Sequences Functions X Infer Relationships Across providers
  • 10. Moorea Biocode Example: From field collection through analysis, across multiple systems Taxon (Taxon) Taxon*n Taxon Key (Key) Blast*n Blast (Biocode Event) (metagenomic Sequencing) (CAMERA Gut Sample Event) (Essig Museum Specimen) (Genbank Sequence) (Smithsonian Tissue)
  • 11. Examples:Global Unique identifiers: • Globally unique (mandatory) • Persistent (not mandatory, but very helpful) http://example.org/urn:lsid:example.org:specimen/7217D220-836A-11DF-8395-0800200C9A66 • Resolvable (not mandatory, but very helpful) http://mycollection.org/specimen/JDeckSpecimen1 http://mycollection.org/specimen/uuid=7217D220-836A-11DF-8395-0800200C9A66 http://dx.doi.org/10.5072/FK2JW8GKM
  • 12. Simple relationship terms: Graphrelationships:
  • 13. ONE FINAL PIECEOF THE PUZZLE: GIVING BIRTH TODATA IN THE RIGHT FORMAT FOR LINKING
  • 14. “Triplifier” - creating the format for linking biological objects Darwin Core Archive Darwin Core Archive Triplifier Create links from Native data formats Mysql KEMU Mysql
  • 15. QUERY AND RESULTS ACROSS LINKED DATA Response Query
  • 16. BISCICOL – EXAMPLE SEARCHClient Interface:Search Scientific Name: Aedes increpitus Run Results: OccurrenceID1 (Aedes increpitus Dyar, 1916 ) OccurrenceID3 (Aedes vittata Theobald, 1903) Taxon SERVICE (ITIS / GNUB) http://lsid.itis.gov/urn:lsid:itis.gov:itis_tsn:126314 http://lsid.itis.gov/urn:lsid:itis.gov:itis_tsn:126317 http://gnub.org/8E19F1DC-74BA-47D4-A505-6498414B4CCE BISCICOL SERVICE LOOKUP: dwc:IdentificationID1 :relatedTo http://lsid.itis.gov/urn:lsid:itis.gov:itis_tsn:126314 dwc:IdentificationID1 :relatedTo dwc:OccurrenceID1 dwc:IdentificationID2 :relatedTo http://lsid.itis.gov/urn:lsid:itis.gov:itis_tsn:126317 dwc:IdentificationID2 :relatedTo dwc:OccurrenceID3
  • 17. IndividualID1 EventID1 GeoreferenceID1 EventID2 GeoreferenceID2 EventID3 GeoreferenceID3Working with Locations: Tracking location in space of a moving individual (whales)
  • 18. Data Impact Factor – Graph MetricsCollectors Graphs Gustav Paulay [ ] GBIF Relations Graph (102,000 direct children) [X] Moorea Biocode [X] SI MSNGR System [+] Add New Graph Christopher Meyer (83,000 direct children) Occurrences MBIO99999 Craig Moritz (1024 total descendents) (523 direct children) IMBL8888888 (723 total descendents)Events Cited occurrences over time Biocode10234 (4234 direct children) Expedition21234 (1023 direct children)
  • 19. Why BiSciCol and Why SPNHC and Why Collaborations?• New era of collections digitization • new & derived data objects created, replicated, annotated• BiSciCol tackles preservation of nat. hist. collections challenge: • How to follow these digital objects • How to link together objects and derivatives back to specimens• BiSciCol is about community, collaborative practice • Commitment to standards, ontologies • Agreement on permanent, resolvable identifiers • Triplification of data sources to enhance linked data