John Deck, University of California, Berkeley Brian Stucky, Colorado University, Boulder Nico Cellinese, University of Florida, Gainesville Neil Davies, University of California, Berkeley Rob Guralnick, Colorado University, Boulder Chris Meyer, Smithsonian Institution Tom Orrell, Smithsonian Institution Richard Pyle, Bishop Museum Kate Rachwal, University of Florida, Gainesville Russell Watkins, University of Florida, Gainesville BiSciCol: a Tagging and Tracking Infrastructure for Biological Science Collections
Bi ological  Sc ience  Col lections Tracker  working towards building an infrastructure designed to tag and track scientific collections and all of their derivatives. National Science Foundation funded 2010 – 2014 Partners are University of Florida, Colorado University, Bishop Museum, UC Berkeley, Smithsonian Institution, University of Arizona Relies on globally unique identifiers (GUIDs) to track objects  Implements a Linked Data approach Provides support for the Global Names Architecture
Outline Use Cases – Why this is important Technical Background Globally Unique Identifiers (GUIDs) RDF/ BiSciCol Implementation Combining Graphs BiSciCol Taxonomy / GNUB Integration Tools and Potential Tools in Development How the BiSciCol Application Works Online search interface Alert system Annotation Network Service How to get Involved
Use Case: Notify people interested in a collecting event that identifications have been made (Biocode Event) (Essig Museum Specimen) (Smithsonian Tissue) (Bishop Museum Tissue) (Genbank Sequence) (BOLD Barcode) (Key/Person) (Blast) (Taxon) (Taxon) (Blast) (Taxon)
Use Case: For Taxon X, find specimens with recent modification dates for image or tissue samples. (Identification) (Taxon) DateLastModified “ 2011-06-01” DateLastModified “ 2011-06-20” (Bishop Museum Tissue) (CalPhotos  Image) (Essig Museum Specimen)
Use Case:  Map all place names/localities associated with Taxon X (Specimen) (Identification) (Location) (Specimen) (Location) (Taxon) (Identification)
Use Case: Provide collection permit information and use restrictions on tissue samples from SpecimenX (Bishop Museum Tissue) (Smithsonian Tissue) (Collecting Permit) (Essig Specimen) (Biocode Event)
Technical Background Globally Unique Identifiers (GUIDs) RDF/ BiSciCol Implementation Combining Graphs BiSciCol Taxonomy / GNUB Integration
Creating Globally Unique Identifiers (GUIDs) Globally unique (mandatory) Persistent (not mandatory, but very helpful) Resolvable (not mandatory, but very helpful) +1-541-914-4739 (Unique, at least for phones) Resolution/Domain + Identifier JDeckSpecimen1 (A named identifier) http://mycollection.org/specimen/ http://mycollection.org/specimen/JDeckSpecimen1 http://mycollection.org/specimen/uuid=7217D220-836A-11DF-8395-0800200C9A66  Examples: http://example.org/urn:lsid:example.org:specimen/7217D220-836A-11DF-8395-0800200C9A66  7217D220-836A-11DF-8395-0800200C9A66  (opaque)  http://example.org/urn:lsid:example.org:specimen/
  BiSciCol Implementation of the  Resource Description Framework (RDF) An RDF  Statement: Subject Object Predicate relatedTo  (Transitive): relatedTo GUID1 GUID2 GUID3 relatedTo GUID1 <-> GUID2 GUID2 <-> GUID3 GUID1 <-> GUID3 OR Predicate GUID1 GUID2 A Simple  BiSciCol Graph  (graph=set of RDF  Statements): relatedTo a a Date Date GUID1 GUID2 GUID3 relatedTo Event “ 2011-06-20” “ 2011-05-01” Tissue “ 2011-06-01” Specimen a Date
Combining Graphs A set of institutions we  are interested in SPARQL Query language for RDF queries multiple graphs XML (Graph) (Graph) N3/Turtle (Graph) Web Page Tags (RDFa/Microformats)
BiSciCol / Global Names Architecture Integration Building a framework for linking taxon concepts to GUIDs Linking Occurrences to Taxons GUID3 (Taxon) GUID2 (Identification Key/Person/ Blast) GUID1 (Occurrence) relatedTo relatedTo The Global Names Architecture is a BiSciCol partner that is creating resolvable Identifiers for all taxons through its global names usage bank service (GNUB). Linking Taxons to GNUB grants us search capabilities for taxon names across all systems. Taxon = defined by Darwin Core Taxon Class GUID3 (Taxon) Linking Taxons to Taxons (e.g. GNUB to myAuthority) relatedTo GUID1 (Identification) GUID2 (Taxon) relatedTo GUID4 (Identification) relatedTo
Tools & Potential Tools in Development What the BiSciCol Application Does Online search interface Alert system Annotation Network Service
What the BiSciCol Application Does http://code.google.com/p/biscicol/ Search Service Map Service BiSciCol Application (Java) Internet SPARQL Model Class Combine graph query results into a single graph (model) Object Class Work with graph results in memory (Ancestors, Siblings,  Descendants) Render Class Provides image/text representation of  results RDFa web tags  (indexer -> graph) XML (Graph) (Graph) N3/Turtle (Graph)
Online Search Interface Demonstration
Alert System (Proposed) RSS Feed, Emails, etc.. Application to store identifiers 1) User X stores objects: JDeckSpecmen1 JDeckEvent1,etc.. 2) Schedules Jobs  3) Runs Queries  Recent Changes New identifications Services BiSciCol Application Discovers/ Traverses Relationships XML (Graph) (Graph) N3/Turtle (Graph)
Annotation Network Service (Proposed)  Facilitating Feedback from Third Parties Generate annotation & Store in a service Relate AnnotationGUID To Specimen GUID Now discoverable and  Linked into BiSciCol
“ Create stable identifiers,  link them to other stable identifiers, and put them on the web.” How to Get Involved http://biscicol.blogspot.com/ http://code.google.com/p/biscicol/

BiSciCol ievobio

  • 1.
    John Deck, Universityof California, Berkeley Brian Stucky, Colorado University, Boulder Nico Cellinese, University of Florida, Gainesville Neil Davies, University of California, Berkeley Rob Guralnick, Colorado University, Boulder Chris Meyer, Smithsonian Institution Tom Orrell, Smithsonian Institution Richard Pyle, Bishop Museum Kate Rachwal, University of Florida, Gainesville Russell Watkins, University of Florida, Gainesville BiSciCol: a Tagging and Tracking Infrastructure for Biological Science Collections
  • 2.
    Bi ological Sc ience Col lections Tracker working towards building an infrastructure designed to tag and track scientific collections and all of their derivatives. National Science Foundation funded 2010 – 2014 Partners are University of Florida, Colorado University, Bishop Museum, UC Berkeley, Smithsonian Institution, University of Arizona Relies on globally unique identifiers (GUIDs) to track objects Implements a Linked Data approach Provides support for the Global Names Architecture
  • 3.
    Outline Use Cases– Why this is important Technical Background Globally Unique Identifiers (GUIDs) RDF/ BiSciCol Implementation Combining Graphs BiSciCol Taxonomy / GNUB Integration Tools and Potential Tools in Development How the BiSciCol Application Works Online search interface Alert system Annotation Network Service How to get Involved
  • 4.
    Use Case: Notifypeople interested in a collecting event that identifications have been made (Biocode Event) (Essig Museum Specimen) (Smithsonian Tissue) (Bishop Museum Tissue) (Genbank Sequence) (BOLD Barcode) (Key/Person) (Blast) (Taxon) (Taxon) (Blast) (Taxon)
  • 5.
    Use Case: ForTaxon X, find specimens with recent modification dates for image or tissue samples. (Identification) (Taxon) DateLastModified “ 2011-06-01” DateLastModified “ 2011-06-20” (Bishop Museum Tissue) (CalPhotos Image) (Essig Museum Specimen)
  • 6.
    Use Case: Map all place names/localities associated with Taxon X (Specimen) (Identification) (Location) (Specimen) (Location) (Taxon) (Identification)
  • 7.
    Use Case: Providecollection permit information and use restrictions on tissue samples from SpecimenX (Bishop Museum Tissue) (Smithsonian Tissue) (Collecting Permit) (Essig Specimen) (Biocode Event)
  • 8.
    Technical Background GloballyUnique Identifiers (GUIDs) RDF/ BiSciCol Implementation Combining Graphs BiSciCol Taxonomy / GNUB Integration
  • 9.
    Creating Globally UniqueIdentifiers (GUIDs) Globally unique (mandatory) Persistent (not mandatory, but very helpful) Resolvable (not mandatory, but very helpful) +1-541-914-4739 (Unique, at least for phones) Resolution/Domain + Identifier JDeckSpecimen1 (A named identifier) http://mycollection.org/specimen/ http://mycollection.org/specimen/JDeckSpecimen1 http://mycollection.org/specimen/uuid=7217D220-836A-11DF-8395-0800200C9A66 Examples: http://example.org/urn:lsid:example.org:specimen/7217D220-836A-11DF-8395-0800200C9A66 7217D220-836A-11DF-8395-0800200C9A66 (opaque) http://example.org/urn:lsid:example.org:specimen/
  • 10.
    BiSciColImplementation of the Resource Description Framework (RDF) An RDF Statement: Subject Object Predicate relatedTo (Transitive): relatedTo GUID1 GUID2 GUID3 relatedTo GUID1 <-> GUID2 GUID2 <-> GUID3 GUID1 <-> GUID3 OR Predicate GUID1 GUID2 A Simple BiSciCol Graph (graph=set of RDF Statements): relatedTo a a Date Date GUID1 GUID2 GUID3 relatedTo Event “ 2011-06-20” “ 2011-05-01” Tissue “ 2011-06-01” Specimen a Date
  • 11.
    Combining Graphs Aset of institutions we are interested in SPARQL Query language for RDF queries multiple graphs XML (Graph) (Graph) N3/Turtle (Graph) Web Page Tags (RDFa/Microformats)
  • 12.
    BiSciCol / GlobalNames Architecture Integration Building a framework for linking taxon concepts to GUIDs Linking Occurrences to Taxons GUID3 (Taxon) GUID2 (Identification Key/Person/ Blast) GUID1 (Occurrence) relatedTo relatedTo The Global Names Architecture is a BiSciCol partner that is creating resolvable Identifiers for all taxons through its global names usage bank service (GNUB). Linking Taxons to GNUB grants us search capabilities for taxon names across all systems. Taxon = defined by Darwin Core Taxon Class GUID3 (Taxon) Linking Taxons to Taxons (e.g. GNUB to myAuthority) relatedTo GUID1 (Identification) GUID2 (Taxon) relatedTo GUID4 (Identification) relatedTo
  • 13.
    Tools & PotentialTools in Development What the BiSciCol Application Does Online search interface Alert system Annotation Network Service
  • 14.
    What the BiSciColApplication Does http://code.google.com/p/biscicol/ Search Service Map Service BiSciCol Application (Java) Internet SPARQL Model Class Combine graph query results into a single graph (model) Object Class Work with graph results in memory (Ancestors, Siblings, Descendants) Render Class Provides image/text representation of results RDFa web tags (indexer -> graph) XML (Graph) (Graph) N3/Turtle (Graph)
  • 15.
  • 16.
    Alert System (Proposed)RSS Feed, Emails, etc.. Application to store identifiers 1) User X stores objects: JDeckSpecmen1 JDeckEvent1,etc.. 2) Schedules Jobs 3) Runs Queries Recent Changes New identifications Services BiSciCol Application Discovers/ Traverses Relationships XML (Graph) (Graph) N3/Turtle (Graph)
  • 17.
    Annotation Network Service(Proposed) Facilitating Feedback from Third Parties Generate annotation & Store in a service Relate AnnotationGUID To Specimen GUID Now discoverable and Linked into BiSciCol
  • 18.
    “ Create stableidentifiers, link them to other stable identifiers, and put them on the web.” How to Get Involved http://biscicol.blogspot.com/ http://code.google.com/p/biscicol/