BiSciCol + VertNet: A Conceptual and Technical Framework for Identifying Specimens

146 views
117 views

Published on

A conceptual framework for implementing solid identifiers for use in data aggregation frameworks and their impact on data publishing and downstream linking.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
146
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Lifecycle of a record1. Publisher uploads a record to IPT which gets published in a Darwin Core Archive.2. The Darwin Core Archive is processed by VertNet into a bunch of CSV files that target specific use cases.3. The CSV files are bulkloaded to various datastores, including Amazon S3, CartoDB, and Google App Engine datastore.4. Once in the datastore, records are served over HTTP APIS as JSON which power apps like the VertNet portal.
  • BiSciCol + VertNet: A Conceptual and Technical Framework for Identifying Specimens

    1. 1. BiSciCol + VertNet: A Conceptual andTechnical Framework for IdentifyingSpecimensiEvoBio Flash Talk 2013Aaron Steele, University of California, BerkeleyJohn Deck, University of California, BerkeleyRob Guralnick, University of Colorado, Boulder
    2. 2. VertNet
    3. 3. VertNet LifeCycle of a Record
    4. 4. • <1% DwC Triplet match between Genbank andVertNet• Identifiers are not awesome (not persistent,resolvable, or even globally unique)BiSciCol / Identifier Review ofChallenges
    5. 5. ark:/21547/R2 = Uniquely identifies processed data instance_separator = _550e8400-e29b...suffix =550e8400-e29b-41d4-a716-446655440000The suffix is assigned by VertNet can be resolved using both theEZID and BCID systems using the suffix passthrough system.BCID Technology (from software bazaar)ark:/21547/ark:/21547/ = Scheme plus name assigning authorityR2R2 = BCID Group identifier, defines a common concept per dataset
    6. 6. A Conceptual and Technical Framework forIdentifying Specimens with (VertNet + BiScicol)
    7. 7. IC:CC:CN (Literal)ark:/21547/R2 (group)ark:/21547/R2_{LocalID}ark:/21547/S2_{UUID}ID’s in theData LifeCycleIdentifiers MaintainedIdentifiers MaintainedIdentifiers MaintainedIdentifiers MaintainedMachineInterpretation
    8. 8. PublisherIC:CC:CN (Literal)ark:/21547/R2_{LocalID}Aggregatorark:/21547/S2_{UUID}Using awesome identifiers we can track all metadatainstances from publisher to aggregator through ApplicationsMachine_interpretationVertNetEOLiDigBioGenbankResolverApplicationsAggregationsSource

    ×