Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Services and Kew's (names) data


Published on

  • Be the first to comment

Services and Kew's (names) data

  1. 1. Services, and Kew’s (names) data Nicky Nicolson, RBG Kew
  2. 2. Outline• What we’re working on at the moment • A names backbone• Kew’s role• What “services” we have just now • … and why so few?• Considerations• Services on the names backbone & timescales
  3. 3. The current situation…Many overlapping systems, few links
  4. 4. … and what we’re aiming for:Authoritative data, reduced duplication, many more links
  5. 5. Names are key to linking the data: build a “names backbone”== “an environment for the management of multiple overlapping classifications and tracking how these change over time”Not a monolith: • Built on a layered view of the domain – clearly separating names and taxonomy • Names form the objective basis for higher layers
  6. 6. Names backbone: a layered environment
  7. 7. Name occurrence layer AKA “Nomen-clutter”== any attemptat thetranscription ofa name..
  8. 8. Names layerHolds objectivepublished factsabout a name:-Orthography- Authorship- Protologuereference- Type citation- Objectivesynonymy
  9. 9. Concepts layerHypothesesdraw namestogether to formconcepts viaheterotypicsynonymy
  10. 10. Names backbone is wider than Kew• We need to draw in data curated elsewhere, bothnames and concepts: • Vascular plants • “Lower” plants • Mycology • ... and zoological names…Kew’s role is as a service consumer as well as a service provider
  11. 11. What services we have at the momentVarious things for particular projects… Used by known partners... Answering specific, tactical needsAre these really services?• Not widely advertised• Not opened up for anybody to use...Not necessarily a strategic commitment
  12. 12. Service example: OpenUpName and concept checking for the data quality toolkit.• Standard message formatBut:• Concepts not persistently identified• No throughput management, so not widely available
  13. 13. i.e. a (short term) system view… Many overlapping systems, few links
  14. 14. … rather than the long term Authoritative data, persistently identified
  15. 15. Short term view == unhappy man in Glasgow
  16. 16. A long-term, sustainable service:1. Authoritative data2. Persistently identified3. Standards based
  17. 17. ... and it also needs:• Robustness / sustainability• Management of throughput• Communications with end users• Support • Help • Example code• Usage monitoring• Sharing usage logs• Terms of use
  18. 18. Analogy with collaborative development • Technical considerationsvs • Social / political considerations
  19. 19. All this should be service accessible… …persistently identified data classes & inter-connections
  20. 20. Services: name occurrence layer- Data input / output:DwCA-Linking andreviewing links-RSS feeds toindicate activity
  21. 21. Services: names layer- Data input / output:TCS-Propose addition /edit of names-RSS feeds toindicate activity
  22. 22. Services: concepts layer- Data input / output:TCS-Createclassifications usingnames-Proposeaddition / edit ofnames to nameslayer-RSS feeds
  23. 23. How the names backbone will support servicesWe’re working to enable service level access to the data, by: • Establishing authority • Reducing duplication • Data standards to represent well-known entities • Persistent identifiers on those well-known entities • Meaningful versioning – what changed, when • Enabling remote curation
  24. 24. Timescales 2013Till March: First release : familial and generic classificationApril – August: Extend to name occurrence layer Extend to species – incorporate WCS Prioritise compilation processSeptember onwards (inc TDWG): Comms w. service providers / consumers