Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

West coastrollout


Published on

FilteredPush use of Open Annotation Ontology presented at Stanford University, April 8 2013

Published in: Technology, Education, Business
  • Be the first to comment

  • Be the first to like this

West coastrollout

  1. 1. FilteredPush PI James Hanken Maureen A. Kelly David B. Lowery Paul J. Morris Robert A. Morris James A. MacklinUniversity of California, Davis Bertram Ludacher Tianhong Song Sven KoehlerHarvard University Herbaria Former Project Participants Lei Dou Chinua IloabachieUniversity of Massachusetts, Boston Timothy McPhillips Donna Tremonte Zhimin Wang NSF: DBI #0960535 (Production, Year 3 of 3) NSF: DBI #0646266 (Prototype, Complete)
  2. 2. Annotate What?● Curatorial and scientific metadata for an estimated 3 bn. specimens in the worlds natural history collections. 1 bn in U.S.● Digital record of about 1.5% at most. See (Also includes non-captured observations).
  3. 3.
  4. 4. ● Ongoing programs in U.S., Europe, Australia, digitize all.● In U.S., 10 year, $500 M effort now underway; automated; semi-automated; – QC issues; correlations to other paper and digital resources; intentionally changing data
  5. 5. The Problem: Data Quality● Collections & occurrence data is all over the map – … literally (off the map!) – … even after “digitization” (sensu stricto) – … and after “digitization” (sensu lato), a.k.a. “computerization”● Issues: – Lat/Long transposition, coordinate & projection issues – Data entry/creation, “fuzzy” data, naming issues, bit rot, data conversions and transformations, schema mappings, … (you name it) 6
  6. 6. (1) Kvetch about data (2) Push to interested parties (3) Filter (4) Change data in databases(5) Store all assertions
  7. 7. Modeling distributed annotations ofmutable, actionable, distributed data● OA is almost adequate+ clear distinction between annotation model and domain models;+ provenance support+ web document-centric features are optional+ SpecificResource model fits data annotation well+ Use of CNT for machine-opaque resources+ (RDF is friendly to semantic pub/sub)
  8. 8. Modeling distributed annotations ofmutable, actionable, distributed data● OA is almost adequate- Model of Evidence for assertions in Body.- Model to convey annotators Expectation to consuming apps- Model use of queries as Selectors(-) Model “transcription” as an oa:Motivation(?) Model domain associations between Target and Body
  9. 9. :anAnnotation a oa:Annotation; oa:hasTarget :nsptarget1; oa:hasBody <> ; oa:hasBody :invalidLatitudeText.:nsptarget1 a oad:AnySuchResource; oa:hasSelector :findInvalidLatitudes.:findInvalidLatitudes a oad:SparqlQuerySelector; oad:hasQuery “select distinct ?x WHERE { ?x a dwcFP:Occurrence. ?x dwc:hasLatitude ?lat. filter (?lat > 90 || ?lat < 90).}”; […] .
  10. 10. Clients AnnotationProcessor AnnotationProcessor Specify 6 Mapper ClientLibrary Specify6 Driver SparqlRules Specify6-HUH Driver OA/OAD SQL Driver dwcFP FP-oauth-consumerMorphbank FP-oauth-providerFilteredPush instance Client Tools FP Network Triage FP-PHP-Library ClientLibrary FP Access Point SparqlRules Symbiota OA/OAD Knowledge Messaging Analysis FilteredPush instance dwcFP JMS-KVP Kepler MySQL RDF Handler Fuseki SPARQL Push Fedora Kepler Kuration Mongo
  11. 11. Who asserted photo:owner ???
  12. 12. No confusion about provenance of photo:owner assertion ?
  13. 13. ENDhttp://wiki.filteredpush.orgVideo there in a few(?) weeks
  14. 14. Specimen record annotation