Published in: Education
  1. 1. Provenance Requirements for the Next Version of RDF Christian Bizer Yolanda Gil Jun Zhao Freie Universität ISI, University of University of Oxford Berlin Southern California Satya Sahoo Paolo Missier Kno.e.sis Center, University of Manchester Wright State University W3C Provenance Incubator Group
  2. 2. Outline ● Background ● Gathering provenance requirements ● From both user and technical perspectives ● From three dimensions: content, management and use ● Requirement to RDF ● Further information
  3. 3. The W3C Provenance Incubator Group ● ● Formed in September 2009 as part of the W3C Semantic Web Activity ● Aim to provide ● A state-of-the art understanding, and ● A roadmap in the area of provenance for Semantic Web technologies, development, and possible standardization
  4. 4. A Definition of Web Provenance The initial sources of information used as well as any entity and process involved in producing a data item The data can be any web resource: a document, an image, a dataset, an RDF statement or a set of RDF graphs, ....
  5. 5. The Importance of Provenance
  6. 6. The Importance of Provenance
  7. 7. The Key Idea ● We require additional capabilities that the current standard RDF model does not offer ● Identity management of RDF statements ● Annotation framework ● .... ● For interoperability we require standardized vocabularies and best practices for provenance descriptions
  8. 8. Where do our requirements come from?
  9. 9. Activities So Far ● Collected >30 provenance use cases ● Defined provenance dimensions ● Content: attribution, evolution, process, entailment, etc ● Management: publication, access, scalability, etc ● Use: interoperability, trust, understanding, debugging, etc ● A provenance requirement document ● Three flagship use cases ●
  10. 10. What are the requirements?
  11. 11. Requirements from Provenance Content: What Needs to Be Represented Requirement 1: Identity ● Ability to refer to the resource being described ● An area of an image, an RDF graph, a set of RDF graphs... ● Resolving equality Requirement 2: Evolution ● How different versions are related ● What transformations were applied ● Best practices for minting new URIs
  12. 12. Requirements from Provenance Content: What Needs to Be Represented Requirement 3: Entailment ● Represent the distinction between asserted versus inferred provenance
  13. 13. Requirements from Provenance Management Requirement 4: Publication ● Linking provenance assertions with the resource ● How to publish provenance: embed or link? ● Associate publisher’s identification (e.g., digital signature) Requirement 5: Querying ● Query formulation: may mix references to the resource and to its provenance ● Efficient query execution
  14. 14. Requirements from Provenance Use ● No requirements were uncovered
  15. 15. State of the Art ● Extension/alternatives to RDF models ● RDF reification – Querying is cumbersome – Others ... ● Named Graphs ● OWL annotations ● RDF molecules, Temporal RDF, PaCE Model ....
  16. 16. State of the Art (Cont.) ● Vocabularies/ontologies to express provenance information ● The Open Provenance Model (OPM) ● Inference Web - Open Proof Language (PML) ● The Provenance Vocabulary ● Dublin Core ● Open Archives Initiative - Object Reuse and Exchange (OAI-ORE) ● Semantic Web Publishing Vocabulary ● The SWAN-SIOC alignment ● The Changeset Vocabulary ● .......
  17. 17. Provenance Requirements to the RDF Community ● Identification ● Of any artifact, be a resource, a single RDF statement, a set of RDF statements or Web resources ● Identity management ● Annotations of RDF graphs ● Standardized schemata, ontologies and vocabularies
  18. 18. Activities Ongoing ● Mapping key terms from various provenance-related vocabularies ● Report on the state-of-the-art in the area of provenance
  19. 19. See Also ● The incubator group: ● Provenance requirement document: uirements ● Mapping provenance-related vocabularies: e_Vocabulary_Mappings
  20. 20. Special thanks to members and invited experts of the W3C Provenance Incubator Group and UK EPSRC This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (