Your SlideShare is downloading. ×

Reflections on Provenance Ontology Encodings


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Reflections on Provenance Ontology Encodings Li Ding 1 , Jie Bao 1 , James Michaelis 1 , Jun Zhao 2 , Deborah L. McGuinness 1 1 Tetherless World Constellation, RPI 2 Image Bioinformatics Research Group, Oxford
  • 2. From Provenance Vocabulary to Provenance Ontology
    • Provenance Vocabulary
      • Metadata Only
      • e.g. PREMIS, ICS-SRC,DDMS
    • Provenance Ontology
      • Metadata + Inference
      • e.g. OPM,PML, DCTerms
    • Motivation : provide guidance to understand, align and evolve existing provenance ontologies
    Source: PREMIS Data Dictionary, version 2.0 Source:
  • 3. Semantic Web Provenance Ontologies
    • Selection Criteria
      • Declarative: encoded in OWL and RDFS
      • In-Use: applied by communities
    • Selected Semantic Web Provenance Ontologies
      • Open Provenance Model (OPM - Moreau et al. 2009)
      • Proof Markup Language (PML2 - McGuinness et al. 2007)
      • Dublin Core Terms (DCTerms - 2008)
      • Provenance Vocabulary (PRV - Hartig and Zhao 2009)
      • Provenir (Sahoo et al. 2008) importing OBO-RO
    • Related Semantic Web ontologies: FOAF, WGS84, OWL-Time, Web of Trust, …
  • 4. Examples: OPM, PML and DCTerms Source: Source: OPM specification v1.1 Source: PML2 specification
    • provenance concept correlation
    • provenance relation classification
    • complex provenance structure
  • 5. Basic Statistics
    • Small ontology
    • Within OWL/OWL2 DL expressivity
    • Not tractable (none fits in OWL2 profiles)
    * DL Expressivity is computed before importing external ontologies AL R+ HI ALCH RI(D) ALH(D) ALHF(D) ALCHIF(D) ALC F(D) DL Expressivity OWL Lite OWL DL OWL 2 DL RDFS OWL DL OWL DL OWL DL OWL Species 24 2 17 55 21 47 26 # of properties 0 8 14 22 8 30 20 # of classes 268 136 304 857 207 505 309 # of triples ro provenir prv dcterms pmlj pmlp opm
  • 6. Semantic Analysis
    • What can be learned from these OWL encodings of provenance ontology?
    • Concept Coverage (Vocabulary)
      • What primitive provenance concepts should be supported?
      • What are the differences between the primitive concepts?
    • Concept modeling (Computation)
      • What kinds of provenance computation is captured?
      • How are computational provenance semantics modeled?
  • 7. Empirically Identified Themes (5W+H)
    • We empirically identify themes to group provenance primitives
    • Agents (Who) : Actionable entities that can take actions in an event.
    • Artifacts (Who) : Entities made by agents and involved in events.
    • Events (What) : Observable occurrence(s), execution of action(s) (potentially including the past).
    • Methods (How): Entities denoting the operations (or actions) used (or mentioned) in events.
    • Time (When): Temporal concepts, such as time and date when things were created (or updated), primarily used for annotating events.
    • Space (Where) :Geospatial concepts such as locations, GPS coordinates and regions.
    • (Why is left out …)
  • 8. Theme Coverage Analysis
    • Similarity
    • Difference
      • Agents: Immutable (OPM1.1), Taxonomy (e.g PML2)
      • Events: complex structure (OPM1.1, PML2) vs binary relation
      • Method: declarative method (PML2, DC, PRV)
      • Time: time structure (OPM1.1, DCTerms)
      • Space: spatial property (DCTerms, Provenir)
    Spatial_parameter Location space temporal_parameter performedAt PeriodOfTime hasCreationDateTime OTime time DataCreationGuide Policy MethodOfAccrual InferenceRule methods provenir:process, ro:derives_from Execution ProvenanceStatement source pmlp:SourceUsage, pmlj:InferenceStep WasGeneratedBy Process events Data Artifact PhysicalResource IdentifiedThing, Information Artifact artifacts Agent Actor Agent Agent Agent agents Provenir (+OBO-RO) PRV core DCTerms PML 2.0 OPM 1.1
  • 9. Computational Model
    • Provenance computations are reflected by the use of four categories of RDFS and OWL ontology constructs
    • Provenance graph inference can be modeled by owl:TransitiveProperty or OWL2 property chain inference.
    X owl:propertyChainAxiom ro foaf,… reused ontology X X owl:imports Concept Reuse X X X Cardinality Restriction X X X X owl:allValuesFrom X X X X X X rdfs:domain /rdfs:range Constraints X owl:TransitiveProperty X X X owl:inverseOf Inference on relations X owl:equivalentClassOf X X X owl:unionOf X X X X owl:disjointWith X X X X X X rdfs:subPropertyOf X X X X X X rdfs:subClassOf Concept Taxonomy ro provenir prv dcterms pmlj pmlp opm
  • 10. Conclusion
    • Findings
      • Although provenance primitives can be grouped by theme, they may not be fully interchangeable due to their semantic difference
      • The coverage on time and location themes could leverage existing ontologies such as OWL-Time
      • Not all provenance computations can be fully expressed using OWL (see Tao et al. 2010)
    • Future work
      • The results of this work will be contributions to W3C Incubator group