SWIB 2013
Tutorial
on
Metadata Provenance

Slides: http://bit.ly/swib13-provenance
Part 2:
Modelling provenance information
using the PROV ontology

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
2
Agenda
Modelling Provenance 1
A data model for provenance information
Introducing the PROV ontology
Extending the basic el...
A data model for provenance information

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
4
Motivation
Now that we have a handle to our data, we want to
describe its provenance in detail
by Magnus
on 2013-11-14 13:...
Options for expressing provenance
Using existing generic vocabularies
Extending/creating a domain-specific vocabulary
Usin...
Using a generic vocabulary: Dublin Core
Dublin Core Metadata Initiative (DCMI)
Element set
15 basic terms
No defined range...
Example
Namespace
Element set --> dc:
Terms --> dcterms: or dct:
ex:doc1 dct:title "A mapping from Dublin Core..." .
ex:do...
Distinction
Some terms contain only information about the
resource itself
But not how or when it was produced

→ Descripti...
Provenance in DC: Who?
Terms
Contributor
Creator
Publisher
RightsHolder

Range is dct:Agent
a resource that acts or has th...
Provenance in DC: When?
Terms
Available
Created
Date
DateAccepted
DateCopyrighted
DateSubmitted
Issued
Modified
Valid
SWIB ...
Provanance in DC: When?
Ranges
Date range
Available, valid

Single date
All others

Dates are basic provenance information...
Provanance in DC: How?
Terms

Derivation and Replacement

IsVersionOf, hasVersion
IsFormatOf , hasFormat
References, isRef...
dcterms:provenance
Definition
“statement of any changes in ownership and custody of the
resource since its creation that a...
Summary
More than half of the DC terms deal with provenance
related information
Who?
When?
How?
What?

Missing information...
Extending a domain-specific vocabulary
Domain-specific vocabularies often deal with aspects
of provenance
e.g. the SWAN On...
Example: PAV module of SWAN
properties
importedBy - An entity responsible for importing the data from an
external source
i...
Example: PAV module of SWAN
properties
sourceAccessedOn - The date when the original source
has been accessed to create th...
Extending a domain-specific vocabulary
Other ontologies have similar approaches
→ Aspects, granularity and terminology dif...
Vocabularies for modelling provenance
Provenir
Published in 2009

Open Provenance Model (OPM)
Pulished in 2010

SWIB 2013 ...
W3C Provenance Incubator Group (PROV-XG)
From 2009-2010
Chaired by Yolanda Gil

“Provenance XG Final Report”
http://www.w3...
W3C Provenance Incubator Group (PROV-XG)
Discussion of requirements for provenance on the web
http://www.w3.org/2005/Incub...
W3C Provenance Working Group
Active from 04/2011 to 07/2013
Co-chaired by Paul Groth and Luc Moreau

Goal
The mission of t...
W3C Provenance Working Group
Implementation of the PROV-XG recommendations
"A provenance framework should support:
the cor...
Introducing the PROV ontology

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
25
PROV Ontology (PROV-O)

http://www.w3.org/TR/prov-overview/

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
26
Entities
PROV-O allows to record the provenance of entities
Entities are all kinds of things
Physical: books, articles, re...
Activities
Model the dynamic aspects of the world
Occurs over a period of time and acts upon or with
entities
Includes con...
Agents
Bear responsibility for
an activity taking place
for the existence of an entity
for another agent's activity

Examp...
Prov-O Basic Elements

Source: http://www.w3.org/TR/prov-primer/

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
30
Starting properties
prov:wasAttributedTo
prov:wasGeneratedBy
prov:used

prov:startedAtTime
prov:endedAtTime
prov:wasInform...
Example: Provenance of a conference paper
The paper was written by a student
The final version of the paper is based on an...
Example: Entities
ex:draft
ex:article
ex:dataset
ex:book
ex:result

ex:comment

dcterms: title
dcterms: title

“Latest res...
Example: Relation between Entities
ex:article
wasDerivedFrom
wasDerivedFrom
ex:comment
ex:draft

wasDerivedFrom

ex:book

...
Example: Modelling the activities
wasGeneratedBy
ex:draft

ex:compose

ex:revise

ex:comment

wasGeneratedBy
ex:result

us...
Example: Agents

ex:student

ex:prof

foaf:name

foaf:name

ex:prog

SWIB 2013 Tutorial on Metadata Provenance

“Latest re...
Example: Attribution
ex:article

wasAttributedTo

ex:student

wasAssociatedWith

ex:compose

wasAssociatedWith
actedOnBeha...
Recap
PROV distinguishes
Entities
Activities
Agents

Relations are
Derivation of Entities from Entities
Attribution of Ent...
Extending the basic elements of PROV

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
39
Agents and Entities
The type of Agent can be specified through subproperties
prov:Person
prov:Organization
prov:SoftwareAg...
Types of Entities
prov:Collection
Provides a structure to a group of Entities
prov:hadMember is used to describe membershi...
Types of Entities
prov:Bundle
A named set of provenance descriptions, that itself can
have provenance information associat...
Example: Bundle
wasGeneratedBy
used
ex:draft

ex:compose
ex:result

used
ex:revise
used

wasGeneratedBy

ex:dataset
wasGen...
Describing Entities
Entities can be described further by
prov:value
a literal value that represents an Entity

prov:Locati...
Derivation
The type of derivation can be specified through subproperties
prov:hadPrimarySource
Specific for first-hand rep...
Relation between Entities
Relation between Entities can be further described
prov:specializationOf
Used to link a more spe...
Broader Terms
A superproperty is introduced that relates any
influenced Entity, Activity, or Agent to any other
influencin...
Example
ex:article
wasInfluencedBy
wasRevisionOf
ex:comment
ex:draft

“The relevant question is why these data
points do n...
Lifetime of an Entity
One can provide a starting and ending time of an
Entity's existence
prov:generatedAtTime
prov:invali...
Overview

Source: http://www.w3.org/TR/prov-o/
SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
50
Qualifying relations in PROV

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
51
Qualifying relations
Problem: Binary relations cannot be further elaborated
But one would like to describe aspects of the ...
The PROV solution
"All problems in computer science can be solved by
another level of indirection"
(Attributed to David Wh...
Qualified Usage
2013-08-23 14:22:13 UST

ex:draft
entity

atTime
ex:usage1

used

ex:attribute
ex:revise

qualifiedUsage

...
Qualified expressions
Unqualified Influence

Influen
cing
Class

Entity

wasGeneratedBy

Activity

qualifiedGeneration

Ge...
Qualified expressions
Influenced
Class

Unqualified
Influence

Influenci
ng Class
Entity
Activity
Agent

Qualification
Pro...
Qualified Derivation
ex:usage2

ex:draft

hadUsage

entity
ex:deriv1

wasDerivedFrom

hadActivity

ex:revise

hadGeneratio...
Roles
A role is the function of an entity or agent with respect
to an activity, in the context of a usage, generation,
inv...
Qualified Attribution
wasAttributedTo

ex:article

qualifiedAttribution

ex:student

agent
ex:attrib1
hadRole

“Primary au...
Summary
Basic and extended PROV relations are unqualified
To qualify a relation
An intermediate node is introduced
There i...
Mapping DC provenance information to PROV

SWIB 2013 Tutorial on Metadata Provenance

25.11.2013
61
Dublin Core
Remember: Many DC terms contain provenance
information
Who affected a resource
Creator, contributor, publisher...
Property ranges
Terms with dct:Agent as range
creator
contributor
publisher
rightsHolder

SWIB 2013 Tutorial on Metadata P...
Property ranges
Terms with time as range
available
created
date
dateAccepted
dateCopyrighted
dateSubmitted
issued
modified...
Property ranges
Terms with another resource as range
accessRights
hasFormat
hasVersion
isFormatOf
isVersionOf
license

SWI...
Direct mappings
Equivalences between PROV attributes and DC terms
Described in using
rdfs:subClassOf
rdfs:subPropertyOf
ow...
Direct mappings: DC Terms
DC Term

Mapping

PROV Property

created

subPropertyOf

generatedAtTime

dateAccepted

subPrope...
Direct mappings: Generalizations
Properties generalizing PROV terms
PROV property

Mapping

DC Term

hadPrimarySource subP...
Direct mappings: classes
DC Term
dct:Agent
dct:BibliographicResource
dct:LicenseDocument
dct:LinguisticSystem
dct:Location...
Complex mappings
Defined to generate qualified PROV statements from
DC statements
Retain more information from the DC stat...
PROV refinements: subclasses
Extended Term

Relation to PROV

PROV extended Term

Publish

subClassOf

Activity

Contribut...
Complex mapping: Example

Source: http://www.w3.org/TR/2013/NOTE-prov-dc-20130430/
SWIB 2013 Tutorial on Metadata Provenan...
Complex mapping: Example
Is there no easier way?

The entity would be both
produced and used

Source: http://www.w3.org/TR...
CONSTRUCT {
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
_:usedEntity a prov:Entity;
prov:sp...
CONSTRUCT {
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
_:usedEntity a prov:Entity;
prov:sp...
CONSTRUCT {
main Entity
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
direct mapping
_:usedEn...
CONSTRUCT {
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
_:usedEntity a prov:Entity;
prov:sp...
CONSTRUCT {
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
_:usedEntity a prov:Entity;
prov:sp...
CONSTRUCT {
?document a prov:Entity;
prov:wasAttributedTo ?agent.
?agent a prov:Agent.
_:usedEntity a prov:Entity;
prov:sp...
Complex mappings: Cleanup
The mappings produce many blank nodes
Ideas to reduce the blank nodes:
1. Conflate properties re...
Summary
To convert existing provenance information in DC
terms, a mapping to PROV-O is provided with the
standard
It conta...
Thank you for listening.

Slides available online
http://www.slideshare.net/MagnusPfeffer/
This work is licensed under a C...
References
PROV-O: The PROV Ontology
(W3C Recommendation)
http://www.w3.org/TR/prov-o/

PROV Model Primer
(W3C Working Gro...
Upcoming SlideShare
Loading in …5
×

Metadata Provenance Tutorial Part 2: Interoperable Metadata Provenance

1,303 views

Published on

Tutorial held at the Semantic Web in Libraries conference in Hamburg, Germany, at November 25th 2013. The tutorial was held together with Kai Eckert, who did Part 1.

Abstract:
When metadata is distributed, combined, and enriched as Linked Data, the tracking of its provenance becomes a hard issue. Using data encumbered with licenses that require attribution of authorship may eventually become impracticable as more and more data sets are aggregated - one of the main motivations for the call to open data under permissive licenses like CC0. Nonetheless, there are important scenarios where keeping track of provenance information becomes a necessity. A typical example is the enrichment of existing data with automatically obtained data, for instance as a result of automatic indexing. Ideally, the origins, conditions, rules and other means of production of every statement are known and can be used to put it into the right context.
Part 1 - Metadata Provenance in RDF: In RDF, the mere representation of provenance - i.e., statements about statements - is challenging. We explore the possibilities, from the unloved reification and other proposed alternative Linked Data practices through to named graphs and recent developments regarding the upcoming next version of RDF.
Part 2 - Interoperable Metadata Provenance: As with metadata itself, common vocabularies and data models are needed to express basic provenance information in an interoperable fashion. We investigate the PROV model that is currently developed by the W3C Provenance Working Group and compare it to Dublin Core as a representative of a flat, descriptive metadata schema.
We actively encourage participants to present their own use cases and open challenges at this workshop. Please contact the organizers for details.
Prior experience: The workshop is intended for participants who have mastered the basics of linked data and want to delve into expressing provenance. Beside a basic understanding of RDF, the linked data principles and the use of ontologies (like Dublin Core or Bibo) to express bibliographic metadata no specialised knowledge is required.

Published in: Education
  • Be the first to comment

Metadata Provenance Tutorial Part 2: Interoperable Metadata Provenance

  1. 1. SWIB 2013 Tutorial on Metadata Provenance Slides: http://bit.ly/swib13-provenance
  2. 2. Part 2: Modelling provenance information using the PROV ontology SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 2
  3. 3. Agenda Modelling Provenance 1 A data model for provenance information Introducing the PROV ontology Extending the basic elements of PROV Short break Modelling Provenance 1 Qualifying relations in PROV Mapping DC provenance information to PROV SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 3
  4. 4. A data model for provenance information SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 4
  5. 5. Motivation Now that we have a handle to our data, we want to describe its provenance in detail by Magnus on 2013-11-14 13:19 using his coffee maker with Brasilian arabica beans coarse grind, 8gr brew time 3:00 SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 5
  6. 6. Options for expressing provenance Using existing generic vocabularies Extending/creating a domain-specific vocabulary Using/Creating a vocabulary specifically made for this purpose SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 6
  7. 7. Using a generic vocabulary: Dublin Core Dublin Core Metadata Initiative (DCMI) Element set 15 basic terms No defined ranges ( --> arbitrary values possible) Terms 55 granular terms (properties) Well defined ranges SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 7
  8. 8. Example Namespace Element set --> dc: Terms --> dcterms: or dct: ex:doc1 dct:title "A mapping from Dublin Core..." . ex:doc1 dct:creator ex:kai . ex:doc1 dct:created "2012-02-28" . ex:doc1 dct:publisher ex:w3c . ex:doc1 dct:issued "2012-02-29" . ex:doc1 dct:subject ex:dublincore . ex:doc1 dct:replaces ex:doc2 . ex:doc1 dct:format "HTML" . SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 8
  9. 9. Distinction Some terms contain only information about the resource itself But not how or when it was produced → Descriptive Terms Some terms also contain information on the creation or derivation of the resource → Provenance Terms SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 9
  10. 10. Provenance in DC: Who? Terms Contributor Creator Publisher RightsHolder Range is dct:Agent a resource that acts or has the power to act Clearly influencing creation of a resource RightsHolder is ownership --> provenance in works of art SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 10
  11. 11. Provenance in DC: When? Terms Available Created Date DateAccepted DateCopyrighted DateSubmitted Issued Modified Valid SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 11
  12. 12. Provanance in DC: When? Ranges Date range Available, valid Single date All others Dates are basic provenance information Availability and validity often inherent to the resource But: provenance related, if active change SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 12
  13. 13. Provanance in DC: How? Terms Derivation and Replacement IsVersionOf, hasVersion IsFormatOf , hasFormat References, isReferencedBy Relations to other resources Replaces, isReplacedBy Source HasPart, isPartOf Processes involved in creation accrualMethod SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 13
  14. 14. dcterms:provenance Definition “statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.” → "classic" provenance of works of art SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 14
  15. 15. Summary More than half of the DC terms deal with provenance related information Who? When? How? What? Missing information Where? Why? (only for the specific reason of replacement) SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 15
  16. 16. Extending a domain-specific vocabulary Domain-specific vocabularies often deal with aspects of provenance e.g. the SWAN Ontology (Semantic Web Applications in Neuromedicine) has a module dealing with "Provenance, Authoring and Versioning (PAV)" → Aspects, granularity and terminology differ between domains Cross-domain data exchange becomes very hard SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 16
  17. 17. Example: PAV module of SWAN properties importedBy - An entity responsible for importing the data from an external source importedOn - The date of the import of the resource importedFirstOn - The date of the first import of the resource importedLastOn - The date of the last import of the resource importedFromSource - The original source of the encoded information (PubMed, UniProt...) importedWithId - The unique identifier of the encoded information in the original source. See http://swan-ontology.googlecode.com/svn/tags/1.2/pav.owl (latest version from 2008) SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 17
  18. 18. Example: PAV module of SWAN properties sourceAccessedOn - The date when the original source has been accessed to create the resource. sourceFirstAccessedOn - The date when the original source has been first accessed and verified sourceLastAccessedOn - The date when the original source has been last accessed and verified See http://swan-ontology.googlecode.com/svn/tags/1.2/pav.owl (latest version from 2008) SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 18
  19. 19. Extending a domain-specific vocabulary Other ontologies have similar approaches → Aspects, granularity and terminology differ between domains → Cross-domain data exchange becomes very hard SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 19
  20. 20. Vocabularies for modelling provenance Provenir Published in 2009 Open Provenance Model (OPM) Pulished in 2010 SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 20
  21. 21. W3C Provenance Incubator Group (PROV-XG) From 2009-2010 Chaired by Yolanda Gil “Provenance XG Final Report” http://www.w3.org/2005/Incubator/prov/XGR-prov/ Overview of the existing approaches and vocabularies Proposes a dedicated W3C Working Group Recommendation of an initial set of terms as a basis for further discussion SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 21
  22. 22. W3C Provenance Incubator Group (PROV-XG) Discussion of requirements for provenance on the web http://www.w3.org/2005/Incubator/prov/wiki/User_Requir ements Mapping of provenance terms from existing vocabularies to OPM http://www.w3.org/2005/Incubator/prov/wiki/Provenance _Vocabulary_Mappings Common use case scenarios for provenance SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 22
  23. 23. W3C Provenance Working Group Active from 04/2011 to 07/2013 Co-chaired by Paul Groth and Luc Moreau Goal The mission of the Provenance Working Group [..] is to support the widespread publication and use of provenance information of Web documents, data, and resources. The Working Group will publish W3C Recommendations that define a language for exchanging provenance information among applications. Main focus on linked data and the semantic web SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 23
  24. 24. W3C Provenance Working Group Implementation of the PROV-XG recommendations "A provenance framework should support: the core concepts of identifying an object, attributing the object to person or entity, and representing processing steps; accessing provenance-related information expressed in other standards; accessing provenance; the provenance of provenance; reproducibility; versioning; representing procedures; and representing derivation." SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 24
  25. 25. Introducing the PROV ontology SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 25
  26. 26. PROV Ontology (PROV-O) http://www.w3.org/TR/prov-overview/ SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 26
  27. 27. Entities PROV-O allows to record the provenance of entities Entities are all kinds of things Physical: books, articles, reports, ... Digital: pictures, text files, pdf documents, videos, ... Conceptual/other: abstract concepts, ideas, theories, ... Provenance information can also include references to other entities SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 27
  28. 28. Activities Model the dynamic aspects of the world Occurs over a period of time and acts upon or with entities Includes consuming, processing, transforming, modifying, relocating, using, or generating entities Examples Writing a report Translating a book Moving an online document to a new URL Generating web access statistics SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 28
  29. 29. Agents Bear responsibility for an activity taking place for the existence of an entity for another agent's activity Examples Persons and organizations Inanimate objects Computer programs SWIB 2013 Tutorial on Metadata Provenance Caveat: One cannot describe the provenance of Agents. To do so they have to be both Agents and Entities. 25.11.2013 29
  30. 30. Prov-O Basic Elements Source: http://www.w3.org/TR/prov-primer/ SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 30
  31. 31. Starting properties prov:wasAttributedTo prov:wasGeneratedBy prov:used prov:startedAtTime prov:endedAtTime prov:wasInformedBy prov:wasAssociatedWith prov:wasDerivedFrom prov:actedOnBehalfOf SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 31
  32. 32. Example: Provenance of a conference paper The paper was written by a student The final version of the paper is based on an earlier draft A professor made some comments on the draft The student cites prior work from a book The paper includes a table that was generated by a program The program used a dataset to generate the table SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 32
  33. 33. Example: Entities ex:draft ex:article ex:dataset ex:book ex:result ex:comment dcterms: title dcterms: title “Latest results” “Results from INV13a” ex:draft a prov:Entity ; a fabio:Manuscript ; dcterms:title "Latest results" . ex:article a prov:Entity ; a fabio:ConferencePaper ; dcterms:title "Results from INV13a" . ex:dataset a prov:Entity ; a fabio:Dataset . ex:book a prov:Entity ; a fabio:Thesis . ex:result a prov:Entity ; a fabio:Table . ex:comment a prov:Entity ; a fabio:Review . SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 33
  34. 34. Example: Relation between Entities ex:article wasDerivedFrom wasDerivedFrom ex:comment ex:draft wasDerivedFrom ex:book SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 34
  35. 35. Example: Modelling the activities wasGeneratedBy ex:draft ex:compose ex:revise ex:comment wasGeneratedBy ex:result used used used ex:dataset wasGeneratedBy ex:article used ex:analyze startedAtTime 2013-10-28 12:34:05 UST ex:compose a prov:Activity . ex:revise a prov:Activity . ex:analyze a prov:Activity . SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 35
  36. 36. Example: Agents ex:student ex:prof foaf:name foaf:name ex:prog SWIB 2013 Tutorial on Metadata Provenance “Latest results” “Results from INV13a” ex:student a prov:Agent ; a foaf:Person ; foaf:name "Will Meyer" . ex:prof a prov:Agent ; a foaf:Person ; foaf:name "Joe Smith" . ex:prog a prov:Entity ; a fabio:Script . 25.11.2013 36
  37. 37. Example: Attribution ex:article wasAttributedTo ex:student wasAssociatedWith ex:compose wasAssociatedWith actedOnBehalfOf ex:revise ex:prog ex:dataset wasAssociatedWith wasAttributedTo ex:comment ex:analyze ex:prof wasAttributedTo SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 37
  38. 38. Recap PROV distinguishes Entities Activities Agents Relations are Derivation of Entities from Entities Attribution of Entities to Agents Generation/Modification/Use of Entities by Activities Association of Agents to Activities SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 38
  39. 39. Extending the basic elements of PROV SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 39
  40. 40. Agents and Entities The type of Agent can be specified through subproperties prov:Person prov:Organization prov:SoftwareAgent Same for type of Entity prov:Collection prov:Bundle prov:Plan SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 40
  41. 41. Types of Entities prov:Collection Provides a structure to a group of Entities prov:hadMember is used to describe membership Can be used to express the provenance of the collection itself SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 41
  42. 42. Types of Entities prov:Bundle A named set of provenance descriptions, that itself can have provenance information associated with No further subclasses provided – better left to other standards prov:Plan A set of actions done by (an) agent(s) to achieve a goal SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 42
  43. 43. Example: Bundle wasGeneratedBy used ex:draft ex:compose ex:result used ex:revise used wasGeneratedBy ex:dataset wasGeneratedBy ex:comment ex:article used ex:analyze startedAtTime 2013-10-28 12:34:05 UST prov:Bundle wasAttributedTo ex:Magnus SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 43
  44. 44. Describing Entities Entities can be described further by prov:value a literal value that represents an Entity prov:Location A geographic place A non-geographic place such as a filesystem directory, URL, row in a table, ... SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 44
  45. 45. Derivation The type of derivation can be specified through subproperties prov:hadPrimarySource Specific for first-hand reports, orignal works, etc. prov:wasQuotedFrom Specific for the extraction of a small part of the Entity prov:wasRevisionOf SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 45
  46. 46. Relation between Entities Relation between Entities can be further described prov:specializationOf Used to link a more specific Entity to a more general one prov:alternateOf Used to link Entities that present aspects of the same thing, but not necessarily the same aspects or at the same time SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 46
  47. 47. Broader Terms A superproperty is introduced that relates any influenced Entity, Activity, or Agent to any other influencing Entity, Activity, or Agent that had an effect on its characteristics. prov:wasInfluencedBy But: The more specific properties should be used where possible SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 47
  48. 48. Example ex:article wasInfluencedBy wasRevisionOf ex:comment ex:draft “The relevant question is why these data points do not fit with the basic model.” wasDerivedFrom value wasQuotedFrom ex:book SWIB 2013 Tutorial on Metadata Provenance ex:quotation 25.11.2013 48
  49. 49. Lifetime of an Entity One can provide a starting and ending time of an Entity's existence prov:generatedAtTime prov:invalidatedAtTime The involved Activities can be linked by prov:wasGeneratedBy / prov:generated prov:wasInvalidatedBy / prov:invalidated SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 49
  50. 50. Overview Source: http://www.w3.org/TR/prov-o/ SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 50
  51. 51. Qualifying relations in PROV SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 51
  52. 52. Qualifying relations Problem: Binary relations cannot be further elaborated But one would like to describe aspects of the relation ex:article wasInfluencedBy ex:comment e.g. the why, when, how, where of the influence between comment and article SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 52
  53. 53. The PROV solution "All problems in computer science can be solved by another level of indirection" (Attributed to David Wheeler, who apparently added: "But that usually will create another problem.") Instead of using a binary relation, an intermediate class that represents the influence between two resources is used This class can then be described by further attributes SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 53
  54. 54. Qualified Usage 2013-08-23 14:22:13 UST ex:draft entity atTime ex:usage1 used ex:attribute ex:revise qualifiedUsage value ex:usage1 a prov:Usage ; prov:entity ex:draft ; prov:atTime “2013-08-23 14:22:13 UST” ex:attribute value . SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 54
  55. 55. Qualified expressions Unqualified Influence Influen cing Class Entity wasGeneratedBy Activity qualifiedGeneration Generation activity Entity wasDerivedFrom Entity qualifiedDerivation Derivation entity Entity wasAttributedTo Agent qualifiedAttribution Attribution agent Activity used Entity qualifiedUsage Usage entity Activity wasInformedBy Activity qualifiedCommunication Communication activity Activity wasAssociatedWith Agent qualifiedAssociation Association agent Agent actedOnBehalfOf Agent qualifiedDelegation Delegation agent Influenced Class SWIB 2013 Tutorial on Metadata Provenance Qualification Property Qualified Influence Influencer Property 25.11.2013 55
  56. 56. Qualified expressions Influenced Class Unqualified Influence Influenci ng Class Entity Activity Agent Qualification Property Qualified Influence Influencer Property Entity Activity Agent wasInfluencedBy Entity hadPrimarySource Entity qualifiedPrimarySource PrimarySource entity Entity wasQuotedFrom Entity qualifiedQuotation Quotation entity Entity wasRevisionOf Entity qualifiedRevision Revision entity Entity wasInvalidatedBy Activity qualifiedInvalidation Invalidation activity Activity wasStartedBy Entity qualifiedStart Start entity Activity wasEndedBy Entity qualifiedEnd End entity SWIB 2013 Tutorial on Metadata Provenance qualifiedInfluence Influence influencer 25.11.2013 56
  57. 57. Qualified Derivation ex:usage2 ex:draft hadUsage entity ex:deriv1 wasDerivedFrom hadActivity ex:revise hadGeneration ex:article qualifiedDerivation ex:gen1 ex:deriv1 a prov:Derivation ; prov:entity ex:draft ; prov:hadActivity ex:revise ; prov:hadGeneration ex:gen1 ; Prov:hadUsage ex:usage2 . SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 57
  58. 58. Roles A role is the function of an entity or agent with respect to an activity, in the context of a usage, generation, invalidation, association, start, and end. Class is prov:Role Attribute is prov:hadRole SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 58
  59. 59. Qualified Attribution wasAttributedTo ex:article qualifiedAttribution ex:student agent ex:attrib1 hadRole “Primary author” SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 59
  60. 60. Summary Basic and extended PROV relations are unqualified To qualify a relation An intermediate node is introduced There is a corresponding class for all relations The intermediate node can be described further Special attributes exist to connect roles and activities SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 60
  61. 61. Mapping DC provenance information to PROV SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 61
  62. 62. Dublin Core Remember: Many DC terms contain provenance information Who affected a resource Creator, contributor, publisher, etc.. How the resource was affected Access rights, license, hasFormat, etc. When the resource was affected Created, issued, dateSubmitted, etc. SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 62
  63. 63. Property ranges Terms with dct:Agent as range creator contributor publisher rightsHolder SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 63
  64. 64. Property ranges Terms with time as range available created date dateAccepted dateCopyrighted dateSubmitted issued modified valid SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 64
  65. 65. Property ranges Terms with another resource as range accessRights hasFormat hasVersion isFormatOf isVersionOf license SWIB 2013 Tutorial on Metadata Provenance isReferencedBy isReplacedBy references replaces rights source 25.11.2013 65
  66. 66. Direct mappings Equivalences between PROV attributes and DC terms Described in using rdfs:subClassOf rdfs:subPropertyOf owl:equivalentClass. SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 66
  67. 67. Direct mappings: DC Terms DC Term Mapping PROV Property created subPropertyOf generatedAtTime dateAccepted subPropertyOf generatedAtTime dateCopyRighted subPropertyOf generatedAtTime dateSubmitted subPropertyOf generatedAtTime issued subPropertyOf generatedAtTime modified subPropertyOf generatedAtTime creator subPropertyOf wasAttributedTo contributor subPropertyOf wasAttributedTo publisher subPropertyOf wasAttributedTo rightsHolder subPropertyOf wasAttributedTo source subPropertyOf wasDerivedFrom hasFormat subPropertyOf alternateOf isFormatOf subPropertyOf alternateOf, wasDerivedFrom SWIB 2013 Tutorial on Metadata Provenance time Agent 25.11.2013 67
  68. 68. Direct mappings: Generalizations Properties generalizing PROV terms PROV property Mapping DC Term hadPrimarySource subPropertyOf source wasRevisionOf isVersionOf subPropertyOf Classes generalizing PROV terms PROV property Mapping DC Term Location subClassOfOf LocationPeriodOrJurisdiction SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 68
  69. 69. Direct mappings: classes DC Term dct:Agent dct:BibliographicResource dct:LicenseDocument dct:LinguisticSystem dct:Location dct:MethodOfAccrual dct:MethodOfInstruction dct:RightsStatement dct:PhysicalResource dct:Policy dct:ProvenanceStatement SWIB 2013 Tutorial on Metadata Provenance Relation owl:equivalentClass rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf owl:equivalentClass rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf PROV Term prov:Agent prov:Entity prov:Entity prov:Plan prov:Location prov:Plan prov:Plan prov:Entity prov:Entity prov:Plan prov:Bundle 25.11.2013 69
  70. 70. Complex mappings Defined to generate qualified PROV statements from DC statements Retain more information from the DC statements Can be adapted to include domain-specific elements Provided in the form of SPARQL construct queries But: Need subclasses extending the base PROV classes to express the type of activity or role SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 70
  71. 71. PROV refinements: subclasses Extended Term Relation to PROV PROV extended Term Publish subClassOf Activity Contribute subClassOf Activity Create subClassOf Activity RightsAssignment subClassOf Activity Modify subClassOf Activity Accept subClassOf Activity Copyright subClassOf Activity Submit subClassOf Activity Replace subClassOf Activity Publisher subClassOf Role Contributor subClassOf Role Creator subClassOf Role RightsHolder subClassOf Role SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 71
  72. 72. Complex mapping: Example Source: http://www.w3.org/TR/2013/NOTE-prov-dc-20130430/ SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 72
  73. 73. Complex mapping: Example Is there no easier way? The entity would be both produced and used Source: http://www.w3.org/TR/2013/NOTE-prov-dc-20130430/ SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 73
  74. 74. CONSTRUCT { ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; prov:qualifiedAssociation [ a prov:Association; prov:agent ?agent; prov:hadRole [a prov:Publisher]. ]. _:resultingEntity a prov:Entity; prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 74
  75. 75. CONSTRUCT { ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; prov:qualifiedAssociation [ a prov:Association; prov:agent ?agent; Translate the prov:hadRole [a prov:Publisher]. following type of ]. statement _:resultingEntity a prov:Entity; prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 75
  76. 76. CONSTRUCT { main Entity ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. direct mapping _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; prov:qualifiedAssociation [ a prov:Association; prov:agent ?agent; prov:hadRole [a prov:Publisher]. ]. _:resultingEntity a prov:Entity; prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 76
  77. 77. CONSTRUCT { ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; Specializations of specializations prov:qualifiedAssociation [ thethe Entity main Entity a prov:Association; prov:agent ?agent; prov:hadRole [a prov:Publisher]. ]. _:resultingEntity a prov:Entity; prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 77
  78. 78. CONSTRUCT { ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; prov:qualifiedAssociation [ a prov:Association; prov:agent ?agent; PROV refinement prov:hadRole [a prov:Publisher]. ]. _:resultingEntity a prov:Entity; prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 78
  79. 79. CONSTRUCT { ?document a prov:Entity; prov:wasAttributedTo ?agent. ?agent a prov:Agent. _:usedEntity a prov:Entity; prov:specializationOf ?document. _:activity a prov:Activity, prov:Publish; prov:used _:usedEntity; prov:wasAssociatedWith ?agent; prov:qualifiedAssociation [ a prov:Association; prov:agent ?agent; prov:hadRole [a prov:Publisher]. ]. _:resultingEntity a prov:Entity; qualified association prov:specializationOf ?document; prov:wasDerivedFrom _:usedEntity; to bind activity to role prov:wasGeneratedBy _:activity; prov:wasAttributedTo ?agent. } WHERE { ?document dct:publisher ?agent. } SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 79
  80. 80. Complex mappings: Cleanup The mappings produce many blank nodes Ideas to reduce the blank nodes: 1. Conflate properties referring to the same state of the resource e.g. the terms publisher and issued 2. Sort all the activities according to their logical order and share intermediate blank nodes e.g. publication after creation SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 80
  81. 81. Summary To convert existing provenance information in DC terms, a mapping to PROV-O is provided with the standard It contains Direct mappings for terms and classes PROV-O Extensions for types of activities and roles Complex mappings to create full PROV-O provenance information SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 81
  82. 82. Thank you for listening. Slides available online http://www.slideshare.net/MagnusPfeffer/ This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 82
  83. 83. References PROV-O: The PROV Ontology (W3C Recommendation) http://www.w3.org/TR/prov-o/ PROV Model Primer (W3C Working Group Note) http://www.w3.org/TR/prov-primer/ This presentation is based on an earlier tutorial held at the SWIB2012 conference together with Kai Eckert. SWIB 2013 Tutorial on Metadata Provenance 25.11.2013 83

×