Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Sightseeing Tour of
PROV and Some of its
Extensions
Khalid Belhajjame
LAMSADE, Université Paris-Dauphine
16/03/16 MADICS...
Why do we care about provenance
…
 Help explain results and outliers
 Assess trust and quality
 Promote systems transpa...
A bit of History
Provenance is not a new topic. There has been a lot of
provenance work in:
 Databases, Workflows, Inform...
A bit of History
 2009-2010: W3C Provenance Provenance Incubator Group
 Objective: provides a state of the art and possi...
Family of PROV
documents
16/03/16 MADICS: ReProVirtuFlow 5
Family of PROV
documents
16/03/16 MADICS: ReProVirtuFlow 6
Provenance
The W3C Provenance Working Group defined provenance
as:
Provenance is defined as a record that
describes the pe...
PROV…
is not a recommendation for representing and
collecting provenance information that should be
adopted internally by ...
Example
16/03/16 MADICS: ReProVirtuFlow 9
PROV Core Structures
16/03/16 MADICS: ReProVirtuFlow 10
Entity
 An entity is a physical, digital, conceptual, or other
kind of thing with some fixed aspects; entities may
be rea...
Activity
 An activity is something that occurs over a period of
time and acts upon or with entities; it may include
consu...
Agent
 An agent is something that bears some form of
responsibility for an activity taking place, for the
existence of an...
Usage and Generation
 Usage is the beginning of utilizing an entity by an
activity. Before usage, the activity had not be...
Derivation
 Derivation is a transformation of an entity into
another, an update of an entity resulting in a new
one, or t...
Association and Attribution
 An activity association is an assignment of
responsibility to an agent for an activity, indi...
PROV Core Structures
16/03/16 MADICS: ReProVirtuFlow 17
W3C PROV Implementations:
Preliminary Analysis
16/03/16 MADICS: ReProVirtuFlow 18
Source: https://khalidbelhajjame.wordpre...
PROV Compliant
Vocabularies
This is by no mean complete ….
PRO
V
ProvONE
wfprov
wfdescc
DC
PAV
extends
extends
c
extends
m...
Prospective provenance
Retrospective
provenance
ProvONE: A PROV Extension Data
Model for Scientific Workflow Provenance
16...
PAV ontology:
provenance, authoring and versioning
16/03/16 MADICS: ReProVirtuFlow 21
PAV ontology:
provenance, authoring and versioning
16/03/16 MADICS: ReProVirtuFlow 22
Acknowledgements
 W3C Provenance Working Group
 DataONE Workflow and Provenance Interest Group
 PAV’s friends: Paolo Ci...
A Sightseeing Tour of
PROV and Some of its
Extensions
Khalid Belhajjame
LAMSADE, Université Paris-Dauphine
16/03/16 MADICS...
Upcoming SlideShare
Loading in …5
×

A Sightseeing Tour of Prov and Some of its Extensions

374 views

Published on

A short Prov Tutorial that I gave at the MADICS ReproVirtuFlow WG face to face meeting in Orsay.

Published in: Education
  • Be the first to comment

  • Be the first to like this

A Sightseeing Tour of Prov and Some of its Extensions

  1. 1. A Sightseeing Tour of PROV and Some of its Extensions Khalid Belhajjame LAMSADE, Université Paris-Dauphine 16/03/16 MADICS: ReProVirtuFlow 1
  2. 2. Why do we care about provenance …  Help explain results and outliers  Assess trust and quality  Promote systems transparency: users are able to determine whether a particular use of information is appropriate under a set of rules.  Assist in debugging  Promote reuse and reproducibility 16/03/16 MADICS: ReProVirtuFlow 2
  3. 3. A bit of History Provenance is not a new topic. There has been a lot of provenance work in:  Databases, Workflows, Information retrieval, ….  By 2009, there have been a number of models/vocabularies for expressing provenance information  Open Provenance Model (OPM),  Proof Markup Language (PML),  Provenance Vocabulary,  PREservation Metadata : Implementation Strategies (PREMIS),  Semantic Web Applications in Neuromedicine (SWAN) Ontology,  Dublin Core, …. 16/03/16 MADICS: ReProVirtuFlow 3
  4. 4. A bit of History  2009-2010: W3C Provenance Provenance Incubator Group  Objective: provides a state of the art and possible recommendations for standardization efforts  2011: W3C Provenance Working Group  Objective: To define a standard vocabulary primarily for the semantic Web  2013: The W3C Provenance Working Group published a number of PROV recommendations and notes:  PROV-DM, PROV-O, …  Since then a number of models and vocabularies have extended and/or defined mapping rules to PROV 16/03/16 MADICS: ReProVirtuFlow 4
  5. 5. Family of PROV documents 16/03/16 MADICS: ReProVirtuFlow 5
  6. 6. Family of PROV documents 16/03/16 MADICS: ReProVirtuFlow 6
  7. 7. Provenance The W3C Provenance Working Group defined provenance as: Provenance is defined as a record that describes the people, institutions, entities, and activities involved in producing, influencing, or delivering a piece of data or a thing. 16/03/16 MADICS: ReProVirtuFlow 7
  8. 8. PROV… is not a recommendation for representing and collecting provenance information that should be adopted internally by all systems.  That is not realistic, and won’t happen any time soon Instead, the aim to facilitate and promote interoperability between domains and applications that adopt their specific representations of provenance.  More pragmatic, and thus likely to happen. 16/03/16 MADICS: ReProVirtuFlow 8
  9. 9. Example 16/03/16 MADICS: ReProVirtuFlow 9
  10. 10. PROV Core Structures 16/03/16 MADICS: ReProVirtuFlow 10
  11. 11. Entity  An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary.  Example: An entity may be the document at IRI http://www.bbc.co.uk/news/science-environment- 17526723, a file in a file system, a car, or an idea. 16/03/16 MADICS: ReProVirtuFlow 11
  12. 12. Activity  An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities.  Example: An activity may be the publishing of a document on the Web, sending a twitter message, extracting metadata embedded in a file, driving a car from Paris to Lyon, etc. 16/03/16 MADICS: ReProVirtuFlow 12
  13. 13. Agent  An agent is something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent's activity.  Example: A site selling books on the Web and the companies hosting them can be seen as agents. 16/03/16 MADICS: ReProVirtuFlow 13
  14. 14. Usage and Generation  Usage is the beginning of utilizing an entity by an activity. Before usage, the activity had not begun to utilize this entity and could not have been affected by the entity.  Example: A program beginning to read an input file  Generation is the completion of production of a new entity by an activity. This entity did not exist before generation and becomes available for usage after this generation.  Example: the completed creation of a file by a program 16/03/16 MADICS: ReProVirtuFlow 14
  15. 15. Derivation  Derivation is a transformation of an entity into another, an update of an entity resulting in a new one, or the construction of a new entity based on a pre-existing entity.  Example: The transformation of a relational table into a linked data set 16/03/16 MADICS: ReProVirtuFlow 15
  16. 16. Association and Attribution  An activity association is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity  Example: the workflow system is responsible for the enactment of a workflow execution  Attribution is the ascribing of an entity to an agent.  Example: A blog post can be attributed to an author, a mobile phone to its manufacturer. 16/03/16 MADICS: ReProVirtuFlow 16
  17. 17. PROV Core Structures 16/03/16 MADICS: ReProVirtuFlow 17
  18. 18. W3C PROV Implementations: Preliminary Analysis 16/03/16 MADICS: ReProVirtuFlow 18 Source: https://khalidbelhajjame.wordpress.com/2013/04/04/w3c-prov-implementations/
  19. 19. PROV Compliant Vocabularies This is by no mean complete …. PRO V ProvONE wfprov wfdescc DC PAV extends extends c extends mapsTo mapsTo 16/03/16 MADICS: ReProVirtuFlow 19
  20. 20. Prospective provenance Retrospective provenance ProvONE: A PROV Extension Data Model for Scientific Workflow Provenance 16/03/16 MADICS: ReProVirtuFlow 20
  21. 21. PAV ontology: provenance, authoring and versioning 16/03/16 MADICS: ReProVirtuFlow 21
  22. 22. PAV ontology: provenance, authoring and versioning 16/03/16 MADICS: ReProVirtuFlow 22
  23. 23. Acknowledgements  W3C Provenance Working Group  DataONE Workflow and Provenance Interest Group  PAV’s friends: Paolo Ciccarese, Stian Soiland- Reyes, Alasdair JG Gray, Carole Goble and Tim Clark 16/03/16 MADICS: ReProVirtuFlow 23
  24. 24. A Sightseeing Tour of PROV and Some of its Extensions Khalid Belhajjame LAMSADE, Université Paris-Dauphine 16/03/16 MADICS: ReProVirtuFlow 24

×