SlideShare a Scribd company logo
Brief Introduction to Provenance
"As data becomes plentiful, verifiable truth becomes scarce”
http://go-to-hellman.blogspot.com/2010/02/named-graphs-argleton-and-
truth-economy.html
For JISC KeepItcourse on Digital Preservation Tools for Repository Managers
Module 3, Primer on preservation workflow, formats and characterisation
Westminster-Kingsway College, London, 2 March 2010
Provenance: example
The following excerpt and slides are taken with permission from Moreau, L.
The Open Provenance Model:Towards inter-operability of Provenance
Systems http://users.ecs.soton.ac.uk/lavm/talks/iam09.pdf
Example The provenance of a bottle of wine includes:
• Grapes from which it is made
• Where those grapes grew
• Process in the wine’s preparation
• How the wine was stored
• Between which parties the wine was transported,
e.g. producer to distributer to retailer
• Where it was auctioned
Provenance Definition
• Oxford English Dictionary:
– the fact of coming from some particular source or quarter;
origin, derivation
– the historyor pedigree of a work of art, manuscript, rare
book, etc.;
– concretely, a record of the passage
of an item through its various
owners.
• The provenance of a piece of data is the
process that led to that piece of data
The Science Lifecycle
scientists
Local
Web
Repositories
Graduate
Students
Undergraduate
Students
Virtual Learning
Environment
Technical
Reports
Reprints
Peer-
Reviewed
Journal &
Conference
Papers
Preprints
&
Metadata
Certified
Experimental Results
& Analyses
experimentation
Data, Metadata,
Provenance, Scripts,
Workflows, Services,
Ontologies, Blogs, ...
Digital
Libraries
Next Generation
Researchers
Adapted from David De Roure’s slides
scientists
Local
Web
Repositories
Graduate
Students
Undergraduate
Students
Virtual Learning
Environment
Technical
Reports
Reprints
Peer-
Reviewed
Journal &
Conference
Papers
Preprints
&
Metadata
Certified
Experimental Results
& Analyses
experimentation
Data, Metadata,
Provenance, Scripts,
Workflows, Services,
Ontologies, Blogs, ...
Digital
Libraries
Next Generation
Researchers
Finding the Provenance
of research outputs
across all the systems
data transited through
Open Provenance Model (OPM)
• Allows us to express all the causes of an item
• Allow for process-oriented and dataflow
oriented views
• Based on a notion of annotated causality
graph
Moreau, L., et al. v1.00 (Dec 2007), OPM v1.01
(Jul 2008), OPM v1.1 (Dec 2009)
OPM Requirements
• To allow provenance information to be
exchanged between systems, by means of a
compatibility layer based on a shared provenance
model.
• To allow developers to build and share tools that
operate on such provenance model.
• To define the model in a precise, technology-
agnostic manner.
• To define bindings to XML/RDF separately
• To support a digital representation of provenance
for any “thing”, whether produced by computer
systems or not
OPM Serialisation
• OPM is an abstract data model to represent past
execution and what causes data and processes to occur
• OPM can be serialised in different formats, referred to
as “technology bindings” or serializations
• OPM XML schema
(http://openprovenance.org/model/v1.01.a)
• OPM RDF schema
• OPM OWL ontology
• Effort underway to ensure full equivalence of
representations
Nodes
• Artifact: Immutable piece of state, which
may have a physical embodiment in a
physical object, or a digital
representation in a computer system.
• Process: Action or series of actions
performed on or caused by artifacts, and
resulting in new artifacts.
• Agent: Contextual entity acting as a
catalyst of a process, enabling,
facilitating, controlling, affecting its
execution.
A
P
Ag
Edges
A1 A2
P1 P2
wasTriggeredBy
wasDerivedFrom
A Pused(R)
AP
wasGeneratedBy(R)
Ag P
wasControlledBy(R)
Edge labels are in the past to express that these are used to describe past executions
Illustration
• Process “used” artifacts and
“generated” artifact
• Edge “roles” indicate the
function of the artifact with
respect to the process (akin
to function parameters)
• Edges and nodes can be
typed
Causation chain:
• P was caused by A1 and A2
• A3 and A4 were caused by P
• Does it mean that A3 and A4
were caused by A1 and A2?
P
A1 A2
A3 A4
used(divisor)used(dividend)
wasGeneratedBy(rest)wasGeneratedBy(quotient)
type=division
Time Constraints
A Pused(R)
A
wasGeneratedBy(R)
Ag
wasControlledBy(R)
start: T2
end: T5
T4T3
T1<T3 (artifact must exist before being used)
T2<T3 (process must have started before using artifacts)
T3<T5 (process uses artifacts before it ends)
T2<T4 (process must have started before generating artifacts)
T4<T5 (process generates artifacts before it ends)
T4<T6 (artifact must exist before being used)
T2<T5 (process must have started before ending)
no constraint between t3 and t4
wasGeneratedBy(R)
T1
used(R)
T6
Dublin Core Profile (draft)
• To many people, provenance is primarily
about attribution, citation, bibliographic
information
• DC provides terms to relate resources to such
information
• DC profile aims to use of Dublin Core terms to
OPM concepts and graph patterns
with Simon Miles and Joe Futrelle
DC to OPM example: dc:publisher
A2
A1
P
publish
wasSameResourceAs
state=published
Ag
wasActionOf
state=unpublished
person
name=Luc wasGeneratedBy
What have we learned about
provenance?
• Provenance: describes and records the results of
processes on objects over time
• OPM represents provenance as XML
• OPM can be serialised in different formats
• RDF, Semantic Web
• OPM is a work in progress
By working with an open standard model, that can
pass information as XML and in standard serialisation
formats (e.g. RDF), it should be possible to build
provenance services into repository environments

More Related Content

Viewers also liked

Ch03 records management
Ch03 records managementCh03 records management
Ch03 records managementxtin101
 
Records Inventory And Appraisal
Records Inventory And AppraisalRecords Inventory And Appraisal
Records Inventory And Appraisal
Fe Angela Verzosa
 
Ch06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notesCh06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notes
francarter2
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015
Humphrey Southall
 
Principles of records management Mushi
Principles of records management MushiPrinciples of records management Mushi
Principles of records management Mushisylvanus mushi
 
Records inventory and appraisal
Records inventory and appraisalRecords inventory and appraisal
Records inventory and appraisal
corpuzed
 
Ch07 records management
Ch07 records managementCh07 records management
Ch07 records managementxtin101
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Rinke Hoekstra
 
Appraisal
AppraisalAppraisal
Appraisal
Sharon Pullen
 
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management InitiativesData Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
Alan McSweeney
 
Behind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesBehind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesAudra Eagle Yun
 
Inventory management
Inventory managementInventory management
Inventory managementKuldeep Uttam
 
How to conduct a records and information inventory
How to conduct a records and information inventoryHow to conduct a records and information inventory
How to conduct a records and information inventory
Jesse Wilkins
 

Viewers also liked (15)

Records inventory final
Records inventory finalRecords inventory final
Records inventory final
 
Ch03 records management
Ch03 records managementCh03 records management
Ch03 records management
 
Records Inventory And Appraisal
Records Inventory And AppraisalRecords Inventory And Appraisal
Records Inventory And Appraisal
 
Ch06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notesCh06 records management slide show part 2 with notes
Ch06 records management slide show part 2 with notes
 
Introduction to archival research 2015
Introduction to archival research 2015Introduction to archival research 2015
Introduction to archival research 2015
 
Principles of records management Mushi
Principles of records management MushiPrinciples of records management Mushi
Principles of records management Mushi
 
Records inventory and appraisal
Records inventory and appraisalRecords inventory and appraisal
Records inventory and appraisal
 
Archival research
Archival researchArchival research
Archival research
 
Ch07 records management
Ch07 records managementCh07 records management
Ch07 records management
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
Appraisal
AppraisalAppraisal
Appraisal
 
Data Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management InitiativesData Governance: Keystone of Information Management Initiatives
Data Governance: Keystone of Information Management Initiatives
 
Behind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesBehind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research libraries
 
Inventory management
Inventory managementInventory management
Inventory management
 
How to conduct a records and information inventory
How to conduct a records and information inventoryHow to conduct a records and information inventory
How to conduct a records and information inventory
 

Similar to Keepit Course 3: Provenance (and OPM), based on slides by Luc Moreau

On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
PlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
Oscar Corcho
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsSrinath Perera
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Sriskandarajah Suhothayan
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
Pinar Alper
 
oai-2.0-adv.ppt
oai-2.0-adv.pptoai-2.0-adv.ppt
oai-2.0-adv.ppt
Bharath Abbareddy
 
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
CARLOS III UNIVERSITY OF MADRID
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
Simeon Warner
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
Safe Software
 
The Data Distribution Service Tutorial
The Data Distribution Service TutorialThe Data Distribution Service Tutorial
The Data Distribution Service Tutorial
Angelo Corsaro
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Stuart Chalk
 
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
National Information Standards Organization (NISO)
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
openseesdays
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
IMPACT Centre of Competence
 
OpenURL - The Rough Guide
OpenURL - The Rough GuideOpenURL - The Rough Guide
OpenURL - The Rough Guide
Tony Hammond
 
Introduction to Networking and OSI Model
Introduction to Networking and OSI ModelIntroduction to Networking and OSI Model
Introduction to Networking and OSI Model
KawtharAlsharah
 
MODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patternsMODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patterns
Antonio García-Domínguez
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
Paris Carbone
 
Networks
NetworksNetworks
Networks
Edward Blurock
 
Design and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemDesign and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management System
Erdi Olmezogullari
 

Similar to Keepit Course 3: Provenance (and OPM), based on slides by Luc Moreau (20)

On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
oai-2.0-adv.ppt
oai-2.0-adv.pptoai-2.0-adv.ppt
oai-2.0-adv.ppt
 
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
The Data Distribution Service Tutorial
The Data Distribution Service TutorialThe Data Distribution Service Tutorial
The Data Distribution Service Tutorial
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
Norman and McCraken, "OpenURL Implementation: Link Resolution That Users Will...
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation FrameworkBL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
 
OpenURL - The Rough Guide
OpenURL - The Rough GuideOpenURL - The Rough Guide
OpenURL - The Rough Guide
 
Introduction to Networking and OSI Model
Introduction to Networking and OSI ModelIntroduction to Networking and OSI Model
Introduction to Networking and OSI Model
 
MODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patternsMODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patterns
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
 
Networks
NetworksNetworks
Networks
 
Design and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemDesign and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management System
 

More from JISC KeepIt project

EPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation PlanningEPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation Planning
JISC KeepIt project
 
Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...
JISC KeepIt project
 
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
JISC KeepIt project
 
Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...
JISC KeepIt project
 
Keepit Course 5: Concluding the course
Keepit Course 5: Concluding the courseKeepit Course 5: Concluding the course
Keepit Course 5: Concluding the course
JISC KeepIt project
 
Keepit Course 5: Revision
Keepit Course 5: RevisionKeepit Course 5: Revision
Keepit Course 5: Revision
JISC KeepIt project
 
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
JISC KeepIt project
 
Keepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy RepositoriesKeepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy Repositories
JISC KeepIt project
 
Keepit Course 5: Trust
Keepit Course 5: TrustKeepit Course 5: Trust
Keepit Course 5: Trust
JISC KeepIt project
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
JISC KeepIt project
 
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
JISC KeepIt project
 
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
JISC KeepIt project
 
KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...
JISC KeepIt project
 
KeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to RepositoriesKeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to Repositories
JISC KeepIt project
 
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
JISC KeepIt project
 
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
JISC KeepIt project
 
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
JISC KeepIt project
 
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
JISC KeepIt project
 
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
JISC KeepIt project
 
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
JISC KeepIt project
 

More from JISC KeepIt project (20)

EPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation PlanningEPrints Preservation: Why we need Preservation Planning
EPrints Preservation: Why we need Preservation Planning
 
Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...Preserving repository content: practical steps for repository managers by Mig...
Preserving repository content: practical steps for repository managers by Mig...
 
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
Update on the JISC KeepIt Repository Preservation Exemplars Project, June 2010
 
Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...Transforming repositories: from repository managers to institutional data man...
Transforming repositories: from repository managers to institutional data man...
 
Keepit Course 5: Concluding the course
Keepit Course 5: Concluding the courseKeepit Course 5: Concluding the course
Keepit Course 5: Concluding the course
 
Keepit Course 5: Revision
Keepit Course 5: RevisionKeepit Course 5: Revision
Keepit Course 5: Revision
 
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
KeepIt Course 5: DRAMBORA: Risk and Trust and Data Management, by Martin Donn...
 
Keepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy RepositoriesKeepit Course 5: Tools for Assessing Trustworthy Repositories
Keepit Course 5: Tools for Assessing Trustworthy Repositories
 
Keepit Course 5: Trust
Keepit Course 5: TrustKeepit Course 5: Trust
Keepit Course 5: Trust
 
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas RauberPreservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
Preservation Planning using Plato, by Hannes Kulovits and Andreas Rauber
 
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...
 
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
KeepIt Course 4: digital preservation recap, by Andreas Rauber, Hannes Kulovi...
 
KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...KeepIt Course 4: Putting storage, format management and preservation planning...
KeepIt Course 4: Putting storage, format management and preservation planning...
 
KeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to RepositoriesKeepIt Course 3: Applying Preservation Metadata to Repositories
KeepIt Course 3: Applying Preservation Metadata to Repositories
 
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
Significant Properties - Where Next? (SPs part 6), by Stephen Grace and Garet...
 
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
Supporting Significant Properties in a Working Archive (SPs part 5), by Steph...
 
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
Significant Properties, Practical 2: Stakeholder Analysis (SPs part 4), by St...
 
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
Significant Properties, Practical 1: Object Analysis (SPs part 3), by Stephen...
 
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
InSPECT Significant Properties Framework (SPs part 2), by Stephen Grace and G...
 
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
Introducing Significant Properties (SPs part 1), by Stephen Grace and Gareth ...
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Keepit Course 3: Provenance (and OPM), based on slides by Luc Moreau

  • 1. Brief Introduction to Provenance "As data becomes plentiful, verifiable truth becomes scarce” http://go-to-hellman.blogspot.com/2010/02/named-graphs-argleton-and- truth-economy.html For JISC KeepItcourse on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation workflow, formats and characterisation Westminster-Kingsway College, London, 2 March 2010
  • 2. Provenance: example The following excerpt and slides are taken with permission from Moreau, L. The Open Provenance Model:Towards inter-operability of Provenance Systems http://users.ecs.soton.ac.uk/lavm/talks/iam09.pdf Example The provenance of a bottle of wine includes: • Grapes from which it is made • Where those grapes grew • Process in the wine’s preparation • How the wine was stored • Between which parties the wine was transported, e.g. producer to distributer to retailer • Where it was auctioned
  • 3. Provenance Definition • Oxford English Dictionary: – the fact of coming from some particular source or quarter; origin, derivation – the historyor pedigree of a work of art, manuscript, rare book, etc.; – concretely, a record of the passage of an item through its various owners. • The provenance of a piece of data is the process that led to that piece of data
  • 4. The Science Lifecycle scientists Local Web Repositories Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata, Provenance, Scripts, Workflows, Services, Ontologies, Blogs, ... Digital Libraries Next Generation Researchers Adapted from David De Roure’s slides
  • 5. scientists Local Web Repositories Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata, Provenance, Scripts, Workflows, Services, Ontologies, Blogs, ... Digital Libraries Next Generation Researchers Finding the Provenance of research outputs across all the systems data transited through
  • 6. Open Provenance Model (OPM) • Allows us to express all the causes of an item • Allow for process-oriented and dataflow oriented views • Based on a notion of annotated causality graph Moreau, L., et al. v1.00 (Dec 2007), OPM v1.01 (Jul 2008), OPM v1.1 (Dec 2009)
  • 7. OPM Requirements • To allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. • To allow developers to build and share tools that operate on such provenance model. • To define the model in a precise, technology- agnostic manner. • To define bindings to XML/RDF separately • To support a digital representation of provenance for any “thing”, whether produced by computer systems or not
  • 8. OPM Serialisation • OPM is an abstract data model to represent past execution and what causes data and processes to occur • OPM can be serialised in different formats, referred to as “technology bindings” or serializations • OPM XML schema (http://openprovenance.org/model/v1.01.a) • OPM RDF schema • OPM OWL ontology • Effort underway to ensure full equivalence of representations
  • 9. Nodes • Artifact: Immutable piece of state, which may have a physical embodiment in a physical object, or a digital representation in a computer system. • Process: Action or series of actions performed on or caused by artifacts, and resulting in new artifacts. • Agent: Contextual entity acting as a catalyst of a process, enabling, facilitating, controlling, affecting its execution. A P Ag
  • 10. Edges A1 A2 P1 P2 wasTriggeredBy wasDerivedFrom A Pused(R) AP wasGeneratedBy(R) Ag P wasControlledBy(R) Edge labels are in the past to express that these are used to describe past executions
  • 11. Illustration • Process “used” artifacts and “generated” artifact • Edge “roles” indicate the function of the artifact with respect to the process (akin to function parameters) • Edges and nodes can be typed Causation chain: • P was caused by A1 and A2 • A3 and A4 were caused by P • Does it mean that A3 and A4 were caused by A1 and A2? P A1 A2 A3 A4 used(divisor)used(dividend) wasGeneratedBy(rest)wasGeneratedBy(quotient) type=division
  • 12. Time Constraints A Pused(R) A wasGeneratedBy(R) Ag wasControlledBy(R) start: T2 end: T5 T4T3 T1<T3 (artifact must exist before being used) T2<T3 (process must have started before using artifacts) T3<T5 (process uses artifacts before it ends) T2<T4 (process must have started before generating artifacts) T4<T5 (process generates artifacts before it ends) T4<T6 (artifact must exist before being used) T2<T5 (process must have started before ending) no constraint between t3 and t4 wasGeneratedBy(R) T1 used(R) T6
  • 13. Dublin Core Profile (draft) • To many people, provenance is primarily about attribution, citation, bibliographic information • DC provides terms to relate resources to such information • DC profile aims to use of Dublin Core terms to OPM concepts and graph patterns with Simon Miles and Joe Futrelle
  • 14. DC to OPM example: dc:publisher A2 A1 P publish wasSameResourceAs state=published Ag wasActionOf state=unpublished person name=Luc wasGeneratedBy
  • 15. What have we learned about provenance? • Provenance: describes and records the results of processes on objects over time • OPM represents provenance as XML • OPM can be serialised in different formats • RDF, Semantic Web • OPM is a work in progress By working with an open standard model, that can pass information as XML and in standard serialisation formats (e.g. RDF), it should be possible to build provenance services into repository environments