Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Provenance and social science data Nicholas Car - Intro to PROV

236 views

Published on

Slides from webinar: Provenance and social science data. Presented on 15 March 2017. Presenter was Nicholas Car, Data Architect, Geosciences Australia

FULL webinar recording: https://youtu.be/elPcKqWoOPg

2. Nicholas Car (Data Architect, Geoscience Aust) A brief introduction to data provenance and provenance standards

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Provenance and social science data Nicholas Car - Intro to PROV

  1. 1. Intro to PROV Nicholas Car Data Architect nicholas.car@ga.gov.au
  2. 2. Outline • What is PROV? • How do I use PROV: modelling • How do I use PROV: data management • How do I use PROV: with other systems Intro to PROV
  3. 3. What is PROV? • W3C Recommendation (standard) • Completed 2013 • Large number of authors • The only international provenance standard • Successor to precursors: PML, OPM. • Many precursor authors involved • Simpler than precursors • No v2 any time soon • Authors recommend extending the current standard • Seeing good adoption Intro to PROV
  4. 4. What is PROV? • A “Family of documents” • PROV-OVERVIEW – documentation • PROV-PRIMER – tutorial • PROV-DM – Data Model • PROV-O – OWL Ontology version of DM • PROV-N – special Notation for DM • PROV-XML – XML encoding of DM • PROV-CONSTRAINS – DM constraints • http://www.w3.org/TR/prov-overview/ Intro to PROV
  5. 5. How do I use PROV: modelling Not like this: Do not describe the lineage of something in the metadata document of that thing Intro to PROV ISO19115 or other standardised Document provenance information contained in document some provenance field Ref: https://geo-ide.noaa.gov/wiki/index.php?title=ISO_Lineage
  6. 6. How do I use PROV: modelling Not like this: Do not link a class of something to a provenance object Intro to PROV Data Catalogue Vocabulary (DCAT) https://www.w3.org/TR/vocab-dcat/ Provenance field 1 field 2 provenance
  7. 7. How do I use PROV: modelling Not like this: Do not link a class of something to a provenance object Intro to PROV Data Catalogue Vocabulary (DCAT) https://www.w3.org/TR/vocab-dcat/ Provenance field 1 field 2 provenance Not even by using the Dublin Core ‘provenance’ Property!
  8. 8. How do I use PROV: modelling Like this: Model things you are interested in as either Entities, Agents or Activities and relate them to one another Intro to PROV PROV-DM’s basic classes expressed in a PROV-O style. After https://www.w3.org/TR/prov-o/
  9. 9. How do I use PROV: modelling Like this: GA’s “process provenance model” Intro to PROV
  10. 10. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • with all the perks of a graph DB! Intro to PROV
  11. 11. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • with all the perks of a graph DB! Intro to PROV A provenance Report generation form for human use in PROMS
  12. 12. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities Intro to PROV Dataset X Dataset Y
  13. 13. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities Intro to PROV Dataset X Dataset Y wasDerivedFrom Entity YEntity X
  14. 14. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities • Ensure relevant properties align with PROV Intro to PROV Dataset X Creator creator
  15. 15. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities • Ensure relevant properties align with PROV Intro to PROV Dataset X wasAssociatedWith Creator creator Agent Creator hadRole
  16. 16. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities • Ensure relevant properties align with PROV • For databases: • Ensure you represent the PROV-DM Intro to PROV
  17. 17. How do I use PROV: data management • For humans, or systems that log things: • create Reports • store them in a document DB • For catalogue-like things: • Add the ability to link Entities, Agents, Activities • Ensure relevant properties align with PROV • For databases: • Ensure you represent the PROV-DM • prove it via exporting Intro to PROV
  18. 18. How do I use PROV: with other systems • PROV & Metadata System X: 1. Full Alignment – Classify all things in MSX in PROV o Requires a data model for MSX o May have to reconsider some MSX objects o Can profile PROV, don’t allow everything 2. Partial Alignment – Classify some of MSX in PROV o Link classified things only o Even link to things outside MSX o Need to demo valid PROV-DM 3. Just PROV – Interpret/create PROV-only data o Deprecate MSX for PROV o Or create new data Intro to PROV
  19. 19. How do I use PROV: data management Like this: GA’s “process provenance model”, full version Intro to PROV

×