Alison Callahan and Michel Dumontier
Carleton University
Ovopubs:
Modular data publication with
minimal provenance
Dumonti...
Data publication
• Emerging interest in publishing data on the web
• microdata formats (rdfa, schema.org) and formal
knowl...
assertions
Nanopublication
• A nanopublication claims to be the
“smallest, unambiguous unit of thought”.
• A nanopublicati...
an Ovopub is an object
that contains and links to data
and the ovopub’s provenance
4
data
provenance
Dumontier::Bio-ontolo...
an assertion ovopub contains
one or more connected statements
This ovopub is good for
capturing knowledge in
the form of s...
An ovopub also links itself to its
content
rdfs:member <uri>
This explicit reification
enables transitive
closures over gr...
An ovopub contains and links to its
own provenance
• dc:creator <uri>
• dc:created xsd:datetime
• dc:license <uri>
• rdf:t...
a collection ovopub contains
one or more unconnected items
Item types:
- object
- assertion ovopub
- collection ovopub
Thi...
iRefIndex: Ovopub Case Study
for Datasets, Records, Assertions
Dumontier::Bio-ontologies 2013:Ovopubs 9
Future work
• Actively develop the nanopublication as a community
standard for provenance-based data publication
– Assess ...
Michel Dumontier
michel_dumontier@carleton.ca
Publications: http://dumontierlab.com
Presentations: http://slideshare.com/m...
Upcoming SlideShare
Loading in …5
×

Ovopub: Modular data publication with minimal provenance

1,010 views

Published on

With the growth of the Semantic Web as a medium for creating, consuming, mashing up and republishing data, our ability to trace any statement(s) back to their origin is becoming ever more important. Several approaches have now been proposed to associate statements with provenance, with multiple applications in data publication, attribution and argumentation. Here, we describe the ovopub, a modular model for data publication that enables encapsulation, aggregation, integrity checking, and selective-source query answering. We describe the ovopub RDF specification, key design patterns and their application in the publication and referral to data in the life sciences.

paper: http://arxiv.org/abs/1305.6800
presented at bio-ontologies 2013: https://sites.google.com/site/bioontologies/home

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,010
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
6
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • In order to keep a clear link back to the original data, in our RDFized datasets we maintain the original data provider’s record identifiers by making use of the following URI pattern:namespace: preferred short name for a biological dataset Registry allows for automatic conversion of any alternative namespace to the preferred namespace from the life sciences registry
  • Ovopub: Modular data publication with minimal provenance

    1. 1. Alison Callahan and Michel Dumontier Carleton University Ovopubs: Modular data publication with minimal provenance Dumontier::Bio-ontologies 2013:Ovopubs 1
    2. 2. Data publication • Emerging interest in publishing data on the web • microdata formats (rdfa, schema.org) and formal knowledge representation languages (RDF/OWL) • Efforts to capturing credit/provenance of assertions – PROV-O, OAG – nanopublications (data/statements - Groth, Kuth) – microattributions (gene variation - Patrinos et al) – micropublications (discourse - Clark et al) Dumontier::Bio-ontologies 2013:Ovopubs 2
    3. 3. assertions Nanopublication • A nanopublication claims to be the “smallest, unambiguous unit of thought”. • A nanopublication is an RDF graph that links to two/three graphs: – A graph containing one or more assertions – A graph containing the provenance for the assertion(s) – A graph providing information about the nanopublication assertion provenance publication Problems : indirection between assertion and its provenance; what if no provenance is provided? nanopub graph cannot fully contain other graphs; reasoning and easy of queries across nested graphs. Dumontier::Bio-ontologies 2013:Ovopubs 3
    4. 4. an Ovopub is an object that contains and links to data and the ovopub’s provenance 4 data provenance Dumontier::Bio-ontologies 2013:Ovopubs
    5. 5. an assertion ovopub contains one or more connected statements This ovopub is good for capturing knowledge in the form of statements Dumontier::Bio-ontologies 2013:Ovopubs 5
    6. 6. An ovopub also links itself to its content rdfs:member <uri> This explicit reification enables transitive closures over graph structures Dumontier::Bio-ontologies 2013:Ovopubs 6
    7. 7. An ovopub contains and links to its own provenance • dc:creator <uri> • dc:created xsd:datetime • dc:license <uri> • rdf:type sio:assertion-ovopub sio:collection-ovopub creator timestamp license ovopub type Dumontier::Bio-ontologies 2013:Ovopubs 7
    8. 8. a collection ovopub contains one or more unconnected items Item types: - object - assertion ovopub - collection ovopub This ovopub is good for - encapsulation and redistribution of selected content - restriction of query execution / results Dumontier::Bio-ontologies 2013:Ovopubs 8
    9. 9. iRefIndex: Ovopub Case Study for Datasets, Records, Assertions Dumontier::Bio-ontologies 2013:Ovopubs 9
    10. 10. Future work • Actively develop the nanopublication as a community standard for provenance-based data publication – Assess the value of directly linking assertion & provenance graphs – Generate (revised) nanopublications in Bio2RDF • Promote nanopublication-based design patterns for: – direct/indirect data/discourse assertions – Aggregation semantics • Use of nanopublications for scientific research – Evidence gathering (HyQue) Dumontier::Bio-ontologies 2013:Ovopubs 10
    11. 11. Michel Dumontier michel_dumontier@carleton.ca Publications: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier 11

    ×