SlideShare a Scribd company logo
A document-inspired way for
tracking changes of RDF data
The case of the OpenCitations Corpus
Silvio Peroni, David Shotton, Fabio Vitali
1st Drift-a-LOD Workshop: Detection, Representation and
Management of Concept Drift in Linked Open Data

Bologna, Italy, November 20, 2016
Paper: https://w3id.org/oc/paper/occ-driftalod2016.html
The Venice analogy
• Island = 

scholarly publication
• Bridge = citation
• Current situation:
– local travel to
the next island
is permitted
– unrestricted
travel over the
entire network
of bridges
requires an
expensive
season ticket
– general
populace is
excluded
https://w3id.org/oc/paper/the-venice-analogy.html
Opening the bridges
• What – Citation data are one of the main tools used by
researchers to gain knowledge about particular topics, and
they also serve institutional goals, for example in research
assessment
• Problem – The most authoritative databases of citation data,
Scopus and Web of Science, can only be accessed by paying
significant annual access fees
– The University of Bologna pays about 6,000,000 euros per year for
accessing to digital bibliographic resources
• Solution – To create a citation database that freely and legally
makes available citation data in an open repository to assist
scholars with their academic studies and serve knowledge to
the wider public
OpenCitations
• The OpenCitations Project aims at creating an open repository of scholarly
citation data – the OpenCitations Corpus (OCC) – made available under a
Creative Commons public domain dedication to provide in RDF accurate citation
information (bibliographic references) harvested from the scholarly literature
– All scripts are released with Open Source ISC Licence and available on GitHub at
http://github.com/essepuntato/opencitations
• Currently processing papers available in the PubMedCentral Open Access subset
• As of November 20, 2016 the OCC contains 2,076,645 citation links
• Six distinct kinds of bibliographic entities
– bibliographic resources (citing/cited articles, journals, books, proceedings, etc.)
– resource embodiments (format information about bibliographic resources)
– bibliographic entries (literal textual entries occurring in the reference lists)
– responsible agents (agents having certain roles with respect to the bibliographic
resources)
– agent roles (author, editor, publisher);
– identifiers (DOI, ORCID, PubMedID, URL, etc.)
http://opencitations.net
Ingestion workflow
BEE
EuropeanPubMedCentralProcessor
Parsing the
XML source of
PubMed Central
Open Access
articles.
1
SPACIN
Producing
JSON with DOI
and bib entries.
{

"doi": "10.1590/1414-431x20154655", 

"localid": "MED-26577845", 

"curator": "BEE EuropeanPubMedCentralProcessor", 

"source": "http://www.ebi.ac.uk/europepmc/webservices/rest/PMC4678653/
fullTextXML",
"source_provider": "Europe PubMed Central", 

"pmid": "26577845", 

"pmcid": “PMC4678653",

"references": [

{

"bibentry": "Wenger, NK. Coronary heart disease: an older woman's major
health risk, BMJ, 1997, 315, 1085, 1090, DOI: 10.1136/bmj.315.7115.1085, PMID:
9366743", 

"pmid": "9366743", 

"doi": "10.1136/bmj.315.7115.1085", 

"pmcid": "PMC2127693", 

"process_entry": "True"

} …
]
}
2
For each citing/cited resource,
if an ID (DOI, PMID, PMCID) is
specified check if the resource
exists already. If it does go to 5.
store
ResourceFinder
3
GraphSet
ProvSet
DatasetHandler

Storer
Load all the statements onthe triplestore and storethem in the file system for
easy recovering.
OCC
6
If the resource doesn’t exist,
extract possible IDs from the entry
and query CrossRef and ORCID.
CrossRefProcessor

ORCIDProcessor
4
GraphEntity
New metadata resources are created.
If CrossRef/ORCID returned something, all
the related metadata will be used,
otherwise only basic metadata (IDs and
entries) will be added.
5
Issues with data
• Automatic workflow built upon external services
– Efficient, but no human check of the data extracted
– Some errors could be propagated
• Data do change in time
– Information can be incomplete (e.g. citations added
in another iteration of the ingestion workflow)
– Information can be wrong (e.g. circular citations –
paper A cites paper A)
– Information can be ambiguous (e.g. author
disambiguation)
Document-inspired data drift
• Inspiration from the Document Engineering domain
– Well-known structure for keeping track of changes in word-
processor documents, e.g. OpenOffice Writer
✦ New content added directly within the existing text and marked in
some way
✦ Removed context moved out from the actual content of the document
and placed in an auxiliary space for easy retrieving and restoration
– Two basic operations (add & remove) are enough for keeping
track of document changes
• Solution for RDF data: using PROV-O + SPARQL
UPDATE (INSERT DATA and DELETE DATA only) for
keeping track of the way entities change in time
The approach
:sp a foaf:Person ;
foaf:name "Silvio Peroni" .
:sp a prov:Entity .
:sp-snapshot-1 a prov:Entity ;
prov:specializationOf :sp .
Data Provenance
:sp a foaf:Person ;
foaf:givenName "Silvio" ;
foaf:familyName "Peroni" .
:sp-snapshot-2 a prov:Entity ;
prov:specializationOf :sp ;
prov:wasDerivedFrom :sp-snapshot-1 ;
new:hasUpdateQuery
"
" .
INSERT DATA {
:sp foaf:givenName 'Silvio' ;
foaf:familyName 'Peroni' } ;
DELETE DATA {
:sp foaf:name 'Silvio Peroni' }
Time
T1
T2
A snapshot records the composition
of the entity it specialises (i.e. the set
of statements using such entity as
subject) at a fixed point in time
Advantages
• Easy to retrieve the current statements of an entity, since they
are those currently available in the dataset
• It is possible to restore the entity to a certain snapshot si by
applying the inverse operations (i.e. deletions instead of
insertions and vice versa) of all the update queries from the most
recent snapshot sn to si+1
– For instance, to get back to the status recorded by the first snapshot of
the previous example, we have to run all the inverse operations of the
update query specified in the second snapshot:



INSERT DATA { :sp foaf:name 'Silvio Peroni' } ;

DELETE DATA { 

:sp 

foaf:givenName 'Silvio' ; 

foaf:familyName 'Peroni' }
Implementation in the OCC
• We use:
– PROV-O
– PROV-DC, an extension of PROV-O mapping it with DC
– OpenCitations Ontology (OCO), which defines oco:hasUpdateQuery
• Each entity in the OCC tracks provenance information about:
– snapshot of entity metadata (prov:Entity), a particular snapshot recording the
metadata associated with an individual entity at a particular time
– curatorial activity (prov:Activity), a curatorial activity relating to that entity
✦ creation (prov:Create), the activity of creating a new entity with statements
✦ modification (prov:Modify), the activity of adding/removing statements of an entity
✦ merging (prov:Replace), the activity of unifying the statements relating to two entites
– provenance agent (prov:Agent), a person, organisation or process, that is
involved in some way in the creation of an entity (e.g. Crossref)
– curatorial role (prov:Association), a particular role held by a provenance
agent with respect to a curatorial activity (e.g. OCC curator, metadata source)
An example
br:525205
a fabio:Expression , fabio:JournalArticle ;
dcterms:title "The Electronic Patient ..." ;
datacite:hasIdentifier
id:816997 , id:816998 , ... ;
fabio:hasPublicationYear "2016"^^xsd:gYear ;
pro:isDocumentContextFor
ar:1591190 , ar:1591191 , ... ;
frbr:embodiment re:217773 ;
frbr:part be:727446 , be:727447 , ... .
se:1 a prov:Entity ;
prov:generatedAtTime "2016-08-08T22:25:48"^^xsd:dateTime ;
prov:hadPrimarySource
<http://api.crossref.org/works/10.2196/mhealth.5331> ;
prov:specializationOf br:525205 ;
prov:wasGeneratedBy ca:1 .
ca:1 a prov:Activity, prov:Create ;
dcterms:description
"The entity 'https://w3id.org/oc/corpus/br/525205'
has been created." ;
prov:qualifiedAssociation cr:1 , cr:2 .
cr:1 a prov:Association ;
prov:agent pa:1 ;
prov:hadRole oco:occ-curator .
pa:1 a prov:Agent ;
foaf:name "SPACIN CrossrefProcessor" . …
Data Provenance
br:525205
cites:cites br:1095420 , br:1095421 , ... ;
frbr:part be:727446 , be:727447 , ... .
se:1
prov:invalidatedAtTime "2016-08-29T22:42:06"^^xsd:dateTime ;
prov:wasInvalidatedBy ca:2 .
se:2 a prov:Entity ;
prov:generatedAtTime "2016-08-29T22:42:06"^^xsd:dateTime ;
prov:hadPrimarySource <http://www.ebi.ac.uk/europepmc/
webservices/rest/PMC4911509/fullTextXML> ;
prov:specializationOf br:525205 ;
prov:wasDerivedFrom se:1 ;
prov:wasGeneratedBy ca:2 ;
oco:hasUpdateQuery "INSERT DATA {
GRAPH <https://w3id.org/oc/corpus/br/> {
br:525205
cito:cites br:1095459 , br:525205 , ... ;
frbr:part be:727491 , be:727452 , ... } }" . …
Time
T1
T2
Conclusions
• Approach for keeping track of changes in RDF data inspired
by existing implementations in the Document Engineering
domain
– PROV-O
– SPARQL UPDATE (INSERT/DELETE DATA)
• Implementation: provenance data in the OpenCitations Corpus
– PROV-DC
– OpenCitations Ontology
• Future works
– automatic tools for snapshot restoration
– handling non-automatic modification to the OpenCitations Corpus
(e.g. curation by humans)
Silvio Peroni
silvio.peroni@unibo.it - @essepuntato
Digital And Semantic Publishing Laboratory
Department of Computer Science and Engineering
Alma Mater Studiorum – Università di Bologna
Bologna, Italy
dasplab.cs.unibo.it – www.unibo.it

More Related Content

What's hot

鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107
皓仁 柯
 
Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19
Janifer Gatenby
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
Herbert Van de Sompel
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
Jun Zhao
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
Herbert Van de Sompel
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
Herbert Van de Sompel
 
How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)
Charleston Conference
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
Herbert Van de Sompel
 
Open Annotation Model
Open Annotation ModelOpen Annotation Model
Open Annotation Model
Paolo Ciccarese
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
Herbert Van de Sompel
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
Herbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for Repositories
Martin Klein
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
Herbert Van de Sompel
 
Best Practices for Descriptive Metadata for Web Archiving
Best Practices for Descriptive Metadata for Web ArchivingBest Practices for Descriptive Metadata for Web Archiving
Best Practices for Descriptive Metadata for Web Archiving
OCLC
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
Herbert Van de Sompel
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Rafal Kasprowski
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
Herbert Van de Sompel
 
Open Annotation Collaboration Introduction
Open Annotation Collaboration IntroductionOpen Annotation Collaboration Introduction
Open Annotation Collaboration Introduction
Timothy Cole
 
EDS Web-scale Panel (Preprint), 2012 Charleston Conference
EDS Web-scale Panel (Preprint), 2012 Charleston ConferenceEDS Web-scale Panel (Preprint), 2012 Charleston Conference
EDS Web-scale Panel (Preprint), 2012 Charleston Conference
Rafal Kasprowski
 

What's hot (20)

鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107
 
Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Open Annotation Model
Open Annotation ModelOpen Annotation Model
Open Annotation Model
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for Repositories
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
Best Practices for Descriptive Metadata for Web Archiving
Best Practices for Descriptive Metadata for Web ArchivingBest Practices for Descriptive Metadata for Web Archiving
Best Practices for Descriptive Metadata for Web Archiving
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Open Annotation Collaboration Introduction
Open Annotation Collaboration IntroductionOpen Annotation Collaboration Introduction
Open Annotation Collaboration Introduction
 
EDS Web-scale Panel (Preprint), 2012 Charleston Conference
EDS Web-scale Panel (Preprint), 2012 Charleston ConferenceEDS Web-scale Panel (Preprint), 2012 Charleston Conference
EDS Web-scale Panel (Preprint), 2012 Charleston Conference
 

Viewers also liked

Mediatización e interracciones sociales
Mediatización e interracciones socialesMediatización e interracciones sociales
Mediatización e interracciones sociales
7121988
 
Tipo de datos de oracle base datos
Tipo de datos de oracle base datosTipo de datos de oracle base datos
Tipo de datos de oracle base datos
Paul Vega
 
Mensaje para las madres
Mensaje para las madresMensaje para las madres
Mensaje para las madres
Carlos Andres Perez Vasquez
 
Musica y baile en los afrocolombianos
Musica y baile en los afrocolombianosMusica y baile en los afrocolombianos
Musica y baile en los afrocolombianos
Carlos Andres Perez Vasquez
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
tatiana-vasquez
 
Vaidas Morkevičius - Open Data for Enhancing the Quality of Research
Vaidas Morkevičius - Open Data for Enhancing the Quality of ResearchVaidas Morkevičius - Open Data for Enhancing the Quality of Research
Vaidas Morkevičius - Open Data for Enhancing the Quality of Research
Aidis Stukas
 
Saulius Gražulis The Crystalography Open Database
Saulius Gražulis  The Crystalography Open DatabaseSaulius Gražulis  The Crystalography Open Database
Saulius Gražulis The Crystalography Open Database
Aidis Stukas
 
Clasificación de invertebrados
Clasificación de invertebradosClasificación de invertebrados
Clasificación de invertebrados
Miguel Efren Torres Chamba
 
Manual empresas eco turisticas
Manual empresas eco turisticasManual empresas eco turisticas
Manual empresas eco turisticas
S. Marlene Ayala
 
Comunicacion interactiva christopher
Comunicacion interactiva christopherComunicacion interactiva christopher
Comunicacion interactiva christopher
Christopher M. D' Hoy
 
Docker Mentor Week 2016 - Medan
Docker Mentor Week 2016 - MedanDocker Mentor Week 2016 - Medan
Docker Mentor Week 2016 - Medan
Albert Suwandhi
 
Presentazione di Gruppo Vectorys
Presentazione di Gruppo VectorysPresentazione di Gruppo Vectorys
Presentazione di Gruppo Vectorys
Bechir Added
 
Group 1 fruits
Group 1 fruitsGroup 1 fruits
Group 1 fruits
Aminurrashid Ibrahim
 
Msc. hugo ortiz
Msc. hugo ortizMsc. hugo ortiz
Msc. hugo ortiz
Manuel Cisneros
 
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Aidis Stukas
 
Muñoz ximena camstudio
Muñoz ximena camstudioMuñoz ximena camstudio
Muñoz ximena camstudio
Ximena Muñoz
 
Sesión informativa a
Sesión informativa aSesión informativa a
Sesión informativa a
jacke04
 
Presentación de escuela indigena
Presentación de escuela indigenaPresentación de escuela indigena
Presentación de escuela indigena
Jazmin Juarez
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
tatiana-vasquez
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
tatiana-vasquez
 

Viewers also liked (20)

Mediatización e interracciones sociales
Mediatización e interracciones socialesMediatización e interracciones sociales
Mediatización e interracciones sociales
 
Tipo de datos de oracle base datos
Tipo de datos de oracle base datosTipo de datos de oracle base datos
Tipo de datos de oracle base datos
 
Mensaje para las madres
Mensaje para las madresMensaje para las madres
Mensaje para las madres
 
Musica y baile en los afrocolombianos
Musica y baile en los afrocolombianosMusica y baile en los afrocolombianos
Musica y baile en los afrocolombianos
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
 
Vaidas Morkevičius - Open Data for Enhancing the Quality of Research
Vaidas Morkevičius - Open Data for Enhancing the Quality of ResearchVaidas Morkevičius - Open Data for Enhancing the Quality of Research
Vaidas Morkevičius - Open Data for Enhancing the Quality of Research
 
Saulius Gražulis The Crystalography Open Database
Saulius Gražulis  The Crystalography Open DatabaseSaulius Gražulis  The Crystalography Open Database
Saulius Gražulis The Crystalography Open Database
 
Clasificación de invertebrados
Clasificación de invertebradosClasificación de invertebrados
Clasificación de invertebrados
 
Manual empresas eco turisticas
Manual empresas eco turisticasManual empresas eco turisticas
Manual empresas eco turisticas
 
Comunicacion interactiva christopher
Comunicacion interactiva christopherComunicacion interactiva christopher
Comunicacion interactiva christopher
 
Docker Mentor Week 2016 - Medan
Docker Mentor Week 2016 - MedanDocker Mentor Week 2016 - Medan
Docker Mentor Week 2016 - Medan
 
Presentazione di Gruppo Vectorys
Presentazione di Gruppo VectorysPresentazione di Gruppo Vectorys
Presentazione di Gruppo Vectorys
 
Group 1 fruits
Group 1 fruitsGroup 1 fruits
Group 1 fruits
 
Msc. hugo ortiz
Msc. hugo ortizMsc. hugo ortiz
Msc. hugo ortiz
 
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
 
Muñoz ximena camstudio
Muñoz ximena camstudioMuñoz ximena camstudio
Muñoz ximena camstudio
 
Sesión informativa a
Sesión informativa aSesión informativa a
Sesión informativa a
 
Presentación de escuela indigena
Presentación de escuela indigenaPresentación de escuela indigena
Presentación de escuela indigena
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
 
Clase tatiana adriana
Clase tatiana adrianaClase tatiana adriana
Clase tatiana adriana
 

Similar to A document-inspired way for tracking changes of RDF data - The case of the OpenCitations Corpus

FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
Raul Palma
 
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, OxfordIntroduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
matthewgamble
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations arise
University of Bologna
 
Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...
Benoit Pauwels
 
Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...
ULB - Bibliothèques
 
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
University of Bologna
 
20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs
andrea huang
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
Andrea Bollini
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
Tony Hammond
 
Bringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointersBringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointers
University of Bologna
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 Calhoun
Karen S Calhoun
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
Stian Soiland-Reyes
 
“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN
Chengjen Lee
 
RELIANCE ROHub hackathon
RELIANCE ROHub hackathonRELIANCE ROHub hackathon
RELIANCE ROHub hackathon
Raul Palma
 
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
andrea huang
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
Paola Gargiulo
 
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
OpenAIRE
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
Raul Palma
 

Similar to A document-inspired way for tracking changes of RDF data - The case of the OpenCitations Corpus (20)

FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, OxfordIntroduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations arise
 
Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...Exchange of usage metadata in a network of institutional repositories: the ...
Exchange of usage metadata in a network of institutional repositories: the ...
 
Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...Exchange of usage metadata in a network of institutional repositories: the ca...
Exchange of usage metadata in a network of institutional repositories: the ca...
 
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
 
20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Bringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointersBringing semantic publishing into TEI: ideas and pointers
Bringing semantic publishing into TEI: ideas and pointers
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 Calhoun
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN“Open Data Web” – A Linked Open Data Repository Built with CKAN
“Open Data Web” – A Linked Open Data Repository Built with CKAN
 
RELIANCE ROHub hackathon
RELIANCE ROHub hackathonRELIANCE ROHub hackathon
RELIANCE ROHub hackathon
 
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
 
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 

More from University of Bologna

A Simplified Agile Methodology for Ontology Development
A Simplified Agile Methodology for Ontology DevelopmentA Simplified Agile Methodology for Ontology Development
A Simplified Agile Methodology for Ontology Development
University of Bologna
 
FOOD: FOod in Open Data
FOOD: FOod in Open DataFOOD: FOod in Open Data
FOOD: FOod in Open Data
University of Bologna
 
A pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflowsA pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflows
University of Bologna
 
Semantic lenses to bring digital and semantic publishing together
Semantic lenses to bring digital and semantic publishing togetherSemantic lenses to bring digital and semantic publishing together
Semantic lenses to bring digital and semantic publishing together
University of Bologna
 
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
University of Bologna
 
Characterising citations in scholarly articles: an experiment
Characterising citations in scholarly articles: an experimentCharacterising citations in scholarly articles: an experiment
Characterising citations in scholarly articles: an experiment
University of Bologna
 
Towards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citationsTowards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citations
University of Bologna
 
The Live OWL Documentation Environment: a tool for the automatic generation o...
The Live OWL Documentation Environment: a tool for the automatic generation o...The Live OWL Documentation Environment: a tool for the automatic generation o...
The Live OWL Documentation Environment: a tool for the automatic generation o...
University of Bologna
 
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
University of Bologna
 
Embedding semantic annotations within texts: the FRETTA approach
Embedding semantic annotations within texts: the FRETTA approachEmbedding semantic annotations within texts: the FRETTA approach
Embedding semantic annotations within texts: the FRETTA approach
University of Bologna
 
Dealing with Markup Semantics
Dealing with Markup SemanticsDealing with Markup Semantics
Dealing with Markup Semantics
University of Bologna
 
Handling Markup Overlaps Using OWL
Handling Markup Overlaps Using OWLHandling Markup Overlaps Using OWL
Handling Markup Overlaps Using OWL
University of Bologna
 

More from University of Bologna (12)

A Simplified Agile Methodology for Ontology Development
A Simplified Agile Methodology for Ontology DevelopmentA Simplified Agile Methodology for Ontology Development
A Simplified Agile Methodology for Ontology Development
 
FOOD: FOod in Open Data
FOOD: FOod in Open DataFOOD: FOod in Open Data
FOOD: FOod in Open Data
 
A pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflowsA pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflows
 
Semantic lenses to bring digital and semantic publishing together
Semantic lenses to bring digital and semantic publishing togetherSemantic lenses to bring digital and semantic publishing together
Semantic lenses to bring digital and semantic publishing together
 
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
Zeri e LODE
: Extracting the Zeri photo archive to Linked Open Data: formaliz...
 
Characterising citations in scholarly articles: an experiment
Characterising citations in scholarly articles: an experimentCharacterising citations in scholarly articles: an experiment
Characterising citations in scholarly articles: an experiment
 
Towards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citationsTowards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citations
 
The Live OWL Documentation Environment: a tool for the automatic generation o...
The Live OWL Documentation Environment: a tool for the automatic generation o...The Live OWL Documentation Environment: a tool for the automatic generation o...
The Live OWL Documentation Environment: a tool for the automatic generation o...
 
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
Scholarly publishing and Linked Data: describing roles, statuses, temporal an...
 
Embedding semantic annotations within texts: the FRETTA approach
Embedding semantic annotations within texts: the FRETTA approachEmbedding semantic annotations within texts: the FRETTA approach
Embedding semantic annotations within texts: the FRETTA approach
 
Dealing with Markup Semantics
Dealing with Markup SemanticsDealing with Markup Semantics
Dealing with Markup Semantics
 
Handling Markup Overlaps Using OWL
Handling Markup Overlaps Using OWLHandling Markup Overlaps Using OWL
Handling Markup Overlaps Using OWL
 

Recently uploaded

AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
Paris Salesforce Developer Group
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
nedcocy
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...
Prakhyath Rai
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
Kamal Acharya
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
TIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptxTIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptx
CVCSOfficial
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 

Recently uploaded (20)

AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...Software Engineering and Project Management - Software Testing + Agile Method...
Software Engineering and Project Management - Software Testing + Agile Method...
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
TIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptxTIME TABLE MANAGEMENT SYSTEM testing.pptx
TIME TABLE MANAGEMENT SYSTEM testing.pptx
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 

A document-inspired way for tracking changes of RDF data - The case of the OpenCitations Corpus

  • 1. A document-inspired way for tracking changes of RDF data The case of the OpenCitations Corpus Silvio Peroni, David Shotton, Fabio Vitali 1st Drift-a-LOD Workshop: Detection, Representation and Management of Concept Drift in Linked Open Data
 Bologna, Italy, November 20, 2016 Paper: https://w3id.org/oc/paper/occ-driftalod2016.html
  • 2. The Venice analogy • Island = 
 scholarly publication • Bridge = citation • Current situation: – local travel to the next island is permitted – unrestricted travel over the entire network of bridges requires an expensive season ticket – general populace is excluded https://w3id.org/oc/paper/the-venice-analogy.html
  • 3. Opening the bridges • What – Citation data are one of the main tools used by researchers to gain knowledge about particular topics, and they also serve institutional goals, for example in research assessment • Problem – The most authoritative databases of citation data, Scopus and Web of Science, can only be accessed by paying significant annual access fees – The University of Bologna pays about 6,000,000 euros per year for accessing to digital bibliographic resources • Solution – To create a citation database that freely and legally makes available citation data in an open repository to assist scholars with their academic studies and serve knowledge to the wider public
  • 4. OpenCitations • The OpenCitations Project aims at creating an open repository of scholarly citation data – the OpenCitations Corpus (OCC) – made available under a Creative Commons public domain dedication to provide in RDF accurate citation information (bibliographic references) harvested from the scholarly literature – All scripts are released with Open Source ISC Licence and available on GitHub at http://github.com/essepuntato/opencitations • Currently processing papers available in the PubMedCentral Open Access subset • As of November 20, 2016 the OCC contains 2,076,645 citation links • Six distinct kinds of bibliographic entities – bibliographic resources (citing/cited articles, journals, books, proceedings, etc.) – resource embodiments (format information about bibliographic resources) – bibliographic entries (literal textual entries occurring in the reference lists) – responsible agents (agents having certain roles with respect to the bibliographic resources) – agent roles (author, editor, publisher); – identifiers (DOI, ORCID, PubMedID, URL, etc.) http://opencitations.net
  • 5. Ingestion workflow BEE EuropeanPubMedCentralProcessor Parsing the XML source of PubMed Central Open Access articles. 1 SPACIN Producing JSON with DOI and bib entries. {
 "doi": "10.1590/1414-431x20154655", 
 "localid": "MED-26577845", 
 "curator": "BEE EuropeanPubMedCentralProcessor", 
 "source": "http://www.ebi.ac.uk/europepmc/webservices/rest/PMC4678653/ fullTextXML", "source_provider": "Europe PubMed Central", 
 "pmid": "26577845", 
 "pmcid": “PMC4678653",
 "references": [
 {
 "bibentry": "Wenger, NK. Coronary heart disease: an older woman's major health risk, BMJ, 1997, 315, 1085, 1090, DOI: 10.1136/bmj.315.7115.1085, PMID: 9366743", 
 "pmid": "9366743", 
 "doi": "10.1136/bmj.315.7115.1085", 
 "pmcid": "PMC2127693", 
 "process_entry": "True"
 } … ] } 2 For each citing/cited resource, if an ID (DOI, PMID, PMCID) is specified check if the resource exists already. If it does go to 5. store ResourceFinder 3 GraphSet ProvSet DatasetHandler
 Storer Load all the statements onthe triplestore and storethem in the file system for easy recovering. OCC 6 If the resource doesn’t exist, extract possible IDs from the entry and query CrossRef and ORCID. CrossRefProcessor
 ORCIDProcessor 4 GraphEntity New metadata resources are created. If CrossRef/ORCID returned something, all the related metadata will be used, otherwise only basic metadata (IDs and entries) will be added. 5
  • 6. Issues with data • Automatic workflow built upon external services – Efficient, but no human check of the data extracted – Some errors could be propagated • Data do change in time – Information can be incomplete (e.g. citations added in another iteration of the ingestion workflow) – Information can be wrong (e.g. circular citations – paper A cites paper A) – Information can be ambiguous (e.g. author disambiguation)
  • 7. Document-inspired data drift • Inspiration from the Document Engineering domain – Well-known structure for keeping track of changes in word- processor documents, e.g. OpenOffice Writer ✦ New content added directly within the existing text and marked in some way ✦ Removed context moved out from the actual content of the document and placed in an auxiliary space for easy retrieving and restoration – Two basic operations (add & remove) are enough for keeping track of document changes • Solution for RDF data: using PROV-O + SPARQL UPDATE (INSERT DATA and DELETE DATA only) for keeping track of the way entities change in time
  • 8. The approach :sp a foaf:Person ; foaf:name "Silvio Peroni" . :sp a prov:Entity . :sp-snapshot-1 a prov:Entity ; prov:specializationOf :sp . Data Provenance :sp a foaf:Person ; foaf:givenName "Silvio" ; foaf:familyName "Peroni" . :sp-snapshot-2 a prov:Entity ; prov:specializationOf :sp ; prov:wasDerivedFrom :sp-snapshot-1 ; new:hasUpdateQuery " " . INSERT DATA { :sp foaf:givenName 'Silvio' ; foaf:familyName 'Peroni' } ; DELETE DATA { :sp foaf:name 'Silvio Peroni' } Time T1 T2 A snapshot records the composition of the entity it specialises (i.e. the set of statements using such entity as subject) at a fixed point in time
  • 9. Advantages • Easy to retrieve the current statements of an entity, since they are those currently available in the dataset • It is possible to restore the entity to a certain snapshot si by applying the inverse operations (i.e. deletions instead of insertions and vice versa) of all the update queries from the most recent snapshot sn to si+1 – For instance, to get back to the status recorded by the first snapshot of the previous example, we have to run all the inverse operations of the update query specified in the second snapshot:
 
 INSERT DATA { :sp foaf:name 'Silvio Peroni' } ;
 DELETE DATA { 
 :sp 
 foaf:givenName 'Silvio' ; 
 foaf:familyName 'Peroni' }
  • 10. Implementation in the OCC • We use: – PROV-O – PROV-DC, an extension of PROV-O mapping it with DC – OpenCitations Ontology (OCO), which defines oco:hasUpdateQuery • Each entity in the OCC tracks provenance information about: – snapshot of entity metadata (prov:Entity), a particular snapshot recording the metadata associated with an individual entity at a particular time – curatorial activity (prov:Activity), a curatorial activity relating to that entity ✦ creation (prov:Create), the activity of creating a new entity with statements ✦ modification (prov:Modify), the activity of adding/removing statements of an entity ✦ merging (prov:Replace), the activity of unifying the statements relating to two entites – provenance agent (prov:Agent), a person, organisation or process, that is involved in some way in the creation of an entity (e.g. Crossref) – curatorial role (prov:Association), a particular role held by a provenance agent with respect to a curatorial activity (e.g. OCC curator, metadata source)
  • 11. An example br:525205 a fabio:Expression , fabio:JournalArticle ; dcterms:title "The Electronic Patient ..." ; datacite:hasIdentifier id:816997 , id:816998 , ... ; fabio:hasPublicationYear "2016"^^xsd:gYear ; pro:isDocumentContextFor ar:1591190 , ar:1591191 , ... ; frbr:embodiment re:217773 ; frbr:part be:727446 , be:727447 , ... . se:1 a prov:Entity ; prov:generatedAtTime "2016-08-08T22:25:48"^^xsd:dateTime ; prov:hadPrimarySource <http://api.crossref.org/works/10.2196/mhealth.5331> ; prov:specializationOf br:525205 ; prov:wasGeneratedBy ca:1 . ca:1 a prov:Activity, prov:Create ; dcterms:description "The entity 'https://w3id.org/oc/corpus/br/525205' has been created." ; prov:qualifiedAssociation cr:1 , cr:2 . cr:1 a prov:Association ; prov:agent pa:1 ; prov:hadRole oco:occ-curator . pa:1 a prov:Agent ; foaf:name "SPACIN CrossrefProcessor" . … Data Provenance br:525205 cites:cites br:1095420 , br:1095421 , ... ; frbr:part be:727446 , be:727447 , ... . se:1 prov:invalidatedAtTime "2016-08-29T22:42:06"^^xsd:dateTime ; prov:wasInvalidatedBy ca:2 . se:2 a prov:Entity ; prov:generatedAtTime "2016-08-29T22:42:06"^^xsd:dateTime ; prov:hadPrimarySource <http://www.ebi.ac.uk/europepmc/ webservices/rest/PMC4911509/fullTextXML> ; prov:specializationOf br:525205 ; prov:wasDerivedFrom se:1 ; prov:wasGeneratedBy ca:2 ; oco:hasUpdateQuery "INSERT DATA { GRAPH <https://w3id.org/oc/corpus/br/> { br:525205 cito:cites br:1095459 , br:525205 , ... ; frbr:part be:727491 , be:727452 , ... } }" . … Time T1 T2
  • 12. Conclusions • Approach for keeping track of changes in RDF data inspired by existing implementations in the Document Engineering domain – PROV-O – SPARQL UPDATE (INSERT/DELETE DATA) • Implementation: provenance data in the OpenCitations Corpus – PROV-DC – OpenCitations Ontology • Future works – automatic tools for snapshot restoration – handling non-automatic modification to the OpenCitations Corpus (e.g. curation by humans)
  • 13. Silvio Peroni silvio.peroni@unibo.it - @essepuntato Digital And Semantic Publishing Laboratory Department of Computer Science and Engineering Alma Mater Studiorum – Università di Bologna Bologna, Italy dasplab.cs.unibo.it – www.unibo.it