The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
Peru Bhardwaj1 - Christophe Debruyne1 - Anuj Singh1 -
Declan O'Sullivan1 - Marta Bustillo2 - Timothy Keefe3
1ADAPT Centre, Trinity College Dublin
2James Joyce Library, University College Dublin
3Digital Resources and Imaging Services, The Library, Trinity College Dublin
2017-05-21 @ CONUL 2017
Facilitating User Engagement by Enriching
Library Data using Semantic Technologies
www.adaptcentre.ieContext
Both the Library and the ADAPT Centre at Trinity College Dublin are exploring ways to
leverage and facilitate user engagement with the publication of Library’s bibliographic
records as Linked Data.
In this talk, we present the results of enriching a Linked Data representation of
bibliographic records with a geospatial component.
We will elaborate on the process of transforming the records into RDF, the creation of
links between Linked Data datasets, and two demonstrators that use the generated
RDF.
We will identify some challenges as well indicate some future directions.
And we will provide references, or in some cases shameless plugs, where appropriate.
www.adaptcentre.ie
1
Use URIs as
names for things.
2
Use HTTP URIs so
that people can
look up those
names.
3
When someone
looks up a URI,
provide useful
information using
the standards.
4
Include links to
other URIs, so
that they can
discover more
things.
Linked Data
Linked Data started off as a initiative called
the Linking Open Data (LOD) project.
It is a global initiative to publish and
interlink structured data on the Web
according to some "protocol" using a clever
combination of simple, standardized
technologies:
• Uniform Resource Identifiers – to name
things;
• Resource Description Framework – to
represent and relate things;
• HTTP infrastructure and HTTP URIs – to
obtain those representations.
T. Berners-Lee
https://www.w3.org/DesignIssues/LinkedData.html
www.adaptcentre.ieLinked Data – (Non-)Information Resources
Information resources (IR) are documents – referred to by a URI – that describe non-
information resources (NIR) – possibly named with a URI – that represent things such
as cars, people, etc.
Source: Georges Biard [CC BY-SA 3.0],
via Wikimedia Commons
Hey girl
NIR http://dbpedia.org/resource/Ryan_Gosling
refers to
described by
described by
IR http://dbpedia.org/page/Ryan_Gosling
IR http://dbpedia.org/data/Ryan_Gosling.n3
www.adaptcentre.ieUplift
We transformed both the records of the “Clarke Stained Glass Studios Collection” and
a non-exhaustive dataset of mostly catholic Irish churches with their coordinates* into
RDF fit for Linked Data.
The process of generating RDF from non-RDF resources is called uplift.
Two approaches to uplift:
• A direct mapping.
Reflects the source’s structure, not necessarily meaningful, and generally needs to
manipulated afterwards.
• A declarative mapping.
Using a mapping language, declare how one generates RDF from the source.
Adopting (established) vocabularies enables one to make the RDF meaningful to a
wider community
* Graciously provided by the Ordnance Survey Ireland (OSi)
www.adaptcentre.ieUplift
The Clarke Stained Glass Studios Collection was transformed into RDF.
FileMaker XML files to RDF using an approach that uses mappings expressed in
XQuery to retrieve the relevant pieces of information to instantiate RDF/XML
"templates" (see Singh 2015).
XQuery mappings are furthermore represented in RDF as to facilitate analysis,
maintenance, etc. E.g., via a GUI that fire SPARQL update queries.
 Approach was deemed feasible and produced the desired output for various cases
 Bespoke solution
 Steep learning curve for users (familiarity with multiple formalisms)
 Interface
Though sufficient, subsequent success with emerging standards led us to recommend
a different approach…
Anuj Singh. Towards Autonomic Uplift of Information.
MSc Thesis. Trinity College Dublin (2015)
www.adaptcentre.ieUplift
A non-exhaustive dataset of mostly catholic Irish churches with their coordinates was
transformed using the RDB to RDF Mapping Language (R2RML), which is a W3C
Recommendation.
R2RML provides a vocabulary for declaring mappings which are stored as RDF
themselves. This allows one to, amongst others, conduct a meta-analysis of the
mappings using SPARQL, adopt SPARQL to update and repair mappings, and even
combine the mappings with the RDF to discover which mappings produced what
statements.
 Feasible and validated approach to produce RDF
(as demonstrated for data.geohive.ie – see Debruyne et al. 2016)
 Based on a standard
 Still a steep learning curve, but all is expressed in RDF
www.adaptcentre.ie
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix ont: <http://ontologies.geohive.ie/osi#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<#TriplesMap1>
rr:logicalTable [ rr:tableName "church" ] ;
rr:subjectMap [
rr:template
"http://data.geohive.ie/resource/church/{GUID}" ;
rr:class ont:Church ;
rr:class geo:Feature ;
] ;
rr:predicateObjectMap [
rr:predicate rdfs:label ;
rr:objectMap [
rr:column "eng_name_5" ;
rr:language "en" ;
] ;
] ;
…
What is being mapped? A
logical table/view or an SQL
query.
How to generate and state
something about the
subject of those triples.
How to generate predicates
and objects.
R2RML – A glimpse…
www.adaptcentre.ieInterlinking the datasets
The creation of links between Linked Data datasets is a daunting task. There are
various approaches, each with their own considerations.
1. Manually curated links are authoritative, but requires a lot of effort.
2. Using link discovery tools such as SILK (Jentzsch et al. 2010) and LIMES (Ngonga
Ngomo and Auer 2011) where one can use declarative approaches (e.g., SILK) or
machine learning (e.g., LIMES). Not guaranteed to be authoritative, unless
governance practices are installed.
3. Or using SPARQL CONSTRUCT queries (see later)
Challenge: How can one elegantly deal with links created with various approaches?
One approach is to cleverly use named graphs and provenance information.
Anja Jentzsch, Robert Isele, Christian Bizer: Silk - Generating RDF Links while
Publishing or Consuming Linked Data. ISWC Posters&Demos 2010
Axel-Cyrille Ngonga Ngomo, Sören Auer: LIMES - A Time-Efficient Approach for
Large-Scale Link Discovery on the Web of Data. IJCAI 2011: 2312-2317
www.adaptcentre.ieDemonstrator 1: Mobile Application
Developed by Peru Bhardwaj as part of an internship in ADAPT with domain expert
input from Marta Bustillo.
One may notice that most stories around Linked Data either involves the publication
process, or the development of applications on top existing Linked Data datasets. Her
goal was to demonstrate how Linked Data can be used to create ways to engage with
TCD Library’s collections.
• Google material design approach
• Google Polymer library
• Geolocation API
• Google Maps API – can be replaced with OSi base maps
The project informed both TCD Library and the Ordnance Survey Ireland.
www.adaptcentre.ieDemonstrator 1: Mobile Application
www.adaptcentre.ieDemonstrator 1: Mobile Application
www.adaptcentre.ieDemonstrator 1: Mobile Application
www.adaptcentre.ieDemonstrator 2: Authoritative geospatial data
Availing of the predicates offered by standards or well-established vocabularies, for
which tools already exist. Library metadata was interlinked with OSi’s churches
dataset, who were described using GeoSPARQL.
GeoSPARQL is an extension of SPARQL offering both a vocabulary (topological relations
as well as spatial functions), and an extension of SPARQL. Not everyone has access to
GeoSPARQL-triplestores, though. And access to SPARQL endpoints are often difficult to
come by (see Verborgh et al. 2016).
In (Debruyne and O’Sullivan 2016), we proposed an extension of Triple Pattern
Fragments (Verborgh et al. 2016) that supports client-side processing of GeoSPARQL
functions, allowing one to query, analyze, and explore that data using the spatial
dimension.
What are Triple Pattern Fragments (TPF)? In short, an approach to distribute the load
of computing the result of a query between a TPF client and server. This results in less
load on the server (smarter clients) at the cost of increased bandwidth.
R. Verborgh, M. Vander Sande, O. Hartig, J. Van Herwegen, L. De Vocht, B. De
Meester, G. Haesendonck, and P. Colpaert. 2016. Triple Pattern Fragments: A low-
cost knowledge graph interface for the Web. J. Web Sem. 37-38 (2016), 184–206
www.adaptcentre.ieDemonstrator 2
Two possible approaches to extending TPF with GeoSPARQL:
A) Extending a TPF Client
• TPF server specification intact (backwards compatible)
• Possibly more network overhead
B) Extending the TPF server
• Outside server specification, but proven to be viable for substring filtering (Van
Herwegen et al. 2015)
Additional requirement: a pure JavaScript implementation
• Allows one to run the client in a browser and hence facilitate stakeholders in
formulating GeoSPARQL queries
J. Van Herwegen, L. De Vocht, R. Verborgh, E. Mannens, and R. Van de Walle. 2015.
Substring Filtering for Low-Cost Linked Data Interfaces. In The Semantic Web - ISWC
2015 - 14th International Semantic Web Conference, Bethlehem, PA, USA, October 11-
15, 2015, Proceedings, Part I (LNCS), Vol. 9366. Springer, 128–143.
www.adaptcentre.ieDemonstrator 2
A. OSi's TPF server
B. the TCD (local) TPF
server
C. the query (using our
extension), and
D. the result
Runs in a browser, and
one does not need to
install and populate a
GeoSPARQL-enabled
triplestore.
Straightforward federated
querying.
A B
C
D
www.adaptcentre.ieDemonstrator 2
Using YASGUI
(http://yasgui.org)
Similar query, only slightly
tweaked to plot the
border of County Dublin
only once.
www.adaptcentre.ieChallenges…
We deem that we achieved our goal of demonstrating the feasibility of publishing and
using TCD Library’s metadata as Linked Data.
During this process, however, we identified several challenges:
• Knowledge organization, and provenance. As we already mentioned, clever use of
named graphs and provenance vocabularies would inform consumers what data to
trust for particular tasks.
• Leveraging the uplift process. Through (repeated, similar) exercise(s), we did
notice that creating mappings to generate RDF is a daunting tasks requiring
appropriate tools and representations.
• (Authoritative) Interlinking by Librarians. Similar to uplift, we notice a lack of
methods and tools to leverage this process. Sometimes tools “as simple as”
creating skeleton mappings makes the task less daunting.
(cfr. Lucy McKenna, also present at CONUL 2017)
www.adaptcentre.ieSome references…
• Christophe Debruyne, Kris McGlinn, Lorraine McNerney and Declan O'Sullivan: A
Lightweight Approach to Explore, Enrich and Use Data with a Geospatial Dimension
with Semantic Web Technologies. GeoRICH 2017
– Leveraging uplift and interlinking process
• Ademar Crotti Junior, Christophe Debruyne and Declan O'Sullivan. Juma: an Editor
that Uses a Block Metaphor to Facilitate the Creation and Editing of R2RML
Mappings. Extended Semantic Web Conference (Posters & Demos) 2017
– Representations for uplift mappings
• Christophe Debruyne, Eamonn Clinton, Declan O'Sullivan: Client-side Processing of
GeoSPARQL Functions with Triple Pattern Fragments. LDOW@WWW 2017
• Christophe Debruyne, Eamonn Clinton, Lorraine McNerney, Atul Nautiyal, Declan
O'Sullivan: Serving Ireland's Geospatial Information as Linked Data. International
Semantic Web Conference (Posters & Demos) 2016
– URI strategies, knowledge representation & organization, Linked Data Service

‘Facilitating User Engagement by Enriching Library Data using Semantic Technologies’ - Christophe Debruyne (Trinity College Dublin)

  • 1.
    The ADAPT Centreis funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund. Peru Bhardwaj1 - Christophe Debruyne1 - Anuj Singh1 - Declan O'Sullivan1 - Marta Bustillo2 - Timothy Keefe3 1ADAPT Centre, Trinity College Dublin 2James Joyce Library, University College Dublin 3Digital Resources and Imaging Services, The Library, Trinity College Dublin 2017-05-21 @ CONUL 2017 Facilitating User Engagement by Enriching Library Data using Semantic Technologies
  • 2.
    www.adaptcentre.ieContext Both the Libraryand the ADAPT Centre at Trinity College Dublin are exploring ways to leverage and facilitate user engagement with the publication of Library’s bibliographic records as Linked Data. In this talk, we present the results of enriching a Linked Data representation of bibliographic records with a geospatial component. We will elaborate on the process of transforming the records into RDF, the creation of links between Linked Data datasets, and two demonstrators that use the generated RDF. We will identify some challenges as well indicate some future directions. And we will provide references, or in some cases shameless plugs, where appropriate.
  • 3.
    www.adaptcentre.ie 1 Use URIs as namesfor things. 2 Use HTTP URIs so that people can look up those names. 3 When someone looks up a URI, provide useful information using the standards. 4 Include links to other URIs, so that they can discover more things. Linked Data Linked Data started off as a initiative called the Linking Open Data (LOD) project. It is a global initiative to publish and interlink structured data on the Web according to some "protocol" using a clever combination of simple, standardized technologies: • Uniform Resource Identifiers – to name things; • Resource Description Framework – to represent and relate things; • HTTP infrastructure and HTTP URIs – to obtain those representations. T. Berners-Lee https://www.w3.org/DesignIssues/LinkedData.html
  • 4.
    www.adaptcentre.ieLinked Data –(Non-)Information Resources Information resources (IR) are documents – referred to by a URI – that describe non- information resources (NIR) – possibly named with a URI – that represent things such as cars, people, etc. Source: Georges Biard [CC BY-SA 3.0], via Wikimedia Commons Hey girl NIR http://dbpedia.org/resource/Ryan_Gosling refers to described by described by IR http://dbpedia.org/page/Ryan_Gosling IR http://dbpedia.org/data/Ryan_Gosling.n3
  • 5.
    www.adaptcentre.ieUplift We transformed boththe records of the “Clarke Stained Glass Studios Collection” and a non-exhaustive dataset of mostly catholic Irish churches with their coordinates* into RDF fit for Linked Data. The process of generating RDF from non-RDF resources is called uplift. Two approaches to uplift: • A direct mapping. Reflects the source’s structure, not necessarily meaningful, and generally needs to manipulated afterwards. • A declarative mapping. Using a mapping language, declare how one generates RDF from the source. Adopting (established) vocabularies enables one to make the RDF meaningful to a wider community * Graciously provided by the Ordnance Survey Ireland (OSi)
  • 6.
    www.adaptcentre.ieUplift The Clarke StainedGlass Studios Collection was transformed into RDF. FileMaker XML files to RDF using an approach that uses mappings expressed in XQuery to retrieve the relevant pieces of information to instantiate RDF/XML "templates" (see Singh 2015). XQuery mappings are furthermore represented in RDF as to facilitate analysis, maintenance, etc. E.g., via a GUI that fire SPARQL update queries.  Approach was deemed feasible and produced the desired output for various cases  Bespoke solution  Steep learning curve for users (familiarity with multiple formalisms)  Interface Though sufficient, subsequent success with emerging standards led us to recommend a different approach… Anuj Singh. Towards Autonomic Uplift of Information. MSc Thesis. Trinity College Dublin (2015)
  • 7.
    www.adaptcentre.ieUplift A non-exhaustive datasetof mostly catholic Irish churches with their coordinates was transformed using the RDB to RDF Mapping Language (R2RML), which is a W3C Recommendation. R2RML provides a vocabulary for declaring mappings which are stored as RDF themselves. This allows one to, amongst others, conduct a meta-analysis of the mappings using SPARQL, adopt SPARQL to update and repair mappings, and even combine the mappings with the RDF to discover which mappings produced what statements.  Feasible and validated approach to produce RDF (as demonstrated for data.geohive.ie – see Debruyne et al. 2016)  Based on a standard  Still a steep learning curve, but all is expressed in RDF
  • 8.
    www.adaptcentre.ie @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ont: <http://ontologies.geohive.ie/osi#> . @prefix geo: <http://www.opengis.net/ont/geosparql#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . <#TriplesMap1> rr:logicalTable [ rr:tableName "church" ] ; rr:subjectMap [ rr:template "http://data.geohive.ie/resource/church/{GUID}" ; rr:class ont:Church ; rr:class geo:Feature ; ] ; rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:objectMap [ rr:column "eng_name_5" ; rr:language "en" ; ] ; ] ; … What is being mapped? A logical table/view or an SQL query. How to generate and state something about the subject of those triples. How to generate predicates and objects. R2RML – A glimpse…
  • 9.
    www.adaptcentre.ieInterlinking the datasets Thecreation of links between Linked Data datasets is a daunting task. There are various approaches, each with their own considerations. 1. Manually curated links are authoritative, but requires a lot of effort. 2. Using link discovery tools such as SILK (Jentzsch et al. 2010) and LIMES (Ngonga Ngomo and Auer 2011) where one can use declarative approaches (e.g., SILK) or machine learning (e.g., LIMES). Not guaranteed to be authoritative, unless governance practices are installed. 3. Or using SPARQL CONSTRUCT queries (see later) Challenge: How can one elegantly deal with links created with various approaches? One approach is to cleverly use named graphs and provenance information. Anja Jentzsch, Robert Isele, Christian Bizer: Silk - Generating RDF Links while Publishing or Consuming Linked Data. ISWC Posters&Demos 2010 Axel-Cyrille Ngonga Ngomo, Sören Auer: LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data. IJCAI 2011: 2312-2317
  • 10.
    www.adaptcentre.ieDemonstrator 1: MobileApplication Developed by Peru Bhardwaj as part of an internship in ADAPT with domain expert input from Marta Bustillo. One may notice that most stories around Linked Data either involves the publication process, or the development of applications on top existing Linked Data datasets. Her goal was to demonstrate how Linked Data can be used to create ways to engage with TCD Library’s collections. • Google material design approach • Google Polymer library • Geolocation API • Google Maps API – can be replaced with OSi base maps The project informed both TCD Library and the Ordnance Survey Ireland.
  • 11.
  • 12.
  • 13.
  • 14.
    www.adaptcentre.ieDemonstrator 2: Authoritativegeospatial data Availing of the predicates offered by standards or well-established vocabularies, for which tools already exist. Library metadata was interlinked with OSi’s churches dataset, who were described using GeoSPARQL. GeoSPARQL is an extension of SPARQL offering both a vocabulary (topological relations as well as spatial functions), and an extension of SPARQL. Not everyone has access to GeoSPARQL-triplestores, though. And access to SPARQL endpoints are often difficult to come by (see Verborgh et al. 2016). In (Debruyne and O’Sullivan 2016), we proposed an extension of Triple Pattern Fragments (Verborgh et al. 2016) that supports client-side processing of GeoSPARQL functions, allowing one to query, analyze, and explore that data using the spatial dimension. What are Triple Pattern Fragments (TPF)? In short, an approach to distribute the load of computing the result of a query between a TPF client and server. This results in less load on the server (smarter clients) at the cost of increased bandwidth. R. Verborgh, M. Vander Sande, O. Hartig, J. Van Herwegen, L. De Vocht, B. De Meester, G. Haesendonck, and P. Colpaert. 2016. Triple Pattern Fragments: A low- cost knowledge graph interface for the Web. J. Web Sem. 37-38 (2016), 184–206
  • 15.
    www.adaptcentre.ieDemonstrator 2 Two possibleapproaches to extending TPF with GeoSPARQL: A) Extending a TPF Client • TPF server specification intact (backwards compatible) • Possibly more network overhead B) Extending the TPF server • Outside server specification, but proven to be viable for substring filtering (Van Herwegen et al. 2015) Additional requirement: a pure JavaScript implementation • Allows one to run the client in a browser and hence facilitate stakeholders in formulating GeoSPARQL queries J. Van Herwegen, L. De Vocht, R. Verborgh, E. Mannens, and R. Van de Walle. 2015. Substring Filtering for Low-Cost Linked Data Interfaces. In The Semantic Web - ISWC 2015 - 14th International Semantic Web Conference, Bethlehem, PA, USA, October 11- 15, 2015, Proceedings, Part I (LNCS), Vol. 9366. Springer, 128–143.
  • 16.
    www.adaptcentre.ieDemonstrator 2 A. OSi'sTPF server B. the TCD (local) TPF server C. the query (using our extension), and D. the result Runs in a browser, and one does not need to install and populate a GeoSPARQL-enabled triplestore. Straightforward federated querying. A B C D
  • 17.
    www.adaptcentre.ieDemonstrator 2 Using YASGUI (http://yasgui.org) Similarquery, only slightly tweaked to plot the border of County Dublin only once.
  • 18.
    www.adaptcentre.ieChallenges… We deem thatwe achieved our goal of demonstrating the feasibility of publishing and using TCD Library’s metadata as Linked Data. During this process, however, we identified several challenges: • Knowledge organization, and provenance. As we already mentioned, clever use of named graphs and provenance vocabularies would inform consumers what data to trust for particular tasks. • Leveraging the uplift process. Through (repeated, similar) exercise(s), we did notice that creating mappings to generate RDF is a daunting tasks requiring appropriate tools and representations. • (Authoritative) Interlinking by Librarians. Similar to uplift, we notice a lack of methods and tools to leverage this process. Sometimes tools “as simple as” creating skeleton mappings makes the task less daunting. (cfr. Lucy McKenna, also present at CONUL 2017)
  • 19.
    www.adaptcentre.ieSome references… • ChristopheDebruyne, Kris McGlinn, Lorraine McNerney and Declan O'Sullivan: A Lightweight Approach to Explore, Enrich and Use Data with a Geospatial Dimension with Semantic Web Technologies. GeoRICH 2017 – Leveraging uplift and interlinking process • Ademar Crotti Junior, Christophe Debruyne and Declan O'Sullivan. Juma: an Editor that Uses a Block Metaphor to Facilitate the Creation and Editing of R2RML Mappings. Extended Semantic Web Conference (Posters & Demos) 2017 – Representations for uplift mappings • Christophe Debruyne, Eamonn Clinton, Declan O'Sullivan: Client-side Processing of GeoSPARQL Functions with Triple Pattern Fragments. LDOW@WWW 2017 • Christophe Debruyne, Eamonn Clinton, Lorraine McNerney, Atul Nautiyal, Declan O'Sullivan: Serving Ireland's Geospatial Information as Linked Data. International Semantic Web Conference (Posters & Demos) 2016 – URI strategies, knowledge representation & organization, Linked Data Service

Editor's Notes

  • #9 Template of the Subject URI rr:class relates the resources to types rr:predicate provides the predicate’s URI Mapping of literals