Digital Enterprise Research Institute www.deri.ie
Enabling networked knowledge
Linked Logainm: Enhancing Library Metadata
using Linked Data of Irish Place Names
Nuno Lopes Rebecca Grant Brian Ó Raghallaigh Eoghan Ó
Carragáin Sandra Collins Stefan Decker
September 26, 2013
logainm.ie
The authority list of Irish place
names, validated by the
Placenames Branch.
Delivering a more detailed level
than in DBpedia, Geonames.
Unique source of Irish language
place names
1 / 13
logainm.ie
The authority list of Irish place
names, validated by the
Placenames Branch.
Delivering a more detailed level
than in DBpedia, Geonames.
Unique source of Irish language
place names
But.. not easily accessible
automatically
1 / 13
The NLI Longfield Map Collection
The Longfield Maps are a set of 1,570 surveys carried out in
Ireland between 1770 and 1840.
Currently catalogued in MarcXML
Integrating Logainm data into their workflow:
for enabling searching for place names in Irish
using Linked Data
2 / 13
Longfield Map example
3 / 13
Longfield Map example
MARC/XML
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Rathdown (Barony)</marc:subfield>
</marc:datafield>
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land use surveys</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Wicklow (County)</marc:subfield>
</marc:datafield>
3 / 13
Approach for creating the dataset
1 Translate Logainm database dump into RDF
2 Determine links to other datasets based on:
Place names
Type
Geographical coordinates
Hierarchy of places
3 Evaluation of generated links
4 Library catalogue enhancement
4 / 13
Overview of GLD
Providers:
DBpedia
Exported from Wikipedia
LinkedGeoData
Exported from
OpenStreetMap
GeoNames
5 / 13
Overview of GLD
Providers:
DBpedia
Exported from Wikipedia
LinkedGeoData
Exported from
OpenStreetMap
GeoNames
GeoLinkedData
Ordnance Survey
5 / 13
Overview of GLD
Providers:
DBpedia
Exported from Wikipedia
LinkedGeoData
Exported from
OpenStreetMap
GeoNames
GeoLinkedData
Ordnance Survey
Vocabularies:
W3C Geo
SpatialThing
NeoGeo
Feature vs Geometry
Spatial Relations
(is_part_of)
Most providers define their own
5 / 13
1. Converting Logainm dump to RDF
SPA QL
M
L
X
D
F
R
∼ 1.3M triples
Data provided in XML
6 / 13
1. Converting Logainm dump to RDF
SPA QL
M
L
X
D
F
R
∼ 1.3M triples
Data provided in XML
Translated to RDF using XSPARQL
6 / 13
1. Converting Logainm dump to RDF
SPA QL
M
L
X
D
F
R
∼ 1.3M triples
Data provided in XML
Translated to RDF using XSPARQL
Exposed using Openlink Virtuoso
6 / 13
Linked Logainm
http://lod-cloud.net/
Government
Media
User-generated
Publications
Life sciences
Cross-domain
GeoLogainm
OCLC FAST
7 / 13
Linked Logainm
http://lod-cloud.net/
Government
Media
User-generated
Publications
Life sciences
Cross-domain
GeoLogainm
OCLC FAST
7 / 13
Linked Logainm
http://lod-cloud.net/
Government
Media
User-generated
Publications
Life sciences
Cross-domain
GeoLogainm
OCLC FAST
7 / 13
2. Place name matching using Silk
1 Place Name
Island, Cavan: 2641 "Place"s in
DBpedia
Airport, Dublin: 7828
8 / 13
2. Place name matching using Silk
1 Place Name
Island, Cavan: 2641 "Place"s in
DBpedia
Airport, Dublin: 7828
2 Geographical Location
∼50% of place names in logainm
contain geographical information
8 / 13
2. Place name matching using Silk
1 Place Name
Island, Cavan: 2641 "Place"s in
DBpedia
Airport, Dublin: 7828
2 Geographical Location
∼50% of place names in logainm
contain geographical information
3 Name of the county / parent place
name
8 / 13
2. Place name matching using Silk
1 Place Name
Island, Cavan: 2641 "Place"s in
DBpedia
Airport, Dublin: 7828
2 Geographical Location
∼50% of place names in logainm
contain geographical information
3 Name of the county / parent place
name
4 Mapping of types from Logainm to
types in other datasets
logainm.ie DBpedia LinkedGeoData Geonames
townland
Populated
Place
Locality
LCTY,
PPLF
8 / 13
3. Silk results
Entities IE # Links % Links
DBpedia1 10,715 1,552 14.5
LinkedGeoData2 36,237 6,611 18
GeoNames3 23,102 8,229 35.5
1
Entities of type “Place” or “Feature”
2
Entities of type “Node”
3
No hierarchy info
4
Including internal & Freebase links
9 / 13
3. Silk results
Entities IE # Links % Links
DBpedia1 10,715 1,552 14.5
LinkedGeoData2 36,237 6,611 18
GeoNames3 23,102 8,229 35.5
Links in other datasets
Entities # Links % Links
DBpedia 873,643 653,7074 74.84
LinkedGeoData 6,251,067 462,098 7,4
1
Entities of type “Place” or “Feature”
2
Entities of type “Node”
3
No hierarchy info
4
Including internal & Freebase links
9 / 13
Evaluation Results
Links Checked Correct
DBpedia 1,552 1,552 (100%) 98%
LinkedGeoData 6,611 500 (7.5%) 96%
GeoNames 8,229 500 (6%) 99%
Same place names can be “towns”, “population centre”, and
“townland” in logainm.ie. DBpedia contains only one entry:
Adrigole (population centre) and Adrigole (townland)
http://dbpedia.org/resource/Adrigole
Similar for LinkedGeoData
10 / 13
Longfield Map example (Updated)
11 / 13
Longfield Map example (Updated)
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Rathdown (Barony)</marc:subfield>
</marc:datafield>
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land use surveys</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Wicklow (County)</marc:subfield>
</marc:datafield>
11 / 13
Longfield Map example (Updated)
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Rathdown (Barony)</marc:subfield>
</marc:datafield>
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land use surveys</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Wicklow (County)</marc:subfield>
</marc:datafield>
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Rathdown (Barony)</marc:subfield>
</marc:datafield>
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land use surveys</marc:subfield>
<marc:subfield code="z">Ireland</marc:subfield>
<marc:subfield code="z">Wicklow (County)</marc:subfield>
</marc:datafield>
<marc:datafield tag="651" ind2="7" ind1="">
<marc:subfield code="2">logainm.ie</marc:subfield>
<marc:subfield code="a">Rathdown</marc:subfield>
<marc:subfield code="0">http://data.logainm.ie/place/283</marc:s
</marc:datafield>
11 / 13
Demo page:
http://apps.dri.ie/locationLODer
12 / 13
Conclusions
Creation of a new Linked Data geographical Dataset
Linking to other publicly available datasets
Enhancing of NLI’s MARC/XML records
13 / 13
Conclusions
Creation of a new Linked Data geographical Dataset
Linking to other publicly available datasets
Enhancing of NLI’s MARC/XML records
Future work
Improve the Silk matching rules to obtain better matching
Street level matching
Enhancing the NLI’s cataloguing system (VuFind)
13 / 13
Conclusions
Creation of a new Linked Data geographical Dataset
Linking to other publicly available datasets
Enhancing of NLI’s MARC/XML records
Future work
Improve the Silk matching rules to obtain better matching
Street level matching
Enhancing the NLI’s cataloguing system (VuFind)
Thank you! Questions?
13 / 13

Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place Names

  • 1.
    Digital Enterprise ResearchInstitute www.deri.ie Enabling networked knowledge Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place Names Nuno Lopes Rebecca Grant Brian Ó Raghallaigh Eoghan Ó Carragáin Sandra Collins Stefan Decker September 26, 2013
  • 2.
    logainm.ie The authority listof Irish place names, validated by the Placenames Branch. Delivering a more detailed level than in DBpedia, Geonames. Unique source of Irish language place names 1 / 13
  • 3.
    logainm.ie The authority listof Irish place names, validated by the Placenames Branch. Delivering a more detailed level than in DBpedia, Geonames. Unique source of Irish language place names But.. not easily accessible automatically 1 / 13
  • 4.
    The NLI LongfieldMap Collection The Longfield Maps are a set of 1,570 surveys carried out in Ireland between 1770 and 1840. Currently catalogued in MarcXML Integrating Logainm data into their workflow: for enabling searching for place names in Irish using Linked Data 2 / 13
  • 5.
  • 6.
    Longfield Map example MARC/XML <marc:datafieldtag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield> 3 / 13
  • 7.
    Approach for creatingthe dataset 1 Translate Logainm database dump into RDF 2 Determine links to other datasets based on: Place names Type Geographical coordinates Hierarchy of places 3 Evaluation of generated links 4 Library catalogue enhancement 4 / 13
  • 8.
    Overview of GLD Providers: DBpedia Exportedfrom Wikipedia LinkedGeoData Exported from OpenStreetMap GeoNames 5 / 13
  • 9.
    Overview of GLD Providers: DBpedia Exportedfrom Wikipedia LinkedGeoData Exported from OpenStreetMap GeoNames GeoLinkedData Ordnance Survey 5 / 13
  • 10.
    Overview of GLD Providers: DBpedia Exportedfrom Wikipedia LinkedGeoData Exported from OpenStreetMap GeoNames GeoLinkedData Ordnance Survey Vocabularies: W3C Geo SpatialThing NeoGeo Feature vs Geometry Spatial Relations (is_part_of) Most providers define their own 5 / 13
  • 11.
    1. Converting Logainmdump to RDF SPA QL M L X D F R ∼ 1.3M triples Data provided in XML 6 / 13
  • 12.
    1. Converting Logainmdump to RDF SPA QL M L X D F R ∼ 1.3M triples Data provided in XML Translated to RDF using XSPARQL 6 / 13
  • 13.
    1. Converting Logainmdump to RDF SPA QL M L X D F R ∼ 1.3M triples Data provided in XML Translated to RDF using XSPARQL Exposed using Openlink Virtuoso 6 / 13
  • 14.
  • 15.
  • 16.
  • 17.
    2. Place namematching using Silk 1 Place Name Island, Cavan: 2641 "Place"s in DBpedia Airport, Dublin: 7828 8 / 13
  • 18.
    2. Place namematching using Silk 1 Place Name Island, Cavan: 2641 "Place"s in DBpedia Airport, Dublin: 7828 2 Geographical Location ∼50% of place names in logainm contain geographical information 8 / 13
  • 19.
    2. Place namematching using Silk 1 Place Name Island, Cavan: 2641 "Place"s in DBpedia Airport, Dublin: 7828 2 Geographical Location ∼50% of place names in logainm contain geographical information 3 Name of the county / parent place name 8 / 13
  • 20.
    2. Place namematching using Silk 1 Place Name Island, Cavan: 2641 "Place"s in DBpedia Airport, Dublin: 7828 2 Geographical Location ∼50% of place names in logainm contain geographical information 3 Name of the county / parent place name 4 Mapping of types from Logainm to types in other datasets logainm.ie DBpedia LinkedGeoData Geonames townland Populated Place Locality LCTY, PPLF 8 / 13
  • 21.
    3. Silk results EntitiesIE # Links % Links DBpedia1 10,715 1,552 14.5 LinkedGeoData2 36,237 6,611 18 GeoNames3 23,102 8,229 35.5 1 Entities of type “Place” or “Feature” 2 Entities of type “Node” 3 No hierarchy info 4 Including internal & Freebase links 9 / 13
  • 22.
    3. Silk results EntitiesIE # Links % Links DBpedia1 10,715 1,552 14.5 LinkedGeoData2 36,237 6,611 18 GeoNames3 23,102 8,229 35.5 Links in other datasets Entities # Links % Links DBpedia 873,643 653,7074 74.84 LinkedGeoData 6,251,067 462,098 7,4 1 Entities of type “Place” or “Feature” 2 Entities of type “Node” 3 No hierarchy info 4 Including internal & Freebase links 9 / 13
  • 23.
    Evaluation Results Links CheckedCorrect DBpedia 1,552 1,552 (100%) 98% LinkedGeoData 6,611 500 (7.5%) 96% GeoNames 8,229 500 (6%) 99% Same place names can be “towns”, “population centre”, and “townland” in logainm.ie. DBpedia contains only one entry: Adrigole (population centre) and Adrigole (townland) http://dbpedia.org/resource/Adrigole Similar for LinkedGeoData 10 / 13
  • 24.
    Longfield Map example(Updated) 11 / 13
  • 25.
    Longfield Map example(Updated) <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield> 11 / 13
  • 26.
    Longfield Map example(Updated) <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield> <marc:datafield tag="651" ind2="7" ind1=""> <marc:subfield code="2">logainm.ie</marc:subfield> <marc:subfield code="a">Rathdown</marc:subfield> <marc:subfield code="0">http://data.logainm.ie/place/283</marc:s </marc:datafield> 11 / 13
  • 27.
  • 28.
    Conclusions Creation of anew Linked Data geographical Dataset Linking to other publicly available datasets Enhancing of NLI’s MARC/XML records 13 / 13
  • 29.
    Conclusions Creation of anew Linked Data geographical Dataset Linking to other publicly available datasets Enhancing of NLI’s MARC/XML records Future work Improve the Silk matching rules to obtain better matching Street level matching Enhancing the NLI’s cataloguing system (VuFind) 13 / 13
  • 30.
    Conclusions Creation of anew Linked Data geographical Dataset Linking to other publicly available datasets Enhancing of NLI’s MARC/XML records Future work Improve the Silk matching rules to obtain better matching Street level matching Enhancing the NLI’s cataloguing system (VuFind) Thank you! Questions? 13 / 13