0
Linked Data for
Digital Libraries
Uldis Bojars, Nuno Lopes, & Jodi Schneider
TPDL 2013
September 22, 2013
Valletta, Malta
...
Nuno
Digital Repository of Ireland &
DERI

Uldis
National Library of Latvia

Jodi
DERI
Schedule for the day
9:00 - Introduction of presenters, tutorial schedule, and learning outcomes
9:10 - Motivation and con...
Hands-on Activities
11:50 – 12:25
Choice of Activities….
• Data Modelling
• Data Cleaning & Structuring
• Querying (SPARQL...
Please share your expertise!
• In the room
• On paper
• Online - shared folder:
http://tinyurl.com/tpdl2013-ld-notes
– PDF...
Objectives for Today
• What is Linked Data? Why use it?
• What are some examples of
Linked Data in Digital Libraries?
• Wh...
Motivation and concepts
of Linked Data
What is Linked Data?
•
•
•
•

Using identifiers
to enable access
to add structure
to link to other stuff
Why use Linked Data?
Key technology for library data!
Representing
Publishing
Exchanging
• Powerful querying
• Ability to mix/match vocabularies
• Same technology stack as everybody else
– Findability
– Interope...
Who is using Linked Data?
Aggregators
Integrated Library Systems & OPACs
Thesauri
Repositories
What is Linked Data (redux)?
Rob Styles
Towards RDF

Subject

Predicate

Object
RDF triple

Subject

Predicate

Object
RDF graph
How Linked Data works
Reuses the existing Web infrastructure to publish your
data along with your documents:
– Using URI i...
Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
those names.
3. When so...
Data on the Web is not enough…
• We need a proper infrastructure for a real
Web of Data
– data is available on the Web
• a...
In groups of 2-3: Discuss
• How would you envision using Linked
Data?What are the opportunities?
• Is your institution alr...
Lifecycle of Linked Data
Lifecycle of Linked Data
•
•
•
•
•
•
•
•

Find
Explore
Transform
Model
Store
Query
Interlink
Publish
Semantic Web for Digital Libraries
Exploring Linked Data
(Practical Tools and Approaches)

Uldis Bojars, Nuno Lopes, & Jod...
Objectives
• Learn about Linked Data (LD) by looking
at existing data sources
• Discover tools and approaches for
explorin...
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
http://lod-cloud.net/
Exploring Linked Data
• Discovering Linked Data
• Accessing RDF data
• Making sense of the data
– Validating RDF data
– Co...
RDF graph
What RDF looks like
• RDF can be expressed in a number of formats:
– some are good for machines;
some – understandable to ...
Accessing RDF data
RDF data on the Web can be found as:
• Linked Data
– follow links, request data by URI
– returned data ...
http://www.ivan-herman.net/
Discovering Linked Data
a) find on a link in a Web page
b) have some tools alert you Linked Data is there
–
–

Tabulator
S...
RDF discovery example
• data at Ivan Herman’s page can be found via:
– finding the RDF icon (with the link to FOAF file)
–...
Making sense of the data
• Validating RDF data
– Ensures that data representation is correct

• Converting between formats...
Validating and Converting RDF
• W3C RDF validator
http://www.w3.org/RDF/Validator/

• URI debugger – “Swiss knife” of Link...
<http://www.ivan-herman.net/> a foaf:PersonalProfileDocument;
dc:creator "Ivan Herman";
dc:date "2009-06-17"^^xsd:date;
dc...
Browsing Linked Data (DBPedia):
http://live.dbpedia.org/resource/Valletta
Command Line Tools
• wget – command line network downloader
$ wget http://dbpedia.org/resource/Valletta

• curl – specify ...
Querying Linked Data
• SPARQL Protocol and RDF Query Language
• Graph Matching
• Components of a SPARQL Query:
– Prefix De...
Europeana SPARQL endpoint
http://europeana.ontotext.com/
Sample queries provided:
http://europeana.ontotext.com/sparql
http://tinyurl.com/europeana-rights-sparql
Tool catalogues: many more tools
• Collection of tools from other projects
– http://www.w3.org/2001/sw/wiki/LLDtools
– htt...
Interesting Projects
• LOCAH
a stylesheet to transform UK Archives Hub EAD to RDF/XML, and provides
examples of the proces...
Tools for Converting MARC records
• MariMba
Tool to translate MARC to RDF and Linked Data
http://mayor2.dia.fi.upm.es/oegu...
Tools for museum curators
• Karma (http://isi.edu/integration/karma/)
was used to map the records of the Smithsonian
Ameri...
Authority Linked Data
VIAF and Wikipedia case study
library links
Slide credit: Jindřich Mynarz
• Use a single, distinct name for
each person, organization, …
• Name is consistently used
throughout library systems
• Is...
http://viaf.org
VIAF
• Virtual Internet Authority File (viaf.org)
• Integrating authority information from
a number of national libraries
...
Wikipedia + VIAF
• How can people discover useful information
in VIAF and via VIAF?
• Linked Data eco-system – let’s explo...
http://en.wikipedia.org/wiki/Andrejs_Pumpurs
http://viaf.org/viaf/44427367/
VIAF
• Ontologies used:
– FOAF, SKOS, RDA (FRBR entities and elements),
Dublin Core, VIAF, UMBEL

• Related datasets:
– Na...
http://viaf.org/viaf/44427367/
How did VIAF get into Wikipedia?
• VIAFbot
– algorithmically matched by name, important
dates, and selected works

• “The ...
One Direction
VIAF

Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC

English Wiki
Enter VIAFBot: Wikipedia Robot
VIAF

Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC

English Wiki
Idea: Reciprocate
VIAF

Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC

English Wiki
VIAF – summary:
– an efficient way for putting library authority data
online as linked data
– in case if the organization ...
Data Modelling
Publishing Data
• Naïve Transform
– Direct Mapping of Relational Data to RDF
See RDB2RDF

OR
• Model & Transform
– Figure ...
Model
• Describe the domain
– What are the important concepts?
– What are their properties?
– What are their relations?

•...
DC TERMS RDF
Vocabularyhttp://purl.org/dc/terms/
Deciding on URI patterns
•
•
•
•

Use a domain that you control
Use consistent patterns
Manage change: transparent isn’t a...
Example URI patterns
• Designing URI Sets for the UK Public Sector
• Defines patterns for
– Identifier URI
– Document URI
...
Choosing Vocabularies
• Audience & Purpose
– e.g. search engine vs. bibliographic exchange

• Domain
– Biomedical, geograp...
Finding vocabularies &
ontologies
Look at examples
Look at examples
Find examples:
Linked Open Data Cloud
Look at Publications & Lists
http://www.w3.org/2005/Incubator/lld/XGR-lldvocabdataset-20111025/
Ask the community
• Mailing lists
– LOD-LAM
– Code4Lib
– OKFN Open-Bibliography Working Group
– W3C Schema.org BibEx Commu...
Popularity
Popularity:
Semantic search engines
http://sindice.com/
Modeling spectrum:
lightweight to heavyweight

An ontology ”spectrum” (in the order of complexity).
Source: [Lassila and M...
Some popular vocabularies
•
•
•
•
•
•

DC
BIBO
FOAF
LODE (LinkedEvents)
OAI-ORE
SKOS
Be aware of & connect to
• Authority data
– e.g. VIAF

• Thesauri
– e.g. Agrovoc

• Linked Data is about Linking!
Modeling examples
•
•
•
•
•

BIBFRAME
British Library Data Model
EDM
LIBRIS
VIAF
VIAF
• Ontologies used:
– FOAF, SKOS, RDA (FRBR entities and elements),
Dublin Core, VIAF, UMBEL

• Related datasets:
– Na...
LIBRIS Modeling
British Library Data Model - Book
@prefix blt:
@prefix rdf:
@prefix rdfs:
@prefix owl:
@prefix xsd:
@prefix dct:
@prefix isbd:
@p...
Semantic Web for Digital Libraries
Geographical LD case study
Uldis Bojars, Nuno Lopes, & Jodi Schneider
The NLI Longfield Map Collection

• Collections refer to Geographical Data in many forms…
• The Longfield Maps are a set o...
Longfield Map example

<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<mar...
Geographic Data Providers
 DBpedia
– Includes latitude and longitude for geographic entities
 LinkedGeoData
– Export of ...
Logainm.ie
• The authority list of Irish place
names, validated by the Place Names
Branch.
• Delivering a more detailed le...
Geo-Vocabularies
• W3C Geo (very basic)
– SpatialThing, latitude and longitude

• Most providers have defined their own
• ...
NeoGeo Overview
• Classes
– Feature (spatial:Feature)
• A geographical feature, capable of holding spatial
relations.

– G...
Relations between geometries
Properties
•
•
•
•
•

connects with (spatial:C)
overlaps (spatial:O)
is part of (spatial:P)
c...
Creating a LD Dataset
Steps:
1. Data transformation / access
•

Vocabulary assessment

2. Link Discovery
•

Evaluation of ...
Converting Logainm to RDF
~100,000 place names

~1.3M triples

http://data.logainm.ie/1
375542

Dublin

http://sws.geona
m...
Link Discovery
• Silk
– http://wifo5-03.informatik.uni-mannheim.de/bizer/silk/

• LIMES
– http://aksw.org/Projects/LIMES.h...
Rules to discover links to other
datasets
•

Rules based on:
–
–
–
–

Place names
Geographical coordinates
Name of the cou...
Longfield Map example
<marc:datafield tag="650" ind1="" ind2="">
<marc:subfield code="a">Land tenure</marc:subfield>
<marc...
Demo: Location LODer
http://apps.dri.ie/locationLODer/locationLODer
Hands-on Activities
11:50 – 12:25
Choice of Activities….
• Data Modelling
• Data Cleaning & Structuring
• Querying (SPARQL...
Semantic Web for Digital Libraries
Open Refine Exercise
Uldis Bojars, Nuno Lopes, & Jodi Schneider
Open Refine
• Useful for batch transformation of large amounts
of data
– data cleanup (misspellings, splitting multiple-va...
Exercise
• Examples from: http://freeyourmetadata.org/
• Sample Data (collection metadata from the
Sydney Powerhouse Museu...
Task 1 - Data Cleanup
1.
2.
3.
4.
5.
6.
7.
8.

Import the collection into OpenRefine
Get to know your data
Remove blank ro...
Task 2 - Data Reconciliation & RDF
Export
1.
2.
3.
4.
5.
6.
7.

Pick a column to reconcile
Pick a vocabulary to reconcile ...
Semantic Web for Digital Libraries
SPARQL Hands-on Session
Uldis Bojars, Nuno Lopes, & Jodi Schneider
SPARQL
• Query Language for RDF data
• W3C Standard
• Components of a SPARQL Query:
– Prefix Declarations
– Result type (S...
Further information
• In-Depth SPARQL tutorials
– http://www.cambridgesemantics.com/semanticuniversity/sparql-by-example
–...
SPARQL by example – Europeana Endpoint
Endpoint: http://europeana.ontotext.com/sparql
1. SPARQL Select template
2. List of...
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
Upcoming SlideShare
Loading in...5
×

TPDL2013 tutorial linked data for digital libraries 2013-10-22

4,832

Published on

Tutorial on Linked Data for Digital Libraries, given by me, Uldis Bojars, and Nuno Lopes in Valletta, Malta at TPDL2013 on 2013-10-22.
http://tpdl2013.upatras.gr/tut-lddl.php



This half-day tutorial is aimed at academics and practitioners interested in creating and using Library Linked Data. Linked Data has been embraced as the way to bring complex information onto the Web, enabling discoverability while maintaining the richness of the original data. This tutorial will offer participants an overview of how digital libraries are already using Linked Data, followed by a more detailed exploration of how to publish, discover and consume Linked Data. The practical part of the tutorial will include hands-on exercises in working with Linked Data and will be based on two main case studies: (1) linked authority data and VIAF; (2) place name information as Linked Data.
For practitioners, this tutorial provides a greater understanding of what Linked Data is, and how to prepare digital library materials for conversion to Linked Data. For researchers, this tutorial updates the state of the art in digital libraries, while remaining accessible to those learning Linked
Data principles for the first time. For library and iSchool instructors, the tutorial provides a valuable introduction to an area of growing interest for information organization curricula. For digital library project managers, this tutorial provides a deeper understanding of the principles of Linked Data, which is needed for bespoke projects that involve data mapping and the reuse of existing metadata models.

Published in: Technology, Education
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,832
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
36
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide
  • USB stick OR online**Google DocData for the hands-on activities
  • http://www.europeana.eu/
  • “The Evergreen and Koha integrated library systems now express their record details in the schema.org vocabulary out of the box using RDFa.”http://www.coffeecode.net/archives/271-RDFa-and-schema.org-all-the-library-things.htmlhttp://koha-community.org/http://evergreen-ils.org
  • http://aims.fao.org/aos/agrovoc/
  • http://projecthydra.org/
  • Rob Styles at Code4Lib 2008http://code4lib.org/conference/2008/stylesSemanticMARCuphttp://dynamicorange.com/uploads/Semantic%20Marcup.pdf
  • Using identifiersto enable accessto add structure to link to other stuff
  • http://en.wikipedia.org/wiki/SPARQLwe’re running a bit ahead when mentioning
  • http://en.wikipedia.org/wiki/SPARQL
  • Flow of presentation: - show this - then show RDF Validator slides (saying we discovered the URI from (a) webpage or (b) extracted RDFa data)
  • RDF validatorhttp://www.w3.org/RDF/Validator/rdfval?URI=http%3A%2F%2Fwww.ivan-herman.net%2Ffoaf.rdf&amp;PARSE=Parse+URI%3A+&amp;TRIPLES_AND_GRAPH=PRINT_BOTH&amp;FORMAT=PNG_EMBED
  • querying too ?
  • show SPARQL query assistants (or the link Jodi found where there are a number of example queries)was it Europeana data?provide links to further info re. SPARQLor just refer to the break-out session
  • Many, many tools – best to ask other people what they can recommend
  • MARC2SKOSXQuery utility to convert MARC/XML Authority records to MADS/RDF and SKOS resources: https://github.com/kefo/marcauth-2-madsrdf Dublin Core to RDF crosswalk http://dublincore.org/documents/dc-rdf/OAI-PHM RDFizerconverts the metadata from an OAI-PMH-capable repository to RDF.http://simile.mit.edu/wiki/OAI-PMH_RDFizer
  • www.slideshare.net/jindrichmynarz/linking-library-data/6
  • Alphabets, diacrits
  • Library Data in Wikipedia &amp; WikidataMaximilian Klein, Wikipedian in Residence at OCLChttp://www.slideshare.net/oclcr/viaf-data-in-wikipedia-and-wikidataQuote from http://hangingtogether.org/?p=2306
  • https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/60975/designing-URI-sets-uk-public-sector.pdf
  • http://data.libris.kb.se/open/auth/71639.n3 via http://libris.kb.se/auth/71639 and copying link patterns
  • Linked Data is lightweight: RDF vocabularies.Less focus on constraints (e.g. OWL ontologies)
  • http://data.libris.kb.se/open/auth/71639.n3 via http://libris.kb.se/auth/71639 and copying link patterns
  • Transcript of "TPDL2013 tutorial linked data for digital libraries 2013-10-22"

    1. 1. Linked Data for Digital Libraries Uldis Bojars, Nuno Lopes, & Jodi Schneider TPDL 2013 September 22, 2013 Valletta, Malta 1
    2. 2. Nuno Digital Repository of Ireland & DERI Uldis National Library of Latvia Jodi DERI
    3. 3. Schedule for the day 9:00 - Introduction of presenters, tutorial schedule, and learning outcomes 9:10 - Motivation and concepts of Linked Data 9:30 - Discuss: How would you envision using Linked Data in your institution? 9:45 - Lifecycle of Linked Data & Exploring Linked Data 10:10 - Case Study 1: Authority Data 10:30 – 11 COFFEE BREAK 11:00 - Recap 11:10 - Modelling data as Linked Data 11:30 - Case Study 2: Geographical Linked Data 11:50 - Choice of Hands-on Activities 12:25 - Conclusions
    4. 4. Hands-on Activities 11:50 – 12:25 Choice of Activities…. • Data Modelling • Data Cleaning & Structuring • Querying (SPARQL)
    5. 5. Please share your expertise! • In the room • On paper • Online - shared folder: http://tinyurl.com/tpdl2013-ld-notes – PDF of the programme – Shared notes – More materials later
    6. 6. Objectives for Today • What is Linked Data? Why use it? • What are some examples of Linked Data in Digital Libraries? • What are the best practices for exploring & creating Linked Data?
    7. 7. Motivation and concepts of Linked Data
    8. 8. What is Linked Data? • • • • Using identifiers to enable access to add structure to link to other stuff
    9. 9. Why use Linked Data?
    10. 10. Key technology for library data! Representing Publishing Exchanging
    11. 11. • Powerful querying • Ability to mix/match vocabularies • Same technology stack as everybody else – Findability – Interoperability
    12. 12. Who is using Linked Data?
    13. 13. Aggregators
    14. 14. Integrated Library Systems & OPACs
    15. 15. Thesauri
    16. 16. Repositories
    17. 17. What is Linked Data (redux)?
    18. 18. Rob Styles
    19. 19. Towards RDF Subject Predicate Object
    20. 20. RDF triple Subject Predicate Object
    21. 21. RDF graph
    22. 22. How Linked Data works Reuses the existing Web infrastructure to publish your data along with your documents: – Using URI identifiers – and HTTP for accessing the information
    23. 23. Linked Data Principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards - RDF, SPARQL 4. Include links to other URIs. so that they can discover more things. http://www.w3.org/wiki/LinkedData http://www.w3.org/DesignIssues/LinkedData
    24. 24. Data on the Web is not enough… • We need a proper infrastructure for a real Web of Data – data is available on the Web • accessible via standard Web technologies – data is interlinked over the Web – ie, data can be integrated over the Web • We need Linked Data Slide credit: Ivan Herman
    25. 25. In groups of 2-3: Discuss • How would you envision using Linked Data?What are the opportunities? • Is your institution already using Linked Data? Planning a Linked Data project?
    26. 26. Lifecycle of Linked Data
    27. 27. Lifecycle of Linked Data • • • • • • • • Find Explore Transform Model Store Query Interlink Publish
    28. 28. Semantic Web for Digital Libraries Exploring Linked Data (Practical Tools and Approaches) Uldis Bojars, Nuno Lopes, & Jodi Schneider
    29. 29. Objectives • Learn about Linked Data (LD) by looking at existing data sources • Discover tools and approaches for exploring Linked Data
    30. 30. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
    31. 31. Exploring Linked Data • Discovering Linked Data • Accessing RDF data • Making sense of the data – Validating RDF data – Converting between formats – Browsing Linked Data • Querying RDF data
    32. 32. RDF graph
    33. 33. What RDF looks like • RDF can be expressed in a number of formats: – some are good for machines; some – understandable to people • Common formats: – RDF/XML – common, but difficult to read – NTriples – a simple list of RDF triples – Turtle – human-readable, easier to understand • Can be represented visually
    34. 34. Accessing RDF data RDF data on the Web can be found as: • Linked Data – follow links, request data by URI – returned data can be in various RDF formats • Data dumps – download the data • SPARQL endpoints – query Linked Data (more on that later)
    35. 35. http://www.ivan-herman.net/
    36. 36. Discovering Linked Data a) find on a link in a Web page b) have some tools alert you Linked Data is there – – Tabulator Semantic Radar c) explore a project you heard about – and know LOD should be there d) use a registry of sources http://datahub.io/group/lodcloud e) Just ask someone
    37. 37. RDF discovery example • data at Ivan Herman’s page can be found via: – finding the RDF icon (with the link to FOAF file) – letting browser tools alert you that RDF is present • RDF auto-discovery – extracting RDFa data embedded in the page • for other data sources RDF content negotiation might work
    38. 38. Making sense of the data • Validating RDF data – Ensures that data representation is correct • Converting between formats – Convert to a [more] human-readable RDF format • Browsing Linked Data – Browse the data without worrying about “reading” RDF
    39. 39. Validating and Converting RDF • W3C RDF validator http://www.w3.org/RDF/Validator/ • URI debugger – “Swiss knife” of Linked Data http://linkeddata.informatik.hu-berlin.de/uridbg/ • RDFa distiller – extracts RDF embedded in web pages http://www.w3.org/2012/pyRdfa/ • Command-line tools (we’ll return to that)
    40. 40. <http://www.ivan-herman.net/> a foaf:PersonalProfileDocument; dc:creator "Ivan Herman"; dc:date "2009-06-17"^^xsd:date; dc:title "Ivan Herman’s home page"; xhv:stylesheet <http://www.ivan-herman.net/Style/gray.css>; foaf:primaryTopic <http://www.ivan-herman.net/foaf#me> . <http://twitter.com/ivan_herman> a foaf:OnlineAccount; foaf:accountName "ivan_herman"; foaf:accountServiceHomepage <http://twitter.com/> . <http://www.ivan-herman.net/cgi-bin/rss2to1.py> a rss:channel . <http://www.ivan-herman.net/foaf#me> a dc:Agent, foaf:Person; rdfs:seeAlso <http://www.ivan-herman.net/AboutMe>, <http://www.ivan-herman.net/cgi-bin/rss2to1.py>, <http://www.ivan-herman.net/foaf.rdf>; ... Extracted from http://www.ivan-herman.net/ using RDFa Distiller
    41. 41. Browsing Linked Data (DBPedia): http://live.dbpedia.org/resource/Valletta
    42. 42. Command Line Tools • wget – command line network downloader $ wget http://dbpedia.org/resource/Valletta • curl – specify HTTP headers $ curl -L -H "Accept: text/rdf+n3” http://dbpedia.org/resource/Valletta • Redland rapper – RDF parsing and serialisation $ rapper -o turtle http://dbpedia.org/resource/Valletta
    43. 43. Querying Linked Data • SPARQL Protocol and RDF Query Language • Graph Matching • Components of a SPARQL Query: – Prefix Declarations – Result type (SELECT, CONSTRUCT, DESCRIBE, ASK) – Dataset – Query pattern – Solution modifiers
    44. 44. Europeana SPARQL endpoint http://europeana.ontotext.com/
    45. 45. Sample queries provided: http://europeana.ontotext.com/sparql
    46. 46. http://tinyurl.com/europeana-rights-sparql
    47. 47. Tool catalogues: many more tools • Collection of tools from other projects – http://www.w3.org/2001/sw/wiki/LLDtools – http://www.w3.org/2001/sw/wiki/Tools – http://semanticweb.org/wiki/Tools – http://dbpedia.org/Applications
    48. 48. Interesting Projects • LOCAH a stylesheet to transform UK Archives Hub EAD to RDF/XML, and provides examples of the process using XLST http://data.archiveshub.ac.uk/ead2rdf/ • AliCAT (Archival Linked-data Cataloguing) Tool for editing collection level records http://data.aim25.ac.uk/step-change/ • Axiell CALM Solution for LAM that includes Linked Data functionality, allowing archivists to tag their collections with URIs from any chosen Linked Dataset. http://www.axiell.com/calm
    49. 49. Tools for Converting MARC records • MariMba Tool to translate MARC to RDF and Linked Data http://mayor2.dia.fi.upm.es/oegupm/index.php/en/downloads/228-marimba • marcauth-2-madsrdf XQuery utility to convert MARC/XML Authority records to MADS/RDF and SKOS resources https://github.com/kefo/marcauth-2-madsrdf
    50. 50. Tools for museum curators • Karma (http://isi.edu/integration/karma/) was used to map the records of the Smithsonian American Art Museum to RDF and link them the Web and the Linked Open Data Cloud. Demo: http://www.youtube.com/watch?v=kUIqTI56oeQ
    51. 51. Authority Linked Data VIAF and Wikipedia case study
    52. 52. library links Slide credit: Jindřich Mynarz
    53. 53. • Use a single, distinct name for each person, organization, … • Name is consistently used throughout library systems • Issues: – “Strings” not “things” – in Linked Data world we’d just use URIs 
    54. 54. http://viaf.org
    55. 55. VIAF • Virtual Internet Authority File (viaf.org) • Integrating authority information from a number of national libraries – Linked data + links to related information • Matching authority data from multiple sources – using related bibliographic records to help matching
    56. 56. Wikipedia + VIAF • How can people discover useful information in VIAF and via VIAF? • Linked Data eco-system – let’s explore (!) – Wikipedia -> VIAF -> National Library LD • Example (Andrejs Pumpurs): – http://en.wikipedia.org/wiki/Andrejs_Pumpurs – http://viaf.org/viaf/44427367/
    57. 57. http://en.wikipedia.org/wiki/Andrejs_Pumpurs
    58. 58. http://viaf.org/viaf/44427367/
    59. 59. VIAF • Ontologies used: – FOAF, SKOS, RDA (FRBR entities and elements), Dublin Core, VIAF, UMBEL • Related datasets: – National authority data: • Germany (d-nb.info), Sweden (LIBRIS), France (idref.rf) – DBPedia
    60. 60. http://viaf.org/viaf/44427367/
    61. 61. How did VIAF get into Wikipedia? • VIAFbot – algorithmically matched by name, important dates, and selected works • “The principal benefit of VIAFbot is the interconnected structure.” -
    62. 62. One Direction VIAF Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC English Wiki
    63. 63. Enter VIAFBot: Wikipedia Robot VIAF Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC English Wiki
    64. 64. Idea: Reciprocate VIAF Slide credit: Maximilian Klein, Wikipedian in Residence at OCLC English Wiki
    65. 65. VIAF – summary: – an efficient way for putting library authority data online as linked data – in case if the organization also provides Linked Data itself can add links to VIAF to link back to organization’s LD records (which may contain richer / additional information)
    66. 66. Data Modelling
    67. 67. Publishing Data • Naïve Transform – Direct Mapping of Relational Data to RDF See RDB2RDF OR • Model & Transform – Figure out how to represent data – Then transform according to the model
    68. 68. Model • Describe the domain – What are the important concepts? – What are their properties? – What are their relations? • Choose vocabularies
    69. 69. DC TERMS RDF Vocabularyhttp://purl.org/dc/terms/
    70. 70. Deciding on URI patterns • • • • Use a domain that you control Use consistent patterns Manage change: transparent isn’t always best Consider what concepts are worth distinguishing
    71. 71. Example URI patterns • Designing URI Sets for the UK Public Sector • Defines patterns for – Identifier URI – Document URI – Representation URI • Identifier example: http://{domain}/id/{concept}/{reference} http://data.archiveshub.ac.uk/id/person/ncarules/s kinnerbeverley1938-1999artist
    72. 72. Choosing Vocabularies • Audience & Purpose – e.g. search engine vs. bibliographic exchange • Domain – Biomedical, geographical, … • Granularity • Popularity: potential for interlinking & reuse
    73. 73. Finding vocabularies & ontologies
    74. 74. Look at examples
    75. 75. Look at examples
    76. 76. Find examples: Linked Open Data Cloud
    77. 77. Look at Publications & Lists http://www.w3.org/2005/Incubator/lld/XGR-lldvocabdataset-20111025/
    78. 78. Ask the community • Mailing lists – LOD-LAM – Code4Lib – OKFN Open-Bibliography Working Group – W3C Schema.org BibEx Community Group • Domain-specific Linked Data groups & lists
    79. 79. Popularity
    80. 80. Popularity: Semantic search engines http://sindice.com/
    81. 81. Modeling spectrum: lightweight to heavyweight An ontology ”spectrum” (in the order of complexity). Source: [Lassila and McGuinness, 2001]. Image from Bojars 2009
    82. 82. Some popular vocabularies • • • • • • DC BIBO FOAF LODE (LinkedEvents) OAI-ORE SKOS
    83. 83. Be aware of & connect to • Authority data – e.g. VIAF • Thesauri – e.g. Agrovoc • Linked Data is about Linking!
    84. 84. Modeling examples • • • • • BIBFRAME British Library Data Model EDM LIBRIS VIAF
    85. 85. VIAF • Ontologies used: – FOAF, SKOS, RDA (FRBR entities and elements), Dublin Core, VIAF, UMBEL • Related datasets: – National authority data: • Germany (d-nb.info), Sweden (LIBRIS), France (idref.rf) – DBPedia
    86. 86. LIBRIS Modeling
    87. 87. British Library Data Model - Book @prefix blt: @prefix rdf: @prefix rdfs: @prefix owl: @prefix xsd: @prefix dct: @prefix isbd: @prefix skos: @prefix bibo: @prefix rda: @prefix bio: @prefix foaf: @prefix event: @prefix org: @prefix geo: Publication Events Series <http://www.bl.uk/schemas/bibliographic/blterms#> . <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <http://www.w3.org/2000/01/rdf-schema#> . <http://www.w3.org/2002/07/owl#> . <http://www.w3.org/2001/XMLSchema#> . <http://purl.org/dc/terms/> . <http://iflastandards.info/ns/isbd/elements/> . <http://www.w3.org/2004/02/skos/core#> . <http://purl.org/ontology/bibo/> . <http://rdvocab.info/ElementsGr2/> . <http://purl.org/vocab/bio/0.1/> . <http://xmlns.com/foaf/0.1/> . <http://purl.org/NET/c4dm/event.owl#> . <http://www.w3.org/ns/org#> . <http://www.w3.org/2003/01/geo/wgs84_pos#> . rdfs:subClassOf Author bio:Birth event:place CalendarYear bio:date blt:publicationStart blt:publication a bio:date blt:PublicationEndEvent PublicationEvent BL URI Topic LCSH BL URI a rdfs:subClassOf PublicationStartEvent BL URI event:time dct:hasPart skos:inScheme a bio:Death http://r.d.g/id/year/ xxxx owl:sameAs blt:TopicLCSH a event:place a LCSH URI if available blt:PublicationStartEvent event:agent a dct:isPartOf id.loc.gov URI for scheme A Literal All properties with a range of blt:PublicationEvent can be used with blt:PublicationStartEvent and blt:PublicationEndEvent. Arrows omitted for clarity. Agent BL URI Place BL URI GeoNames URI External Link rdfs:subClassOf a foaf:focus An Instance blt:PublicationEvent geo:SpatialThing a Series BL URI event:Event A Class skos:prefLabel skos:notation MARC country code URI a bibo:issn Key foaf:Agent dcterms:Agent bibo:Series Birth BL URI Death BL URI a foaf:familyName PublicationEndEvent BL URI foaf:givenName bio:event dct:BibliographicResource bio:event foaf:name blt:publicationEnd rdfs:subClassOf Person-as-Concept BL URI a blt:PersonConcept a dct:subject bibo:Book or bibo:MultiVolumeBook Person-as-Agent BL URI blt:hasCreated a rdfs:subClassOf id.loc.gov URI for scheme Family-as-Concept BL URI rdfs:subClassOf blt:hasContributedTo dct:subject foaf:focus blt:hasCreated dct:subject rdfs:label rdfs:subClassOf foaf:focus blt:OrganizationConcept blt:hasContributedTo blt:bnb Lexvo URI dct:subject dct:subject id.loc.gov URI for scheme a MARC language code URI dct:spatial Dewey BL URI skos:notation dct:alternative skos:prefLabel isbd:P1073 (note on language) isbd:P1042 (content note) skos:broader Place-as-Concept BL URI a owl:sameAs foaf:focus Title dct:description isbd:P1053 (extent) skos:notation Dewey Info URI bibo:isbn13 dct:title isbd:P1008 (edition statement) skos:inScheme Dewey Info URI for scheme foaf:Agent dct:Agent foaf:Organization org:Organization Identifiers dct:tableOfContents rdfs:subClassOf blt:TopicDDC bibo:isbn10 a dct:abstract foaf:focus skos:inScheme Subject rdfs:label [foaf:name] dct:language Organization-as-Concept BL URI a Organization-as-Agent BL URI dct:contributor Family-as-Agent BL URI id.loc.gov URI for scheme VIAF URI if available dct:creator Resource BL URI a skos:inScheme blt:FamilyConcept owl:sameAs rda:periodOfActivityOfThePerson dct:contributor dct:subject Skos:Concept foaf:Agent dct:Agent foaf:Person a dct:creator foaf:focus skos:inScheme bibo:numVolumes Place-as-Thing BL URI a Miscellaneous literals rdfs:subClassOf blt:PlaceConcept LCSH URI if available geo:SpatialThing dct:Location Assume that most instance data will have an rdfs:label. These properties have been omitted for clarity. V.1.4 August 2012 Tim Hodson - tim.hodson@talis.com Corine Deliot - Corine.Deliot@bl.uk Alan Danskin - Alan.Danskin@bl.uk Heather Rosie - Heather.Rosie@bl.uk Jan Ashton - Jan.Ashton@bl.uk British Library Data Model http://www.bl.uk/bibliographic/pdfs/bldatamodelbook.pdf
    88. 88. Semantic Web for Digital Libraries Geographical LD case study Uldis Bojars, Nuno Lopes, & Jodi Schneider
    89. 89. The NLI Longfield Map Collection • Collections refer to Geographical Data in many forms… • The Longfield Maps are a set of 1,570 surveys carried out in Ireland between 1770 and 1840. • Currently catalogued in MarcXML, using data from Logainm, Geonames and Dbpedia.
    90. 90. Longfield Map example <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield>
    91. 91. Geographic Data Providers  DBpedia – Includes latitude and longitude for geographic entities  LinkedGeoData – Export of data from OpenStreetMap – Beyond lat/lon (areas as polygons)  GeoNames – Access data as RDF (download requires subscription)  GeoLinkedData   Spain Ordnance Survey  UK
    92. 92. Logainm.ie • The authority list of Irish place names, validated by the Place Names Branch. • Delivering a more detailed level than in DBpedia, Geonames. • Unique source of Irish language place names. • NLI looking to integrate Logainm data into their workflow. Allowing to search for place names in Irish.
    93. 93. Geo-Vocabularies • W3C Geo (very basic) – SpatialThing, latitude and longitude • Most providers have defined their own • NeoGeo (http://geovocab.org/doc/neogeo/) – Feature vs Geometry – Spatial Relations (is_part_of)
    94. 94. NeoGeo Overview • Classes – Feature (spatial:Feature) • A geographical feature, capable of holding spatial relations. – Geometry (geom:Geometry) • Super-class of all geometrical representations (RDF, KML, GML, WKT...). • Connected by the geometry (geom:geometry)
    95. 95. Relations between geometries Properties • • • • • connects with (spatial:C) overlaps (spatial:O) is part of (spatial:P) contains (spatial:Pi) …
    96. 96. Creating a LD Dataset Steps: 1. Data transformation / access • Vocabulary assessment 2. Link Discovery • Evaluation of generated links 3. Deployment • Virtuoso OpenSource
    97. 97. Converting Logainm to RDF ~100,000 place names ~1.3M triples http://data.logainm.ie/1 375542 Dublin http://sws.geona mes.org/2964574/
    98. 98. Link Discovery • Silk – http://wifo5-03.informatik.uni-mannheim.de/bizer/silk/ • LIMES – http://aksw.org/Projects/LIMES.html • Based on specifying rules that compare pairs of entities
    99. 99. Rules to discover links to other datasets • Rules based on: – – – – Place names Geographical coordinates Name of the county / parent place name Hierarchy of places • # entities matched: – DBpedia: 1,552 – LinkedGeoData: 6,611 – GeoNames: 8,229
    100. 100. Longfield Map example <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land tenure</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Rathdown (Barony)</marc:subfield> </marc:datafield> <marc:datafield tag="650" ind1="" ind2=""> <marc:subfield code="a">Land use surveys</marc:subfield> <marc:subfield code="z">Ireland</marc:subfield> <marc:subfield code="z">Wicklow (County)</marc:subfield> </marc:datafield> <marc:datafield tag="651" ind2="7" ind1=""> <marc:subfield code="2">logainm.ie</marc:subfield> <marc:subfield code="a">Rathdown</marc:subfield> <marc:subfield code="0”>http://data.logainm.ie/place/283</marc:subfield> </marc:datafield>
    101. 101. Demo: Location LODer http://apps.dri.ie/locationLODer/locationLODer
    102. 102. Hands-on Activities 11:50 – 12:25 Choice of Activities…. • Data Modelling • Data Cleaning & Structuring • Querying (SPARQL)
    103. 103. Semantic Web for Digital Libraries Open Refine Exercise Uldis Bojars, Nuno Lopes, & Jodi Schneider
    104. 104. Open Refine • Useful for batch transformation of large amounts of data – data cleanup (misspellings, splitting multiple-valued columns, …) • Linking to other databases – Freebase – Any SPARQL enabled LD • Website: http://openrefine.org/ • RDF extension: http://refine.deri.ie/
    105. 105. Exercise • Examples from: http://freeyourmetadata.org/ • Sample Data (collection metadata from the Sydney Powerhouse Museum): http://data.freeyourmetadata.org/powerhous e-museum/phm-collection.zip • Screencast: http://www.youtube.com/watch?v=NnCA1dn CT-c
    106. 106. Task 1 - Data Cleanup 1. 2. 3. 4. 5. 6. 7. 8. Import the collection into OpenRefine Get to know your data Remove blank rows Remove duplicate rows Split cells with multiple values Remove blank cells Cluster values Remove double category values
    107. 107. Task 2 - Data Reconciliation & RDF Export 1. 2. 3. 4. 5. 6. 7. Pick a column to reconcile Pick a vocabulary to reconcile with Tell OpenRefine about the vocabulary Start the reconciliation process Understanding the reconciliation results Interpreting the new reconciliation results Exporting RDF
    108. 108. Semantic Web for Digital Libraries SPARQL Hands-on Session Uldis Bojars, Nuno Lopes, & Jodi Schneider
    109. 109. SPARQL • Query Language for RDF data • W3C Standard • Components of a SPARQL Query: – Prefix Declarations – Result type (SELECT, CONSTRUCT, DESCRIBE, ASK) – Dataset – Query pattern – Solution modifiers
    110. 110. Further information • In-Depth SPARQL tutorials – http://www.cambridgesemantics.com/semanticuniversity/sparql-by-example – http://axel.deri.ie/presentations/20100922SPARQL1.1 Tutorial.pptx – http://web.ing.puc.cl/~marenas/talks/BNCOD13.pdf • SPARQL: – http://sparql.org/ (Jena) – http://dydra.org/
    111. 111. SPARQL by example – Europeana Endpoint Endpoint: http://europeana.ontotext.com/sparql 1. SPARQL Select template 2. List of data providers having contributed content to Europeana 3. List of provided objects with their aggregators 4. 18th century Europeana objects from France 5. Write your own
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×