Presented at Linked Open Data: current practice in libraries and archives (Cataloguing & Indexing Group in Scotlland 3rd Linked Open Data Conference), Edinburgh, 18 Nov 2013
Publishing the British National Bibliography as Linked Open Data / Corine Deliot, British Library
1. Publishing
the British National Bibliography
as
Linked Open Data
Corine Deliot
Metadata Standards Analyst
British Library
CIGS Linked Open Data Seminar
Edinburgh, 18 November 2013
2. Presentation overview
• Motivations and approach
• The modelling process and the data model
• Technical process: from MARC 21 to RDF
• Linking to external datasets
• Outcomes – datasets/platform/access
• Plans for future developments
www.bl.uk
2
3. Motivations
• Publishing our data for others to re-use
• Looking beyond library audiences
• Taking part in the Linked Data conversation
www.bl.uk
3
4. How?
• Pragmatic, bottom-up approach
• Using existing staff
• Building on existing skills
• Using existing tools as much as possible
www.bl.uk
4
5. Why BNB?
• General bibliography - not a
unique institutional catalogue
• Consistent format - over 60 years
• Size & range of content - 3
million records on all subjects in many
languages
• Control of metadata –
publishable as CC0.
www.bl.uk
5
6. The modelling process (I)
• identify our objects of interest, i.e. what does the
MARC record says about “things in the world”
e.g. Bibliographic resources, people, organizations,
places, subjects, etc.
• Assign URIs to identify these objects of interests
URI pattern guidance from the UK Cabinet Office
“Designing URI Sets for the UK Public Sector”
www.bl.uk
6
8. The modelling process (II)
Describe these objects of interest and how they relate
to each other.
Use classes and properties from existing RDF
vocabularies
Define our own classes and properties when required;
documented in the British Library Terms RDF schema
www.bl.uk
8
9. RDF Vocabularies
• Bibliographic Ontology
• Org: an Organisation Ontology
• Bio: a Vocabulary for
Biographical Information
• OWL
• British Library Terms
• RDA
• Dublin Core
• RDF
• Event Ontology
• RDF Schema
• FOAF: Friend of a Friend
• SKOS
• ISBD
• WGS84 Geo Positioning
www.bl.uk
9
10. The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
• Existing property not quite right (e.g. not granular enough)
e.g. dcterms:identifier vs blt:bnb
www.bl.uk
10
11. The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
Property or class required by specific feature of the model
e.g. blt:publication and blt:PublicationEvent
(rdfs:subclass of event:Event)
www.bl.uk
11
12. The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
For pragmatic reasons, e.g. facilitate searching, inferencing
and navigating through the graph
e.g. blt:TopicLCSH and blt:TopicDDC
e.g. blt:hasCreated owl:inverseOf dcterms:creator
www.bl.uk
12
13. The BNB data model - Books
www.bl.uk
http://www.bl.uk/bibliographic/pdfs/bldatamodelbook.pdf
13
15. Data Model Features (II): Publication as an event
• <BibResource> dcterms:publisher <Publisher> .
<BibResource> dcterms:issued “Date” .
<BibResource> ? “Place” .
Usual
approach
Or
<BibResource> ? <Place> .
• <BibResource> blt:publication <PublicationEvent> .
<PublicationEvent> event:place <Place> .
<PublicationEvent> event:agent <Publisher> .
<PublicationEvent> event:time <Year> .
www.bl.uk
Event-based
approach
15
16. Data model features (III)
• Birth and death are modelled as biographical events
• extensive use of foaf:focus to relate “things in the world”
(e.g. people, organizations, places) to their SKOS concepts.
e.g. “Paris”, the capital of France as a single “thing in the
world” may be the “focus” of multiple concepts belonging to
different concept schemes, e.g. thesauri (LCSH, Rameau,
etc.)
<Concept> foaf:focus <Thing in the World>
http://efoundations.typepad.com/efoundations/2011/09/thing
s-their-conceptualisations-skos-foaffocus-modellingchoices.html by Pete Johnston
www.bl.uk
16
17. MARC to RDF Conversion Workflow
Process
• Selection
• Character set conversion
• Pre-processing
• URI generation
• Data transformation
• Create & load triples
Tools
• Catalogue Bridge Utilities
• MARC Global/MARC Report
http://www.marcofquality.com/
• Jena Eyeball
http://jena.sourceforge.net/Eyeball/
www.bl.uk
17
18. Linking to external sources (I)
To give our data broader
context we linked to:
• General resources:
• GeoNames
• Lexvo
• RDF Book
Mashup
• Library resources:
•
•
•
•
www.bl.uk
LCSH
VIAF
Dewey.info
MARC language
and country
codes
18
19. Linking to external sources (II)
Techniques included:
• Automatic generation from
record data
• Auto text match with linked data
dumps
• Crosswalk matching for coded
data
www.bl.uk
19
25. Platform change
• 2011 - initial Talis platform
• 2013 – data migration to TSO platform
http://www.tso.co.uk/our-expertise/technology/openup-platform
Tendering process
Migration of data and services over a couple of months
www.bl.uk
25
26. Plans for Future Developments
• Refine and extend the model
• Investigate frbr-ization
• Link to other external sources
• Geonames at city level
• ISNI, LC/NACO, DBpedia
• DNB bibliographic resources
• Expand scope beyond current BNB
• Improve developer support
www.bl.uk
26