LINKED DATA AND THE LOCAH PROJECT Bethan Ruddock, Library and Archival Services, Mimas, University of Manchesterbethan.firstname.lastname@example.org @ bethanar #ILI2011
LINKED OPEN COPAC & ARCHIVES HUB JISC-funded project (under JISCexpo - exposing digital content for education and research) September 2010 – August 2011 Staff from Mimas, UKOLN, Eduserv Additional expertise from Talis, OCLC, Library of Congress
PROJECT AIMS Put archival and bibliographic data at the heart of the Linked Data Web, making new links between diverse content sources, enabling the free and flexible exploration of data and enabling researchers to make new connections between subjects, people, organisations and places to reveal more about our history and society. Make a collection of resources available on the Web as structured data, in particular linked data, where a case can be made that it would benefit teaching, learning, research, administration and/or knowledge transfer in UK higher education Develop a prototype with instructional step-by-step demonstration and documentation to show how the structured content can be used by 3rd party tools and services Explore and report on the opportunities and barriers in making content structured and exposed on the Web for discovery and use. Such opportunities and barriers may coalesce around licensing implications, trust, provenance, sustainability and usability
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
THE DATA: COPAC• Merged union catalogue of the holdings of over 60 UK libraries• Over 50 million records• Consolidated records• MODS XML (not MARC) A Copac consolidated record created from 5 contributed records. Lines show how contributed records match with one another.
THE DATA: ARCHIVES HUB• Descriptions of archive collections from over 200 UK repositories• Nearly 25,000 descriptions – collection-level and multi-level• EAD (Encoded Archival Description)
CHALLENGES: VARIANCE• Data from many sources – should adhere to Standards AARC2 ISAD(G) BUT Differences in implementation
CHALLENGES: DATA 260 $b: unknown dct:publisher: unknown dct publisher: definition:‘entity responsible for making the resource available’
CHALLENGES: MULTIPLE SOURCES A ‘match graph’ of a consolidated Copac record
LICENSING• Data comes from contributors Not ours to redistribute!• Concerns Provenance Trust Control• Consulted Liaised with contributors and stakeholders
THE TECHY STUFFSpecifications required a lot of brainstorming… Image used under a CC licence from http://www.flickr.com/photos/blankdots/4865831504/
ARCHIVES HUB MODEL in Finding maintainedBy/ Repository administeredB Place Postcode Aid maintains (Agent) y/ Unit administers hasPart/ encodedAs/ partOf encodes EAD Document accessProvidedBy/ LevelBiographical hasBiogHist/ topic/ providesAccessTo History isBiogHistFor page level Language Archival language at time topic/ origination hasPart/ Resource page product of Creation Temporal partOf associatedWith Entity extent inScheme Extent Agent Concept Concept Scheme representedBy Is-a foaf:focus Object Is-a associatedWith Person Family Organisation Place Book participates in Birth Death Genre Function at time Temporal Entity
Node name MODS field OntologyBibliographicResource <modscollection> bibocardinality property URI/literal ontology0 1 copac:creator Creator URI dc0 m copac:contributor Contributor URI coapc0 1 event:producedIn Production Date URI event0 1 dct:issued Production Date URI dc0 m pode:publicationPlace Place URI pode0 m isbd:P1016 Place URI isbd0 m dct:publisher Publisher URI dc0 1 dct:isPartOf Series URI dc1 m copac:HeldBy Institution URI with Institution as subject1 1 bibo:type Type URI bibo0 m dct:subject Subject URI dc0 m skos:subject subject URI skos0 m dct:language Language URI dc1 1 hub:encodedAs mods URI hub
WHAT NEXT? Linking Lives name-based approach into the data integrating archival resource with other resources DBPedia, VIAF, Copac... route into archives for different audiences? issues around trust and provenance to be explored
FINALLY… The LOCAH data is open for use… …please play with it!Image used under a CC licence from http://www.flickr.com/photos/huladancer22/530743543/
@bethanar LOCAH blog: http://email@example.comImage used under a CC licence from http://www.flickr.com/photos/theilluminated/5386099858/
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.