Your SlideShare is downloading. ×
“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

“Il n’y a pas de hors-texte” - Challenges for Archival Linked Data


Published on

Invited speaker talk given at the 'Meeting on Semantic Web and Archives, Libraries and Museums' event, Fundación Ramón Areces, Madrid, Spain. 10th April …

Invited speaker talk given at the 'Meeting on Semantic Web and Archives, Libraries and Museums' event, Fundación Ramón Areces, Madrid, Spain. 10th April 2014.

Published in: Technology, Education

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Generally considered to be a more accurate translation.Got me thinking about archival context.Got me thinking about process of creating linked data somewhat like deconstruction – breaking down what we have thinking about things – then reconstruct. This process the possibly problematizes the notion of archival context – RDF model problematizes notion of ISAD(G) and archival context and document centric ways of thinking.
  • Hub is an aggregation of archival descriptions from archive repositories across the UK.The core data comes from the Archives Hub, UK aggregator of archival descriptions – forms the basis of the linked dataApprox500,000 component level descriptions
  • Talk through the page a bit.1,495,168 statements currently - triples in LD subset
  • ‘Every story has a beginning’
  • Mock-up of the LInking Lives interface shows the way data is brought together.
  • External data is key to linked data. We link to VIAF and through that to DBPedia. We are looking at linking to the BNB.
  • Current unfinished version of the interface.
  • Data modelling can be hard – takes timeVocabularies can be hardTransforming data hardXSLT hardNot many toolsWorth the investment?
  • Steep learning curve: - RDF Linked Data modelingterminology - Lack of archive domain examples – though you now have LOCAH! - Certain level of expertise neededDirty Data - Joe Bloggs and others’ rather than just a name, or where the access points do not have rules or a source associated with them. - Extent data highly variableComplexity - “lower level” units interpreted in context of the higher levels of description - Arguably “incomplete” without the contextual data.Relations are asserted, e.g. member-of/component-ofBut there is no requirement or expectation that data consumers will follow the links describing the relationsFrom Pete’s blog post:“In a post on the Archives Hub blog, Jane emphasised the value of the “Linked Data” approach in making things mentioned in our data into “first-class citizens”. One consequence of the multi-level approach in archival description practice is a strong sense of the importance of “context”, and that the descriptions of the “lower level” units should be read and interpreted in the context of the higher levels of description (perhaps even that they are in some sense “incomplete” without that “contextual” data). In contrast, the “Linked Data” approach typically involves exposing “bounded descriptions” of individual resources. Now, certainly, yes, those “bounded descriptions” include assertions of relationships with other resources (including the sort of part-whole/member-of/component-of relationships present here), and those links can be followed by consumers to obtain further information on the other resources – however, there is no requirement or expectation that consumers will do so. So, there is arguably a (perhaps unavoidable) element of tension between the strongly “contextual” emphasis of EAD and ISAD(G) and the “bounded descriptions” of “Linked Data”. Rather than seeing that as an insurmountable hurdle, however, I think it provides an area that the project can usefully explore and evaluate.”
  • Names are often entered into the Hub in different ways, despite the use of Rules.
  • One of the challenges of doing Linked Data is the plethora of vocabularies. It is hard to decide what we should use. Daniel Suarahighlighted this.
  • But matching strings is not easy, e.g. matching subjects in the Hub with subjects in LCSH.
  • Quotes from Linking Lives Evaluation:Researchers want a clearer idea of what is covered and they don’t always understand the results they see and why they get certain results in response to their searches. I can’t help thinking that, bearing this in mind, bringing diverse sources together may make it more difficult for users to understand and interpret results.“they remained cautious about the the principle of bringing sources together”serendipitous searching there was a feeling that it could potentially be useful but also that it could actually distract the researcher from what is relevant.“I think at PhD level there’s a kind of artistry to how you make your way through…I’ve certainly never come across a search engine that can do the same or be as complex as your own thinning patterns.”Whilst it could be said that it is not important for users to understand how data is pulled together under the hood, our research suggested that potential users, particularly advanced researchers, do indeed have an interest in how and why this information has been gathered together in a particular way.
  • Transcript

    • 1. Meeting on Semantic Web and Archives, Libraries and Museums n n Areces, Madrid, Spain. 10th April 2014 Adrian Stevenson Senior Technical Innovations Coordinator Mimas, University of Manchester, UK @adrianstevenson “Il n’y a pas de hors-texte” – Challenges for Archival Linked Data
    • 2. “Il n’y a pas de hors-texte” ‘Of Grammatology’ Jacques Derrida, 1967
    • 3. “There is nothing outside the text”
    • 4. “There is nothing outside context”
    • 5.
    • 6.
    • 7. Deconstruction / Context • Archives Hub data in ‘Encoded Archival Description’ EAD XML form • Need to think about: – knowing what we want to say about our ‘things’ – data modelling – defining relationships – selecting vocabularies – deciding on identifiers – HTTP URIs – creating RDF XML – linking to external resources
    • 8. Archival Resource Finding Aid EAD Document Biographical History Agent FamilyPerson Place Concept Genre Function Organisation maintainedBy/ maintains origination associatedWith accessProvidedBy/ providesAccessTo topic/ page hasPart/ partOf hasPart/ partOf encodedAs/ encodes Repository (Agent) Book Place topic/ page Language Level administeredBy/ administers hasBiogHist/ isBiogHistFor foaf:focus Is-a associatedWith level Is-a language Concept Scheme inScheme Object representedBy Postcode Unit Extent Creation Birth Death extent participates in Temporal Entity Temporal Entity at time at time product of in Archives Hub Model
    • 9.
    • 10. Visualisation Prototype Using Timemap – – Googlemaps and Simile – map/ Early stages with this Will give location and ‘extent’ of archive. Will link through to Archives Hub
    • 11.
    • 12.
    • 13. Linking Lives • Linking Lives is a project to create an end-user interface based on Linked Data • A biographical interface, providing information about individuals that is taken from a variety of sources • Aim is to place archival descriptions within a much broader context
    • 14. Martha Beatrice Webb Place of birth: Gloucester, England Place of death: Liphook, Hampshire, Englan d Life dates: 1858-1943 Epithet: social reformer and historian Family name: Webb Image from: Beatrice Webb letters Beatrice Webb (1858 - 1943). Fabian Socialist, social reformer, writer, historian, diarist. Wife, collaborator and assistant of Sidney Webb, later Lord Passfield. Together they contributed to the radical ideology first of the Liberal Party and later of the Labour Party. from: Beatrice Webb, A summer holiday in Scotland, 1884. Beatrice Webb (1858-1943), nee Potter, social reformer and diarist. Married to Sidney Webb, pioneers of social science. She was involved in many spheres of political and social activity including the Labour Party, Fabianism, social observation, investigations into poverty, development of socialism, the foundation of the National Health Service and post war welfare state, the London School of Biographical Notes Works Our Partnership My Apprenticeship The case for the factory acts Beatrice Webb’s diaries; edited by Margaret Cole The Diary Knows,_1st_Bar on_Passfield
    • 15. Why? • Telling stories • Placing archives in a global information space • External data forms part of the user interface – moving away from the silo approach • Dynamic links to other content • Extensible • An exemplar – shows what can be done
    • 16. Some Challenges / Lessons Learnt • Steep learning curve • Difficult data, URI persistence • Linking data not straightforward • Keeping data up to date • How sustainable are the data sources? • Can you track the provenance of data sources? • Are data licensing issues covered?
    • 17. Data Modelling • Steep learning curve –RDF terminology “confusing” –Lack of archival examples • Complexity –Archival description is hierarchical and multi-level –RDF may be at odds with ISAD(G)
    • 18. Hub data inconsistencies • Winston Leonard Churchill • Sir Winston Leonard Spencer Churchill • Churchill, Sir, Winston Leonard Spencer, 1874- 1965, knight, prime minister and historian • Churchill, Winston Leonard, 1874-1965, prime minister • Churchill, Sir Winston, 1874- 1965, knight, statesman and historian
    • 19. Understanding Vocabs & Ontologies
    • 20. Linking Names
    • 21. Linking Subjects
    • 22. Thoughts on What Next? • We still need more convincing use / business cases – Clear articulation of what researchers actually gain by bringing diverse data together • We still need more and better tools – But this depends on use cases • Cultural heritage not working together enough – better collaboration on things like name URIs • Coordinated consistent approach for vocabs
    • 23. Adrian Stevenson @adrianstevenson More on Linked Data at:
    • 24. This presentation is available under creative commons Non Commercial-Share Alike: