Wrapping and Unwrapping History: What’s Gained and What’s Lost


Published on

Presentation given at the 'Unlocking Sources: WW1 & Europeana' conference located at the Staatsbibliothek zu Berlin, Germany on 31st January 2014.


Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Today talk about WW1 DiscoveryThis page from blog and can get more info there.Not all that useful one linerWill explore
  • WW1 part of major JISC activity called Discovery. Has ceased under this name and lots changing at Jisc, but still discovery activities a major activity.Core of Discovery advocating open data & licenses – v sim to Europeana.Perhaps less familiar is aggregating data – how you bring resources together. We tried to use data where it sits rather than gathering.Have created overlay API – the wrapping based on subject of WW1Created two user interfaces – unwrap the mediated data
  • JISC Discovery vision doc.We have IWM and V&A.API in area 2 of the diagramIntefaces is layer. Meets still very common use case. Also mention the technical principles
  • Lots could say about APIs but will focus on the resource discovery aspects.About machine readability.Most info marked up for humans on the web. Very useful but has its limits.APIs allows machines to read effectively the same info.Usually over the web but doesn’t have to be .Many ways of doing this. API isn’t a standard tho standards so existEg. Twitter client such as tweetdeck, hootsute, janetter – use the APIIncreasing interest in linked data and the web in museums space. WW1 not linked data – mention Locah and linking lives.Some have more of an interoperability focus, some more proprietary.
  • More specifically aggregating data from APIsUsing APIs and helping institutions set up APIs. Phase 1 Kings College work identified institutions with good WW1 stuff. Unfortunately this work wasn’t focussed so much on technical provision.Only a few identified sources had APIs- V&A and NMM.IWM API under the radar.Also data from other aggregators such as Euroepana and Culture Grid – Great war Arhive and Euroepana 1914-1918. Picking certain things.Revised project plan and tried to help data sources.Optimistic plan didn’t work. Have taken data from and set up examples at Mimas. In addition, very few data source institutions have APIs.Aim was to take data in all sorts of formats. SOLR very poplular.Open search.
  • First version of API released November 2012. Have been many subsequent revisions and are almost there with it. Worked through last set of bus and fixes late January 2013.APIs line other search servies like google using query syntax.Available as XML and JSON.About 12 data sources
  • Now onto the demonstratorsWorked with 2 suppliersHome page hereKCL identified unexplored areas that used as themes.
  • Can drag it around –tries to present nice exploratory format
  • Tried to highlight visual side of thingsA challenge is that many things have no images
  • Some nice stuff from NMM. Can get usual things – description, larger image and click through to to site and see licening info
  • Worked with WAWWD. Also a has a search
  • Note this is where the crowdsourcing comes inCan tell your own story.Idea is to feed this info back to data sources
  • Click street view to get overlayview
  • Can place images
  • Also has a map view. Can pull around an click through for interesting stuff.
  • Challenges include the lack of APIs available to aggregate data. Data comes in all shapes and sizes. Tried to work live with data where it is – called federated searching – decided to have another go. Means completely up to date. Don’t have to manage data locally – less maintenance.Tech principles suggest using data in situ here.Merits of this is that data doesn’t get stale and that in principle shouldn’t have data maintenance issues centralisers have such as Archives Hub.Mimas also does lots of centralisation stuff so wanted to try a diff approachAlso most API suited to querying, not harvesting.No valid way to relevance rank the search results of the different data sources against each other,.Acknowledged that even when you do centralise and have a view of all the MD, still questionable how rank, as they come in all shapes, sizes, quality and degrees of sparcity or not as Europeana appear to have found.Of course, if not using API for cross searching, this may well not be a problem.APIs are not meant for aggregating.Historypin god for getting missing but no good flow for feeding this data back to institutions.
  • Wrapping and Unwrapping History: What’s Gained and What’s Lost

    1. 1. Wrapping and Unwrapping History: What’s Gained and What’s Lost Unlocking Sources: WW1 & Europeana Staatsbibliothek zu Berlin, Germany. 31st January 2014 Adrian Stevenson Senior Technical Innovations Coordinator Mimas, University of Manchester, UK @adrianstevenson
    2. 2. 927233355446 ww1.discovery.ac.uk
    3. 3. WW1 Discovery Project • Proof-of-Concept illustrating principles of the JISC Discovery initiative • Discovery about advocating ‘open’ and ‘aggregating’ • Make digital content more discoverable by people and machines www.discovery.ac.uk • Built WW1 aggregation API and discovery layers
    4. 4. What is an API? • ‘Application Programming Interface’ • Allows machine readability of data – Typically over the Web • Provides access to content or functions for other systems • Many ways to do this – e.g. – Google, Facebook, Flickr, twitter APIs …. – OAI-PMH, Z39.50 – RDF - Linked Data, Semantic Web 5
    5. 5. WW1 Discovery: How? • Aggregate and ‘wrap’ data from existing APIs – NMM, V&A, Europeana • Help others with example API – BL, Welsh Voices, Postal Museum • Formats: SOLR, RSS, OpenSearch, OAI-PMH, CSV
    6. 6. http://ww1.discovery.ac.uk/how-to-use-our-api/
    7. 7. Challenges • Lack of APIs • Difficulties merging data – Varied content and formats – APIs can change – Relevance ranking dubious • From Discovery ‘Technical Principles’ - “Discovery is distributed … Discovery is concerned with a plethora of information resources and services from a wide variety of sources and is prepared, where appropriate, to deal with these in situ” • Speed of API response • Lack of content – images – geo-data and time data • Content licenses not open
    8. 8. Contact Adrian Stevenson Mimas, University of Manchester, UK adrian.stevenson@manchester.ac.uk www.mimas.ac.uk www.twitter.com/adrianstevenson www.linkedin.com/in/adrianstevenson www.slideshare.net/adrianstevenson 19
    9. 9. CC License This presentation available under creative commons Non Commercial-Share Alike: http://creativecommons.org/licenses/by-nc/2.0/uk/