Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hub Distributed Model 2009


Published on

Published in: Education, Technology
  • Login to see the comments

Hub Distributed Model 2009

  1. 1. The Archives Hub ~ Interoperability, Spokes and the Distributed Model
  2. 2. The Hub in a Nutshell <ul><li>Based at Mimas, University of Manchester </li></ul><ul><li>In service since 2000 </li></ul><ul><li>Over 23,000 collection descriptions </li></ul><ul><li>170 repositories </li></ul><ul><li>JISC funded </li></ul><ul><li>Management and service team at Manchester </li></ul><ul><li>Development team at Liverpool </li></ul><ul><li>Cheshire software </li></ul><ul><li>Cheshire for Archives – works with EAD descriptions </li></ul><ul><li>Distributed system </li></ul>
  3. 3. Content and contributors <ul><li>Strategic aim: build and enhance content </li></ul><ul><li>Meeting the needs of the UK research community </li></ul><ul><li>Meeting the needs of the wider community </li></ul><ul><li>Archives for education and research </li></ul>Hub Workshop 2009 Flickr cc licence: eileenaway's photostream The success of the Hub is a reflection of the rich content available from Hub contributors
  4. 4. Current contributors <ul><li>Higher/Further Education </li></ul><ul><li>Consortium contributions </li></ul><ul><li>Institutions with a research agenda </li></ul><ul><li>Others on a case-by-case basis </li></ul><ul><li>We encourage institutions to contact us </li></ul>Hub Workshop 2009 John Rylands Library, Manchester
  5. 5. Collection or lower-level…? <ul><li>Originally funded for collection-level </li></ul><ul><li>Software/searches effective with both </li></ul><ul><li>Complimentary approaches </li></ul><ul><li>Researchers ask for detail </li></ul>Hub Workshop 2009 Flickr cc licence: Muffet’s photostream Flickr cc licence: soylentgreen23’s photostream <ul><li>Images useful at item level </li></ul>
  6. 6. JISC Information Environment <ul><li>… a vast and sometimes bewildering range of potential sources of electronic information. Each source of information has its own name, its own interface, features and search facilities. Little wonder, then, that many users remain unaware of their existence or fail to discover their value for their own learning, teaching or research. </li></ul><ul><li>A key challenge is therefore to achieve a managed, coherent and shared information environment that will overcome these obstacles. </li></ul><ul><li>Being able to cross-search and use customised, value added and other services will considerably simplify users’ interactions with online resources. This should encourage take-up and greatly improve means of accessing these resources. </li></ul><ul><li>… these activities need to be based on standards for the creation, access, use, preservation and interoperability of networked resources. </li></ul><ul><li> </li></ul>
  7. 7. JISC Information Environment <ul><li>Most content providers will already offer a Web site through which end-users can access their content. To be a part of the JISC-IE, content providers also need to support machine oriented interfaces to their resources. </li></ul><ul><ul><li>Support searching using Z39.50/SRW </li></ul></ul><ul><ul><li>Support metadata harvesting using OAI-PMH </li></ul></ul><ul><li>Andy Powell </li></ul><ul><li>5 step guide to becoming a content provider in the JISC Information Environment </li></ul><ul><li> </li></ul>Hub Workshop 2009
  8. 8. E-GIF, open source and open standards <ul><li>e-GIF version 6.1 (18 th March 2005) </li></ul><ul><ul><li>The e-Government Interoperability Framework (e-GIF) sets out the government’s technical policies and specifications for achieving interoperability… across the public sector. </li></ul></ul><ul><ul><li>There is a strategic decision to adopt XML and XSL as the core standards for data integration and management. </li></ul></ul><ul><ul><li>It is a pragmatic strategy that aims to reduce cost and risk for government systems while aligning them to the global Internet revolution. </li></ul></ul><ul><ul><li> </li></ul></ul><ul><li>Open Source, Open Standards and Re–Use: Government Action Plan </li></ul><ul><ul><li> </li></ul></ul>Hub Workshop 2009
  9. 9. Isn’t technology brilliant?!! <ul><li>Technical know-how </li></ul>Hub Workshop 2009 <ul><li>XML </li></ul><ul><li>Data creation/editing template </li></ul><ul><li>Web interface </li></ul><ul><li>Machine interfaces </li></ul><ul><li>Distributed model </li></ul><ul><li>Web 2.0 </li></ul><ul><li>Dissemination </li></ul>= Satisfying user experience + understanding users
  10. 10. Hub Data Flow Hub Workshop 2009 <ul><li>Sustainable model </li></ul><ul><li>Data held as XML </li></ul><ul><li>Efficient search </li></ul><ul><li>mechanism </li></ul><ul><li>Flexible access </li></ul><ul><li>Easy to become a </li></ul><ul><li>Spoke </li></ul>
  11. 11. The Distributed Hub Flickr cc licence : Thomas Hawk The main goal of a distributed computing system is to connect users and resources in a transparent , open, and scalable way. Ideally this arrangement is drastically more fault tolerant and more powerful than many combinations of stand-alone computer systems. [Wikipedia] <ul><li>Administration interface </li></ul><ul><li>Customisable web front-end </li></ul><ul><li>Machine-to-machine interfaces </li></ul><ul><li>Data Creation Template </li></ul><ul><li>Local control </li></ul><ul><li>Technical support locally </li></ul><ul><li>Hub team support </li></ul>
  12. 12. Spokes software <ul><li>Offers a means of storing and sharing archival descriptions in XML </li></ul><ul><li>Provides machine-to-machine access to the descriptions through Z39.50 and SRU (Search and Retrieve via URL) & OAI-PMH for harvesting records </li></ul><ul><li>Provides a customisable Web search interface </li></ul><ul><li>Is open source and based on open standards </li></ul><ul><li>Includes a data creation and editing template </li></ul>
  13. 13. Anatomy of a Spoke Hub Workshop 2009 EAD XML files Web search interface Direct searching access for other applications through standards-based machine-to-machine protocols … Including the central Hub! Z39.50 SRU Cheshire indexes of EAD data HTTP
  14. 14. Spokes indexes The database will provide indexes based on the following standards: Data standard Data field(s) cql.anywhere full text dc.description unittitle, controlaccess, and scopecontent fields dc.title collection title (titleproper) dc.creator creator of the collection dc.identifier eadid unitdate dc.subjects controlaccess fields personal, family, corporate and geographic names bath.personalName personal names bath.corporateName corporate names bath.geographicName geographic names bath.genreForm genre
  15. 15. Administration Interface Hub Workshop 2009
  16. 17. Hub Workshop 2009
  17. 18. Liverpool Spoke Hub Workshop 2009
  18. 19. John Rylands Spoke Hub Workshop 2009
  19. 20. Agreement with Spokes Hub Workshop 2009
  20. 21. Hosted Spokes <ul><li>Spokes at Manchester </li></ul><ul><ul><li>Configuration </li></ul></ul><ul><ul><li>Agreement between parties </li></ul></ul><ul><li>Manchester team undertake agreed level of support </li></ul><ul><li>Institution still responsible for the data </li></ul>Hub Workshop 2009
  21. 22. Being an Archives Hub Spoke… <ul><li>Gives you control over your own EAD files </li></ul><ul><ul><li>Allows you to update and add new files when you need to </li></ul></ul><ul><li>Exposes your EAD to other applications which need to cross-search the descriptions </li></ul><ul><ul><li>Using standards-compliant methods </li></ul></ul><ul><li>Means you benefit from using software that has been developed with the Archives Hub community </li></ul>Hub Workshop 2009
  22. 23. Collaboration & Sharing <ul><li>Networks and communities – the National Archives Network </li></ul><ul><li>Cross-service and cross-domain collaboration </li></ul><ul><ul><li>Copac </li></ul></ul><ul><ul><li>Intute </li></ul></ul><ul><ul><li>Digitisation Projects </li></ul></ul><ul><li>Expand and share content </li></ul><ul><ul><li>import/export/M2M </li></ul></ul><ul><li>Links to other archive services </li></ul><ul><ul><li>NRA </li></ul></ul>Hub Workshop 2009
  23. 24. The National Archives Network <ul><li>‘ Our vision of the future of British archives is of a flow of archival information which takes account of all the opportunities offered by digital networks and offers opportunity for exploration - historical, personal, social - to the broadest possible range of people wherever they can use it - in the home, the classroom or the office.’ </li></ul>Hub Workshop 2009 British Archives: The Way Forward (NCA, 2000) A comprehensive national resource discovery mechanism
  24. 25. The importance <ul><li>‘ There can be no higher priority for archives than the creation of this collaborative electronic network, overcoming the limitations of geography, crossing the many archival sectors and creating a truly unified digital directory or encyclopaedia of British historical documents.’ </li></ul>Hub Workshop 2009 British Archives: The Way Forward (NCA, 2000)
  25. 26. National Archive Network Hub Workshop 2009
  26. 27. The opportunity <ul><li>‘ Outreach has been a developing preoccupation for archives in recent years, but the arrival of the internet age provides the opportunities to take archives, as never before, to the doorstep of the community at large.’ </li></ul><ul><li>British Archives: The Way Forward (2000) </li></ul>Hub Workshop 2009
  27. 28. Progress of the NAN <ul><li>Many archives took part in this drive towards a national archives network </li></ul><ul><li>… many still are taking part </li></ul><ul><li>The importance of recognised standards </li></ul><ul><li>Intention to create collection level catalogues of all substantial collections within a defined timeframe </li></ul>Hub Workshop 2009
  28. 29. Success of the NAN <ul><li>Strands of the national archives network provide access to archives that were previously inaccessible </li></ul><ul><li>The HLF has played a major role in enabling access and online discovery </li></ul><ul><li>Users of archives have benefited enormously </li></ul><ul><li>Data standards have become of central importance </li></ul>Hub Workshop 2009
  29. 30. Shortcomings of the NAN <ul><li>We don’t have a single national network </li></ul><ul><li>Differences in data structure; content; search capabilities; look and feel </li></ul><ul><li>Strands are not fully interoperable </li></ul><ul><li>Politics, funding and willpower may not combine in favour of this approach </li></ul><ul><li>The landscape has changed substantially since 2000 – maybe this solution is no longer appropriate? </li></ul>Hub Workshop 2009
  30. 31. The NAN today <ul><li>Many ‘strands’ </li></ul><ul><li>Only a few use EAD (support EAD export) </li></ul><ul><li>Lack of funding for a joint solution </li></ul><ul><li>Key is interoperability and machine-to-machine interfaces: </li></ul><ul><li>NAN as a community, sharing knowledge and experiences </li></ul><ul><li>NAN as a promoter of standards and facilitator for data sharing </li></ul><ul><li>NAN strands as promoters of flexible and open approaches </li></ul>Hub Workshop 2009
  31. 32. The Interoperable Hub Hub Workshop 2009 The ability of software and hardware on different machines to share data <ul><li>Content standards </li></ul><ul><li>Structural standards </li></ul><ul><li>Validation of content </li></ul><ul><li>Data Editor </li></ul><ul><li>Training and awareness </li></ul><ul><li>Contributor responsibility </li></ul><ul><li>Networking and community building </li></ul>
  32. 33. Hub Workshop 2009
  33. 34. Machine-to-machine interfaces <ul><li>Web access is just one means of access to the data </li></ul><ul><li>Machine access provides flexible access, so people can set their own agendas </li></ul><ul><ul><li>Z39.50 </li></ul></ul><ul><ul><li>SRU </li></ul></ul><ul><ul><li>OAI-PMH (harvester) </li></ul></ul><ul><li>Need to provide semantic data – properly marked up, well-structured </li></ul>Hub Workshop 2009
  34. 35. Pilot project for SRU: Genesis portal for Women’s Studies <ul><li>Hub hosts data </li></ul><ul><li>Genesis searches the Hub using SRU </li></ul><ul><li>Implications for data – how search just for appropriate descriptions? </li></ul><ul><li>Possible issues with search speeds </li></ul>Hub Workshop 2009
  35. 36. Persistent Identifiers <ul><li>All Hub descriptions have their own identifiers – a unique reference </li></ul><ul><li>Gives them their own web address – can point to any description </li></ul><ul><li>Facilitates linking, e.g. from National Register of Archives </li></ul><ul><li>Enables bookmarking of content </li></ul><ul><li> </li></ul>Hub Workshop 2009
  36. 37. Challenges (of which there are many) <ul><li>Understanding our users </li></ul><ul><li>Encouraging item-level descriptions </li></ul><ul><li>Encouraging images/links to content </li></ul><ul><li>Which technology? </li></ul><ul><li>Perceptions of relevancy </li></ul><ul><li>Understanding Impact </li></ul><ul><li>Sustainability </li></ul>Hub Workshop 2009 Flickr cc licence: hoodwink’s photostream
  37. 38. Moving Forward <ul><li>Increasing content and contributors </li></ul><ul><li>Branding and new Website </li></ul><ul><li>More engagement with users / user generated content </li></ul><ul><li>Continuing to be standards-based, open and interoperable </li></ul>Hub Workshop 2009