Hull presentation to Fedora UK&I meeting, 21st March 2013


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Hull presentation to Fedora UK&I meeting, 21st March 2013

  1. 1. Towards a mature, multi-purpose repository for theinstitution…Chris Awre, Simon Lamb, Richard GreenFedora UK&I, University of Durham21st March 2013
  2. 2. An institutional repository• Repositories are infrastructure– Maintaining infrastructure requires resource, which weneed to minimise to justify costs in the long-term• Content doesn’t sit in silos– One repository facilitates cross-fertilisation of use• Integration with one system– Embedding the repository means linking to one placeFedora UK&I meeting, Durham | 21 March 2013 | 2One institution = one repository?
  3. 3. Five principles A repository should be content agnostic A repository should be (open) standards-based A repository should be scalable A repository should understand how pieces of content relateto each other A repository should be manageable with limited resourceFedora UK&I meeting, Durham | 21 March 2013 | 3
  4. 4. Five principles (leading to our implementation) Fedora is content agnostic Fedora is (open) standards-based Fedora is scalable Fedora understands how pieces of content relate to each other Fedora is manageable with limited resource– With help from the communityFedora UK&I meeting, Durham | 21 March 2013 | 4
  5. 5. HydraChange the way you think about Hull | 7 October 2009 | 2• A collaborative project between:– University of Hull– University of Virginia– Stanford University– Fedora Commons/DuraSpace– MediaShelf LLC• Unfunded (in itself)– Activity based on identification of a common need• Aim to work towards a reusable framework for multipurpose,multifunction, multi-institutional repository-enabledsolutions• Timeframe - 2008-11 (but now extended indefinitely)TextFedora UK&I meeting, Durham | 21 March 2013 | 5
  6. 6. Fedora and Hydra• Fedora can be complex in enabling its flexibility• How can the richness of the Fedora system be enabledthrough simpler interfaces and interactions?– The Hydra project has endeavoured to address this, andhas done so successfully– Not a turnkey, out of the box, solution, but a toolkit thatenables powerful use of Fedora’s capabilities throughlightweight tools• Principles can also be applied to other repository environments• Hydra ‘heads’– Single body of content, many points of access into itFedora UK&I meeting, Durham | 21 March 2013 | 6
  7. 7. HydraFedora UK&I meeting, Durham | 21 March 2013 | 7
  8. 8. Community• Conceived & executed as a collaborative, open sourceeffort from the start• Initially a joint development project betweenStanford, University of Virginia, and University ofHull• Close collaboration with DuraSpace / Partnership with MediaShelf,LLC• Complementary strengths and expertise• Community now stands at 16 partners
  9. 9. Community Hydra heads• sufia– Institutional repository head, created by Penn State• Now being adopted by Notre Dame, Northwestern and others• Avalon– Video management head being developed by Indiana andNorthwestern• Also HydraDAM, media management head based on sufia, at WGBH• DIL– Image management head created by Northwestern• Argo– Repository administrative head, created by StanfordFedora UK&I meeting, Durham | 21 March 2013 | 9
  10. 10. Hydra in Hull home page• Different views based onuser login• Multi-level security fordifferential access– Uses CAS (Shibboleth?)• Queue-based submissionworkflow– Everything deposited goesthrough QA prior topublication• Creating records usestemplates to suit contenttype• Launched September2011/March 2012Fedora UK&I meeting, Durham | 21 March 2013 | 10
  11. 11. Adapt to the contentFedora UK&I meeting, Durham | 21 March 2013 | 11
  12. 12. Organise the contentStructural setsDisplay setsFedora UK&I meeting, Durham | 21 March 2013 | 12
  13. 13. Experimental and observational data• Tiptoeing into data management• JISC History DMP project– Identified ways to encourage and facilitate the planning ofdata management• EPSRC roadmap– Highlighting ways forward to make the most of the data weproduceHistory DMPManaging data for historical researchThe Department of History at the University of Hull was the top performing department in the RAE2008 at the University. This highstanding has resulted from the broad range of research activity across the department, with over 30 full-time members of staffcontributing to the overall body of work. The Department’s focus on teams engaged in the production of databases was specificallyrecognised in feedback from the RAE2008 panel.The History DMP project has built on the existing data management best practices within the Department. It has used the experience indata management demonstrated and recognised to frame a departmental approach to data management, to enhance and build on theindividual activities that have been the basis of data management thus far. This departmental data management plan will be used tosupport future research strategy and provide a coherent platform for data sustainability in future research.The work undertaken developed an overall plan for use across he Department, and then implemented this using past, present and futureresearch activities as case studies. The role of local technical provision for data management, in the shape of the University’s digitalrepository, was explored and the system enhanced in specific ways to deliver a platform that not only manages the data, but allows forits exploitation and access as well.In brief…The focus of the project’s work on technical solutions formanaging data was on the institutional digital repository at Hull.This is based on Fedora1 and presented using Hydra2.Hydra provides a interface toolkit that enables templates (models)to be defined for different content types; a dataset model wascreated, that allows us to address dataset needs in the broadercontext of the repository as a whole. This specifically enables:Ø Addition of coverage and the translation specifically ofgeographic coverage into map displays using KML GoogleMaps files – e.g.,Ø Addition of a DOI identifier derived on submission to DataCitevia the DataCite APIØ The addition of extra fields (based on MODS) as required fordifferent future data management needsLinked data – The project translated a sample dataset into linkedopen data to highlight how this might support data managementbased around the structure inherent in datasets. This identifiedhow to effectively link out using appropriate ontologies as well asallowing linking in: the work will be a case exemplar to informfurther discussion.A three step process was taken to develop the History datamanagement plan:Ø Requirements for data management were gathered throughinterviews and focus groups.Ø These highlighted that academics were pleased thatsomeone was looking to help them: many were making do,having little idea of how to manage data, or who to ask.Ø The data management plan was developed through adistillation of questions in the DCC checklist, adapting them tosuit identified needs amongst history academicsØ The document is available at (search forhistory dmp at for related documents).Ø An online version using the DCC DMPOnline tool is inpreparation.Ø The data management plan was applied to three projects atdifferent stages of the research lifecycle: a completed project,a current project, and a project just starting. All found the planconvenient to complete and a useful way of focusing theirthoughts on data management.Ø Case studies – data management planChris Awre, Richard Green, John Nicholls, Peter Wilson, and David Starkey, University of Hull | Martin Dow, AcuityUnlimitedPlus many thanks to the members of staff and postgraduate students of the department for their inputThis work is taking place under the auspices of the JISC-funded History DMP project within the JISC Managing Research Data Programme 2011-3History DMP Project Blog: http://historydmp.wordpress.comEmail contact: or – | 2Hydra – http://projecthydra.orgCredits and Further InformationThe technologyKey to the data management plan has been raisingawareness and confidence in the capacity formanaging that data. The project has revealed thatlocal management within a repository provides astraightforward way of adding value to the data in itsown right alongside any disciplinary repository andprovides local assurance of the data being managed.Fedora UK&I meeting, Durham | 21 March 2013 | 13
  14. 14. Onward… Upgrade to Hydra 5• Hull’s original implementation was Hydra 2– We then painfully upgraded to Hydra 3• Hydra developers have since focused on streamlining theupgrade path– Also re-architecting technology stack to simplify adding ofnew functionality• We are planning an upgrade to Hydra 5 in April/May– Note - Hydra 6 is on the way• Watch this space…Fedora UK&I meeting, Durham | 21 March 2013 | 14
  15. 15. Onward… Image management• We need to add better image management to Hydra at Hull• Options• Still considering which route to followAdapt and expandour currentworkflowReview and adoptHydra communitysolutionDevelop solutionbased on IIIFspecificationFedora UK&I meeting, Durham | 21 March 2013 | 15
  16. 16. Onwards… Library search integrationStaff and studentsRepository Catalogue ArticlesHow to combine delivery?Fedora UK&I meeting, Durham | 21 March 2013 | 16
  17. 17. BlacklightFedora UK&I meeting, Durham | 21 March 2013 | 17
  18. 18. Onwards… Digital archives management• Archives colleagues took part in Mellon-funded AIMS project– An Inter-institutional Model for Stewardship of born-digitalcollectionsDeveloping practice forarchivists in dealing withborn-digital collectionsthrough events andadvocacyUsing Hydra to develop toolsthat help put the model intopracticeFedora UK&I meeting, Durham | 21 March 2013 | 18
  19. 19. Onwards… CRIS integration / E-publication• Hull has adopted the Converis research information managementsystem• Integration enabling deposit from Converis to Fedora is ongoing– Will allow research outputs to be transferred and showcased– Meet open access requirements• Larkin Press project developed a prototype for integrating Fedorawith Open Journal System– Enables book editing and compilation– Archive artefacts to repository– Print on demandFedora UK&I meeting, Durham | 21 March 2013 | 19
  20. 20. Summary• Hydra at Hull successfully, if at times painfully, launched inSeptember 2011 (user interface) / March 2012 (create,update, delete)• Hydra established, and recognised, as institutionalrepository solution• Now looking to move to the next stage– Upgrading, new collections, new functionality• Encouraged by community growth– Hydra maturing nicely…• Fedora 4?Fedora UK&I meeting, Durham | 21 March 2013 | 20
  21. 21. Thank youChris Awre – Lamb – Green – at Hull – Project –
  22. 22. Fedora and integration
  23. 23. Fedora and integration with other systems• Fedora was deliberately not designed as a silo technology– Open APIs– Modular framework• Hence, how has Fedora been embedded withinorganisational systems infrastructure?• Exchange of practice and experience aimed at supportingothers with related activities• Any gaps / limitations?– Feed ideas and use cases into Fedora 4 developmentFedora UK&I meeting, Durham | 21 March 2013 | 23
  24. 24. Integration examples (notes from discussion)• Institutional ICT infrastructure– E.g., authentication, storage, security, etc.– LDAP integration / pre-set filters in Solr / Shibboleth?– Tape storage (robotic arm)– Cloud still too expensive– Messaging - ActiveMQ• Application integration– E.g., VLE, CRIS, library, etc.– Archives catalogues (CALM) and other sources of descriptivemetadata– EPrints as separate IR? EPrints as ingest tool for Fedora (Oxford)– Upstream/downstream implications• Where does access/preservation copy sit and route?– Archivematica (though gap in getting output into Fedora)Fedora UK&I meeting, Durham | 21 March 2013 | 24
  25. 25. Fedora and data
  26. 26. Fedora and data management• A number of presentations today have already mentioneddata management– Incl. Fedora 4• Research landscape currently very focused on improvingdata management– EPSRC roadmap, funder requirements, research integrity• Small data / big data differences– What functional requirements are needed across the dataspectrum?Fedora UK&I meeting, Durham | 21 March 2013 | 26
  27. 27. Data management examples/experience (notesfrom discussion)• Commercial emphasis on such management?• CKAN / DataBank alternatives• Making the repository a viable option for data• Cleanliness/tidying of data (requirements for management)– Funders may stimulate effort?• Format issues– Very varied and often rare• Capturing metadata – consistency and validity• Are data just digital blobs like other stuff?• Work with researchers on creation of their data• DataONE / Data Conservancy• Linked data?Fedora UK&I meeting, Durham | 21 March 2013 | 27