This document discusses a project investigating the use of Archivematica for long-term digital preservation of research data. The project is a collaboration between researchers at the University of Hull and University of York. Archivematica is an open source digital preservation system that packages preservation tools. The project aims to set up local implementations of Archivematica at Hull and York and explore integrating it more broadly. Work in Phase 2 included enhancing Archivematica, spreading awareness of the project, and planning for sustainability and outreach in a potential Phase 3.
Research data spring: filling in the digital preservation gap
1. Research data spring
Filling the Digital Preservation Gap10/12/2015
Investigating Archivematica to preserve research
data for the longer term …because digital
preservation won’t just go away
2. Team
2
University of Hull:
• Chris Awre
• Richard Green
• Simon Wilson
University of York:
• Julie Allinson
• Jen Mitcham
3. Project aim
3
“…to investigate
Archivematica and explore
how it might be used to
provide digital preservation
functionality within a wider
infrastructure for Research
Data Management.”
4. Why?
4
....because we believe that digital preservation
should be a key element of the infrastructure for
managing research data for the long term
(From RDMSS PQQ)
“...preservation actions should ensure that data
remains authentic, reliable, and usable while
maintaining integrity.”
5. Why Archivematica?
5
....because it is open source,
standards compliant, flexible
and customisable and packages
a range of preservation tools
together
...if you want to know
more you can read our
phase 1 report
7. The Archivematica development model
7
Artefactual develop
Archivematica
Archivematica
released
as open source
Community of users
identifies enhancements
Enhancements
sponsored
by one or more users
RDS /
Research data
UK users
COPPUL
8. Progress in phase 2
»Planning our own local
implementations
»Hull
»York
»Considering above campus
option for Archivematica
»Liaising with other projects
»Phase 2 Report: now available!
8
9. Progress in phase 2
»Enhancing Archivematica:
»DIP regeneration
»METS parsing
»Generic search API
»Choice of checksum
»Pronom integration
»Documentation
9
10. Progress in phase 2
Not all of the work we
have sponsored is ‘visual’
but much of it is
fundamental to the
future development of
Archivematica. Our work
is enabling
10
“The Jisc work has helped to
modernise some of the
internal infrastructure of
Archivematica”
Sarah Romkey, Artefactual Systems,
8th December 2015
14. Impact and demand
14
Yes….sounds like a
pragmatic solution
Yes! Low down learning
curve and Archivematica
sounds just the ticket :-)
Possible but too
early to say
Do you see Archivematica as a possible digital preservation
solution for your institution? Why?
Yes - University Archivist is an
advocate and want to link in
collaboratively with the
institution's RDM developments
Possibly if it can
integrate with Pure...
Yes
16. Sustainability
»All developments funded in phase 2 will be
incorporated into the main code base to be
supported for the long term by Artefactual
–look out for these in version 1.6 (due Spring 2016)
»There are already plans to build on some of the
work we have funded
–for example AIP re-ingest work from Zuse Institute
–...and more...see phase 2 report
16
17. Next phase
» Implement our local proof of concepts at Hull and York
» Outreach
» Paper at IDCC conference
» Presentation at UK Archivematica group meeting
» Poster at Open Repositories conference
» Poster at UK Archives Discovery Forum
» more blogs
» end of project event to disseminate our case studies
» Phase 3 project report (with assessment of success of
PoCs)
17
18. What we will spend the money on
»Managing and funding our own internal development work
» 2 weeks support from Artefactual Systems
» 4 new research data file signatures from The National
Archives (and further engagement on generic process)
» Outreach (conference fees, travel etc)
» Putting on our own dissemination event
18
19. Working for other repositories
» Archivematica -> repository
› Our model: Archivematica -> Fedora/Hydra
› Unpack a DIP and create Fedora objects
– Similar model for EPrints/DSpace?
› Could just store the DIP, but this limits access options
» Repository -> Archivematica
› Push content to Archivematica from a repository for
dark archiving
› Possible via DSpace, planned for Fedora/Hydra at Hull
19