Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies and Strategies at the University of Houston Libraries. Santi Thompson, Annie Wu, Drew Krewer, Mary Manning and Rob Spragg
Paper presented at the 12th International Conference on Digital Preservation, November 2-6, 2015. University of North Carolina at Chapel Hill.
Abstract:
To develop a comprehensive digital preservation program for maintaining long-term access to the Libraries’ digital assets and align our practices with national standards and guidelines, the University of Houston (UH) Libraries formed the Digital Preservation Task Force (DPTF) to assess previous digital preservation practices and make recommendations on future efforts. This paper outlines the methodology used, including the task force’s use of existing models and evaluation criteria, to successfully generate new policies and select Archivematica as our system to process and preserve our digital assets. It concludes with recommended strategies for the implementation of the policies and preservation operations.
Educational Records of Practice: Preservation and Access Concerns. Elizabeth ...
Similar to Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies and Strategies at the University of Houston Libraries. Santi Thompson, Annie Wu, Drew Krewer, Mary Manning and Rob Spragg
The benefits and challenges of open access: lessons from practice - Helen Bla...Jisc
Similar to Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies and Strategies at the University of Houston Libraries. Santi Thompson, Annie Wu, Drew Krewer, Mary Manning and Rob Spragg (20)
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies and Strategies at the University of Houston Libraries. Santi Thompson, Annie Wu, Drew Krewer, Mary Manning and Rob Spragg
1. Preserving the Fruit of Our Labor:
Establishing Digital Preservation
Policies and Strategies at the
University of Houston Libraries
UH Libraries Digital Preservation Task Force
Santi Thompson (Chair)
Annie Wu (Vice Chair)
Drew Krewer
Mary Manning
Rob Spragg
3. Setting the Scene
• Inconsistent practices
• Transitional period
• Leverage expertise
4. Task Force Charge
• Define Scope
• Articulate Priorities and Policies
• Determine Resources
• Align with TDL
• Recommend Next Steps
5. Action Plan for Developing
a Digital Preservation
Program
• Organizational Infrastructure
• Technological Infrastructure
• Resources Framework
6. Digital Preservation Policy Framework
Purpose
Objectives
Mandate
Scope
Challenges
Principles
Roles and Responsibilities
Collaboration
Selection and Acquisition Criteria
Access/Use Criteria
Review Cycle (added)
Policies and Procedures
Roles and Responsibilities
Digital Assets
Digital Preservation Strategies
Technological Infrastructure
Digital Archive Operations
Platform Requirements/Procedures
Technological Infrastructure
OAIS Reference Model
Information Packages
Functional Entities
Digital Preservation Policy
8. Testing the Systems
What have others done before us?
1. Preserving (Digital) Objects with Restricted
Resources (POWWR) Project
2. The National Library of Medicine (NLM) Digital
Repository Evaluation and Selection Working Group
9. The POWRR White Paper
“From Theory to Action: ‘Good
Enough’ Digital Preservation Solutions
for Under-Resourced Cultural Heritage
Institutions”
http://commons.lib.niu.edu/bitstream/10843/13610/1/FromTheoryToAction_PO
WRR_WhitePaper.pdf
11. NLM Consolidated Repository
Test Plan
Prepared by the NLM Digital Repository
Evaluation and Selection Working Group
http://www.nlm.nih.gov/digitalrepository/Consolidated-DR-Testplan-
Template.xls
● Ingest
● Archival Storage
● Metadata
● Additional Technical Infrastructure
13. Results
Advantages Disadvantages
Sustained by active user
community
More complex system
maintenance
Supports complex archival
workflow
Limited storage locations
Aligns with standards and
best practices
Lacks robust reporting and
notification service
14. TDL Storage Services
… Align priorities with digital preservation
standards, best practices, and Texas Digital
Library (TDL) storage services
15. Project Benchmarks
Date Task Force Activity
April 5, 2015 Task force released draft of final report and DP
Policies to internal UH Libraries’ stakeholders for
public comment
May 15, 2015 Task force submitted final report, DP Policies, and
proposed budget to Library Administration
September 16,
2015
Approved by Library Administration; DP
implementation began
19. Image Credits
● Auntie P, "Jigsaw," CC BY-NC-SA 2.0, https://flic.kr/p/8utKeQ
● Bunches and Bits {Karina}, "Reading," CC BY-NC-ND 2.0, https://flic.kr/p/7fRqZP
● Chlot’s Run, “Puzzle Time,” CC BY 2.0, https://flic.kr/p/aZ8r7P
● Scott Maxwell, “Working Together Teamwork Puzzle Concept,” CC BY-SA 2.0, https://flic.kr/p/4fUsNL
● Poppen, “OAIS Functional Entities,”
http://en.wikipedia.org/wiki/Open_Archival_Information_System#/media/File:OAIS-.gif
● Jonathan Vaughan, “Puzzle Nerd Blog”, https://puzzlenerd.wordpress.com/assembly-piece-by-piece
● Mitchelangelo, “Mixte Madness,” CC BY 2.0, https://flic.kr/p/eLxZ9i
● Horla Varlan, “Fat exclamation mark made from jigsaw puzzle pieces,” CC BY 2.0, https://flic.kr/p/7x9bSS
● ---, “Question mark made of puzzle pieces,” CC BY 2.0, https://flic.kr/p/7vB7fR
Inconsistent Practices:
Some workflows followed process outlined in “Implementing METS, MIX, and DC for Sustaining Digital Preservation at the University of Houston Libraries” MINGYU CHEN and MICHELE REILLY; Some objects had technical metadata generated with METS records;
Data located in various network drives; some of which the originals were backed up to a secure, access-restricted network drive.
Transitional Period: UH Libraries reorganized its digitization and repository units during the first six months of 2014. Great opportunity to create new policies and services around area that we have not addressed
Leverage Expertise: UH Libraries acquired new talent and reassigned other personnel -- both of which brought new digital preservation experience and skills to the table
Image Citation Information: https://flic.kr/p/aZ8r7P
Defined the policy’s scope and levels of preservation
Articulated digital preservation priorities by outlining current practices, identifying preservation gaps and areas for improvement, and establishing goals to address gaps – CREATION OF UH Libraries DP Policy
Determined the tools, infrastructure, and other resources needed to address unmet needs and to sustain preservation activities in the future – SELECTION OF PRESERVATION TOOL
Aligned priorities with digital preservation standards, best practices, and Texas Digital Library (TDL) storage services
Recommended roles, responsibilities, and next steps for implementing the strategy and policy
May 2014 to May 2015
Image Citation Information: https://flic.kr/p/8BPU1e
Digital Preservation Management is a workshop aimed at organizations that are looking to respond to digital preservation challenges.
Taught by Nancy McGovern and Kari Smith from MIT Libraries, though it was also developed at Cornell University Libraries.
The core of the workshop centers around the Action Plan for Developing a Digital Preservation Program, which consists of three parts: Organizational Infrastructure, Technological Infrastructure, and Resources Framework.
The organizational infrastructure offers both a general context of the program as well as more narrow operations, processes, and short-term implementation plans
The technological infrastructure uses the OAIS model as a framework to aid in system selection and to develop pertinent procedures surrounding specific functions of the chosen preservation system.
The resources framework outlines startup, ongoing, and contingency costs.
This is the form our digital preservation policy took after attending the workshop and following the action plan.
It follows the DPM workshop model pretty faithfully, though there are some areas we either added or omitted. I can talk more about this later if others are interested
Added a “Review Cycle” to the Policy Framework to demonstrate a commitment to revisiting document
Temporarily omitted a section called “Plans and Strategies”; we are developing that piece through a Digital Preservation Team, which Santi Thompson will speak about later.
Added “Content Selection Strategies for Cloud-Based Storage” under the Preservation Planning functional entity in case upcoming AV and large-scale projects dramatically affect storage priorities over time.
We have drafted the Resources Framework but have chosen to share that portion of the document with administration only.
Digital Assets category under “Policies and Procedures” contains: Quality Creation & Benchmarking, Selection and Acquisition Policies, Transfer Requirements and Deposit Guidelines, and Access and Use Policies
Functional Entities category under “Technological Infrastructure” contains: Pre-ingest, Common Services, Ingest, Archival Storage, Data Management, Administration, Preservation Planning, and Access
Instead of inventing the wheel, we researched to see what others had done ahead of us. We used two white papers extensively: one from the POWRR Project, the other from the NLM Digital Repository Evaluation and Selection Working Group.
From 2012-2014, the Digital POWRR Project, an Institute of Museum and Library Services (IMLS)-funded study investigated, evaluated, and recommended scalable, sustainable digital preservation solutions for libraries with smaller amounts of data and/or fewer resources
The group researched and produced a white paper that evaluated the following tools: Archivematica, Curator’s Workbench, DuraCloud, MetaArchive, Preservica, Internet Archive
Using the POWWR Project research, we were able to narrow the field of digital preservation tools to Rosetta, Archivematica, and Preservica.
Ex Libris’s DAMS was under consideration for testing by a UH DAMS Task Force that was selecting a DAMS while we were selecting a digital preservation tool, so we also looked at Ex Libris’s Rosetta.
Rosetta came in for a day and presented to both Task Forces. However, when the DAMS task force ruled out Ex Libris’s tool for several reasons, including costs, the Digital Preservation Task Force also ruled it out.
This left Preservica and Archivematica. We had initial discussions with both vendors. In the end, Preservica was cost prohibitive. Additionally, we wanted to use a METS wrapper and were dissatisfied with their proprietary wrapper.
This left Archivematica to test—but by what criteria should we evaluate it?
The NLM Digital Repository Evaluation and Selection Working Group (DRESWG) evaluated commercial systems and open source software. The Working Group developed a set of “Master Evaluation Criteria,”.
Their evaluation tool tested in these areas:
· Ingest, etc. (see slide)
Additional Technical Infrastructure Requirements included testing criteria such: can the system respond to OAI-PMH requests, does it confirm to Z39.50 and Z39.87 NISO standards, does it support Unicode, etc.
We tested Archivematica using the tool modified from NLM and afterward, had Archivematica come to UH for a day’s consultation to answer remaining questions and to discuss how their tool is interoperable with other content management systems and DAMS: DSpace, Content DM, ArchivesSpace, DuraCloud, etc.
Advantages
OAIS compliant, which conforms to digital preservation standards and best practices. It uses open source solutions to perform digital preservation activities.Supports the ingest and technical metadata extraction of a wide array of file formats
Actively developed and has a robust, active user community
Supports versioning through the adoption of the Archival Information Collection (AIC)
Records digital preservation events and places this information into METS record as PREMIS metadata and also in log files
Intuitive interface that makes it easy for administrators to customize rules, settings, and workflows; tracks workflow in a transparent way
Supports complex archival workflows with multiple users having access, if desired
Ability to integrate with other digital asset management systems, including CONTENTdm, Fedora, ATOM, and DuraSpace
“No-cost” solution with a pay structure for software support and/or customized features
Disadvantages
The combination of tools can make long term system maintenance more complex. This is not a “set it and forget it” platform.
Ingest of descriptive metadata is limited toCSV file or manual input
Lack of robust reporting and notification to assist with digital curation tasks. Additionally, error reports are sometimes cryptic. They are also generated by each micro-service, which can vary in its description of the problem and possible solution.
Can only store objects in one specified location*
Polling and self-healing are not services currently offered**
The roles for users and administrators are limited
*The task force has identified a work-around for this issue.
**This service may be offered in DuraSpace
The DPTF is recommending that UH Libraries selects Archivematica as its DP tool
Because we are members of the Texas Digital Library, we also have access to Duracloud @ TDL. This will allow us to deposit some materials to the Cloud. TDL will also be one of 5 nodes in the Digital Preservation Network.
Image Citation Information: https://flic.kr/p/4fUsNL
Make recommendations to stakeholders and admin RE: User Agreements with DuraCloud and Archivematica
Purchase services
Install Archivematica
Inventory of Materials to be preserved (SC & MDS)
Create "core records" for born digital materials that need to be preserved.
Could start patron digitization "core records" if decision has been made about inclusion of such work in the Digital Library/Preservation.
Library commitment to Digital Preservation
Better understanding of current risks
Identified key players in the process and have articulated their roles and responsibilities
Expanded copies and storage to off-site location
TDL + DuraCloud
Best practices
OAIS Compliant
Guided by TRAC
Image Content Information: https://flic.kr/p/7x9bSS