This presentation was given at the 2017 Society of Mississippi Archivists' Conference. It covers the final two modules of the Library of Congress' digital preservation curriculum.
1. DIGITAL PRESERVATION –
MANAGE AND PROVIDE
ACCESS
BY MICHAEL PAULMENO
BASED ON PRESENTATIONSGIVENATTHE 2016 DIGITAL PRESERVATIONOUTREACH
AND EDUCATIONWORKSHOP IN JACKSON, MS
3. MODULES
Identify - what digital content do you have?
Select - what portion of that content will be preserved?
Store - how should your content be stored for the long term?
Protect - what steps are needed to protect your digital content?
Manage - what provisions are needed for long-term management?
Provide - how should your content be made available over time?
5. PLANNING AND MANAGEMENT
• Digital preservation on going process
• Involves multiple parts
• Preservation planning (on-going)
• Self-assessment (internal process)
• Audit (external review by peers)
6. BALANCED MANAGEMENT
• Preservation programs don’t run themselves
• An effective approach will address
• Organizational requirements and objectives
• Technological opportunities and change
• Resources – funding, staff, equipment, etc.
7. THREE ELEMENTS OF ORGANIZATION
Preservation
Management
People
PoliciesTechnology
8. ORGANIZATIONAL ELEMENT: PEOPLE
Knowledge and skills include:
• Policy development
• Project management
• Repository/software management, programming
• Metadata management
• Legal expertise
• Marketing expertise
10. ORGANIZATIONAL ELEMENT: POLICIES
• Good policies are important
• Help deal with potential issues
• Defines roles, responsibilities, goals, etc.
• Ensure compliance with standards
• Lays foundation for future growth
• Examples
• Strategic plan
• Preservation plan
12. ORGANIZATIONAL ELEMENT: POLICIES
Benefits of a preservation policy:
• Specifies institutional commitment
• Developing policy builds DP team
• Demonstrates compliance – meet requirements
• Manages expectations – message to stakeholders
• Identifies issues and challenges
• Raises awareness
• Defines roles and responsibilities
13. ORGANIZATIONAL ELEMENT:TECHNOLOGY
• Needs to be part of overall strategy
• Preservation and policy concerns should inform technology decisions
• Key concerns:
• Managing
• Selecting
• Investing
• Funding
14. MANAGING TECHNOLOGY
• Prioritize: weigh requirements to be met
• Assess: define criteria to select appropriate
• Plan: identify steps to meet goals
• Select: decide when to own/join/share
• Fund: allocate resources
• Monitor: look ahead, be prepared
• Evaluate: measure outcomes and success
15. PRIORITIZE AND ASSESS
• Ask yourself questions
• What are our needs?
• Do we have technical staff?
• Can we sustain funding commitments?
• Answers should inform further decisions
16. PLAN
• Develop a road map for purchases and decisions
• Create backup/disaster plans (and test them!)
• Consider all scenarios (paranoia can be a good thing)
• Think carefully; not all decisions easily undone
17. A WORD ON BACKUPS …
• No system is 100% effective
• Follow the LOCKSS principle (Lots of Copies Keeps Stuff Safe)
• Devise a disaster plan and test it regularly
18. SELECTING TECHNOLOGY
• Many considerations
• On-premise vs. hosted
• Locally developed vs. contracted
• Turnkey vs. in-house development
• Open-source vs. proprietary
• Evaluate options and consider what is the best fit
• Read fine print
19. SELECTING TECHNOLOGY
Characteristics of sound software:
• written in a well-documented language
• usable on a wide variety of platforms
• sustained support by creators/developers
• modular in design
• supports batch processing and workflows
• licenses support secondary use
23. INVESTING IN TECHNOLOGY
• Investment and selection influence each other
• Look at funding levels, staff, and needs
• What sort of commitment is needs/desire?
24. DESIGNATED FUNDING
• Funds set aside for digital preservation
• May not be explicit (e.g., budget line item)
• Must be able to make a compelling case
• Measurable indication of intent to preserve
• Challenging to do, but important
• Over time, contributes to track record
25. MONITOR AND EVALUATE
• Is the software working?
• Do we like it?
• Does it do what we need?
• Is the vendor supportive and responsive?
• What are our options for migration
26. USING STANDARDS
• Lots to choose from!
• Why follow one?
• Ensure long term preservation
• Avoids haphazard planning
• Helps communicating with others
27. DP STANDARDS
Standards emerging since 1996 report :
• Trusted Digital Repositories, 2002
• Open Archival Information Systems (OAIS) Reference Model, 2003 plus
standardization (ISO 14721:2012)
• Preservation Metadata Implementation Strategies, 2005 plus updates
• Trustworthy Repositories Audit and Certification (TRAC), 2011
• Audit and Certification of Trustworthy Digital Repositories (ISO 16363:2012)
• NDSA Levels of Digital Preservation (LoDP), 2013
28. TRUSTED DIGITAL REPOSITORY
A TDR should have these characteristics:
• community standards (OAIS Compliance)
• commitment (Administrative Responsibility)
• management (Organizational Viability)
• resources (Financial Sustainability)
• infrastructure (Technological Suitability)
• protection and control (System Security)
• documentation (Procedural Accountability)
29. OPEN ARCHIVAL INFORMATION SYSTEM
• Framework for designing a long term
preservation system
• Definition: “Any organization or system charged
with the task of preserving information over the
long term and making it accessible to a class of
users”1.
• Conceptual design for an archive
• Consists of producers, consumers, management,
and the archive
• Each has a unique function (see right)
1. Lavoie, Brian. “Meeting the Challenges of Digital Preservation:TheOAIS Reference Model.”Accessed on April 5, 2017. http://www.oclc.org/research/publications/library/2000/lavoie-
oais.html.
30. MORE INFORMATION
• Lavoie, Brian. “Meeting the Challenges of Digital Preservation:The OAIS Reference
Model.” http://www.oclc.org/research/publications/library/2000/lavoie-oais.html.
• ConsultativeCommittee for Space Data Systems. “Reference Model for an OpenArchival
Information System.” https://public.ccsds.org/pubs/650x0m2.pdf
• National Digital Stewardship Alliance: http://ndsa.org/
33. MODULES
Identify - what digital content do you have?
Select - what portion of that content will be preserved?
Store - how should your content be stored for the long term?
Protect - what steps are needed to protect your digital content?
Manage - what provisions are needed for long-term management?
Provide - how should your content be made available over time?
34. LEARNING OBJECTIVES
• Describe the difference between preservation and access technologies
• Describe the value of long-term access policies
• Be aware of legal and rights management issues connected to digital
records
37. What is Long-termAccess?
Preservation makes long-term access possible…
Preservation
• relies upon proven technologies to
preserve digital objects across
generations of technology
• accumulates metadata over the life
cycle to trace and preserve content
• preservation systems create new
versions of digital objects for access
to deliver as needs change over time
• purpose: ensure long-term access
• focus: future users
Access
• relies on cutting edge
technologies to provide best and
fastest access at a point in time
• selects metadata needed to use
and understand content
• access systems deliver objects
with user-oriented services to
make the objects
• purpose: provide content to users
• focus: current users
38. PROVEN VS. CUTTING EDGE TECHNOLOGIES
Proven
Reliable, well-documented, widely used
Trade-off ease of use for long term
sustainability
Ex. Hard disks, tape drives, SQL
databases, Debian Linux, C++
Cutting Edge
New, advanced, and unproven
Convenient, but potentially
unsustainable
Ex.Windows 10, CockroachDB, Dart,
Fedora
39. THE RIGHT TOOL FOR THE RIGHT JOB
Different systems for preservation and access
No one size fits all solution
Ask yourself
• Will this system still exist in 10 years? 20?
• Is the system user friendly?
• Is it fast and convenient or proven and reliable?
• Does the system require third party (i.e., vendor)
support?
39
40. The RightTool for the Right Job (cont’d)
Static web pages - access
Omeka - access
Archivematica - preservation
SQL Database - preservation
41. SUSTAINABILITY
Access requires technology to be well managed and sustainable
Long term funding and expertise need to be in place
Ask yourself:
What happens if funding is cut?
How many staff know how to maintain the software?
Is the technology robust and reliable?
42. PRESERVATION AND THE CLOUD
You can’t always do it yourself
Cloud service providers offer solid alternatives to self hosting
Ask yourself:
Do we have the funding and expertise to migrate servers?
Will funding be present to sustain cloud subscriptions?
What happens if we must back out of a cloud subscription?
43. PRESERVATION AND THE CLOUD (CONT’D)
Benefits
Easy disaster preparedness
Wide range of offerings
Less on-site IT staff required
Drawbacks
Cost
Complex pricing (ex., AWS)
Learning curve
44. REQUIREMENTS FOR PROVIDING ACCESS
Use current and known technologies
Software should be well-documented and maintained
Ask yourself
When was the program last updated?
Is it open-source or proprietary?
What is the total cost of ownership?
45. REQUIREMENTS FOR PROVIDING ACCESS
(CONT’D)
Content should be delivered completely, be intact, and well-formed
Content should be accurately described
46. REQUIREMENTS FOR PROVIDING ACCESS
(CONT’D)
Provide access in accordance with policy
Provide fair access
Everyone should be able to access materials equally
Ensure accessibility for people with disabilities
47. Managing Access
E. von Seutter Photograph Collection, PI/1985.0032, Mississippi Department of Archives and History
48. MANAGING ACCESS
• Policies managing digital access should apply to staff and
users
• Policies enable consistent, sustainable access over time. Ad
hoc decisions do not!
48
49. STAFF POLICY QUESTIONS
• Who is allowed to have back end access to content?
• What about preservation masters versus access copies?
49
50. USER POLICY QUESTIONS
• Are access policies equal for all content?
• If not, how are categories managed?
50
51. USER POLICY QUESTIONS
• How are exceptions/special requests handled?
• How do users request/get access?
51
52. ACCESS POLICIES SHOULD...
• ...address requirements for preservation systems to produce
access objects
• ...reflect and respond to new discovery/delivery issues that
emerge
• ...be written down and implemented!
52
53. Legal Issues
New Capitol Photograph Album, Series 317, Mississippi Department of Archives and History
54. LEGAL ISSUES
Things to think about:
Do you have the right to distribute the content?
E.g., authors’ papers versus university-produced
content
54
55. LEGAL ISSUES
Things to think about:
Are there access restrictions in deeds of gift or donor
agreements?
E.g., time restrictions on opening records
55
56. LEGAL ISSUES
Things to think about:
Is there any confidential information that must be
redacted?
E.g., Social Security numbers, medical
information, student records
56
57. LEGAL ISSUES
Things to think about:
Is there any material you should not provide access
to?
E.g., university blueprints
57
58. • Know your content to address relevant legal issues
for preservation and access
• Document and preserve your decisions about legal
issues regarding access to your content
But I’m not a lawyer!
58
59. • Know who your legal adviser is (or find one) and
develop a sound working relationship
• It is your legal adviser’s responsibility to help –
help them help you
But I’m not a lawyer!
59
60. Understand Users
Conrad Nutschan [GFDL (http://www.gnu.org/copyleft/fdl.html), CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/) or CC BY-SA 2.0
(http://creativecommons.org/licenses/by-sa/2.0)], viaWikimedia Commons
61. Understand Users
• Know who utilizes collection
• Understand what users expect
• Expectation driven by technologies users know and want
62. Understand Users
• How do we anticipate needs of future users?
• How should digital content be packages for delivery over time?
• What sort of tracking instruments (surveys, GoogleAnalytics, etc) should we
use?
63. Understand Users
• Preservation a pathway from one generation of technology to the next
• Migration of content critical to ensuring long term access
• No one can predict the future, but planning still important
It’s important to emphasize how multiple systems are needed for access and preservation. The exact technology used will depend on an institution’s requirements.
Policies might be the least interesting part of building a new program in your institution, but they’re vital to the long term sustainability of your work. Policies enable consistent decision making and when followed, give you an ironclad paper trail and justification for your actions. In this case, you would want a basic policy apparatus to manage both your staff’s access to content on the back end and your users’ access to content on the front end.
It’s important to determine which staff members are responsible for maintaining your content (manage module??) and who can access all your digitized or born digital stuff. Just like not everyone has stack access, not everyone should be able to get to your digital records. You also can and probably should draw a staff access distinction between your preservation masters and your access copies--if only a select group of staff can get to your preservation masters, it’s more okay to have widely available access copies for staff to use as needed.
You may need to have a different access mode for audio files versus image files, for example--maybe image files are broadly web accessible but audiovisual files are only accessible on your campus. If you have separate access categories for different kinds of content, you will need to consider how you draw lines between categories.--e.g., audio only files can be accessible online but audiovisual files are on-campus only.
Note to trainers: as discussed with preservation policies in the Manage Module, access policies have benefits for developing a sound program to ensure long-term access
Another important question to consider with users is how you will handle special requests for content, anything from unprocessed material to material that’s only accessible on older media that you might have behind the scenes but not in your reading room. Make sure that your policies treat all comers with equal respect, from your own faculty to students to outside researchers--in other words, don’t get into the habit of bending the rules or imposing rule will become that much more difficult.
The most important thing about your access policies is that they’re written down and implemented!! There’s no point creating policies if you don’t use them. Policies should also be living, growing documents that respond to changes in technology or your institutution. If you’re having trouble generating policies, your FAQs can be a great starting point.
When you’re considering what kinds of content you want to make available online, there are several questions you need to ask yourself to cover your legal liabilities. The first and most obvious question is do you hold the rights to this material? If you hold copyright you can do what you want, otherwise you need to be more careful about how it’s distributed. Eg., you may not hold the rights to authors’ papers you might have in special collections, whereas you probably do hold the rights to distribute any university-produced content, like yearbooks.
You also need to think about whether there are any restrictions in your deed of gift or donor agreements that could prevent you from distributing content online. The classic example is restricting personal papers until a certain period of time after the donor’s death, but there may be an explicit clause in your donor agreement stipulating material can only be viewed on site.
Something that can come into play with your institutional records is whether there is any confidential information that need to be redacted--the obvious example in a university is student records, but there can be social security numbers, personnel files, and other privileged information hiding in any number of places, such as professor’s papers.
You should also consider that just because you can digitize what you want and put it online, that doesn’t mean you should. Some content is probably better kept off the internet, like university blueprints of campus buildings, which could easily be put to nefarious use.
Most of the basic questions about your legal right to make content accessible over the web can be answered by robust content management. In other words, know what you have and keep good documentation of collections coming into the archives so you know what donors expect regarding their collections. When you’re making decisions about what kind of access to provide, make sure you document what you’re doing and the logic behind it so that you have a paper trail in the event of a problem. This is also the logical place to review your standard donor agreement to see if it addresses how the donor wants to treat any future digital surrogates produced from their records.
For questions that you can’t answer, find a good legal adviser who understands copyright and intellectual property law. Having good documentation of your collections will not only help you know what to do with your collections, it will also help any legal advisers help you!