What is Digital Preservation? Lynne M. Thomas Northern Illinois UniversityCARLI Digital Preservation Forum April 10, 2012
It is not…• Scary• Impossible• Difficult to understand• Someone else’s problem• Anything we can’t handle…eventually.• Something we must do all by ourselves
It is…• Necessary• Achievable• Collaborative• Something we can plan for long-term• Allied with our current skill set• Well within our mission• Required (if you want grant money)
PARS (ALA) Definition“Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.”Source: http://www.ala.org/ala/mgrps/divs/alcts/resources/preserv/defdig pres0408.cfm
Challenges• Digital objects can easily separate their content (the text itself), their context (how the text is rendered on a computer screen), and their metadata (the “title page” of the text) from one another.• Digital Preservation requires saving all three components, and linking them together in perpetuity.• Example: Excel file: content = text in each cell; context=excel’s method of display/cell order (proprietary); metadata=information about the file (name, version of Excel, date of creation, etc.)
Challenges• Creators of digital content often aren’t thinking long-term. That’s our job.• I love standards; there are so many to choose from!
Misconceptions I• Copying data to a single CD/DVD/flash drive/external hard drive is not enough.• “Benign Neglect” doesn’t work on digital objects. The data will degrade unless it is continually refreshed and checked. This is called “bit rot.”• The Internet is Forever… right?
Misconceptions II• Archive.org has it all!… unless we’re talking about … – Copyright issues – ISP hopping – Don’t archive me! (robot.txt files) – Social networks, deep web subscription databases – Pre 1996 Websites
Misconceptions III• That’s okay, we have an Institutional Repository (IR) – What’s in it? – How is it maintained? – What file and data types does it cover? – Open source or vendor-driven?• An IR is NOT a long term DP solution (although it may be linked to one).
Misconceptions IV• We need lots of money to even begin• Campus IT will take care of it for us• … And they know about our needs and standards• Or we’ll have to do it all by ourselves. Without help. Or funding.
What now?• Education• IMLS National Leadership Grant• Start the conversation on your campus
Education: Yourself• Nancy McGovern’s Three Legged Stool (2007) – Technology, Organization, Resources Source: “A Digital Decade: Where Have We Been and Where are We Going in Digital Preservation?” RLG DigiNews 11:1 (April 15, 2007)• Digital Preservation Management online Workshop: http://www.dpworkshop.org/• LOCKSS: Lots of Copies Keeps Stuff Safe• Special Collections 2.0 (2009)
Education: Others• Educate your campus, your stakeholders, your IT folks. This is everyone’s problem.• NSF data management plan requirements: no DMP, no grant $$• Build allies and stakeholders
IMLS Grant• NIU, ISU, WIU, Illinois Wesleyan, Chicago State• Explore sustainable DP options for smaller and medium institutions (or underfunded ones)• Examine multiple options, incl. cloud storage• White Paper (late 2013 or early 2014)
Ask your campus• How much data are we talking about?• What kinds?• Public or private? Who holds copyright?• Who holds it? (Faculty? The library? Vendors? University offices?)• Where?
When you get answers• Create a digital collection development plan = what do we plan/hope to collect and save?• Create a data management plan = how do we plan to save it?• (DMPTool: https://dmp.cdlib.org/)• Work towards a long-term funding plan = how can we pay for this?
Moving forward• Baby steps (like planning) still count. The sooner we begin, the better off we are.• Collaboration is key; this isn’t cheap, but working together brings down costs.• Be ready to leverage resources when they do arrive.