Removal – we all have encountered this experience Obsolescence – we all have 5.25 inch floppy disks we can no longer read Loss – unless well managed, data can and does go missing Funding – as administrations (gov or academic) come and go priorities change and funds are reallocated E-journals are vulnerable
Libraries are allocating significant funds to e-resources Since they do not hold a local copy they can not provide hands-on preservation as they did with print
Explain who are RLG, OCLC and leading role in digital preservation standards development
Third party = independent of the content creator. Examples: National libraries, Portico Economic model: financial support must be diverse enough to minimize risk – link to AHDS govt funding model Tech infrastructure: must be refreshed as technology and preservation strategies, tools evolve
OAIS – a NISO standard TRAC – developed with input from RLG, OCLC and Center for Research Libraries with Mellon funding. Note Portico’s participation
The Interuniversity Consortium for Political and Social Research ( ICPSR ) provides access to an extensive collection of downloadable data.
Portico’s approach builds from its preservation-oriented mission.
280 eileen fenton presentation
Digital Preservation:New Issues and Responsibilities Eileen Fenton Executive Director, Portico Society for Scholarly Publishing May 28-30, 2008
New Issues and Responsibilities1. Why worry about digital preservation?2. What digital preservation is (and is not)3. Core requirements4. Emerging models and case study5. Insights from operations
1. Why Worry About Digital Preservation?E-resources can and do disappear.• Removal: 27 months after publication up to 13% of online cited sources are irretrievable*• Obsolescence: Tapes of U.S. census data from 1960’s are now inaccessible• Loss: Location of NASA’s original moon landing recordings is (currently) unknown• Funding: Funding for the long standing UK Arts and Humanities Data Service discontinued April 2008• Orphans: When ownership or other rights become uncertain, availability is threatened *Dellavalle, Robert P. et. al. “Information Science: Going, Going, Gone.” Science 302, no. 5646 (Oct. 31, 2003), 787-8.
1. Why Worry About Digital Preservation? • The shift to reliance upon40 e-resources is accelerating.3530 Average E-Resource • E-resources consume a Expenditure as Percent25 of Total LME growing portion of total20 library materials15 expenditures.10 • Libraries typically license5 access to rather than own0 outright e-resources. 0 6 7 8 1 2 3 4 5 9 5 00 -9 -9 -9 -9 -9 -0 -0 -0 -0 -0 94 95 96 97 98 00 01 02 03 04 -2 99 19 20 19 19 19 19 20 20 20 2019 Mark Young and Martha Kyrillidou, ARL Statistics 2004-05 (Washington: Association of Research Libraries, 2005).
2. What Digital Preservation Is (and Is Not)Digital preservation is not:• Reformatting from print to digital for access surrogates or product line expansion• Back-up or byte storage on various media• Mirror sites or networks designed for reliable delivery• Carried out within delivery systems
2. What Digital Preservation Is (and Is Not)• Active content management designed to ensure enduring usability, authenticity and accessibility over the very long-term – See Trusted Digital Repositories: Attributes and Responsibilities. An RLG-OCLC Report, May 2002. – See The Preservation Management of Digital Material Handbook
3. Core Requirements for Digital Preservation• Third-party with an organizational mission to carry out preservation• A sustainable economic model able to support preservation activities over the targeted timeframe• Technological infrastructure able to support selected preservation strategy and best practices
3. Core Requirements for Digital Preservation• Clear legal rights and relationships with content providers and (eventual) users• Compliance with digital preservation standards and best practices – OAIS: Open Archives Information Systems – TRAC: Trustworthy Repositories Audit and Certification – DRAMBORA: Digital Repository Audit Method Based on Risk Assessment
4. Emerging Models• Models for e-journal preservation are emerging• E-Journal Archiving Metes and Bounds: A survey of the landscape published by the Council on Library and Information Resources (CLIR), September 2006 reports on current approaches* – A survey of 12 e-journal initiatives – All efforts are described as young and requiring ongoing evaluation*http://www.clir.org/pubs/reports/pub138/pub138.pdf
4. Emerging Models1. National libraries – To support mission or legal deposit – Content scope and access terms vary – Government funded – Ex: National Library of the Netherlands, British Library1. Community supported third-party preservation archives – Provides a focused point of accountability – Costs shared across participating publishers and libraries – Ex: Portico, ICPSR
4. Emerging Models3. Networked library efforts – Responsibility shared across a group of institutions – May (or may not) use specialized software – Ex: C/LOCKSS (Lots of Copies Keeps Stuff Safe and Controlled LOCKSS), National Digital Information Infrastructure Preservation Program (NDIIPP)
4. Case Study: Portico MissionTo preserve scholarly literature published in electronic form and to ensure that these materials remain availableto future generations of scholars, researchers, and students.
4. Case Study: Portico Content ScopeIn scope:• Electronic scholarly literature, initially e-journals; other genres under active discussion• Intellectual content including text, tables, images, supplemental files• Limited functionality such as internal linkingOut of scope:• Full functionality of publisher’s delivery platform• Today’s ephemeral HTML rendition
4. Case Study: Portico Methodology: Migration and Byte Storage• Publishers deliver “source files” (SGML, XML, PDF, etc.) to Portico.• Portico converts proprietary source files from multiple publishers to archival formats suitable for long-term preservation.• 7.1 million+ journal articles preserved to date; hardware systems capacity supports ingest of 1-2 million articles / month• Portico migrates files to new formats as technology changes.
4. Case Study: Portico Access to the Preservation Archive• Only participating libraries and publishers may access the archive.• Access is offered when specific trigger event conditions prevail and when titles are no longer available from the publisher or other sources.• Trigger events initiate campus-wide access for all libraries supporting the archive regardless of previous subscriber status.• Libraries may rely upon the Portico archive for post- cancellation access, if a publisher chooses to name Portico as one mechanisms to meet this obligation. For approximately 85% of preserved titles Portico is so named.
4. Case Study: Portico Sources of Support• Early support provided by The Andrew W. Mellon Foundation, Ithaka, JSTOR and the Library of Congress• Ongoing support for the archive comes from the primary beneficiaries of the archive.• Contributing publishers supply content and make an annual financial contribution ($250 to $75,000).• More than 7,550 journals (~14M articles) from 55 publishers are committed to the archive to date.• Libraries make an Annual Archive Support (AAS) payment based upon total library materials expenditures ($1,500 to $24,000).• More than 430 libraries from 13 countries participate in the archive
5. Insights from Operations: Publishers• Publishers are developing multi-layered strategies to mitigate risk and meet library requirements.• Cooperative interaction with archival partners is required to establish data flows and respond to questions.• Third-party preservation archives can supply feedback regarding data consistency and standards conformance.
5. Insights from Operations: Libraries• Libraries are actively evaluating options for meeting preservation obligations and needs.• Multi-layered strategies to preserve a wide array of print and e-content are being developed.• Breadth of preservation strategy varies with institutional size.
5. Insights from Operations: Preservation Archives• Archives must be prepared to respond to complex, still emerging e-publishing best practices.• File usability vs. validity creates special preservation challenges.• Gathering and communicating holdings information is critical and challenging.