1. Shift to Online: implications for preserving
scholarly communications
Daniel Dollar
Head, Collection Development & Management
Cushing/Whitney Medical Library
Yale School of Medicine
daniel.dollar@yale.edu
4. Shifting Formats
• Content
• Online journal is the version of record
• Digital backfiles (include everything)
• Scholarly Sharing
• Interlibrary loan
• Fair use
• Licensing
• Pricing models
• Usage
• COUNTER compliant
5. Shifting Formats
• Accessibility
• Easy and clear path to content with few/no clicks
• No passwords please
• OpenURL compliant
• Ownership
• Purchase content not lease it
• Perpetual or archival access rights
• Post-cancellation access to subscribed content
• Preservation
• Use trusted third-party preservation archive(s)
6. Preservation: Problem Statement
Digital preservation represents one of the grand challenges
facing higher education. Yet… the responsibility for
preservation is diffuse and the responsible parties have
been slow to identify and invest in the necessary infra-
structure. The shift from print to electronic publication
of scholarly journals is occurring at a particularly rapid
pace; the digital portion of the scholarly record is in-
creasingly at risk and solutions may require unique ar-
rangements within the academy for sharing preservation
responsibility.
Adapted from "Urgent Action Needed to Preserve Scholarly
Electronic Journals," Don Waters et al, 10/2005
7. Preservation: Study
• Association of Research Libraries (ARL) and the
Council on Library and Information Resources
(CLIR) agree on the need for survey of online
journal archiving initiatives.
• CLIR commissioned study with a report issued in
September 2006, entitled E-Journal Archiving
Metes and Bounds: A Survey of the Landscape.
8. Contents
• Includes: the "who, what, when, where, why,
and how" of significant archiving programs
operated by not-for-profit organizations in the
domain of peer reviewed journal literature
published in digital form.
• Excludes: preservation efforts covering
digitized versions of print journals (i.e.,
JSTOR), library conversion projects, publisher
efforts, and initiatives in planning stages.
9. Twelve initiatives studied
• Government mandated/funded (6):
• Koninklijke Bibliotheek - e-Depot (Dutch national
deposit library)
• Kopal - DDB (National Library of Germany &
Ministry of Education & Research)
• CISTI - Csi (Canadian national science library)
• NLA-Pandora (Preserving and Accessing Networked
Documentary Resources of Australia)
• PubMed Central (National Institutes of Health-
National Library of Medicine)
• LANL-RL aDORe (Los Alamos National Laboratory
Research Library)
11. Seven indicators of viability
1. Explicit mission & necessary mandate to perform
long-term archiving
2. Negotiate all rights and responsibilities to carry out
its obligations
3. Identify exactly which titles are covered and for
whom
4. Provide a minimal set of defined services - receive,
store, verify integrity, guard against loss, be
auditable (certification)
12. Seven indicators of viability
5. Preserved information available to libraries under
clearly stated conditions
6. Organizationally sound
7. Work as part of a network
13. What about content coverage?
• Proved difficult to identify which publications
are being archived, by whom
• Not all publish lists; not all have complete, up
to date titles (this is complicated)
• Not all of a publishers' titles necessarily
included in a collection (PubMed Central has
largest number of publishers & smallest number
of titles)
• Aggregators such as Muse, BioOne, etc., add
complexity
14. Content coverage (2)
• Participation in the 12 (2006 data):
• Number of unique publishers was 128
• 91 participated in only one program
• 20 participated in 2 programs
• 17 (major) publishers are in 3 or more programs
• Lots of redundancy for STM
• Other disciplines, smaller publishers, non-Roman,
and dynamic Web publications are less well
represented and less likely to have an
archiving/preservation program
15. “Minimal” set of services?
• This area of the report:
• Is the most lengthy
• Is particularly clearly written
• Represents the area that we know least about
(much technical activity with yet a long way to
go to assure perpetual availability)
• Identifies emerging best practices and standards
• Some areas covered: formats for ingestion, what
content is included, how to know it's all there, is it
corrupted, cost effectiveness, data migration vs.
emulation, guard against loss/backup, etc.
16. Organizational viability?
• Most of the 12 appear to have the necessary
organizational structure including:
• Commitment
• Documentation
• Adherence to standards
• Succession planning
• Good business planning, models
• Incoming revenue for support
• However, mostly a limited track record (very new)
17. Part of a network?
• Networks can be formal or informal and provide:
• Idea exchange
• Sharing of documents
• Sharing software
• Coordinating content selection
• Reciprocal storage, mirroring
• Backup if other archives fail
• Shared resources, facilities
• Some of these initiatives are communicating
productively with one or more other initiatives
18. Conclusions of the CLIR study
• Trigger events will happen
• Libraries cannot do this alone
• Current license terms for libraries are mostly inadequate
(perpetual access does not equal preservation)
• Viable options are emerging
• No single archiving program will meet all needs
• Coverage is uneven
• Much content is at risk
• Libraries can and should influence developments
• Legislation needed -- legal deposit
• All programs need greater support, transparency, etc.
19. Digital Repository Certification
Research Library Group and National
Archives and Records Administration
Digital Repository Certification Task
Force
Trustworthy Repositories Audit & Certification
(TRAC): Criteria and Checklist (Version 1.0
February 2007)
Center for Research Libraries (CRL) taking
on audit and certification tasks in the US
using TRAC criteria and checklist
20. Copyright and Digital Preservation
• Section 108 Study Group Report (March
2008)
• Clarify library and museum rights to preserve
digital content.
22. Medical Library >> tomorrow
• Pilot study of online
journal archiving
• Reviewed the library’s
“core” journals for
inclusion in LOCKSS
and/or Portico
• Lack of title lists and
ISSNs problematic
• Yale University Library
study to come
23. Conclusions
• Online journals are the version of record
• Preservation issues are complex
• Technical
• Risks
• Costs
• Trust
• Submit scholarly published content to trusted
(certified) third-party preservation archives
• Use both LOCKSS and Portico.
24. Readings
• E-Journal Archiving Metes and Bounds: A Survey of the
Landscape. (September 2006)
• Urgent Action Needed to Preserve Scholarly Electronic
Journals. (October 15, 2005)
• Section 108 Study Group Report. Executive Summary
(March 2008)
Trustworthy Repositories Audit & Certification: Criteria
and Checklist. (Version 1.0 February 2007)
• Bernard F. Reilly, Jr. “Center for Research Libraries’
Auditing and Certification of Digital Archives.”
Charleston Advisor (January 2008): 59-60.
• Bernard F. Reilly, Jr. “Summary of the Test Audits of
Portico and LOCKSS.” Charleston Advisor (January
2008): 61-62.
Editor's Notes
Good afternoon. I will start with some background on how the Yale Medical Library has responded to the raise of electronic resources, implications of that in how we acquire and maintain access to content, and then will look at some key documents concerning preservation. NEXT SLIDE
Harvey Cushing & John Hay Whitney Medical Library CWML (a wing of the Sterling Hall of Medicine at Yale School of Medicine) 460,000 volumes 2.4 million dollar collection budget We serve: Yale-New Haven Medical Center Yale Schools of Medicine, Nursing, Public Health and a teaching hospital: Yale-New Haven Hospital Also, we are part of the Yale University Library with 600 FTE, 20+ locations and 11 million or so items; sharing ILS and other systems.
Morse (Periodical) Reading Room Fall 2003 an extensive usage study found that just over half of current print journals housed in this room received no detectable use. At the same time these journals were receiving significant usage online. Our patrons had moved on and we needed to keep up. After a consultants report, we completely reorganized to more effectively manage our online collections, and stopped receiving and/or keeping print journals accept for a “core” list of 240 journals.
Quote from 3 page statement issued in October 2005 after a meeting convened by the Andrew Mellon Foundation with academic administrators and librarians. Or, to put it another way: In an age of information abundance and rapid growth, an age of immensely ambitious digital resources, libraries neither own – nor have much assurance of long-time access to – all this glorious electronic content that we are making available and delivering to our patrons.
I recommend reading the CLIR study, its 120 pages, but you could stop at the end of the executive summary on page 3.
Government mandated/funded (6): KB - e-Depot (Dutch national deposit library). Started in 2000. 12 major publishers Dutch Publishers Association, IBM Kopal - DDB (National Library of Germany & Ministry of Education & Research's project to accept journals under legal deposit arrangement). Started in 2004 GNL, Gottingen, IBM, and others CISTI - Csi (Canada's national science library; Canada's scientific infostructure. Started in 2003.
LOCKSS Alliance (Lots of Copies Keep Stuff Safe). Started in 2000. Over 200 participating institutions in 20+ countries. Informal and “unregulated” CLOCKSS (Controlled LOCKSS). Started in 2006. 7 libraries and 11 publishers to establish a comprehensive dark archive. Intentional and comprehensive OCLC-ECO: Started in 1997. Over 5,000 titles from 70 publishers; libraries can select their content Portico: Membership-based 3rd party "dark archive" service, includes 39? publishers, thousands of titles (2006) Consortial implementations, providing access for library members (2): OhioLink Electronic Journal Center: over 7,200 journals, 9.1M articles, from 100+ publishers, 85+ members. Started early 90s? Ontario Scholars Portal: serves 20+ university libraries in Ontario; 7,300 journals
The study group developed a list of seven indicators of a e-journal archives viability. But these indicators have relevance for any trusted archiving program.
Content issues discovered by the study team in looking at these 12 initiatives.
Two years old by gives you a sense of where we are.
An important theme from the CLIR study is “Trust.” Do libraries trust publishers to maintain perpetual access and preserve that digital content. Do we trust other libraries. So we need “trusted” third-party archives that have gone through a widely accepted audit and certification process. These leads to the other important report to skim. (Its only 88 pages.) CRL “Auditing and certification of digital archives project conducted test audits using the criteria and provided feedback to the development of the document.” TRAC document CRL “Certification of Digital Archives Project” Testing RLG-NARA metrics through actual audits…project staff will determine the optimum set of methodologies for auditing and certification, and corresponding cost. It will develop and deliver specifications for the auditing and certification processes, and will outline a business model for the certifying agency or entity best suited to carry out those processes on a continuing basis.
Read 14 page executive summary. We can create preservation copies of online content. Its not legal deposit as proposed by CLIR study (that study things its useful but not a silver bullet). But its part an important step in this area.
Coming back to the Yale Medical Library, across the rotunda from the Morse Reading Room, is the Medical Historical Library. The historical library houses one of the world’s finest historical medical collections. The collection contains over 130,000 books, bound manuscripts, journals and pamphlets. This includes 325 incunabula, which are books printed between 1450 and 1500, a wonderful Renaissance, Arabic and Persian manuscript collection along with hundreds of bound manuscripts form the 16th to the 20th centuries. The Cushing/Whitney Historical Library also houses the Fry Collection of Prints and Drawings that spans five centuries, an additional 2500 portrait engravings and over 2000 original photographs. We have an artifact collection that includes over 1,000 medical and scientific instruments and the Streeter Collection of Weights and Measures containing several thousands items. The Preservation Librarian for the Cushing/Whitney Medical Library began in August 2005 to establish a program that will preserve and conserve these collections. The exhibit, which will be on view from March 14th-June 11th , illustrates issues of preservation and solutions that can be achieved to safeguard this priceless collection.
Yale is in the mist of a capital campaign (we still trail Harvard by a few billion…). Medical Library component “Medical Library>>tomorrow “Preserve Our Past” | Collection Preservation, Conservation Lab and Digitization Project for Historical Collections. We know, especially in the printed world that preservation is expense… we don’t expect that to change in the online environment. Approximately 54% were identified in one or both of these archives. LOCKSS does not have a list, needed to look at participating publisher lists. Portico list was missing ISSNs.