OCLC Research has organized a full program of work under the rubric of Shared Print collections since 2007. Past work has included analysis of the distribution and character of unique-held books in research libraries, the state of library off-site storage, obstacles to de-accessioning print journal back files, policy frameworks for shared single and last copy repositories, and studies of the aggregate print and digital collections. In the past year and in the year to come, we have focused our attention on managing risk in print preservation – risks of over investing in aggressive strategies, risks of perpetuating the current laissez faire approach. We’re continuing to explore this issue in current projects, including a few that are especially pertinent to today’s discussion. I will not be discussing any of these in detail, but it’s worth calling out of a few of them: A print journals preservation project that we’ve pursued in collaboration with a number of US research libraries, looking at the risks associated with current decentralized print preservation strategies as they relate to ‘at risk’ scholarly journals in the humanities A new effort that we’ll be advancing with NYU, Hathi and ReCAP libraries, focused on service and business models for shared print and digital collections An effort to model workflows for disclosing print archiving commitments using the MARC21 bib/LHR format; and A new project that will produce a decision tree for rationalizing locally held print journal back files. This project is led by my colleague Dennis Massie. And we continue to pursue a range of data-mining projects looking at aggregate collections as they relate to print and digital collection management.
A few words about our print journals preservation project, which has involved 10 partner libraries over the last year. We spent a good deal of time on our sampling and selection criteria – ultimately focused on active scholarly journals in the humanities with limited holdings and print only distributior. We were interested to discover for this particular class of at-risk titles what the current preservation status is and longterm prospects are likely to be. This entailed a title-by-title review of 200 journals by participating libraries, who completed detailed workforms describing scope and condition of local holdings, documented usage trends, physical environment, and other factors. Libraries were also invited to indicate if the at risk titles in their local collections were likely to benefit from any long-term archiving and access commitment or if they would consider transferrng holdings to an agent prepared to make such commitments. Our results suggest that the greatest danger facing these titles has less to do with item condition or use than with institutional ignorance about the value of the content coupled with very sparse holdings. This combination spells almost certain death for these titles, which can not reach thresholds of duplicated needed for longterm survivability and which are likely to be canceled. We are in the process of drafting a final report on this project and exploring a possible follow-on effort with ARL, which is interested in economic dimensions of preservation as they affect the publishing community. How it relates to today’s conversation: evident that absence of coordinated action is placing this content at even further risk; regional efforts are not likely to suffice, given limited aggregate holdings. There is a significant overlap between these titles and CRL – but is CRL prepared to ‘own’ this responsibility on behalf of the research library community as a whole?
We’ve also very recently initiated a project that aims to characterize the service requirements for shared print and digital service providers. We’re calling this our ‘Cloud Library’ project because it is exploring a service model that is similar in some respects to ‘cloud computing’ and web-based storage and management of information assets. We’ll be working with NYU, Hahti and ReCAP to characterize what the ‘total value’ of services to a consumer library might be, based on a calculation of overlap in holdings between the collections and service expectations of the consuming institutions. We are explicitly focused on service models that can operate at scale – not one-off solutions customized for private university in Manhattan. A certain amount of analysis has already been done, though much remains to be done. Preliminary results based on a May extract of Hathi data (a rapidly moving target) suggest that on average, we can expect the overlap between Hathi and collections in large US research libraries to exceed 20%. This is a significant finding given that Hathi expects to ingest something on the order of 400K volumes each month for the next year at least. (They project 18M volumes in the five years or so.) Only a small part of the content in Hathi is in the PD, so it can’t serve as a surrogate for research libraries collections as a whole (even for preservation purposes) without some complementary shared print services. We are interested in hearing from research libraries about how they envision consuming (or contributing) to this kind of shared service model. How this fits with today’s conversation: we believe that regional print preservation and access hubs are likely to play a significant role in the future academic collection management. CRL, the UC RLFs, large-scale shared print repositories in other regions all represent potential shared print service providers.
Consortia may be little more than a macrocosm of individual institution – organized around a model of self-sufficiency. Some challenges… Can’t go it alone, even if you want to (OhioLINK, UC) Preservation and access needs don’t recognize geographic or consortial boundaries (UC CRL print archiving; journals consolidation) Absence of coordination means good work at regional level is invisible and may even come at the expense of community needs (viz. last copy policies)
We need to factor books back into the equation – they account for the lion’s share of research library collections on a title basis, duplication is thin, and the format transition is accelerating. We are not paying sufficient attention to the economics of scholarly publishing in the long tail. Small publishers represent the greatest share of producers but may be put out of business if the library community doesn’t acknowledge that their ability to move content into online formats is limited. We need to attend to the dynamics of scholarly communication more closely in managing our legacy collections; there is a lot of good, current work looking at the ‘value’ of information content as it is currently being consumed (Eigen, Mesur, Bergstrom). By ignoring it, we run the risk of sentimentalizing our collection management practices – assuming that print preservation is inherently valuable and worthy of institutional resources. Finally, and most critically, we need to examine the challenges of print preservation in the context of larger changes in the library environment and consider which parts of the problem lend themselves to collaborative action. What aspects of print inventory management require local attention? Which can be reorganized on a regional basis? What opportunities for shared service provision exist? How can we begin to organize a response in a decentralized institutional environment? Can we assign responsibility for coordination to an existing national (or international) body or is a new organizational entity needed?
OCLC Research: Current Work in Shared Print
Risk assessment : Print journal preservation
Where is community investment most needed?
Shared service models : Toward a ‘Cloud Library’
Use cases for large-scale shared print/digital repositories
Infrastructure : MARC 583
Recording and disclosing print archiving commitments