OCLC Research has organized a full program of work under the rubric of Shared Print collections since 2007. Past work has included analysis of the distribution and character of unique-held books in research libraries, the state of library off-site storage, obstacles to de-accessioning print journal back files, policy frameworks for shared single and last copy repositories, and studies of the aggregate print and digital collections. In the past year and in the year to come, we have focused our attention on managing risk in print preservation – risks of over investing in aggressive strategies, risks of perpetuating the current laissez faire approach. We’re continuing to explore this issue in current projects, including a few that are especially pertinent to today’s discussion. I will not be discussing any of these in detail, but it’s worth calling out of a few of them: A print journals preservation project that we’ve pursued in collaboration with a number of US research libraries, looking at the risks associated with current decentralized print preservation strategies as they relate to ‘at risk’ scholarly journals in the humanities A new effort that we’ll be advancing with NYU, Hathi and ReCAP libraries, focused on service and business models for shared print and digital collections An effort to model workflows for disclosing print archiving commitments using the MARC21 bib/LHR format; and A new project that will produce a decision tree for rationalizing locally held print journal back files. This project is led by my colleague Dennis Massie. And we continue to pursue a range of data-mining projects looking at aggregate collections as they relate to print and digital collection management.
A few words about our print journals preservation project, which has involved 10 partner libraries over the last year. We spent a good deal of time on our sampling and selection criteria – ultimately focused on active scholarly journals in the humanities with limited holdings and print only distributior. We were interested to discover for this particular class of at-risk titles what the current preservation status is and longterm prospects are likely to be. This entailed a title-by-title review of 200 journals by participating libraries, who completed detailed workforms describing scope and condition of local holdings, documented usage trends, physical environment, and other factors. Libraries were also invited to indicate if the at risk titles in their local collections were likely to benefit from any long-term archiving and access commitment or if they would consider transferrng holdings to an agent prepared to make such commitments. Our results suggest that the greatest danger facing these titles has less to do with item condition or use than with institutional ignorance about the value of the content coupled with very sparse holdings. This combination spells almost certain death for these titles, which can not reach thresholds of duplicated needed for longterm survivability and which are likely to be canceled. We are in the process of drafting a final report on this project and exploring a possible follow-on effort with ARL, which is interested in economic dimensions of preservation as they affect the publishing community. How it relates to today’s conversation: evident that absence of coordinated action is placing this content at even further risk; regional efforts are not likely to suffice, given limited aggregate holdings. There is a significant overlap between these titles and CRL – but is CRL prepared to ‘own’ this responsibility on behalf of the research library community as a whole?
We’ve also very recently initiated a project that aims to characterize the service requirements for shared print and digital service providers. We’re calling this our ‘Cloud Library’ project because it is exploring a service model that is similar in some respects to ‘cloud computing’ and web-based storage and management of information assets. We’ll be working with NYU, Hahti and ReCAP to characterize what the ‘total value’ of services to a consumer library might be, based on a calculation of overlap in holdings between the collections and service expectations of the consuming institutions. We are explicitly focused on service models that can operate at scale – not one-off solutions customized for private university in Manhattan. A certain amount of analysis has already been done, though much remains to be done. Preliminary results based on a May extract of Hathi data (a rapidly moving target) suggest that on average, we can expect the overlap between Hathi and collections in large US research libraries to exceed 20%. This is a significant finding given that Hathi expects to ingest something on the order of 400K volumes each month for the next year at least. (They project 18M volumes in the five years or so.) Only a small part of the content in Hathi is in the PD, so it can’t serve as a surrogate for research libraries collections as a whole (even for preservation purposes) without some complementary shared print services. We are interested in hearing from research libraries about how they envision consuming (or contributing) to this kind of shared service model. How this fits with today’s conversation: we believe that regional print preservation and access hubs are likely to play a significant role in the future academic collection management. CRL, the UC RLFs, large-scale shared print repositories in other regions all represent potential shared print service providers.
Consortia may be little more than a macrocosm of individual institution – organized around a model of self-sufficiency. Some challenges… Can’t go it alone, even if you want to (OhioLINK, UC) Preservation and access needs don’t recognize geographic or consortial boundaries (UC CRL print archiving; journals consolidation) Absence of coordination means good work at regional level is invisible and may even come at the expense of community needs (viz. last copy policies)
We need to factor books back into the equation – they account for the lion’s share of research library collections on a title basis, duplication is thin, and the format transition is accelerating. We are not paying sufficient attention to the economics of scholarly publishing in the long tail. Small publishers represent the greatest share of producers but may be put out of business if the library community doesn’t acknowledge that their ability to move content into online formats is limited. We need to attend to the dynamics of scholarly communication more closely in managing our legacy collections; there is a lot of good, current work looking at the ‘value’ of information content as it is currently being consumed (Eigen, Mesur, Bergstrom). By ignoring it, we run the risk of sentimentalizing our collection management practices – assuming that print preservation is inherently valuable and worthy of institutional resources. Finally, and most critically, we need to examine the challenges of print preservation in the context of larger changes in the library environment and consider which parts of the problem lend themselves to collaborative action. What aspects of print inventory management require local attention? Which can be reorganized on a regional basis? What opportunities for shared service provision exist? How can we begin to organize a response in a decentralized institutional environment? Can we assign responsibility for coordination to an existing national (or international) body or is a new organizational entity needed?
1. OCLC Research: Current Work in Shared Print <ul><li>Risk assessment : Print journal preservation </li></ul><ul><ul><li>Where is community investment most needed? </li></ul></ul><ul><li>Shared service models : Toward a ‘Cloud Library’ </li></ul><ul><ul><li>Use cases for large-scale shared print/digital repositories </li></ul></ul><ul><li>Infrastructure : MARC 583 </li></ul><ul><ul><li>Recording and disclosing print archiving commitments </li></ul></ul><ul><li>Guidance : De-accessioning print journals </li></ul><ul><ul><li>Roadmap for collection managers </li></ul></ul><ul><li>Aggregate collections : </li></ul><ul><ul><li>OhioLINK duplication and circulation rates </li></ul></ul><ul><ul><li>orphan works & Google Library partners </li></ul></ul><ul><ul><li>Archival collections (descriptive practice) </li></ul></ul>
2. Print Journals Preservation Project <ul><li>Qualitative assessment of current preservation model </li></ul><ul><li>What factors contribute to local commitment to preserve? </li></ul><ul><li>200 scholarly humanities journals w/ print-only distribution </li></ul><ul><li>Representative sample of US research libraries (10) </li></ul><ul><li>Holdings generally incomplete; ~15% to ~80% of pub’d vols </li></ul><ul><li>Condition generally good; 12% exhibit minor problems </li></ul><ul><li>Usage is low; 70% with no evidence of demand over 5 yrs </li></ul><ul><li>Bibliographic description generally adequate </li></ul><ul><li>Limited institutional incentive to retain/preserve </li></ul><ul><li>Commitment increases with extent of local holdings </li></ul><ul><li>Cancellations abound; negative selection pressure on survivability of small scholarly publishers </li></ul>
3. ‘Toward a Cloud Library’ <ul><li>What savings/resource reallocation is possible if local </li></ul><ul><li>reliance on shared print & digital collections is maximized? </li></ul><ul><li>What service expectations must be met? </li></ul><ul><li>Model service requirements for shared print/digital suppliers </li></ul><ul><li>Characterize range of operational efficiencies for research libraries; approximate retail value of services </li></ul><ul><ul><li>Model consumer: NYU </li></ul></ul><ul><ul><li>Model suppliers: ReCAP shared print collection </li></ul></ul><ul><ul><li> Hathi shared digital collection </li></ul></ul><ul><li>~20% duplication between average large, North American research library collection and Hathi </li></ul><ul><li>~12% of titles in Hathi are public domain; ~2% of local print holdings (dup’n mostly in-copyright) </li></ul><ul><li>Combination of shared print and digital services is needed </li></ul>
4. Roles for Regional Consortia <ul><li>Opportunities </li></ul><ul><li>Consolidation and validation of shared print archives for regional partners </li></ul><ul><li>Load-leveling for national preservation interests, objectives </li></ul><ul><li>Characterization of regional service requirements to external providers </li></ul><ul><li>Representation in supra-regional governance structures </li></ul><ul><li>Challenges </li></ul><ul><li>Regional inventory inadequate to sustain scholarly activity or community preservation goals </li></ul><ul><li>Membership models ill-adapted to system-wide expectations and needs </li></ul><ul><li>Preservation guarantees subject to revision based on local/regional prerogatives at the expense of community preservation objectives, expectations </li></ul>
5. Gaps to be addressed in a national strategy for legacy print collections . . . <ul><li>Redress balance in format-based initiatives </li></ul><ul><ul><li>Millions of digitized pages vs. millions of digitized titles </li></ul></ul><ul><ul><li>Heterogeneity of preservation infrastructure </li></ul></ul><ul><li>Small scholarly publishers </li></ul><ul><ul><li>Limited scale, fragmented infrastructure </li></ul></ul><ul><ul><li>Print-only formats predominate; limited market for conversion </li></ul></ul><ul><li>Valuation of scholarly content in the network </li></ul><ul><ul><li>Move beyond use (circulation, page views) as measure of value </li></ul></ul><ul><ul><li>Scope, duration of disciplinary networks should guide re-evaluation of preservation requirements </li></ul></ul><ul><li>Locus of action: where is responsibility fixed? </li></ul><ul><ul><li>Regional consolidation, optimizat’n of physical inventory </li></ul></ul><ul><ul><li>De-centralized governance; absence of coordinating bodies </li></ul></ul>
6. Questions, Comments? <ul><li>OCLC Research Shared Print Program </li></ul><ul><li>www.oclc.org/programs/ourwork/collectivecoll/sharedprint / </li></ul><ul><li>Journals Preservation Project </li></ul><ul><li>www.slideshare.net/RLGPrograms/cloud-library-precipitating-change-in-library-infrastructure </li></ul><ul><li>Cloud Library </li></ul><ul><li>www.slideshare.net/RLGPrograms/journals-preservation-project-managing-risks-in-perilous-times </li></ul><ul><li>RLG Shared Print Update , Monday 13 July, 10:00-11:30, OCLC Red Suite, Hyatt McCormick Place </li></ul>[email_address]