  • This is a model we have used to frame some discussions about library collections and operations in the past. The horizontal axis is a measure of the stewardship or curation efforts that have traditionally been needed to manage these materials in libraries. The vertical axis is a measure of how widely held the materials are in the library system: at the top are resources that are abundant in the library community, at the bottom are materials that relatively rare.In the upper left quadrant are the materials that libraries traditionally purchased and increasingly are leasing. Below that are special collections, rare books and manuscripts. The bottom right includes research outputs and teaching materials. The upper right includes a wide variety of resources found on the Open Web – web sites, discussion lists, blogs etc.Libraries may be interested in all of these areas, but not equally. Traditionally, library acquisitions and operations have focused on the upper left quadrant: published materials in print. Licensed resources were a secondary focus. And, except in major research libraries, there was limited attention to managing rare books and manuscripts, instructional course materials, or Web archiving.
  • Increasingly, [click] we have seen this attention shift to licensed electronic materials, which are now more ubiquitous and also require less local management effort. At the same time, we have seen an intensification of interest in the curation of locally created research and teaching materials. There are a number of reasons for this, motivated by the need to manage institutional reputation, to increase faculty and student productivity, and improve operational efficiency. Based on what I know of Penn State’s Strategic Plan, it seems to me that this shift is especially pertinent to the University libraries.
  • There are a number of important changes in the academic library environment that we should be paying attention to. First, the shift to reliance on externally sourced, licensed content is accelerating – this is no longer just about e-journals but e-books as well.Secondly, print collections aren’t delivering the value they once did. Circulation rates have been on a steady decline for years; there is increasing attention to the long term cost burden of acquiring and retaining low-use print books.Finally, special collections are not universally perceived to be a key part of the library’s service mission in higher education. They may contain a few items regarded as treasures by the university, but the acquisition of rare books and manuscripts is rarely viewed, or funded, as a core library function.
  • There are three main drivers I want to call out here, though one could certainly point to others. First, there is general agreement that the traditional library value proposition -- acquiring and amassing a comprehensive or substantially representative physical corpus of material for local use – is no longer perceived to be relevant.Second, the nature of the scholarly record has changed and is no longer adequately captured in traditional print and licensed collections. There is increased attention to the need for managing ‘upstream’ research outputs and traditional print operations are viewed as something of a distraction from this.Finally and most importantly for the purposes of our discussion to day is the impact of mass digitisation on the discoverability and perceived ‘location’ of library collections. Digitized books are no longer regarded as the property of individual libraries but instead are simply considered part of the network.
  • I mentionedthat the print to electronic transition for journals is already well underway and, by some reckonings, very nearly complete. Indeed, a librarian here at Penn State was one of the first to attempt a serious assessment of the degree to which electronic distribution and digital archiving of scholarly journals could effectively replace local print copies. Bob Seeds, the head of the mathematics library wrote several papers about how remote storage annexes were transforming the library landscape at the University. At the turn of the 21st century, the “ubiquitous question” for research libraries was whether electronic journals could replace print journals, and secondarily, how this would affect the organization of the university library. JSTOR and later Portico provided the kind of shared infrastructure that made it possible for libraries like Penn State to begin to shift attention and operations from print to electronic.A decade later, we are asking the same question about e-books and the mass-digitized library corpus. And as I’ll suggest further on, we are beginning to see some shared infrastructure emerge to support a transition in management of the monographic literature.
  • At Penn State, the university libraries have historically been viewed as underfunded when compared to peer institutions. I was interested to read (in Michael Bezilla’s history of PSU) that, in early 1960s, the library accounted for about 1.4% of the institutional budget. For many years, one could hold that figure up as evidence of the need for increased institutional support of the university libraries. At the time, other institutions were allocating 3% or more of univ exp to the library. Yet, when we look at aggregate spending on college and university libraries in the US over time, it would appear that an investment of less than 1.5% of total institutional spending is the norm today. This chart shows that while total institutional investment in higher education has increased dramatically in the past 30 years, proportional spending on academic libraries has been on a steady decline. If this trend continues, we can project that the university allocation to libraries will fall below 1% by about 2013. This has something to do with the increasing costs of educational infrastructure – spending on laboratories and technology has grown much more rapidly than spending on library infrastructure. So while library expenditures have increased each year, they represent a diminishing part of the university’s total spending in support of research, teaching and learning. This is a trend that is driving a certain amount of change in the academic library environment, encouraging a shift to collaborative sourcing of collections and services, increased attention to the return on library investment, and a stern focus on identifying and eliminating operational inefficiencies.Here at Penn State, there has been a tremendous effort to improve and expand the university libraries, supported in recent years by a stunningly effective development campaign and some exceptionally committed individual donors. At the same time, there has been a serious reassessment of traditional library operations – guided by a very thoughtful Strategic Plan – and a reallocation of effort and resources in support of a new vision of library excellence. All of this is very impressive. I want to emphasize, however that the trend toward diminished support for academic libraries is not a new phenomenon and it is not merely a knock-on effect of regional or institutional economic pressures. It is a reflection of much broader changes in the higher education environment, including funding mandates that create incentives for increased institutional attention to science and engineering, a decline in the number of students pursuing advanced degrees in the humanities, and new models of educational provisioning -- including distance learning – that are no longer reliant on locally-sourced collections or infrastructure.
  • In the US, the last five years have been marked by significant growth in for-profit education market, dominated by online universities. These institutions are not reliant on traditional physical infrastructure of the library. Their success is forcing traditional HE institutions to compete for students and to revitalize their institutional reputations. The core library operations associated with print based collections do not have much relevance here. We see the impact of this shift called out in the University’s strategic plan, which acknowledges increased competition for the hearts and minds of the next generation of Penn Staters.
  • While net institutional spending on libraries has declined over the past thirty years, the total university library spending has increased sharply. This is not a reflection of growing library infrastructure – or what we might call “new library starts” – since as you can see the total number of academic libraries has remained relatively stable. There are currently about 3800 college and university libraries in the United States. What’s driving the increased spending is primarily library expenditures on materials, especially at institutions offering post-graduate (Masters and Doctoral) programs. A small number of research libraries – including Penn State -- account for the majority of spending on collections. Libraries supporting doctoral-level education account for just 20% of academic libraries in the US, but are responsible for more than 70% of spending on information resources.Something’s got to give.
  • In the US, a majority of research libraries are already spending more than half of the library materials budget on licensed resources. [click]Print is no longer at the center.
  • In fact, more and more of it is at the periphery.Thepast 25 years have seen massive growth in off-site library storage infrastructure in the US. This is not a trend we expect to see continue.
  • So, what explains the mass migration of books from the center of campus to peripheral annexes and off-site warehouses.It’s not merely a matter of space pressures in academic libraries – as we so often say – but of priorities. If print were genuinely the engine of academic and research excellence, no university would hesitate to allocate prime real estate to it.
  • So we’ve talked about academic collections in general, and the academic library system as a whole. I want now to say a few words about library infrastructure at the institutional level, using Penn State as an exemplar. The organization of the academic library has undergone some major shifts. Traditionally, the library’s function was to amass sufficient materials to make scholarship possible. This was at first embodied in large, centrally located physical collections. Later, those collections were dispersed in specialized departmental libraries. And more recently we have seen efforts to consolidate and re-aggregate collections both physically and virtually to support supra-institutional disciplinary communities.“combined functions of library, repository and collaboratory” (Leo Waaijers (SURF), 2005)
  • Here is a local perspective on library infrastructure in the academy, as viewed by Fred Pattee at the start of the last century. In his course-book on American literature, first published in 1919, he acknowledged that most students of this new emerging discipline would be hard pressed to obtain local copies of every title in the new canon. This was his justification for compiling excerpts in a textbook – literally, a book of core texts. (It’s still something of a classic – held by more than 200 libraries.)Professor Pattee knew a thing or two about the challenges of teaching and advancing research with limited library infrastructure. At the time he came to Penn State in the 1890s, the university library holdings amounted to about 7 thousand volumes. His efforts to survey the entire corpus of American literature must have entailed heroic exertions in inter-library loan and many research trips. (Interested to find that PSU special collections contain a catalog of the BPL that Pattee drew up in the 1870s.) This image – from PSU’s digital collections – shows Pattee and his students in the Old Main Library. By the time his textbook was published, the University was equipped with a much larger library and its collections had tripled in size.;q1=library;start=1;size=25;page=search;seq=9;view=image;num=v
  • In many ways, the Carnegie Library that was established on the Penn State campus in 1904 was a harbinger of things to come. It was built with a gift of external funds and intended to increase the institutional reputation of the university as a center of scholarship. This is not dissimilar to the campaign in the 1980s and ‘90s to establish the Paterno library. At the same time, the Carnegie building was very much rooted in tradition, a physical embodiment of the view that a university is a collection of buildings organized around a library. The collection remained relatively small, but was heavily used.
  • In the post-WWII period, Penn State University Libraries continued to grow, both in size and in scope. Although its collections remained small compared to other research universities, its overall configuration was much like other academic libraries, increasingly resembling a warehouse of books.
  • In the 1970s and ‘80s Penn State University found new ways to leverage available infrastructure [click] to extend its collections, by joining up with larger social and technological networks, including the Research Libraries Group and the CIC. The library also found some powerful new allies and increased its collections spending, leapfrogging its way into the top tier of academic research libraries.Image from:’s book held by 119 libraries. Out of print. (Scanned by Google from U Michigan copy.)
  • Today, Penn State ranks among the top research libraries in the United States – not because of its volume count (traditionally the hallmark of greatness) but because it has aligned institutional support behind a new strategic vision. There is no more dramatic embodiment of this change than the creation of a Knowledge Commons on the first floor of the Pattee Library, which will entail a significant reconfiguration of the physical collections.
  • At Penn State and elsewhere, there is a new convergence between campus computing and the university library. The heart and the mind (or CNS…) of the university are finding common cause in the development of a new cyberinfrastructure.From article in PSU campus newsletter promoting upcoming CyberInfrastructure Day at PSU
  • In the current Penn State Strategic Plan, UniversityIT and the Libraries share responsibility for building the University’s infrastructure for research, teaching and administration. The strength of that academic infrastructure is no longer measured with volume counts, but instead by the use of local and shared digital repositories, and the use and re-use of local teaching and research materials.What we are seeing here is a deliberate redirection of library resource and attention in support of those “lower left quadrant” priorities that I spoke of earlier.
  • I want to turn now to the issue of shared infrastructure. Specifically, the emergence of the HathiTrust, a shared digital repository developed within the CIC. This isn’t the only example of cooperatively sourced infrastructure in the higher education environment – one could point to open source platforms like SAKAI, or e-prints – but I believe it will be one of the most important for academic libraries. Over the past year, OCLC Research has studied the rising rate of duplication between titles held in the shared HathiTrust digital repository and in the academic print book collection.This scatter chart provide a simple but effective visualization of an important pattern that this project has revealed: that is, that the risks and opportunities associated with moving collection management ‘into the cloud’ are uniformly distributed across the research library community as a whole. [CLICK] This is a picture of the ARL membership (a microcosm of the larger research library community) that shows the level of duplication between individual library collections and the mass digitized book collection in Hathi. Over the course of this project, we have seen the rate of duplication between locally held print and mass digitized books increase steadily and significantly. In June 2009 an average of 20% of print titles in an academic library were duplicated in the Hathi repository; today that figure is above 30% (up to 40% for some institutions). [CLICK] In real terms, this means that rate of digital replication is exceeding the pace of growth in print acquisitions in most academic institutions. We estimate that the rate of duplication has increased by about 8% per library in the past year. Print acquisitions typically grow at about 2% per year in research libraries.[CLICK] We project that in a year’s time, many academic libraries are liable to find themselves “underwater,” holding a massive inventory of over-valued assets.Library directors will be called to account and expected to respond to questions about how an increasingly redundant local print collection is serving the educational and research mission of theparent institution. We need to be preparing for a world in which just-in-time, print on demand delivery is an option for a large share of the retrospective book collection.
  • This chart shows the rate of duplication between the HathiTrustlibray and the PS University Park libraries as of December 2010. You can see that the overlap is significant. You can also see that most of the duplicated content is still in copyright.
  • This is where the rubber meets the road. I mentioned that there has been increased attention to the long-term costs of acquiring and retaining low-use print materials. This is especially true for retrospective print collections that have been digitized. On recent study by the Dean of Libraries at the University of Michigan suggests that it costs about $4.25 per volume per year to store a book on campus, and less than a third as much to manage it off-site. This means that the Pennsylvania State University is currently spending between $750 thousand and $3.7 million dollars each year to retain copies of books that are preserved in the HathiTrust repository. Which Penn State is also paying for. The library is not accountable for these costs – they are not charged to the library budget – but is in some sense responsible for them.
  • And in fact, the University libraries are already beginning to position the HathiTrust as part of the infrastructure supporting research, teaching and learning at Penn State. A recent article in the Library newsletter highlighted the value that the Hathi partnership is delivering to the library by increasing the visibility of the libraries’ unique resources and extending access to public domain content contributed by other HathiTrust partners. There’s a third way in which the University’s investment in this new shared infrastructure could be helpful, and that’s in achieving a rational reconfiguration of the library’s current print collection. I’ll have more to say about that in a moment. But first let’s look at a couple of examples of how the libraries are integrating Hathi into the local environment. page 13
  • For example, links to electronic versions of local print holdings in the HathiTrust are integrated within the Cat. This is one way the libraries can make good on their commitment, in the university strategic plan, to increase use of digital repositories.This title provides a nice illustration of how the library’s investment in the HathiTrust is delivering distinctive value to the university community. This government document is available in a variety of formats: as a print book at University Park, as a licensed electronic resource from LexisNexis, and as free full text from the HathiTrust digital library. It is not generally recognized that libraries participating in the HathiTrust are doing a great deal more than sharing the cost of digital preservation – they are actively pushing the envelope on universal access to digitized library collections. As you can see, this government document – that is, *public information* -- is not fully available online from Google. By contributing this content to the shared digital repository, Penn State Universities have made a genuinely transformative change in the online research environment.
  • Another title PSU has contributed to Hathi.This one fits the profile of the University Libraries ‘collections of distinction’ … a title in meteorology that is held by fewer than 10 libraries worldwide. Paging through the digital version we can see that it was not in great demand, at least in the pre-automation period. Not surprisingly, perhaps, this volume is now managed in Penn State Libraries’ offsite annex. Again, this provides evidence that Penn State’s participation in Hathi is enabling its unique resources reach new audiences.
  • I’ve suggested that Penn State benefits from its investment in the HathiTrust in at least three ways:Its contributed content delivers more value in the larger Hathi digital aggregation;the public domain content in Hathi that Penn doesn’t already own can now be sourced at reduced cost in ways that support new forms of scholarship; and the physical inventory that Penn State currently holds can now be managed more efficiently.If you graph relative size and growth of each of these categories over time as I’ve done here, it is immediately apparent that the greatest value that accrues to Penn State is associated with the reconfiguration of the local collection, in view of the digital preservation and access guarantees provided by Hathi. Let’s see how that might happen.
  • In preparation for this visit, I took a closer look at the titles owned by Penn State Libraries that are now duplicated in the HathiTrust Digital Library. Knowing that you are taking a hard look at the University library collections, assessing current usage and prospective purchasing, and also planning for a major collection shifting exercise, I thought it would be useful to consider what kinds of resources are now ‘backed up’ by a digital preservation guarantee. This subject distribution is largely representative of the HathiTrust Library as a whole; that is, subject areas that are well represented in Hathi, are also among the most widely duplicated in Penn State’s print collection. As you can see, the distribution favors the humanities: History and Literature account for about 20% of all the titles duplicated in the two collections. In the context of the ongoing reconfiguration of collections in the Pattee Library, this distribution is especially interesting, since it suggests that some part of the History collections that will need to be relocated from the 1st floor to accommodate the new Knowledge Commons, will be searchable and in some cases fully available online from HathiTrust. This means that concerns over the possible impact on browsing the physical collections may be somewhat alleviated. More than 10,000 of the titles concerned are in the public domain and fully available online.History titles:14565 (13%) PD100754 (87%)IC
  • Here’s another way to look at it.Relatively few of the PSU owned titles duplicated in the HathiTrust Digital Library represent “at risk” or unique content. This is not likely to be material that the university libraries regard as a distinctive institutional asset. It can be sourced in print from multiple library suppliers within and beyond the CIC.
  • In practical terms, the opportunity costs of inaction can be calculated in terms of the space savings the libraries could achieve if the duplicated content were shifted offsite or even withdrawn.It amounts to 10 miles of shelving, or nearly 68K assignable square feet. This estimate may overstate the potential space savings and cost avoidance, since some number of the duplicated titles may already be in the Penn State University Libraries storage annex.
  • As we look to the future, it is clear that the academic library environment as a whole is changing. Here I have plotted projections for the duplication of academic print collections in the HathiTrust Digital Library for a range of academic libraries in the state of Pennsylvania. The blue and violet lines at the top of the stack represent smaller academic institutions . We predict that 50% of their library holdings will be duplicated within the coming year. At research intensive institutions, that watershed moment will occur somewhat later. At top tier institutions like Penn State, it may take another year or two before redundant print inventory begins to look less like an asset and more like a liability. But this change is coming, and we need to plan for it.
  • Achieving a sustainable reconfiguration of academic collections will be a delicate operation
