What to Retain: a framework for managing change in the library organization<br />Constance Malpas<br />Program Officer, OC...
Reorganizing the Legacy Print Collection<br />An illusion of imprisonment<br />artfully positioned looking glass<br />(It ...
Format Migration:  through the Looking Glass<br />Shift to digital has transformed scholarly landscape, yet academic libra...
E-Formats: Increase in Research Productivity?<br />… a correlation between e-format consumption and institutional research...
Journals:  ‘What to Withdraw’ (Ithaka, 2009)<br />Framework for assessing preservation risks, proposes criteria for identi...
reliability of digital access (quality, business continuity)
Image-intensive titles an excluded class (retain in print) </li></ul>Print as ‘back-stop’ to digital preservation <br />Re...
Investment in Academic Print Collections<br />You are here<br />Source: US Dept of Education, NCES, Academic Libraries Sur...
E-book Margin is Increasing<br />$169.5M in 2009<br />$9.3M in 2004<br />Source: American Association of Publishers<br />
Shift in Pattern of Library Investment<br />Declining library investment in preservation<br />Source: US Dept of Education...
Shared Infrastructure:  Journals v. Books<br />Margin of confidence?<br />Source: Portico, Growth of Archive<br />
Dematerialization of the Scholarly Record<br />Scholarly journals:  ~26,000 titles in 2010<br />i.e. refereed academic jou...
Moving Collections to the Cloud<br />Premise:  emergence of large scale shared print and digital repositories creates oppo...
Enable reallocation of institutional resources
Model new business relationships among libraries</li></ul>* increased reliance on external infrastructure and service plat...
Methodology<br /><ul><li>Monthly harvest of metadata from HathiTrust repository
Mapped to WorldCat bibliographic and holdings data
Selectively enhanced to increased coverage of physical storage collections
Iterative analysis at title level including
Subject distribution (scholarly audience)
Copyright status (availability)
Distribution in print format (opportunity for rationalization)</li></li></ul><li>Key Findings<br /><ul><li>Scope of mass-d...
Ratio of replaceable inventory independent of collection size
Most content also held in trusted print repositories with preservation and access services (CRL, UC Regional Library Facil...
Distribution of resource still suboptimal for shared service model
If limited to titles in the public domain, shared service offering may not be sufficient to mobilize significant resources
Fewer titles, smaller audience: demand is low</li></li></ul><li>Hathi Growth Trajectory – 12 months<br />Equal in scale to...
Hathi Trust:  Subject Distribution<br />N=3.2 million titles <br />Humanities content (literature, history) dominates – pr...
Upcoming SlideShare
Loading in …5
×

Malpas cni 2010

858 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
858
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • http://www.flickr.com/photos/monica_andre/348890160/sizes/m/in/set-72157594338250238/Book Cell by Monica, nicPhoto from installation of MatejKrén’s [MateyKranes] “Book Cell” at the Centro de Arte Moderna - Foundation CalousteGulbenkian in Lisbon, Portugal (2006). Slovak artist who has created several monumental pieces built on books, including tower of books(“Idiom”, created in 1998) now on permanent display at the municipal library in Prague.
  • http://www.rin.ac.uk/our-work/communicating-and-disseminating-research/e-journals-their-use-value-and-impact10 universities; log analysis of Science Direct and Oxford Journals for search/use activity in 2007.
  • Report issued in September 2009. A response to concerns about how redundant print is constraining library innovation.Acknowledges primacy of digital formats in scholarly journal literature, argues that preservation value of print is limited and diminishing.A model built around experience of JSTOR, one of the most successful examples of format transition.
  • Changing pattern in library investment reflects a shift in scholarly attention. We are in the midst of a progressive but dramatic phase change.This is not just about price pressure that e-journals are placing on print acquisitions. Many libraries actively replacing print journal back-files with electronic surrogates as part of a longterm strategy of replacing print with digital. Reflects an active decision to prefer electronic over print – even when the print is already owned.Systematic externalization of discovery, delivery, and management functions associated with the journal literature. But it’s not just about journals…
  • Some evidence that e-books are gaining on print formats. Net domesticsales of books published in the United States. American Association of Publishers recently reported that e-book sales as a proportion of total market share increased dramatically in the past year, and now represents more than 3% of all sales. A small percentage, but clearly a growing audience.Lots of anticipation around impact of iPad on scholarly consumption of e-books – creating a commercial market.Growth of ebook margin increasing much faster than growth of book sales as a whole.
  • Format migration has not slowed library spending on information resources; acquisitions now accounts for almost 40% of academic library spending. Most of it spent of licensed electronic content. Over the same thirty year period, proportional spending on infrastructure and preservation have been on the decline.Not necessarily a reflection of diminished institutional commitments to preservation; instead a systematic and strategic externalization of operations to commercial providers, especially e-journal aggregators and publishers.
  • Part of what has made the ‘externalization’ of collection management possible is the growth of shared infrastructure. In the past four to five years, substantial progress has been made in building out a robust infrastructure for the preservation of licensed journal content. One can point to the examples of LOCKSS and CLOCKSS or, as in this diagram, to the growth of Portico.The Portico digital archive now serves more than 600 libraries and covers some 10K journal titles. Almost all of these are not only committed to the archive but already ingested. It has recently expanded the scope of its preservation remit to include e-books. More than 30K titles have been committed by commercial publishers; a small fraction (about 6%) have been ingested. I emphasize the disparity between the rate of ingest for books and journals not to call into question the value of Portico as a book archive, but to demonstrate how different the preservation circumstances are for books and journals.
  • Books are different – and the same. http://www.mtholyoke.edu/artmuseum/images/artmuseum_full/2008_10_1XL.jpg2007 exhibit at Mount Holyoke College art museum based on 2006 book (209 library holdings; digitized by Google from a copy in the University of Michigan library)John Cox, a consultant for the Association of Learned and Professional Society Publishers (ALPSP) estimates as much as 96% of the scientific literature and up to 85% of journals in the humanities have moved from print to electronic formats.The situation for the scholarly monographic literature is quite different, despite well-publicized mass digitization efforts in the last few years. 50M figures is titles irrespective of format, but we know that more than 90% of titles in libraries are monographs. Lavoie (2009) estimates 85 million print book titles in WorldCat.
  • Mellon-funded project began in July 2009.Concept of externalization comes from economic theory of the firm.
  • Over the nine months of this study, the number of titles in the Hathi repository has grown by 68% (about 8% per month). If this rate of growth is sustained, we would expect to find nearly 4 million titles (and 6.5 million volumes) in the archive by June 2010. On average, about 150,000 new titles are added to the repository each month. We focus on titles as a unit of measure because monographic works dominate in the Hathi corpus; an increase in the number of titles indicates growth in the scope of coverage while growth in volume counts may simply reflect additional digitized copies of the same edition. (If Hathi were primarily archiving journals, a focus on holdings at the volume level would be more important.) Since all copies of a given manifestation are subject to the same copyright restrictions, an increase in the number of copies does not necessarily result in an increase in the online availability of the content.
  • EEBO, ECCO – some full-text primary sources available to support research/teaching in the humanities, but mostly limited to interpretive/critical apparatus in the journal literature, not scholarly monographs. Now (or soon) we can expect a critical mass of digitized books to come online, potentially displacing some core library function.
  • Speaks to relevance and scholarly valueThe vast majority of titles in the Hathi archive were produced in the last 50-60 years; consequently, there is relatively little ‘outright’ public domain content, but a potentially significant corpus of orphan works. Note spike in titles from 1900; many of these (&gt;50%) are journals with an unknown start date in the 20th century which were cataloged with a 19uu in the 008 date1 field. Most (&gt;70%) are in copyright.
  • Limited online availability – should this stop us from acting?No reason to believe that PD content will be sufficient to support scholarship or revitalize library mission.
  • Zones of negotiation
  • The fact that titles in the public domain tend to be held by fewer libraries has important implications for a shared service business model: fewer titles, and relatively small audience of institutions seeking to replace print inventory with digital access. The preservation need may be great, but the institutional demand for (and willingness to pay for) a solution is likely to be quite limited.The public domain ‘long tail’ is both longer and narrower.
  • Despite large variations in collection size, ARL libraries can derive equal benefit from externalization of preservation functions Columbia University’s main library collection is almost 4 times the size of Rice’s, yet the proportional space gain it might realize is not substantially different. In fact, larger research institutions have a slightly lower rate of duplication with Hathi for the simple reason that they tend to hold a greater number of scarce and unique titles.
  • “Principles of Sustainability (and where they came from)” Books, embroidery floss, screenprint.KristianBjornard, graphic artist, 2009 MFA thesis project from Maryland Institute College of the Arts
  • A set of strategic choices
  • While actively building shared storage collections that will enable wide-scale redistribution of resource, based on digitization status and aggregate demand; allow demand to drive collective preservation priorities
  • http://www.flickr.com/photos/landschaft/3658612324/Detroit public school system book depository
  • Malpas cni 2010

    1. 1. What to Retain: a framework for managing change in the library organization<br />Constance Malpas<br />Program Officer, OCLC Research<br />malpasc@oclc.org<br />CNI Spring Taskforce<br />12 April 2010<br />
    2. 2. Reorganizing the Legacy Print Collection<br />An illusion of imprisonment<br />artfully positioned looking glass<br />(It only looks infinite)<br />MatejKrén “Book Cell” Installation at Centro de Arte Moderna, Lisbon (2006) <br />
    3. 3. Format Migration: through the Looking Glass<br />Shift to digital has transformed scholarly landscape, yet academic library operations still dominated by print paradigm<br />Format migration has introduced new levels of complexity into collection management as the scholarly function of print is revised<br />Decisions about what to withdraw, what to retain are fraught with uncertainty about future of the library mission<br />For books, especially, a fear of loss to academic reputation<br />
    4. 4. E-Formats: Increase in Research Productivity?<br />… a correlation between e-format consumption and institutional research reputation<br />Session length & gateway access<br />Journal spend, use & research outcomes<br />Source: (UK) Research Information Network E-journals: their Use, Value and Impact (2009)<br />
    5. 5. Journals: ‘What to Withdraw’ (Ithaka, 2009)<br />Framework for assessing preservation risks, proposes criteria for identifying print journals suitable for withdrawal<br /><ul><li>optimal number of copies (2 – 4 in dark archives)
    6. 6. reliability of digital access (quality, business continuity)
    7. 7. Image-intensive titles an excluded class (retain in print) </li></ul>Print as ‘back-stop’ to digital preservation <br />Retention horizon of 20-100 years, depending on digital preservation status<br />Decision support tool for JSTOR titles<br />
    8. 8. Investment in Academic Print Collections<br />You are here<br />Source: US Dept of Education, NCES, Academic Libraries Survey, 1998-2008<br />
    9. 9. E-book Margin is Increasing<br />$169.5M in 2009<br />$9.3M in 2004<br />Source: American Association of Publishers<br />
    10. 10. Shift in Pattern of Library Investment<br />Declining library investment in preservation<br />Source: US Dept of Education, NCES, Academic Libraries Survey, 1977-2008<br />
    11. 11. Shared Infrastructure: Journals v. Books<br />Margin of confidence?<br />Source: Portico, Growth of Archive<br />
    12. 12. Dematerialization of the Scholarly Record<br />Scholarly journals: ~26,000 titles in 2010<br />i.e. refereed academic journals in <br /> Ulrich’s knowledge-base <br />Est. 80-90% titles online(Cox, 2008)<br />ARL aggregate collection: ~50M titles in 2010<br />i.e. titles held by one or more ARL <br /> member library<br /> Est. 6-7 million (12-14%) titles digitized <br /> (extrapolated from analysis of Hathi<br /> archive and based on current estimates <br /> of 12 million volumes scanned by <br /> Google, February 2010)<br />Rosamond Purcell “Foucault’s Pendulum” <br /> from Bookworm (2006)<br />
    13. 13. Moving Collections to the Cloud<br />Premise: emergence of large scale shared print and digital repositories creates opportunity for strategic externalization*of core library operations<br /><ul><li>Reduce costs of preserving scholarly record
    14. 14. Enable reallocation of institutional resources
    15. 15. Model new business relationships among libraries</li></ul>* increased reliance on external infrastructure and service platforms in response to economic imperative (lower transaction costs)<br />
    16. 16. Methodology<br /><ul><li>Monthly harvest of metadata from HathiTrust repository
    17. 17. Mapped to WorldCat bibliographic and holdings data
    18. 18. Selectively enhanced to increased coverage of physical storage collections
    19. 19. Iterative analysis at title level including
    20. 20. Subject distribution (scholarly audience)
    21. 21. Copyright status (availability)
    22. 22. Distribution in print format (opportunity for rationalization)</li></li></ul><li>Key Findings<br /><ul><li>Scope of mass-digitized corpus in Hathi is already sufficient to replace at least 20-30% of most academic print collections
    23. 23. Ratio of replaceable inventory independent of collection size
    24. 24. Most content also held in trusted print repositories with preservation and access services (CRL, UC Regional Library Facilities, ReCAP, Library of Congress)
    25. 25. Distribution of resource still suboptimal for shared service model
    26. 26. If limited to titles in the public domain, shared service offering may not be sufficient to mobilize significant resources
    27. 27. Fewer titles, smaller audience: demand is low</li></li></ul><li>Hathi Growth Trajectory – 12 months<br />Equal in scale to LoC?<br />Equal in scope to very large ARLs (Columbia, Washington, etc)<br />Equal in size to median ARL collection (2008)<br />2016<br />Data current as of February 2010<br />NB: average holdings per book (title) in WorldCat =11<br />
    28. 28. Hathi Trust: Subject Distribution<br />N=3.2 million titles <br />Humanities content (literature, history) dominates – presages shift in scholarly practice?<br />Data current as of February 2010<br />
    29. 29. Distribution by Date of Publication<br />N=3.2 million titles <br /> >75% of titles in repository published after 1949;<br /> ~50% of titles published since 1976<br /> ~10% of titles published since 2000<br />A recent corpus, hence likely to be more broadly relevant to scholars<br />Data current as of February 2010<br />
    30. 30. Copyright Status: What Counts?<br />Volumes in Hathi Library<br />Titles in Hathi Library<br />Optimistically, additional copyright determination on orphan works might increase yield by ~600K titles <br />N=3.2M titles<br />N=5.3M volumes<br />Based on Hathi profile February 2010<br />
    31. 31. Distribution by WorldCat Library Holdings<br />N=3.2 million titles <br />Collective priority<br />Local mandate<br />Commercial viability<br />
    32. 32. Distribution by Holdings and Copyright Status<br />Public domain titles less widely held<br />and more likely to be very sparsely held<br />NB: average holdings per book (title) in WorldCat =11<br />Data current as of February 2010<br />
    33. 33. How Much is Enough? <br /><ul><li>If limited to titles currently in the public domain, average academic research library might regain space equivalent to ~2% of local collection (based on WorldCat holdings)
    34. 34. Since public domain collections (excepting government documents) typically not growing, replacement value a ‘one time’ proposition
    35. 35. Roughly equivalent to median annual growth rate in ARL libraries (~2% based on volume count); at best, enables steady-state for a single year </li></ul>Public domain corpus inadequate to mobilize <br />large-scale shift in library resources<br />
    36. 36. If Scope is Expanded to In Copyright Titles…<br />Rice University (RCE)<br />1.6M titles in collection<br />35% duplicated in Hathi (Feb 2010)<br />Columbia University (ZYU)<br />4.7M titles in collection<br />25% duplicated in Hathi (Feb 2010)<br />A conservative estimate based on current coverage,<br />likely to expand dramatically in next 1-5 years<br />Data current as of February 2010<br />*Spheres are scaled to size of institutional collection based on WorldCat holdings<br />
    37. 37. Why (Re) Organize Now?<br /><ul><li>Uncertainties about outcome of GBS settlement should not hold us back</li></ul>Many (majority?) of print books currently represented in Hathi are low-use titles for which aggregate demand can be met with reduced inventory, even without a licensed provision<br />There is sufficient redundancy to enable space savings for a significant number academic libraries; adequate scale<br />By progressively increasing reliance on shared print collections, libraries create economy in which further externalization becomes possibleand shared asset gains in value<br />Increased confidence in long-term preservation will enable broader base of institutions to participate in licensed offering, increasing library negotiating power<br />
    38. 38. Recycling Some Ideas about Sustainability<br />KristianBjornardPrinciples of Sustainability (and where they come from)MFA thesis installation, Maryland Institute College of the Arts, 2009<br />
    39. 39. Common Pool Resources (CPR)<br />Overexploitation of common-pool resources (‘tragedy of the commons’) is not inevitable<br />Multi-institutional ownership of non-commercial assets is viable and may increase sustainability<br />Cooperative governance can be modeled scientifically<br />E. Ostrom, Governing the Commons (1990)<br /> E. OstromGoverning the Commons <br /> Kindle Edition (2010)<br />
    40. 40. Can CPR be Applied to Libraries?<br />[Yes]<br />[Yes]<br />E. Ostrom & C. Hess Artifacts, Facilities, And Content: Information as a Common-pool Resource (2001)<br />
    41. 41. A Framework for Action<br />Empower ‘rational appropriators’ (regional and national consortia) to undertake systematic redistribution and rationalization of low-use monographic collections<br /> efforts underway in WEST, CRL, CIC etc.<br />Systematically assess carrying capacity of aggregate resource, i.e. system-wide supply/demand dynamics<br />Leverage OhioLINK and other findings<br />Monitor change in demand over time; enjoin participants to act as monitors<br /> CRL audit role might be extended<br />Adopt contingent strategies for print preservation<br />Embrace de-sacralization of codex<br /> E. OstromGoverning the Commons <br /> Kindle Edition (2010)<br />
    42. 42. Where to Start?<br />Actively seek to replace low-use print inventory with reliance on digitized and shared print collections; shift economic model toward cooperative management<br /><ul><li>Low-risk public domain titles; institutional risk tolerance will dictate whether regional print copy is needed</li></ul>~370K titles in Feb ‘10; approx. 250K (67%) held by >9 libraries<br /><ul><li>In-copyright digitized monographs already in large-scale stores, for which there is adequate duplication to create a market for service</li></ul>e.g. .5M titles held in UC SRLF and by >99 libraries<br />
    43. 43. What to Retain (locally)<br />Distinctive institutional assets that demonstrably contribute to university’s research mission<br />Print monographs already digitized and in copyright, for which aggregate supply is relatively low (<10 to 25 libraries) <br />ongoing demand will indicate whether long-term local stewardship is a logical choice and where relegation is advantageous<br />Neither scarcity of supply (‘uniqueness’) nor present<br />ownership are reliable indicators of scholarly value<br />
    44. 44. Academic print: it’s not the end . . .<br />but it’s no longer the means<br />Ongoing redefinition of scholarly function and value of print<br /> will entail some loss <br /> and some gain in library relevance<br />“Archive of the available past” by Joguldi<br /> Abandoned books at the Detroit Central <br /> School Book Depository (6 May 2009)<br />

    ×