Cni Dec 2007 Copyright And Mass Dig For Cni

2,310 views

Published on

CNI Fall Task Force Presentation: Copyright and Large-scale Digitization: Implications for Access, by Merrilee Proffitt and Constance Malpas with RLG Programs

Published in: Education
  • Be the first to comment

Cni Dec 2007 Copyright And Mass Dig For Cni

  1. 1. Copyright and Large-scale Digitization: Implications for Access Merrilee Proffitt Constance Malpas RLG Programs CNI Fall Task Force Washington, DC 10 December 2007
  2. 2. This presentation . . . <ul><li>Summarizes findings from conversations with RLG Program Partners regarding copyright assessment practice </li></ul><ul><li>and considers the implications of these practices </li></ul><ul><li>in light of </li></ul><ul><li>What we know about the system-wide book collection (‘supply’) </li></ul><ul><li>What we can observe about need and use of that collection (‘demand’) </li></ul><ul><li>Speculations about how increased discoverability of digitized text may impact use (and management) of library print collections </li></ul>
  3. 3. Interviews with RLG Programs Partners <ul><li>8 interviewees; some (not all) engaged in mass digitization </li></ul><ul><li>All identify “high-risk materials” in order to eliminate them from pool, focus making as much low-risk content available as possible </li></ul><ul><li>Books, published in the US, before 1923 </li></ul><ul><li>Not a lot of effort devoted to this work at this time </li></ul><ul><li>Some well-established numbers from University of Michigan on costs for “low-hanging fruit” and for identifying low-risk materials to 1963 </li></ul><ul><li>Left aside are riskier materials to 1963; materials published outside of US; materials after 1963 </li></ul>
  4. 4. 1923-1963: How much? What’s the impact on research and teaching? <ul><li>Based on a January 2007 snapshot of WorldCat, we can estimate that ~15% of US imprints were published between 1923-1963; ~2M titles </li></ul><ul><li>Independent studies at Stanford and Michigan suggest that ~30% of US imprints are in copyright; up to 70% may be in the public domain </li></ul><ul><li>An optimistic scenario: ~2M * .70 = ~ 1.4M titles </li></ul><ul><li>Add to this the pre-1923 books already in the public domain, est. ~15% of US imprints; optimistically, a total of ~3.4M titles, or the volume equivalent of a mid-level ARL collection </li></ul><ul><li>Suppose we go as far as we can with this? </li></ul><ul><li>What’s the likely impact? </li></ul>
  5. 5. <ul><li>Based on historical samples of monographic titles in the </li></ul><ul><li>WorldCat database: </li></ul><ul><li>15-20% published (anywhere) before 1923; ~10-14M titles </li></ul><ul><li>15% published (anywhere) 1923-1963; ~10M titles </li></ul><ul><li>US imprints only (i.e., the titles for which North American </li></ul><ul><li>libraries might reasonably expect to undertake copyright </li></ul><ul><li>assessment efforts ) based on a random sample of 1000 </li></ul><ul><li>monographic titles: </li></ul><ul><li>15% published before 1923  public domain </li></ul><ul><li>15% published 1923-1963  moderate risk/effort </li></ul><ul><li>30% published 1964-1988  high risk/effort </li></ul><ul><li>27% published after 1989  greatest risk/effort </li></ul><ul><li>7% ambiguous pub’n data  unknown risk/effort </li></ul>Supply: the system-wide book collection
  6. 6. Distribution of Content by US Copyright Regime based on a random sample of US imprints Books published between 1923 – 1963 are only part of the picture Increasing risk = increased reward?
  7. 7. 200 years of production 15% of sample 4 decades 15% of sample 13 yrs 17% 10 yrs 19% 18 yrs 27% US imprints in 1000 record sample Period of Publication
  8. 8. ~74% of US books will require more work, other players Optimistically, ~26% of US imprints could be made accessible with some research
  9. 9. What’s missing from this picture? Period of Publication
  10. 10. What’s missing from this picture? Period of Publication Period of Publication Holdings for US imprints in 1000 record sample
  11. 11. What’s missing from this picture? Period of Publication While holdings : titles increase over time, aggregate supply dips in the period when copyright restrictions are most onerous Median holdings per manifestation = 2 Max. holdings for a single manifestation = 737 Period of Publication Holdings for US imprints in 1000 record sample
  12. 12. What’s missing from this picture? Books published outside of the United States Based on January 2007 snapshot of published print books in WorldCat n = 48M titles Books published elsewhere US imprints ?
  13. 13. Other Dimensions of Supply <ul><li>What about holdings/availability? </li></ul><ul><ul><li>In our sample of US imprints: </li></ul></ul><ul><ul><li>~90% of titles with >50 holdings were published after 1963 </li></ul></ul><ul><ul><li>All titles with >300 holdings were published after 1963 </li></ul></ul><ul><ul><li>Work-level holdings may help fill the gap for titles with sparse holdings at manifestation level; mostly for teaching/learning </li></ul></ul><ul><li>What about non-US book titles? </li></ul><ul><ul><li>Based on a January 2007 snapshot of WorldCat: </li></ul></ul><ul><ul><li>US imprints account for ~30% of the global book collection; non-US publications account for ~70% of print book records in WorldCat </li></ul></ul><ul><ul><li>Holdings for non-US publications are relatively scarce (viz. OCLC/ARL Global Resources report, 2007) </li></ul></ul><ul><ul><li>Place of publication not always explicit – add’l research needed before copyright assessment can even begin </li></ul></ul><ul><li>What about non-book materials? </li></ul><ul><ul><li>Monographs are just one part of the scholarly record </li></ul></ul>
  14. 17. Demand: What access is needed to support scholarship? 29% 12% 9% 41% 8% 38% 29% 19% 12% 2% 6% 6% 85% 21% 4% 13% 9% 52% 2% 70 5 40 1989 - 5 5 2 0 The First Word Almost all monographs cited published in the US. 2/3 of sources were from journal literature (not counted) 8 12 16 1 Shakespeare the Thinker US imprints account for less than ¼ of works cited 9 12 28 8 Lawrence and Aaronsohn US imprints account for only 1/3 of works cited 1978 - 1988 1964 - 1977 1923 - 1963 -1922 Citations to US imprints (monographs only)
  15. 18. Consequences of greater discoverability of monographs: Scenario A <ul><li>Use of print decreases: </li></ul><ul><li>Learners, teachers, and researchers turn to what’s available and useable in digital form rather than print materials; use of print collections declines </li></ul><ul><ul><li> scope of scholarly record is defined opportunistically, based on what’s most conveniently available </li></ul></ul><ul><li>For some fortunate scholars, greater discoverability is accompanied by greater rights to use digitized text – but availability is determined by institutional affiliation </li></ul><ul><ul><li> inequitable access to ‘liquid text’ produces an uneven body of scholarly analysis; incentives to create new analytic tools are limited </li></ul></ul>
  16. 19. Consequences of greater discoverability of monographs: Scenario B <ul><li>Use and value of print collections increase: </li></ul><ul><li>Learners, teachers, and researchers find more materials online; because they can't get in these digital form, use of print increases. </li></ul><ul><ul><li> Existing print copies and delivery apparatus can meet the demand. (But what about shifting models for print?) </li></ul></ul><ul><ul><li> Existing copes and delivery apparatus can't meet the need, and that creates an opportunity for someone to do something despite rights restrictions to make print or electronic forms of high-demand materials more available. Must be high-value enough to bring rights holders to the table. </li></ul></ul><ul><ul><li> Existing copies and delivery apparatus can't meet the need but there isn't enough incentive for anyone to solve this problem. </li></ul></ul>

×