Slideshare.net (beta)

 
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 1 (more)

Cni Dec 2007 Copyright And Mass Dig For Cni

From NEERLG, 2 months ago

CNI Fall Task Force Presentation: Copyright and Large-scale Digiti more

632 views  |  0 comments  |  1 favorite  |  5 downloads
 

Tags

fall cni task force rlg programs copyright and mass digitization

more

 
 

Groups / Events

 

 
Embed
options

More Info

This slideshow is Public
Total Views: 632
on Slideshare: 632
from embeds: 0

Slideshow transcript

Slide 1: RLG Programs Copyright and Large-scale Digitization: Implications for Access Merrilee Proffitt Constance Malpas RLG Programs CNI Fall Task Force Washington, DC 10 December 2007

Slide 2: This presentation . . .  Summarizes findings from conversations with RLG Program Partners regarding copyright assessment practice and considers the implications of these practices in light of  What we know about the system-wide book collection (‘supply’)  What we can observe about need and use of that collection (‘demand’)  Speculations about how increased discoverability of digitized text may impact use (and management) of library print collections CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 2 Copyright and Large-scale Digitization

Slide 3: Interviews with RLG Programs Partners  8 interviewees; some (not all) engaged in mass digitization  All identify “high-risk materials” in order to eliminate them from pool, focus making as much low-risk content available as possible  Books, published in the US, before 1923  Not a lot of effort devoted to this work at this time  Some well-established numbers from University of Michigan on costs for “low-hanging fruit” and for identifying low-risk materials to 1963  Left aside are riskier materials to 1963; materials published outside of US; materials after 1963 CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 3 Copyright and Large-scale Digitization

Slide 4: 1923-1963: How much? What’s the impact on research and teaching?  Based on a January 2007 snapshot of WorldCat, we can estimate that ~15% of US imprints were published between 1923-1963; ~2M titles  Independent studies at Stanford and Michigan suggest that ~30% of US imprints are in copyright; up to 70% may be in the public domain  An optimistic scenario: ~2M * .70 = ~1.4M titles  Add to this the pre-1923 books already in the public domain, est. ~15% of US imprints; optimistically, a total of ~3.4M titles, or the volume equivalent of a mid-level ARL collection Suppose we go as far as we can with this? What’s the likely impact? CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 4 Copyright and Large-scale Digitization

Slide 5: Supply: the system-wide book collection Based on historical samples of monographic titles in the WorldCat database:  15-20% published (anywhere) before 1923; ~10-14M titles  15% published (anywhere) 1923-1963; ~10M titles US imprints only (i.e., the titles for which North American libraries might reasonably expect to undertake copyright assessment efforts) based on a random sample of 1000 monographic titles:  15% published before 1923  public domain  15% published 1923-1963  moderate risk/effort  30% published 1964-1988  high risk/effort  27% published after 1989  greatest risk/effort  7% ambiguous pub’n data  unknown risk/effort CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 5 Copyright and Large-scale Digitization

Slide 6: Distribution of Content by US Copyright Regime based on a random sample of US imprints Books published between 1923 – 1963 are only part of the picture w = d? re sk ar d ri se ng ea si cr ea in cr In CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 6 Copyright and Large-scale Digitization

Slide 7: US imprints in 1000 record sample US imprints in 1000 rec sample 70 Titles in Sample 60 200 years of production 4 decades 13 yrs 50 15% of sample 15% of sample 17% 40 10 yrs 30 19% 20 18 yrs 27% 10 0 7 s s s s s s s s s s s 00 s 00 00 00 00 10 20 30 40 50 60 70 80 20 90 -2 17 18 19 19 19 19 19 19 19 19 19 19 Period of Publication Decade of Publication CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 7 Copyright and Large-scale Digitization

Slide 8: US imprints in 1000 rec sample US imprints in 1000 rec sample 70 70 Titles in Sample 60 Titles in Sample 60 50 50 Optimistically, ~26% 40 ~74% of US books 40 of US imprints could 30 will require more 30 be made accessible 20 work, other players 20 10 with some research 10 0 0 00 7 00 0s 00 0s 00 0s 10 0s 20 0s 30 0s 40 0s 50 0s 60 0s 70 0s 80 0s 20 0 0s s -2 00 2019 9 90 7 1 s 19 s 19 s 19 s 19 s 19 s 19 s 19 s 19 s 19 s 19 s 19 0 19 2 17 7 0 18 8 0 19 1 19 3 19 4 19 5 19 6 19 7 19 8 00 0-2 1 Decade of Publication Decade of Publication CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 8 Copyright and Large-scale Digitization

Slide 9: What’s missing from this picture? US imprints in 1000 rec sample 70 Titles in Sample 60 50 40 30 20 10 0 7 s s s s s s s s s s s 00 s 00 00 00 00 10 20 30 40 50 60 70 80 20 90 -2 17 18 19 19 19 19 19 19 19 19 19 19 Period of Publication Decade of Publication CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 9 Copyright and Large-scale Digitization

Slide 10: What’s missing from this picture? US imprints in 1000 rec sample Holdings for US imprints in 1000 record sample 70 70 Titles in Sample 60 60 50 50 40 40 30 Titles 20 30 Holdings 10 20 0 10 7 s s s s s s s s s s s 00 s 00 00 00 00 10 20 30 40 50 60 70 80 20 90 -2 17 18 19 19 19 19 19 19 19 19 19 19 0 Period of Publication 18 s 19 s 19 s 19 s 19 s 19 s 19 s 20 s 19 s 19 s 19 s 19 s s Decade of Publication 00 10 20 30 40 50 60 70 80 90 00 00 00 17 Period of Publication Decade of Publication CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 10 Copyright and Large-scale Digitization

Slide 11: What’s missing from this picture? While holdings : titles increase over time, aggregate supply dipsUS the period when rec sample in imprints in 1000 copyright restrictions Holdings for US imprints in 1000 record sample are most onerous 70 70 Median holdings per manifestation = 2 Titles in Sample 60 60 50 Max. holdings for a single manifestation = 737 50 40 40 30 Titles 20 30 Holdings 10 20 0 10 7 s s s s s s s s s s s 00 s 00 00 00 00 10 20 30 40 50 60 70 80 20 90 -2 17 18 19 19 19 19 19 19 19 19 19 19 0 Period of Publication 18 s 19 s 19 s 19 s 19 s 19 s 19 s 20 s 19 s 19 s 19 s 19 s s Decade of Publication 00 10 20 30 40 50 60 70 80 90 00 00 00 17 Period of Publication Decade of Publication CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 11 Copyright and Large-scale Digitization

Slide 12: What’s missing from this picture? Books published outside of the United States 4% 4% ? 27% 27% US imprints USUS Books rest res published unkn un elsewhere 69% 69% Based on January 2007 snapshot of published print books in WorldCat n = 48M titles CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 12 Copyright and Large-scale Digitization

Slide 13: Other Dimensions of Supply  What about holdings/availability? In our sample of US imprints:  ~90% of titles with >50 holdings were published after 1963  All titles with >300 holdings were published after 1963  Work-level holdings may help fill the gap for titles with sparse holdings at manifestation level; mostly for teaching/learning  What about non-US book titles? Based on a January 2007 snapshot of WorldCat:  US imprints account for ~30% of the global book collection; non-US publications account for ~70% of print book records in WorldCat  Holdings for non-US publications are relatively scarce (viz. OCLC/ARL Global Resources report, 2007)  Place of publication not always explicit – add’l research needed before copyright assessment can even begin  What about non-book materials? Monographs are just one part of the scholarly record CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 13 Copyright and Large-scale Digitization

Slide 14: CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 14 Copyright and Large-scale Digitization

Slide 15: CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 15 Copyright and Large-scale Digitization

Slide 16: CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 16 Copyright and Large-scale Digitization

Slide 17: Demand: What access is needed to support scholarship? Citations to US -1922 1923 - 1964 - 1978 - 1989 - imprints 1963 1977 1988 (monographs only) Lawrence and 29% 9% 8% 12% 41% Aaronsohn US imprints account for only 1/3 of works cited 8 28 12 9 40 Shakespeare the 2% 38% 29% 19% 12% Thinker US imprints account for less than ¼ of works cited 1 16 12 8 5 The First Word 2% 6% 6% 85% Almost all monographs cited published in the US. 2/3 of sources were from journal literature (not counted) 0 2 5 5 70 4% 21% 13% 9% 52% CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 17 Copyright and Large-scale Digitization

Slide 18: Consequences of greater discoverability of monographs: Scenario A Use of print decreases:  Learners, teachers, and researchers turn to what’s available and useable in digital form rather than print materials; use of print collections declines  scope of scholarly record is defined opportunistically, based on what’s most conveniently available  For some fortunate scholars, greater discoverability is accompanied by greater rights to use digitized text – but availability is determined by institutional affiliation  inequitable access to ‘liquid text’ produces an uneven body of scholarly analysis; incentives to create new analytic tools are limited CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 18 Copyright and Large-scale Digitization

Slide 19: Consequences of greater discoverability of monographs: Scenario B Use and value of print collections increase:  Learners, teachers, and researchers find more materials online; because they can't get in these digital form, use of print increases.  Existing print copies and delivery apparatus can meet the demand. (But what about shifting models for print?)  Existing copes and delivery apparatus can't meet the need, and that creates an opportunity for someone to do something despite rights restrictions to make print or electronic forms of high-demand materials more available. Must be high-value enough to bring rights holders to the table.  Existing copies and delivery apparatus can't meet the need but there isn't enough incentive for anyone to solve this problem. CNI Fall Task Force Meeting - 10 December 2007 RLG Programs 19 Copyright and Large-scale Digitization