1. Massively Digitizing
UC Library Collections
Google, Microsoft, and More
Learning in Retirement
Libraries – The Intersection of Tradition and
Innovation
April 10, 2008
Ivy Anderson & Heather Christenson
2. California Digital Library
Two Complementary Roles
Facilitate library collaboration across the ten
campuses of the UC system (e.g. shared collection
development)
Distinctive services emphasizing digital
stewardship, innovation in scholarly publishing,
and open-access digital collections
Three Audiences
UC libraries
Broader UC community
External constituencies and the general public
Five Programs
Collection Development and Management
(Licensed Content, Shared Print Collections, Mass
Digitization)
Bibliographic Services (Melvyl Catalog, SFX)
Preservation (Digital Preservation Repository, Web
Archiving)
Digital Special Collections (Calisphere, Online
Archive of California)
Publishing Services (eScholarship Repository,
eScholarship Editions, collaboration with UC Press)
“11th University Library”
founded 1997
Part of UC Office of the
President
3. Digitization of Library
Collections
Special Collections
Manuscripts,
archival
collections,
photographs, etc.
CDL / UC Libraries
Online Archive of
California
Calisphere
Berkeley, University of California, Bancroft Library, UCB 150, f. 252v
5. Digitization of Library
Collections
Commercial
Partnerships
EEBO: 100,000
important early
English texts
Licensed access
via ProQuest
Satans stratagems, 1648. copy from UCLA Library
6. …and Along Came Google
Google Library Project
2005: The ‘Google Five:’
Harvard, Oxford, New
York Public Library,
Stanford, University of
Michigan
2008: 20 library partners
in 5 countries
Google Publisher Partner
Program
7. …and the Open Content
Alliance
October 2005
Founders: Internet
Archive, University of
California, U of
Toronto…
Large-scale
digitization of out-of-
copyright works only
A project of the
Internet Archive
10. So: Three Projects, One Goal
Goal: Mass digitization of library book collections
Google
In-copyright and out-of-copyright works
Available via Google search engine and Google Book
Search
Microsoft
Out-of-copyright works only
Available via Microsoft Live Search
Open Content Alliance
Out-of-copyright works only
Available (via the Internet Archive website) to any and all
search engines
Library and grant-funded
11. Why Are They Doing It?
Google’s vision: To put all the world’s
information online
Google and Microsoft: To gain marketshare
and competitive advantage for their search
(and online advertising) services
It’s all about Search
OCA: To put the world’s information online,
for free, forever
It’s all about the public good
12. Why Are We Doing It?
To enhance student and faculty research
To put our collections where our users are – in Google!
Mass digitization of these materials enhances access. It can make
people aware of books they may not have discovered otherwise and
lead them, through an internet search, back to our libraries
To support deeper textual analysis and research. Scholars can trace
the evolution of ideas and perform other sophisticated textual analysis
when the full text is indexed and searchable by computer, opening
scholarship in new ways.
To fulfill our public service mission
Many books of enduring general interest – including classic works of
literature and more unique items such as early histories of the
settlement of California and the West - can now be read by anyone,
anywhere, anytime
To preserve and protect our collections
In earthquake and fire-prone California, digitizing books in our
collections may also help protect the university from catastrophic loss
should disaster someday strike our libraries
27. Costs to the UC Libraries
Staffing (2-5 FTE at each of 5 locations)
Physical space & facilities
Scanning centers (where scanning
machines are housed), book processing,
queue storage (book trucks)
Costs to run campus systems
CDL servers for inventory database, digital
preservation
29. What sort of books are being
digitized?
American history
Humanities
Science
Cookbooks
Children’s books
East Asian & Pacific Rim collections
30. Where can you access the
books?
Google Book Search:
http://books.google.com/
Microsoft Live Search Books:
http://search.live.com/results.aspx?q=&scop
e=books
Internet Archive:
http://www.archive.org/details/university_of_c
alifornia_libraries
Test version of UC Union catalog:
http://melvyl-test.cdlib.org:8164/F
31. Copyright status is a factor
Out of copyright, pre-1923
“orphan works,” 1923-1964
1965 - present
39. What’s ahead
Digital preservation –storage,
storage, storage
Copyright determination
Print on demand
40.
41. New modes of access & critical
mass of digital books will transform
scholarship
Full text search - new form of book discovery
Beyond search – text mining,
computationally assisted research
Machines can interact with massive amounts
of texts, and provide new structures
42. Questions?
Heather Christenson, CDL Mass
Digitization Project Manager
heather.christenson@ucop.edu
Ivy Anderson, CDL Director of Collections
ivy.anderson@ucop.edu
For more information:
http://www.cdlib.org/inside/projects/mas
sdig/