Loading...
Flash Player 9 (or above) is needed to view slideshows. We have detected that you do not have it on your computer.To install it, go here
Creating Content: Smithsonian Institution Libraries' Digital Library Program
Creating Content: Smithsonian Institution Libraries' Digital Library Program. Martin R. Kalfatovic. Open World Leadership Center Washington DC Orientation Seminar / Library of Congress. September 28, 2007. Washington, DC.
294 views | comments | 0 favorites | 12 downloads | 1 embeds (Stats)
More Info
This slideshow is Public
Total Views: 294 on Slideshare: 293 from embeds: 1
Most viewed embeds (Top 5):
More
Slideshow Transcript
- Slide 1: Creating Content
Smithsonian
Institution
Libraries'
Digital Library
Program
Martin R. Kalfatovic
New Media Office and
Preservation Services
Smithsonian Institution Libraries
- Slide 2: Smithsonian
Institution
Libraries
20 branch libraries (New York
●
to Panama)
1.5 million volumes; 50,000
●
rare books; 500,000 trade lit
items
~120 staff
●
Web presence since 1995
●
3 million web visitors per year
●
80% from outside the
●
Smithsonian network
- Slide 3: Overview of
Library Digitizing
• Books are unique
objects for scanning
purposes
• Differ from 2
dimensional works (e.g.
Photographs)
• Differ from 3
dimensional works (e.g.
Artifacts)
- Slide 4: Overview of
Library Digitizing
• Codex has been around
for over 1600 years
• The book format (title
page, text, index, etc.)
since the mid-16th
century
• Web delivery of book
objects has interesting
challenges
- Slide 5: Book Digitizing
Process
• Bound materials
– Quality issues
– Protects the material
for future use
• Disbound materials
– Generally better scans
– Destructive
- Slide 6: SIL Imaging
Center
• Established 1999
• 2 digital scanning-
back cameras
– BetterLight
– Phase I
• Flatbed scanners
• All Mac-based
- Slide 7: Digitizing
Vendors
• SIL has used a variety of
commercial vendors for non-
and semi-rare materials
– Kirtas Technologies
(Robotic APT 2400
Scanner)
– Preservation Resources
– PTFS, Inc.
– Thomson
– TechBooks
- Slide 8: Digitizing Partner
Internet Archive
Scanning services
•
File Storage
•
File delivery
•
Technical development
•
Internet Archive Headquarters
The Presidio, San Francisco
- Slide 9: Digitizing
Standards: Page
Images
300 dpi, 24-bit color
uncompressed TIFF, or
lossless compressed
images (e.g. LZW,
JPEG2000)
- Slide 10: Digitizing
Standards: Page
Images
DLF Benchmark
●
for Faithful Digital
Reproductions of
Monographs and
Serials
NISO Framework
●
for Digital
Collections
- Slide 11: Digitizing
Standards: Text
Conversion
Re-keying or OCR with
correction to 99.997%
accuracy
- Slide 12: Digitizing
Standards: Text
Conversion
Standard mark-up schema
(e.g. flavors of XML like
TEI or structured
databases)
- Slide 13: SIL Digitizing
Statistics
• Approximately
400,000 scanned
pages
• 700+ titles
• 1,100 volumes
- Slide 14: Who Is Using the
SIL Digital Library?
• Sewing machine
enthusiasts
• Researchers in Brazil
• School kids around the
country
• Lepidopterists in Peru
“Aloha. I live on the Big Island
of Hawai’i …I’ve been looking
for this text for over TWENTY
YEARS. Mahalo nui loa for all
your hard work. Reading these
pages means so much to me
and many others …”
- Slide 15: Major Projects:
Digital Editions
• History of Science
• Natural History
• History and Culture
• Art and Design
- Slide 16: Major Projects:
Trade Literature
• Trade Literature
Collections
– Over 500,000 pieces, only
a small fraction digitized
– 30,000 images from two
collections
– Among SI Libraries’ most
popular sites with over
15,000 visitors per month
- Slide 17: Major Projects:
Image Galaxy
SIL Image Galaxy
– Over 9,000 of SI Libraries’
most interesting images
– Serves as a gateway for
product development and
licensing
– Assists students and
teachers in locating
images for use in the
classroom and other
projects
- Slide 18: Major Projects:
Scholarly
Publications
Smithsonian Contributions
and Studies Series
– Collaboration with
Smithsonian Institution
Scholarly Press
– Currently over 65,000
pages online with another
80,000 in FY 2007
- Slide 19: National and
International
Partnerships
• Aluka
– African history and
culture
• Open Content
Alliance
- Slide 20: Biodiversity
Heritage Library
• 10 member libraries
• Goal: digitize corpus
of heritage taxonomic
literature (300 million
pages)
• $3 million grant as
part of Encyclopedia
of Life in hand
www.biodiversitylibrary.org
- Slide 21: Internet Archive
Scribe Scanner
• Single Scribe Machine
– Human operated
– 200 volumes per shift
per week
– ~ 70,000 pages from
a single machine per
week
– Located in the
Natural History
building and working
on BHL project
- Slide 22: BHL Scribe
Facilities
• Boston Library Consortium
(Boston Public Library)
• New York Public Library
• University of Illinois, Urbana-
Champaign
• Natural History Museum
(London)
• Smithsonian Inst. Libraries
• Missouri Botanical Garden
(non-Scribe operation)
- Slide 23: Digitizing
Philosophy
• Digital Curation
– Just as libraries keep
books, so do libraries have
a mission to preserve
“born digital” material
– Digital Preservation
through assisting in the
transmission of digital
content to future
generations
- Slide 24: Smithsonian
Digital
Repository
DSpace
– Developed jointly by MIT
and HP
– Open source software
used in hundreds of
academic libraries
- Slide 25: Smithsonian
Digital
Repository
– Preserves and makes
available digital output of
scientists, researchers,
curators, historians, etc.
– Coordinated with
Smithsonian Scholarly
Bibliography to track
Smithsonian staff
publications
- Slide 26: Needs for
Enhancing the SIL
Digital Library
Program
Petabyte storage system for
•
source files
Effective system for archiving
•
of digital material (byte
preservation)
Enhanced capacity for
•
storing/delivering web-
deliverable images
Central programming support
•
for enhanced XML data
delivery
- Slide 27: Needs for
Enhancing the SIL
Digital Library
Program
Implementation of Web 2.0
•
technologies
Focus on reuse and re-
•
purposing of legacy data and
metadata
Enhanced service to the
•
various Smithsonian
audiences (internal and
worldwide)
- Slide 28: Smithsonian
Institution
Libraries
Image Credits
All book and material images available
•
through the Galaxy of Images (
www.sil.si.edu/imagegalaxy)
Other images by Martin R. Kalfatovic
•