Your SlideShare is downloading. ×
Digitizing Spectator - Libraries Digital Program
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Digitizing Spectator - Libraries Digital Program

302
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
302
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Columbia Spectator Archive Progress Report on Phase 1 Stephen Paul Davis Columbia University Libraries Digital Program June 27, 2012
  • 2. The Plan• Partnership between Columbia Libraries / Information Services and the Spectator• High quality scanning of original Spectator issues from Columbia University Archives and the Spectator Editorial Offices• State-of-the-art text processing (OCR) of scanned images to allow searching at article• Feature-rich online presentation• Permanent, long-term digital preservation
  • 3. The Players• The Spectator staff and board• University Archives• Libraries‟ Preservation & Digital Conversion Division• Libraries‟ Digital Program Division• Libraries‟ Information Technology Division• Digital Data Divide• Brechin Imaging Services• Digital Library Consulting (Veridian provider)• Cornell University Libraries [behind the scenes]
  • 4. The ContextColumbia Libraries Digital Program’s mission:• To carry out digitization and access projects chiefly from Columbia‟s rare and special collections (2002-)• To build and support Columbia‟s long-term digital preservation infrastructure (2010-)• To develop and support preservation of and access to born-digital archival collections (2011-)
  • 5. Columbia Libraries Digitization Program• Digitization Projects (Digital Scriptorium, APIS (papyrus project), John Jay Papers, Herbert Lehman Papers, etc.)• Digital Exhibitions (See especially: Core Curriculum:CC, Core Curriculum:LitHum, 1968:Columbia in Crisis, Varsity Show)• „Born-Digital‟ & Web Archives (Columbia University, Human Rights Organizations, etc.)
  • 6. Columbia‟s Technology PlatformsColumbia University Libraries / Information Serviceshas a:• robust repository infrastructure that follows• national and international standards and• „best practices‟ to support• digital publishing and• long-term digital preservation
  • 7. Columbia‟s Repository & Preservation Infrastructure Schematic Overview
  • 8. Newspaper Access …• Providing flexible access to newspaper content is complicated and expensive• Not cost-effective for single institutions to build custom, newspaper-oriented software• Only two major vendors provide software optimized for newspapers• DL Consulting’s Veridian is by far the better & most frequent choice for research libraries
  • 9. Spectator StatsSpectator run from 1877-2009: Number of volumes = 155 Estimated no. of pages = 79,145 Average pages per volume = 500 (wide variation!) Est. vols. requiring disbinding = 100 Est. vols. unable to be digitized = 10NB: Most volumes contain severely brittle paper; only24 volumes have flexible paper
  • 10. Why Scan From Originals?
  • 11. Scanning from originals retains visual content 6 May 1968
  • 12. Tiny sampler of Spec Archive images
  • 13. 11 October 1956 19 February 1957
  • 14. 29 September 1959
  • 15. 3 December 197327 October 1961
  • 16. 2 October 1972 7 March 1974
  • 17. Challenges of Scanning from Originals
  • 18. Disbinding fragile pages
  • 19. Repairing and Conserving
  • 20. Preservation Boxing(for shipping & long-term storage)
  • 21. Phase 1 Completion• Prep, rehouse, digitize & encode Spec volumes for 1955-1992: completed June 15th• Load into VeridianTest System: June 29th• Design Spectator Archive website: July 15th• Move test system to production environment: July 30th• Do user testing and quality review: August 15th• Launch new public site: September 4th
  • 22. Demo of Test System• 1964: http://tinyurl.com/78hhypj• 1968: http://tinyurl.com/7jk6ynz• 1973: http://tinyurl.com/7gu55p6• 1983: http://tinyurl.com/7dq8zly• Searching “coeducation”: http://tinyurl.com/7cwd95g• Partial content list: http://tinyurl.com/7q8w4nq[Note that these are all temporary links that work as of 6/28/2012 but whichwill stop working altogether at some point in the next few weeks.]
  • 23. Phase 2 Goals Finish the Project!(Prep, rehouse, repair, digitize & encode Specvolumes for 1877-1954 and 1992-2009)
  • 24. Phase 2 Costs (for ca. 55,000 pages)• Preparation, rehousing, repair = will be covered by CU Libraries• Scanning of 55,000 pages = $55,000 + $5,000 contingency• OCR, segmentation, selective text correction = $55,000 + $5,000 contingency• Load into host system, license, maintenance = already covered by CU Libraries• Long term preservation of master image (tiff) files = may require additional fundraising
  • 25. Final, key points• The Spectator Archive project is extremely important for preservation of and access to Columbia University‟s history• This is an archival preservation project as well as an information access project• Columbia Libraries is making a major, long-term investment to ensure the success of this project• The Libraries and the Spec have made a great start, but additional funding is needed to complete the job
  • 26. QuestionsStephen Paul Davis, Director Libraries Digital Program Columbia University daviss@columbia.edu