Digital presevation
Upcoming SlideShare
Loading in...5

Digital presevation






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Digital presevation Digital presevation Presentation Transcript

    • Digital preservation for ongoing accessPresentation for Council July 2008 David Pearson Manager, Digital Preservation Section
    • Overview1. We have lots of “digital stuff” in our collections and it is growing2. We will lose access to it unless we take action3. We need to manage the process of keeping it accessible and usable4. Solutions have to be scalable, reliable and automated
    • 1. “Digital stuff”- many collections Pictures Oral History Manuscripts Historical Web sites Sheet music NewspapersMaps Ephemera Books Serial
    • How does it grow?1. We collect it – Physical carriers – Online • PANDORA web archive • Australian web domain harvests2. We create it – Oral history interviews – Photographs – Publications3. We convert it – Digitise our collections
    • Web Archives• Web sites are collected selectively – Individually for access via PANDORA, or – On a large scale via annual domain snapshots• No control over content creation• Lots of – File formats – Individual files (Pandora ≈ 51 million, Domain harvest ≈ 1.3 billion files) – Links – Software (browser, plug-ins, readers)• Internet content changes over time
    • Digitisation • Around 135,000 items digitised • Newspaper project = 4 million pages by 2010 • Internally created so we can control – Standards – File formats (e.g. TIFF, JPEG, PDF ) – Metadata – Workflows • Issues – Growing volume
    • Physical carriers • Approx. 12,000 items – grows by 1,000 a year Issues • No control over creation • Time lag before acquisition • Variety of carriers (fragile) and file formats • Require various hardware, software, operating systems, drivers to access • Labour intensive to process and transfer to safe storage (growing backlog)
    • Growth : digital collection storage 350 300 250Storage size (terabytes) 200 Newspapers 150 Australian Web 100 Harvests 50 0 Jan-03 Jul-03 Jan-04 Jul-04 Jan-05 Jul-05 Jan-06 Jul-06 Jan-07 Jul-07 Jan-08 Jul-08
    • Type of Digital Collections Pandora 3% 2008 Maps 2% Sheet Music 4% Manuscripts 2% PicturesAustralian Web 7% Harvest 40% Oral History 18% Other 3% Historical Newspapers 21%
    • Growth: compared to books Comparison of books collection & digital collection "book equivalents" 6.00"Book Equivalents" (millions) 5.00 4.00 Digital Collection 20 mb "book 3.00 equivalents" Books Collection 2.00 1.00 0.00 2005 2006 2007 2008 Year end June
    • 2. Act or risk losing it• “Digital stuff” is dependent on technology at all stages – Creation/capture – Storage – Access• Technology changes rapidly thus software, hardware, media, file formats, operating systems become obsolete• Unless managed deterioration can occur rapidly e.g. data can be corrupted or lost in storage or transfer process
    • Computer Museum
    • 3. Managing to keep it• “Not managing it” is not an option• We need to – Understand our “digital stuff” & associated risks – Provide safe storage & ensure integrity – Ensure access over time as technology changes – Develop & implement preservation workflows, skills, standards, & strategies for ongoing access – Enable content to be shared and used in different ways in the future
    • 4. Solutions and implications• Large scale automated processes• Original research & time to deliver the solutions• Reasonably long lead times• Audit processes and quality control monitoring are critical• Significant resources are required
    • Conclusions• We are responsible for a lot of “digital stuff”• If we simply collect and store it, it will become unusable in a relatively short time as technologies change• Maintaining the ability to access it requires a lot of good management, planning, & dedicated resources• We have to find and use solutions that can be applied automatically and reliably to billions of digital files