Prometheus

411 views
313 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
411
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Prometheus

  1. 1. Invited Demo: Prometheus: Managing the Ingest of Media Carriers Nicholas del Pozo Douglas Elford David Pearson Digital Preservation Digital Preservation Digital Preservation National Library of Australia National Library of Australia National Library of Australia Parkes Place, ACT 2600 Australia Parkes Place, ACT 2600 Australia Parkes Place, ACT 2600 Australia ndelpozo@nla.gov.au delford@nla.gov.au dapearso@nla.gov.auABSTRACT number of widely used carrier types, any long-term solution hasThe National Library of Australia has a relatively small but to make provision for almost any kind of carrier, including carrierimportant collection of digital material stored on common carriers types which may not have been encountered yet. Moreover, this issuch as floppy disks, CDs and DVDs. This includes both a constantly growing problem; if we don’t deal with the digitalpublished material and unpublished manuscripts in digital form. materials that we have already collected, and ideally process newIn the past, preservation of the Library’s physical format digital materials as a part of the acquisition process, accessing thesecollection has been taken care of manually, on a case-by-case carriers will soon become unmanageable, and eventuallybasis, but this approach is insufficient to deal effectively with the impossible.increasing volume of material requiring preservation. Factors such as obsolescence and carrier degradation alreadyThe Library has produced an application called Prometheus, make it difficult for digital preservation solutions to preservewhich provides a semi-automated, scalable process for access to digital content. Additionally, due to the potentialtransferring data from carriers to preservation-managed digital volume and diversity of carriers and file formats, unless solutionsstorage. This is helping the Library to mitigate the major risks are robust and semi-automated, the digital data that it is currentlyassociated with storing the content on physical carriers: possible to preserve may not be. To avoid exacerbating thedeterioration of the media and obsolescence of the hardware problem, it is key that solutions deal with current common carrierrequired to access them. Prometheus makes it easier to process the types as efficiently as possible, while providing access to, or amajority of carriers commonly encountered in the Library and to mechanism for preserving, as many older carriers as is practical.collect and manage metadata about their content. Although notperfect, Prometheus is helping the Library to save digital content 2. PROMETHEUSbefore it is too late. To ensure access to digital content on the most common carriers within the Library, the Digital Preservation Workflow ProjectKeywords produced an application called Prometheus. This applicationDigital preservation, media carriers, National Library of provides a semi-automated, scalable process for transferring dataAustralia, obsolescence, open source software, Prometheus. from carriers to preservation-managed digital storage. This is helping the Library to mitigate the major risks associated with storing the content on physical carriers: deterioration of the media1. INTRODUCTION and obsolescence of the technology required to access them.The National Library of Australia has a relatively small but Prometheus makes it easier to process the majority of carriersimportant collection of digital material stored on common carriers commonly encountered in the Library and to collect and managesuch as floppy disks, CDs and DVDs. This includes both metadata about their content. It also provides mechanisms topublished material and unpublished manuscripts in digital form. accommodate special cases, such as less common media types.In the past, preservation of the Library’s physical format digital Additionally, the original physical arrangement of a group ofcollection has been taken care of manually, on a case-by-case media can be recorded, even in those cases where a piece ofbasis, but this approach is insufficient to deal effectively with the physical media cannot be processed.increasing volume of material requiring preservation. Prometheus allows Library staff to link to catalogue records,The Library collects digital material through multiple acquisition create a byte-level image of the digital content, and transfer it tostreams and generally has little control over the physical format in preservation-managed digital storage. Once the content is copiedwhich the material arrives. So, while most items fall into a small from the carrier, the integrity of the image is verified, and as much metadata as possible is harvested. Attaching a customisableThis work is licensed under the Creative Commons Attribution- ‘mini-jukebox’ (Figure 1) to a staff member’s workstation allowsNoncommercial-No Derivative Works 3.0 Unported license. You are freeto share this work (copy, distribute and transmit) under the following the accurate duplication of the content from a wider range ofconditions: attribution, non-commercial, and no derivative works. To view carrier types, such as USB thumb drives, memory cards or 3½a copy of this license, visit http://creativecommons.org/licenses/by-nc- inch floppy disks. It also provides more reliable hardware fornd/3.0/. imaging CDs, and DVDs. The digital preservation section can useDigCCurr2009, April 1-3, 2009, Chapel Hill, NC, USA Prometheus to deal with carrier types that fall outside this range, such as 5¼ inch floppy disks, SyQuest disks or hard drives. 73
  2. 2. Figure 1. Library developer Snezana Mihajlovic uses a customised ‘mini-jukebox’ attached to a standard Library workstation (Photo: Douglas Elford, National Library).The system incorporates a range of open source tools to undertakeprocessing, including carrier imaging (dd [1], cdrdao [2]);integrity calculation and checking (Jaxsum [3]); file identification(DROID [4]); and metadata extraction (JHOVE [5], NLNZMetadata Extraction Tool [6]). These tools are deployed usingJava-based web services. Moreover, Prometheus has beendesigned in a modular way, so that tools and services can beeasily upgraded or replaced as new versions are released or bettersoftware becomes available (Figure 2).3. THE SOFTWARE RELEASEDPrometheus was designed for the Library’s specific environment,and therefore is not an ‘out of the box’ solution. However, it maybe possible for other parties to use all or some of therequirements, other documentation or components. As such, thesoftware has been released under the GNU General PublicLicense V3.0. The latest version of Prometheus and itsdocumentation is available from the project website [7]. A paperwas presented on this project at the IFLA World Library andInformation Congress in Quebec City, Canada, in August 2008[8].If we wait for the prefect system to be built, for the content onmany carriers it will already be too late. Experience to datesuggests that even though we all share the same fundamental Figure 2. General Process View.problem, the sheer volume and diversity of carriers, as well asvarying individual collecting and business environments, makes itunlikely that there will ever be a single software solution that canbe used by everyone. At least for the Library, Prometheus 4. ACKNOWLEDGMENTSprovides a starting point to manage the ingest of, and preserve Our thanks to Gerard Clifton, Snezana Mihajlovic and Josephcontent from problematic and sometimes idiosyncratic carriers for Mok, who worked with us on version 1.0 of Prometheus, and wholong-term preservation, hopefully in a way that can advantage continue with the development work for version 1.4.others.This paper is based on the earlier paper, that appeared in 5. REFERENCESGateways Dec 2008 [9]. [1] dd for Windows, at http://www.chrysocome.net/dd [2] cdrdao, at http://cdrdao.sourceforge.net/ [3] Jaxsum Java checksum utility, at http://sourceforge.net/projects/jacksum/ 74
  3. 3. [4] DROID automatic file format identification tool, at [8] Elford, D., del Pozo, N., Mihajlovic, S., Pearson, D., Clifton, http://droid.sourceforge.net/wiki/index.php/Introduction G. and Webb, C. 2008. Media Matters: developing processes[5] JHOVE object validation environment, at for preserving digital objects on physical carriers at the http://hul.harvard.edu/jhove/ National Library of Australia. In World Library and Information Congress: 74th IFLA General Conference and[6] National Library of New Zealand Metadata Extraction Tool, Council 10-14 August 2008, Québec, Canada at http://meta-extractor.sourceforge.net/ www.ifla.org/IV/ifla74/papers/084-Webb-en.pdf.[7] Prometheus Sourceforge Website, at http://prometheus- [9] Pearson, D. 2008. Titans in the Library: Prometheus Unbinds digi.sourceforge.net/ At-risk Data. In Gateways Dec 2008. http://www.nla.gov.au/pub/gateways/issues/96/story02.htm. 75

×