20110428 ARMA Amarillo Managing Your Records in 5, 50, 500 Years


Published on

This presentation at the ARMA Amarillo Spring Seminar described issues and strategies for digital preservation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • ERM CO M12 06/05/11 17:04 Page Civilisations throughout the world have been creating records for millennia. Of course, we are talking here about physical records. Humankind has been especially inventive in choosing media for its record keeping. But whether the records were kept using South American knotted strings, Sumerian clay tablets, Egyptian papyrus, Asian birch bark, ancient Greek vellum, contemporary paper or high-tech microfilm on a polyester substrate, they have one thing in common: ~ the medium is chosen to preserve the information in the records it holds, for as long as it is needed. To be more precise, the record keeping media will last as long as needed, ~ provided it is kept in suitable conditions. Keepers of records soon learned that they had to keep their records in suitable conditions. For example, many records media will quickly rot if kept in damp conditions. So the techniques of preservation started to evolve. Keepers of records devised suitable environments in which to store records. Today, these techniques are very sophisticated. ~ They rely on three principles: ~ Choosing media and materials which are durable; ~ Storing the media in good conditions; ~ Protecting the records against disaster. There are innumerable examples of all three techniques in use today. The next slide shows just a few, for illustration.
  • ERM CO M12 06/05/11 17:04 Page Some examples of durable media are: ~ Acid-free paper – this is paper which is specially made to last longer than most “ordinary” paper. It is less likely to turn yellow and brittle with age. ~ Archival-grade polyester sleeves for photographic film and negatives. These are less likely to react chemically with the film they contain. ~ Acetate substrate silver halide film – providing this is properly processed and stored in good conditions, it is predicted that this can last for centuries. The very best of durable media will only last a long time if they are stored in good conditions. ~ Examples of such conditions are: ~ Maintaining a consistent, relatively low, temperature. Almost all media will suffer from chemical changes over long periods (typically decades). These changes, which degrade the media, occur much faster at higher temperatures, so archives and record stores maintain a steady, low temperature – though not so low as to damage the media in other ways. ~ Maintaining a constant, low level of humidity is likewise very desirable. High humidity, and changes in humidity, promote chemical degradation. ~ Separating the storage areas from reading and consultation areas. This protects the stored records from various pollutants and other disturbances caused by the comings and goings of readers. ~ Guard against loss by disaster. These include: ~ The use of fireproof safes, or vaults, which protect documents against flames for a number of hours, normally in conjunction with other precautions against fire. ~ The use of secure underground storage, in geologically stable areas. This is a complex, but well-documented, subject. It is complex because there are many special considerations – for example, cellulose microfilm can be stored safely in atmospheres where the humidity is as low 15 percent, but polyester microfilm should only be stored in humidity over 30 percent. Fortunately, requirements are well documented, by media suppliers on one hand, and ~ By standards institutions on the other. There are at least 22 standards from the two standards bodies, ISO and ANSI, relating to the preservation of microfilm alone, for example. And there are other standards that relate to the physical storage of paper. As the preservation of physical records is not the focus of this module, these numerous standards are not considered further.
  • ERM CO M12 06/05/11 17:04 Page Some of the same considerations apply to electronic records. So, regardless of preservation issues which are unique to electronic records, ~ You need to store them on durable media to preserve them over long periods. The expected lifetime of the media should be a known factor in your choice of media. WORM media tends to last longer than rewritable media because the underlying data substrate is more stable. And cartridge-based media tends to last longer because it is better-protected against accidental damage and environmental contaminants. ~ Likewise, electronic media need to be stored in suitable conditions. This may mean air conditioned and humidity controlled environments, something which tends to be the norm in larger sized organisations but which may not be reasonable for small organisations. ~ And naturally, the idea of protection against disaster applies too. There is no point in taking care of electronic records only to see them lost because of a disaster. These three ideas are covered in a little more detail later. The idea of standards applies only partly, however. For the physical storage of electronic media, there are clear guidelines. The media suppliers, for example, provide instructions on storage of the media they provide. However, electronic records need much more preservation than just this basic physical preservation; and there are few standards for these other aspects of digital preservation. The other aspects of digital preservation are the core of the remainder of this module.
  • ERM CO M12 06/05/11 17:04 Page On the previous slide you learned that digital preservation involves more than just the physical preservation of media. Here’s why. ~ Almost all physical records require little or no technology to allow humans to extract the information from them. You can pick up an old record, and whether it is written or printed on clay, papyrus, paper, or anything else, you can make out what it says (understanding what it says is another matter – you will cover that later in this module). There are exceptions of course – microfilm, audio and video recordings all need some sort of technological assistance, but these are all relatively recent developments, and are in the minority. By contrast, ~ electronic records need technology to allow humans to understand them. Actually, they require rather a lot of technology. This can include various combinations of products, including: Servers Disk drives Network PC Screen Operating system ERM system software And so on. In other words, a combination of hardware and software.
  • 1. Data cassette story for the C-64 2. Gen: single-session vs. multisession CDs 3. WORM, DVD(!) 4. CD ubiquity – now DVD, soon Blu-Ray/UDO
  • Story here about migration through generations vs. over generations – get details!
  • Analog refers to outputting to paper or film. Not viable for many rich media – think audio, flash
  • Analog refers to outputting to paper or film. Not viable for many rich media – think audio, flash
  • Requires keeping the original system (software, OS, data) plus writing an emulation environment where the original system will think it is running on original hardware; today’s emulator may require its own emulator to run on Win2010!
  • Software was proprietary (BCPL, a precursor of C) and only ran on BBC Micro computers No LV-ROM readers available (proprietary 12” discs) Media is fairly hardy, but it will fail eventually….scratches, etc. CAMiLEON: Creative Archiving at Michigan and Leeds: Emulating the Old on the New
  • ERM CO M12 06/05/11 17:04 Page Some of the digital preservation ideas and actions described in this module have been theoretical, uncertain, or questionable for some reason. Unfortunately, that reflects the state of the art – as suggested earlier, there is no “silver bullet”. ~ But there are nonetheless some actions you can consider, if you have a recognised, or even latent, digital preservation issue to solve. ~ The first rule is: “Know your holdings”. You will not make the right decisions on digital preservation unless you know about the records you hold. There are five main things you need to understand: ~ What file formats your records are in; ~ how the mix of file formats may change in the future (to the extent that this is known or can be predicted) ~ The physical storage conditions of the records and all their backups ~ How long they have to be retained ~ Their value or importance, to the organisation (or to whoever else has an interest in them). These five steps form the foundations for almost any preservation activity. Anyone wanting to make a start, or indeed wanting to confirm that no action is needed, should start here. Understanding the current file formats may be a challenge in some cases; it will often call for the use of software tools to investigate and report on formats, due to the sheer volumes involved. Automated tools for this purpose are not yet widely available, though there are efforts led by the UK PRONOM project to document as many file formats as possible.
  • ERM CO M12 06/05/11 17:04 Page The next step would be to ~ Develop a strategic approach to digital preservation. Preservation actions are rarely very urgent, but are likely to be costly in many cases. So, a measured, strategic approach is called for. The strategy may involve collaborating with other organisations or departments within the organisation to spread costs, initiating research internally, and a range of different actions. The sorts of short-term actions a strategy might contain include ~ Actions to improve physical storage conditions, both for the live sites and backup storage locations ~ Detailed investigation of protection against media degradation ~ Creation and capture of metadata to record file formats for all records, or for all records that may in future require preservation actions ~ The adoption of standard file formats, having chosen one or several preferred file formats which are best suited for your organisations likely future needs. Note that this might imply migrating all newly-received records into preferred formats on receipt, or retrospectively migrating existing records to these preferred formats, or both ~ The adoption of preservation standards, including standards for processes, such as ISO/TR 18492, and file formats including but not limited to ISO 19005, PDF/A, and ISO 32000, PDF. ~ And finally, the creation of – or subscription to – a technology watch function. The idea of a technology watch function is that it routinely scans the technology scene with a view to spotting when technologies you use may be about to become obsolete, and also to evaluate new technologies and file formats in terms of their preservation potential.
  • 20110428 ARMA Amarillo Managing Your Records in 5, 50, 500 Years

    1. 1. Jesse Wilkins, CRM April 28, 2011
    2. 2. <ul><li>Digital documents last forever – or five years, whichever comes first. </li></ul><ul><li>--Jeff Rothenberg, RAND Corp. </li></ul>
    3. 3. <ul><li>The problem with digital information </li></ul><ul><li>Approaches to digital preservation </li></ul><ul><li>Strategies for long-term access </li></ul>
    4. 5. <ul><li>Records are not new </li></ul><ul><li>Physical records last as long as you need them… </li></ul><ul><li>Preservation principles: </li></ul><ul><ul><li>Durable media </li></ul></ul><ul><ul><li>Storage conditions </li></ul></ul><ul><ul><li>Disaster protection </li></ul></ul>
    5. 6. <ul><li>Durable media examples: </li></ul><ul><ul><li>Acid-free paper </li></ul></ul><ul><ul><li>Silver halide microfilm </li></ul></ul><ul><li>Storage conditions exam p les: </li></ul><ul><ul><li>Controlled temperature </li></ul></ul><ul><ul><li>Low, constant, humidity </li></ul></ul><ul><li>Disaster protection examples: </li></ul><ul><ul><li>Fireproof vault </li></ul></ul><ul><li>Numerous standards apply </li></ul>
    6. 7. <ul><li>Durable media examples: </li></ul><ul><ul><li>WORM media </li></ul></ul><ul><ul><li>Cartridge-based media </li></ul></ul><ul><li>Storage conditions exam p les: </li></ul><ul><ul><li>Controlled temperature and humidity apply </li></ul></ul><ul><li>Disaster protection examples: </li></ul><ul><ul><li>Fireproof vault, offsite storage </li></ul></ul><ul><li>Some standards apply </li></ul>
    7. 8. <ul><li>Physical records need little or no technology </li></ul><ul><li>Electronic records need a lot of technology: </li></ul><ul><ul><li>Servers and networks </li></ul></ul><ul><ul><li>Disk drives </li></ul></ul><ul><ul><li>PC and operating system </li></ul></ul><ul><ul><li>Monitor </li></ul></ul><ul><ul><li>ERM system software </li></ul></ul><ul><li>Software and hardware evolve rapidly! </li></ul>
    8. 9. <ul><li>Media deterioration </li></ul><ul><li>Hardware compatibility </li></ul><ul><li>Software compatibility </li></ul><ul><li>Security and encryption </li></ul><ul><li>A word about standards </li></ul>
    9. 10. <ul><li>There are no archival-class media for storing digital information </li></ul><ul><ul><li>Media can be damaged, scratched, stretched </li></ul></ul><ul><ul><li>Substrate separation – the chemical layer that stores the data separates from media </li></ul></ul><ul><li>And if there were – </li></ul><ul><li>it wouldn’t matter! </li></ul>
    10. 11. <ul><li>Technical obsolescence </li></ul><ul><ul><li>8” floppy disks, laser video discs </li></ul></ul><ul><li>Generational changes </li></ul><ul><ul><li>Floppy disks, CDs </li></ul></ul><ul><li>Non-standard formats </li></ul><ul><ul><li>ZIP drives, LS-120 </li></ul></ul><ul><li>Rapid rate of change </li></ul>
    11. 12. <ul><li>Between applications </li></ul><ul><ul><li>Microsoft Word, Corel WordPerfect </li></ul></ul><ul><li>Between platforms </li></ul><ul><ul><li>Word, Word for Mac </li></ul></ul><ul><li>Between versions </li></ul><ul><ul><li>Word 1.0, Word 2010 </li></ul></ul>
    12. 13. <ul><li>Passwords can be lost </li></ul><ul><li>Some applications don’t play nicely with encrypted or protected files </li></ul><ul><li>Some applications don’t </li></ul><ul><li>recognize security features </li></ul><ul><li>-- and ignore them </li></ul>
    13. 14. <ul><li>Formal standards are agreed to by users, vendors, industry experts, and managed by standards organizations. </li></ul><ul><ul><li>XML, PDF </li></ul></ul><ul><li>Ad hoc standards are controlled by vendors or smaller groups and are considered standards because they are in widespread use </li></ul><ul><ul><li>Microsoft Word </li></ul></ul><ul><li>Standards protect the organization! </li></ul>
    14. 16. <ul><li>Analog storage </li></ul><ul><li>System archival </li></ul><ul><li>Emulation </li></ul><ul><li>Conversion </li></ul><ul><li>Migration </li></ul><ul><li>Each has its own strengths & weaknesses </li></ul>
    15. 17. <ul><li>Analog storage suffers from a number of issues: </li></ul><ul><li>Search and retrieval issues </li></ul><ul><li>Storage requirements and costs </li></ul><ul><li>Data loss, particularly </li></ul><ul><li>for rich media formats </li></ul><ul><li>Sheer volume of stuff </li></ul>
    16. 18. <ul><li>Maintain copy of original hardware, software, operating system, and records </li></ul><ul><li>Still run into issues with media and hardware lifespan </li></ul><ul><li>Centralizes access to locations with older systems </li></ul><ul><li>Increasing number of systems required to ensure access to everything </li></ul><ul><li>Difficult to ensure everything is taken into account </li></ul>
    17. 19. <ul><li>Virtual recreation of original environment </li></ul><ul><li>Does not require any conversion </li></ul><ul><li>Requires periodic refreshing of the emulation environment </li></ul><ul><li>Still have issues around media and, maybe, hardware to read it </li></ul><ul><li>Lots of work is being done in this area </li></ul>
    18. 20. <ul><li>Move from proprietary to standard </li></ul><ul><ul><li>HTML to XML </li></ul></ul><ul><ul><li>Windows bitmap to JPEG or TIFF </li></ul></ul><ul><ul><li>Excel to ASCII text </li></ul></ul><ul><li>Can be labor-intensive </li></ul><ul><li>Often results in some loss of data </li></ul><ul><ul><li>Proprietary formatting </li></ul></ul><ul><ul><li>Rich objects, images, formulas, etc. </li></ul></ul>
    19. 21. <ul><li>Digital media doesn’t last forever… </li></ul><ul><li>… and neither does the hardware </li></ul><ul><li>Media must be refreshed while it’s still readable </li></ul><ul><li>Very labor intensive </li></ul><ul><li>Often results in loss of some information </li></ul><ul><ul><li>Migration over generations often more reliable than migration through generations </li></ul></ul>
    20. 22. <ul><li>Domesday book written in 1086 </li></ul><ul><li>In 1986, BBC created interactive </li></ul><ul><li>presentation using LaserVision LV-ROM </li></ul><ul><li>By 2002 the discs were unreadable </li></ul><ul><li>Through significant effort and the use of migration and emulation, the Domesday presentation remains available </li></ul>
    21. 24. <ul><li>Know your holdings </li></ul><ul><ul><li>Current file formats </li></ul></ul><ul><ul><li>Future file formats </li></ul></ul><ul><ul><li>Physical storage conditions </li></ul></ul><ul><ul><li>Retention requirements </li></ul></ul><ul><ul><li>Value/importance </li></ul></ul>
    22. 25. <ul><li>Develop strategy </li></ul><ul><ul><li>Improvement of physical storage conditions </li></ul></ul><ul><ul><li>Investigation of media degradation actions </li></ul></ul><ul><ul><li>Creation of file format metadata </li></ul></ul><ul><ul><li>Adoption of standard file formats </li></ul></ul><ul><ul><li>Adoption of preservation-related standards </li></ul></ul><ul><li>Develop a migration plan </li></ul><ul><li>Create a technology watch function </li></ul>
    23. 26. <ul><li>Capture information using no compression or lossless compression </li></ul><ul><li>Use standard file and media formats </li></ul><ul><li>Select high-quality media that will last 5-10 years </li></ul><ul><li>Capture relevant metadata </li></ul>
    24. 27. <ul><li>Capture information using no compression or lossless compression </li></ul><ul><li>Capture information in standard formats or formal descriptions </li></ul><ul><li>Select high-quality media and plan for migration </li></ul><ul><li>Capture relevant metadata </li></ul><ul><li>Do not use encryption or passwords on individual documents </li></ul>
    25. 28. <ul><li>Capture information in standard formats or formal descriptions </li></ul><ul><li>Select high-quality media and plan for migration </li></ul><ul><li>Capture and embed relevant metadata </li></ul><ul><li>Consider converting to analog </li></ul><ul><li>Do not use encryption or passwords on the individual documents </li></ul>
    26. 29. <ul><li>Digital preservation requires work </li></ul><ul><li>Ultimately a question of tradeoffs </li></ul><ul><ul><li>Cost to preserve </li></ul></ul><ul><ul><li>Cost of not preserving </li></ul></ul><ul><ul><li>Exactly what must be preserved </li></ul></ul><ul><li>Pursue multiple preservation strategies </li></ul><ul><li>Standards can help preservation efforts </li></ul>
    27. 31. <ul><li>Jesse Wilkins, CRM </li></ul><ul><li>Director, Systems of Engagement </li></ul><ul><li>AIIM </li></ul><ul><li>+1 (303) 574-0749 direct </li></ul><ul><li>[email_address] </li></ul><ul><li>http://www.twitter.com/jessewilkins </li></ul><ul><li>http://www.linkedin.com/in/jessewilkins </li></ul><ul><li>http://www.facebook.com/jessewilkins </li></ul><ul><li>http://www.slideshare.net/jessewilkins </li></ul>