• Save
Digital Archives in Theory and Practice
Upcoming SlideShare
Loading in...5

Digital Archives in Theory and Practice



Presentation to RMLG, November 2003

Presentation to RMLG, November 2003



Total Views
Views on SlideShare
Embed Views



2 Embeds 4

http://www.slideshare.net 3
http://www.linkedin.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Digital Archives in Theory and Practice Digital Archives in Theory and Practice Presentation Transcript

  • Digital Records and Digital Archives Preservation in theory and practice Richard Davis Digital Archives Department University of London Computer Centre http://www.ulcc.ac.uk http://ndad.ulcc.ac.uk [email_address]
  • What we will cover
    • Preservation options
    • Basic tasks
      • Physical preservation
      • Logical preservation
    • Metadata
    • Organisational issues
  • Why do they need special attention?
    • Digital records require an intermediary
    • They don’t have a fixed form
    • Their carriers are perishable
    • They fall outside records management regimes
    • Computers need “experts”
  • What are the advantages of digital preservation?
    • Ease of copying
    • Ease of re-use
    • No worry about which is “original”
    • Takes up less space
    • Easy to search, even without a catalogue
  • Assumptions
    • You know what digital records exist
    • You know what you want to preserve
    • You have a retention/disposal policy
    • You can separate material for preservation
    • You know what you want to do with it
  • Preservation options
    • Preserve the bits
    • Preserve the data
    • Preserve the record
    • Preserve the experience
  • Preserving the bits
    • Keep the data in exactly the same format
    • Interpretation likely to be a problem
    • Works in some contexts
    • Mostly useful as an adjunct to other strategies
  • Preserving the data
    • Keep the essential data in generic form
    • Don’t worry about presentation and context
    • Better than nothing
    • Often used for databases
    • Reduces long-term utility
  • Preserving the record
    • Keep the information and context
    • The ideal approach
    • Don’t necessarily preserve appearance
    • Balances utility against costs
  • Preserving the experience
    • Keep everything - software, information, forms, etc
    • May require emulators or old computers
    • Expensive
    • Doesn’t support/promote re-use
    • Someone may do it - but not us!
  • Basic tasks in digital preservation
    • Protecting the media
    • Copying to new media
    • Choosing a file format
    • Migrating to new file formats
    • Managing metadata
  • Physical forms
    • Floppy disks
    • Open-reel tapes
    • Tape cartridges
    • Hard disks
    • CD-ROM
    • ZIP, JAZ, etc. disks
    • Punched cards, paper tape
  • Media lifespans
  • Refreshing media
    • The process of copying to new media
      • At end of predicted lifetime
      • At regular intervals
      • After detected failure
    • Lifetime may be number of uses, not interval
    • Maybe the same, maybe different
    • Check all copies
  • Logical preservation
    • Selecting right file format
    • At time of creation or accession
    • No universal solution
    • Preservation format may be different from access format
    • Should include metadata
  • Properties of preservation formats
    • Published standard
    • Stability
    • Good conversion from ingest formats
    • Good conversion to access formats
    • Good representation of structure of information
  • Long-term storage
    • Documents: Plain text, PDF, XML
    • Data tables (DBMS or spreadsheet): CSV, SQL Schema, XML Schema
    • Pictures: TIFF
    • Sound: PCM, AIFF
    • Avoid lossy compression
  • Capturing the record
    • Manual
      • Users must choose what is retained
      • User-driven conversion
    • Automatic
      • System forces capture of record copy
      • Triggers conversion to preservation format
    • Retrospective archiving is manual, by definition
    • ERMS should support automated capture
  • Automated capture
    • Email: central server captures and indexes
    • Documents: EDMS
    • Databases: capture transaction logs and/or regular snapshots
    • Web sites: as databases
    • Custom applications: specify requirements ab initio
  • Migration
    • Frequency is not predictable
    • Usually driven by external factors
      • Changes in IS/IT strategy
      • Software/hardware upgrades
    • Should be automated
    • Check migration does not lose information
  • Metadata
    • Data about data
    • Not specific to digital records
    • Types of metadata:
      • Discovery
      • Access
      • Preservation
      • System
    • Embedded or external
    • Treat with same care as data itself!
  • Typical metadata
    • Author
    • Subject
    • Keywords
    • Abstract
    • Dates of creation/use/retirement
    • Access conditions
    • Retention period
  • Non-digital metadata
    • Most computer systems depend on paper records to be understood:
      • Specifications
      • Manuals
      • Reports
    • Some essential information may only be in people’s heads
    • Especially true for older systems/records
  • Non-digital metadata
  • Preservation and access
    • Preservation systems:
      • Keep information safe and secure
      • Control accessibility
      • Deliver data without interpretation
    • Access systems:
      • Mediate between user and preservation system
      • Format, select and present information
      • Enable user discovery of resources
      • Relate information to context
  • Working with IT departments
    • Style of IT support depends on size/age/type of organisation
    • Central control is easier to work with
    • Try to be involved before records are created
    • Express needs/issues in clear, real-world terms
    • IT developers like simple, reusable formats as well
  • Hints and tips
    • Databases: may have different views
    • Beware of …
      • Password-protected files
      • Automated dates in documents
      • Dynamic documents
      • Linked documents
      • Embedded objects
      • Hybrid assemblies
  • Hybrid assemblies and embedded objects
  • And finally…
    • For now: preserve original bitwise copies and use standard formats
    • Don’t wait for all the answers before you begin
    • Make friends with IT specialists
    • Learn about other initiatives and approaches
    • Remember your Records Management training: digital isn’t that different