SlideShare a Scribd company logo
1 of 27
Introduction to Digital
     Preservation
    William LeFurgy
       2/6/2013
Session overview
•   Some definitions
•   Digital preservation challenges
•   Preservation strategies
•   Non-technical issues
    • Collaboration, institutional culture, legal issues, costs
• Digital Forensics
Definitions
• Digital stewardship, preservation, curation
   – Often used interchangeably to mean active management of digital
     information over time to ensure its accessibility; “stewardship” is
     broadest term
• Born digital
   – Information created in digital form (not digitized!)
• Digitize (scan, reformat, “reborn” digital)
   – Create a digital copy of an analog original
• Life Cycle
   – Stages that digital content moves through from creation to
     preservation to access
• Archive/Archival Store/Repository
   – System used to accept, store and access specified information with
     long term value; provides for secure, redundant and managed
     administration; protect content and ensure ongoing access
Pictures Rather than Words




Atlas of Digital Damages on Flickr, http://ow.ly/hrJ7i
The Digital Preservation Challenge
• Libraries, archives, museums and other cultural
  heritage institutions have unparalleled
  experience managing analog items
• Digital information is an existential test:
  institutions have to figure out a new way of
  doing business
• Hard, because most institutions and staff have
  limited experience dealing with digital
• Hard, too, because digital presents challenges
Problem: Lots and Lots of Data
• Huge volume of digital information—and it is
  rapidly growing
• Organizations, governments and individuals
  are all information creators
• Some large chunks of this information has
  value—actual or potential—from perspective
  of archives/libraries
• Which chunks to focus on?
Problem: Information Complexity
• Dynamic databases, websites
• Sophisticated specialty uses: CGI, CAD/CAM,
  geospatial…
• Highly specialized applications dependent on
  deep knowledge: scientific databases
Problem: Technological
          Dependency/Obsolescence
• Every piece of digital information depends on a stack
  of technologies working perfectly together, e.g.:
   –   File format (pdf, html, doc)
   –   Storage media (cloud, hard drive, USB drive)
   –   Application software (reader, browser, app)
   –   Operating system (Windows XP, Vista, 7)
   –   Computing device (PC, laptop, smart phone)
• Each layer of the stack is changing
• Ensuring ongoing access requires work, careful
  planning
Osborne I, 1982 : WordStar, 5.25” floppy, CP/M
What is “Preservation”?
• What does a system need to do with information to provide for
  adequate preservation and access, now and in the future?
• Is saving the original files enough? Do they need to be
  converted/normalized?
• What metadata needs to be available?
• How important is original “look and feel” compared with
  information content?
• Answers to such questions drive strategies, approaches
Progress is Evident
• A number of initiatives are tackling the issue
  around the world
• Using some common principals, but different
  approaches
• Reasons for optimism:
  – Important elements of the issue are defined
  – Solid conceptual framework exists
  – Biggest institutions are deeply engaged
  – Extensive cooperation, sharing, open development
Mix of Institutional Strategies
• Build Institutional foundations
   – Provide mandate and policies for a preservation program
   – Trusted Digital Repositories/TRAC
• Develop Internal systems
   – Build an infrastructure (Proprietary? Open Source?)
• Use External Services
   – Pay for an existing infrastructure
• Learn by doing
   – Identify/capture content, rely on iterative improvement
• Collaborate
   – Work with others on shared approaches
• Observe and wait
Preservation Approaches
• Differences of opinion now exist
• Possible that future approaches will emerge
• Three commonly accepted approaches today:
  • Bit preservation
  • Migration
  • Emulation
• Can rely on one approach or a hybrid
Approach: Bit Preservation
Capture information in its original form and
focus on maintaining data integrity: files are
kept unchanged
  Advantages                 Disadvantages
    •Lower cost                • Useful life of data unclear
    •Scalable, practicable     • Future functionality (look and feel)
    •Works well (so far)         at risk
Approach: Migration
Transform/normalize data into formats and
structures that are optimal for preservation

Advantages                         Disadvantages
  •Homogeneous data easier         • Complex ingest processes
to manage, access                  • Loss of data, functionality
  •Files are preserved with rich   • Based on assumptions about future
contextual metadata                • IP issues major barrier
  •Potential to solve              • Scalability, practicality not proven
preservation issues once and
for all
Approach: Emulation
Use software to mimic behavior of obsolete
systems to access and use original data

Advantages                            Disadvantages
• Look and feel preserved              • Complex development: may need to
• Potential to solve access issues      emulate HW, OS, applications …
  once and for all                     • Technology a moving target: need
• No need to process original files   many emulators to reflect changes
                                       • IP issues major barrier
                                       • Scalability, practicality not proven
                                       • Is the emulation right?
Preferred Approaches Share Basic Ideas
 • No optimal system; iterative improvements will continue
 • Keep the original files
 • Active management essential
    – Move data to new storage media ~5 years
    – Monitor data integrity with fixity checks
    – Ensure data remains accessible and interpretable
 • Make multiple copies and store separately
 • Modular approach to tools and services
 • Watch for changes in technology and user expectations
Preferred Approaches Are Open
• Open architectures:
  – Allows adding, upgrading and swapping system
    components from different vendors and sources
  – Essential not to be locked into one approach: must be able
    to easily move data to new platform
  – Systems should support interoperability
• Open Standards:
  – Published, widely used, consensus based
  – Can include open source or commercial products
  – Key is transparent understanding of technical basis to
    enable data access, manipulation
Important Non-Technical Issues
• Collaboration: new models needed for institutions,
  communities to work together
• Institutional culture: new policies, leaders need to
  integrate analog and digital management, staff need
  new skills
• Cost: many variables; economic sustainability is an
  issue
Copyrighted, Private, Confidential
• Exceptions in U. S. Copyright law for libraries
  & archives are outdated
  – 3 copy limit
• Societal norms and expectations for privacy
  are shifting
  – especially on the Internet
• Data mining and other techniques allow for
  new kinds of access and new policies
  – Social media, personal information
Digital Forensics
• Tools and approaches for protecting and
  extracting digital information
• Special relevance for all types of digital media,
  personal digital archiving
• Basic principles:
  – Acquire evidence without alteration
  – Do work in accountable, repeatable way
Work with Current & Archaic Data
• Must handle current digital information from
  mobile devices, networks, live data on remote
  computers, flash media, virtual machines,
  cloud services and encrypted sources
• Also deal with older information on all
  imaginable media—8” floppy disks, punch
  cards, ancient hard drives
• Everything to do with computing is either
  obsolete or rapidly headed that way
Personal Archiving
• “Personal papers” increasingly digital
• Social media, web largely driven by personal
  creation
• Personal content characterized by highly
  inconsistent structures, formats, provenance
• High risk of incompleteness, questionable
  authenticity
Forensic Life Cycle (Partial)
• Securing and Evaluating the Scene: ensure safety, confirm computer
  equipment present, secure equipment, identify and protect evidence,
  conduct interviews
• Documenting the Scene: create a permanent record of the scene by
  means of photography and note taking, document condition and location
  of computers
• Evidence Collection: collect computer hardware and media while
  preserving evidential value, obtain analogue evidence such as passwords,
  handwritten notes, computer manuals, printouts
• Forensic Imaging and Copying: e.g. for hard drive – removal of physical
  disk from computer, digital preview and capture using physical or logical
  disk acquisition, with writeblockers, followed by return of original media
  to evidence custodian

Source: Digital Forensics and Preservation, DPC Technology Watch Report 12-03
Summary
• Digital information presents tough issues in terms of
  preservation and access
• Libraries and archives must address these issues even though
  there are no ideal solutions and some open questions
• Initiatives are underway around the world testing different
  approaches to preservation
• There are a number of significant non-technical issues
• Digital preservation is also relevant on the personal level;
  digital forensics is an emerging sub-specialty
For More Information: A Partial List
• Digital preservation: an introduction, UKLON, http://ow.ly/hpoWr
• An Introduction to Digital Preservation, JISC Digital Media,
http://ow.ly/hpp7A
• Curation Reference Manual, Digital Curation Centre, http://ow.ly/hppeR
• Digital Preservation Handbook, Digital Preservation Coalition,
http://ow.ly/hppk2
• Digital Preservation Management Tutorial, Inter-university Consortium for
Political and Social Research, University of Michigan, http://ow.ly/hpprU
• Harnessing the Power of Digital Data for Science and Society, Report of the
Interagency Working Group on Digital Data to the Committee on Science of
the National Science and Technology Council, http://ow.ly/hppxC
• International Study on the Impact of Copyright Law on Digital Preservation,
Library of Congress, JISC, OAK Law, SURFfoundation, http://ow.ly/hppBs
• National Digital Information Infrastructure and Preservation Program,
Library of Congress, http://ow.ly/hppHP
For More Information: A Partial List-2
• LIFE3: A Predictive Costing Tool For Digital Collections, Life Cycle Information
for E Literature, University College London Library Services and the British
Library, http://ow.ly/hpoI7
• Open Planets Foundation, http://ow.ly/htqEw
• Preserving Moving Pictures and Sound, DPC Technology Watch Report 12-01
March 2012, http://ow.ly/hoYQx
• Digital Forensics and Preservation, DPC Technology Watch Report 12-03
November 2012, http://ow.ly/hoZiW
• Digital Forensics and Born Digital Content in Cultural Heritage Collections,
http://ow.ly/hpnn3
•Library of Congress digital preservation blog, The Signal, http://ow.ly/hpq0F
• National Digital Stewardship Alliance, Digital Preservation Glossary,
http://ow.ly/hua7X

More Related Content

What's hot

Keep it Safe, Stupid, or an Intro to Digital Preservation
Keep it Safe, Stupid, or an Intro to Digital PreservationKeep it Safe, Stupid, or an Intro to Digital Preservation
Keep it Safe, Stupid, or an Intro to Digital PreservationKyle Banerjee
 
Digital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessDigital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessMichaelPaulmeno
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
Digital preservation
Digital preservationDigital preservation
Digital preservationSarika Sawant
 
Personal Digital Archiving
Personal Digital ArchivingPersonal Digital Archiving
Personal Digital ArchivingMichaelPaulmeno
 
An Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of CongressAn Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of Congresslljohnston
 
Setting a Course for Success: Getting Started with Digital Preservation in Yo...
Setting a Course for Success: Getting Started with Digital Preservation in Yo...Setting a Course for Success: Getting Started with Digital Preservation in Yo...
Setting a Course for Success: Getting Started with Digital Preservation in Yo...WiLS
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital PreservationMichael Day
 
Maintaining a Personal Collection. Richard Wright.
Maintaining a Personal Collection. Richard Wright.Maintaining a Personal Collection. Richard Wright.
Maintaining a Personal Collection. Richard Wright.FIAT/IFTA
 
Preservation and Research Data at Binghamton University Libraries by Edward C...
Preservation and Research Data at Binghamton University Libraries by Edward C...Preservation and Research Data at Binghamton University Libraries by Edward C...
Preservation and Research Data at Binghamton University Libraries by Edward C...Charles Lyons
 
Digital preservation and curation of information.presentation
Digital preservation and curation of information.presentationDigital preservation and curation of information.presentation
Digital preservation and curation of information.presentationPrince Sterling
 
Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Brad Houston
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservationsmtcd
 

What's hot (20)

An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Keep it Safe, Stupid, or an Intro to Digital Preservation
Keep it Safe, Stupid, or an Intro to Digital PreservationKeep it Safe, Stupid, or an Intro to Digital Preservation
Keep it Safe, Stupid, or an Intro to Digital Preservation
 
Digital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessDigital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide Access
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Carl idigpres
Carl idigpresCarl idigpres
Carl idigpres
 
CARLIdigpres
CARLIdigpresCARLIdigpres
CARLIdigpres
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
 
Personal Digital Archiving
Personal Digital ArchivingPersonal Digital Archiving
Personal Digital Archiving
 
An Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of CongressAn Introduction to digital preservation at the Library of Congress
An Introduction to digital preservation at the Library of Congress
 
Setting a Course for Success: Getting Started with Digital Preservation in Yo...
Setting a Course for Success: Getting Started with Digital Preservation in Yo...Setting a Course for Success: Getting Started with Digital Preservation in Yo...
Setting a Course for Success: Getting Started with Digital Preservation in Yo...
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
 
Maintaining a Personal Collection. Richard Wright.
Maintaining a Personal Collection. Richard Wright.Maintaining a Personal Collection. Richard Wright.
Maintaining a Personal Collection. Richard Wright.
 
Data Storage & Preservation
Data Storage & PreservationData Storage & Preservation
Data Storage & Preservation
 
Digitization in media
Digitization in mediaDigitization in media
Digitization in media
 
Preservation and Research Data at Binghamton University Libraries by Edward C...
Preservation and Research Data at Binghamton University Libraries by Edward C...Preservation and Research Data at Binghamton University Libraries by Edward C...
Preservation and Research Data at Binghamton University Libraries by Edward C...
 
Digital preservation and curation of information.presentation
Digital preservation and curation of information.presentationDigital preservation and curation of information.presentation
Digital preservation and curation of information.presentation
 
Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?Microfilm or Digitize: Which is Right for You?
Microfilm or Digitize: Which is Right for You?
 
Digital Destiny
Digital DestinyDigital Destiny
Digital Destiny
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 

Similar to Introduction to Digital Preservation

Service and Support for Science IT -Peter Kunzst, University of Zurich
Service and Support for Science IT-Peter Kunzst, University of ZurichService and Support for Science IT-Peter Kunzst, University of Zurich
Service and Support for Science IT -Peter Kunzst, University of ZurichMind the Byte
 
Manage Complex Digital Assets at Massive Scale
Manage Complex Digital Assets at Massive ScaleManage Complex Digital Assets at Massive Scale
Manage Complex Digital Assets at Massive ScaleNuxeo
 
What Are you Waiting For? Remediate your File Shares and Govern your Informat...
What Are you Waiting For? Remediate your File Shares and Govern your Informat...What Are you Waiting For? Remediate your File Shares and Govern your Informat...
What Are you Waiting For? Remediate your File Shares and Govern your Informat...Everteam
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planningariadnenetwork
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curationGarethKnight
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
Digital Asset Management and Archival Preservation
Digital Asset Management and Archival PreservationDigital Asset Management and Archival Preservation
Digital Asset Management and Archival PreservationLAC Group
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research DataKristin Briney
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayJohn Kunze
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with ArchivematicaJenny Mitcham
 
Digital Preservation Discussion Group
Digital Preservation Discussion GroupDigital Preservation Discussion Group
Digital Preservation Discussion GroupAxiell ALM
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...Health IT Conference – iHT2
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
 

Similar to Introduction to Digital Preservation (20)

Service and Support for Science IT -Peter Kunzst, University of Zurich
Service and Support for Science IT-Peter Kunzst, University of ZurichService and Support for Science IT-Peter Kunzst, University of Zurich
Service and Support for Science IT -Peter Kunzst, University of Zurich
 
Andrew Waugh presentation
Andrew Waugh   presentationAndrew Waugh   presentation
Andrew Waugh presentation
 
Manage Complex Digital Assets at Massive Scale
Manage Complex Digital Assets at Massive ScaleManage Complex Digital Assets at Massive Scale
Manage Complex Digital Assets at Massive Scale
 
What Are you Waiting For? Remediate your File Shares and Govern your Informat...
What Are you Waiting For? Remediate your File Shares and Govern your Informat...What Are you Waiting For? Remediate your File Shares and Govern your Informat...
What Are you Waiting For? Remediate your File Shares and Govern your Informat...
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planning
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Andrew waugh
Andrew waughAndrew waugh
Andrew waugh
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
Digital Asset Management and Archival Preservation
Digital Asset Management and Archival PreservationDigital Asset Management and Archival Preservation
Digital Asset Management and Archival Preservation
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do Today
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica
 
Digital Preservation Discussion Group
Digital Preservation Discussion GroupDigital Preservation Discussion Group
Digital Preservation Discussion Group
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
E-discovery
E-discoveryE-discovery
E-discovery
 
Proact story on Archiving
Proact story on ArchivingProact story on Archiving
Proact story on Archiving
 
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...
CHIME LEAD New York 2014 "Case Studies from the Field: Putting Cyber Security...
 
Digital documents & e-discovery
Digital documents & e-discovery Digital documents & e-discovery
Digital documents & e-discovery
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 

Introduction to Digital Preservation

  • 1. Introduction to Digital Preservation William LeFurgy 2/6/2013
  • 2. Session overview • Some definitions • Digital preservation challenges • Preservation strategies • Non-technical issues • Collaboration, institutional culture, legal issues, costs • Digital Forensics
  • 3. Definitions • Digital stewardship, preservation, curation – Often used interchangeably to mean active management of digital information over time to ensure its accessibility; “stewardship” is broadest term • Born digital – Information created in digital form (not digitized!) • Digitize (scan, reformat, “reborn” digital) – Create a digital copy of an analog original • Life Cycle – Stages that digital content moves through from creation to preservation to access • Archive/Archival Store/Repository – System used to accept, store and access specified information with long term value; provides for secure, redundant and managed administration; protect content and ensure ongoing access
  • 4. Pictures Rather than Words Atlas of Digital Damages on Flickr, http://ow.ly/hrJ7i
  • 5. The Digital Preservation Challenge • Libraries, archives, museums and other cultural heritage institutions have unparalleled experience managing analog items • Digital information is an existential test: institutions have to figure out a new way of doing business • Hard, because most institutions and staff have limited experience dealing with digital • Hard, too, because digital presents challenges
  • 6. Problem: Lots and Lots of Data • Huge volume of digital information—and it is rapidly growing • Organizations, governments and individuals are all information creators • Some large chunks of this information has value—actual or potential—from perspective of archives/libraries • Which chunks to focus on?
  • 7. Problem: Information Complexity • Dynamic databases, websites • Sophisticated specialty uses: CGI, CAD/CAM, geospatial… • Highly specialized applications dependent on deep knowledge: scientific databases
  • 8. Problem: Technological Dependency/Obsolescence • Every piece of digital information depends on a stack of technologies working perfectly together, e.g.: – File format (pdf, html, doc) – Storage media (cloud, hard drive, USB drive) – Application software (reader, browser, app) – Operating system (Windows XP, Vista, 7) – Computing device (PC, laptop, smart phone) • Each layer of the stack is changing • Ensuring ongoing access requires work, careful planning
  • 9. Osborne I, 1982 : WordStar, 5.25” floppy, CP/M
  • 10. What is “Preservation”? • What does a system need to do with information to provide for adequate preservation and access, now and in the future? • Is saving the original files enough? Do they need to be converted/normalized? • What metadata needs to be available? • How important is original “look and feel” compared with information content? • Answers to such questions drive strategies, approaches
  • 11. Progress is Evident • A number of initiatives are tackling the issue around the world • Using some common principals, but different approaches • Reasons for optimism: – Important elements of the issue are defined – Solid conceptual framework exists – Biggest institutions are deeply engaged – Extensive cooperation, sharing, open development
  • 12. Mix of Institutional Strategies • Build Institutional foundations – Provide mandate and policies for a preservation program – Trusted Digital Repositories/TRAC • Develop Internal systems – Build an infrastructure (Proprietary? Open Source?) • Use External Services – Pay for an existing infrastructure • Learn by doing – Identify/capture content, rely on iterative improvement • Collaborate – Work with others on shared approaches • Observe and wait
  • 13. Preservation Approaches • Differences of opinion now exist • Possible that future approaches will emerge • Three commonly accepted approaches today: • Bit preservation • Migration • Emulation • Can rely on one approach or a hybrid
  • 14. Approach: Bit Preservation Capture information in its original form and focus on maintaining data integrity: files are kept unchanged Advantages Disadvantages •Lower cost • Useful life of data unclear •Scalable, practicable • Future functionality (look and feel) •Works well (so far) at risk
  • 15. Approach: Migration Transform/normalize data into formats and structures that are optimal for preservation Advantages Disadvantages •Homogeneous data easier • Complex ingest processes to manage, access • Loss of data, functionality •Files are preserved with rich • Based on assumptions about future contextual metadata • IP issues major barrier •Potential to solve • Scalability, practicality not proven preservation issues once and for all
  • 16. Approach: Emulation Use software to mimic behavior of obsolete systems to access and use original data Advantages Disadvantages • Look and feel preserved • Complex development: may need to • Potential to solve access issues emulate HW, OS, applications … once and for all • Technology a moving target: need • No need to process original files many emulators to reflect changes • IP issues major barrier • Scalability, practicality not proven • Is the emulation right?
  • 17. Preferred Approaches Share Basic Ideas • No optimal system; iterative improvements will continue • Keep the original files • Active management essential – Move data to new storage media ~5 years – Monitor data integrity with fixity checks – Ensure data remains accessible and interpretable • Make multiple copies and store separately • Modular approach to tools and services • Watch for changes in technology and user expectations
  • 18. Preferred Approaches Are Open • Open architectures: – Allows adding, upgrading and swapping system components from different vendors and sources – Essential not to be locked into one approach: must be able to easily move data to new platform – Systems should support interoperability • Open Standards: – Published, widely used, consensus based – Can include open source or commercial products – Key is transparent understanding of technical basis to enable data access, manipulation
  • 19. Important Non-Technical Issues • Collaboration: new models needed for institutions, communities to work together • Institutional culture: new policies, leaders need to integrate analog and digital management, staff need new skills • Cost: many variables; economic sustainability is an issue
  • 20. Copyrighted, Private, Confidential • Exceptions in U. S. Copyright law for libraries & archives are outdated – 3 copy limit • Societal norms and expectations for privacy are shifting – especially on the Internet • Data mining and other techniques allow for new kinds of access and new policies – Social media, personal information
  • 21. Digital Forensics • Tools and approaches for protecting and extracting digital information • Special relevance for all types of digital media, personal digital archiving • Basic principles: – Acquire evidence without alteration – Do work in accountable, repeatable way
  • 22. Work with Current & Archaic Data • Must handle current digital information from mobile devices, networks, live data on remote computers, flash media, virtual machines, cloud services and encrypted sources • Also deal with older information on all imaginable media—8” floppy disks, punch cards, ancient hard drives • Everything to do with computing is either obsolete or rapidly headed that way
  • 23. Personal Archiving • “Personal papers” increasingly digital • Social media, web largely driven by personal creation • Personal content characterized by highly inconsistent structures, formats, provenance • High risk of incompleteness, questionable authenticity
  • 24. Forensic Life Cycle (Partial) • Securing and Evaluating the Scene: ensure safety, confirm computer equipment present, secure equipment, identify and protect evidence, conduct interviews • Documenting the Scene: create a permanent record of the scene by means of photography and note taking, document condition and location of computers • Evidence Collection: collect computer hardware and media while preserving evidential value, obtain analogue evidence such as passwords, handwritten notes, computer manuals, printouts • Forensic Imaging and Copying: e.g. for hard drive – removal of physical disk from computer, digital preview and capture using physical or logical disk acquisition, with writeblockers, followed by return of original media to evidence custodian Source: Digital Forensics and Preservation, DPC Technology Watch Report 12-03
  • 25. Summary • Digital information presents tough issues in terms of preservation and access • Libraries and archives must address these issues even though there are no ideal solutions and some open questions • Initiatives are underway around the world testing different approaches to preservation • There are a number of significant non-technical issues • Digital preservation is also relevant on the personal level; digital forensics is an emerging sub-specialty
  • 26. For More Information: A Partial List • Digital preservation: an introduction, UKLON, http://ow.ly/hpoWr • An Introduction to Digital Preservation, JISC Digital Media, http://ow.ly/hpp7A • Curation Reference Manual, Digital Curation Centre, http://ow.ly/hppeR • Digital Preservation Handbook, Digital Preservation Coalition, http://ow.ly/hppk2 • Digital Preservation Management Tutorial, Inter-university Consortium for Political and Social Research, University of Michigan, http://ow.ly/hpprU • Harnessing the Power of Digital Data for Science and Society, Report of the Interagency Working Group on Digital Data to the Committee on Science of the National Science and Technology Council, http://ow.ly/hppxC • International Study on the Impact of Copyright Law on Digital Preservation, Library of Congress, JISC, OAK Law, SURFfoundation, http://ow.ly/hppBs • National Digital Information Infrastructure and Preservation Program, Library of Congress, http://ow.ly/hppHP
  • 27. For More Information: A Partial List-2 • LIFE3: A Predictive Costing Tool For Digital Collections, Life Cycle Information for E Literature, University College London Library Services and the British Library, http://ow.ly/hpoI7 • Open Planets Foundation, http://ow.ly/htqEw • Preserving Moving Pictures and Sound, DPC Technology Watch Report 12-01 March 2012, http://ow.ly/hoYQx • Digital Forensics and Preservation, DPC Technology Watch Report 12-03 November 2012, http://ow.ly/hoZiW • Digital Forensics and Born Digital Content in Cultural Heritage Collections, http://ow.ly/hpnn3 •Library of Congress digital preservation blog, The Signal, http://ow.ly/hpq0F • National Digital Stewardship Alliance, Digital Preservation Glossary, http://ow.ly/hua7X