Best Practices for Managing Born Digital Content


Published on

Webinar presented for WiLS by Emily Pfotenhauer, Recollection Wisconsin Program Manager, June 24, 2014. Based on information from the Demystifying Born Digital reports from OCLC Research and the Digital Preservation Education and Outreach (DPOE) curriculum developed by the Library of Congress.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Best Practices for Managing Born Digital Content

  2. 2.  Emily Pfotenhauer Recollection Wisconsin Program Manager, WiLS 608-616-9756  Slides and links: BEST PRACTICES FOR MANAGING BORN DIGITAL CONTENT
  3. 3. al.html
  4. 4. The mission of the DPOE program of the Library of Congress is to encourage individuals and organizations to actively preserve their digital content, building on a collaborative network of instructors, contributors, and institutional partners. DIGITAL PRESERVATION OUTREACH AND EDUCATION
  5. 5. identify select store protect manage provide DPOE Modules for Managing Digital Content Over Time
  6. 6. WHAT IS DIGITAL CONTENT?  Digital content is any material that is published or distributed in a digital form, including text, data, sound recordings, photographs and images, motion pictures, and software.  Digital materials created from analog sources  Born-digital materials  Digital materials you currently have or create – or expect to have – that you want to preserve.
  7. 7.  Born-digital resources are items created and managed in digital form.  Digital photographs  Digital documents  Digital manuscripts  Harvested web content  Electronic records  Data sets  Digital art  Digital media publications  Defining “Born Digital,” Ricky Erway, OCLC Research ns/borndigital.pdf DEFINING “BORN DIGITAL”
  8. 8.  Everyone is  creating digital content  distributing digital content  using digital content  And we are responsible for managing digital content DIGITAL REALITY IN 2014
  9. 9. WHAT’S THE PROBLEM?  Increasing amounts of digital assets are arriving on our doorstep  The digital assets arrive in all formats and on all formats  Time sensitive -- the longer we wait or the longer our donors wait, the increased chance that something will be unreadable
  10. 10. Who takes the lead? What can I do? Where do I start? Too technical (I don’t understand...) Too daunting (I don’t have time...) WHAT ARE THE CHALLENGES?
  11. 11. Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. Working group on Defining Digital Preservation ALA Annual Conference, 6/24/2007 DIGITAL PRESERVATION
  12. 12. Digital materials on physical media (CDs, flash drives, floppy disks, etc.) have been stored along with other collection materials without having been copied, preserved, or made accessible. A TYPICAL SCENARIO
  14. 14.  Do no harm  Don’t do anything that prevents future action and use  Take action  Document what you do FIRST STEPS: FOUR ESSENTIAL PRINCIPLES
  15. 15. Identifying content is a first step to planning for current and future preservation needs Ask: what content do I have, will I have, might I have, must I have? An inventory is the best way to identify what content you have now – and raise awareness in your institution. DPOE MODULE 1: IDENTIFY
  16. 16.  Good preservation decisions are based on an understanding of the possible content to be preserved  Not all digital content can or should be preserved  Preservation requires an explicit commitment of resources WHY DO WE IDENTIFY CONTENT?
  17. 17. 1. Identify and locate existing holdings. 2. Count and describe digital media within each collection. 3. Remove media from collection (retain order with photographs or separator sheets). 4. Assign inventory number to each physical piece. 5. Record anything that is known about the hardware, operating systems, and software used to create the files. 6. Calculate total amount of data (estimate). 7. Re-house physical media in suitable storage. FIRST STEPS: CREATE AN INVENTORY
  18. 18.  Medium (6 CDs, 1 hard drive)  Format (pdfs, docs)  File Size (be consistent - MB, GB or TB)  Identifying information found on labels such as creator, title, description of contents and dates  Expected future growth, if any COUNT AND DESCRIBE
  19. 19. Prioritize for further processing based on:  Significance and use of overall collection  Danger of loss of content (degradation) due to age or type of media  Uniqueness – not replicated elsewhere  Quantity of digital content DPOE MODULE 2: SELECT
  20. 20.  Cost: storage may be cheap, management is not…especially over time  Not all digital content may be appropriate for your organization to preserve.  Matching mission to content  Keeping delivery and access manageable and sustainable WHY SELECT CONTENT TO PRESERVE? Log jam on the St. Croix River, 1886 Wisconsin Historical Society WHi-2364
  21. 21. Ask yourself which digital content is  most significant to your organization?  most extensive?  most requested/used?  easiest?  oldest?  newest?  mandated?  at risk? SETTING PRIORITIES Postal workers sorting mail, 1955 Wisconsin Historical Society WHi-36392
  22. 22.  Communication is key, particularly when content comes from external creators  Keep content creators in the conversation Arrange a convenient time for them to talk about your preservation plans Identify list of materials to review with them Document the results and send them a copy  Sample policy: Minds@UW INCLUDE CONTENT CREATORS
  23. 23. THEN WHAT? Steps for transferring born-digital content from media you can read in-house: 1. Use a “clean” computer. 2. Use a write blocker. 3. Insert source media. 4. Create a disk directory. 5. Copy files from media to the directory. 6. Generate a copy of the directory. 7. Generate and record a checksum. 8. Create a readme file. 9. Copy the directory to trustworthy archival storage. 10. Return the original physical media to storage. 11. Create or update any associated descriptive tool(s).
  24. 24.  Dedicated computer  Regularly scanned with up- to-date antivirus software  Non-networked STEP 1: CLEAN WORKSTATION UW-Madison Archives
  25. 25.  Prevents the computer from altering file content and metadata (i.e. date, creator)  Do not open files until after transfer STEP 2: WRITE BLOCKER
  26. 26.  Do not attempt to open any files.  Examine media for cracks, breaks, etc.  Remove any sticky notes or anything else that could become loose. STEP 3: INSERT SOURCE MEDIA
  27. 27.  Create a directory on the clean machine for the current project.  Within the directory, create sub-directories:  Master Folder (to hold the master copy of the file)  Working Folder (to hold working copies of the master copy)  Documentation Folder (to hold metadata and other information associated with the project) STEP 4: CREATE A DISK DIRECTORY
  28. 28.  Copy files from the source media to the master folder  Copy files individually or in groups -OR-  Create a disk image  Disk image = single file containing an authentic copy of a disk’s contents, retaining original metadata and file system structure  After transfer from source media, make a second working copy – ok to open these files STEP 5: COPY FILES
  29. 29.  Generate a copy of the disk directory information  File names  File sizes  File extensions  Dates  Store a digital copy in the project documentation folder  Print a copy to keep with the physical collection STEP 6: COPY THE DISK DIRECTORY INFO
  30. 30.  Checksums (aka “hash sums”) are created by programs running an algorithm against the contents of a file. (There are many free utilities that will perform this function for you.)  The resulting checksum is a short sequence of letters and/or numbers that uniquely identifies that file. (think “electronic fingerprint”) STEP 7: RUN CHECKSUMS Unix cksum utility
  31. 31.  Checksums help maintain the INTEGRITY of your collections because they will tell you if things change over time.  If two files are exactly the same, the checksums of those files will also be exactly the same (generally speaking).  If a file becomes corrupted, degraded or is changed in some way, the next time you run the utility on it, the checksum will change. WHY IS THIS A GOOD THING?
  32. 32.  Things that will NOT affect checksums  Moving items from one place to another  Changing the file name  Run on the master files when a collection is completed  Set up a schedule to run “verify checks” periodically CHECKSUMS: THINGS TO REMEMBER
  33. 33.  Leave yourself (and others) some breadcrumbs  Brief description of contents, any retention schedule, naming conventions, steps taken in transfer  Store the file in the project documentation folder and store a printout of the readme file with the physical collection materials STEP 8: CREATE A README FILE
  34. 34.  Copy the directories containing the master files and project documentation to trustworthy archival storage  Store a second copy of the files in a different physical location  May delete working files at this time STEP 9 : TRANSFER TO SECURE LOCATION
  35. 35. STEP 10: RETURN ORIGINAL TO STORAGE  Return original source media to appropriate storage - OR –  Destroy the originals using a secure method
  36. 36.  Inventory as well as any finding aid, collection-level record and/or accession record  Include steps taken during transfer and the current location(s) of the files STEP 11: CREATE OR UPDATE ANY ASSOCIATED DESCRIPTIVE TOOL(S)
  37. 37.  Do no harm  Don’t do anything that prevents future action and use  Take action  Document what you do REVIEW: FOUR ESSENTIAL PRINCIPLES
  38. 38.  The Signal: Library of Congress digital preservation blog  Minnesota State Archives – Electronic Records Management Resources onicrecords.php  Practical E-Records blog  Digital Curation Exchange  Digital Curation Bibliography FURTHER RESOURCES
  39. 39.  Emily Pfotenhauer Recollection Wisconsin Program Manager, WiLS 608-616-9756  Slides and links: THANK YOU!