Webinar presented for WiLS by Emily Pfotenhauer, Recollection Wisconsin Program Manager, June 24, 2014. Based on information from the Demystifying Born Digital reports from OCLC Research and the Digital Preservation Education and Outreach (DPOE) curriculum developed by the Library of Congress.
2. Emily Pfotenhauer
Recollection Wisconsin Program Manager, WiLS
emily@wils.org
608-616-9756
Slides and links:
http://recollectionwisconsin.org/borndigital
BEST PRACTICES FOR MANAGING
BORN DIGITAL CONTENT
4. The mission of the DPOE program of the Library of
Congress is to encourage individuals and
organizations to actively preserve their digital
content, building on a collaborative network of
instructors, contributors, and institutional partners.
http://www.digitalpreservation.gov/education/
DIGITAL PRESERVATION
OUTREACH AND EDUCATION
6. WHAT IS DIGITAL CONTENT?
Digital content is any material that is published or
distributed in a digital form, including text, data, sound
recordings, photographs and images, motion pictures,
and software.
Digital materials created from analog sources
Born-digital materials
Digital materials you currently have or create – or expect
to have – that you want to preserve.
7. Born-digital resources are items created and managed in
digital form.
Digital photographs
Digital documents
Digital manuscripts
Harvested web content
Electronic records
Data sets
Digital art
Digital media publications
Defining “Born Digital,” Ricky Erway, OCLC Research
http://oclc.org/content/dam/research/activities/hiddencollectio
ns/borndigital.pdf
DEFINING “BORN DIGITAL”
8. Everyone is
creating digital content
distributing digital content
using digital content
And we are responsible for managing digital content
DIGITAL REALITY IN 2014
http://digitalbevaring.dk
9. WHAT’S THE PROBLEM?
Increasing amounts of digital assets are arriving on our
doorstep
The digital assets arrive in all formats and on all formats
Time sensitive -- the longer we wait or the longer our
donors wait, the increased chance that something will be
unreadable
10. Who takes the lead?
What can I do?
Where do I start?
Too technical
(I don’t understand...)
Too daunting
(I don’t have time...)
WHAT ARE THE CHALLENGES?
http://digitalbevaring.dk
11. Digital preservation combines policies, strategies and
actions to ensure access to reformatted and born digital
content regardless of the challenges of media failure and
technological change. The goal of digital preservation is
the accurate rendering of authenticated content over time.
Working group on Defining Digital Preservation
ALA Annual Conference, 6/24/2007
http://www.ala.org/alcts/resources/preserv/defdigpres0408
DIGITAL PRESERVATION
12. Digital materials on
physical media (CDs,
flash drives, floppy
disks, etc.) have been
stored along with
other collection
materials without
having been copied,
preserved, or made
accessible.
A TYPICAL SCENARIO
14. Do no harm
Don’t do anything that
prevents future action and use
Take action
Document what you do
FIRST STEPS:
FOUR ESSENTIAL PRINCIPLES
15. Identifying content is a first step to planning for current
and future preservation needs
Ask: what content
do I have,
will I have,
might I have,
must I have?
An inventory is the best way to identify
what content you have now –
and raise awareness in your institution.
DPOE MODULE 1: IDENTIFY
http://digitalbevaring.dk
16. Good preservation decisions are based on an
understanding of the possible content to be preserved
Not all digital content can or should be preserved
Preservation requires an explicit commitment of
resources
WHY DO WE IDENTIFY CONTENT?
17. 1. Identify and locate existing holdings.
2. Count and describe digital media within each
collection.
3. Remove media from collection (retain order with
photographs or separator sheets).
4. Assign inventory number to each physical piece.
5. Record anything that is known about the hardware,
operating systems, and software used to create the
files.
6. Calculate total amount of data (estimate).
7. Re-house physical media in suitable storage.
FIRST STEPS: CREATE AN INVENTORY
18. Medium (6 CDs, 1 hard drive)
Format (pdfs, docs)
File Size (be consistent - MB, GB or TB)
Identifying information found on labels such as creator,
title, description of contents and dates
Expected future growth, if any
COUNT AND DESCRIBE
19. Prioritize for further processing based on:
Significance and use of overall collection
Danger of loss of content (degradation) due to age
or type of media
Uniqueness – not replicated elsewhere
Quantity of digital content
DPOE MODULE 2: SELECT
20. Cost: storage may be
cheap, management is
not…especially over
time
Not all digital content
may be appropriate for
your organization to
preserve.
Matching mission to
content
Keeping delivery and
access manageable and
sustainable
WHY SELECT CONTENT TO PRESERVE?
Log jam on the St. Croix River, 1886
Wisconsin Historical Society WHi-2364
21. Ask yourself which digital content is
most significant to your organization?
most extensive?
most requested/used?
easiest?
oldest?
newest?
mandated?
at risk?
SETTING PRIORITIES
Postal workers sorting mail, 1955
Wisconsin Historical Society WHi-36392
22. Communication is key, particularly when content
comes from external creators
Keep content creators in the conversation
Arrange a convenient time for them to talk about
your preservation plans
Identify list of materials to review with them
Document the results and send them a copy
Sample policy: Minds@UW
http://uwdcc.library.wisc.edu/minds/faq.shtml
INCLUDE CONTENT CREATORS
23. THEN WHAT?
Steps for transferring born-digital content from media you can
read in-house:
1. Use a “clean” computer.
2. Use a write blocker.
3. Insert source media.
4. Create a disk directory.
5. Copy files from media to the directory.
6. Generate a copy of the directory.
7. Generate and record a checksum.
8. Create a readme file.
9. Copy the directory to trustworthy archival storage.
10. Return the original physical media to storage.
11. Create or update any associated descriptive tool(s).
25. Prevents the computer
from altering file content
and metadata (i.e. date,
creator)
Do not open files until
after transfer
STEP 2: WRITE BLOCKER
https://www.flickr.com/photos/joncrel/6285946610/
26. Do not attempt to open any
files.
Examine media for cracks,
breaks, etc.
Remove any sticky notes or
anything else that could
become loose.
STEP 3: INSERT SOURCE MEDIA
bitcurator.net
27. Create a directory on the clean machine for the
current project.
Within the directory, create sub-directories:
Master Folder (to hold the master copy of the file)
Working Folder (to hold working copies of the master
copy)
Documentation Folder (to hold metadata and other
information associated with the project)
STEP 4: CREATE A DISK DIRECTORY
28.
29. Copy files from the source media to the master folder
Copy files individually or in groups
-OR-
Create a disk image
Disk image = single file containing an authentic copy of a
disk’s contents, retaining original metadata and file
system structure
After transfer from source media, make a second working
copy – ok to open these files
STEP 5: COPY FILES
30. Generate a copy of the disk directory information
File names
File sizes
File extensions
Dates
Store a digital copy in the project documentation folder
Print a copy to keep with the physical collection
STEP 6: COPY THE DISK DIRECTORY INFO
31. Checksums (aka “hash sums”) are created by programs
running an algorithm against the contents of a file.
(There are many free utilities that will perform this
function for you.)
The resulting checksum
is a short sequence of letters
and/or numbers that uniquely
identifies that file.
(think “electronic fingerprint”)
STEP 7: RUN CHECKSUMS
Unix cksum utility
32. Checksums help maintain the INTEGRITY of your
collections because they will tell you if things change
over time.
If two files are exactly the same, the checksums of those
files will also be exactly the same (generally speaking).
If a file becomes corrupted, degraded or is changed in
some way, the next time you run the utility on it, the
checksum will change.
WHY IS THIS A GOOD THING?
33. Things that will NOT affect checksums
Moving items from one place to another
Changing the file name
Run on the master files when a collection is completed
Set up a schedule to run “verify checks” periodically
CHECKSUMS: THINGS TO REMEMBER
34. Leave yourself (and others) some breadcrumbs
Brief description of contents, any retention
schedule, naming conventions, steps taken in
transfer
Store the file in the project documentation folder
and store a printout of the readme file with the
physical collection materials
STEP 8: CREATE A README FILE
35. Copy the directories containing the master files and
project documentation to trustworthy archival storage
Store a second copy of the files in a different physical
location
May delete working files at this time
STEP 9 : TRANSFER TO SECURE LOCATION
36. STEP 10: RETURN ORIGINAL TO STORAGE
Return original source media to appropriate storage
- OR –
Destroy the originals using a secure method
37. Inventory as well as any
finding aid, collection-level
record and/or accession
record
Include steps taken during
transfer and the current
location(s) of the files
STEP 11: CREATE OR UPDATE ANY
ASSOCIATED DESCRIPTIVE TOOL(S)
http://digitalbevaring.dk
38. Do no harm
Don’t do anything that
prevents future action and use
Take action
Document what you do
REVIEW:
FOUR ESSENTIAL PRINCIPLES
39.
40. The Signal: Library of Congress digital preservation blog
http://blogs.loc.gov/digitalpreservation/
Minnesota State Archives – Electronic Records
Management
Resourceshttp://www.mnhs.org/preserve/records/electr
onicrecords.php
Practical E-Records blog
http://e-records.chrisprom.com
Digital Curation Exchange
http://digitalcurationexchange.org
Digital Curation Bibliography
http://digital-scholarship.org/dcbw/dcb.htm
FURTHER RESOURCES
41. Emily Pfotenhauer
Recollection Wisconsin Program Manager, WiLS
emily@wils.org
608-616-9756
Slides and links:
http://recollectionwisconsin.org/borndigital
THANK YOU!