Getting Bits off Disks:
Using open source tools to stabilize and prepare
born-digital materials
for long-term preservation...
Acquisition

Accession

Born-Digital Workflow

Discovery
&
Access

Arrangement
&
Description
Acquisition

Accession
Acquisition Process
Donor Survey

Feasibility Assessment

Transfer Agreement
Donor Survey
Creation
Context

Organization
Privacy & Security

Storage
Technical

Transfer Options
Current

Donor survey
Future

Drupal
Web Form

XML / CSV
Feasibility Assessment
Do we have resources to feasibly acquire, preserve,
and provide access to the digital materials?
Transfer

Physical Media

Network

ARCHIVES
Current
Future

Drupal
Web Form

XML / CSV
Accession
Accession Process
Disk Image Media

Initial Analysis

Produce AIP
Data Transfer
Hardware

Software

3.5 Floppy Drive
5.25. Floppy Drive
Zip Drive
CD / DVD Drive
USB Write-Blocker
SATA / ID...
Disk Imaging

“A single file or storage device containing the
complete contents and structure representing
a data storage ...
Disk Imaging
Born Digital Workstation 1.0
Disk Imaging
Born Digital Workstation 2.0
Disk Imaging
Get Media

Write-Protect
Media

Assign
Identifier

Create Image

Photograph
Media

Export Files

Record
Chara...
FC5205
Disk Image and Browse
FTK Imager
Issue:
Unknown / Unrecognized
Filesystems
Options:
Kryoflux
Initial Analysis
Extract
Metadata

Identify
Restricted Info

Identify
Duplicates

Generate
Reports
Initial Analysis
Hardware

Software

BitCurator
fiwalk
Bulk Extractor
“an effort to build, test, and analyze systems and software
for incorporating digital forensics methods
into the workflows...
BitCurator:

fiwalk
BitCurator:

bulk_extractor
BitCurator:

Reports
Produce AIP

AIP = Archival Information Package
Produce AIP
Hardware

Software

Archivematica
“a free and open-source digital preservation system
that is designed to maintain standards-based,
long-term access
to coll...
Produce AIP
Archivematica
Current

Future

Using version 0.10 on
dedicated workstation
(testing as virtual server)

Instal...
Accession

Arrangement
&
Description
A&D
Prepare

Develop Processing Plan

Implement Processing
Plan
A&D
Current

•

Integrate Born Digital
materials into existing
A&D process / tools (mix
of Excel, Word, XMetal
XML editor)...
Acquisition

Accession

Born-Digital Workflow

Discovery
&
Access

Arrangement
&
Description
Lessons Learned
• Embrace iterative approach (use what you have and
get what you need when you need it)
• Capture as much ...
Open Source - Issues
• May require specific IT environment (Linux)
• Tools likely to change quickly

• User interfaces / e...
Open Source - Benefits
• Limited initial resources needed to install and test

• Provides opportunity to engage systems / ...
Resources
FC5205 Disk Image
http://www.deviceside.com/fc5025.html
Kryoflux
http://www.kryoflux.com/
BitCurator
http://www....
Thanks!
sam.meister@mso.umt.edu
@samalanmeister
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Upcoming SlideShare
Loading in …5
×

Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation

719 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
719
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Sam – Overview of acquisition steps
  • Sam – discuss donor survey / site visit Types of information captured / purpose of collecting information for appraisal / selection decision-making Each potential acquisition is a new case to be investigated High potential for various types of content and format types Donor survey is tool to capture initial information to assist in determining feasibility of acquiring materials Caveat / Disclamer: Not all acquisition scenarios will allow for use of donor survey tool before acquisition decision made
  • Current = Word document
  • FutureWeb Form – exports XML or tabular data To allow for integration / interoperability with collection management / descriptive system
  • Sam – discuss feasibility assessment process Series of questions to assist in acquisition decision-making Analyzing sample set of files / data may be required to determine answers Ultimate question is resource-based – cost/ / benefit analysis New content / media / format types may require new software / hardware to acquire / accession materials Jenny: briefly review content; introduce customer to accession process; frequent repeat or reluctant customers
  • Sam – discuss transfer processTwo basic transfersPhysical media and/or Network transfer / agreement / forms Jenny: Transferring within Windows environment (using a server share to isolate files); calculating and comparing checksums; transfer agreement completed. Financial issues.
  • Current = Word documentDigital Materials Transfer document functions as appendix to deed of gift Documents details of transfer / acquisition process
  • FutureWeb Form – exports XML or tabular data To allow for integration / interoperability with collection management / descriptive system
  • Sam – overview of accession steps
  • Sam – provide overview of current born digital workstation Media drives Use of digital forensics hardware and software Born Digital Log – record / document accession process in Access database discuss disk imaging purpose / function
  • Sam – provide overview of current born digital workstation Media drives Use of digital forensics hardware and software Born Digital Log – record / document accession process in Access database discuss disk imaging purpose / function Jenny : When we do this, why we mostly don’t
  • Sam – born digital workstation version 2
  • Sam – born digital workstation version 2
  • Sam – 3.5 floppy drive
  • Sam – 5.25 floppy drive
  • Sam – zip drive
  • Sam – write blockers
  • Sam – overview of disk imaging steps
  • Sam – give overview of Born digital Log to document accession process
  • Sam – discuss purpose of Photograph media Documenting label text and artifact characteristics May / may not continue this step / practice in the future
  • Sam – 5.25 floppy drive
  • Kryoflux hardware and software as option to capture raw bitstream from unrecognized / unknown filesystems
  • Sam – overview of analysis steps
  • Sam – discuss tools used for initial analysis BitCurator
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss processing / preparing of data / files and metadata for storageJenny: Currently, AIP is produced manually and stored on Windows drive. Will need to revise process with TRIM. Could make use of Archivematica, but waiting until after ERMS implementation.
  • Sam – discuss processing / preparing of data / files and metadata for storageArchivematica
  • Sam – archivematica transfer steps
  • Sam – overview of archivematica ingest steps
  • Sam – archivematica storage of AIP
  • Sam – discuss current and potential future uses of ArchivematicaDescribe continued used in relation to overall digital preservation program development
  • Sam – describe general A&D strategy Basic steps are same for analog and digital materials
  • Sam – describe current and future A&D process Current = in development Future = dependent on decision to implement an ACMS
  • Sam – overview of major lessons learned to date
  • Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation

    1. 1. Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation Sam Meister University of Montana Best Practices Exchange 2013 November 13, 2013
    2. 2. Acquisition Accession Born-Digital Workflow Discovery & Access Arrangement & Description
    3. 3. Acquisition Accession
    4. 4. Acquisition Process Donor Survey Feasibility Assessment Transfer Agreement
    5. 5. Donor Survey Creation Context Organization Privacy & Security Storage Technical Transfer Options
    6. 6. Current Donor survey
    7. 7. Future Drupal Web Form XML / CSV
    8. 8. Feasibility Assessment Do we have resources to feasibly acquire, preserve, and provide access to the digital materials?
    9. 9. Transfer Physical Media Network ARCHIVES
    10. 10. Current
    11. 11. Future Drupal Web Form XML / CSV
    12. 12. Accession
    13. 13. Accession Process Disk Image Media Initial Analysis Produce AIP
    14. 14. Data Transfer Hardware Software 3.5 Floppy Drive 5.25. Floppy Drive Zip Drive CD / DVD Drive USB Write-Blocker SATA / IDE Write-Blocker FTK Imager Guymager FC5205
    15. 15. Disk Imaging “A single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, CD/DVD/BD, or USB flash drive”
    16. 16. Disk Imaging Born Digital Workstation 1.0
    17. 17. Disk Imaging Born Digital Workstation 2.0
    18. 18. Disk Imaging Get Media Write-Protect Media Assign Identifier Create Image Photograph Media Export Files Record Characteristics Virus Scan
    19. 19. FC5205 Disk Image and Browse
    20. 20. FTK Imager
    21. 21. Issue: Unknown / Unrecognized Filesystems
    22. 22. Options: Kryoflux
    23. 23. Initial Analysis Extract Metadata Identify Restricted Info Identify Duplicates Generate Reports
    24. 24. Initial Analysis Hardware Software BitCurator fiwalk Bulk Extractor
    25. 25. “an effort to build, test, and analyze systems and software for incorporating digital forensics methods into the workflows of a variety of collecting institutions”
    26. 26. BitCurator: fiwalk
    27. 27. BitCurator: bulk_extractor
    28. 28. BitCurator: Reports
    29. 29. Produce AIP AIP = Archival Information Package
    30. 30. Produce AIP Hardware Software Archivematica
    31. 31. “a free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects”
    32. 32. Produce AIP Archivematica Current Future Using version 0.10 on dedicated workstation (testing as virtual server) Install version 1.0 on server with multiple client nodes (workstations)
    33. 33. Accession Arrangement & Description
    34. 34. A&D Prepare Develop Processing Plan Implement Processing Plan
    35. 35. A&D Current • Integrate Born Digital materials into existing A&D process / tools (mix of Excel, Word, XMetal XML editor) Future • Determine tools needed for reviewing content (data visualization) • Integrate Born Digital materials into collection management system
    36. 36. Acquisition Accession Born-Digital Workflow Discovery & Access Arrangement & Description
    37. 37. Lessons Learned • Embrace iterative approach (use what you have and get what you need when you need it) • Capture as much metadata as possible (descriptive, structural, administrative) • Start with workflow requirements (what needs to be done) then test tools (what things will get it done) • Build flexibility into system (may not always be ideal scenarios)
    38. 38. Open Source - Issues • May require specific IT environment (Linux) • Tools likely to change quickly • User interfaces / experience may be simple • Will need ongoing support from IT / Systems staff
    39. 39. Open Source - Benefits • Limited initial resources needed to install and test • Provides opportunity to engage systems / IT in new areas • Designed and developed in collaboration with archival community • Direct communication channels to contribute to / modify development roadmap • Quickly build initial standards-compliant workflow
    40. 40. Resources FC5205 Disk Image http://www.deviceside.com/fc5025.html Kryoflux http://www.kryoflux.com/ BitCurator http://www.bitcurator.net/ Archivematica https://www.archivematica.org/wiki/Main_Page
    41. 41. Thanks! sam.meister@mso.umt.edu @samalanmeister

    ×