Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials  for long-term preservation
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation

on

  • 467 views

 

Statistics

Views

Total Views
467
Views on SlideShare
458
Embed Views
9

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 9

https://twitter.com 9

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Sam – Overview of acquisition steps
  • Sam – discuss donor survey / site visit Types of information captured / purpose of collecting information for appraisal / selection decision-making Each potential acquisition is a new case to be investigated High potential for various types of content and format types Donor survey is tool to capture initial information to assist in determining feasibility of acquiring materials Caveat / Disclamer: Not all acquisition scenarios will allow for use of donor survey tool before acquisition decision made
  • Current = Word document
  • FutureWeb Form – exports XML or tabular data To allow for integration / interoperability with collection management / descriptive system
  • Sam – discuss feasibility assessment process Series of questions to assist in acquisition decision-making Analyzing sample set of files / data may be required to determine answers Ultimate question is resource-based – cost/ / benefit analysis New content / media / format types may require new software / hardware to acquire / accession materials Jenny: briefly review content; introduce customer to accession process; frequent repeat or reluctant customers
  • Sam – discuss transfer processTwo basic transfersPhysical media and/or Network transfer / agreement / forms Jenny: Transferring within Windows environment (using a server share to isolate files); calculating and comparing checksums; transfer agreement completed. Financial issues.
  • Current = Word documentDigital Materials Transfer document functions as appendix to deed of gift Documents details of transfer / acquisition process
  • FutureWeb Form – exports XML or tabular data To allow for integration / interoperability with collection management / descriptive system
  • Sam – overview of accession steps
  • Sam – provide overview of current born digital workstation Media drives Use of digital forensics hardware and software Born Digital Log – record / document accession process in Access database discuss disk imaging purpose / function
  • Sam – provide overview of current born digital workstation Media drives Use of digital forensics hardware and software Born Digital Log – record / document accession process in Access database discuss disk imaging purpose / function Jenny : When we do this, why we mostly don’t
  • Sam – born digital workstation version 2
  • Sam – born digital workstation version 2
  • Sam – 3.5 floppy drive
  • Sam – 5.25 floppy drive
  • Sam – zip drive
  • Sam – write blockers
  • Sam – overview of disk imaging steps
  • Sam – give overview of Born digital Log to document accession process
  • Sam – discuss purpose of Photograph media Documenting label text and artifact characteristics May / may not continue this step / practice in the future
  • Sam – 5.25 floppy drive
  • Kryoflux hardware and software as option to capture raw bitstream from unrecognized / unknown filesystems
  • Sam – overview of analysis steps
  • Sam – discuss tools used for initial analysis BitCurator
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss use of fiwalk to extract / generate filesystem metadata for disk images
  • Sam – discuss processing / preparing of data / files and metadata for storageJenny: Currently, AIP is produced manually and stored on Windows drive. Will need to revise process with TRIM. Could make use of Archivematica, but waiting until after ERMS implementation.
  • Sam – discuss processing / preparing of data / files and metadata for storageArchivematica
  • Sam – archivematica transfer steps
  • Sam – overview of archivematica ingest steps
  • Sam – archivematica storage of AIP
  • Sam – discuss current and potential future uses of ArchivematicaDescribe continued used in relation to overall digital preservation program development
  • Sam – describe general A&D strategy Basic steps are same for analog and digital materials
  • Sam – describe current and future A&D process Current = in development Future = dependent on decision to implement an ACMS
  • Sam – overview of major lessons learned to date

Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation Presentation Transcript

  • 1. Getting Bits off Disks: Using open source tools to stabilize and prepare born-digital materials for long-term preservation Sam Meister University of Montana Best Practices Exchange 2013 November 13, 2013
  • 2. Acquisition Accession Born-Digital Workflow Discovery & Access Arrangement & Description
  • 3. Acquisition Accession
  • 4. Acquisition Process Donor Survey Feasibility Assessment Transfer Agreement
  • 5. Donor Survey Creation Context Organization Privacy & Security Storage Technical Transfer Options
  • 6. Current Donor survey
  • 7. Future Drupal Web Form XML / CSV
  • 8. Feasibility Assessment Do we have resources to feasibly acquire, preserve, and provide access to the digital materials?
  • 9. Transfer Physical Media Network ARCHIVES
  • 10. Current
  • 11. Future Drupal Web Form XML / CSV
  • 12. Accession
  • 13. Accession Process Disk Image Media Initial Analysis Produce AIP
  • 14. Data Transfer Hardware Software 3.5 Floppy Drive 5.25. Floppy Drive Zip Drive CD / DVD Drive USB Write-Blocker SATA / IDE Write-Blocker FTK Imager Guymager FC5205
  • 15. Disk Imaging “A single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, CD/DVD/BD, or USB flash drive”
  • 16. Disk Imaging Born Digital Workstation 1.0
  • 17. Disk Imaging Born Digital Workstation 2.0
  • 18. Disk Imaging Get Media Write-Protect Media Assign Identifier Create Image Photograph Media Export Files Record Characteristics Virus Scan
  • 19. FC5205 Disk Image and Browse
  • 20. FTK Imager
  • 21. Issue: Unknown / Unrecognized Filesystems
  • 22. Options: Kryoflux
  • 23. Initial Analysis Extract Metadata Identify Restricted Info Identify Duplicates Generate Reports
  • 24. Initial Analysis Hardware Software BitCurator fiwalk Bulk Extractor
  • 25. “an effort to build, test, and analyze systems and software for incorporating digital forensics methods into the workflows of a variety of collecting institutions”
  • 26. BitCurator: fiwalk
  • 27. BitCurator: bulk_extractor
  • 28. BitCurator: Reports
  • 29. Produce AIP AIP = Archival Information Package
  • 30. Produce AIP Hardware Software Archivematica
  • 31. “a free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects”
  • 32. Produce AIP Archivematica Current Future Using version 0.10 on dedicated workstation (testing as virtual server) Install version 1.0 on server with multiple client nodes (workstations)
  • 33. Accession Arrangement & Description
  • 34. A&D Prepare Develop Processing Plan Implement Processing Plan
  • 35. A&D Current • Integrate Born Digital materials into existing A&D process / tools (mix of Excel, Word, XMetal XML editor) Future • Determine tools needed for reviewing content (data visualization) • Integrate Born Digital materials into collection management system
  • 36. Acquisition Accession Born-Digital Workflow Discovery & Access Arrangement & Description
  • 37. Lessons Learned • Embrace iterative approach (use what you have and get what you need when you need it) • Capture as much metadata as possible (descriptive, structural, administrative) • Start with workflow requirements (what needs to be done) then test tools (what things will get it done) • Build flexibility into system (may not always be ideal scenarios)
  • 38. Open Source - Issues • May require specific IT environment (Linux) • Tools likely to change quickly • User interfaces / experience may be simple • Will need ongoing support from IT / Systems staff
  • 39. Open Source - Benefits • Limited initial resources needed to install and test • Provides opportunity to engage systems / IT in new areas • Designed and developed in collaboration with archival community • Direct communication channels to contribute to / modify development roadmap • Quickly build initial standards-compliant workflow
  • 40. Resources FC5205 Disk Image http://www.deviceside.com/fc5025.html Kryoflux http://www.kryoflux.com/ BitCurator http://www.bitcurator.net/ Archivematica https://www.archivematica.org/wiki/Main_Page
  • 41. Thanks! sam.meister@mso.umt.edu @samalanmeister