1. Long-Term Preservation of Digital Objects Outside of ContentDM "preserve your digital objects in double wrapped tortillas" Edward Iglesias and WittawattMeesangnil Elihu Burritt Library Central Connecticut State University
2. Part I: Rationale Basic info on CCSU Library Use ContentDM for digital collection But have many kinds of digital objects outside CDM to be archived Current digital objects to archived ~ 4 TB Primary decision Go with offsite / cloud storage
3. We need long term digital preservation: choices OCLC’s Digital Archive Option A: OCLC Digital Archive Pros: Works beautifully with ContentDM Little to no maintenance No new expertise required Safe Cons: Expensive Amazon S3 Option B: Amazon Simple Storage Service (S3) Pros: Provide reliable cloud storage Cost less Cons: Need to manage digital preservation process our self
4. Comparison from dltj.org Need to deal with features which Amazon S3 doesn’t have ↑ If we were to go with S3
5. We use ContentDM but why not using OCLC Digital Archive? Cost Preservation of digital objects not loaded in ContentDM Staff development to become experts on digital preservation on our campus
7. Metadata Master files on physical media mailed to OCLC OCLC digital Archive Workflow*from Taylor Surface, Building Digital Preservation into the Workflow of your Digital Library, OCLC Files from your local computer or network WorldCat Metadata anddisplay image Digital Archive Digital Asset Mgt. System
10. What OCLC Digital Archive Does It provides Systems management Physical security Data security Data backups Disaster recovery ISO 9001 Certification by performing Automated PREMIS creation Manifest verification Fixity check (digital fingerprinting) Format verification Virus check Reports on … Storage use & growth File types Accesses & disseminations
11. Part II: Our Solution We use a local server dedicated for digital preservation + Amazon S3 Create Preservation Metadata (PREMIS) Use MySQL database for management Use Bagit for data transfer verification Digital Objects are preserved in 2 locations Local RAID Hard drive Amazon S3 Deposits aredone quarterly
14. Format verificationKeep track - What objects are in archive -Storage use & growth -File types BagIt! PREMIS Archival Object Archive Amazon S3 RAID1 HDD
21. Ingest (2)Create bags using Bagit Bagit: Transferring Content for Digital Preservation http://www.digitalpreservation.gov/videos/bagit0609.html In a nutshell: put files in one directory(bag) for network transfer Combine digital objects and preservation metadata in to same directory Create “BagIt” checksum for validation and transfer of data Digital Archive Server PREMIS Digital Objects BagIt! Archival Object (Bag)
22. Ingest (3)update MySQL Keep track files that go in to archive Data recorded Bag name Bag size Date Depositor Department Collection Bag Checksum Digital Archive Server MySQL Update archive database Keep track - What objects are in archive -Storage use & growth -File types
23. Workflow: Archive Copy processed digital objects (now in a bag with preservation metadata) to Local RAID1 HDD Amazon S3 Verify Bagit checksum Digital Archive Server Archival Object (bag) Amazon S3 RAID1 HDD
24. Comparisons (again) OCLC Digital Archive provides ERIS DA Systems management Have to manage precess our self Use MySQL database to keep track objects in archive (use batch file to automate the process) Physical security, Data backups Keep 2 copies Local RAID1 hard drive (Windows Home server) Amazon Simple Storage Service(S3) Data security Create PREMIS metadata Disaster recovery Local DA server will be added to library disaster plan S3 agreement does not guarantee data will not lost( I guess nether does OCLC) But record of S3 uptime and integrity is very solid ISO 9001 Certification Señor Burritt certification Systems management Physical security Data security Data backups Disaster recovery ISO 9001 Certification
25. Comparisons (cont.) Digital Archive performs ERIS DA Automated PREMIS creation Manifest verification Fixity check (digital fingerprinting) Format verification Virus check Reports on … Storage use & growth File types Accesses & disseminations Semi-automated PREMIS creation using batch file Manifest verification, Fixity check, Format verification Virus check Use Antivirus program running on Local server (ClamAV WHS version) Reports on Can tell what files are in archive Total storage use, etc Accesses & disseminations Can access Digital archive at anytime Linking access file with mater file is not as seamless as OCLC DA, but usable
26.
27. Taylor Surface, Director, Digital Collection Services, OCLChttp://www.oclc.org/us/en/multimedia/2009/files/surface.pdf
Editor's Notes
OCLC DA data backup: “At any point in time there are 6 copies of the content of the Digital Archive at offsite facilities and one copy onsite”
Not included:S3 data transfer (out) fee: which should be very minimum
Just put those files in DVD, mail them, and you’re done