SlideShare a Scribd company logo
1 of 20
Long-Term Preservation of Digital Objects Outside of ContentDM "preserve your digital objects in double wrapped tortillas" Edward Iglesias and WittawattMeesangnil Elihu Burritt Library Central Connecticut State University
Part I: Rationale Basic info on CCSU Library Use ContentDM for digital collection But have many kinds of digital objects outside CDM to be archived Current digital objects to archived ~ 4 TB Primary decision Go with offsite / cloud storage
We need long term digital preservation: choices OCLC’s Digital Archive Option A: OCLC Digital Archive Pros:  Works beautifully with ContentDM Little to no maintenance No new expertise required Safe Cons: Expensive Amazon S3  Option B: Amazon Simple Storage Service (S3) Pros: Provide reliable cloud storage Cost less Cons: Need to manage  digital preservation process our self
Comparison from dltj.org Need to deal with features which Amazon S3 doesn’t have ↑ If we were to go with S3
We use ContentDM but why not using OCLC Digital Archive? Cost Preservation of digital objects not loaded in ContentDM Staff development to become experts on digital preservation on our campus
Cost comparison
Metadata Master files on physical media mailed to OCLC OCLC digital Archive Workflow*from Taylor Surface, Building Digital Preservation into the Workflow of your Digital Library, OCLC Files from your local computer or network WorldCat Metadata anddisplay image Digital Archive Digital Asset Mgt. System
Need to create system that is complementary
What OCLC Digital Archive Does It provides Systems management Physical security Data security Data backups Disaster recovery ISO 9001 Certification by performing Automated PREMIS creation Manifest verification Fixity check (digital fingerprinting) Format verification Virus check Reports on … Storage use & growth File types Accesses & disseminations
Part II: Our Solution We use a local server dedicated for digital preservation + Amazon S3 Create Preservation Metadata (PREMIS) Use MySQL database for management Use Bagit for data transfer verification Digital Objects are preserved in 2 locations Local RAID Hard drive Amazon S3 Deposits aredone quarterly
Process System Digital Objects Deposit Digital Archive Server Ingest MySQL Update  archive database Create PREMIS ,[object Object]
Fixity check
Format verificationKeep track - What objects are in archive -Storage use & growth -File types BagIt! PREMIS Archival Object Archive Amazon S3 RAID1 HDD
Workflow: Deposit Process System Users send digital object along with administrative information  ,[object Object],Files send to Digital Archive server via ,[object Object]
-for small transfer
USB external Hard driveDigital Objects Deposit Digital Archive Server
Ingest (1)Create PREMIS Semi-automated PREMIS creation Manifest verification, Fixity check (digital fingerprinting), Format verification Based on New Zealand Prototype PREMIS Creation Tool http://www.loc.gov/standards/premis/tools_for_premis.php Use JHOVE Manifest verification, Fixity check (MD5, SHA1, CRC) DROID Format verification Convert 2 outputs to PREMIS using XSLT Use Ms-DOS batch file to automate the function Digital Archive Server Ingest Create PREMIS ,[object Object]
Fixity check
Format verificationPREMIS

More Related Content

What's hot

AWS IoT: colmare il divario tra il mondo fisico e quello digitale
AWS IoT: colmare il divario tra il mondo fisico e quello digitaleAWS IoT: colmare il divario tra il mondo fisico e quello digitale
AWS IoT: colmare il divario tra il mondo fisico e quello digitaleAmazon Web Services
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageAmazon Web Services
 
Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189Vahid Jalili
 
Spectrum Scale - Cognitive
Spectrum Scale - CognitiveSpectrum Scale - Cognitive
Spectrum Scale - CognitiveSmita Raut
 

What's hot (7)

AWS IoT: colmare il divario tra il mondo fisico e quello digitale
AWS IoT: colmare il divario tra il mondo fisico e quello digitaleAWS IoT: colmare il divario tra il mondo fisico e quello digitale
AWS IoT: colmare il divario tra il mondo fisico e quello digitale
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
S3 and Glacier
S3 and GlacierS3 and Glacier
S3 and Glacier
 
Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189
 
Big Data Applications
Big Data ApplicationsBig Data Applications
Big Data Applications
 
Spectrum Scale - Cognitive
Spectrum Scale - CognitiveSpectrum Scale - Cognitive
Spectrum Scale - Cognitive
 
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3) AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
 

Similar to Burrito digital archive system

Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Amazon Web Services
 
APTrust CNI Seattle April 2015
APTrust CNI Seattle April 2015APTrust CNI Seattle April 2015
APTrust CNI Seattle April 2015chipgerman
 
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...Amazon Web Services
 
Moving Data into the Cloud with AWS Transfer Services - May 2017 AWS Online ...
Moving Data into the Cloud with AWS Transfer Services  - May 2017 AWS Online ...Moving Data into the Cloud with AWS Transfer Services  - May 2017 AWS Online ...
Moving Data into the Cloud with AWS Transfer Services - May 2017 AWS Online ...Amazon Web Services
 
Building Hybrid Cloud Storage Architectures with AWS
Building Hybrid Cloud Storage Architectures with AWSBuilding Hybrid Cloud Storage Architectures with AWS
Building Hybrid Cloud Storage Architectures with AWSAmazon Web Services
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 Amazon Web Services Korea
 
ENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudAmazon Web Services
 
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...Amazon Web Services
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big DataAmazon Web Services
 
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...Amazon Web Services
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 
Automating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVaultAutomating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVaultAmazon Web Services
 
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...Amazon Web Services
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech TalksDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech TalksAmazon Web Services
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageAmazon Web Services
 

Similar to Burrito digital archive system (20)

Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
 
APTrust CNI Seattle April 2015
APTrust CNI Seattle April 2015APTrust CNI Seattle April 2015
APTrust CNI Seattle April 2015
 
AWS Storage and Data Migration
AWS Storage and Data MigrationAWS Storage and Data Migration
AWS Storage and Data Migration
 
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
Using AWS for Backup and Restore (backup in the cloud, backup to the cloud, a...
 
Moving Data into the Cloud with AWS Transfer Services - May 2017 AWS Online ...
Moving Data into the Cloud with AWS Transfer Services  - May 2017 AWS Online ...Moving Data into the Cloud with AWS Transfer Services  - May 2017 AWS Online ...
Moving Data into the Cloud with AWS Transfer Services - May 2017 AWS Online ...
 
Building Hybrid Cloud Storage Architectures with AWS
Building Hybrid Cloud Storage Architectures with AWSBuilding Hybrid Cloud Storage Architectures with AWS
Building Hybrid Cloud Storage Architectures with AWS
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
 
ENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the Cloud
 
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...
AWS re:Invent 2016: Getting Started with the Hybrid Cloud: Enterprise Backup ...
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
 
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...
Deep Dive on S3 Storage Management Covering New Feature Announcements - Decem...
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Automating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVaultAutomating Backup & Archiving with AWS and CommVault
Automating Backup & Archiving with AWS and CommVault
 
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...
What's new with Amazon S3, Amazon EFS, and other AWS storage services - STG20...
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech TalksDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
 
Ingest and storage options
Ingest and storage optionsIngest and storage options
Ingest and storage options
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 

More from Edward Iglesias

More from Edward Iglesias (15)

Implementing Virtual Reality in an Academic Library
Implementing Virtual Reality in an Academic Library Implementing Virtual Reality in an Academic Library
Implementing Virtual Reality in an Academic Library
 
CLASS Conference 2014
CLASS Conference 2014CLASS Conference 2014
CLASS Conference 2014
 
It came from the Printer
It came from the PrinterIt came from the Printer
It came from the Printer
 
CLC Maker Workshop 2014
CLC Maker Workshop 2014CLC Maker Workshop 2014
CLC Maker Workshop 2014
 
Digital History Class Presentation
Digital History Class PresentationDigital History Class Presentation
Digital History Class Presentation
 
Technologies of Makerspaces
Technologies of MakerspacesTechnologies of Makerspaces
Technologies of Makerspaces
 
CLA Makerspace
CLA  MakerspaceCLA  Makerspace
CLA Makerspace
 
Ccald2
Ccald2Ccald2
Ccald2
 
Cloud presentation NELA
Cloud presentation NELACloud presentation NELA
Cloud presentation NELA
 
Asist mit 2012
Asist mit 2012Asist mit 2012
Asist mit 2012
 
Cloudcomputingday2
Cloudcomputingday2Cloudcomputingday2
Cloudcomputingday2
 
Cloud Computing Day 1
Cloud Computing Day 1Cloud Computing Day 1
Cloud Computing Day 1
 
Digital History Presentation
Digital History PresentationDigital History Presentation
Digital History Presentation
 
Presentation for RILA
Presentation for RILAPresentation for RILA
Presentation for RILA
 
Discovery Layers
Discovery LayersDiscovery Layers
Discovery Layers
 

Burrito digital archive system

  • 1. Long-Term Preservation of Digital Objects Outside of ContentDM "preserve your digital objects in double wrapped tortillas" Edward Iglesias and WittawattMeesangnil Elihu Burritt Library Central Connecticut State University
  • 2. Part I: Rationale Basic info on CCSU Library Use ContentDM for digital collection But have many kinds of digital objects outside CDM to be archived Current digital objects to archived ~ 4 TB Primary decision Go with offsite / cloud storage
  • 3. We need long term digital preservation: choices OCLC’s Digital Archive Option A: OCLC Digital Archive Pros: Works beautifully with ContentDM Little to no maintenance No new expertise required Safe Cons: Expensive Amazon S3 Option B: Amazon Simple Storage Service (S3) Pros: Provide reliable cloud storage Cost less Cons: Need to manage digital preservation process our self
  • 4. Comparison from dltj.org Need to deal with features which Amazon S3 doesn’t have ↑ If we were to go with S3
  • 5. We use ContentDM but why not using OCLC Digital Archive? Cost Preservation of digital objects not loaded in ContentDM Staff development to become experts on digital preservation on our campus
  • 7. Metadata Master files on physical media mailed to OCLC OCLC digital Archive Workflow*from Taylor Surface, Building Digital Preservation into the Workflow of your Digital Library, OCLC Files from your local computer or network WorldCat Metadata anddisplay image Digital Archive Digital Asset Mgt. System
  • 8. Need to create system that is complementary
  • 9.
  • 10. What OCLC Digital Archive Does It provides Systems management Physical security Data security Data backups Disaster recovery ISO 9001 Certification by performing Automated PREMIS creation Manifest verification Fixity check (digital fingerprinting) Format verification Virus check Reports on … Storage use & growth File types Accesses & disseminations
  • 11. Part II: Our Solution We use a local server dedicated for digital preservation + Amazon S3 Create Preservation Metadata (PREMIS) Use MySQL database for management Use Bagit for data transfer verification Digital Objects are preserved in 2 locations Local RAID Hard drive Amazon S3 Deposits aredone quarterly
  • 12.
  • 14. Format verificationKeep track - What objects are in archive -Storage use & growth -File types BagIt! PREMIS Archival Object Archive Amazon S3 RAID1 HDD
  • 15.
  • 17. USB external Hard driveDigital Objects Deposit Digital Archive Server
  • 18.
  • 21. Ingest (2)Create bags using Bagit Bagit: Transferring Content for Digital Preservation http://www.digitalpreservation.gov/videos/bagit0609.html In a nutshell: put files in one directory(bag) for network transfer Combine digital objects and preservation metadata in to same directory Create “BagIt” checksum for validation and transfer of data Digital Archive Server PREMIS Digital Objects BagIt! Archival Object (Bag)
  • 22. Ingest (3)update MySQL Keep track files that go in to archive Data recorded Bag name Bag size Date Depositor Department Collection Bag Checksum Digital Archive Server MySQL Update archive database Keep track - What objects are in archive -Storage use & growth -File types
  • 23. Workflow: Archive Copy processed digital objects (now in a bag with preservation metadata) to Local RAID1 HDD Amazon S3 Verify Bagit checksum Digital Archive Server Archival Object (bag) Amazon S3 RAID1 HDD
  • 24. Comparisons (again) OCLC Digital Archive provides ERIS DA Systems management Have to manage precess our self Use MySQL database to keep track objects in archive (use batch file to automate the process) Physical security, Data backups Keep 2 copies Local RAID1 hard drive (Windows Home server) Amazon Simple Storage Service(S3) Data security Create PREMIS metadata Disaster recovery Local DA server will be added to library disaster plan S3 agreement does not guarantee data will not lost( I guess nether does OCLC) But record of S3 uptime and integrity is very solid ISO 9001 Certification Señor Burritt certification Systems management Physical security Data security Data backups Disaster recovery ISO 9001 Certification
  • 25. Comparisons (cont.) Digital Archive performs ERIS DA Automated PREMIS creation Manifest verification Fixity check (digital fingerprinting) Format verification Virus check Reports on … Storage use & growth File types Accesses & disseminations Semi-automated PREMIS creation using batch file Manifest verification, Fixity check, Format verification Virus check Use Antivirus program running on Local server (ClamAV WHS version) Reports on Can tell what files are in archive Total storage use, etc Accesses & disseminations Can access Digital archive at anytime Linking access file with mater file is not as seamless as OCLC DA, but usable
  • 26.
  • 27. Taylor Surface, Director, Digital Collection Services, OCLChttp://www.oclc.org/us/en/multimedia/2009/files/surface.pdf

Editor's Notes

  1. OCLC DA data backup: “At any point in time there are 6 copies of the content of the Digital Archive at offsite facilities and one copy onsite”
  2. Not included:S3 data transfer (out) fee: which should be very minimum
  3. Just put those files in DVD, mail them, and you’re done