Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Archiving as a Service - A Model for the Provision of Shared Archiving Services Using Cloud Computing


Published on

Presentation held at iConference 2011

Published in: Education
  • Be the first to comment

Archiving as a Service - A Model for the Provision of Shared Archiving Services Using Cloud Computing

  1. 1. iConference 2011 Archiving as a Service - A Model for the Provision of Shared Archiving Services Using Cloud Computing Jan Askhoj – janaskhoej[at] Shigeo Sugimoto – sugimoto[at] Mitsuharu Nagamori – nagamori[at] University of Tsukuba, Japan
  2. 2. The Rise of Cloud Computing <ul><li>Big business: Reported that the cloud computing market will grow to more than $150 billion in 2013 </li></ul><ul><li>Gartner listed cloud computing as one of the most hyped technologies in 2009. </li></ul><ul><li>Many benefits: Reduced cost, increased storage, no software deployment, flexibility, mobility and allowing IT to shift focus. </li></ul><ul><li>Cloud computing is being used increasingly for content creation and storage . </li></ul>* Global Industry Analysts, 2010
  3. 3. A Cloud Definition (One of Many) <ul><li>Cloud Computing is an abstracted, scalable plat-form for service delivery. </li></ul><ul><li>Cloud computing makes use of existing technologies that can be described via a layered model. </li></ul><ul><li>Access to both platform and services is available via the internet . </li></ul><ul><li>Availability, quality and number of services are offered according to agreements with a provider . </li></ul><ul><li>- Vaquero et al. 2009 </li></ul>
  4. 4. Cloud Computing from an Archiving Perspective <ul><li>In the cloud, archives may not have knowledge of records creation hardware and software . How do we document such formats? </li></ul><ul><li>Cloud Providers are good at managing data and hosting software. But what if something happens? </li></ul><ul><li>There are providers of services for backup , but not for preservation . </li></ul><ul><li>Can we find and read documents created and stored in the cloud in 10 years from now? </li></ul>
  5. 5. I found the document... If only I knew how to access it!
  6. 6. Object of Research <ul><li>Providing a reference model for cloud based archiving that makes possible: </li></ul><ul><li>Offering trusted storage and long term preservation as a cloud based service. </li></ul><ul><li>Automatically providing preservation metadata and information packages for transfer of digital records. </li></ul><ul><li>Extending preservation to as early in the records lifecycle as possible. </li></ul>
  7. 7. Current Archive Model: OAIS <ul><li>Reference Model for an Open Archival Information System (OAIS). </li></ul><ul><li>Defines Entities, Relationships and Information Types in digital archives. </li></ul>Consultative Committee for Space Data Systems, 2002.
  8. 8. OAIS and the Cloud <ul><li>The OAIS Model does not cover the use of a shared platform for storage , outside the control of an archive. Such functionality overlaps with several OAIS functional entities. </li></ul><ul><li>An OAIS Archive does not cover the early stages of the document lifecycle . With a shared platform, digital objects can be immediately accessible to an archive for early preservation planning. </li></ul><ul><li>In OAIS, Digital Objects and metadata are included in information packages . If Producer and Archive share a common platform, this is not necessary. </li></ul>
  9. 9. Hardware/Facilities Connectivity Abstraction OS Virtualization Data Metadata Content Applications APIs Presentation (User facing) SaaS (Software as a Service). Users access applications via user-facing software or APIs. PaaS (Platform as a Service). Virtualized platform for executing applications and providing storage. IaaS (Infrastructure as a Service). Hardware and Infrastructure. A General Layered Model for Cloud Computing Services
  10. 10. Some Characteristics of the Layered Model <ul><li>In a layered model, each layer offers defined services to the layers above. </li></ul><ul><li>Services are abstracted and interchangeable. </li></ul><ul><li>Benefits: </li></ul><ul><li>- Makes it easy to offer and take advantage of defined levels of services. </li></ul><ul><li>- Facilitates resource sharing </li></ul><ul><li>- Facilitates migration </li></ul>
  11. 11. Archive Digital Object Digital Object Business System Storage Layer Simple Layered Cloud Archiving System Interaction Layer Trusted repository (bit-level integrity)
  12. 12. Expanding the Simple Model <ul><li>Storage does not equal preservation . </li></ul><ul><li>Information is needed to support: “ Viability, Renderability, Understandability, Authenticity, and Identity of Digital Objects” (known in OAIS as an Information Package). </li></ul>
  13. 13. Proposed Four Layer Model <ul><li>Interaction Layer : User facing Archives/ Records Management Systems and Business Systems. </li></ul><ul><li>Preservation Layer : Adds preservation information. Turns Digital Objects into Information Packages for use by Archives/Records Management Systems. </li></ul><ul><li>SaaS Layer : Applications represent bit-strings as Digital Objects used by systems and users. </li></ul><ul><li>PaaS Layer : Application platform and trusted repository for storing bit-strings. </li></ul>
  14. 14. Information Object Data Object Represent. Information Digital Object Bit Sequence 1+ 1+ 1+ OAIS Information Package Layered Model Interaction Layer Preservation Layer SaaS Layer PaaS Layer Preservation Description Information Information Package
  15. 15. Where does Preservation Metadata come from? <ul><li>Business System Metadata : Generated at the time of document creation or records export. </li></ul><ul><li>Registry Information : Pre-provided (semi-static) information about registered Entities and Information Types </li></ul><ul><li>Event Related Information : Information describing changes to Digital Objects and metadata taking place during the preservation process. </li></ul>
  16. 16. PaaS Layer SaaS Layer Preservation Layer Interaction Layer Digital Object Type & Metadata Bitstream Storage & API Information Package Layered Model Applications, Information and Provided Services Archive System Package Creator Business Software Storage/ Hosting Platform Application Service Preservation Information Information Package Digital Object Bit-stream Information Type
  17. 17. Case Study: Japanese Government <ul><li>Problems with system incompatibility and insufficient record management has led to a new Archives Policy and a new IT Strategy </li></ul><ul><li>One part is a cloud computing project: The Kasumigaseki Cloud ( 霞が関クラウド ). This is still in the early stages of planning. </li></ul><ul><li>We focus on three archiving problem areas to see how these could be resolved using our model. </li></ul>
  18. 18. Platform Platform Platform Record Historic Record Destruction Destruction Common Document Registration System Registration Transfer Plan Preservation Plan Retention Schedule Agency Records Mgmt. Agency National Archives Business System National Archive Current Workflow Business System Business System Business System Business System Records Mgmt. System
  19. 19. Problem Areas <ul><li>Lack of system integration : Individual government offices use different systems. Preparing records is a time consuming task. </li></ul><ul><li>Lack of resources : The burden of transferring records to the National Archives lies with government agencies. The size of the NAJ makes it hard to provide assistance. </li></ul><ul><li>Preservation : Lack of preservation of records in government agency systems. </li></ul>
  20. 20. Applying the model <ul><li>Assumption that the Kasumigaseki Cloud will offer both a storage/hosting platform (PaaS) and software services (SaaS) </li></ul><ul><li>Added functionality in Preservation Layer: </li></ul><ul><ul><li>Registration </li></ul></ul><ul><ul><li>Harvesting </li></ul></ul><ul><ul><li>Preservation </li></ul></ul><ul><ul><li>Reporting </li></ul></ul>
  21. 21. Archive System PaaS Layer Package Layer SaaS Layer ARM Layer User Facing Systems Transfer Transfer SaaS Business Systems -> Digital Objects Platform -> Bit-sequences Preservation Description Information Representation Information Package Information Package Desc. Functionality -> Registration, Harvesting, Conversion, Reporting RMS Agency Records Mgmt. Agency National Archives Business System Back-end Transfer Plan Preservation Plan Retention Schedule
  22. 22. Benefits and Limitations in Case <ul><li>Benefits : </li></ul><ul><ul><li>Automatic package creation, simplifying records transfer. </li></ul></ul><ul><ul><li>Early and consistent preservation metadata addition </li></ul></ul><ul><ul><li>Allows keeping current workflow, but adds automation </li></ul></ul><ul><li>Limitations/Requirements : </li></ul><ul><ul><li>Cloud platform must be truly trustworthy with no unexpected change or loss of service. </li></ul></ul><ul><ul><li>Need good export of content and metadata from SaaS business systems </li></ul></ul><ul><ul><li>Providing semantic or community specific information </li></ul></ul>
  23. 23. Concluding Remarks <ul><li>We believe our model has a number of advantages when developing a cloud archive framework: </li></ul><ul><li>Builds on OAIS model concepts and information types. </li></ul><ul><li>Adds trusted storage and preservation to early stages in the document lifecycle. </li></ul><ul><li>Simplifies archive system design by allowing organizations choose different levels of service. </li></ul><ul><li>Current Status : Work on defining information classes and properties. Designing a test system using the model. </li></ul>
  24. 24. Thank you ! ありがとうございました ! University of Tsukuba, Japan
  25. 25. References <ul><li>ISO 15489-1:2001 - Information and documentation - Records management - Part 1: General. 2001. </li></ul><ul><li>Requirements for Electronic Records Management Systems. 2002. . </li></ul><ul><li>Reference Model for an Open Archival Information System (OAIS) . Consultative Committee for Space Data Systems, 2002. </li></ul><ul><li>Electronic Records Archives ERA Lifecycle. 2004. </li></ul><ul><li>National Archives Law . National Archives of Japan, 2007. </li></ul><ul><li>Outline of the National Archives. 2007. </li></ul><ul><li>Chan, T. Japan to build massive cloud infrastructure for e-government. Green Telecom . </li></ul><ul><li>Guenther, R. Understanding and Implementing the PREMIS Data Dictionary for Preservation Metadata. 2009. </li></ul><ul><li>Koga, T. Recent development of the government information policy in Japan. International Federation of Library Associations and Institutions, Government Information and Official Publications Section (GIOPS) Newsletter, 8 , (2010), 8-11. </li></ul><ul><li>Kulovits, H., Becker, C., and Kraxner, M. Plato: A Preservation Planning Tool Integrating Preservation Action Services. 5173/2008 , (2008), 413-414. </li></ul><ul><li>Okamoto, S. New Developments in Managing Records in Japan - The Establishment, Direction and Structure of the Archive Law. 2010. </li></ul><ul><li>Sugimoto, S. Ensuring the Preservation and Use of Electronic Records. (2007). </li></ul><ul><li>Vaquero, L.M., Rodero-Merino, L., and Caceres, J. A Break in the Clouds: Towards a Cloud Definition. ACM SIGCOMM Computer Communication Review 39 , 1 (2009), 50-55. </li></ul><ul><li>Youseff, L., Butrico, M., and DaSilva, D. Toward a Unified Ontology of Cloud Computing. Grid Computing Environments Workshop , (2008), 1-10. </li></ul>