Shaman Project Hemmje


Published on

3rd Annual WePreserve Conference Nice 2008

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Shaman Project Hemmje

  1. 1. Securing Communication with the future
  2. 2. SHAMAN will provide a next-generation Digital Preservation Framework. It develops exemplary Application Prototypes to investigate advantages and impacts of integrating SHAMAN’s New DP Component Technologies as well as legacy into New Applicagtion Solutions along SHAMAN’s Reference Architecture that extends OAIS. Validation of the SHAMAN framework viability will be focused in three Application Trial Domains: scientific publishing in libraries and documents in governmental archives digital objects used in industrial design and engineering processes data resources used in e-Science applications
  3. 3. SHAMAN’s PCAs supporting Interoperability PCA PCA NAME PCA COORDINATION Distributed Resource Management Infrastructure 1 Jose Borbinha (INESC-ID) Framework and Gridbased Resource Integration Contextual and Multivalent Archival and 2 Matthias Hemmje (Univ. of Hagen) Preservation Processes Semantic, Constraint-based Collection Management 3 Jean-Pierre Chanod (XEROX RCE) Systems 4 Managing Future Requirements Adil Hasan (Univ. of Liverpool)
  4. 4. SHAMAN’s WPs and PCAs supporting Component R&D WP 1 Requirements Analysis and Identification of User Perla Innocenti (HATII, Univ. of Glasgow), Scenarios WP 2 Design and Specification of the SHAMAN Digital Milena Dobreva (U. Strathclyde, Glasgow) Preservation Framework PCA1 WP 5 Martin Mois (Univ. of Hagen) Data Grid Implementation WP 3 Context Capturing, Representation, and Claus-Peter Klas (Univ. of Hagen), Management WP 4 Multivalent Preservation Interface and Media Engines Paul Watry (Univ. of Liverpool), PCA2 WP 6 Harmonisation, Basic Analysis and Ingest Jean-Pierre Chanod (Xerox RCE) WP 7 Advanced Information Extraction and Knowledge Engineering Jean-Pierre Chanod (Xerox RCE) PCA3 WP 8 Managing Shared Collections Jens Ludwig (SUB) WP 9 Interoperability with Future Environments Paul Watry (Univ. of Liverpool) PCA4 WP 10 Maintaining Essential Properties Jana Dittmann (Univ. of Magdeburg)
  5. 5. SHAMAN’s ISPs – Supporting Cohesion, Integration, Evaluation, and Demonstration WP 11 Document Production, Archival, Access and Reuse in the Context of Memory Institutions for Alfred Krahnstedt (DNB) ISP1 Scientific and Governmental Collections WP 12 Simple and Connected Object Production, Archival and Reuse in the Andreas Hundsdörfer (InConTec) ISP2 Industrial Design and Engineering Domain WP 13 eScience Data-Acquisition and Harmonisation Testbed Jose Borbinha (INESC/ID) ISP3
  6. 6. SHAMAN’s Integration & Demonstration Subprojects (ISPs)  Foster Systematic Evolution of Project Results…  ISP 1 – Document Production, Archival, Access and Reuse in the Context of Memory Institutions for Scientific and Governmental Collections  ISP 2 – Simple and Connected Object Production, Archival and Reuse in the Industrial Design and Engineering Domain  ISP 3 – eScience Data-Acquisition and Harmonisation Testbed  Horizontal Integration of RTD Contributions 07.11.2008
  7. 7. Fine, that is the SHAMAN Project structure … but how will the project work ?
  8. 8. Strategic R&D Impact Steering: Early Focus on Impact Drivers
  9. 9. Tactical R&D WP Steering (I): Early Operational Consequences for Requirements Analysis in WP1
  10. 10. Tactical R&D WP Steering (II): Early Operational Consequences for the Reference Architecture in WP2
  11. 11. SHAMAN’s legacy in ISP-1 (I): KOPAL (DNB, SUB)
  12. 12. SHAMAN’s legacy in ISP-1 (II): KOLIBRI (DNB, SUB)
  13. 13. SHAMAN’s legacy in ISP-1 (III): DIAS (IBM) Query Results Permanent Access Toolbox Preservation Planning SIP Preservation Preservation DIP Processor Manager Metadata SIP* Query DIP*** METS Data METS Data Ingest Access Data KB Management KB MPEG-21 Standard Custom MPEG-21 Metadata AIP** AIP Query Archival Storage Administration Monitoring & Logging BusinessObjects * = SIP: Submission Information Package *** = DIP: Dissemination Information Package ** = AIP: Archival Information Package
  14. 14. SHAMAN’s requirements analysis in ISP-1 (IV): Scientific Publication P&R Context Creation Review  Scientific congress publication Production process can make available Publication rich set of information to the Acquisition/ Ingest reuse context Scientific  Scientific community web Information and Archival/ Preservation Publication publishing and DL application Process Distribution/ Access CO CONGRESS ONLINE ® can be extended to capture Interpretation context data beyond the Reuse Reflection immediate requirements of scientific event organization Interlinking
  15. 15. Usage Scenario in ISP-2 (I): D&E Scenario and the OAIS Reference Model Design & Engineering P&R Context Multivalent approaches Providing ingestion and access in preserving CAD models distributed heterogeneous • migration industrial engineering szenarios • emulation with collaboration support Base diagram from: Consultative Comitee for Space Data Systems (2002): Reference Model for an Open Archival Information System (OAIS); CCSDS 650.0-B-1; BLUE BOOK
  16. 16. ISP-2 (II): R&D Dimensions of the Engineering Scenario Discover Search Collate Interpret Re-present CAD/CAE Data Collaboration
  17. 17. Basic Research Challenges in SHAMAN  Theory of Preservation: that may be used to store and access potentially any type of data, based on the integration of digital library, persistent archive, and data management technologies.  Infrastructure for long-term preservation and reuse of data over a decades-long time span.  Grid-based production system that will support the virtualization of data and services across scientific, engineering, document, and media domains. Identifying Content and Capturing of Context Demonstrate Distributed Ingestion 07.11.2008
  18. 18. First Investigations for Embedding Legacy Environments and Application-Domain oriented Use Cases into a Grid-Based Preservation Infrastructure Towards SHAMAN’s Framework Infrastructure and its Reference Architecture
  19. 19. ISP 1 Scenario – Use Cases  Information Integration  Mediated search within distributed repositories  Transparent (read) access to legacy systems  How to find information on all integrated systems ?  Distributed Ingestion  Local ingest processes are registered for updating federated metadata catalogue/index  How to cover local and global data ingestion, such that information can still be found ?  Managing Distributed Collections  Implementation of replication mechanisms  User requirements mediated to underlying storage infrastructure  How to enable data management over legacy systems? 07.11.2008
  20. 20. Challenges of Memory Institutions for SHAMAN within ISP1 Embedding legacy environments into a grid-based preservation infrastructure Types of legacy systems: Requirements: Types of technology  Data Grids  Integrity  Data Grid: iRODS/SRB  Institutional Repositories  Authenticity  DSpace  Archival Systems  Search&Browse  Fedora  Access Systems  Interpretability  KOPAL  Digital Libraries  Virtualization  DAFFODIL 07.11.2008
  21. 21. Usecase 1: Information Integration Search & Browse in distributed heterogeneous resources  Problem  Different access points to all resources  Possibly different user interfaces and  different query forms  Heterogeneous metadata standards  Solution  One access point  Central user interface & query form  Transparent access to all legacy systems 07.11.2008
  22. 22. Information Integration of Legacy Systems (I) Search and browse in heterogeneous environments DAFFODIL User Interface iRODs Search & Browse Service iRODs iRODS iRODS Wrapper Wrapper Wrapper Knowledge Base DSpac Data iRODs e base 07.11.2008 Kopal
  23. 23. Information Integration of Legacy Systems (II) Search and Browse in heterogeneous environments DAFFODIL User Interface Search, Browse & Integration service Wrapper Wrapper Wrapper Wrapper Knowledge Base DSpac Data iRODs e base Kopal 07.11.2008
  24. 24. Usecase 2: Distributed Ingestion Ingestion in distributed heterogeneous resources  How to cover local and global data ingestion, such that information can still be found ?  Local ingestion:  Notification push/pull  Double ingestion (full object)  Only metadata ingestion 07.11.2008
  25. 25. Distributed Ingestion Parallel storage, index only, full object store 07.11.2008
  26. 26. Resulting Technical Challenges of Managing Distributed Collections  Management of distributed collections in grid environment with all legacy systems, based on policies.  Replication of information in world wide distributed data and storage grids to prevent distruction, e.g., in case of disasters  UC1 & UC2: Wrapper & services need only be aware of local environement  Here: New mediator level, aware of all repositories and their current status! 07.11.2008
  27. 27. Managing Distributed Collections Replicate stored objects according to user requirements DAFFODIL Management Interface Policy Service e.g. replication Service Service Service Service Wrapper Wrapper Wrapper Wrapper DSpac Data Kopal iRODs e base 07.11.2008
  28. 28. Constructing SHAMAN‘s Service-Oriented DP Reference Architecture Communication Protocols ISP 1 ISP 2 ISP 3 07.11.2008
  29. 29. Fine. Thank you very much for your attention 07.11.2008
  30. 30. SHAMAN SHAMAN Welcome to the future. Welcome to SHAMAN.