Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum

776 views

Published on

Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum - at Repository Fringe 2015

Published in: Education
  • Be the first to comment

Integration - the heart of researcher centric research data management systems - Steve Mackey, Arkivum

  1. 1. Integration – the heart of researcher centric research data management systems Steve Mackey 15 January 2015 1
  2. 2. Agenda • Who we are, what we do • How it works • RDM systems, where it fits • Workflows • Integrations 21 October 2014 2
  3. 3. Archive storage with a difference Flagship Arkivum100 service with 100% data integrity guarantee World-wide professional indemnity insurance – Arkivum100 Long term contracts for enterprise data archiving Fully automated and managed solution Audited and certified to ISO27001 Data escrow, exit plan, no lock-in 21 October 2014 3
  4. 4. Adding media – effectively continual process Monthly checks and maintenance updates Annual data retrieval and integrity checks Hardware refresh Software migration Hardware migration Tape format migration – LTO n to LTO n+2 Support and admin staff migration Change of supplier of products and services Keeping Data Alive for 25+ Years 3-5 year obsolescence of servers, operating systems and software
  5. 5. Arkivum Appliance • CIFS/NFS presentation (integrates easily to local file systems) • Simple administration of user access permissions and storage allocations • Robust REST API for application integration • GUI for file ingest status, recovery pre-staging, security • Ingest triggered by: timeout, checksum exchange, manifest (bulk). • Checksum/fixity chain of custody from ingest through replication • Immutable (WORM) • Regular (6 monthly) data copy read verify • Offline Escrow data copy (open source, self describing) • Data encryption throughout keys only held by customer 21 October 2014 5
  6. 6. Arkivum Service Arkivum Gateway on ApplianceOriginal Datasets & Files Copy for ingest
  7. 7. Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Encrypted Archive
  8. 8. Encrypted Archive Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Validated Archive Decrypted object
  9. 9. Arkivum Service Arkivum Gateway on Appliance Copy for ingest Original Datasets & Files Archive Copy 1 Validated Archive
  10. 10. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Copy for ingest Original Datasets & Files Validated Archive
  11. 11. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Copy for ingest Original Datasets & Files Validated Archive
  12. 12. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Copy for ingest Original Datasets & Files Validated Archive
  13. 13. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Original Datasets & Files Validated Archive Cached Copy
  14. 14. Arkivum/100 Arkivum Gateway on Appliance Archive Copy 1 Archive Copy 2 Escrow Copy Cached Copy Validated Archive
  15. 15. http://datablog.is.ed.ac.uk/2013/12/06/the-four-quadrants- of-research-data-curation-systems/ PURE Elements Converis ePrints, Dspace, Hydra Figshare Re3data.org Landing pages CKAN Institutional storage
  16. 16. Workflows • RDM Workflow - The sequence of repeatable processes (steps) through which Research Data passes during its lifecycle, including the steps involved in its creation, curation, preservation, access and eventual disposal. 21 October 2014 17
  17. 17. RDM Workflows Report • JISC Research Data Spring • A Consortial Approach to Building an Integrated RDM System – “Small and Specialist” • http://dx.doi.org/10.6 084/m9.figshare.1476 832 21 October 2014 18
  18. 18. Researcher Centric Workflow 21 October 2014 19
  19. 19. Figshare (Amazon) Archive (Arkivum) Researcher 8. Data DOI 2. Data files Local Research Data 5. Data DOI DataCite (BL) HR system 1. Researcher details Web browser 4. Mint DOI 3. Data Description Journal7. Article CRIS (Elements) 6. Data DOI 12. Dataset Description and Data DOI 9.Article and Article DOI 14. Data files Repository (DSpace) 10. Article and Article DOI 13. Dataset Description And Data DOI Article DOI 16. Data is safe 15. Data is safe 11. Article DOI
  20. 20. Why integrate? • Simpler and easier RDM processes from a Researcher perspective, which both encourages adoption and lowers the cost of institutional support to the research base. • Clear and repeatable RDM processes that help ensure higher levels of quality and consistency in RDM across the research base. • Ability to deploy RDM as community-driven shared service(s) so that smaller institutions can ‘join forces’ to benefit from having access to a common RDM infrastructure. • Scaling RDM up across a large research base using automation and ‘factory’ type approaches to achieve ‘economies of scale’ and move away from RDM being a manual and labour intensive endeavour. • Specifically for Archive layer storage this may include: – Confirmation of integrity of received files via checksums/fixity – File archive status reporting – Trigger for original file deletion – File location, data pool management – File recovery staging – Encryption key management 21 October 2014 21
  21. 21. Data Archiving - Integrations 21 October 2014 22
  22. 22. 21 October 2014 23 Questions?

×