Advertisement
Advertisement

More Related Content

Similar to Core Trust Seal for Trustworthy Data Repositories, 2018-04-19(20)

Advertisement
Advertisement

Core Trust Seal for Trustworthy Data Repositories, 2018-04-19

  1. CoreTrustSeal for Trustworthy Data Repositories John B Howard, CoreTrustSeal Board of Directors University Librarian, University College Dublin john.b.howard@ucd.ie CTS IntroducCon and Workshop, Dublin, March X, 2018
  2. Workshop Outline •  Trust: What does it mean, how does it apply? •  Who are the stakeholders, what are the issues? •  Repository cerCficaCon: •  CerCficaCon frameworks •  European Framework for Audit & CerCficaCon of Digital Repositories •  CoreTrustSeal •  Background •  CerCficaCon process •  Guidelines - a quick review
  3. Perspec<ves on Trust •  Trust in common parlance •  Repository context •  May represent a technical maZer of compliance (OAIS Reference Model, or a cerCficaCon framework) •  Transparency with regard to policy & operaCons •  Engagement with the designated community •  End-user context •  Good will, understanding the needs of the designated community; value proposiCon; reputaCon •  Transparency •  Confidence, reliability, persistence/permanence •  Endorsement: peers within the designated community, cerCficaCon frameworks
  4. “Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scien9sts to trust that, if they share, their data won’t be lost, garbled, stolen or misused?”
  5. Understanding stakeholder communi<es •  Focus on data with regard to Open Science, with implicaCons of scienCfic interest •  Most cerCfied repositories hold observaConal, experimental, or simulated data and data analyCcs •  Prevailing definiCons of data open the door to designated communiCes that are heterogenous •  Current DataCite “resource types”: CollecCon, Dataset, Event, Film, Image, InteracCveResource, PhysicalObject, Service, Socware, Sound, Text
  6. Cer<fica<on affirms trustworthiness, it does not bestow trust •  Forming a strategy to engender trust in repository services •  Understand the OAIS model and how it applies to your situaCon •  IdenCfy your designated community (communiCes) •  Evaluate your acCons as a repository: how do you add value, support reliability •  Evaluate your mission with regard to the long-term usability and cite-ability of assets held, be able to demonstrate awareness of preservaCon challenges •  Evaluate the transparency of your documentaCon
  7. •  “A trusted digital repository is one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future.” - RLG/OCLC Working Group: Research Libraries Group, 2002, p. i
  8. Cer<fica<on CerCficaCons add value through external consideraCon of quality of service, based on standards and best pracCces Framework of standards at different levels
  9. Various approaches •  Task Force on Archiving of Digital InformaCon (Commission on PreservaCon and Access and Research Libraries Group (RLG), 2001) •  OCLC •  TRAC Trustworthy Repositories Audit & Cer9fica9on •  ConsultaCve CommiZee for Space Data Systems (CCSDS) •  RAC Repositories Audit & Cer9fica9on •  Digital CuraCon Centre (DCC) und DigitalPreservaConEurope (DPE) •  DRAMBORA •  nestor AG Vertrauenswürdige Archive/ ZerCfizierung •  DIN 31644/nestor-Siegel •  Data Archive and Networked Services (DANS) •  Data Seal of Approval •  Primary Trustworthy Digital Repository AuthorisaCon Body (ISO-PTAB) •  ISO 16363
  10. European Framework for Audit & Cer<fica<on of Digital Repositories (2010) Three different Cers of cerCficaCon processes that build upon each other hZp://www.trusteddigitalrepository.eu/Trusted%20Digital%20Repository.html
  11. An Updated View DIN 31644 standard “Criteria for trustworthy digital archives” hZp://www.langzeitarchivierung.de ISO 16363:2012 - Audit and cerCficaCon of trustworthy digital repositories hZp://www.iso16363.org/ Merged DSA/WDS Frameworks hZp://www.datasealofapproval.org/ hZps://www.icsu-wds.org/ Succeeded by hZps://www.coretrustseal.org/
  12. Formal cer<fica<on: ISO 16363 •  “Trusted Digital Repository (TDR) Checklist” •  Based on Open Archival InformaCon System (OAIS) and Trusted Repository Audit and CerCficaCon (TRAC) •  Over 100 metrics •  Test audits 2011 by PTAB (Primary Trustworthy Digital Repository AuthorisaCon Body) •  Full external audiCng process •  ISO 16919: Requirements for bodies providing audit and cerCficaCon of candidate trustworthy digital repositories http://www.iso16363.org/
  13. Extended Cer<fica<on: nestorSeal •  34 criteria wriZen by German NESTOR-group and adopted in Germany as DIN31644 •  Self-assessment procedure by NESTOR leads to NESTOR seal •  Review of the assessment by 2 reviewers, appointed by NESTOR •  Self assessment and evidence on website •  2 seals acquired (DANS and DNB) http://www.langzeitarchivierung.de/Subsites/nestor/EN/ Siegel/siegel_node.html
  14. CoreTrustSeal: DSA and WDS: Look-Alikes •  CommunaliCes: •  Lightweight, self assessment, community review •  Complementarity: •  Geographical spread •  Disciplinary spread
  15. Partnership Goals •  Realizing efficiencies •  Simplifying assessment opCons •  SCmulaCng more cerCficaCons Outcomes •  Common catalogue of requirements for core repository assessment •  Common procedures for assessment •  One new cerCficaCon body, replacing DSA and WDS cerCficaCon •  New, not-for-profit S"ch"ng based in NL
  16. hWps://www.coretrustseal.org/
  17. Orienta<on to CTS Assessment Key documents •  An IntroducCon to the Core Trustworthy Data Repositories Requirements •  Core Trustworthy Data Repositories Requirements •  Glossary •  CoreTrustSeal Extended Guidance v1.0 See also: Video capture of Extended Guidance Webinare
  18. CoreTrustSeal requirements •  OrganisaConal infrastructure (6) •  Digital object management (8) •  Technology (2) •  Endorsed RDA output •  EC-recogniCon as an ICT technical specificaCon •  Self assessment (publicly available) •  Peer review •  3 year seal period •  CerCficaCon mandatory for regular members of WDS
  19. Organisa<onal infrastructure •  R1. The repository has an explicit mission to provide access to and preserve data in its domain. •  R2. The repository maintains all applicable licenses covering data access and use and monitors compliance. •  R3. The repository has a continuity plan to ensure ongoing access to and preservation of its holdings. •  R4. The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms. •  R5. The repository has adequate funding and sufficient numbers of qualified staff managed through a clear system of governance to effectively carry out the mission. •  R6. The repository adopts mechanism(s) to secure ongoing expert guidance and feedback (either in-house, or external, including scientific guidance, if relevant).
  20. Digital object management •  R7. The repository guarantees the integrity and authenCcity of the data. •  R8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users. •  R9. The repository applies documented processes and procedures in managing archival storage of the data. •  R10. The repository assumes responsibility for long-term preservaCon and manages this funcCon in a planned and documented way.
  21. Technical infrastructure •  R15. The repository funcCons on well-supported operaCng systems and other core infrastructural socware and is using hardware and socware technologies appropriate to the services it provides to its Designated Community. •  R16. The technical infrastructure of the repository provides for protecCon of the facility and its data, products, services, and users.
  22. Cer<fica<on procedure •  Self assessment based on requirements •  Online tool available •  Extended guidance (document and webinar) •  Review of the self assessment by two reviewers under the responsibility of the CTS Board •  URLs to evidence strongly encouraged •  Maturity raCngs strongly encouraged •  Responses in English •  Assessments to be publicly available •  Assessment fee €1,000 •  Renewal every 3 years
  23. Be ready before you start InformaCon may undergo ongoing maintenance, but should not be revised during the assessment Compliance level: reviewers may give a different level; when lowered, progress at renewal should be demonstrated Guidance is always indicaCve. Not every scenario can be covered in it General points 24
  24. •  If missing informaCon makes a requirement impossible to judge, this should be sent back and no compliance level should be given •  Possible familiarity with or inside knowledge of the repository cannot play a role: the assessment should be judged only by evidence Missing evidence 25
  25. •  Reviewers should not be forced to go into detailed reading of long pieces of evidence •  SecCons to be clearly indicated in lengthy documents •  A consistent descripCon of the repository as a whole is helpful Understandability of documenta<on 26
  26. •  Evidence being confidenCal, commercially sensiCve or containing security risks cannot be accepted as such •  This evidence can however be submiZed to the reviewer confidenCally and should be judged equally •  In future Cme (i.e. the next round) the evidence must be public Confiden<ality of internal documents 27
  27. Compliance levels •  0 – Not applicable 1 – The repository has not considered this yet 2 – The repository has a theoreCcal concept 3 – The repository is in the implementaCon phase 4 – The guideline has been fully implemented in the repository •  ... to foster the applicants’ own understanding of the current status/maturity of their repositories
  28. Assessment Details, 0-16
  29. 0. Context •  Repository Type •  Comments should explain which roles are being fulfilled. •  Brief DescripCon of the Repository’s Designated Community •  DefiniCon of the Designated Community is important here (see glossary): narrow, broad? •  Where does your data come from? •  Level of CuraCon Performed A. Content distributed as deposited B. Basic curaCon – e.g., brief checking, addiCon of basic metadata or documentaCon C. Enhanced curaCon – e.g., conversion to new formats, enhancement of documentaCon D. Data-level curaCon – as in C above, but with addiConal ediCng of deposited data for accuracy •  Outsource Partners. •  A diagram could be useful here.
  30. I. Mission/Scope R1. The repository has an explicit mission to provide access to and preserve data in its domain. Repositories take responsibility for stewardship of digital objects, and for ensuring that materials are held in the appropriate environment for appropriate periods of Cme. Depositors and users must be clear that preservaCon of and conCnued access to the data is an explicit role of the repository. For this Requirement, please describe: •  Your organizaCon’s mission in preserving and providing access to data, and include links to explicit [public] statements of this mission. •  The level of approval within the organisaCon that such a mission statement has received (e.g., approved public statement, roles mandated by funders, policy statement signed off by governing board). •  If data management is not referred to in the mission statement, then this requirement, as a rule, cannot have a compliance level of 3 or higher. 31
  31. II. Licenses R2. The repository maintains all applicable licenses covering data access and use and monitors compliance. •  Access and use condiCons could be set differently: either as standard terms and condiCons, or as differenCated for parCcular depositors or datasets. These could cover the level of curaCon, what is the liability level, the level of responsibility taken for the data, limitaCons on use, limits on usage environment (safe room, secure remote access), limits on types of users (approved researcher, has received training, etc.). •  The consequences if noncompliance is detected (e.g., sancCons on current or future access/use of data) should be made clear. Ideally, repositories should have a public policy in place for noncompliance. •  The minimum compliance level should be 4, if the applicant is currently providing access to data. 32
  32. III. ConCnuity of access R3. The repository has a conCnuity plan to ensure ongoing access to and preservaCon of its holdings. The level of responsibility for data should be indicated in the evidence. This informaCon helps the reviewer to judge whether the organisaCon is sustainable in terms of its finances and processes; in parCcular the conCnuity of its collecCons and responsibiliCes in the case of cessaCon of funding. The responsibility for sustainability may not lie in the hands of the repository itself, but a higher, overarching (or umbrella) organisaCon. 33
  33. IV. ConfidenCality/Ethics R4. The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms. •  All organisaCons responsible for data have an ethical duty to manage them to the level as expected by the scienCfic pracCce of its designated community. For repositories holding data about individuals, businesses, or other organisaCons, there are furthermore obligaCons and obligaCons that the rights of the data subjects will be protected. These will be both of a legal and ethical nature. •  Disclosure of these data could also present a risk of personal harm, a breach of commercial confidenCality, or the release of criCcal informaCon (e.g., the locaCon of protected species or an archaeological site). •  Reviewers expect to see evidence that the applicant understands their legal environment and the relevant ethical pracCces, and has documented procedures. •  Minimum compliance level should be a 4 if the repository is currently providing access to personal data. 34
  34. V. OrganizaConal infrastructure R5. The repository has adequate funding and sufficient numbers of qualified staff managed through a clear system of governance to effecCvely carry out the mission. •  The repository is hosted by a recognized insCtuCon (ensuring long-term stability and sustainability) appropriate to its Designated Community. •  The repository has sufficient funding, including staff resources, IT resources, and a budget for aZending meeCngs when necessary. Ideally this should be for a three- to five-year period. •  The repository ensures that its staff have access to ongoing training and professional development. •  The range and depth of experCse of both the organizaCon and its staff, including any relevant affiliaCons (e.g., naConal or internaConal bodies), is appropriate to the mission. The descripCon of this requirement should contain evidence describing the organisaCon’s governance/ management decision making processes, and the enCCes involved. Staff should have appropriate training in data management to ensure consistent quality standards. •  In what degree is funding structural or project-based? Can this be expressed in FTE numbers? •  How ocen does periodic renewal occur? 35
  35. VI. Expert guidance R6. The repository adopts mechanism(s) to secure ongoing expert guidance and feedback (either in-house, or external, including scienCfic guidance, if relevant). •  Does the repository have in-house advisers, or an external advisory commiZee that might be populated with technical members, data science experts, and disciplinary experts? •  How does the repository communicate with the experts for advice? •  How does the repository communicate with its Designated Community for feedback? •  This Requirement seeks to confirm that the repository has access to objecCve expert advice beyond that provided by skilled staff menConed in R5 (OrganisaConal infrastructure). 36
  36. VII. Data integrity and authenCcity R7. The repository guarantees the integrity and authenCcity of the data. •  A clear and complete context secCon is important for all requirements but this is parCcularly the case for this long requirement 7. The organisaCon of the curaCon and the types of data will help guide the reviewer expectaCon. The reviewer would benefit from a clear overview of the processes and tools used to curate the data including the level of manual and automated pracCce, and the how the processes, tools and pracCces are documented. Most useful would be when the applicant responds to each bullet point separately and to address integrity and authenCcity independently as defined in the requirement. •  Audit trails (wriZen evidence on which acCons have been performed on the data) should be elaborated on in the evidence 37
  37. VIII. Appraisal R8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users. The applicant should be able to demonstrate that procedures are in place to ensure that only data appropriate to the collecCon policy are accepted and that they have all the necessary informaCon and procedures and skills to ensure long term preservaCon and use relevant for the designated community. 38
  38. IX. Documented storage procedures R9. The repository applies documented processes and procedures in managing archival storage of the data. •  The reviewer will be looking to understand each of the storage locaCons which support curaCon processes, how data are appropriately managed in each environment and that processes are in place to monitor and manage change to storage documentaCon. •  Can the repository recover from short-term disasters? •  Are procedures documented and standardised in such a way that different data managers, while performing the same tasks separately, will arrive at substanCally the same outcome? 39
  39. X. PreservaCon plan R10. The repository assumes responsibility for long-term preservaCon and manages this funcCon in a planned and documented way. The reviewer will be looking for clear managed documenta<on to ensure (1) a managed approach to long term preserva<on (2) con<nued access for data types despite format changes and (3) with sufficient documenta<on to support usability by the designated community. The preserva<on plan should be managed to ensure that changes to data technology and user requirements are handled in a stable and <mely manner. 40
  40. XI. Data quality R11. The repository has appropriate experCse to address technical data and metadata quality and ensures that sufficient informaCon is available for end users to make quality related evaluaCons. •  The applicant should make clear in his statements that he understands the quality levels which can reasonably be expected from depositors. This should describe the quality assurance and improvement it will undertake during curaCon and the quality expectaCons of users, which may involve documentaCon of areas where quality thresholds have not been met. 41
  41. XII. Workflows R12. Archiving takes place according to defined workflows from ingest to disseminaCon. The reviewer is looking for evidence that the applicant takes a consistent, rigorous, documented approach to managing its acCviCes throughout their processes and that changes to those processes are appropriately evaluated, documented, managed and implemented. 42
  42. XIII. Data discovery and idenCficaCon R13. The repository enables users to discover the data and refer to them in a persistent way through proper citaCon. This should contain evidence that the cura<on of data and metadata is designed to support resource discovery of clearly defined and iden<fied digital objects. It should be clear to the users of this data how it must be cited to provide appropriate academic credit and linkage between related research. 43
  43. XIV. Data reuse R14. The repository enables reuse of the data over Cme, ensuring that appropriate metadata are available to support the understanding and use of the data. The applicant should understand the needs of the designated community in terms of their research prac<ses and technical environment and used standards. Changes in technology are important, but appropriate high quality metadata should also play an essen<al role and should be men<oned in the evidence provided. The laWer informa<on is cri<cal to design cura<on processes which result in digital objects that meet the needs of the end user as well as generic or disciplinary standards. 44
  44. XV. Technical infrastructure R15. The repository funcCons on well supported operaCng systems and other core infrastructural so`ware and is using hardware and so`ware technologies appropriate to the services it provides to its Designated Community. The workflows and human actors providing repository services must be supported by a technological infrastructure. If possible this should be demonstrated by using a reference model. The reviewer is looking for evidence that the applicant understands the wider ecosystem of standards, tools and technologies available for (research) data management and cura<on and has selected op<ons which align with local requirements. 45
  45. XVI. Security R16. The technical infrastructure of the repository provides for protecCon of the facility and its data, products, services, and users. •  The applicant should understand the technical risks applicable to its parCcular service data user and physical environment and that it has mechanisms in place to respond to incidents. •  Evidence must focus on technical infrastructure rather than on managerial and procedural aspects of business conCnuity. •  In what way is the technical infrastructure determined by the repository or by their host /outsource insCtuCon? 46
  46. Ques<ons
Advertisement