Quality andimprovinginteroperabilitybetween languageresources:             http://www.datasealofapproval.org/trust, proces...
Quality and interoperability    evolution    hard-to-fake traits    indicating fitness    promote interoperability
Overview • Introduction and Theory   • qualities   • trust, simplicity   • guidelines • Process and Demo   • assessment an...
Scientific Qualityhttp://www.ploscompbiol.org/article/metrics/info:doi/10.1371/journal.pcbi.1000112
Scientific quality  • transparent    • from producer    • through repository    • to consumer  • properties to guard    • ...
Usage quality • data formats   • usability • metadata   • findability   • intellegibility
Quality control • by the stakeholders   • data producers   • data custodians   • date consumers • custodians = repositorie...
Quality issues • metadata standards    • CMDI and www.isocat.org • preferred formats    • TEI, XML • referencing systems  ...
Quality issues • search engines   • CLARIN search and develop • access rights   • comply with privacy law, copyright law  ...
Quality and Trust • imperfection lurks everywhere • trust works where certainty blocks • trust is a process   • to greater...
Quality and Simplicity  reduce       organizetime learn differences         context  emotion         trust        failuref...
Guidelines: producershttp://www.datasealofapproval.org/    1.The data producer deposits the research data in    a data rep...
Guidelines: consumershttp://www.datasealofapproval.org/    14. The data consumer complies with access    regulations set b...
Guidelines: repositorieshttp://www.datasealofapproval.org/4. The data repository has an explicit mission in the areaof dig...
Guidelines: repositorieshttp://www.datasealofapproval.org/8. Archiving takes place according to explicit workflows across ...
Guidelines: outsourcinghttp://www.datasealofapproval.org/ repositories may outsource digital preservation to specialist re...
Seal of Approvement • a repository shows it on its webpage • if conditions are fulfilled • as testified by   • a self-asse...
Assessment and review minimum requirements threshold will go up as time proceeds score actions taken           comments   ...
Organisation • repositories represented by a board • tools to facilitate the procedure   • modifiaction record • the DSA w...
CLARIN centres • A = provide infrastructure    • managing the federation • B = provide services    • data and webservices ...
Group assignment • P(roducers)   • invent p-guidelines for B/C centers • R(epositories)   • invent r-guidelines for A/B ce...
Wrap-up: P-Groupmetadata about backgroundinformation about researchers    who, why, publications    DAI    In IMDI it is d...
Wrap up: C-groupgoal is: finding info in a repositorywe need:    overview of access rights    proper web-connection to the...
Wrap-up: R-groupwe provide infrastructure and management for datawe want to standardize our stuffwe need knowledge, the ri...
Wrap-up: Generaladd weights to guidelines, in order todeclare some guidelines more importantthan others.
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial
Upcoming SlideShare
Loading in …5
×

2010 CLARA Nijmegen - Data Seal of Approval tutorial

686 views

Published on

A tutorial for the participants of the CLARA summerschool about the Data Seal of Approval. Philosophy and practice for quality in research data.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
686
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • 1reduce (restrict to the most important issues, a few guidelines will do)2organize (group the guidelines in sections for producers, custodians, consumers)3time (save time by a smooth assessment process)4learn (use expertise in preservation)5differences (reintroduce complexity in a controlled way, because sometimes it is needed)6context (exploit knowledge of the community, requirements of the users)7emotion (do not make it purely bureaucratical, keep the feeling of value, enjoy good relationships with stakeholders)8trust (by default trust, but know where your undo button is, even against the ones you trust)9failure (learn from failures, improve the guidelines, the assessment procedures)10focus: subtract what is obvious, add what is meaningful (this is not about the data in bank accounts, nor highly sensitive medical data, nor company archives, but about research data: the scientific value is protected by the guidelines)
  • 2010 CLARA Nijmegen - Data Seal of Approval tutorial

    1. 1. Quality andimprovinginteroperabilitybetween languageresources: http://www.datasealofapproval.org/trust, process,simplicity Dirk.Roorda@dans.knaw.nl coordinator infrastructure at http://www.dans.knaw.nl
    2. 2. Quality and interoperability evolution hard-to-fake traits indicating fitness promote interoperability
    3. 3. Overview • Introduction and Theory • qualities • trust, simplicity • guidelines • Process and Demo • assessment and review • Discussion and Application • CLARIN centers • language resources
    4. 4. Scientific Qualityhttp://www.ploscompbiol.org/article/metrics/info:doi/10.1371/journal.pcbi.1000112
    5. 5. Scientific quality • transparent • from producer • through repository • to consumer • properties to guard • authenticity • integrity • provenance
    6. 6. Usage quality • data formats • usability • metadata • findability • intellegibility
    7. 7. Quality control • by the stakeholders • data producers • data custodians • date consumers • custodians = repositories • substantial role for repositories • guidelines for producers • agreements for consumers
    8. 8. Quality issues • metadata standards • CMDI and www.isocat.org • preferred formats • TEI, XML • referencing systems • persistent identifiers • long term preservation • after the live-environment has died off • interoperability • OAI-PMH
    9. 9. Quality issues • search engines • CLARIN search and develop • access rights • comply with privacy law, copyright law • respect people from which data is obtained • accountability • for all repository operations
    10. 10. Quality and Trust • imperfection lurks everywhere • trust works where certainty blocks • trust is a process • to greater quality • to better relationships • to more certainty
    11. 11. Quality and Simplicity reduce organizetime learn differences context emotion trust failurefocus: subtract what is obvious add what is meaningful http://lawsofsimplicity.com/
    12. 12. Guidelines: producershttp://www.datasealofapproval.org/ 1.The data producer deposits the research data in a data repository with sufficient information for others to assess the scientific and scholarly quality of the research data and compliance with disciplinary and ethical norms. 2. The data producer provides the research data in formats recommended by the data repository 3. The data producer provides the research data together with the metadata requested by the data repository
    13. 13. Guidelines: consumershttp://www.datasealofapproval.org/ 14. The data consumer complies with access regulations set by the data repository 15. The data consumer conforms to and agrees with any codes of conduct that are generally accepted in higher education and research for the exchange and proper use of knowledge and information 16. The data consumer respects the applicable licenses of the data repository regarding the use of the research data
    14. 14. Guidelines: repositorieshttp://www.datasealofapproval.org/4. The data repository has an explicit mission in the areaof digital archiving and promulgates it5. The data repository uses due diligence to ensurecompliance with legal regulations and contractsincluding, when applicable, regulations governing theprotection of human subjects.6. The data repository applies documented processesand procedures for managing data storage7. The data repository has a plan for long-termpreservation of its digital assets
    15. 15. Guidelines: repositorieshttp://www.datasealofapproval.org/8. Archiving takes place according to explicit workflows across thedata life cycle9. The data repository assumes responsibility from the dataproducers for access and availability of the digital objects10. The data repository enables the users to utilize the researchdata and refer to them11. The data repository ensures the integrity of the digital objectsand the metadata12. The data repository ensures the authenticity of the digitalobjects and the metadata13. The technical infrastructure explicitly supports the tasks andfunctions described in internationally accepted archival standardslike OAIS
    16. 16. Guidelines: outsourcinghttp://www.datasealofapproval.org/ repositories may outsource digital preservation to specialist repositories • implement all except 4,6,7,8 and 13 • store a copy of the data in another (TDR) that • has acquired the DSA logo • by implementing each of the sixteen guidelines • (including 4, 6, 7, 8 and 13).
    17. 17. Seal of Approvement • a repository shows it on its webpage • if conditions are fulfilled • as testified by • a self-assessment • with reviews • on a yearly basis • the exact level of compliance is • transparently published under the seal
    18. 18. Assessment and review minimum requirements threshold will go up as time proceeds score actions taken comments issues * nothing done give a reason ** theoretical concept point to initiation doc describe main issues *** implementation phase point to definition doc describe main issues **** fully implemented point to definition doc N/A not applicable give a reason
    19. 19. Organisation • repositories represented by a board • tools to facilitate the procedure • modifiaction record • the DSA website links to compliant repositories
    20. 20. CLARIN centres • A = provide infrastructure • managing the federation • B = provide services • data and webservices • C = provide metadata • harvestable metadata • R = respected = recognised • offer LRT resources in whatever form • E = external • offer non-LRT resources or services • identity federations • national libraries
    21. 21. Group assignment • P(roducers) • invent p-guidelines for B/C centers • R(epositories) • invent r-guidelines for A/B centers • C(onsumers) • invent c-guidelines for B/C/R centers Suggestions for • assessment • review • modification record
    22. 22. Wrap-up: P-Groupmetadata about backgroundinformation about researchers who, why, publications DAI In IMDI it is difficult to update information, affiliation updates, use unique identifiers for participants in building a corpus, store records of people, and link from the metadata of resources to the records of peopleusing formats depending on formats formats maybe standardised, but not usable to researchers, I do not want to wrap my data in dead formats: the repositories should support innovation in this respect, when it is driven by researchers
    23. 23. Wrap up: C-groupgoal is: finding info in a repositorywe need: overview of access rights proper web-connection to the repository user-friendly interface low threshold for feedback for new features we should be part of the chain in the design of the access toolsGUIDELINES WE WANT ALL CENTERS IN THE CHAIN THAT PROVIDE US WITH THE INFORMATION WE NEED TO OFFER US TRANSPARENCY AND VERIFIABILITY ON HOW THEIR DATA IS OBTAINED, PROCESSED AND CONTROLLED/MANAGED WE WANT TOOLS WITH CLEAR COPYRIGHT PERMISSIONS THAT HAVE A
    24. 24. Wrap-up: R-groupwe provide infrastructure and management for datawe want to standardize our stuffwe need knowledge, the right metadata of the stuff that is coming to uswe want the materials in the right format, allowing for some flexibilityretro-archiving: we offer tools for converting legacy data, so that producers may submitraw materialsmanagement of data concerning legal access protect the providers, so that the providers can trust the consumers: licensing formsshare knowledge about services we provide with potential users: people working in the field other repositorieswe want a forum as an instrument for developing trust between producers andconsumers: the community becomes more transparent
    25. 25. Wrap-up: Generaladd weights to guidelines, in order todeclare some guidelines more importantthan others.

    ×