Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OpenAIRE webinar on Open Research Data in H2020 (OAW2016)


Published on

Presentation by Marjan Grootveld (DANS) on the Open Research Data in H2020 - OpenAIRE webinar Oct. 26 (OAW2016)

Published in: Science
  • Be the first to comment

  • Be the first to like this

OpenAIRE webinar on Open Research Data in H2020 (OAW2016)

  1. 1. DANS is een instituut van KNAW en NWO Open Research Data in H2020 Marjan Grootveld OpenAIRE webinar, 26 October 2016
  2. 2. Who we are Open Access Infrastructure for Research in Europe
  3. 3. DANS: Data Archiving and Networked Services Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research information
  4. 4. 4 DataverseNL for short- and mid- term storage EASY: certified long-term Electronic Archiving System for self-deposit NARCIS: Gateway to scholarly information in the Netherlands Research data in context
  5. 5. Contents • Brief recap from recent OpenAIRE-EUDAT webinars • The updated Guidelines for FAIR Data Management: • F, A, I, R • Costs, data security, ethical aspects, other RDM procedures • Recommendations • Links to EC and OpenAIRE information 5
  6. 6. Recent webinars Introductory RDM webinar, Tony Ross-Hellauer & Sarah Jones, 26 May: • Reasons to manage data • How to manage and share data (+ how to respond to concerns about sharing) • EUDAT & OpenAIRE services Q&A document: “How to write a DMP”, Sarah Jones & Marjan Grootveld, 7/14 July: • What is a Data Management Plan and why to write it? • Example DMPs in different domains, with lots of links! • Lessons and guidance (e.g. storing =/= archiving; how to find a repository; file-naming conventions) All recordings and slides are on Research Data Services, Expertise & Technology 6
  7. 7. Recap: why manage data? (Not for the research funder, but for life we make data management plans) Make your research easier Stop yourself drowning in irrelevant stuff Save data for later Avoid accusations of fraud or bad science Write a data paper, connect your nano publications Share your data for re-use & get them validated in real life Get credit for it 7 NON PECUNIAE INVESTIGATIONIS CURATORE SED VITAE FACIMUS PROGRAMMAS DATORUM PROCURATIONIS
  8. 8. Horizon 2020 infographic
  9. 9. Horizon 2020: Open Research Data Pilot The use of a Data Management Plan (DMP) is required for projects participating in the Open Research Data Pilot, detailing what data the project will generate, whether and how they will be exploited or made accessible for verification and re-use, and how they will be curated and preserved. 9
  10. 10. Guidelines on FAIR DM v.3 Structure of the Guidelines: 1.Background: extension of the pilot 2.DMP general definition 3.Proposal, submission and evaluation 4.RDM plans during the project life cycle 5.Support 6.Annex 1: the DMP template 1. Data summary 2. FAIR data 3. Allocation of resources 4. Data security 5. Ethical aspects 6. Other issues 7. Summary table “Fair DM at a glance” 10
  11. 11. What’s new? • You should develop a DMP for your project. • There is a single DMP template from start to finish. • The DMP template is inspired by the FAIR principles: research data should be findable, accessible, interoperable and re-usable (without suggesting any specific technology, standard, or implementation solution). Also explicit in the new guidelines: • From 1-1-2017 the pilot will cover all thematic areas of Horizon 2020. • Costs related to open access to research data are eligible for reimbursement during the duration of the project under the conditions defined in the Grant Agreement. 11
  12. 12. Good things that remain Whether a (proposed) project participates in the ORD pilot or chooses to opt out does not affect the evaluation of that project: proposals will not be penalised for opting out. Participating in the ORD pilot does not necessarily mean opening up all your research data: as open as possible, as closed as necessary. The DMP is a living document. You are not required to provide detailed answers to all the questions in the first version of the DMP (due M6). Deposit in a research data repository: a. the data needed to validate the results presented in scientific publications, including the metadata; b. any other data, including the metadata, as specified in the DMP; c. plus for a-b the documentation and the tools that are needed to validate the results, e.g. specialised software or software code, algorithms and analysis protocols (when possible, these instruments themselves). 12
  13. 13. DMPonline A web-based tool to help researchers write DMPs Guidance from EUDAT and OpenAIRE being added Choose your funder to get their specific template Choose any additional optional guidance 13
  14. 14. §2 Making data FAIR Findable – Assign persistent IDs, provide rich metadata, register in a searchable resource, ... Accessible – Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t... Interoperable – Use formal, broadly applicable languages, use standard vocabularies, qualified references... Reusable – Rich, accurate metadata, clear licences, provenance, use of community standards... 14 and
  15. 15. EC in the Guidelines: “This template is not intended as a strict technical implementation of the FAIR principles, it is rather inspired by FAIR as a general concept.” EC Infographic: w920.png 15
  16. 16. Some F questions 2.1 Making data findable, including provisions for metadata • Use metadata and specify standards for metadata creation (if any). If there are no standards in your discipline describe what type of metadata will be created and how. • Search keywords • Persistent and unique identifiers such as DOI • File and folder naming conventions: see OpenAIRE-EUDAT July webinar • Versioning of the datasets and clear version numbers 16
  17. 17. Metadata and documentation • Metadata and documentation is needed to find and understand research data. • Think about what others would need in order to find, evaluate, understand, and reuse your data. • Get others to check the metadata to improve quality. • Use standards to enable interoperability. 17
  18. 18. Some A questions 2.2 Making data openly accessible: • Explain which data can’t be shared openly, if any • Specify how access will be provided in case of restrictions, e.g. through a data committee, a license, or arranged with the repository. • Will methods or software tools needed to access the data (if any) be included or documented? • Deposit the data and associated metadata, documentation and code preferably in certified repositories which support Open Access. Data Seal of Approval ICSU World Data System nestor seal ISO 16363 18
  19. 19. Where to find a repository? More information: Zenodo: 19
  20. 20. File format considerations No clearcut definitions of “sustainable file format”. Each archives has its own expertise, related to its designated community. Examples: 4TU.ResearchData DANS Level 1 Level 2 or 3 Preferred Accepted audio .wav .ra, .mp3, .wma .wav, .flac .aiff, .mp3, .aac chemistry NMR, ChemDoodle, ….pdb, .xyz databases delimited flat file w/DDL .mdb, .dbf, .acdb .sql, .siard, .csv .mdb, .dbf, .hdf5 … video .mp1, .mp2, .mp4, .mov … .mpg2, .mpg4, .avi, .mov .mkv 20
  21. 21. Interoperability Before clocks were invented, people kept time using different instruments to observe the Sun’s zenith at noon. Towns and cities set clocks based on sunsets and sunrises. Time calculation became a serious problem for people travelling by train, sometimes hundreds of miles in a day. UTC is the World's Time Standard. 21
  22. 22. Some I questions 2.3 Making data interoperable • Specify what data and metadata vocabularies, standards or methodologies you will follow to facilitate interoperability. • Standard vocabulary to allow inter-disciplinary interoperability or a mapping from your vocabulary to more commonly used ontologies? 22
  23. 23. Some R questions 2.4 Increase data re-use (through clarifying licences) • License the data to permit the widest reuse possible • Specify a data embargo, if this is needed • How long will the data remain reusable? • Describe data quality assurance processes Re-use over time 23
  24. 24. Licensing research data and software EUDAT licensing wizard help you pick licence for data & software You should also license Open Access data, or waive rights. Horizon 2020 Open Access guidelines point to: or 24
  25. 25. Keep everything? For always? When regenerating data is cheaper than archiving, don’t archive. Select what data you’ll need and want to retain. 10 years is often stated in data policies and academic codes, but data can be valuable for ages, in climatology, sociology, health sciences, astronomy, linguistics, … Look beyond minimal retention periods where relevant. “The lifetime of software is generally not as long as that of data” (Daniel Katz e.a. RDNL Selection criteria: management/selecting-research-data/ DCC How-to guide: 25
  26. 26. §3 Allocation of resources • What are the costs for making data FAIR in your project? • Resources for long term preservation Check the UK Data Service Costing model. Rule of thumb: 5% of the project budget is spent on RDM. The High Level Expert Group on the European Open Science Cloud recommends that “well budgeted data stewardship plans should be made mandatory and we expect that on average about 5% of research expenditure should be spent on properly managing and stewarding data”. UKDS model HLEG report df#view=fit&pagemode=none p. 19 26
  27. 27. §4-6 Data security • Provisions for data recovery, secure storage, transfer of sensitive data? • Safely stored in certified repositories for long term preservation and curation? Ethical aspects • Any ethical or legal issues that can impact data sharing? • Informed consent for data sharing and long term preservation included in questionnaires dealing with personal data? Which other national/funder/sectorial/departmental procedures for data management do you use (if any)? 27
  28. 28. Closing remarks Image “Fishbone” CC BY-NC-ND 2.0 by ttps://
  29. 29. Recommendations • Think about the desired end result and plan for this. • Involve all work packages and partners to get a coherent plan. • “Sharing” means “outside the consortium”. • Approach the DMP in whatever way best fits your project: • EC template is intended as a service, not an obligation. Read the background information and the guidance, and use it as a checklist. • More than one dataset? Describe generically what is possible and dataset-specific what is necessary. • Focus effort on datasets you’ll create rather than reuse. 29
  30. 30. The EC Open Research Data pilot Key sources of information • Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 t/h2020-hi-oa-pilot-guide_en.pdf • Guidelines on Data Management in Horizon 2020 t/h2020-hi-oa-data-mgt_en.pdf • Annotated model grant agreement, clause 29.3 020-amga_en.pdf • New infographic summarising key policy points • Open Access and Data Management • issues/open-access-dissemination_en.htm 30
  31. 31. OpenAIRE support materials • Briefing papers, factsheets, webinars, workshops, FAQs • Information on: • Open Research Data Pilot • Creating a data management plan • Selecting a data repository • Personal data 31
  32. 32. DANS is een instituut van KNAW en NWO Thank you! Acknowledgements: Thanks to Sarah Jones (DCC), OpenAIRE and EUDAT for slides.