Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20170530_Open Research Data in Horizon 2020

1,643 views

Published on

OpenAIRE spring webinar series

Published in: Science
  • Be the first to comment

  • Be the first to like this

20170530_Open Research Data in Horizon 2020

  1. 1. Open Research Data in Horizon 2020 Marjan Grootveld, DANS Tony Ross-Hellauer, University of Göttingen OpenAIRE webinar 30th May, 2017 @MarjanGrootveld @tonyR_H @openaire_eu
  2. 2. Contents • What is OpenAIRE? • Data management: why? • The EC Guidelines for FAIR Data Management: • OpenAIRE services • Summary and recommendations • Links to EC and OpenAIRE information 2
  3. 3. Related webinars Introductory RDM webinar, Tony Ross-Hellauer & Sarah Jones, 26 May 2016: • Reasons to manage data • How to manage and share data (+ how to respond to concerns about sharing) • EUDAT & OpenAIRE services • Q&A document: https://www.openaire.eu/public-documents?id=835&task=document.viewdoc How to write a DMP, Sarah Jones & Marjan Grootveld, 7/14 July 2016: • What is a Data Management Plan and why to write it? • Example DMPs in different domains, with lots of links! • Guidance, e.g. storing =/= archiving; how to find a repository; file-naming conventions Open Research Data in H2020 and Zenodo, Marjan Grootveld & Krzysztof Nowak, 26 October 2016: • Sustainable file formats differ across domains and repositories • Funders embrace the FAIR data principles – implications for Data Management Planning? • Slides: www.slideshare.net/OpenAIRE_eu/openaire-webinar-on-open-research-data-in-h2020-oaw2016 • Q&A document: https://www.openaire.eu/public-documents?id=843&task=document.viewdoc FAIR data in Trustworthy Data Repositories, Peter Doorn & Ingrid Dillo, 12/13 December 2016: • Proposal for scoring datasets on Findability, Accessibility and Interoperability = Reusability levels • Inspired by the Data Seal of Approval criteria for Trustworthy Data Repositories • Slides: http://www.slideshare.net/EUDAT/fair-data-in-trustworthy-data-repositories-webinar-1213-december-2016 https://www.eudat.eu Research Data Services, Expertise & Technology 3 3
  4. 4. WHAT IS OPENAIRE? 4
  5. 5. Human Network A “dual core” eInfrastructure for Open Scholarship Digital Network Fosters the social and technical links that enable Open Science in Europe and beyond 5
  6. 6. Linked Open Science Policies and practices hand in hand for sustainable OA Putting research in its proper context: Intelligent discovery Transparency and trust Reproducibility Monitoring & analysis 6
  7. 7. Who we are • 50 Partners from every EU country, and beyond • In 24/7 operation since 2010 • 4 project phases to date • Legal entity in 2017 • Institutional, national and international perspectives on OA policies & e-Infrastructures Open Access experts • Building efficient e-Infra technologies • State of the art technologies (big data, linked data) Information & Computer Science experts • Legal & policy recommendations Legal experts • Best practices for data • Linking to data infrastructures Data communities 7
  8. 8. People make the difference Local support for Europe’s diverse research landscape Human support network • 33 expert nodes all over Europe to help with: • OA training and support • OA policy development • Technical assistance • World-wide synergies 8
  9. 9. Researchers & research communities Data providers Funders & research administrators 3rd party service providers Smart services for all Dashboards for data providers, funders and researcher communities Open Science services for the whole research life-cycle 9
  10. 10. OPEN DATA IN H2020 10
  11. 11. EC Open Access Mandate Progression FP7 (2008) • 20% programme areas • Deposit in Repositories • APC payments during project • ERC OA Guidelines Horizon 2020 (2014) • 100% programme areas • Deposit in Repositories • APCs during and after project • Open Data Pilot (100% from 2017) 11
  12. 12. Which H2020 projects are affected? Projects starting from January 2017 are by default part of the Open Data policy. If your project started earlier and stems from one of these Horizon 2020 areas, you will automatically be part of the pilot as well: • Future and Emerging Technologies • Research infrastructures (including e-Infrastructures) • Leadership in enabling and industrial technologies – Information and Communication Technologies • Nanotechnologies, Advanced Materials, Advanced Manufacturing and Processing, and Biotechnology: ‘nanosafety’ and ‘modelling’ topics • Societal Challenge: Food security, sustainable agriculture and forestry, marine and maritime and inland water research and the bioeconomy - selected topics in the calls H2020-SFS-2016/2017, H2020-BG-2016/2017, H2020-RUR-2016/2017 and H2020-BB-2016/2017, as specified in the work programme • Societal Challenge: Climate Action, Environment, Resource Efficiency and Raw materials – except raw materials • Societal Challenge: Europe in a changing world – inclusive, innovative and reflective Societies • Science with and for Society • Cross-cutting activities - focus areas – part Smart and Sustainable Cities. 12
  13. 13. Open Research Data policy requirements • Deposit the data underlying your scientific publications, including the metadata, documentation and tools needed to validate the results, in a research data repository. • Sharing more data is encouraged. • Write, and keep up-to-date, a Data Management Plan. • Make data “as open as possible, as closed as necessary”: opting out – fully or in part – is possible, but needs justification. http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf 13 13
  14. 14. Reasons for opting out 14 • Participation is incompatible with the Horizon 2020 obligation to protect results that can reasonably be expected to be commercially or industrially exploited; • Participation is incompatible with the need for confidentiality in connection with security issues; • Participation is incompatible with rules on protecting personal data; • The project will not generate / collect any research data; or • There are other legitimate reasons not to take part in the Pilot.
  15. 15. EC Open Research Data Pilot Opt-out Reasons https://open-data.europa.eu/data/dataset/open-research-data-the-uptake-of-the-pilot-in-the-first-calls-of-horizon-2020 15
  16. 16. The EC Open Research Data policy Key sources of information • Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot- guide_en.pdf • Guidelines on Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data- mgt_en.pdf • Annotated model grant agreement, clause 29.3 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-amga_en.pdf • Infographic summarising key policy points http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf • Open Access and Data Management http://ec.europa.eu/research/participants/docs/h2020-funding- guide/cross-cutting-issues/open-access-dissemination_en.htm 16
  17. 17. DATA MANAGEMENT: WHY? 17
  18. 18. Why manage data? Image CC-BY-NC-SA by Leo Reynolds www.flickr.com/photos/lwr/13442910354 18
  19. 19. Data explosion • More and more data is being created • Issue is not creating data, but being able to navigate and use it • Data management is critical to make sure data are well- organised, understandable and reusable 19
  20. 20. A reproducibility crisis 20
  21. 21. Data loss Digital data are fragile and susceptible to loss for a wide variety of reasons • Natural disaster • Facilities infrastructure failure • Storage failure • Server hardware/software failure • Application software failure • Format obsolescence • Human error • Malicious attack • Loss of staffing competencies • Loss of institutional commitment • Loss of financial stability • Changes in user expectations Image CC BY-NC-SA 2.0 by Dave Hill https://www.flickr.com/photos/dmh650/4031607067 21
  22. 22. And there is you! • Make your research easier • Stop yourself drowning in irrelevant stuff • Save data for later • Avoid accusations of fraud or sloppy science • Write a data paper, connect your nano publications • Share your data for re-use & get them validated in real life • Get credit for it 22
  23. 23. Managing & sharing data 23
  24. 24. Research data lifecycle CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA CREATING DATA: designing research, DMPs, planning consent, locate existing data, data collection and management, capturing and creating metadata RE-USING DATA: follow-up research, new research, undertake research reviews, scrutinising findings, teaching & learning ACCESS TO DATA: distributing data, sharing data, controlling access, establishing copyright, promoting data PRESERVING DATA: data storage, back-up & archiving, migrating to best format & medium, creating metadata and documentation ANALYSING DATA: interpreting, & deriving data, producing outputs, authoring publications, preparing for sharing PROCESSING DATA: entering, transcribing, checking, validating and cleaning data, anonymising data, describing data, manage and store data Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life- cycle 24
  25. 25. From the re-use perspective What is needed in order to find, evaluate, understand, and reuse someone’s data – and to give them credit? CREATING DATA PROCESSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA 25
  26. 26. FAIR DATA MANAGEMENT 26
  27. 27. Making data FAIR • Findable • Assign persistent IDs, provide rich metadata, register in a searchable resource, ... • Accessible • Retrievable by their ID using a standard protocol, metadata remain accessible even if data don’t... • Interoperable • Use formal, broadly applicable languages, use standard vocabularies, qualified references... • Reusable • Rich, accurate metadata, clear licences, provenance, use of community standards... www.force11.org/group/fairgroup/fairprinciples and http://www.nature.com/articles/sdata201618 27
  28. 28. EC FAIR data EC in the Guidelines: “This template is not intended as a strict technical implementation of the FAIR principles, it is rather inspired by FAIR as a general concept (…) without suggesting any specific technology, standard, or implementation solution” 28
  29. 29. Example: raw data in Zenodo 29
  30. 30. Some other funders that require DMPs 30
  31. 31. Data Management Plans A DMP is a brief plan to define: • how the data will be created • how it will be documented • who can access it • where it will be stored • whether it will be shared • where it will be preserved DMPs are sometimes submitted as part of grant applications, sometimes afterwards, but they are useful whenever researchers are creating data. CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA 31
  32. 32. DMPonline A web-based tool to help researchers write DMPs https://dmponline.dcc.ac.uk Choose your funder to get their specific template Choose any additional optional guidance 32
  33. 33. Some “F” questions §2.1 Making data findable, including provisions for metadata • Use metadata and specify standards for metadata creation (if any). If there are no standards in your discipline describe what type of metadata will be created and how. • Use search keywords • Persistent and unique identifiers such as DOI • File and folder naming conventions: see OpenAIRE-EUDAT July 2016 webinar • Versioning of the datasets and clear version numbers 33
  34. 34. Metadata and documentation • Metadata and documentation is needed to locate and understand research data. • Use relevant standards to enable interoperability. • Check what the long-term repository supports or expects. • Get others to check the metadata to improve quality. http://rd-alliance.github.io/metadata-directory 34
  35. 35. Documentation? • Code book explaining the variables • Study design • Lab journal • iPython or Jupyter notebook • Statistical queries • Software or instruments to understand or reproduce the data • Machine configurations • Consent information • Data usage licence • … In short: document and preserve everything that is needed to reproduce the study – ideally following the standard in your discipline 35
  36. 36. Some “A” questions § 2.2 Making data openly accessible: • Explain which data can’t be shared openly, if any • Specify how access will be provided in case of restrictions, e.g. through a data committee, a license, or arranged with the repository. • Will methods or software tools needed to access the data (if any) be included or documented? • Deposit the data and associated metadata, documentation and code preferably in certified repositories which support Open Access. Data Seal of Approval ICSU World Data System nestor seal ISO 16363 36
  37. 37. Where to find a repository? More information: https://www.openaire.eu/opendatapilot-repository Zenodo: http://www.zenodo.org Re3data.org: http://www.re3data.org 37
  38. 38. Keep everything? Forever? Select what data you’ll need and want to retain. Some selection criteria: • Data underlying publications • What can’t be recreated, like interviews or environmental recordings • What is potentially useful to others • What has scientific, cultural or historical value 10 years is often stated in data policies and academic codes, but data can be valuable for ages, in climatology, sociology, health sciences, astronomy, linguistics, … Look beyond minimal retention periods where relevant. RDNL Selection criteria: http://www.researchdata.nl/en/services/data-management/selecting-research-data/ DCC How-to guide: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data 38
  39. 39. Interoperability 39 Before clocks were invented, people kept time using different instruments to observe the Sun’s zenith at noon. Towns and cities set clocks based on sunsets and sunrises. Time calculation became a serious problem for people travelling by train, sometimes hundreds of miles in a day. UTC is the World's Time Standard.
  40. 40. Some “R” questions § 2.4 Increase data re-use (through clarifying licences) • License the data to permit the widest reuse possible • Specify a data embargo, if this is needed • How long will the data remain reusable? • Describe data quality assurance processes Re-use over time 40
  41. 41. Licensing research data and software EUDAT licensing wizard helps you pick licences for data & software You should also license Open Access data, or waive rights. Horizon 2020 Open Access guidelines point to: or http://ufal.github.io/public-license-selector/ 41
  42. 42. DMPlanning recommendations • Plan for the desired end result: open and reuseable data. • Be specific and justify your decisions in the DMP. • Involve all work packages and partners to get a coherent plan. Also consult your Research Support staff and the repository. • Approach the DMP in whatever way best fits your project: • EC template is intended as a service, not an obligation. Read the background information and the guidance, and use it as a checklist. • More than one dataset? Describe generically what is possible and dataset-specific what is necessary. • Focus effort on datasets you’ll create rather than reuse. 42
  43. 43. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters security and confidentiality of sensitive data lack of acknowledgement / credit loss of advantage when competing for research funding 43
  44. 44. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters security and confidentiality of sensitive data lack of acknowledgement / credit loss of advantage when competing for research funding metadata metadata metadata metadata 44
  45. 45. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters provide rich Abstract, Purpose, Use Constraints and Supplemental Information where needed security and confidentiality of sensitive data • the metadata does NOT contain the data • Use Constraints specify who may access the data and how lack of acknowledgement / credit specify a required data citation within the Use Constraints and the license loss of data insight and competitive advantage when vying for research funding create second, public version with generalised Data Processing Description 45
  46. 46. OPENAIRE & EUDAT SERVICES 46
  47. 47. Zenodo.org For all content types! With GitHub integration! Upload Describe Publish Create communities! 47
  48. 48. Zenodo now supports versioning! Alert to show you are not on latest version. “Newer version” is linked to latest version. In “owner view“, you can now easily edit metadata of previous versions and create new versions of a record. Each version gets its own DOI. List shows last 8 versions of a record, with link to “view all versions”. We also create one DOI that represents all the differing versions.
  49. 49. Link data to publications https://www.openaire.eu/search 49
  50. 50. OpenAIRE support materials https://www.openaire.eu/opendatapil ot https://www.openaire.eu/support • Briefing papers, factsheets, webinars, workshops, FAQs • Information on: • Open Research Data Pilot • Creating a data management plan • Selecting a data repository • Personal data 50
  51. 51. OpenAIRE Open Science Helpdesk If you cannot find an answer to your question, please contact us through our helpdesk: https://www.openaire.eu/support/ helpdesk If your question relates directly to your own country, your enquiry will be routed to your local OpenAIRE National Open Access Desk: 51
  52. 52. NOADs (National Open Access Desks) • 33 local experts on Open Access and Open Science • There to help you! • https://www.openaire.eu/contact-noads 52
  53. 53. New horizons for Open Data • Literature-Data Integration service: https://dliservice.research-infrastructures.eu • Co-chairing (with Elsevier, DataCite, etc.) to enable exchange of scholarly links across domains and platforms • Developing forthcoming AMNESIA tool for effective anonymization of sensitive data • OpenAIRE-Connect – new project started 1st Jan 2017 • Open Science as a Service • Customize OpenAIRE administration tools for research communities needs 53
  54. 54. Advising on Open Data • Aligning/harmonizing Open Data policies across Europe via our National Open Access Desks • Providing feedback to EC on Open Data policies • E.g., 2016 OpenAIRE-EUDAT advice on revising H2020 DMP template • Until 21 June 2017, OpenAIRE is collecting feedback on the Horizon 2020 template for Data Management Plans: https://www.surveymonkey.com/r/OpenAIRE_DMP_survey • Conducting legal research into RDM and Open Data • 2013 OpenAIRE study “Safe to be Open” on research data protection https://goo.gl/opMyNd • Follow-up study by same authors coming soon! • 2017 Legal Issues in Open Data Workshop slides and recordings: https://www.openaire.eu/workshop-legal-issues-ord 54
  55. 55. EUDAT B2 service suite Covering both access and deposit, from informal data sharing to long-term archiving, and addressing identification, discoverability and computability of both long-tail and big data, EUDAT’s services address the full lifecycle of research data https://www.eudat.eu/ 55
  56. 56. 56
  57. 57. Closing remarks Image “Fishbone” CC BY-NC-ND 2.0 by https://www.flickr.com/photos/mrjnl/ 57
  58. 58. Summary of RDM in H2020 • Research data should be as open as possible, as closed as necessary. • In H2020 the DMP is a regular project deliverable, due by month 6. • A DMP is a living document: to be used, updated and shared. • You can use the H2020 template in DMPonline. • Deposit the data in a research data repository. Look early for a research data repository for sharing and preserving the data long term. • “Sharing” means “outside the consortium”. If (part of your) data cannot be shared with everyone, exemptions apply. • Manage and document all data FAIRly, whether they will be open or not. 58
  59. 59. www.openaire.eu @openaire_eu facebook.com/groups/openaire linkedin.com/groups/OpenAIRE-3893548 marjan.grootveld@dans.knaw.nl ross-hellauer@sub.uni-goettingen.de 59 Questions? Thanks to colleagues at OpenAIRE, EUDAT and DCC for content!

×