Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Data Pilot and
OpenAIRE tools
Update on Research Data Management
Elly Dijk and Marjan Grootveld
Data Archiving and Net...
Outline
1. Introduction to the Open Research Data Pilot
2. Prepare for responsible research
3. During the research project...
1. Introduction to the Open
Research Data Pilot
Horizon 2020
Open Research Data Pilot
3
• Financing European research and innovation projects
• Increasing competitive position of Europe, and find solutions
for ...
New EC guidelines: 2015
5
http://ec.europa.eu/resear
ch/participants/data/ref/h
2020/grants_manual/hi/oa
_pilot/h2020-hi-o...
OpenAIRE supports Horizon 2020
demands
6
OpenAIRE support for data
All information is available via https://www.openaire.eu/opendatapilot 7
https://www.openaire.eu...
Open Research Data Pilot
• Aim: to make the research data generated by selected Horizon 2020 projects
accessible with as f...
Which research has to partipate in
the pilot?
• Future and Emerging Technologies
• Research infrastructures
• Leadership i...
Opting out / opting in
• Opting out of the pilot is possible
when motivated
• And opting in is also possible
Reasons for total or partial
opting out
• Incompatible with the Horizon 2020 obligation to protect
results if they can rea...
Opting in
• Voluntary opting in also possible
• When a researcher wants to publish and share his/her data as
open access
•...
Opt in / Opt out numbers
Basis : 3,699 Horizon 2020 signed grant agreements
• Calls in core-areas: opt out 34,6% (149/431 ...
Reasons for opting out
Numbers by Daniel Spichtinger, European Commission, at OpenCon 14-11-15
14
17.85
35.37
5.32
24.96
7...
Requirements Open Data
Pilot
1.Data Management Plan required within six
months after project grant
2. Deposit your data in...
16
2. Prepare for
responsible research
Data management planning
Stakeholders
17
Data Management Planning
Video by Research Data Netherlands, http://datasupport.researchdata.nl
18
How to write a DMP
• Template available from https://dmponline.dcc.ac.uk/
•
• And from a few national DMPonline sites, e.g...
20
2
3
21
4
22
“The DMP is not a
fixed document…”
Self-assigned
ID
23
Briefly specify
• how data will be captured/created
• how it will be documented
• according to what standards
• who wil...
24
ID of the
dataset,
assigned by PI
EC guidance
PI’s answer
Initial DMP
5
Template mid-term review
DMP
Broad notions: the data and associated metadata should be managed
in a way that allows for fu...
Roles and responsibilities
Institution
RDM policy
Facilities
€$£
Research funders
Publishers
Data Availability
Policy
Comm...
Let’s recall the goal:
• Open access to research data refers to the right
to access and re-use digital research data. Open...
Negative intermezzo
• Stored data is not in itself “curated and preserved”
• Preserved (or: archived) data is not in itsel...
What should be deposited?
• The data needed to validate results in scientific publications (minimally!).
• The associated ...
https://commons.wikimedia.org/wiki/File%3ABudget_Debate_2011_(5611505228).jpg 30
Open Access to all data, unless…
• Confidentiality and security issues can be good reasons not to
publish or share – all –...
Repository, archive, ehm?
• A pilot requirement is to “deposit your data in a research data
repository”: a digital archive...
33
EC guidance
PI’s answer
Initial DMP
5
34
Several export
formats
6
Deliver the DMP
• Send the initial DMP version to the Commission within six months.
• EC: “Since DMPs are expected to matu...
3. During the project
Data management is part of good research
36
Roles and responsibilities
Institution
RDM policy
Facilities
€$£
Research funders
Publishers
Data Availability
Policy
Comm...
Linking data and publications
• From a data-centric perspective publications are part of a dataset’s
context. However, the...
Incentives 1
Vegetation map 1977 reused
in 2015 expedition
H.D. Heinemeijer & A.J. van Dijk (1977):
Vegetation map Rosenbergdalen,
Edge...
Incentives 3
Image: https://www.flickr.com/photos/dmh650/4031607067/in/gallery-wlef70-72157633022909105/ 41
Data managemen...
4. Data Services
Trustworthy digital repositories
Find a data repository
Research data in OpenAIRE
42
Storage and Trust
• Local storage facilities during the research
• Network of trustworthy digital repositories for long-te...
Where to find a repository?
In order of preference: use
1. an external data archive or repository in your research domain
...
Main criteria for choosing a data repository:
• Certification as a ‘Trustworthy Digital Repository’, with an explicit
ambi...
• re3data.org is a global registry of research data
repositories
• different academic disciplines
• It presents repositori...
47
48
49
https://zenodo.org/
Contents
50
OpenAIRE2020
https://zenodo.org/features
OpenAIRE
Research data in OpenAIRE
HYPOX: FP7 PROJECT
52 Publications
from 20 different
OpenAIRE data
providers
392 datasets from
PANGAEA
Slide from Pedro Pr...
FP7 projects: publications +
datasets
HYPOX >
https://www.openaire.eu/search/project?projectId=corda_______::abb5725eaf261...
FP7 projects: publications +
datasets
HYPOX >
https://www.openaire.eu/search/project?projectId=corda_______::abb5725eaf261...
5. Summary
• Research projects in 9 appointed Horizon 2020 areas are automatically part of
the pilot, e.g. Future and emer...
Slogan EC: As open as
possible, as closed as needed
62
6. Introduction to the afternoon’s
‘In practice session’
63
WP 4 Training and Support
• Task 4.3. Research Data Management training and support
• DANS (Data Archiving and Networked S...
Programme ‘In Practice
session’
1. Situation of RDM in your country:
• Introduction of you and the situation in your count...
www.openaire.eu
@openaire_eu
facebook.com/groups/openaire
linkedin.com/groups/OpenAIRE-
3893548
Thank you!
elly.dijk@dans....
Upcoming SlideShare
Loading in …5
×

Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld (OpenAIRE workshop, Ghent, Nov.2015)

2,362 views

Published on

Workshop: “Sharing Research Data and Open Access to publications in H2020” - University of Ghent, Nov. 2015

Published in: Science
  • Be the first to comment

Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld (OpenAIRE workshop, Ghent, Nov.2015)

  1. 1. The Data Pilot and OpenAIRE tools Update on Research Data Management Elly Dijk and Marjan Grootveld Data Archiving and Networked Services (DANS)
  2. 2. Outline 1. Introduction to the Open Research Data Pilot 2. Prepare for responsible research 3. During the research project 4. Data services 5. In summary 6. Introduction to the afternoon’s ‘In Practice sessions’ 2
  3. 3. 1. Introduction to the Open Research Data Pilot Horizon 2020 Open Research Data Pilot 3
  4. 4. • Financing European research and innovation projects • Increasing competitive position of Europe, and find solutions for societal challenges, e.g. climate change, food security, health and wellbeing, secure societies • Successor of the FP7 programme (KP7) • Period 2014 - 2020; the budget is € 80 billion • National Contact Points for the H2020 programme • http://ec.europa.eu/programmes/horizon2020 4
  5. 5. New EC guidelines: 2015 5 http://ec.europa.eu/resear ch/participants/data/ref/h 2020/grants_manual/hi/oa _pilot/h2020-hi-oa-pilot- guide_en.pdf
  6. 6. OpenAIRE supports Horizon 2020 demands 6
  7. 7. OpenAIRE support for data All information is available via https://www.openaire.eu/opendatapilot 7 https://www.openaire.eu/opendatapilot
  8. 8. Open Research Data Pilot • Aim: to make the research data generated by selected Horizon 2020 projects accessible with as few restrictions as possible, while at the same time protecting sensitive data from inappropriate access. • EC: information already paid for by the public should not be paid for again. • Open data is data that is free to access and reuse • Two types of data: 1. Data, including metadata, needed to validate the results in scientific publications 2. Other data, including metadata, as specified in the Data Management Plan, like raw data 8
  9. 9. Which research has to partipate in the pilot? • Future and Emerging Technologies • Research infrastructures • Leadership in enabling and industrial technologies • Nanotechnologies, Advanced Materials, Advanced Manufacturing and Processing, and Biotechnology • Societal Challenge: Food security, sustainable agriculture and forestry, marine and maritime and inland water research and the bioeconomy • Societal Challenge: Climate Action, Environment, Resource Efficiency and Raw materials • Societal Challenge: Europe in a changing world – inclusive, innovative and reflective Societies • Science with and for Society • Cross-cutting activities - focus areas – part Smart and Sustainable Cities 9
  10. 10. Opting out / opting in • Opting out of the pilot is possible when motivated • And opting in is also possible
  11. 11. Reasons for total or partial opting out • Incompatible with the Horizon 2020 obligation to protect results if they can reasonably be expected to be commercially or industrially exploited; • Incompatible with the need for confidentiality in connection with security issues; • Incompatible with existing rules concerning the protection of personal data; • If the project will not generate / collect any research data; • If there are other legitimate reasons to not take part in the Pilot 11
  12. 12. Opting in • Voluntary opting in also possible • When a researcher wants to publish and share his/her data as open access • Mandate to open access of publications: Aim to deposit at the same time the research data needed to validate the results ("underlying data”) 12
  13. 13. Opt in / Opt out numbers Basis : 3,699 Horizon 2020 signed grant agreements • Calls in core-areas: opt out 34,6% (149/431 proposals) • Other areas: voluntary opt in 12,5% (409/3268 proposals) Conclusion: • These numbers in the proposals for the first calls of Horizon 2020 are encouraging. • Comprehensive follow up needed • Numbers by Daniel Spichtinger, European Commission, at OpenCon 14-11-15 13
  14. 14. Reasons for opting out Numbers by Daniel Spichtinger, European Commission, at OpenCon 14-11-15 14 17.85 35.37 5.32 24.96 7.79 8.71 No data generated IPR protec on Confiden ality Privacy Jeopardize main objec ve other
  15. 15. Requirements Open Data Pilot 1.Data Management Plan required within six months after project grant 2. Deposit your data in a research data repository 3.Open data is data that is free to access and reuse: Creative Commons Licence CC-BY or CC0 15
  16. 16. 16
  17. 17. 2. Prepare for responsible research Data management planning Stakeholders 17
  18. 18. Data Management Planning Video by Research Data Netherlands, http://datasupport.researchdata.nl 18
  19. 19. How to write a DMP • Template available from https://dmponline.dcc.ac.uk/ • • And from a few national DMPonline sites, e.g. in Spain and Belgium See https://www.openaire.eu/opendatapilot-dmp - Spain: http://pgd.consorciomadrono.es/ - Belgium - forthcoming 19 1
  20. 20. 20 2 3
  21. 21. 21 4
  22. 22. 22 “The DMP is not a fixed document…” Self-assigned ID
  23. 23. 23 Briefly specify • how data will be captured/created • how it will be documented • according to what standards • who will be able to access it • where it will be stored • how it will be backed up, and • where and how it will be shared and preserved long-term
  24. 24. 24 ID of the dataset, assigned by PI EC guidance PI’s answer Initial DMP 5
  25. 25. Template mid-term review DMP Broad notions: the data and associated metadata should be managed in a way that allows for future reuse 25
  26. 26. Roles and responsibilities Institution RDM policy Facilities €$£ Research funders Publishers Data Availability Policy Commercial partners
  27. 27. Let’s recall the goal: • Open access to research data refers to the right to access and re-use digital research data. Openly accessible research data can typically be accessed, mined, exploited, reproduced and disseminated free of charge for the user. • The use of a Data Management Plan (DMP) is required for projects participating in the Open Research Data Pilot, detailing what data the project will generate, whether and how they will be exploited or made accessible for verification and re-use, and how they will be curated and preserved. http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf 27
  28. 28. Negative intermezzo • Stored data is not in itself “curated and preserved” • Preserved (or: archived) data is not in itself findable • Findable data is not in itself accessible • Accessible data is not in itself understandable • Understandable data is not in itself usable 28 What should be archived for long-term reuse is a package of data + context:
  29. 29. What should be deposited? • The data needed to validate results in scientific publications (minimally!). • The associated metadata: the dataset’s creator, title, year of publication, repository, identifier etc. • Follow a metadata standard in your line of work, or a generic standard, e.g. Dublin Core or DataCite. Standards are important for discovering and exchanging data. • The repository will assign a persistent ID to the dataset: important for discovering and citing the data. • Documentation like code books, lab journals, informed consent forms – domain-dependent, and important for understanding the data and combining them with other data sources. • Software, hardware, tools, syntax queries, machine configurations – domain-dependent, and important for really using the data. (Alternative: information about the software etc.) Basically, everything that is needed to replicate a study should be available for others. Hence the name “replication package”, although the aspiration is reuse rather than replication: more is most welcome. More data, more information in the package… and described in the DMP. 29
  30. 30. https://commons.wikimedia.org/wiki/File%3ABudget_Debate_2011_(5611505228).jpg 30
  31. 31. Open Access to all data, unless… • Confidentiality and security issues can be good reasons not to publish or share – all – data. Note in the DMP* the reasons for not giving access, and deposit that part of the data under a Restricted Access regime. • E.g. when regenerating data would be cheaper than archiving, don’t archive. Spend time on selecting what data you’ll need and want to retain. Motivate your criteria in the DMP. See http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf For selection criteria see https://www.openaire.eu/opendatapilot 31 Grant Agreement, Art. 29.3, Open Access to research data:
  32. 32. Repository, archive, ehm? • A pilot requirement is to “deposit your data in a research data repository”: a digital archive collecting and displaying datasets and their metadata. • Select a data repository that will preserve your data, metadata and possibly tools in the long term. It is advisable to contact the repository of your choice when writing the first version of your DMP. Repositories may offer guidelines for sustainable data formats and metadata standards, as well as support for dealing with sensitive data and licensing. But how to find a repository? More in a few minutes… 32
  33. 33. 33 EC guidance PI’s answer Initial DMP 5
  34. 34. 34 Several export formats 6
  35. 35. Deliver the DMP • Send the initial DMP version to the Commission within six months. • EC: “Since DMPs are expected to mature during the project, more developed versions of the plan can be included as additional deliverables at later stages. (…) New versions of the DMP should be created whenever important changes to the project occur due to inclusion of new data sets, changes in consortium policies or external factors.” 35
  36. 36. 3. During the project Data management is part of good research 36
  37. 37. Roles and responsibilities Institution RDM policy Facilities €$£ Research funders Publishers Data Availability Policy Commercial partners
  38. 38. Linking data and publications • From a data-centric perspective publications are part of a dataset’s context. However, there is no need to include publications in the replication package: • A lot of data repositories also accept publications, and allow linking between publications and their underpinning data. • By means of smart, persistent identifiers – consistently used – linking is also possible across repositories. 38
  39. 39. Incentives 1
  40. 40. Vegetation map 1977 reused in 2015 expedition H.D. Heinemeijer & A.J. van Dijk (1977): Vegetation map Rosenbergdalen, Edgeøya, Svaltbard http://sees.nl/
  41. 41. Incentives 3 Image: https://www.flickr.com/photos/dmh650/4031607067/in/gallery-wlef70-72157633022909105/ 41 Data management is a part of good research practice. RCUK Policy and Code of Conduct on the Governance of Good Research Conduct Responsible data management is part of good research. NWO – Introduction to the pilot Data Management
  42. 42. 4. Data Services Trustworthy digital repositories Find a data repository Research data in OpenAIRE 42
  43. 43. Storage and Trust • Local storage facilities during the research • Network of trustworthy digital repositories for long-term preservation of (a selection of) the data after the research is finished • Certification of digital repositories in order to establish trust • 4 certification standards available
  44. 44. Where to find a repository? In order of preference: use 1. an external data archive or repository in your research domain 2. an institutional research data repository, or your research group’s established data management facilities 1. Zenodo.org 2. or search for other data repositories at re3data.org http://www.zenodo.org/ http://www.re3data.org/ 44
  45. 45. Main criteria for choosing a data repository: • Certification as a ‘Trustworthy Digital Repository’, with an explicit ambition to keep the data available in the long term. • Matches your particular data needs: e.g. formats accepted; mixture of Open and Restricted Access. • Gives your submitted dataset a persistent and globally unique identifier: for sustainable citations – both for data and publications – and to link back to particular researchers and grants. • Provides guidance on how to cite the data that has been deposited. How to select a repository? https://www.openaire.eu/opendatapilot-repository 45
  46. 46. • re3data.org is a global registry of research data repositories • different academic disciplines • It presents repositories for the permanent storage and access of data sets • Funded by the German Research Foundation (DFG) • 2015: 1,368 reviewed repositories 46
  47. 47. 47
  48. 48. 48
  49. 49. 49
  50. 50. https://zenodo.org/ Contents 50
  51. 51. OpenAIRE2020 https://zenodo.org/features
  52. 52. OpenAIRE
  53. 53. Research data in OpenAIRE
  54. 54. HYPOX: FP7 PROJECT 52 Publications from 20 different OpenAIRE data providers 392 datasets from PANGAEA Slide from Pedro Principe, University of Minho 54
  55. 55. FP7 projects: publications + datasets HYPOX > https://www.openaire.eu/search/project?projectId=corda_______::abb5725eaf2617c39ae240b4ce1cce3e http://hypox.net; Slide from Pedro Principe, University of Minho 55
  56. 56. FP7 projects: publications + datasets HYPOX > https://www.openaire.eu/search/project?projectId=corda_______::abb5725eaf2617c39ae240b4ce1cce3e 56 Open Access funded Publications aggregated from repositories & journals Datasets from Data Repositories
  57. 57. 5. Summary • Research projects in 9 appointed Horizon 2020 areas are automatically part of the pilot, e.g. Future and emerging technologies; Nanotechnologies; Climate action; Sustainable agriculture. • Opting in / opting out is possible • Data Management Plan required within six months after project grant • Deposit the research data in a trusted research data repository • Open data is data that is free to access and reuse: Creative Commons Licence CC-BY or CC0 • 11,000 open datasets in OpenAIRE 61
  58. 58. Slogan EC: As open as possible, as closed as needed 62
  59. 59. 6. Introduction to the afternoon’s ‘In practice session’ 63
  60. 60. WP 4 Training and Support • Task 4.3. Research Data Management training and support • DANS (Data Archiving and Networked Services) is task leader • Support kit for Open Research Data Pilot: https://www.openaire.eu/opendatapilot Briefing paper: Research Data management - Support for Open Research Data Pilot OpenAIRE 2020 64
  61. 61. Programme ‘In Practice session’ 1. Situation of RDM in your country: • Introduction of you and the situation in your country regarding RDM 2. Breakout sessions ‘Feedback Briefing paper RDM’ • Section 2: Reusability and data management • Section 3: How to plan data management • Section 5: Roles and responsibilities in RDM 1. Wrap up and other questions / suggestions to future support materials 65
  62. 62. www.openaire.eu @openaire_eu facebook.com/groups/openaire linkedin.com/groups/OpenAIRE- 3893548 Thank you! elly.dijk@dans.knaw.nl marjan.grootveld@dans.knaw.nl 66

×