1. Facilitate Open Science Training for European Research
What are the Horizon 2020 Open Access/ Open Data
Mandates? What should applicants comply with?
Martin Donnelly
Digital Curation Centre
University of Edinburgh
NCP Academy Webinar
23 June 2017
2. The Digital Curation Centre DCC)
ā¢ UK national centre of expertise in digital preservation
and data management, est. 2004
ā¢ Principal audience has been the UK higher education
sector, but we increasingly work further afield
(continental Europe, North America, South Africa, Asiaā¦)
ā¢ Provide guidance, training, tools (e.g. DMPonline) and
other services on all aspects of research data
management and Open Science
ā¢ Organise national and international events and webinars
(International Digital Curation Conference, Research
Data Management Forum)
ā¢ Offer tailored consultancy/training
3. Background (me)
ā¢ Academic background in cultural heritage computingā¦
ā¢ Which led me to work in digital preservationā¦
ā¢ Which led to my current involvement in research data
management and the broader topic of Open Science
ā¢ Iāve been involved to various degrees in the development
of early DMP resources (DCC Checklist, DMPonline,
DMPTool, book chapter on DMPā¦)
ā¢ Member of the original FOSTER consortium
ā¢ Also involved in consultancy, advocacy, events, training
etc, e.g. as external expert reviewer of Horizon 2020
DMPs
4. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
5. Timeline / context
ā¢ FP7 (2007-2013)
ā¢ Open Access pilot (launched 08/08) covered c. 20% of
budget, including Information and Communication
Technologies; Research Infrastructures (e-Infrastructures);
and Science in Society
ā¢ Research data not mandated, but supported via Research
Infrastructure projects
ā¢ Horizon 2020 / FP8 (2014-2020)
ā¢ Open Access mandate for all published research papers
ā¢ Initially a limited Open Data Pilot, subsequently expanded
to cover all H2020 thematic areas
ā¢ FAIR approach adopted (07/16)
ā¢ Research Infrastructure projects continue
6. Benefits recap (as EC sees it)
ā¢ Broader access to scientific publications and
data helps to:
ā¢ build on previous research results (improved quality
of results)
ā¢ encourage collaboration and avoid duplication of
effort (greater efficiency)
ā¢ speed up innovation (faster progress to market means
faster growth)
ā¢ involve citizens and society (improved transparency
of the scientific process)
8. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
9. Open Access to publications
ā¢ Open Access to scientific publications means free
online access for any user
ā¢ For the EC, āaccessā includes not only the right to read,
download and print ā but also the right to copy,
distribute, search, link, crawl and mine
ā¢ Mandate:
ā¢ At minimum, must ensure that any scientific, published, peer-
reviewed journal papers can be read online, downloaded and
printed. (Beneficiaries are also encouraged to apply this to other
forms of publication, like books, monographs and conference
proceedings.)
ā¢ So far as possible, increase the usefulness of publications by
granting rights to copy, distribute, search, link, crawl and mine
10. Two-step policy
ā¢ Step 1. Deposit in a disciplinary or institutional/
national repository
ā¢ As soon as possible, and at latest upon publication
ā¢ This must be done regardless of green/gold choice at Step 2
ā¢ 6-12 month embargo permitted, depending on discipline
ā¢ Step 2. Grant Open Access
ā¢ Either via green or green-and-gold route
ā¢ Researcher chooses where to publish
ā¢ APCs are eligible for reimbursement during the duration of the
project
ā¢ There are also specific conditions re. the format of bibliographic
metadata, which must be open to aid discoverability
11. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
12. Open Data Definitions I
ā¢ Open access to research data is āthe right to access and
reuse digital research data under the terms and
conditions set out in the Grant Agreementā
ā¢ Research data is āinformation, in particular facts or
numbers, collected to be examined and considered as
a basis for reasoning, discussion, or calculationā
ā¢ In a research context, examples of data include statistics, results
of experiments, measurements, observations resulting from
fieldwork, survey results, interview recordings and images. The
focus is on research data that is available in digital form
ā¢ Users can normally access, mine, exploit, reproduce
and disseminate openly accessible research data free of
charge
13. Open Data Definitions II
ā¢ Types of data covered by the Open Research
Data Pilot:
ā¢ 'underlying data' (the data needed to validate the
results presented in scientific publications), including
the associated metadata (i.e. metadata describing the
research data deposited), as soon as possible
ā¢ any other data (for instance curated data not directly
attributable to a publication, or raw data), including
the associated metadata, as specified and within the
deadlines laid down in the DMP ā that is, according to
the individual judgement by each project/beneficiary
14. Open Data Definitions III
ā¢ The EC has adopted the FORCE11 āFAIRā approach to
research data management.
ā¢ This states that āOne of the grand challenges of data-
intensive science is to facilitate knowledge discovery by
assisting humans and machines in their discovery of,
access to, integration and analysis of, task-appropriate
scientific data and their associated algorithms and
workflows.ā
ā¢ To help achieve this, (meta)data should beā¦
ā¢ Findable
ā¢ Accessible
ā¢ Interoperable
ā¢ Reusable
15. The FAIR Data Principles (1/4)
To be Findable:
F1. (meta)data are assigned a globally unique and
eternally persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a
searchable resource.
F4. metadata specify the data identifier.
16. The FAIR Data Principles (2/4)
To be Accessible:
A1. (meta)data are retrievable by their
identifier using a standardized communications
protocol.
A1.1. the protocol is open, free, and universally
implementable.
A1.2. the protocol allows for an authentication and
authorization procedure, where necessary.
A2. metadata are accessible, even when the data are
no longer available.
17. The FAIR Data Principles (3/4)
To be Interoperable:
I1. (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
I2. (meta)data use vocabularies that follow FAIR
principles.
I3. (meta)data include qualified references to other
(meta)data.
18. The FAIR Data Principles (4/4)
To be Re-usable:
R1. meta(data) have a plurality of accurate and
relevant attributes.
R1.1. (meta)data are released with a clear and
accessible data usage license.
R1.2. (meta)data are associated with
their provenance.
R1.3. (meta)data meet domain-relevant community
standards.
19. Data Management Plan
ā¢ The DMP should include information on:
ā¢ the handling of research data during and after the end of the
project
ā¢ what data will be collected, processed and/or generated
ā¢ which methodology and standards will be applied
ā¢ whether data will be shared/made open access, and
ā¢ how data will be curated and preserved (including after the end
of the project)
ā¢ DMPs are submitted as deliverables ā first iteration is
due at M6
ā¢ Template and guidance is given in the Guidelines doc
20. Complying with H2020 Open Data Policy (I)
ā¢ Deposit research data, preferably in a research data
repository.
ā¢ These are online research data archives, which may be
subject-based/thematic, institutional or centralised. Useful
listings of repositories include the Registry of Research Data
Repositories and Databib. The Open Access Infrastructure for
Research in Europe (OpenAIRE) provides additional
information and support on linking publications to underlying
research data.
ā¢ Some repositories like Zenodo (an OpenAIRE and CERN
collaboration), allow researchers to deposit both publications
and data, while providing tools to link them. Zenodo and
some other repositories as well as many academic publishers
also facilitate linking publications and underlying data through
persistent identifiers and data citations.
21. Complying with H2020 Open Data Policy (II)
ā¢ Take measures to enable third parties to access, mine,
exploit, reproduce and disseminate (free of charge for any
user) this research data, so far as possible.
ā¢ One straightforward and effective way of doing this is to
attach Creative Commons Licences (CC BY or CC0) to the data
deposited. The EUDAT B2SHARE tool includes a built-in license
wizard that facilitates the selection of adequate license for
research data.
ā¢ At the same time, projects should provide information via the
chosen repository about the tools available to the
beneficiaries that are needed to validate the results, e.g.
specialised software or software code, algorithms and analysis
protocols. Where possible, they should provide these
instruments themselves.
22. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
23. Reviewing DMPs
ā¢ Iāve been involved in two sets of EC reviews: June/July
2016, and February/March 2017
ā¢ These have been first iterations of H2020 DMPs, at the
six month stage
ā¢ At this stage we are mainly looking for assurances that
the project team has thought things through, and is
considering data management from early in the project
lifecycle
ā¢ Reviewers can also flag up potential issues and make
suggestions/recommendations via the Project Officers
ā¢ Specific details often come later, when the data
generation actually takes place. The expectation is that
the DMP will be updated whenever something important
happens, such as a new dataset being created
24. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
25. Links and resources
ā¢ Open Access section of the H2020 Online Manual -
http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-
issues/open-access-data-management/open-access_en.htm
ā¢ Guidelines to the Rules on Open Access to Scientific Publications and Open Access to
Research Data in Horizon 2020 (v3.2, March 2017) -
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h
2020-hi-oa-pilot-guide_en.pdf
ā¢ H2020 Programme Guidelines on FAIR Data Management in Horizon 2020 (v3.0, July 2016) -
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h
2020-hi-oa-data-mgt_en.pdf
ā¢ Article 29.2 of the Model Grant Agreement (OA) -
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-
amga_en.pdf#page=221
ā¢ Article 29.3 of the Model Grant Agreement (data) -
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-
amga_en.pdf#page=222
ā¢ FORCE11 FAIR data principles - https://www.force11.org/group/fairgroup/fairprinciples
26. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
27. ā¢ Phase 1 (2014-2016): Spread
the Seeds of Open Science and
Open Access
ā¢ Creation of Open Science
Taxonomy
ā¢ 2000+ training materials,
categorized in the FOSTER
Portal
ā¢ More than 100 f2f training
events in 28 countries and 25
online courses, totalling more
than 6300 participants
FacilitateOpenScienceTrainingforEuropeanResearch
The project
http://fosteropenscience.eu
28. ā¢ Phase 2 (2017-2019): Let the Flowers of Open Science Bloom
ā¢ Focus on:
ā¢ Training for the practical implementation of Open Science (face to face
and online) including RDM and Open Data
ā¢ Developing intermediate/advanced level/discipline-specific training
resources in collaboration with three disciplinary communities (and
related RIs): Life Sciences (ELIXIR), Social Sciences (CESSDA) and
Humanities (DARIAH)
ā¢ Update the FOSTER Portal to support moderated learning, badges and
gamification
ā¢ In concrete terms:
ā¢ 150 new training resources
ā¢ Over 50 training events (outcome-oriented, providing participants with
tangible skills) and 20 e-learning courses
ā¢ Multi-module Open Science Toolkit
ā¢ Trainers Network, Open Science Bootcamp, Open Science Training
Handbook, and moreā¦
FacilitateOpenScienceTrainingforEuropeanResearch
The project
http://fosteropenscience.eu
29. Overview
1. Timeline / context
2. Open Access to publications
3. Open Data pilot
4. Reviewing Data Management Plans
5. Links and resources
6. About the FOSTER project
7. Contact details
30. Contact details
ā¢ For more information about the
FOSTER project:
ā¢ Website: www.fosteropenscience.eu
ā¢ Principal investigator: Eloy Rodrigues
(eloy@sdum.uminho.pt)
ā¢ General enquiries: Gwen Franck
(gwen.franck@eifl.net)
ā¢ Twitter: @fosterscience
ā¢ My contact details:
ā¢ Email: martin.donnelly@ed.ac.uk
ā¢ Twitter: @mkdDCC
ā¢ Slideshare:
http://www.slideshare.net/martindo
nnelly
This work is licensed under the
Creative Commons Attribution
2.5 UK: Scotland License.