Making Research Data Repositories
Visible – The re3data.org Registry
Frank Scholze | Karlsruhe Institute of Technology, KIT Library
Heinz Pampel, Paul Vierkant | GFZ German Research Centre for Geosciences,
LIS
LIBER 2015 | London, June 26, 2015
Background
European Commisson. (2014). Horizon 2020 Annotated Model Grant Agreements. Version 1.6.2 .Retrieved from
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-amga_en.pdf
• Funders' data policies
• Example: European Commission
Background
NPG (2013). Availability of data and materials. Retrieved from http://www.nature.com/authors/policies/availability.html
PLOS (2014). PLOS Editorial and Publishing Policies. Retrieved from http://www.plosone.org/static/policies.action
• Journal Data Policies
• Nature Publishing Group
• “[...] authors are required to make materials, data and
associated protocols promptly available to readers
without undue qualifications. “
• PLOS
• “PLOS journals require authors to make all data
underlying the findings described in their manuscript fully
available without restriction, with rare exception.“
Reproducability and trust
re3data.org - Mission
• global registry of research data repositories
• covers all academic disciplines
• helps researchers, funding bodies, publishers,
libraries and scholarly institutions to find research
data repositories
• promotes a culture of sharing, increased access
and better visibility of research data
Pampel, H. et al. (2013). Making Research
Data Repositories Visible: The re3data.org
Registry. PLOS ONE, 8(11), e78080.
http://doi.org/10.1371/journal.pone.0078080
Schema
Vierkant, P., et al. (2014). Schema for the
Description of Research Data Repositories.
Version 2.2. http://doi.org/10.2312/re3.006
39Properties
2.2Version
Based on Analyses,
Feedback and
Experience
Icons
Vierkant, P., et al. (2014). Schema for the
Description of Research Data Repositories.
Version 2.2. doi:10.2312/re3.006
The research datarepository provides
additional information on its ser vice.
The research datarepository
provides open/restricted/closed
access to its data.
The terms of use and licenses
of the dataare provided by the
research datarepository.
The research datarepository
provides apolicy.
The research datarepository uses
apersistent identifier system to make its
provided data persistent,unique and citable.
The research datarepository is
either certified or suppor ts a
repository standard. RESEARCH
DATA
REPOSITORY
GENERAL
INFORMATION
POLICY
LEGAL
ASPECTS
TECHNICAL
STANDARDS
QUALITY
STANDARDS
Quality
Requirements:
• be run by a legal entity, such as a sustainable
institution (e.g. library, university)
• clarify access conditions to the data and repository as
well as the terms of use
• have focus on research data
Workflow
simple
search
box
filters
results
icons
RDR Typology
• Institutional
• Disciplinary
• Multidisciplinary
• Project
RDR indexed by re3data
0
200
400
600
800
1000
1200
1400
Aug-12
Sep-12
Oct-12
Nov-12
Dec-12
Jan-13
Feb-13
Mar-13
Apr-13
May-13
Jun-13
Jul-13
Aug-13
Sep-13
Oct-13
Nov-13
Dec-13
Jan-14
Feb-14
Mar-14
Apr-14
May-14
Jun-14
Jul-14
Aug-14
Sep-14
Oct-14
Nov-14
Dec-14
Jan-15
Feb-15
Mar-15
Apr-15
Indexed Research
Data Repositories
RDR by Country
48%
15%
13%
5%
4%
3%
3%
2%
2%
2% 2%
1%
US
GER
UK
CAN
FRAN
JPN
AUS
CH
IND
NED
CHN
DEN
Icons and numbers
From a total of 1260 RDR in re3data (June 2015)
0
200
400
600
800
1000
1200
Certification Open Access Persistent Id All Aspects
Champions by discipline
0
10
20
30
40
50
60
HSS Life Sciences Natural Sciences Engineering
From a total of 88 RDR (June 2015)
Integration of re3data in Guidelines
• Funder Example: European Commission
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
• Institutional Example: Bielefeld University
• „Verzeichnisse, wie das DFG-geförderte "Registry of Research Data Repositories",
bilden die Grundlage für die Suche nach geeigneten Publikationsorten für die
Forschungsdaten. “
Universität Bielefeld (2011): Resolution zum Forschungsdatenmanagement. https://data.uni-bielefeld.de/de/resolution
• Publisher Example: Nature Publishing Group
• „Physics, astrophysics, astronomy and geoscience databases should be registered
with re3data.org.“
Scientific Data (2013): Data policies. http://www.nature.com/sdata/data-policies
Cooperation
• Deutsche Initiative für Netzwerkinformation (DINI)
• DataCite (MoU, April 2012)
• OpenAIRE (MoU, October 2013)
• BioSharing (MoU, November 2013)
• Databib (MoU, March 2014)
• DataCite (Formal cooperation, March 2015)
Dimensions of sustainability
TECHNICAL
LEGALFINANCIAL
ORGANISATIONAL
Organizational sustainability
• Merger with DataBib under the auspices of DataCite
• re3data.org working group within DataCite
• International Editorial Board
• Cooperations within Research
Data Alliance (RDA) and the
research data repository community
• Community building and feedback loops during RFC
phases (e.g. re3data.org schema)
Technical sustainability
• Open interfaces
• RESTful API
• OpenSearch
• Documentation: http://www.re3data.org/api/doc
• Used e.g. by OpenAIRE
• Open metadata
• Documentation: http://www.re3data.org/schema/
• Long-term hosting commitment by KIT
Legal sustainability
• Open licenses
• CC BY for the website
• CC 0 for metadata
Financial sustainability
• Technical maintenance financed by DataCite from 2016
• Further development managed by DataCite
• Further project funding
Thanks to the team
• Michael Witt
Purdue University, Distributed Data Curation Center (D2C2)
• Roland Bertelmann, Claudio Fuchs, Heinz Pampel
GFZ German Research Centre for Geosciences, Library and
Information Services (LIS)
• Maxi Kindling, Jessica Rücknagel, Peter Schirmbacher,
Paul Vierkant
Humboldt-Universität zu Berlin, Berlin School of Library and
Information Science (BLIS)
• Hans-Jürgen Goebelbecker, Gabriele Kloska, Evelyn Reuter,
Edeltraud Schnepf, Angelika Semrau, Michael Skarupianski,
Robert Ulrich
Karlsruhe Institute of Technology (KIT), KIT Library
info@re3data.org
http://re3data.org
With the exception of all photos and graphics, this slides are licensed under
the “Attribution 4.0 International (CC BY 4.0)“ Licence:
http://creativecommons.org/licenses/by/4.0/

Scholze liber 2015-06-25_final

  • 1.
    Making Research DataRepositories Visible – The re3data.org Registry Frank Scholze | Karlsruhe Institute of Technology, KIT Library Heinz Pampel, Paul Vierkant | GFZ German Research Centre for Geosciences, LIS LIBER 2015 | London, June 26, 2015
  • 3.
    Background European Commisson. (2014).Horizon 2020 Annotated Model Grant Agreements. Version 1.6.2 .Retrieved from http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-amga_en.pdf • Funders' data policies • Example: European Commission
  • 4.
    Background NPG (2013). Availabilityof data and materials. Retrieved from http://www.nature.com/authors/policies/availability.html PLOS (2014). PLOS Editorial and Publishing Policies. Retrieved from http://www.plosone.org/static/policies.action • Journal Data Policies • Nature Publishing Group • “[...] authors are required to make materials, data and associated protocols promptly available to readers without undue qualifications. “ • PLOS • “PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception.“
  • 5.
  • 6.
    re3data.org - Mission •global registry of research data repositories • covers all academic disciplines • helps researchers, funding bodies, publishers, libraries and scholarly institutions to find research data repositories • promotes a culture of sharing, increased access and better visibility of research data Pampel, H. et al. (2013). Making Research Data Repositories Visible: The re3data.org Registry. PLOS ONE, 8(11), e78080. http://doi.org/10.1371/journal.pone.0078080
  • 7.
    Schema Vierkant, P., etal. (2014). Schema for the Description of Research Data Repositories. Version 2.2. http://doi.org/10.2312/re3.006 39Properties 2.2Version Based on Analyses, Feedback and Experience
  • 8.
    Icons Vierkant, P., etal. (2014). Schema for the Description of Research Data Repositories. Version 2.2. doi:10.2312/re3.006 The research datarepository provides additional information on its ser vice. The research datarepository provides open/restricted/closed access to its data. The terms of use and licenses of the dataare provided by the research datarepository. The research datarepository provides apolicy. The research datarepository uses apersistent identifier system to make its provided data persistent,unique and citable. The research datarepository is either certified or suppor ts a repository standard. RESEARCH DATA REPOSITORY GENERAL INFORMATION POLICY LEGAL ASPECTS TECHNICAL STANDARDS QUALITY STANDARDS
  • 9.
    Quality Requirements: • be runby a legal entity, such as a sustainable institution (e.g. library, university) • clarify access conditions to the data and repository as well as the terms of use • have focus on research data
  • 10.
  • 11.
  • 16.
    RDR Typology • Institutional •Disciplinary • Multidisciplinary • Project
  • 17.
    RDR indexed byre3data 0 200 400 600 800 1000 1200 1400 Aug-12 Sep-12 Oct-12 Nov-12 Dec-12 Jan-13 Feb-13 Mar-13 Apr-13 May-13 Jun-13 Jul-13 Aug-13 Sep-13 Oct-13 Nov-13 Dec-13 Jan-14 Feb-14 Mar-14 Apr-14 May-14 Jun-14 Jul-14 Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15 Mar-15 Apr-15 Indexed Research Data Repositories
  • 18.
    RDR by Country 48% 15% 13% 5% 4% 3% 3% 2% 2% 2%2% 1% US GER UK CAN FRAN JPN AUS CH IND NED CHN DEN
  • 19.
    Icons and numbers Froma total of 1260 RDR in re3data (June 2015) 0 200 400 600 800 1000 1200 Certification Open Access Persistent Id All Aspects
  • 20.
    Champions by discipline 0 10 20 30 40 50 60 HSSLife Sciences Natural Sciences Engineering From a total of 88 RDR (June 2015)
  • 21.
    Integration of re3datain Guidelines • Funder Example: European Commission http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf • Institutional Example: Bielefeld University • „Verzeichnisse, wie das DFG-geförderte "Registry of Research Data Repositories", bilden die Grundlage für die Suche nach geeigneten Publikationsorten für die Forschungsdaten. “ Universität Bielefeld (2011): Resolution zum Forschungsdatenmanagement. https://data.uni-bielefeld.de/de/resolution • Publisher Example: Nature Publishing Group • „Physics, astrophysics, astronomy and geoscience databases should be registered with re3data.org.“ Scientific Data (2013): Data policies. http://www.nature.com/sdata/data-policies
  • 22.
    Cooperation • Deutsche Initiativefür Netzwerkinformation (DINI) • DataCite (MoU, April 2012) • OpenAIRE (MoU, October 2013) • BioSharing (MoU, November 2013) • Databib (MoU, March 2014) • DataCite (Formal cooperation, March 2015)
  • 23.
  • 24.
    Organizational sustainability • Mergerwith DataBib under the auspices of DataCite • re3data.org working group within DataCite • International Editorial Board • Cooperations within Research Data Alliance (RDA) and the research data repository community • Community building and feedback loops during RFC phases (e.g. re3data.org schema)
  • 25.
    Technical sustainability • Openinterfaces • RESTful API • OpenSearch • Documentation: http://www.re3data.org/api/doc • Used e.g. by OpenAIRE • Open metadata • Documentation: http://www.re3data.org/schema/ • Long-term hosting commitment by KIT
  • 26.
    Legal sustainability • Openlicenses • CC BY for the website • CC 0 for metadata
  • 27.
    Financial sustainability • Technicalmaintenance financed by DataCite from 2016 • Further development managed by DataCite • Further project funding
  • 28.
    Thanks to theteam • Michael Witt Purdue University, Distributed Data Curation Center (D2C2) • Roland Bertelmann, Claudio Fuchs, Heinz Pampel GFZ German Research Centre for Geosciences, Library and Information Services (LIS) • Maxi Kindling, Jessica Rücknagel, Peter Schirmbacher, Paul Vierkant Humboldt-Universität zu Berlin, Berlin School of Library and Information Science (BLIS) • Hans-Jürgen Goebelbecker, Gabriele Kloska, Evelyn Reuter, Edeltraud Schnepf, Angelika Semrau, Michael Skarupianski, Robert Ulrich Karlsruhe Institute of Technology (KIT), KIT Library
  • 29.
    info@re3data.org http://re3data.org With the exceptionof all photos and graphics, this slides are licensed under the “Attribution 4.0 International (CC BY 4.0)“ Licence: http://creativecommons.org/licenses/by/4.0/

Editor's Notes

  • #8 general information (e.g. short description of the RDR, content types, keywords) responsibilities (e.g. institutions responsible for funding, content or technical issues) policies (e.g. policies of the RDR, incl. URL) legal aspects (e.g. licenses of the database and datasets) technical standards (e.g. APIs, versioning of datasets, software of the RDR) quality standards (e.g. certificates, audit processes)
  • #9 Information icons help researchers to easily identify an adequate repository for the storage and reuse of their data.
  • #19 US 654 GER 196 UK 176 CAN 65 FRAN 49 JPN 42 AUS 45 CH 33 IND 27 NED 28 CHN 23 DEN 17
  • #20 Certification 231 Open Access 1066 (access to repo, access to data, data upload) Persistent Id 388 All Aspects 88 Access to Repository Access to Data Data Upload
  • #21 HSS 52 Archaeology Data Service ads , CLARIN-ERIC Life Sciences 21 NeuroMorpho, cancerData Natural Sciences 34 astronomy dataverse network, Easy Dans, pangaea Engineering 11 3tudatacentrum