Date:28 May 2012
GA-283595 PUBLIC 1 / 7
Results of OpenAIREplus Data Survey
Author: Najla Rettberg
Date: 28 May 2012
Recently launched, OpenAIREplus extends the OpenAIRE service to facilitate access to the entire
Open Access scientific production of the European Research Area. Opening up the infrastructure to
data sources from subject‐specific communities, OpenAIREplus will provide cross‐links from
publications to data and funding schemes, contributing to Open Linked Data initiatives. Working
closely with OpenAIRE, the project will establish an e‐Infrastructure to harvest, enrich and store the
metadata of Open Access scientific datasets. The project facilitates collaboration across data
infrastructures, providing information to scientists, non‐scientists as well as to providers of value‐
added services. In order to link to data sources, the project will start to identify relevant data
repositories, and use other initiatives (e.g. DataCite, re3data, DCC) to identify appropriate and
certified repositories, and in the longer term will work to get a wider picture of the repositories that
exist within its community of practice.
Aims of the Survey
Data management practices and maturity in Europe are fairly heterogeneous. Different models are
emerging and some countries and data centres are paving the way. On the other hand, some
countries have very few activities. OpenAIRE is comprised of a wide‐reaching network among all
European member states and beyond1
. In order to build up an infrastructure, it is appropriate that
these representatives of OpenAIRE within the member states, based at the National Open Access
Desks (NOADs), are fully informed about the burgeoning data landscape in order to support a service
for integration of data sources, via linking to the metadata of data repositories.
It was therefore fitting that one of the first activities of the project was to carry out a survey of data
practices within the OpenAIRE community. As OpenAIRE has contact desks in each member state,
results of the survey would be representative of Europe as a whole. This would also serve to
stimulate interest at the very least in the data management area. Each question was accompanied by
examples of good practice.
In January 2012, the survey was sent out via email to the 32 National Open Access Desks.
Among the eight questions asked were the following:
Do you know of any resources and centres for data management within your country/region?
All EC countries, plus Norway, Croatia, Iceland, Switzerland and Turkey.
Date:28 May 2012
GA-283595 PUBLIC 2 / 7
Do funding bodies in your country/region have any specific policies or management
requirements in place for research data?
Do any regional Higher Education institutions/Universities have data management plans?
What data archives exist at institutional level in your country/region?
Do you know of any discipline‐specific data archives at national/international level?
Out of 33 countries surveyed, 25 responded. The ‘missing data’ accounts for c.75% of the survey.
Reasons for not completing the survey are still being interpreted, for example lack of time, or not
able to identify resources. Poland declined to complete the survey, noting that no data practices at
all were available. It was also noted that that some partners misunderstood the questions (one
partners took the word ‘data repository’ to be a text‐based repository). It is also possible that the
survey wasn’t completed by the most informed people with relevant skills/knowledge of data.
Results were varied in length and detail.
Some responses were fairly subjective in approach ‘I think this counts as a data repository’ which
also highlights another problem: what is the definition of a data repository?
Overall, it is possible to start building a list of both discipline specific repositories and institutional
repositories from this survey.
In turn this list could be subdivided by discipline: SSH, Life Sciences, Geosciences etc.
This list could then be compared to other similar surveys in Europe (SIM4RDM).
It serves to identify which countries need to focus on developing data practices and
management. It could also be possible to identify which countries have ‘no’ data management,
‘emerging’, and ‘advanced’ practices.
Of note is that Italy, who answered ‘no’ to most questions, stated that, as a result of this survey,
it would carry out its own regional survey of data initiatives. There was a clear need for more
recognition in this area.
Some Facts and figures
Four countries out of 25 answered ‘yes’ to ‘Do any regional Higher Education
institutions/Universities have data management plans? This included Austria, Slovakia, Germany
and the UK.
Six countries, Germany, Lithuania, Austria, Netherlands, Sweden and UK answered that had, or
were starting to have, data policies at funding level. The rest answered ‘no’.
Netherlands, Denmark, Germany and UK have examples of data archives at institutional level.
See Appendix A for full list.
The list of discipline‐specific data archives was longer, and a list can be gathered from this survey,
see Appendix B
Date:28 May 2012
GA-283595 PUBLIC 3 / 7
Of particular interest is Slovenia, Lithuania and Czech Republic which have a number of
interesting initiatives (many in native language only).
There are a range of data practices reflected in the response to these questions. The UK, Germany
and Netherlands are the most advanced countries in terms of activities and initiatives.
While many countries have discipline specific repositories, management and sustainability for data
management at funding or institutional level is fairly sparse.
On the whole it can be concluded that there is a lack of practice at institutional level within the
majority of European countries, but further investigation (i.e. a wider, more qualitative study) would
merit a more scientific conclusion. This initial survey might have paved the way to gather some
information, and stimulated the OpenAIRE community to investigate in this area. It reflects a high‐
level overview of data practices in Europe and could complement existing survey results.
Date:28 May 2012
GA-283595 PUBLIC 4 / 7
List of Institutional level repositories
Utrecht Dataverse Network http://www.uu.nl/university/library/EN/services/dataverse/Pages/default.aspx
KUBIS Dataverse Network. https://data.kb.dk/dvn/
Open Linked Data from the Open University http://data.open.ac.uk/
The Open University Datasets are http://data.open.ac.uk/datasets/
Loughborough University‐ Centre for Renewable Energy Systems and Technology (CREST)
University of Cambridge Data Sets http://www.lib.cam.ac.uk/repository/deposit_guide/data_sets.html
University of East Anglia, Climatic Research Unit, http://www.cru.uea.ac.uk/cru/data/
University of Essex Data Archive (UKDA) http://www.data‐archive.ac.uk/
University of Southampton, Ocean and earth sciences datasources and datasets
European Bioinformatics Institute http://www.ebi.ac.uk/Information/databases_sitemap.html
Cambridge Structural Database http://www.ccdc.cam.ac.uk/about_ccdc/ Unilever Cambridge Center for
Molecular Informatics, Cambridge University, http://wwmm.ch.cam.ac.uk/crystaleye/ CrystalEye
WorldWideMolecularMatrix, University of Cambridge http://www.dspace.cam.ac.uk/handle/1810/724
British Atmospheric Data Centre (BADC) http://badc.nerc.ac.uk/home/index.html
MIRAGE, Middlesex Medical Image Repository http://image.mdx.ac.uk/mirage/
Edinburgh Data Share http://datashare.is.ed.ac.uk/
Economic and Social Science Data Service http://www.esds.ac.uk/
Data Specific repositories:
Edinburgh DataShare http://datashare.is.ed.ac.uk/
Queen Margaret University, Edinburgh http://edata.qmu.ac.uk/
Exeter Research Data Management Services (under development)
Date:28 May 2012
GA-283595 PUBLIC 5 / 7
List of discipline specific repositories extracted from survey results
VLIZ offers the possibility to store data http://www.vliz.be
Biofresh dataportal http://data.freshwaterbiodiversity.eu/
Belgian data portal: http://data.biodiversity.be/nl/pages/search
Czech Social Science Data Archive (CSDA) http://archiv.soc.cas.cz/en/
LINDAT‐Clarin, repository: LINDAT‐Clarin ‐ Centre for Language Research Infrastructure in the Czech Republic,
The Danish Data Archive is dedicated to the acquisition, preservation and dissemination of machine‐readable
data created by researchers from the Social Sciences and the Health Sciences communities. DDA, furthermore,
has quantitative historical data materials, especially transcribed historical censuses.
Summary of archives for social and economical data: http://www.ratswd.de/eng/dat/fdz.html
Archeology (still under construction): http://it‐zentrum‐antike.dainst.org/
Earth and Environmental Science: http://www.pangaea.de/about/
Climate data: http://www.dkrz.de/daten‐en?set_language=en, http://wdc.dlr.de/
Earth Sampling: http://www.scientificdrilling.org/front_content.php
Animal Tracking Data: http://www.movebank.org/
The Finnish Social Science Data Archive http://www.fsd.uta.fi/en/
LIDA ‐ the Lithuanian archive of humanities and social sciences data. The archive is used for acquisition,
maintenance and dissemination of empirical data of SSH. Project website: http://www.lidata.eu/. Catalogue
The new project “MIDAS ‐ The national open‐access archive of scientific information data” started in 2011.
Project coordinator is Vilnius University.
DANS – Data Archiving and Network Services
Network of European Economists Online Dataverse<http://dvn.iq.harvard.edu/dvn/dv/NEEO.
Norsk Samfunnsvitenskapelig Datatjeneste (Norwegian Social Science Data Services) (NSD)
http://www.nsd.uib.no/nsd/english/index.html This is a national data archive for the Social Sciences.
Date:28 May 2012
GA-283595 PUBLIC 6 / 7
Swedish National Data Service (SND): collects and disseminates research data within Social Sciences, Medicine
and Humanities. http://snd.gu.se/en/start
Environmental Climate Data Sweden (ECDS): collects and disseminates research data within Environmental and
Climate research. http://www.smhi.se/ecds
Social Science Data Archives (http://adp.fdv.uni‐lj.si/eng/) has the longest tradition of all discipline‐specific data
archives and is a member of European CESSDA and US ICPSR.
FidaPLUS (http://www.fidaplus.net/) is a corpus of Slovenian language.
Natural Language Server (http://nl.ijs.si/) is maintained by the Department of Knowledge Technologies of the
Jožef Stefan Institute and offers services and data resources primarily for Slovene, but also for other languages.
SI‐STAT (http://pxweb.stat.si/pxweb/Dialog/statfile2.asp) is a national statistics data portal.
Also of interest for OpenAIREplus might be DEDI (http://www.dedi.si/info/projekt‐dedi)
UK Data Centers http://guides.lib.sussex.ac.uk/content.php?pid=239822&sid=1978908
Growth Data Sets http://www.bris.ac.uk/Depts/Economics/Growth/datasets.htm
Archaeology Data Service ‐ http://ads.ahds.ac.uk/ ‐ This site contains a mixture of metadata and project
archives on archaeological sites and finds. Mandated depository for archaeological work funded by the AHRC in
Arts and Humanities Data Service (AHCS) ‐ http://www.ahds.ac.uk/ no longer funded but site still exists
Data Archive ‐ http://www.data‐archive.ac.uk/create‐manage ‐ social and economic datasets , social sciences
Centre for Environmental Data Archival (CEDA), http://www.ceda.ac.uk/
eCrystals ‐ Southampton ‐ http://ecrystals.chem.soton.ac.uk/ ‐ This subject repository contains X‐ray
crystallography structure datasets
ShareGeo Open ‐ http://www.sharegeo.ac.uk/ ‐ Earth and Planetary Sciences; Geography and Regional Studies.
This site provides access to spatial data
Cambridge Structural Database. http://www.ccdc.cam.ac.uk/about_ccdc/ The CCDC is a non‐profit, charitable
Institution whose objectives are the general advancement and promotion of the science of chemistry and
crystallography for the public benefit.
ChemSpider http://www.chemspider.com/ A free chemical structure database providing fast text and structure
search access to over 26 million structures from hundreds of data sources.Hosted by the Royal Society of
British Atmospheric Data Centre (BADC) http://badc.nerc.ac.uk/home/index.html . From the Natural
Environment Research Council (NERC). Many datasets are openly accessible but some are restricted.
Discovery: a metadata ecology for UK education and research
Date:28 May 2012
GA-283595 PUBLIC 7 / 7
USA (taken from survey)
The University of Wyoming maintains a list with discipline specific open access data archival services
(multidisciplinary, archeology, astronomy & physics, biology, chemistry, computer science, environmental
sciences, geology, geoscience and GIS, hydrology, linguistics, mathematics, medicine, social sciences).
NASA, National Space Science Data Center, Earth Observing System‐ Data Services
Statistical Science Data Sets http://www.statsci.org/datasets.html
The Open Access Directory contains a list of Data Repositories
http://oad.simmons.edu/oadwiki/Data_repositories (it lists national and international repositories in plenty of
specializations and subject areas).