Open science making_your_data_fair_slides_25-10-2018
1. How to share your (FAIR) data
Library Services
Dr. Ruth Wainman and Daniel van Strien
Research Data Management
(Library Services)
lib-researchsupport@ucl.ac.uk
www.ucl.ac.uk/research-data-management
www.ucl.ac.uk/research-it-services/research-data-service
4. Library Services
Why should you share?
ā¢ā Understanding why you are sharing will help you decide what to share
CC-BY 4.0
Illustration credit: Ainsley Seago.
doi:10.1371/journal.pbio.1001779.g001
5. Why Share? The Open Science Agenda
ā¢ There is currently a culture change taking place in the universities surrounding the dissemination of
research.
ā¢ Open science is defined as the āmovement which aims to make scientific research, data and
dissemination accessible to all levels of an inquiring societyā.
ā¢ There are many drivers of this change including the prevalence of the internet in our lives and the
need to address the disjoint between the universities and society.
ā¢ Open Science has also been made a priority by the EU Commission. It forms one of the three goals
set by Commissioner Moedas for EU research and innovation policy during his mandate.
Library Services
6. The Benefits of Open Science
1. The visibility of all research outputs will be increased once they are open. This should lead to a citation advantage, as users who
can easily download open versions of outputs will cite these versions as everyone with an Internet connection will have access
2. Making the underlying research data and methodology available allows individual users to replicate the results of the original
authors, and to spot any errors/slips. This level of transparency is good for researchers and good for research
3. Pursuing the steps above will add to the visibility of the outputs and also allow readers to see how the text/conclusions have
evolved at different stages in the process
4. As a minimum, research data used in the publication should be made available as a supporting dataset
5. The use of recognised identifiers/processes gives due acknowledgement to authors and external funders and improves citation
analysis. It rewards all stakeholders in the research process and enriches the research landscape as a result
Source: Open Science and its Role in the Universities: A Roadmap for Cultural Change (2018)
Library Services
7. The Eight Pillars of Open Science
1. Future of Scholarly Communication
2. EOSC (European Open Science Cloud)
3. FAIR Data
4. Skills
5. Research Integrity
6. Rewards
7. Altmetrics
8. Citizen Science
Source: Open Science and its Role in the Universities: A Roadmap for Cultural Change (2018)
Library Services
8. Open Science and RDM
ā¢ There has been an increasing emphasis on disseminating research data in order to give
research outputs the same visibility as publications.
ā¢ However there have been some challenges to establishing responsible RDM practices
including the requirements of GDPR.
ā¢ Research data needs to be shared as far as possible in accordance with Findable
Accessible Interoperable Reusable data principles so that RDM in the Open Science
agenda can be achieved.
Library Services
9. Library Services
Why should you share: so others can verify your results
"We searched for errors in 107 papers in the fields of
engineering, materials and computer science, which
were based on existing small data sets (see G.
Taguchi et al. Quality Engineering Handbook, Wiley;
2004). Our search revealed an alarming number of
errors. Ten papers had one or more mistakes that
were substantial enough to affect the findings and
conclusions. There were errors that were not so
significant in almost one-third of the papers." -
Linton, Jonathan D. 2013. āResearch: All Journals
Need to Correct Errors.ā Nature.
doi:10.1038/504033d.
10. Library Services
Why should you share: so others can reproduce your research?
https://blog.f1000.com/2014/04/04/reproducibility-tweetchat-recap/
"More than 70% of researchers have tried and
failed to reproduce another scientist's
experiments, and more than half have failed to
reproduce their own experiments"
- Baker, Monya. 2016. ā1,500 Scientists Lift the Lid on Reproducibility.ā Nature 533 (7604):
452ā54. doi:10.1038/533452a.
11. Library Services
Why should you share: because your data was expensive to collect
CC-BY-SA 4.0
https://commons.wikimedia.org/wiki/File:CERN_LHC.jpg
Maximilien Brice
"The Large Hadron Collider was first turned on in
August of 2008, then stopped for repairs in
September until November 2009. Taking all of those
costs into consideration, the total cost of finding the
Higgs boson ran about $13.25 billion."
https://www.forbes.com/sites/alexknapp/2012/07/05/how-much-does-it-cost-to-find-a-higgs-
boson/#55a583a83948
12. Library Services
Why should you share: because your data is useful for others
https://github.com/EIT-team/Stroke_EIT_Dataset
13. Library Services
Why should you share: a funder requirement?
ā¢ Funder policies
āInstitutional and project specific data management policies
and plans should be in accordance with relevant standards
and community best practice. Data with acknowledged long-
term value should be preserved and remain accessible and
usable for future research.ā
UK Research Councilsā Common principles on data:
www.ukri.org/funding/information-for-award-holders/data-policy/common-
principles-on-data-policy/
14. Library Services
ā¢ UCL policy
āResearchers should:
- Develop and record appropriate procedures and processes for
the collection, storage, use, re-use, access, and retention of the
research dataā¦;
- Establish and document agreements for research data
management when involved in a joint research projectā¦;
- Ensure that the integrity and security of their data is
maintainedā¦ā
https://www.ucl.ac.uk/isd/sites/isd/files/migrated-
files/uclresearchdatapolicy.pdf
15. Legal and ethical considerations
Funders & UCL encourage data sharing, if appropriate.
ā¢ Various levels of access: data accessible to all (āOpen
Dataā); restricted to some audience; embargoed; closed
ā¢ Data sharing should respect copyright; GDPR and other
legislation; consent forms; fundersā expectations
E.g.: ESRCās expectations:
- Anticipate restrictions to share data in grant application
- Share data to enable further scientific use, āwithin 3 months of
the grant endingā BUT embargoes accepted
- 3 levels of access : āopenā, āsafeguardedā or ācontrolledā
Library Services
16. Library Services
By SangyaPundir [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], from Wikimedia Commons
How to share FAIR?
https://www.nature.com/articles/sdata201618
17. Library Services
To be Findable:
F1. (meta)data are assigned a globally unique and persistent identifier
F2. data are described with rich metadata (defined by R1 below)
F3. metadata clearly and explicitly include the identifier of the data it describes
F4. (meta)data are registered or indexed in a searchable resource
To be Accessible:
A1. (meta)data are retrievable by their identifier using a standardized
communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where
necessary
A2. metadata are accessible, even when the data are no longer available
18. Library Services
To be Interoperable:
I1. (meta)data use a formal, accessible, shared, and broadly applicable language for
knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles
I3. (meta)data include qualified references to other (meta)data
To be Reusable:
R1. meta(data) are richly described with a plurality of accurate and relevant attributes
R1.1. (meta)data are released with a clear and accessible data usage license
R1.2. (meta)data are associated with detailed provenance
R1.3. (meta)data meet domain-relevant community standards
19. Library Services
What data should you share?
ā¢ Share everything?
ā¢ Share only selection of data?
ā¢ Raw data or processed data?
ā¢ External data that has been processed further?
ā¢ ā¦
20. Choosing a repository
ā¢ Project specific (for big ongoing projects
and collaborations ā donāt set up a
repository! )
ā¢ Funders repository (e.g. ESRC, NERC)
ā¢ Discipline specific https://www.re3data.org/
ā¢ āGenericā repository:
https://zenodo.org ā EU funded repository
hosted at CERN standard max upload 50GB
Library Services
21. Choosing a repository: UCL
Library Services
ā¢ Forthcoming institutional repository for research data
ā¢ Preference should be given to a discipline or funder repository if one exists
ā¢ Up-to 5 TB upload
22. RPS
Library Services
ā¢ Create a metadata record for data you have shared on RPS
ā¢ Get credit for your data outputs
ā¢ ORCID ā identify your outputs
23. Licensing outputs
Library Services
Using a license = making decisions about how you share
Creative Commons (non-code) Open source license (software/code)
https://creativecommons.org/about/downloads/
https://choosealicense.com/
Creative Commons Attribution 3.0 Unported License
24. Sharing software/code
ā¢ This can ranges from sharing code
underpinning a paperā¦
ā¢ to large software projects
ā¢ GitHub for version control, collaboration, CI
etc.
ā¢ Zenodo for long term preservation with DOI
and version of record
Library Services
25. Library Services
ā¢ Research IT support: www.ucl.ac.uk/research-it-services
o Funded software development
o High performance computing
o Data storage
ā¢ Library support: www.ucl.ac.uk/library
o Librarians and resources
o Data management drop-ins