SlideShare a Scribd company logo
Enhance your research impact
through open science
Gareth Knight
Research Data Manager
Library & Archives Service
researchdatamanagement@lshtm.ac.uk
Open Science
A broad movement that seeks to improve the quality of
research through greater:
• Transparency: Ensure methods are clearly explained and made
available earlier
• Consistency: Common standards, tools and services are used to
perform analysis.
• Collaboration: Opportunities are available for external
contribution & collaboration on research
• Access: All resources necessary to recreate the analysis are
made available in a form that enable verification & reuse
(Summary: it’s science with the benefit of 21st century tools)
Reproducibility Crisis
Vimes et al (2014) investigated data availability for 516 articles
published 2-22 years previous – odds of a dataset being
obtainable fell by 17% per year
A 2016 Nature survey revealed 52% of 1,576 surveyed researchers
considered there to be a 'significant' reproducibility crisis in
science.
• Approx. 68% of respondents failed to reproduce medical experiment.
Research replication is time-consuming and expensive
• Cancer Biology: https://osf.io/e81xl/wiki/home/
• Psychological Science - https://osf.io/ezcuj/wiki/home/
Retraction Watch lists 18,000+ papers that have been retracted,
many as a result of faulty science
Vimes et al (2014) https://doi.org/10.1016/j.cub.2013.11.014
Nature (2016) https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
What are the benefits of open science?
Analysis of open research practices and motivations of
583 Wellcome & 259 ESRC funded researchers:
• Improved visibility of research
• More publications
• Higher citation rate – See Piwowar & Vision (2013)
• Contribute to academic profile
• Career benefits (e.g. promotion)
• New collaborations
Van den Eynden, V. et al. (2016) Towards Open Research: Practices, experiences, barriers and
Opportunities. Wellcome Trust. https://doi.org/10.6084/m9.figshare.4055448
Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage.
https://doi.org/10.7717/peerj.175
Open Science by Design
Plan
Collect
ManageAnalyse
Publish
https://www.flaticon.com/free-icon/scientist_857648
Enhanced
Research
standards
Enhanced
Research
standards
Open
Education
Resources
Open
Education
Resources
Open
software
Open
software
Citizen
Science &
peer review
opportunities
Citizen
Science &
peer review
opportunities
Open accessOpen access
Reusable
resources
Reusable
resources
Research Reproducibility
Research Objectives
Research is reviewed for many purposes:
• Verification: check analysis to confirm conclusions are valid
• Replicate: Same methods applied to get same result, different
environment
• Reproduce: Same methods applied, different setup
• Reuse: same data, different research
What steps do you take to ensure research is easier to
validate/replicate/reproduce or reuse by others?
The Difference
https://xkcd.com/242/
Plan for openness from the outset
Plan
Be aware of
requirements
Consider
community
engagement
opportunities
Document
research
protocol &
publish
Data
collection
Inform
participants and
relevant
stakeholder
Acquire raw
data in
electronic form
using secure
systems (e.g.
ODK)
Data
Management
Organise
resources
logically
Ensure raw data
is read only
Assign unique
IDs to relevant
items
Data
processing
Automate
processing
activities (as far
as possible) in
an open format
to enable it to
be re-applied
Document
activities
performed to
ensure an audit
trail
Data analysis
Provide
opportunities
for relevant
individuals to
contribute
Store resources
used to
underpin
analysis (inc.
that used to
produce
graphs)
Reporting
Consider how
resources can
be made
accessible
Ensure
resources are
curated &
accessible in
the long-term
https://doi.org/10.1371/journal.pcbi.1003285
Openness requirements
Research practice
• Demonstrate rigour of research
Funder requirements:
• Gold vs. Green
• Publication status, research data, other outputs
Domain-specific reporting guidelines:
• For study protocol and project outputs
https://www.equator-network.org/
Journal policies:
• Transparency and Openness Promotion (TOP)
https://cos.io/our-services/top-guidelines/
• Joint Data Archiving Policy (JDAP)
https://datadryad.org//pages/jdap https://cos.io/prereg/
Storage and organisation
• Ensure project resources are stored in a location that is
secure and available to relevant parties
• Can you find files from a project completed 10 years ago?
• Store on Secure Server or other defined location
• Adopt a consistent structure to organise & label content
• Content type (data, documents, code)
• Version (raw, processed)
• Sensitivity – store personal info in secure locale
• Create a file inventory spreadsheet
• Filename, location, content, source, sensitivity, etc.
https://xkcd.com/1459/
Tidy data
Common issues:
• Column headers contain values
• Multiple variables held in 1
column.
• Variables held in both rows and
columns.
• Multiple types of observation
recorded in the same table.
Wickham applies 3rd Normal Form:
• One row for each observation
• One column for each variable
• One table for each type of
observation
• Column headers (where they are
used) should be variable names,
Tidy data tools:
tidyr, dplyr, ggplot2, data.table, pandas
A set of principles to make data more consistent
https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf
Documentation & metadata
What info is needed to replicate or re-apply your analysis?
What info is needed to analyse and use your data?
User guide:
• Study design and data collection methods
• Data Analysis and Preparation
• Quality checks applied
Codebook:
• Variable type (Continuous, Ordinal, Categorical,
Missing values, censored/redacted)
• Permitted responses & their meaning (what is 1?)
• Abbreviations & phrases
• Research protocols
• Standard Operating Procedures
• Codebooks & data dictionaries
• Informed Consent form &
participant information sheet
• Questionnaires, interview
guide and other collection tools
• Data papers and other
publications
• Other relevant documents
http://www.dcc.ac.uk/resources/metadata-standards
Working with code and scripts in workflows
• Use ‘open’ programming/scripting languages not dependent upon
proprietary software
• Don’t reinvent the wheel: reuse existing code if it serves purpose
• Don’t update the source data, generate a derived file & label the version
no.
• Ensure a header to code files that explains their purpose and indicate
who created it & when
• Add comments throughout code explaining purpose of functions/specific
lines (if not obvious)
• Document dependencies, including version number
Providing access to resources
What do you
make available?
Anonymised data
Code
Research tools
Workflows
When do you
make it available?
-
During the project
lifetime
On publication of
findings
Within 6-12 months of
publication
Where do you
host it?
What platforms are
appropriate to your
needs?
How will access
be provided?
Open vs. controlled access
Need a reason
Participant consent, identifiable
-
How will it be managed?
Corresponding author,
Data Access Committee,
Data Sharing Agreement
https://www.flickr.com/photos/lwr/3897479560
https://www.flickr.com/photos/ryanr/142455033/
Data sharing principles
Publish a description
in a research catalogue
Obtain a permanent ID
to make it easy to cite
Provide clear method to
obtain files – open vs.
safeguarded
Handle access consistently
(PLOS req.)
Use recognised domain
standards & vocabularies
Common formats, e.g.
STATA, CSV
Apply clear usage licence -
Creative Commons or other
Provide documentation
relevant to researchers in
your field
The FAIR Guiding Principles for scientific data management and stewardship
Resource management tools
Functionality:
• Lifecycle management
• Object & version identifiers
• Workflow description standards that balance generic &
domain specific needs (E.g. DDI lifecycle, BPM variants)
Platforms:
• Electronic Lab Notebooks (Rspace, SciNote, LabArchives
• Code hosting: My Experiment, runmycode, Github/lab
• Repository platforms: OSF, Data Compass
Analysis and reporting tools
Growing number of online tools allow you to
create and share interactive documents that
contain live code, data, and other resources
• R Markdown - https://rmarkdown.rstudio.com/
• Jupyter - http://jupyter.org/
• Collaboratory https://colab.research.google.com/
• Benefits:
• Dynamic content that combines data & analysis
• Development environment - R, Python SQL.
• Disadvantages:
• Another complex platform to host & manage
• Content will become publicly accessible
Images sourced from project webpages
In summary
Open science requires you to consider:
• Research stakeholders who will be interested in
your work
• The value of research outputs for verification and
further use
• Systems that will be used to collect, manage,
analyse and provide access to research
https://www.flickr.com/photos/keith_marshall_avery/8132240925/

More Related Content

What's hot

Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
Varsha Khodiyar
 
Payton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook MetadataPayton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook Metadata
National Information Standards Organization (NISO)
 
Library resources and services for grant development
Library resources and services for grant developmentLibrary resources and services for grant development
Library resources and services for grant development
rds-wayne-edu
 
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
CareSearch palliative care knowledge network
 
HLA PD Day EBLIP* Brisbane 2015: Smart Seaching
HLA PD Day EBLIP* Brisbane 2015: Smart SeachingHLA PD Day EBLIP* Brisbane 2015: Smart Seaching
HLA PD Day EBLIP* Brisbane 2015: Smart Seaching
Flinders Filters, Flinders University
 
NIH rigor and reproducibility.use of animals in research.msu 2.16
NIH rigor and reproducibility.use of animals in research.msu 2.16NIH rigor and reproducibility.use of animals in research.msu 2.16
NIH rigor and reproducibility.use of animals in research.msu 2.16
Michigan State University Research
 
Sbm open science committee report to the board
Sbm open science committee report to the boardSbm open science committee report to the board
Sbm open science committee report to the board
Bradford Hesse
 
Ontologies for Clinical Research - Assessment and Development
Ontologies for Clinical Research - Assessment and DevelopmentOntologies for Clinical Research - Assessment and Development
Ontologies for Clinical Research - Assessment and Development
Wolfgang Kuchinke
 
Doing research better: The role of meta‐data
Doing research better: The role of meta‐dataDoing research better: The role of meta‐data
Doing research better: The role of meta‐data
GarethKnight
 
Sharing, Reproducibility, Replication – AN NIH View
Sharing, Reproducibility, Replication – AN NIH ViewSharing, Reproducibility, Replication – AN NIH View
Sharing, Reproducibility, Replication – AN NIH View
Philip Bourne
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
ARDC
 
Peer Reviewing Data: experiences from a data journal
Peer Reviewing Data: experiences from a data journalPeer Reviewing Data: experiences from a data journal
Peer Reviewing Data: experiences from a data journal
Varsha Khodiyar
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
ARDC
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
Phoenix Bioinformatics
 
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
SC CTSI at USC and CHLA
 
Clicking Past Google
Clicking Past GoogleClicking Past Google
Clicking Past Google
Douglas Joubert
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
sesrdm
 
Clinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific DataClinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific Data
Varsha Khodiyar
 
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
ASIS&T
 
The journey to evidence 2 1
The journey to evidence 2 1The journey to evidence 2 1
The journey to evidence 2 1stanbridge
 

What's hot (20)

Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
 
Payton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook MetadataPayton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook Metadata
 
Library resources and services for grant development
Library resources and services for grant developmentLibrary resources and services for grant development
Library resources and services for grant development
 
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
'Workshop on Smart Searching: Search Filters and Expert Topic Searches', by S...
 
HLA PD Day EBLIP* Brisbane 2015: Smart Seaching
HLA PD Day EBLIP* Brisbane 2015: Smart SeachingHLA PD Day EBLIP* Brisbane 2015: Smart Seaching
HLA PD Day EBLIP* Brisbane 2015: Smart Seaching
 
NIH rigor and reproducibility.use of animals in research.msu 2.16
NIH rigor and reproducibility.use of animals in research.msu 2.16NIH rigor and reproducibility.use of animals in research.msu 2.16
NIH rigor and reproducibility.use of animals in research.msu 2.16
 
Sbm open science committee report to the board
Sbm open science committee report to the boardSbm open science committee report to the board
Sbm open science committee report to the board
 
Ontologies for Clinical Research - Assessment and Development
Ontologies for Clinical Research - Assessment and DevelopmentOntologies for Clinical Research - Assessment and Development
Ontologies for Clinical Research - Assessment and Development
 
Doing research better: The role of meta‐data
Doing research better: The role of meta‐dataDoing research better: The role of meta‐data
Doing research better: The role of meta‐data
 
Sharing, Reproducibility, Replication – AN NIH View
Sharing, Reproducibility, Replication – AN NIH ViewSharing, Reproducibility, Replication – AN NIH View
Sharing, Reproducibility, Replication – AN NIH View
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
Peer Reviewing Data: experiences from a data journal
Peer Reviewing Data: experiences from a data journalPeer Reviewing Data: experiences from a data journal
Peer Reviewing Data: experiences from a data journal
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
 
Clicking Past Google
Clicking Past GoogleClicking Past Google
Clicking Past Google
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
 
Clinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific DataClinical Data Publishing at Scientific Data
Clinical Data Publishing at Scientific Data
 
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...
 
The journey to evidence 2 1
The journey to evidence 2 1The journey to evidence 2 1
The journey to evidence 2 1
 

Similar to Enhance your rese​arch impact through open science

Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
London School of Hygiene and Tropical Medicine
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
Rebecca Raworth, MLIS
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
Rebecca Raworth, MLIS
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
National Information Standards Organization (NISO)
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
The University of Edinburgh
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
Catherine Gold
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
University of Arizona
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
Susanna-Assunta Sansone
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
Varsha Khodiyar
 
Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)
Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
CLACSO-Latin American Council of Social Sciences, Open Access
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015
Susanna-Assunta Sansone
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
C. Tobin Magle
 
Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...
Anusuriya Devaraju
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
Anita de Waard
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
seanb
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
aaroncollie
 
Shareable by Design: Making Better Use of your Research
Shareable by Design: Making Better Use of your ResearchShareable by Design: Making Better Use of your Research
Shareable by Design: Making Better Use of your Research
London School of Hygiene and Tropical Medicine
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 

Similar to Enhance your rese​arch impact through open science (20)

Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)
Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)
Open data in a big data world (Accord ICSU-IAP-ISSC-TWAS)
 
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
Open data in a big data world Accord (ICSU-IAP-ISSC-TWAS)
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...Using Feedback from Data Consumers to Capture Quality Information on Environm...
Using Feedback from Data Consumers to Capture Quality Information on Environm...
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Shareable by Design: Making Better Use of your Research
Shareable by Design: Making Better Use of your ResearchShareable by Design: Making Better Use of your Research
Shareable by Design: Making Better Use of your Research
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 

More from London School of Hygiene and Tropical Medicine

Preparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTMPreparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTM
London School of Hygiene and Tropical Medicine
 
Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...
London School of Hygiene and Tropical Medicine
 
Information Security and GDPR
Information Security and GDPRInformation Security and GDPR
Information Security and GDPR
London School of Hygiene and Tropical Medicine
 
GDPR and Research Data Management
GDPR and Research Data ManagementGDPR and Research Data Management
GDPR and Research Data Management
London School of Hygiene and Tropical Medicine
 
Towards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunitiesTowards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunities
London School of Hygiene and Tropical Medicine
 
Data Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharingData Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharing
London School of Hygiene and Tropical Medicine
 
Crowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public healthCrowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public health
London School of Hygiene and Tropical Medicine
 
Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...
London School of Hygiene and Tropical Medicine
 
i-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseasesi-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseases
London School of Hygiene and Tropical Medicine
 
Internet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platformInternet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platform
London School of Hygiene and Tropical Medicine
 
An overview of the MyHeart Counts app
An overview of the MyHeart Counts appAn overview of the MyHeart Counts app
An overview of the MyHeart Counts app
London School of Hygiene and Tropical Medicine
 
Electronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in EthiopiaElectronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in Ethiopia
London School of Hygiene and Tropical Medicine
 
Mobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour ResearchMobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour Research
London School of Hygiene and Tropical Medicine
 
RDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspectiveRDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspective
London School of Hygiene and Tropical Medicine
 
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
London School of Hygiene and Tropical Medicine
 
Research data services at the University of Oxford
Research data services at the University of OxfordResearch data services at the University of Oxford
Research data services at the University of Oxford
London School of Hygiene and Tropical Medicine
 
Research Data Management at The University of Edinburgh
Research Data Management at The University of EdinburghResearch Data Management at The University of Edinburgh
Research Data Management at The University of Edinburgh
London School of Hygiene and Tropical Medicine
 
Research data management at UAL
Research data management at UALResearch data management at UAL
Research data management at UAL
London School of Hygiene and Tropical Medicine
 
RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?
London School of Hygiene and Tropical Medicine
 
An RDM Service for Health Researchers: LSHTM Case Study
An RDM Service for Health Researchers: LSHTM Case StudyAn RDM Service for Health Researchers: LSHTM Case Study
An RDM Service for Health Researchers: LSHTM Case Study
London School of Hygiene and Tropical Medicine
 

More from London School of Hygiene and Tropical Medicine (20)

Preparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTMPreparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTM
 
Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...
 
Information Security and GDPR
Information Security and GDPRInformation Security and GDPR
Information Security and GDPR
 
GDPR and Research Data Management
GDPR and Research Data ManagementGDPR and Research Data Management
GDPR and Research Data Management
 
Towards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunitiesTowards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunities
 
Data Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharingData Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharing
 
Crowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public healthCrowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public health
 
Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...
 
i-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseasesi-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseases
 
Internet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platformInternet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platform
 
An overview of the MyHeart Counts app
An overview of the MyHeart Counts appAn overview of the MyHeart Counts app
An overview of the MyHeart Counts app
 
Electronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in EthiopiaElectronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in Ethiopia
 
Mobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour ResearchMobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour Research
 
RDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspectiveRDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspective
 
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
 
Research data services at the University of Oxford
Research data services at the University of OxfordResearch data services at the University of Oxford
Research data services at the University of Oxford
 
Research Data Management at The University of Edinburgh
Research Data Management at The University of EdinburghResearch Data Management at The University of Edinburgh
Research Data Management at The University of Edinburgh
 
Research data management at UAL
Research data management at UALResearch data management at UAL
Research data management at UAL
 
RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?
 
An RDM Service for Health Researchers: LSHTM Case Study
An RDM Service for Health Researchers: LSHTM Case StudyAn RDM Service for Health Researchers: LSHTM Case Study
An RDM Service for Health Researchers: LSHTM Case Study
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Enhance your rese​arch impact through open science

  • 1. Enhance your research impact through open science Gareth Knight Research Data Manager Library & Archives Service researchdatamanagement@lshtm.ac.uk
  • 2. Open Science A broad movement that seeks to improve the quality of research through greater: • Transparency: Ensure methods are clearly explained and made available earlier • Consistency: Common standards, tools and services are used to perform analysis. • Collaboration: Opportunities are available for external contribution & collaboration on research • Access: All resources necessary to recreate the analysis are made available in a form that enable verification & reuse (Summary: it’s science with the benefit of 21st century tools)
  • 3. Reproducibility Crisis Vimes et al (2014) investigated data availability for 516 articles published 2-22 years previous – odds of a dataset being obtainable fell by 17% per year A 2016 Nature survey revealed 52% of 1,576 surveyed researchers considered there to be a 'significant' reproducibility crisis in science. • Approx. 68% of respondents failed to reproduce medical experiment. Research replication is time-consuming and expensive • Cancer Biology: https://osf.io/e81xl/wiki/home/ • Psychological Science - https://osf.io/ezcuj/wiki/home/ Retraction Watch lists 18,000+ papers that have been retracted, many as a result of faulty science Vimes et al (2014) https://doi.org/10.1016/j.cub.2013.11.014 Nature (2016) https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
  • 4. What are the benefits of open science? Analysis of open research practices and motivations of 583 Wellcome & 259 ESRC funded researchers: • Improved visibility of research • More publications • Higher citation rate – See Piwowar & Vision (2013) • Contribute to academic profile • Career benefits (e.g. promotion) • New collaborations Van den Eynden, V. et al. (2016) Towards Open Research: Practices, experiences, barriers and Opportunities. Wellcome Trust. https://doi.org/10.6084/m9.figshare.4055448 Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. https://doi.org/10.7717/peerj.175
  • 5. Open Science by Design Plan Collect ManageAnalyse Publish https://www.flaticon.com/free-icon/scientist_857648 Enhanced Research standards Enhanced Research standards Open Education Resources Open Education Resources Open software Open software Citizen Science & peer review opportunities Citizen Science & peer review opportunities Open accessOpen access Reusable resources Reusable resources
  • 7. Research Objectives Research is reviewed for many purposes: • Verification: check analysis to confirm conclusions are valid • Replicate: Same methods applied to get same result, different environment • Reproduce: Same methods applied, different setup • Reuse: same data, different research What steps do you take to ensure research is easier to validate/replicate/reproduce or reuse by others? The Difference https://xkcd.com/242/
  • 8. Plan for openness from the outset Plan Be aware of requirements Consider community engagement opportunities Document research protocol & publish Data collection Inform participants and relevant stakeholder Acquire raw data in electronic form using secure systems (e.g. ODK) Data Management Organise resources logically Ensure raw data is read only Assign unique IDs to relevant items Data processing Automate processing activities (as far as possible) in an open format to enable it to be re-applied Document activities performed to ensure an audit trail Data analysis Provide opportunities for relevant individuals to contribute Store resources used to underpin analysis (inc. that used to produce graphs) Reporting Consider how resources can be made accessible Ensure resources are curated & accessible in the long-term https://doi.org/10.1371/journal.pcbi.1003285
  • 9. Openness requirements Research practice • Demonstrate rigour of research Funder requirements: • Gold vs. Green • Publication status, research data, other outputs Domain-specific reporting guidelines: • For study protocol and project outputs https://www.equator-network.org/ Journal policies: • Transparency and Openness Promotion (TOP) https://cos.io/our-services/top-guidelines/ • Joint Data Archiving Policy (JDAP) https://datadryad.org//pages/jdap https://cos.io/prereg/
  • 10. Storage and organisation • Ensure project resources are stored in a location that is secure and available to relevant parties • Can you find files from a project completed 10 years ago? • Store on Secure Server or other defined location • Adopt a consistent structure to organise & label content • Content type (data, documents, code) • Version (raw, processed) • Sensitivity – store personal info in secure locale • Create a file inventory spreadsheet • Filename, location, content, source, sensitivity, etc. https://xkcd.com/1459/
  • 11. Tidy data Common issues: • Column headers contain values • Multiple variables held in 1 column. • Variables held in both rows and columns. • Multiple types of observation recorded in the same table. Wickham applies 3rd Normal Form: • One row for each observation • One column for each variable • One table for each type of observation • Column headers (where they are used) should be variable names, Tidy data tools: tidyr, dplyr, ggplot2, data.table, pandas A set of principles to make data more consistent https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf
  • 12. Documentation & metadata What info is needed to replicate or re-apply your analysis? What info is needed to analyse and use your data? User guide: • Study design and data collection methods • Data Analysis and Preparation • Quality checks applied Codebook: • Variable type (Continuous, Ordinal, Categorical, Missing values, censored/redacted) • Permitted responses & their meaning (what is 1?) • Abbreviations & phrases • Research protocols • Standard Operating Procedures • Codebooks & data dictionaries • Informed Consent form & participant information sheet • Questionnaires, interview guide and other collection tools • Data papers and other publications • Other relevant documents http://www.dcc.ac.uk/resources/metadata-standards
  • 13. Working with code and scripts in workflows • Use ‘open’ programming/scripting languages not dependent upon proprietary software • Don’t reinvent the wheel: reuse existing code if it serves purpose • Don’t update the source data, generate a derived file & label the version no. • Ensure a header to code files that explains their purpose and indicate who created it & when • Add comments throughout code explaining purpose of functions/specific lines (if not obvious) • Document dependencies, including version number
  • 14. Providing access to resources What do you make available? Anonymised data Code Research tools Workflows When do you make it available? - During the project lifetime On publication of findings Within 6-12 months of publication Where do you host it? What platforms are appropriate to your needs? How will access be provided? Open vs. controlled access Need a reason Participant consent, identifiable - How will it be managed? Corresponding author, Data Access Committee, Data Sharing Agreement https://www.flickr.com/photos/lwr/3897479560 https://www.flickr.com/photos/ryanr/142455033/
  • 15. Data sharing principles Publish a description in a research catalogue Obtain a permanent ID to make it easy to cite Provide clear method to obtain files – open vs. safeguarded Handle access consistently (PLOS req.) Use recognised domain standards & vocabularies Common formats, e.g. STATA, CSV Apply clear usage licence - Creative Commons or other Provide documentation relevant to researchers in your field The FAIR Guiding Principles for scientific data management and stewardship
  • 16. Resource management tools Functionality: • Lifecycle management • Object & version identifiers • Workflow description standards that balance generic & domain specific needs (E.g. DDI lifecycle, BPM variants) Platforms: • Electronic Lab Notebooks (Rspace, SciNote, LabArchives • Code hosting: My Experiment, runmycode, Github/lab • Repository platforms: OSF, Data Compass
  • 17. Analysis and reporting tools Growing number of online tools allow you to create and share interactive documents that contain live code, data, and other resources • R Markdown - https://rmarkdown.rstudio.com/ • Jupyter - http://jupyter.org/ • Collaboratory https://colab.research.google.com/ • Benefits: • Dynamic content that combines data & analysis • Development environment - R, Python SQL. • Disadvantages: • Another complex platform to host & manage • Content will become publicly accessible Images sourced from project webpages
  • 18. In summary Open science requires you to consider: • Research stakeholders who will be interested in your work • The value of research outputs for verification and further use • Systems that will be used to collect, manage, analyse and provide access to research https://www.flickr.com/photos/keith_marshall_avery/8132240925/