Open
Research
NISO training
fall 2024
session 3:
Reproducibility
and code sharing
October 10, 2024
Facilitated by Bianca Kramer & Jeroen Bosman
https://tinyurl.com/NISO-fall2024-session03
Course goals and structure
Course Goals
● Learn what open research
entails and why one should
pursue it
● Explore practices and tools,
getting insight into how these
are implemented and used
● Discuss open research
policies and how to support
open research in practice
Each week’s structure:
Review previous week:
● 20 min short recap, share actions
Session topic
● 25 min ‘what’ and ‘why’ (lecture)
● 25 min ‘how’ (hands-on activity)
● 20 min support, monitoring, policy
(discussion)
Home assignment
Recap session 2: Preparing open research
funder researcher /
institution
stake-
holder
Independent
Session 2 - Home assignment
Before our next session,
formulate one or more
potential actions at your
organization to facilitate
‘preparing open research’
Please share your
actions in the slides below -
we will then discuss them
together
Inform,
indirect
support
Direct
support
Advise,
advocate
Develop
policies
e.g.:
Info on website,
in LibGuides etc.,
curate metadata,
manage repo etc
Training, answering
questions, providing
workshops
Evidence based
opinions on what is a
good choice, what is
important and why
Co-develop policies
on open science
aspects or
information strategy
asks
for:
Knowledge,
organizing
information,
reliability
Communication skills,
expertise, listening,
ability to inspire
Setting priorities,
knowing patronʼs
goals, disciplinary
knowledge, vision,
ability to convince
Authority, role being
accepted by partners,
strategic thinking
Types of open research & education support
Session 2: Reproducibility and code sharing
How can open science improve reproducibility
of research and transparency of the research
process?
In this session we explore practices such as
preregistration, reproducible coding,
collaborative coding and sharing and archiving
of code, and some of the platforms and tools
available to them.
We will also discuss uncertainties researchers
may face regarding these practices.
analysis
Session 2: Reproducibility and code sharing
For the analysis phase, researchers make choices
around:
● preregistration
● reproducible coding
● sharing and archiving of code
● sharing protocols, workflows
Relevance of these aspects may differ between
disciplines and research designs, but importance
is growing.
analysis
Reproducibility
Why is it important?
Reproducibility
Why is it important?
Reproducibility
Why is it important?
Nature asked 1,576 scientists this
question as part of an online
survey. Most agree that there is a
crisis and over 70% said they'd
tried and failed to reproduce
another group's experiments.
Reproducibility
1. To show evidence of the correctness of your results
2. To enable others to make use of your methods and results
Who can benefit?
● Collaborators
● Peer reviewers & journal editors
● Broad scientific community
● The public
● You!
Why is it important?
Verifiability &
Reproducibility
Efficiency
Transparency &
accountability
Relevance &
stakeholder
involvement
What is reproducibility?
same data? same methodology?
Image source: Scriberia for The Turing Way https://doi.org/10.5281/zenodo.3332808.
Reproducibility
Not just for code and software!
computational
reproducibility
experiments /
protocols
systematic
reviews
sharing code,
collaborative coding
notebooks:
executable code +
documentation
archiving code
sharing protocols
Research Resource
Identification
reporting guidelines
standardized
methodology
archive/share
search strings
Computational reproducibility
Like this?
Computational reproducibility
ORGANIZATION
DOCUMENTATION
AUTOMATION
DISSEMINATION
file organization, file naming, version control (Git)
Readme file, code commenting, declaring dependencies
code vs. point & click (Excel…), functions for repeated steps
archive/share code (and data), software licenses,
software citation
Computational reproducibility
The Turing Way handbook to
reproducible, ethical and
collaborative data science
https://book.the-turing-way.org/
Computational reproducibility
Preregistration
Preregistration improves research practice and reduces
bias by specifying in advance how data will be collected
and analyzed.
Preregistration distinguishes between confirmatory
research (hypothesis-testing) and exploratory research
Why preregister?
● Improve the credibility of your results
● Improve your research design through earlier
planning (and review)
https://www.cos.io/prereg
Preregistration
Already common in some types of research !
Reproducibility
How to assess / monitor / test ?
Individual assessment Institutional monitoring Replication studies
Choose to explore either examples of what specific practices around reproducibility
look like or to what extent these practices are being implemented:
Examples of specific practice implementations
● Preregistration
● Protocols sharing
● Data sharing
● Code sharing
● Package with multiple components
Insights into degrees of implementation
● Trials registration
● Preregistration generally
● Data sharing
● Code sharing
Exploring tools and practices
Examples of specific practice implementations
Go to one of the examples break out groups. Group size is 4 maximum. Start with filling group E1, 5th
person goes to E2 etc. If you end up alone in a group you may add yourself as a fifth person to a
group.
● As a group, decide which of the practices below you want to see examples of and click that link.
You can likely do two at most. Scan the record and briefly look at the options of the platforms as
a whole. Talk aloud about what you see.
○ Preregistration: The TikTok-ization of News: Effects on (the Illusion of) Knowledge
○ Protocols sharing: Mod3D Live Cell Chambers and holders 3D printing and Assembly V.3
○ Data sharing: Hansbreen Snowpit Dataset: a long-term snow monitoring
○ Code sharing: NASA Astrobee Robot Software
○ Package with multiple components: National Survey on Research Integrity
● Try to formulate why the practice(s) help(s) with reproducibility and try to formulate what it
would require in terms of expertise, time etc. to apply/execute the practice.
Exploring tools and practices: activity instructions
Uptake levels of practices
● Go to one of the uptake levels break out groups. Group size is 4 maximum. Start with
filling group U1, 5th person goes to U2 etc. If you end up alone in a group you may
add yourself as a fifth person to a group.
● As a group decide which of the practices below you want to see insights/data on
uptake of and click that link. You can likely do two at most. Talk aloud about what you
see.
○ Clinical trials registration: ClinicalTrials.gov Trends and Charts on Registered Studies
○ (Pre)Registration: A new Open Science Indicator: measuring study registration
○ Data sharing: French Open Science monitor (scroll down to data info))
○ Code sharing: Charité Dashboard on Responsible Research (scroll down to code info)
● Try to summarize the level of uptake of the practice and whether the data are global
and for all disciplines or for specific geographies and disciplines; also try to come up
with interpretation of the levels found: do you think they are high/low, what might
explain the uptake levels?
Exploring tools and practices: activity instructions
Specific practices
Uptake levels
Discussion: monitoring, policies and support
Reproducibility
How to build community - locally
We are a grassroots journal club
initiative that helps researchers create
local Open Science journal clubs at
their universities to discuss diverse
issues, papers and ideas about
improving science, reproducibility and
the Open Science movement.
Reproducibility
How to build community - nationally
“We promote training activities,
curate and disseminate best
practices, support meta-scientific
research, and coordinate efforts in
collaboration with local initiatives,
research institutes and other
stakeholder organizations.”
Home assignment
Before our next session, formulate one or
more potential actions at your organization
to facilitate ‘reproducibility and code
sharing’
These can be things that are being
considered already, or fully new ideas.
Try to use the SMART rubric - identifying
actions that are specific, measurable,
achievable, relevant and time-bounded.
Formulate actions along the lines of:
“[Actor] will [action] for [audience] ”
For example:
“Our graduate school will start a
ReproducibiliTEA journal club”
“The library will organize a workshop on
preregistration for ECRs”
“The library, together with the computer
science department, will host Software
Carpentry workshops”
Open
Research
NISO training
fall 2024
session 4:
Open data
October 17, 2024
Facilitated by Bianca Kramer & Jeroen Bosman
Next week:

Bosman and Kramer Open Research: A 2024 NISO Training Series, Session Three: Reproducibility and code sharing"

  • 1.
    Open Research NISO training fall 2024 session3: Reproducibility and code sharing October 10, 2024 Facilitated by Bianca Kramer & Jeroen Bosman https://tinyurl.com/NISO-fall2024-session03
  • 2.
    Course goals andstructure Course Goals ● Learn what open research entails and why one should pursue it ● Explore practices and tools, getting insight into how these are implemented and used ● Discuss open research policies and how to support open research in practice Each week’s structure: Review previous week: ● 20 min short recap, share actions Session topic ● 25 min ‘what’ and ‘why’ (lecture) ● 25 min ‘how’ (hands-on activity) ● 20 min support, monitoring, policy (discussion) Home assignment
  • 3.
    Recap session 2:Preparing open research funder researcher / institution stake- holder Independent
  • 4.
    Session 2 -Home assignment Before our next session, formulate one or more potential actions at your organization to facilitate ‘preparing open research’ Please share your actions in the slides below - we will then discuss them together
  • 5.
    Inform, indirect support Direct support Advise, advocate Develop policies e.g.: Info on website, inLibGuides etc., curate metadata, manage repo etc Training, answering questions, providing workshops Evidence based opinions on what is a good choice, what is important and why Co-develop policies on open science aspects or information strategy asks for: Knowledge, organizing information, reliability Communication skills, expertise, listening, ability to inspire Setting priorities, knowing patronʼs goals, disciplinary knowledge, vision, ability to convince Authority, role being accepted by partners, strategic thinking Types of open research & education support
  • 6.
    Session 2: Reproducibilityand code sharing How can open science improve reproducibility of research and transparency of the research process? In this session we explore practices such as preregistration, reproducible coding, collaborative coding and sharing and archiving of code, and some of the platforms and tools available to them. We will also discuss uncertainties researchers may face regarding these practices. analysis
  • 7.
    Session 2: Reproducibilityand code sharing For the analysis phase, researchers make choices around: ● preregistration ● reproducible coding ● sharing and archiving of code ● sharing protocols, workflows Relevance of these aspects may differ between disciplines and research designs, but importance is growing. analysis
  • 8.
  • 9.
  • 10.
    Reproducibility Why is itimportant? Nature asked 1,576 scientists this question as part of an online survey. Most agree that there is a crisis and over 70% said they'd tried and failed to reproduce another group's experiments.
  • 11.
    Reproducibility 1. To showevidence of the correctness of your results 2. To enable others to make use of your methods and results Who can benefit? ● Collaborators ● Peer reviewers & journal editors ● Broad scientific community ● The public ● You! Why is it important? Verifiability & Reproducibility Efficiency Transparency & accountability Relevance & stakeholder involvement
  • 12.
    What is reproducibility? samedata? same methodology? Image source: Scriberia for The Turing Way https://doi.org/10.5281/zenodo.3332808.
  • 13.
    Reproducibility Not just forcode and software! computational reproducibility experiments / protocols systematic reviews sharing code, collaborative coding notebooks: executable code + documentation archiving code sharing protocols Research Resource Identification reporting guidelines standardized methodology archive/share search strings
  • 14.
  • 15.
    Computational reproducibility ORGANIZATION DOCUMENTATION AUTOMATION DISSEMINATION file organization,file naming, version control (Git) Readme file, code commenting, declaring dependencies code vs. point & click (Excel…), functions for repeated steps archive/share code (and data), software licenses, software citation
  • 16.
    Computational reproducibility The TuringWay handbook to reproducible, ethical and collaborative data science https://book.the-turing-way.org/
  • 17.
  • 18.
    Preregistration Preregistration improves researchpractice and reduces bias by specifying in advance how data will be collected and analyzed. Preregistration distinguishes between confirmatory research (hypothesis-testing) and exploratory research Why preregister? ● Improve the credibility of your results ● Improve your research design through earlier planning (and review) https://www.cos.io/prereg
  • 19.
    Preregistration Already common insome types of research !
  • 20.
    Reproducibility How to assess/ monitor / test ? Individual assessment Institutional monitoring Replication studies
  • 21.
    Choose to exploreeither examples of what specific practices around reproducibility look like or to what extent these practices are being implemented: Examples of specific practice implementations ● Preregistration ● Protocols sharing ● Data sharing ● Code sharing ● Package with multiple components Insights into degrees of implementation ● Trials registration ● Preregistration generally ● Data sharing ● Code sharing Exploring tools and practices
  • 22.
    Examples of specificpractice implementations Go to one of the examples break out groups. Group size is 4 maximum. Start with filling group E1, 5th person goes to E2 etc. If you end up alone in a group you may add yourself as a fifth person to a group. ● As a group, decide which of the practices below you want to see examples of and click that link. You can likely do two at most. Scan the record and briefly look at the options of the platforms as a whole. Talk aloud about what you see. ○ Preregistration: The TikTok-ization of News: Effects on (the Illusion of) Knowledge ○ Protocols sharing: Mod3D Live Cell Chambers and holders 3D printing and Assembly V.3 ○ Data sharing: Hansbreen Snowpit Dataset: a long-term snow monitoring ○ Code sharing: NASA Astrobee Robot Software ○ Package with multiple components: National Survey on Research Integrity ● Try to formulate why the practice(s) help(s) with reproducibility and try to formulate what it would require in terms of expertise, time etc. to apply/execute the practice. Exploring tools and practices: activity instructions
  • 23.
    Uptake levels ofpractices ● Go to one of the uptake levels break out groups. Group size is 4 maximum. Start with filling group U1, 5th person goes to U2 etc. If you end up alone in a group you may add yourself as a fifth person to a group. ● As a group decide which of the practices below you want to see insights/data on uptake of and click that link. You can likely do two at most. Talk aloud about what you see. ○ Clinical trials registration: ClinicalTrials.gov Trends and Charts on Registered Studies ○ (Pre)Registration: A new Open Science Indicator: measuring study registration ○ Data sharing: French Open Science monitor (scroll down to data info)) ○ Code sharing: Charité Dashboard on Responsible Research (scroll down to code info) ● Try to summarize the level of uptake of the practice and whether the data are global and for all disciplines or for specific geographies and disciplines; also try to come up with interpretation of the levels found: do you think they are high/low, what might explain the uptake levels? Exploring tools and practices: activity instructions
  • 24.
    Specific practices Uptake levels Discussion:monitoring, policies and support
  • 25.
    Reproducibility How to buildcommunity - locally We are a grassroots journal club initiative that helps researchers create local Open Science journal clubs at their universities to discuss diverse issues, papers and ideas about improving science, reproducibility and the Open Science movement.
  • 26.
    Reproducibility How to buildcommunity - nationally “We promote training activities, curate and disseminate best practices, support meta-scientific research, and coordinate efforts in collaboration with local initiatives, research institutes and other stakeholder organizations.”
  • 27.
    Home assignment Before ournext session, formulate one or more potential actions at your organization to facilitate ‘reproducibility and code sharing’ These can be things that are being considered already, or fully new ideas. Try to use the SMART rubric - identifying actions that are specific, measurable, achievable, relevant and time-bounded. Formulate actions along the lines of: “[Actor] will [action] for [audience] ” For example: “Our graduate school will start a ReproducibiliTEA journal club” “The library will organize a workshop on preregistration for ECRs” “The library, together with the computer science department, will host Software Carpentry workshops”
  • 28.
    Open Research NISO training fall 2024 session4: Open data October 17, 2024 Facilitated by Bianca Kramer & Jeroen Bosman Next week: