SlideShare a Scribd company logo
1 of 35
What is Reproducibility?
The R* Brouhaha
Professor Carole Goble
The University of Manchester, UK
Software Sustainability Institute UK
carole.goble@manchester.ac.uk
AlanTuring Institute Symposium Reproducibility, Sustainability and Preservation , 6-7 April 2016, Oxford, UK
“When I use a word," Humpty Dumpty
said in rather a scornful tone, "it means
just what I choose it to mean - neither
more nor less.”
Carroll,Through the Looking Glass
re-compute
replicate
rerun
repeat
re-examine
repurpose
recreate
reuse
restore
reconstruct review
regenerate
revise
recycle
redo
robustness
tolerance
verificationcompliancevalidation assurance
remix
Reproducibility of
Reproducibility Research
Even in Computer Science
http://www.dagstuhl.de/16041
24. – 29. January 2016, Dagstuhl Seminar 16041
Reproducibility of Data-Oriented Experiments in e-Science
Computational
Workflow
10 Simple Rules for Reproducible
Computational Research
1. For Every Result, Keep Track of How It Was
Produced
2. Avoid Manual Data Manipulation Steps
3. Archive the Exact Versions of All External
Programs Used
4. Version Control All Custom Scripts
5. Record All Intermediate Results, When Possible in
Standardized Formats
6. For Analyses That Include Randomness, Note
Underlying Random Seeds
7. Always Store Raw Data behind Plots
8. Generate Hierarchical Analysis Output, Allowing
Layers of Increasing Detail to Be Inspected
9. Connect Textual Statements to Underlying
Results
10. Provide Public Access to Scripts, Runs, and
Results
Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible
Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
Record
Everything
Automate
Everything
Scientific publications goals:
(i) announce a result and
(ii) convince readers that the result
is correct.
Papers in experimental science
should describe the results and
provide a clear enough protocol to
allow successful repetition and
extension.
Papers in computational science
should describe the results and
provide the complete software
development environment, data
and set of instructions which
generated the figures.
Virtual Witnessing*
*Leviathan and the Air-Pump: Hobbes, Boyle, and the
Experimental Life (1985) Shapin and Schaffer.
Jill Mesirov
David Donoho
Datasets, Data collections
Standard operating procedures
Software, algorithms
Configurations,
Tools and apps, services
Codes, code libraries
Workflows, scripts
System software
Infrastructure
Compilers, hardware
Analogy: The Lab
data science | data-driven science
1. Philosophy
2. Practice
Witnessing “Datascopes”
Input Data
Software
Output Data
Config
Parameters
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, , ref resources
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment
Model Driven Science – can I rerun my model?
Model Sweeps,What are the sensitivities?
“Micro” Reproducibility
“Macro” Reproducibility
Repeat, Replicate, Robust
Why the differences?
Reproduce
[CTitus Brown]
https://2016-oslo-repeatability.readthedocs.org/en/latest/repeatability-discussion.html
Computational
analyses
Productivity
Track differences
Computational
analyses?
Repeatability:
Sameness
Same result
1 Lab
1 experiment
Reproducibility:
Similarity
Similar result
> 1 Lab
> 1 experiment
“an experiment is reproducible until
another laboratory tries to repeat it.”
Alexander Kohn
reviewers want additional
work
statistician wants more
runs
analysis needs to be
repeated
post-doc leaves, student
arrives
new data, revised data
updated versions of
algorithms/codes
sample was
contaminated
Measuring Information Gain from Reproducibility
Research goal
Method/Alg.
Platform/Exec Env
Data Parameters
Input data
Actors
Information Gain
Implementation/Code
No change
Change
Don’t care
https://linkingresearch.wordpress.com/2016/02/21/dagstuhl-seminar-report-reproducibility-of-data-oriented-experiments-in-e-scienc/
http://www.dagstuhl.de/16041
Taxonomy of actions towards
improving reproducibility in
Computer Science.
https://linkingresearch.wordpress.com/2016/02/21/dagstuhl-seminar-report-reproducibility-of-data-oriented-experiments-in-e-scienc/
http://www.dagstuhl.de/16041
Practice
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, ref datasets
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment
Input Data
Software
Output Data
Config
Parameters
“Datascope” Entropy -> Preservation
“Replication / Reproducibility Window”
Change:
Science, methods, datasets
Questions don’t change, answers do.
Materials unavailable
one offs, streams,
stochastics, sensitivities,
licensing
scale, non-portable data
Change:
Instruments break, labs decay
Active ref datasets and services
Platforms & resources unavailable
supercomputer scale
non-portable software
“Datascope” Entropy -> Preservation
“Replication / Reproducibility Window”
Archived vs Active Environment
Isolated vs Open Distributed Ecosystem
T1 T2
Evolving Reference
Knowledge Bases
e.g. UNIPROT
Repeat harder than Reproduce?
Repeating the experiment or the set up?
• When the environment is active
“Datascope” Entropy -> Preservation
“Replication / Reproducibility Window”
Form?
Function?
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, ref datasets
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment
How? Preserve by Reporting, Reproduce by Reading
ProvenanceTraces, Notebooks, Rich Metadata
Archived Record
Methods
Materials
Instruments
Laboratory
How?
Preserve by Reporting, Reproduce by Reading
ProvenanceTraces, Notebooks, Rich Metadata
Archived Record
standards, common metadata
Provenance
Workflows,
Scripts
ELNs
Methods
Materials
Instruments
Laboratory
How? Preserve by Maintaining, Repairing,VMs
Reproduce by Running, Emulating, Reconstructing
Active Instrument
Methods
Materials
Instruments
Laboratory
How? Preserve by Maintaining, Repairing,VMs
Reproduce by Running, Emulating, Reconstructing
Active Instrument, Byte level
Methods
Materials
Instruments
Laboratory
Levels of Computational Reproducibility
Coverage: how
much of an
experiment is
reproducible
OriginalExperimentSimilarExperimentDifferentExperiment
Portability
Depth: how much of an experiment is available
Binaries +
Data
Source Code /
Workflow
+ Data
Binaries +
Data +
Dependencies
Source Code /
Workflow
+ Data +
Dependencies
Virtual Machine
Binaries +
Data +
Dependencies
Virtual Machine
Source Code /
Workflow
+ Data +
Dependencies
Figures +
Data
[Freire, 2014]
Minimum:
data and source
code available
under terms
that permit
inspection and
execution.
Repeatable Environments
*https://2016-oslo-repeatability.readthedocs.org/en/latest/overview-and-agenda.html
[C.Titus Brown*]
Metadata Objects : Reproducible Reporting, Exchange
Checklist
ProvenanceTracking
Versioning
Dependencies
container
provenance
dependencies
steps, features
transparency
portability
robustness
preservation
access description
available
standards
common APIs
licensing
identifiers
standards,
common metadata
change
variation sensitivity
versioning
packaging
So, What is Reproducibility? Being FAIR
So, What is Reproducibility? Being FAIR
What is Reproducibility?
Why, When, Where, Who for, Who by, How
Special thanks to
• C Titus Brown
• Juliana Freire
• David De Roure
• Stian Soiland-Reyes
• Barend Mons
• Tim Clark
• Daniel Garijo
• Wf4Ever and Research Object teams
• Dagstuhl Seminar 16041
• Force11 http://www.force11.org

More Related Content

Viewers also liked

Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySparkSpark Summit
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondRobin Rice
 
What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?Robin Rice
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Robin Rice
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsRobin Rice
 
Data Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghData Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghRobin Rice
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Robin Rice
 
Managing active research in the University of Edinburgh
Managing active research in the University of EdinburghManaging active research in the University of Edinburgh
Managing active research in the University of EdinburghRobin Rice
 
Guiding users through data deposit
Guiding users through data depositGuiding users through data deposit
Guiding users through data depositRobin Rice
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
 
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DIT
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DITDesenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DIT
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DITProf. Marcus Renato de Carvalho
 
Social Media beyond Marketing
Social Media beyond MarketingSocial Media beyond Marketing
Social Media beyond MarketingNabeel Adeni
 
Keamanan informasi
Keamanan informasiKeamanan informasi
Keamanan informasiAl Amin
 
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...Mapping Slacktivism: Patterns of low-threshold civic participation on the int...
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...mysociety
 

Viewers also liked (15)

Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyond
 
What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjects
 
Data Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghData Library Services at the University of Edinburgh
Data Library Services at the University of Edinburgh
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...
 
Managing active research in the University of Edinburgh
Managing active research in the University of EdinburghManaging active research in the University of Edinburgh
Managing active research in the University of Edinburgh
 
Guiding users through data deposit
Guiding users through data depositGuiding users through data deposit
Guiding users through data deposit
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...
 
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DIT
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DITDesenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DIT
Desenvolvimento integral da PRIMEIRA INFÂNCIA 15 argumentos DIT
 
Social Media beyond Marketing
Social Media beyond MarketingSocial Media beyond Marketing
Social Media beyond Marketing
 
Paradigm ias academy
Paradigm ias academyParadigm ias academy
Paradigm ias academy
 
Keamanan informasi
Keamanan informasiKeamanan informasi
Keamanan informasi
 
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...Mapping Slacktivism: Patterns of low-threshold civic participation on the int...
Mapping Slacktivism: Patterns of low-threshold civic participation on the int...
 

More from Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows Carole Goble
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
 

More from Carole Goble (20)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 

Recently uploaded

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 

Recently uploaded (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 

What is reproducibility goble-clean

  • 1. What is Reproducibility? The R* Brouhaha Professor Carole Goble The University of Manchester, UK Software Sustainability Institute UK carole.goble@manchester.ac.uk AlanTuring Institute Symposium Reproducibility, Sustainability and Preservation , 6-7 April 2016, Oxford, UK
  • 2. “When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean - neither more nor less.” Carroll,Through the Looking Glass re-compute replicate rerun repeat re-examine repurpose recreate reuse restore reconstruct review regenerate revise recycle redo robustness tolerance verificationcompliancevalidation assurance remix
  • 5. http://www.dagstuhl.de/16041 24. – 29. January 2016, Dagstuhl Seminar 16041 Reproducibility of Data-Oriented Experiments in e-Science
  • 7. 10 Simple Rules for Reproducible Computational Research 1. For Every Result, Keep Track of How It Was Produced 2. Avoid Manual Data Manipulation Steps 3. Archive the Exact Versions of All External Programs Used 4. Version Control All Custom Scripts 5. Record All Intermediate Results, When Possible in Standardized Formats 6. For Analyses That Include Randomness, Note Underlying Random Seeds 7. Always Store Raw Data behind Plots 8. Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected 9. Connect Textual Statements to Underlying Results 10. Provide Public Access to Scripts, Runs, and Results Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285 Record Everything Automate Everything
  • 8. Scientific publications goals: (i) announce a result and (ii) convince readers that the result is correct. Papers in experimental science should describe the results and provide a clear enough protocol to allow successful repetition and extension. Papers in computational science should describe the results and provide the complete software development environment, data and set of instructions which generated the figures. Virtual Witnessing* *Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer. Jill Mesirov David Donoho
  • 9. Datasets, Data collections Standard operating procedures Software, algorithms Configurations, Tools and apps, services Codes, code libraries Workflows, scripts System software Infrastructure Compilers, hardware
  • 10. Analogy: The Lab data science | data-driven science 1. Philosophy 2. Practice
  • 11. Witnessing “Datascopes” Input Data Software Output Data Config Parameters Methods techniques, algorithms, spec. of the steps, models Materials datasets, parameters, algorithm seeds Instruments codes, services, scripts, underlying libraries, workflows, , ref resources Laboratory sw and hw infrastructure, systems software, integrative platforms computational environment
  • 12. Model Driven Science – can I rerun my model? Model Sweeps,What are the sensitivities?
  • 14. Repeat, Replicate, Robust Why the differences? Reproduce [CTitus Brown] https://2016-oslo-repeatability.readthedocs.org/en/latest/repeatability-discussion.html
  • 16. Computational analyses? Repeatability: Sameness Same result 1 Lab 1 experiment Reproducibility: Similarity Similar result > 1 Lab > 1 experiment
  • 17. “an experiment is reproducible until another laboratory tries to repeat it.” Alexander Kohn
  • 18. reviewers want additional work statistician wants more runs analysis needs to be repeated post-doc leaves, student arrives new data, revised data updated versions of algorithms/codes sample was contaminated
  • 19. Measuring Information Gain from Reproducibility Research goal Method/Alg. Platform/Exec Env Data Parameters Input data Actors Information Gain Implementation/Code No change Change Don’t care https://linkingresearch.wordpress.com/2016/02/21/dagstuhl-seminar-report-reproducibility-of-data-oriented-experiments-in-e-scienc/ http://www.dagstuhl.de/16041
  • 20. Taxonomy of actions towards improving reproducibility in Computer Science. https://linkingresearch.wordpress.com/2016/02/21/dagstuhl-seminar-report-reproducibility-of-data-oriented-experiments-in-e-scienc/ http://www.dagstuhl.de/16041
  • 22. Methods techniques, algorithms, spec. of the steps, models Materials datasets, parameters, algorithm seeds Instruments codes, services, scripts, underlying libraries, workflows, ref datasets Laboratory sw and hw infrastructure, systems software, integrative platforms computational environment Input Data Software Output Data Config Parameters “Datascope” Entropy -> Preservation “Replication / Reproducibility Window”
  • 23. Change: Science, methods, datasets Questions don’t change, answers do. Materials unavailable one offs, streams, stochastics, sensitivities, licensing scale, non-portable data Change: Instruments break, labs decay Active ref datasets and services Platforms & resources unavailable supercomputer scale non-portable software “Datascope” Entropy -> Preservation “Replication / Reproducibility Window” Archived vs Active Environment Isolated vs Open Distributed Ecosystem
  • 25. Repeat harder than Reproduce? Repeating the experiment or the set up? • When the environment is active
  • 26. “Datascope” Entropy -> Preservation “Replication / Reproducibility Window” Form? Function? Methods techniques, algorithms, spec. of the steps, models Materials datasets, parameters, algorithm seeds Instruments codes, services, scripts, underlying libraries, workflows, ref datasets Laboratory sw and hw infrastructure, systems software, integrative platforms computational environment
  • 27. How? Preserve by Reporting, Reproduce by Reading ProvenanceTraces, Notebooks, Rich Metadata Archived Record Methods Materials Instruments Laboratory
  • 28. How? Preserve by Reporting, Reproduce by Reading ProvenanceTraces, Notebooks, Rich Metadata Archived Record standards, common metadata Provenance Workflows, Scripts ELNs Methods Materials Instruments Laboratory
  • 29. How? Preserve by Maintaining, Repairing,VMs Reproduce by Running, Emulating, Reconstructing Active Instrument Methods Materials Instruments Laboratory
  • 30. How? Preserve by Maintaining, Repairing,VMs Reproduce by Running, Emulating, Reconstructing Active Instrument, Byte level Methods Materials Instruments Laboratory
  • 31. Levels of Computational Reproducibility Coverage: how much of an experiment is reproducible OriginalExperimentSimilarExperimentDifferentExperiment Portability Depth: how much of an experiment is available Binaries + Data Source Code / Workflow + Data Binaries + Data + Dependencies Source Code / Workflow + Data + Dependencies Virtual Machine Binaries + Data + Dependencies Virtual Machine Source Code / Workflow + Data + Dependencies Figures + Data [Freire, 2014] Minimum: data and source code available under terms that permit inspection and execution.
  • 32. Repeatable Environments *https://2016-oslo-repeatability.readthedocs.org/en/latest/overview-and-agenda.html [C.Titus Brown*] Metadata Objects : Reproducible Reporting, Exchange Checklist ProvenanceTracking Versioning Dependencies container
  • 33. provenance dependencies steps, features transparency portability robustness preservation access description available standards common APIs licensing identifiers standards, common metadata change variation sensitivity versioning packaging So, What is Reproducibility? Being FAIR
  • 34. So, What is Reproducibility? Being FAIR
  • 35. What is Reproducibility? Why, When, Where, Who for, Who by, How Special thanks to • C Titus Brown • Juliana Freire • David De Roure • Stian Soiland-Reyes • Barend Mons • Tim Clark • Daniel Garijo • Wf4Ever and Research Object teams • Dagstuhl Seminar 16041 • Force11 http://www.force11.org