Being Reproducible: SSBSS Summer School 2017

Being Reproducible:
Models, Research
Objects and R* Brouhaha
Professor Carole Goble, carole.goble@manchester.ac.uk
The University of Manchester, UK
The FAIRDOM Association Coordinator
ELIXIR-UK Head of Node
Co-lead ELIXIR Interoperability Platform
SSBSS 2017, July 17 2017, Cambridge, UK
4th International Synthetic & Systems Biology Summer School
Reproducibility
Rampancy
47/53 “landmark” publications
could not be replicated
[Begley, Ellis Nature, 483, 2012]
Retraction
http://www.nature.com/news/misconduct-is-the-main-cause-of-life-sciences-retractions-1.11507
Misconduct is the main cause
of life-sciences retractions
Zoë Corbyn
01 October 2012
Vahan Simonyan,
Center for Biologics
Evaluation and Research
Food and Drug Administration
USA
NIH Rigor and
Reproducibility
https://www.nih.gov/research-
training/rigor-reproducibility
cos.io/top
http://www.acmedsci.ac.uk/policy/policy-
projects/reproducibility-and-reliability-of-
biomedical-research/
John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI: 10.1371/journal.pmed.1001747
Reproducibility of biological experiments
is hard
for in vivo/vitro and
for in silico analysis
• OS version
• Revision of scripts
• Data analysis software versions
• Version of data files
• Command line parameters written on
a napkin
• “Black magic” only a grad student
knows
Fix with latest technologies, best
practices and willingness
[Keiichiro Ono, Scripps Institute]
The first
step is to
be
FAIR
See the whole of
the previous talk…
Record All
Automate All
Contain All
Access All
Findable (Citable)
Accessible (Trackable)
Interoperable (Intelligible)
Reusable (Reproducible)
design
cherry picking data, random seed
reporting, non-independent bias, poor
positive and negative controls, dodgy
normalisation, arbitrary cut-offs,
premature data triage, un-validated
materials, improper statistical analysis,
poor statistical power, stop when “get to
the right answer”, software
misconfigurations misapplied black box
software
reporting
incomplete reporting of software configurations, parameters & resource
versions, missed steps, missing data, vague methods, missing software
Empirical Statistical Computational
V. Stodden, IMS Bulletin (2013)
Reproducibility and reliability of biomedical
research: improving research practice
https://www.sciencenews.org/article/12-reasons-research-goes-wrong
Being Reproducible: SSBSS Summer School 2017
“When I use a word," Humpty Dumpty
said in rather a scornful tone, "it means
just what I choose it to mean - neither
more nor less.”
Carroll, Through the Looking Glass
re-compute
replicate
rerun
repeat
re-examine
repurpose
recreate
reuse
restore
reconstruct review
regenerate
revise
recycle
redo
robustness
tolerance
verificationcompliancevalidation assurance
remix
Scientific publications goals:
(i) announce a result
(ii) convince readers its correct.
Papers in experimental science
should describe the results and
provide a clear enough protocol to
allow successful repetition and
extension.
Papers in computational science
should describe the results and
provide the complete software
development environment, data
and set of instructions which
generated the figures.
VirtualWitnessing*
*Leviathan and theAir-Pump: Hobbes, Boyle, and the
Experimental Life (1985) Shapin and Schaffer.
Jill Mesirov
David Donoho
“Micro” Reproducibility
“Macro” Reproducibility
Fixivity
Validate
Verify
Trust
Repeatability:
“Sameness”
Same result
1 Lab
1 experiment
Reproducibility:
“Similarity”
Similar result
> 1 Lab
> 1 experiment
why the differences?
https://2016-oslo-
repeatability.readthedocs.org/en/latest/repeatability-discussion.html
Validate
Verify
Method Reproducibility
the provision of enough detail about
study procedures and data so, in
theory or in actuality, the same
procedures could be exactly
repeated.
Result Reproducibility
(aka replicability)
obtaining the same results from the
conduct of an independent study
whose procedures are as closely
matched to the original experiment
as possible
Goodman, et al ScienceTranslational Medicine 8 (341) 2016
Validate
Verify
What are you reproducing?
Algorithm vs its script conflation
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, ref datasets
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment
Productivity
Track differences
Validate
Verify
Validate
Verify
Recompute By Degrees
Fixivity - Liveness
• New/updated/deprecated methods,
datasets, services, codes, h/w
• Snapshots
Dependency – Containment
• Streams, non-portable data/software,
• 3rd party services, supercomputer access,
licensing restrictions….
• Locally contained and maintained
• External dependencies
Transparency
• Blackboxes, proprietary software,
manual steps
Robustness
• Bounds of use
• Stochastics, non-deterministics,
contexts
https://xkcd.com/797/
Components and Dependencies
Software are typically
compound works.
Libraries. Plug-ins.
Code fragments.
We are encouraged to
reuse and not reinvent
Combining licenses.
License compatibilities
Black boxes
• closed codes
• closed external or cloud
services
• method obscurity
• manual steps
[Thanks to Jason Scott]
The ReproducibilityWindow
all experiments become less reproducible over
time….
• Can’t contain everything
– Pesky Internet in a Box
• Can’t automate everything
– Pesky people intervening
• Can’t fix and fossils everything
– Pesky science keeps changing
Results may vary
Bonus slide
At SSBSS Theodor Gescher came up with REALSCI
Robust -many runs
Environment -describe the equipment/OS
Another -done by not your lab
Limits -parameters
Standards -well understood/comprehensible methods
Complete -not cherry picking
Immortal -community supported commodity systems
Mixed Central and Distributed stores:
Containment and Dependencies. Upload vs Referencing
In House Stores
External Databases
Publishing services
Model Resources
Mixed Central and Distributed stores:
Containment and Dependencies. Upload vs Referencing
In House Stores
External Databases
Publishing services
Model Resources
Migrations into FAIRDOMHub
For long term reproducibility
Shades of Reproducibility
Running an active instrument
Reading an archived record
Are you using
hard-wired
localhost ids?
Workflows
SOPs
Containers, cloud services, common services
Markup languages,
reporting guidelines and
checklists, ontologies,
catalogues
Sounds hard….
what can I do?
Catalogue
Protocol specs and sharing…
A language for specifying
experimental protocols for
biological research in way that is
precise, unambiguous, and
understandable by both humans
and computers.
Validation Data
https://fairdomhub.org/sops/203https://fairdomhub.org/investigations/56
Standard Operating Procedures
Quality Control
in situ reproducible models in FAIRDOM
metadata annotation against standards
validation, comparison and simulation
SBML Model simulation
Model comparison
Model versioning
Reproducing simulations
[Jacky Snoep, Dagmar Waltemath, Martin Peters, Martin Scharm]
JWS Online
Tracking versi0ns
Tracking model versions smartly
Scharm, M., Wolkenhauer, O., & Waltemath, D. (2015). An algorithm to detect and
communicate the differences in computational models describing biological
systems. Bioinformatics, btv484
Model simulation in FAIRDOMHub
using JWS Online
A simulation database allows a one-click, live
figure reproduction in a FAIRDOM-SEEK
JWS model Excel data file
Dagmar Waltemath, Uni Rostock
Jacky Snoep, Uni Stellenbosch
Simulation Experiment Description Markup
Language: XML-based format for encoding
simulation setups, to ensure exchangeability and
reproducibility of simulation experiments
• which models to use in an experiment,
• modifications to apply on the models before using them,
• which simulation procedures to run on each model,
• what analysis results to output,
• and how the results should be presented.
FAIRDOMHub Journal Programme
Molecular Systems Biology
ModelTechnical curation forJournals
[Jacky Snoep (Stellenbosch), DagmarWaltemath, Martin Peters, Martin Scharm (Rostock)]
* store DOI citable supplementary files on FAIRDOMHub
** model and data curation
*** reproducible clickable figures in papers using SED-ML
Cataloguing
Packaging
Penkler, G., du Toit, F., Adams,
W., Rautenbach, M., Palm, D. C.,
van Niekerk, D. D. and Snoep, J.
L. (2015), Construction and
validation of a detailed kinetic
model of glycolysis in
Plasmodium falciparum. FEBS J,
282: 1481–1511.
doi:10.1111/febs.13237
https://fairdomhub.org/investigations/56
DOI: 10.15490/seek.1.investigation.56
Snapshot
preservation
active
18/07/2017 39
An “evolving manuscript” would begin with a pre-
publication, pre-peer review “beta 0.9” version of an
article, followed by the approved published article itself, [
… ] “version 1.0”.
Subsequently, scientists would update this paper with
details of further work as the area of research develops.
Versions 2.0 and 3.0 might allow for the “accretion of
confirmation [and] reputation”.
Ottoline Leyser […] assessment criteria in science revolve
around the individual. “People have stopped thinking
about the scientific enterprise”.
http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
Packaging: CombineArchive
https://sems.uni-rostock.de/projects/combinearchive/
Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D
SEMS, University of Rostock
zip-like file with a manifest & metadata
- Bundling files - Keeping provenance
- Exchanging data - Shipping results
Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format:
one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1.
Standards-based metadata framework for
bundling (scattered) resources with context and citation
Packaging:
Research Objects
http://researchobject.org
Packaging:
Research Objects
Publishing
Archive
Institutional
Archive
1.Export
2.Exchange
http://researchobject.org
Manifest
Construction
Container
Manifest
Description
Packaging Platforms:
Zip files, BagIt,
Docker, Conda, Singularity
Repositories
FAIRDOMHub
Packaging:
Research Objects in a nutshell
Different
manifest
description
profiles for
different kinds of
objects
FromVirtual Machines to Executable Containers
for portable execution
• Containers everything required to make a piece of
software run is packaged into isolated containers.
• UnlikeVMs, containers do not bundle a full operating
system - only libraries and settings required to make
the software work.
• Efficient, lightweight, self-contained systems
• Guarantees that software will always run the same,
regardless of where it’s deployed.
https://www.software.ac.uk/c4rr/ https://biocontainers.pro/
Biocontainers
Use commodity and community systems
Sustained platforms
Communities to drive them
Tooling and training
Spreadsheets are the Cockroaches of Science
EU FAIR Data Expert Group Consultation
https://github.com/FAIR-Data-
EG/consultation/issues
What to know more?
Go on a Software or Data Carpentry Course
https://tess.elixir-europe.org
Make software open and reusable
Software Sustainability Institute ,
http://www.software.ac.uk
Goble, Better Software Better Research
IEEE Internet Computing 18(5), (2014 )
DOI: 10.1109/MIC.2014.88
Jiménez RC, Kuzak M, Alhamdoosh M et al.
Four simple recommendations to
encourage best practices in research
software [version 1; referees: 3 approved].
F1000Research 2017, 6:876 (doi:
10.12688/f1000research.11407.1)
Use Common
Platforms
Get the licencing
right…
MATLAB
Mathematica….
Proprietary
software
Cloud Centralised Service
insitu reproducibility….
Galaxy
FAIRDOMHub + JWS Online
Blackbox vs
Whitebox
https://view.commonwl.org/workflows/github.com/Protein
sWebTeam/ebi-metagenomics-
cwl/tree/fa86fce/workflows/rna-selector.cwl
Use and document workflows
preferrably a workflow management system, Living Research Objects!
http://commonwl.org/
Workflow repository
Use a workflow – the vision!
preferrably a workflow management system
preferrably described using CommonWorkflow Language
Experimental
workflows
Event BUS Business Process Management
Taverna Knime Galaxy
Workflow
BPM layer
Workflow
Computation
Application
layer
Computing resources Databases
Effector
layer
Front-end
Web interface / Monitoring interface
Pipeline
Pilot
FAIRDOM SEEK
Workflow repository
Workflow portal
repository
launch, results
FAIRDOM
[Jean Loup Fallon, Carole Goble]
https://hive.biochemistry.gwu.edu/htscsrs/workshop_2017
Reproducible Pipelines for Robust Regulation
BioCompute Objects
Emphasis on fixing the
pipeline so it can be
replicated, and on
reporting the
parameter space
Use an Electronic Lab Notebook
What can you do?
• Follow the 10 RACA Principles
• Take action, be imperfect
• Demand reproducibility in reviews.
• Educate your PIs and supervisors.
[Norman Morrison]
Technological Debt: Appropriate Effort
Retrospective Reusability 
What are the incentives?
[Garza] [Malone] [Resnik]
Acknowledgements
• David De Roure
• Tim Clark
• Sean Bechhofer
• Robert Stevens
• Christine Borgman
• Victoria Stodden
• Marco Roos
• Jose Enrique Ruiz del Mazo
• Oscar Corcho
• Ian Cottam
• Steve Pettifer
• Magnus Rattray
• Chris Evelo
• Katy Wolstencroft
• Robin Williams
• Pinar Alper
• C. Titus Brown
• Greg Wilson
• Kristian Garza
• Juliana Freire
• Jill Mesirov
• Simon Cockell
• Paolo Missier
• Paul Watson
• Gerhard Klimeck
• Matthias Obst
• Jun Zhao
• Pinar Alper
• Daniel Garijo
• Yolanda Gil
• James Taylor
• Alex Pico
• Sean Eddy
• Cameron Neylon
• Barend Mons
• Kristina Hettne
• Stian Soiland-Reyes
• Rebecca Lawrence
• Michael Crusoe
Jon OlavVik,
Norwegian University of Life Science
Maksim Zakhartsev
University Hohenheim, Stuttgart,
Germany
Alexey Kolodkin
Siberian Branch
Russian Academy of Sciences
Tomasz Zieliński,
SynthSys Centre
University Edinburgh, UK
Martin Peters, Martin Scharm
Systems Biology Bioinformatics
University of Rostock, Germany
Web sites
• Force11 http://www.force11.org
• TeSS https://tess.elixir-europe.org
• FAIRDOM http://www.fair-dom.org
• FAIRDOMHub http://www.fairdomhub.org
• Software Carpentry http://software-carpentry.org
• Data Carpentry http://datacarpentry.org
• Software Sustainability Institute http://www.software.ac.uk
• Rightfield http://www.rightfield.org.uk
• FAIRSharing http://www.fairsharing.org
• CommonWorkflow Language http://commonwl.org/
Reading List (refs also throughout)
• John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI:
10.1371/journal.pmed.1001747
• Ioannidis JPA (2005) Why Most Published Research FindingsAre False. PLoS Med 2(8): e124.
doi:10.1371/journal.pmed.0020124
• Steven N. Goodman*, Daniele Fanelli and John P. A. Ioannidis,What does research reproducibility mean? Science
Translational Medicine 01 Jun 2016:Vol. 8, Issue 341, pp. 341ps12 DOI: 10.1126/scitranslmed.aaf5027
• Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible Computational Research.
PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
• Massimiliano Assante, Leonardo Candela, DonatellaCastelli, Paolo Manghi and Pasquale Pagano, Science 2.0
Repositories:Time for a Change in Scholarly Communication, D-Lib Magazine January/February 2015,Volume 21,
Number 1/2 , DOI: 10.1045/january2015-assante
• Waltemath, D., Henkel, R., Hälke, R., Scharm, M., &Wolkenhauer, O. (2013). Improving the reuse of
computational models through version control.Bioinformatics, 29(6), 742-748.
• Bergmann, F.T., Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014).
COMBINE archive andOMEX format: one file to share all information to reproduce a modeling project. BMC
bioinformatics,15(1), 1.
• Scharm, M.,Wolkenhauer, O., &Waltemath, D. (2015). An algorithm to detect and communicate the differences
in computational models describing biological systems. Bioinformatics, btv484
• http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328
• http://www.acmedsci.ac.uk/policy/policy-projects/reproducibility-and-reliability-of-biomedical-research/
1 of 60

Recommended

Being FAIR: FAIR data and model management SSBSS 2017 Summer School by
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
978 views65 slides
ROHub by
ROHubROHub
ROHubRaul Palma
775 views13 slides
Aspects of Reproducibility in Earth Science by
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceRaul Palma
560 views17 slides
The Rhetoric of Research Objects by
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
2.4K views53 slides
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe... by
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
459 views59 slides
FAIRy Stories by
FAIRy StoriesFAIRy Stories
FAIRy StoriesCarole Goble
1.7K views59 slides

More Related Content

What's hot

FAIR Data, Operations and Model management for Systems Biology and Systems Me... by
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
1.5K views47 slides
Reproducibility, Research Objects and Reality, Leiden 2016 by
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Carole Goble
1.1K views74 slides
Advances in Scientific Workflow Environments by
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
1.1K views41 slides
The Research Object Initiative: Frameworks and Use Cases by
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
1.7K views61 slides
Research Shared: researchobject.org by
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
2K views25 slides
FAIRer Research by
FAIRer ResearchFAIRer Research
FAIRer ResearchCarole Goble
1K views40 slides

What's hot(20)

FAIR Data, Operations and Model management for Systems Biology and Systems Me... by Carole Goble
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Carole Goble1.5K views
Reproducibility, Research Objects and Reality, Leiden 2016 by Carole Goble
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
Carole Goble1.1K views
Advances in Scientific Workflow Environments by Carole Goble
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
Carole Goble1.1K views
The Research Object Initiative: Frameworks and Use Cases by Carole Goble
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
Carole Goble1.7K views
Research Shared: researchobject.org by Norman Morrison
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison2K views
What is Reproducibility? The R* brouhaha (and how Research Objects can help) by Carole Goble
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
Carole Goble1.5K views
Mtsr2015 goble-keynote by Carole Goble
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble1.5K views
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ... by Carole Goble
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Carole Goble866 views
Research Objects, SEEK and FAIRDOM by Carole Goble
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
Carole Goble1.7K views
Introduction to FAIRDOM by Carole Goble
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
Carole Goble1.3K views
Reproducibility Using Semantics: An Overview by dgarijo
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
dgarijo890 views
Reproducible and citable data and models: an introduction. by FAIRDOM
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
FAIRDOM4.2K views
FAIR Data and Model Management for Systems Biology (and SOPs too!) by Carole Goble
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble1.1K views
Reproducibility of model-based results: standards, infrastructure, and recogn... by FAIRDOM
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
FAIRDOM4K views
Better Software, Better Research by Carole Goble
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
Carole Goble657 views
RARE and FAIR Science: Reproducibility and Research Objects by Carole Goble
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
Carole Goble1.4K views
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks by Carole Goble
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Carole Goble1.3K views
FAIR data and model management for systems biology. by FAIRDOM
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
FAIRDOM1.6K views
Improving the Management of Computational Models -- Invited talk at the EBI by Martin Scharm
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
Martin Scharm6K views

Similar to Being Reproducible: SSBSS Summer School 2017

Research Objects for FAIRer Science by
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
2.2K views78 slides
Reproducibility (and the R*) of Science: motivations, challenges and trends by
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
1.8K views37 slides
Results may vary: Collaborations Workshop, Oxford 2014 by
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
1.8K views52 slides
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o... by
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
17.2K views90 slides
Aussois bda-mdd-2018 by
Aussois bda-mdd-2018Aussois bda-mdd-2018
Aussois bda-mdd-2018Khalid Belhajjame
410 views101 slides
The beauty of workflows and models by
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
1.1K views63 slides

Similar to Being Reproducible: SSBSS Summer School 2017(20)

Research Objects for FAIRer Science by Carole Goble
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
Carole Goble2.2K views
Reproducibility (and the R*) of Science: motivations, challenges and trends by Carole Goble
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
Carole Goble1.8K views
Results may vary: Collaborations Workshop, Oxford 2014 by Carole Goble
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
Carole Goble1.8K views
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o... by Carole Goble
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble17.2K views
The beauty of workflows and models by myGrid team
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
myGrid team1.1K views
Reproducibility by Other Means: Transparent Research Objects by Timothy McPhillips
Reproducibility by Other Means: Transparent Research ObjectsReproducibility by Other Means: Transparent Research Objects
Reproducibility by Other Means: Transparent Research Objects
Timothy McPhillips310 views
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th... by GigaScience, BGI Hong Kong
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Is that a scientific report or just some cool pictures from the lab? Reproduc... by Greg Landrum
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Greg Landrum2.1K views
The Future of Research (Science and Technology) by Duncan Hull
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
Duncan Hull39K views
The FAIRDOM Commons for Systems Biology by FAIRDOM
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
FAIRDOM2.3K views
Open PHACTS April 2017 Science webinar Workflow tools by open_phacts
Open PHACTS April 2017 Science webinar Workflow toolsOpen PHACTS April 2017 Science webinar Workflow tools
Open PHACTS April 2017 Science webinar Workflow tools
open_phacts348 views
RDA Scholarly Infrastructure 2015 by William Gunn
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
William Gunn1K views
Acs denver dirks potenzone 30 aug2011 by Rudy Potenzone
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
Rudy Potenzone534 views
Keynote speech - Carole Goble - Jisc Digital Festival 2015 by Jisc
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Jisc4K views
2011 03-provenance-workshop-edingurgh by Jun Zhao
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
Jun Zhao392 views
Docker in Open Science Data Analysis Challenges by Bruce Hoff by Docker, Inc.
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker, Inc.652 views

More from Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo... by
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
45 views23 slides
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research... by
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
36 views33 slides
Research Software Sustainability takes a Village by
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
40 views29 slides
FAIR Computational Workflows by
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
193 views29 slides
Open Research: Manchester leading and learning by
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
143 views17 slides
RDMkit, a Research Data Management Toolkit. Built by the Community for the ... by
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
710 views38 slides

More from Carole Goble(20)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo... by Carole Goble
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
Carole Goble45 views
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research... by Carole Goble
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Carole Goble36 views
Research Software Sustainability takes a Village by Carole Goble
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
Carole Goble40 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble193 views
Open Research: Manchester leading and learning by Carole Goble
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
Carole Goble143 views
RDMkit, a Research Data Management Toolkit. Built by the Community for the ... by Carole Goble
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble710 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble982 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble415 views
EOSC-Life Workflow Collaboratory by Carole Goble
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
Carole Goble132 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble493 views
FAIR Data Bridging from researcher data management to ELIXIR archives in the... by Carole Goble
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
Carole Goble120 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble629 views
FAIR Workflows and Research Objects get a Workout by Carole Goble
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble480 views
FAIRy stories: the FAIR Data principles in theory and in practice by Carole Goble
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble243 views
RO-Crate: A framework for packaging research products into FAIR Research Objects by Carole Goble
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
Carole Goble425 views
The swings and roundabouts of a decade of fun and games with Research Objects by Carole Goble
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
Carole Goble168 views
How are we Faring with FAIR? (and what FAIR is not) by Carole Goble
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
Carole Goble814 views
What is Reproducibility? The R* brouhaha and how Research Objects can help by Carole Goble
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
Carole Goble258 views
FAIR History and the Future by Carole Goble
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
Carole Goble308 views
ELIXIR UK Node presentation to the ELIXIR Board by Carole Goble
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
Carole Goble501 views

Recently uploaded

"How can I develop my learning path in bioinformatics? by
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?Bioinformy
23 views13 slides
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI
5 views6 slides
application of genetic engineering 2.pptx by
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptxSankSurezz
9 views12 slides
TF-FAIR.pdf by
TF-FAIR.pdfTF-FAIR.pdf
TF-FAIR.pdfDirk Roorda
6 views120 slides
Disinfectants & Antiseptic by
Disinfectants & AntisepticDisinfectants & Antiseptic
Disinfectants & AntisepticSanket P Shinde
10 views36 slides
RemeOs science and clinical evidence by
RemeOs science and clinical evidenceRemeOs science and clinical evidence
RemeOs science and clinical evidencePetrusViitanen1
36 views96 slides

Recently uploaded(20)

"How can I develop my learning path in bioinformatics? by Bioinformy
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?
Bioinformy23 views
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 views
application of genetic engineering 2.pptx by SankSurezz
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptx
SankSurezz9 views
RemeOs science and clinical evidence by PetrusViitanen1
RemeOs science and clinical evidenceRemeOs science and clinical evidence
RemeOs science and clinical evidence
PetrusViitanen136 views
A training, certification and marketing scheme for informal dairy vendors in ... by ILRI
A training, certification and marketing scheme for informal dairy vendors in ...A training, certification and marketing scheme for informal dairy vendors in ...
A training, certification and marketing scheme for informal dairy vendors in ...
ILRI13 views
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl... by GIFT KIISI NKIN
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...
Synthesis and Characterization of Magnetite-Magnesium Sulphate-Sodium Dodecyl...
GIFT KIISI NKIN22 views
Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH16 views
PRINCIPLES-OF ASSESSMENT by rbalmagro
PRINCIPLES-OF ASSESSMENTPRINCIPLES-OF ASSESSMENT
PRINCIPLES-OF ASSESSMENT
rbalmagro12 views
Nitrosamine & NDSRI.pptx by NileshBonde4
Nitrosamine & NDSRI.pptxNitrosamine & NDSRI.pptx
Nitrosamine & NDSRI.pptx
NileshBonde413 views
Metatheoretical Panda-Samaneh Borji.pdf by samanehborji
Metatheoretical Panda-Samaneh Borji.pdfMetatheoretical Panda-Samaneh Borji.pdf
Metatheoretical Panda-Samaneh Borji.pdf
samanehborji16 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles808 views
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf by KerryNuez1
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfMODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
KerryNuez124 views
CSF -SHEEBA.D presentation.pptx by SheebaD7
CSF -SHEEBA.D presentation.pptxCSF -SHEEBA.D presentation.pptx
CSF -SHEEBA.D presentation.pptx
SheebaD711 views

Being Reproducible: SSBSS Summer School 2017

  • 1. Being Reproducible: Models, Research Objects and R* Brouhaha Professor Carole Goble, carole.goble@manchester.ac.uk The University of Manchester, UK The FAIRDOM Association Coordinator ELIXIR-UK Head of Node Co-lead ELIXIR Interoperability Platform SSBSS 2017, July 17 2017, Cambridge, UK 4th International Synthetic & Systems Biology Summer School
  • 3. 47/53 “landmark” publications could not be replicated [Begley, Ellis Nature, 483, 2012]
  • 5. Vahan Simonyan, Center for Biologics Evaluation and Research Food and Drug Administration USA
  • 7. John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI: 10.1371/journal.pmed.1001747
  • 8. Reproducibility of biological experiments is hard for in vivo/vitro and for in silico analysis • OS version • Revision of scripts • Data analysis software versions • Version of data files • Command line parameters written on a napkin • “Black magic” only a grad student knows Fix with latest technologies, best practices and willingness [Keiichiro Ono, Scripps Institute] The first step is to be FAIR See the whole of the previous talk…
  • 9. Record All Automate All Contain All Access All Findable (Citable) Accessible (Trackable) Interoperable (Intelligible) Reusable (Reproducible)
  • 10. design cherry picking data, random seed reporting, non-independent bias, poor positive and negative controls, dodgy normalisation, arbitrary cut-offs, premature data triage, un-validated materials, improper statistical analysis, poor statistical power, stop when “get to the right answer”, software misconfigurations misapplied black box software reporting incomplete reporting of software configurations, parameters & resource versions, missed steps, missing data, vague methods, missing software Empirical Statistical Computational V. Stodden, IMS Bulletin (2013) Reproducibility and reliability of biomedical research: improving research practice https://www.sciencenews.org/article/12-reasons-research-goes-wrong
  • 12. “When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean - neither more nor less.” Carroll, Through the Looking Glass re-compute replicate rerun repeat re-examine repurpose recreate reuse restore reconstruct review regenerate revise recycle redo robustness tolerance verificationcompliancevalidation assurance remix
  • 13. Scientific publications goals: (i) announce a result (ii) convince readers its correct. Papers in experimental science should describe the results and provide a clear enough protocol to allow successful repetition and extension. Papers in computational science should describe the results and provide the complete software development environment, data and set of instructions which generated the figures. VirtualWitnessing* *Leviathan and theAir-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer. Jill Mesirov David Donoho
  • 15. Repeatability: “Sameness” Same result 1 Lab 1 experiment Reproducibility: “Similarity” Similar result > 1 Lab > 1 experiment why the differences? https://2016-oslo- repeatability.readthedocs.org/en/latest/repeatability-discussion.html Validate Verify
  • 16. Method Reproducibility the provision of enough detail about study procedures and data so, in theory or in actuality, the same procedures could be exactly repeated. Result Reproducibility (aka replicability) obtaining the same results from the conduct of an independent study whose procedures are as closely matched to the original experiment as possible Goodman, et al ScienceTranslational Medicine 8 (341) 2016 Validate Verify
  • 17. What are you reproducing? Algorithm vs its script conflation Methods techniques, algorithms, spec. of the steps, models Materials datasets, parameters, algorithm seeds Instruments codes, services, scripts, underlying libraries, workflows, ref datasets Laboratory sw and hw infrastructure, systems software, integrative platforms computational environment
  • 19. Validate Verify Recompute By Degrees Fixivity - Liveness • New/updated/deprecated methods, datasets, services, codes, h/w • Snapshots Dependency – Containment • Streams, non-portable data/software, • 3rd party services, supercomputer access, licensing restrictions…. • Locally contained and maintained • External dependencies Transparency • Blackboxes, proprietary software, manual steps Robustness • Bounds of use • Stochastics, non-deterministics, contexts
  • 20. https://xkcd.com/797/ Components and Dependencies Software are typically compound works. Libraries. Plug-ins. Code fragments. We are encouraged to reuse and not reinvent Combining licenses. License compatibilities
  • 21. Black boxes • closed codes • closed external or cloud services • method obscurity • manual steps [Thanks to Jason Scott]
  • 22. The ReproducibilityWindow all experiments become less reproducible over time…. • Can’t contain everything – Pesky Internet in a Box • Can’t automate everything – Pesky people intervening • Can’t fix and fossils everything – Pesky science keeps changing Results may vary
  • 23. Bonus slide At SSBSS Theodor Gescher came up with REALSCI Robust -many runs Environment -describe the equipment/OS Another -done by not your lab Limits -parameters Standards -well understood/comprehensible methods Complete -not cherry picking Immortal -community supported commodity systems
  • 24. Mixed Central and Distributed stores: Containment and Dependencies. Upload vs Referencing In House Stores External Databases Publishing services Model Resources
  • 25. Mixed Central and Distributed stores: Containment and Dependencies. Upload vs Referencing In House Stores External Databases Publishing services Model Resources Migrations into FAIRDOMHub For long term reproducibility
  • 26. Shades of Reproducibility Running an active instrument Reading an archived record Are you using hard-wired localhost ids? Workflows SOPs Containers, cloud services, common services Markup languages, reporting guidelines and checklists, ontologies, catalogues Sounds hard…. what can I do? Catalogue
  • 27. Protocol specs and sharing… A language for specifying experimental protocols for biological research in way that is precise, unambiguous, and understandable by both humans and computers.
  • 30. in situ reproducible models in FAIRDOM metadata annotation against standards validation, comparison and simulation SBML Model simulation Model comparison Model versioning Reproducing simulations [Jacky Snoep, Dagmar Waltemath, Martin Peters, Martin Scharm] JWS Online
  • 32. Tracking model versions smartly Scharm, M., Wolkenhauer, O., & Waltemath, D. (2015). An algorithm to detect and communicate the differences in computational models describing biological systems. Bioinformatics, btv484
  • 33. Model simulation in FAIRDOMHub using JWS Online
  • 34. A simulation database allows a one-click, live figure reproduction in a FAIRDOM-SEEK JWS model Excel data file Dagmar Waltemath, Uni Rostock Jacky Snoep, Uni Stellenbosch Simulation Experiment Description Markup Language: XML-based format for encoding simulation setups, to ensure exchangeability and reproducibility of simulation experiments • which models to use in an experiment, • modifications to apply on the models before using them, • which simulation procedures to run on each model, • what analysis results to output, • and how the results should be presented.
  • 36. ModelTechnical curation forJournals [Jacky Snoep (Stellenbosch), DagmarWaltemath, Martin Peters, Martin Scharm (Rostock)] * store DOI citable supplementary files on FAIRDOMHub ** model and data curation *** reproducible clickable figures in papers using SED-ML
  • 37. Cataloguing Packaging Penkler, G., du Toit, F., Adams, W., Rautenbach, M., Palm, D. C., van Niekerk, D. D. and Snoep, J. L. (2015), Construction and validation of a detailed kinetic model of glycolysis in Plasmodium falciparum. FEBS J, 282: 1481–1511. doi:10.1111/febs.13237 https://fairdomhub.org/investigations/56 DOI: 10.15490/seek.1.investigation.56 Snapshot preservation active
  • 38. 18/07/2017 39 An “evolving manuscript” would begin with a pre- publication, pre-peer review “beta 0.9” version of an article, followed by the approved published article itself, [ … ] “version 1.0”. Subsequently, scientists would update this paper with details of further work as the area of research develops. Versions 2.0 and 3.0 might allow for the “accretion of confirmation [and] reputation”. Ottoline Leyser […] assessment criteria in science revolve around the individual. “People have stopped thinking about the scientific enterprise”. http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
  • 39. Packaging: CombineArchive https://sems.uni-rostock.de/projects/combinearchive/ Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D SEMS, University of Rostock zip-like file with a manifest & metadata - Bundling files - Keeping provenance - Exchanging data - Shipping results Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1.
  • 40. Standards-based metadata framework for bundling (scattered) resources with context and citation Packaging: Research Objects http://researchobject.org
  • 42. Manifest Construction Container Manifest Description Packaging Platforms: Zip files, BagIt, Docker, Conda, Singularity Repositories FAIRDOMHub Packaging: Research Objects in a nutshell Different manifest description profiles for different kinds of objects
  • 43. FromVirtual Machines to Executable Containers for portable execution • Containers everything required to make a piece of software run is packaged into isolated containers. • UnlikeVMs, containers do not bundle a full operating system - only libraries and settings required to make the software work. • Efficient, lightweight, self-contained systems • Guarantees that software will always run the same, regardless of where it’s deployed. https://www.software.ac.uk/c4rr/ https://biocontainers.pro/ Biocontainers
  • 44. Use commodity and community systems Sustained platforms Communities to drive them Tooling and training Spreadsheets are the Cockroaches of Science
  • 45. EU FAIR Data Expert Group Consultation https://github.com/FAIR-Data- EG/consultation/issues
  • 46. What to know more? Go on a Software or Data Carpentry Course https://tess.elixir-europe.org
  • 47. Make software open and reusable
  • 48. Software Sustainability Institute , http://www.software.ac.uk Goble, Better Software Better Research IEEE Internet Computing 18(5), (2014 ) DOI: 10.1109/MIC.2014.88 Jiménez RC, Kuzak M, Alhamdoosh M et al. Four simple recommendations to encourage best practices in research software [version 1; referees: 3 approved]. F1000Research 2017, 6:876 (doi: 10.12688/f1000research.11407.1)
  • 49. Use Common Platforms Get the licencing right… MATLAB Mathematica…. Proprietary software Cloud Centralised Service insitu reproducibility…. Galaxy FAIRDOMHub + JWS Online Blackbox vs Whitebox
  • 50. https://view.commonwl.org/workflows/github.com/Protein sWebTeam/ebi-metagenomics- cwl/tree/fa86fce/workflows/rna-selector.cwl Use and document workflows preferrably a workflow management system, Living Research Objects! http://commonwl.org/ Workflow repository
  • 51. Use a workflow – the vision! preferrably a workflow management system preferrably described using CommonWorkflow Language Experimental workflows Event BUS Business Process Management Taverna Knime Galaxy Workflow BPM layer Workflow Computation Application layer Computing resources Databases Effector layer Front-end Web interface / Monitoring interface Pipeline Pilot FAIRDOM SEEK Workflow repository Workflow portal repository launch, results FAIRDOM [Jean Loup Fallon, Carole Goble]
  • 52. https://hive.biochemistry.gwu.edu/htscsrs/workshop_2017 Reproducible Pipelines for Robust Regulation BioCompute Objects Emphasis on fixing the pipeline so it can be replicated, and on reporting the parameter space
  • 53. Use an Electronic Lab Notebook
  • 54. What can you do? • Follow the 10 RACA Principles • Take action, be imperfect • Demand reproducibility in reviews. • Educate your PIs and supervisors.
  • 55. [Norman Morrison] Technological Debt: Appropriate Effort Retrospective Reusability 
  • 56. What are the incentives? [Garza] [Malone] [Resnik]
  • 57. Acknowledgements • David De Roure • Tim Clark • Sean Bechhofer • Robert Stevens • Christine Borgman • Victoria Stodden • Marco Roos • Jose Enrique Ruiz del Mazo • Oscar Corcho • Ian Cottam • Steve Pettifer • Magnus Rattray • Chris Evelo • Katy Wolstencroft • Robin Williams • Pinar Alper • C. Titus Brown • Greg Wilson • Kristian Garza • Juliana Freire • Jill Mesirov • Simon Cockell • Paolo Missier • Paul Watson • Gerhard Klimeck • Matthias Obst • Jun Zhao • Pinar Alper • Daniel Garijo • Yolanda Gil • James Taylor • Alex Pico • Sean Eddy • Cameron Neylon • Barend Mons • Kristina Hettne • Stian Soiland-Reyes • Rebecca Lawrence • Michael Crusoe
  • 58. Jon OlavVik, Norwegian University of Life Science Maksim Zakhartsev University Hohenheim, Stuttgart, Germany Alexey Kolodkin Siberian Branch Russian Academy of Sciences Tomasz Zieliński, SynthSys Centre University Edinburgh, UK Martin Peters, Martin Scharm Systems Biology Bioinformatics University of Rostock, Germany
  • 59. Web sites • Force11 http://www.force11.org • TeSS https://tess.elixir-europe.org • FAIRDOM http://www.fair-dom.org • FAIRDOMHub http://www.fairdomhub.org • Software Carpentry http://software-carpentry.org • Data Carpentry http://datacarpentry.org • Software Sustainability Institute http://www.software.ac.uk • Rightfield http://www.rightfield.org.uk • FAIRSharing http://www.fairsharing.org • CommonWorkflow Language http://commonwl.org/
  • 60. Reading List (refs also throughout) • John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI: 10.1371/journal.pmed.1001747 • Ioannidis JPA (2005) Why Most Published Research FindingsAre False. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124 • Steven N. Goodman*, Daniele Fanelli and John P. A. Ioannidis,What does research reproducibility mean? Science Translational Medicine 01 Jun 2016:Vol. 8, Issue 341, pp. 341ps12 DOI: 10.1126/scitranslmed.aaf5027 • Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285 • Massimiliano Assante, Leonardo Candela, DonatellaCastelli, Paolo Manghi and Pasquale Pagano, Science 2.0 Repositories:Time for a Change in Scholarly Communication, D-Lib Magazine January/February 2015,Volume 21, Number 1/2 , DOI: 10.1045/january2015-assante • Waltemath, D., Henkel, R., Hälke, R., Scharm, M., &Wolkenhauer, O. (2013). Improving the reuse of computational models through version control.Bioinformatics, 29(6), 742-748. • Bergmann, F.T., Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive andOMEX format: one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1. • Scharm, M.,Wolkenhauer, O., &Waltemath, D. (2015). An algorithm to detect and communicate the differences in computational models describing biological systems. Bioinformatics, btv484 • http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328 • http://www.acmedsci.ac.uk/policy/policy-projects/reproducibility-and-reliability-of-biomedical-research/