Improving Reproducibility and Reuse of Modelling Results in the Life Sciences

SYSTEMS BIOLOGY
BIOINFORMATICS
ROSTOCK
S E Ssimulation experiment management system
Improving Reproducibility and Reuse of
Modelling Results in the Life Sciences
MARTIN SCHARM
Department of Systems Biology & Bioinformatics, University of Rostock
https://sbi.uni-rostock.de/scharm
Rostock, 30 August 2018
30 August 2018 Improving Reproducibility and Reuse of Modelling Results in the Life Sciences | Martin Scharm 1

Outline of my talk
Model 1
v1 v2 v3
Model 2
v1 v2 v3 v4
Model 3
v1 v2
Correct Hypothesis 1
Hypothesis 2
Improve
A method to characterise differences in
computational models (Chapters 2 and 3)

Outline of my talk
Search&Retrieve
Publish&Share
Model 1
v1 v2 v3
Model 2
v1 v2 v3 v4
Model 3
v1 v2
Hypothesis 2
Improve
Support for shareable and reproducible
simulation studies (Chapter 4)

Computational models evolve
an example from real life
Novak and Tyson: Numerical analysis of a comprehensive model of M-phase control in Xenopus oocyte extracts and intact embryos.
In Journal of Cell Science 1993 106:1153-1168
Models evolve: I not just witnessed changes, but induced model evolution myself.

Novak and Tyson: Numerical analysis of a comprehensive model of M-phase control in Xenopus oocyte extracts and intact embryos.
In Journal of Cell Science 1993 106:1153-1168
Models evolve: I not just witnessed changes, but induced model evolution myself.
Cyclin
Cdc2
Cdc2Cyclin
SBGN

June 2007
The model with id BIOMD0000000107 was published in
BioModels Database. It encodes the biological system
described by Novak and Tyson in 1993.

June 2007 June 2013
Along with many other models in BioModels Database, the model
was released in multiple versions between 2007 and 2013.

Cyclin
Cdc2
Cdc2
Cyclin
June 2007
SBML code has changed
June 2013
differences
modifications
inserts
deletes
The changes however do not affect
the graphical representation of the
reaction.

Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cdc2Cyclin
June 2007
SBML code has changed
Model was corrected
June 2013
November 2013
differences
modifications
inserts
deletes
When studying the model
I found an error in that reaction
and corrected the model.

Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cdc2Cyclin
June 2007
SBML code has changed No modifications
Model was corrected
June 2013
November 2013
April 2015
differences
modifications
inserts
deletes
BioModels Database created
many more releases, but none of
them affected the reaction.

Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cdc2Cyclin
June 2007
Model was corrected
June 2013
November 2013
April 2015 Latest Version
in BioModels
differences
modifications
inserts
deletes
In 2016 my corrections were merged
into BioModels Database.

even after publication
Scharm, Gebhardt, Touré, Bagnacani, Salehzadeh-Yazdi, Wolkenhauer and Waltemath: Evolution of computational models in
BioModels Database and the Physiome Model Repository. In BMC Systems Biology 2018 12:53.
BIOMD0001
BIOMD0021
BIOMD0041
BIOMD0061
BIOMD0081
BIOMD0101
BIOMD0121
BIOMD0141
BIOMD0161
BIOMD0181
BIOMD0201
BIOMD0221
BIOMD0241
BIOMD0261
BIOMD0281
BIOMD0301
BIOMD0321
BIOMD0341
BIOMD0361
BIOMD0381
BIOMD0401
BIOMD0421
BIOMD0441
BIOMD0461
BIOMD0481
BIOMD0501
BIOMD0521
BIOMD0541
BIOMD0561
BIOMD0581
BIOMD0601
Apr 05
Jul 05
Jan 06
Jun 06
Oct 06
Jan 07
Jun 07
Sep 07
Mar 08
Aug 08
Dec 08
Mar 09
Jun 09
Sep 09
Jan 10
Apr 10
Sep 10
Apr 11
Sep 11
Feb 12
May 12
Aug 12
Dec 12
Jun 13
Nov 13
Apr 14
Sep 14
Apr 15
May 16
Jun 17
0
5
10
50
100
500
1000
5000
17425
Numberofdifferences

changes need to be communicated
Waltemath, Henkel, Hälke, Scharm and Wolkenhauer: Improving the Reuse of Computational Models through Version Control.
In Bioinformatics 2013 29:6 742–748.
• Requirements of a version control system for computational models:
• it must be entailed to the structure of model documents,
• changes must be transparent and versions must be unambiguously identiﬁable,
addressable, and accessible,
• changes must be justiﬁed.
• Only with proper difference detection at hand users are able to grasp a model’s
history and to identify errors and inconsistencies.

Versioning of models
Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cyclin
Cyclin
Cdc2
Cdc2
Cdc2Cyclin
June 2007
Model was corrected
June 2013
November 2013
in BioModels
differences
modifications
inserts
deletes

June 2007 June 2013
November 2013
in BioModels

June 2007 June 2013
November 2013
in BioModels
track versions ✓
BioModels Database collects
TM
Physiome Model Repository focuses on

June 2007 June 2013
November 2013
in BioModels
track versions ✓
What happened?
BioModels Database collects
TM
Physiome Model Repository focuses on

Differences in models
detection and communication
Scharm, Wolkenhauer and Waltemath: An algorithm to detect and communicate the differences in computational models describing
biological systems. In Bioinformatics 2015 32:4 563-570
A method to detect and characterise differences in versions of computational models,
including
1. an algorithm to detect changes based on the ideas of the XyDiff algorithm,
2. output formats to communicate the changes,
3. an ontology to semantically describe changes.

detecting changes
A
B
C D E
F
G
A
B
D H E
F
G
model version 1
model version 2
list of species list of reactions
C + D E D + H E

detecting changes
A
B
C D E
F
G
A
B
D H E
F
G
id=“cdc2” GO:0000123
id=“listOfReactions”
initialmappingpb

detecting changes
A
B
C D E
F
G
A
B
D H E
F
G
initialmappingpb

detecting changes
A
B
C D E
F
G
A
B
D H E
F
G
propagationpb

detecting changes
A
B
C D E
F
G
A
B
D H E
F
G
evaluationpb

detecting changes
C
D
H E
communicationpb

communicating changes
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<bives type="fullDiff">
<update />
<delete>
[...]
<node id="6" oldChildNo="1" oldTag="modifierSpeciesReference"
oldPath="../listOfModifiers[1]/modifierSpeciesReference[1]"
oldParent="../listOfModifiers[1]" triggeredBy="5" />
<attribute id="7" name="species" triggeredBy="6"
oldPath="../modifierSpeciesReference[1]" oldValue="cdc2" />
</delete>
<insert>
[...]
<node id="12" newChildNo="2" newTag="speciesReference"
newParent="../listOfReactants[1]"
newPath="../listOfReactants[1]/speciesReference[2]" />
<attribute id="13" name="species" triggeredBy="12"
newPath="../speciesReference[2]" newValue="cdc2" />
<attribute id="14" name="metaid" triggeredBy="12"
newPath="../speciesReference[2]" newValue=" 818337" />
</insert>
<move />
</bives>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
XMLPatch
Text-basedReport
HighlightedReactionNetwork

semantic characterisation
• A good communication of model changes can increase the trust in a model.
• COMODI — COmputational MOdels DIffer.
• The COMODI ontology provides terms to characterise changes in models.

the COMODI ontology
Scharm, Waltemath, Mendes and Wolkenhauer: COMODI: an ontology to characterise differences in versions of computational models in biology.
In Journal of Biomedical Semantics 2016 7:46.
wasTriggeredBy
affects
hasReason
hasIntention
appliesTo
Change
XmlEntity
Target
Intention
Reason
Correction
Simplification
Annotations
Setup
Reaction Network
Kinetics
Node
Attribute
Typo
Curation
insertion
deletion
update
65 classes
5 object properties
purl.uni-rostock.de/comodi
BioPortal::COMODI

an example
<parameter name=“Km1”
value=“0.3”
units=“molesperlitre” />
value=“0.7”

an example
value=“0.3”
value=“0.7”
<update>
<attribute id=“1” name=“value”
oldPath=“/sbml[1]/../parameter[1]”
oldValue=“0.3” newValue=“0.7”
[...] />
</update>
difference detection

an example
value=“0.3”
value=“0.7”
<update>
<attribute id=“1” name=“value”
oldPath=“/sbml[1]/../parameter[1]”
oldValue=“0.3” newValue=“0.7”
[...] />
</update>
difference detection
#1 a comodi:Update ;
comodi:appliesTo comodi:XmlAttribute ;
comodi:affects comodi:ParameterSetup ;
comodi:hasIntention comodi:Correction ;
comodi:hasReason comodi:MismatchWithPublication.
annotation

Applications
the BiVeS framework
BiVeS is a framework implementing the method to characterise differences in
computational models.
BiVeS
BiVeS-SBML BiVeS-CellML
BiVeS-Core
jCOMODI xmlutils
BiVeS-WebApp
BiVeS-WebApp-Client
HTTP

Applications
the BiVeS framework
BiVeS
BiVeS-Core
jCOMODI xmlutils
BiVeS-WebApp
BiVeS-WebApp-Client
HTTPJava API

Applications
the BiVeS framework
BiVeS
BiVeS-Core
jCOMODI xmlutils
BiVeS-WebApp
BiVeS-WebApp-Client
HTTP
Executable

Applications
the BiVeS framework
BiVeS
BiVeS-Core
jCOMODI xmlutils
BiVeS-WebApp
BiVeS-WebApp-Client
HTTP
Web interface

Applications
in the systems biology domain
SEEK and the
FAIRDOMHub
Physiome Model
Repository
Cardiac Electrophysiology
Web Lab

Outline of my talk
Search&Retrieve
Publish&Share
Model 1
v1 v2 v3
Model 2
v1 v2 v3 v4
Model 3
v1 v2
Hypothesis 2
Improve
Support for shareable and reproducible
simulation studies (Chapter 4)

The COMBINE archive
one ﬁle to share them all
Bergmann, Adams, Moodie, Cooper, Glont and Scharm et al.: COMBINE archive and OMEX format: one ﬁle to share all information to reproduce
a modeling project. In BMC Bioinformatics 2014 15:369.
01001
00111
1101001
0010110
1110100
1010110
TM

The COMBINE archive
a fully featured demo
Scharm and Waltemath: A fully featured COMBINE archive of a simulation study on syncytial mitotic cycles in Drosophila embryos.
In F1000Research 2016 5:2421.
01001
00111
1101001
0010110
1110100
1010110
TM

Applications
employing COMBINE archives
CombineArchiveWeb
application – manage
modelling results
SED-ML web tools:
generate, modify and
export sim. studies
Extracting
reproducible
simulation studies
The Cardiac
Electrophysiology
Web Lab

Summary
Search&Retrieve
Publish&Share
Model 1
v1 v2 v3
Model 2
v1 v2 v3 v4
Model 3
v1 v2
Hypothesis 2
Improve
• Models are subject to changes.
• A novel method to identify and
characterise changes:
• entailed to the structure of models,
• changes become transparent,
• changes can be justiﬁed.
• Methods and tools support sharing and
reproducing of research results.

Many thanks to...
SYSTEMS BIOLOGY
BIOINFORMATICS
ROSTOCK
Olaf Wolkenhauer
Ali Salehzadeh-Yazdi
Holger Hennig
Tom Theile
Sherry Freiesleben
Markus Wolﬁen
Peggy Sterling
Ulf Liebal
Shailendra Gupta
S E Ssimulation experiment management system
Dagmar Waltemath
Martin Peters
Tom Gebhardt
Vasundra Touré
Mariam Nassar
Fabienne Lambusch
Ron Henkel
Vivek Garg
Srijana Kayastha
Jenny Fabian
Jonathan Cooper
Stian Soiland-Reyes
Natalie Stanford
Matthew Gamble
Gary Mirams
Carole Goble
Pedro Mendes
Tommy Yu
Frank Bergmann
Olga Krebs
Falk Schreiber
CellML
SBML
COPASI
FAIRDOM
Inkscape
LATEX
Texstudio
Protégé
Java
Maven
Docker
Debian, GNU/Linux
Stack Exchange
Git, SVN, Hg
Python, Perl, PHP
Trac, Git(Hub|Lab), Wiki*
Icedove
Iceweasel
OpenSSH
Tomcat
Eclipse
Okular
Zsh
R ...the audience!

Improving Reproducibility and Reuse of Modelling Results in the Life Sciences

Recommended

Recommended

More Related Content

Similar to Improving Reproducibility and Reuse of Modelling Results in the Life Sciences

Similar to Improving Reproducibility and Reuse of Modelling Results in the Life Sciences (20)

More from Martin Scharm

More from Martin Scharm (12)

Recently uploaded

Recently uploaded (20)

Improving Reproducibility and Reuse of Modelling Results in the Life Sciences