http://sems.uni-rostock.de
Dagmar WaltemathSeptember 2015 – GCB, Dortmund
Management of simulation studies
in computational biology
(e:Bio Junior Research Group SEMS)
The project goal is to improve the reproducibility
of simulation studies in computational biology.
2
The number of models in open repositories
is steadily increasing, and so is their complexity.
3
Fig.: Models in BioModels. Chelliah et al. Nucl. Acids Res. 2015;43:D542-D548
Open model repositories are rich resources of
interlinked knowledge ready to be explored.
4
Simulation studies comprise of several
heterogeneous files.
5
Many studies are not reproducible and thus
not reusable.
6
● The human factor
● Lack of standards and tool interoperability
● Lack of data availability
● Lack of provenance information
Figs. (left to right): (A) The Economist, Trouble at the lab, 2013; (B) JISC, www.jisc.ac.uk, CC BY-NC-ND 2.0 UK; (C) Diego Delso,
Wikimedia Commons, License CC-BY-SA 3.0; (D) C. Goble, Keynote ISMB/ECCB 2013 “Results may vary”, Slideshare
Waltemath and Wolkenhauer (2015) under review
SEMS: Model management for Computational Biology
7
Martin Scharm Tom Gebhardt Mariam NassarMartin Peters Vasundra TouréRon Henkel Fabienne Lambusch
8
MASYMOS
How can we link model-related data?
● Publication
● Model
● Analysis (Simulation)
● Data (experimental and simulation)
● Results (Figures, data tables...)
Types of model-related data
9
MASYMOS
How can we link model-related data?
● Data from open repositories
(BioModels, PMR2, BioPortal)
Explicit linking
10
MASYMOS
Integration of model-related data
● Data from open repositories
(BioModels, PMR2, BioPortal)
● Links between data belonging to a
simulation study
Henkel (2015) DATABASE
Explicit linking
11
MASYMOS
Integration of model-related data
● Data from open repositories
(BioModels, PMR2, BioPortal)
● Links between data belonging to a
simulation study
● Graph database for integrated
storage (Neo4J)
Explicit linking
Henkel (2015) DATABASE
12
MASYMOS
Can we improve model search?
Model
Publication
Annotation
Person
Simulation
Show me models by Tyson,
dealing with the Cell Cycle and
simulating concentration of cdc2!
Ron Henkel
13
M2CAT
How can we export search results?
internet
internet
SEARCHubiquitin
internet
RESULTS
EXPORT
EXPORT
EXPORT
EXPORT
Query database
for annotations, persons,
simulation descriptions
Retrieve information
about models, simulations,
figures, documentation
Export simulation study
as COMBINE archive
Download archive
and open the study
with your favourite
simulation tool
Open archive in CAT
to modify its contents and
to share it with others
internet
API Commincations
enrich your studies
with simulation results
Simulate a Study
with just a single click
Fig.: Exporting COMBINE Archives from MASYMOS. Bergmann (2014), Scharm (2015)
14
MASYMOS
Ranking, Comparison,Clustering
MASYMOS Henkel (2011) BMC SysBiol
Ranking
Alm (2015) JBMS
x x x x
x x x x
x x x
x
x x
x x x x
x x x
x
x
x x x
x x x
xCellCycle Models
x x x x x x
x x x x x
x x
x
x x x x
x x x
x
x x x x
x x x
x
x
x x x x
x
x x x x
x x x
x x
x x
x x x
x x
x x x x
x x x
x
x x x x x
x x x x x x
x x x x x x
x x x x x x x
x x x x x
x x x x x
x x x x
x x x x x x x
x x x x x x
x x x x x x x x x
x x x x x x
x x
BioModels
ClusteringSearchII
Which models contain feedback loops?
How many models
contain this specific submodule?
Which models have
'ATP' as product?
Lambusch, BA (2015),Henkel (2014) SWAT4LS
15
BiVeS
Can we describe a model's evolution?
16
BiVeS
Can we describe a model's evolution?
Fig.: Visualisation of the evolution of an early cell cyle model Scharm (2015)
17
BiVeS
Difference Detection
Fig.: BudHat – Detecting differences between model versions. Scharm (2015)
Martin Scharm
18
BiVeS
Difference Detection
Fig.: BiVeS algorithm and output format. Scharm (2015)
Martin Scharm
19
COMODI
How can we characterise changes?
How do COmputational MOdels DIffer?
Fig.: Statistics on model versions in PMR2. Scharm (2015)
20
COMODI
How can we characterise changes?
How do COmputational MOdels DIffer?
Fig.: Working draft of COMODI, slides presented at the 2015 e:Bio Meeting.
21
COMODI
How can we characterise changes?
How do COmputational MOdels DIffer?
Fig.: Working draft of COMODI, slides presented at the 2015 e:Bio Meeting.
Standards are the basis of our work.
22
Fig. : COMBINE standards today. Slide courtesy M. Hucka, Slideshare.
Special issue at JIB
COMBINE standards
23
Summary
Methods
Ranked Retrieval
BiVeS
Model similarity
Tools Standards
TM
Archive
VANTEDVANTED
Reproducible Science
Thank you for your attention.
24
HERMES-Forschungsförderung
der Universität Rostock
Tom Gebhard
Mariam Nassar
Martin Peters
Martin Scharm
Dagmar Waltemath
Ron Henkel (de.NBI-SYSBIO)
Fabienne Lambusch
Vasundra Toure (SBGN-ED)
@SemsProject

Management of simulation studies in computational biology

  • 1.
    http://sems.uni-rostock.de Dagmar WaltemathSeptember 2015– GCB, Dortmund Management of simulation studies in computational biology (e:Bio Junior Research Group SEMS)
  • 2.
    The project goalis to improve the reproducibility of simulation studies in computational biology. 2
  • 3.
    The number ofmodels in open repositories is steadily increasing, and so is their complexity. 3 Fig.: Models in BioModels. Chelliah et al. Nucl. Acids Res. 2015;43:D542-D548
  • 4.
    Open model repositoriesare rich resources of interlinked knowledge ready to be explored. 4
  • 5.
    Simulation studies compriseof several heterogeneous files. 5
  • 6.
    Many studies arenot reproducible and thus not reusable. 6 ● The human factor ● Lack of standards and tool interoperability ● Lack of data availability ● Lack of provenance information Figs. (left to right): (A) The Economist, Trouble at the lab, 2013; (B) JISC, www.jisc.ac.uk, CC BY-NC-ND 2.0 UK; (C) Diego Delso, Wikimedia Commons, License CC-BY-SA 3.0; (D) C. Goble, Keynote ISMB/ECCB 2013 “Results may vary”, Slideshare Waltemath and Wolkenhauer (2015) under review
  • 7.
    SEMS: Model managementfor Computational Biology 7 Martin Scharm Tom Gebhardt Mariam NassarMartin Peters Vasundra TouréRon Henkel Fabienne Lambusch
  • 8.
    8 MASYMOS How can welink model-related data? ● Publication ● Model ● Analysis (Simulation) ● Data (experimental and simulation) ● Results (Figures, data tables...) Types of model-related data
  • 9.
    9 MASYMOS How can welink model-related data? ● Data from open repositories (BioModels, PMR2, BioPortal) Explicit linking
  • 10.
    10 MASYMOS Integration of model-relateddata ● Data from open repositories (BioModels, PMR2, BioPortal) ● Links between data belonging to a simulation study Henkel (2015) DATABASE Explicit linking
  • 11.
    11 MASYMOS Integration of model-relateddata ● Data from open repositories (BioModels, PMR2, BioPortal) ● Links between data belonging to a simulation study ● Graph database for integrated storage (Neo4J) Explicit linking Henkel (2015) DATABASE
  • 12.
    12 MASYMOS Can we improvemodel search? Model Publication Annotation Person Simulation Show me models by Tyson, dealing with the Cell Cycle and simulating concentration of cdc2! Ron Henkel
  • 13.
    13 M2CAT How can weexport search results? internet internet SEARCHubiquitin internet RESULTS EXPORT EXPORT EXPORT EXPORT Query database for annotations, persons, simulation descriptions Retrieve information about models, simulations, figures, documentation Export simulation study as COMBINE archive Download archive and open the study with your favourite simulation tool Open archive in CAT to modify its contents and to share it with others internet API Commincations enrich your studies with simulation results Simulate a Study with just a single click Fig.: Exporting COMBINE Archives from MASYMOS. Bergmann (2014), Scharm (2015)
  • 14.
    14 MASYMOS Ranking, Comparison,Clustering MASYMOS Henkel(2011) BMC SysBiol Ranking Alm (2015) JBMS x x x x x x x x x x x x x x x x x x x x x x x x x x x x x xCellCycle Models x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x BioModels ClusteringSearchII Which models contain feedback loops? How many models contain this specific submodule? Which models have 'ATP' as product? Lambusch, BA (2015),Henkel (2014) SWAT4LS
  • 15.
    15 BiVeS Can we describea model's evolution?
  • 16.
    16 BiVeS Can we describea model's evolution? Fig.: Visualisation of the evolution of an early cell cyle model Scharm (2015)
  • 17.
    17 BiVeS Difference Detection Fig.: BudHat– Detecting differences between model versions. Scharm (2015) Martin Scharm
  • 18.
    18 BiVeS Difference Detection Fig.: BiVeSalgorithm and output format. Scharm (2015) Martin Scharm
  • 19.
    19 COMODI How can wecharacterise changes? How do COmputational MOdels DIffer? Fig.: Statistics on model versions in PMR2. Scharm (2015)
  • 20.
    20 COMODI How can wecharacterise changes? How do COmputational MOdels DIffer? Fig.: Working draft of COMODI, slides presented at the 2015 e:Bio Meeting.
  • 21.
    21 COMODI How can wecharacterise changes? How do COmputational MOdels DIffer? Fig.: Working draft of COMODI, slides presented at the 2015 e:Bio Meeting.
  • 22.
    Standards are thebasis of our work. 22 Fig. : COMBINE standards today. Slide courtesy M. Hucka, Slideshare. Special issue at JIB COMBINE standards
  • 23.
    23 Summary Methods Ranked Retrieval BiVeS Model similarity ToolsStandards TM Archive VANTEDVANTED Reproducible Science
  • 24.
    Thank you foryour attention. 24 HERMES-Forschungsförderung der Universität Rostock Tom Gebhard Mariam Nassar Martin Peters Martin Scharm Dagmar Waltemath Ron Henkel (de.NBI-SYSBIO) Fabienne Lambusch Vasundra Toure (SBGN-ED) @SemsProject