Reproducibility of model-based results: standards, infrastructure, and recognition.
Sep. 24, 2015•0 likes
1 likes
Be the first to like this
Show More
•3,994 views
views
Total views
0
On Slideshare
0
From embeds
0
Number of embeds
0
Download to read offline
Report
Science
Written and presented by Dagmar Waltemath (University of Rostock) as part of the Reproducible and Citable Data and Models Workshop in Warnemünde, Germany. September 14th - 16th 2015.
http://sems.uni-rostock.de
What is a model?
Fig.: Modeling Cellular Reprogramming Using Network-based
Models. Courtesy Antonio del Sol Mesa, LCSB Luxembourg
Fig.: Modeling the cell cycle using ODE systems. Goldbeter
(1991), http://www.ncbi.nlm.nih.gov/pubmed/1833774
Fig.: Modeling large-scale networks. Lee et al (2013),
http://www.nature.com/articles/srep02197.
2
In systems biology, a computational model represents biological facts in
the computer. Often, the representation is simulated to help understand
the system's dynamic behavior.
http://sems.uni-rostock.de
Re[usea|produci]bility challenge
4
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, and
a common format for publications and databases, users
would be better able to spend more time on actual research
rather than on struggling with data format issues.”
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2003)
5
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, and
a common format for publications and databases, users
would be better able to spend more time on actual research
rather than on struggling with data format issues.” (SBML L1)
http://sems.uni-rostock.de
→ Strategies for model similarity, ranking, clustering, filtering
Fig.: Henkel et al 2010 http://www.biomedcentral.com/1471-2105/11/423/
Fig.: Schulz et al 2011 DOI: 10.1038/msb.2011.41
x x x x
x x x x
x x x
x
x x
x x x
x
x x x
x
x
x x x
x x x
x
CellCycle Models
x x x x x x
x x x x x
x x
x
x x x x
x x x
x
x x x x
x x x
x
x
x x x x
x
x x x x
x x x
x x
x x
x x x
x x
x x x x
x x x
x
x x x x x
x x x x x x
x x x x x x
x x x x x x x
x x x x x
x x x x x
x x x x
x x x x x x x
x x x x x x
x x x x x x x x x
x x x x x x
x x
Fig.:Alm et al (2014) doi:10.1186/s13326-015-0014-4
http://sems.uni-rostock.de
→ Retrieval and archiving of simulation studies and asssociated files
Model-related data
in the systems biology
workflow
Linking model-related data
Give me all the files I need to
run this simulation study.
Which are the most frequently used
GO annotations in my model set?
Which models contain reactions with 'ATP'
as reactant and 'ADP' as product?
Find good candidates for
features describing my set of
models.
http://sems.uni-rostock.de
State of affairs in 2015
●
Standards:
– support for all steps of the modeling cycle
– support of various modeling techniques
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling
workshop, Waltemath et al 2015 (under review))
●
Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
●
Recognition
http://sems.uni-rostock.de
COMBINE Standards
●
COmputational Modeling in BIology Network
●
Goals:
– Avoid overlap of standardisation efforts
– Coordinate standard developments
– Coordinate meetings
– Coordinate development of procedures & tools
– common infrastructure for specification development, semantic
annotation, and dissemination
●
All specifications now citable and accessible in one place:
Schreiber et al. (2015) http://journal.imbio.de/articles/pdf/jib-258.pdf
http://sems.uni-rostock.de
COMBINE Standards
●
Data formats
– Community-developed representation formats for models and
related data
– Format: XML, OWL, RDF/XML
●
Minimum Information/Reporting guidelines:
– Minimum amount of data and information required reproduce
and interpret an experiment
– Format: human-readable specification documents
●
Basis for the specification of data models and metadata
●
Bio-ontologies
http://sems.uni-rostock.de
SBML
Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
Lucky modelers: You should not need to worry about the details of these (XML) formats,
the tools should handle import and export! (Tool developers should though.)
http://sems.uni-rostock.de
Minimum Information Guidelines
●
Reporting guidelines and checklists
●
Narrative description of the information necessary to
reproduce a model-based result
●
MIRIAM: Minimum Information about the Annotation of a
Model
●
MIASE: Minimum Information about a Simulation Experiment
●
MIAPE,MIAME… for experimental setups
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
●
Models must
– be encoded in a public machinereadable format
– be clearly linked to a single publication
– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)
– be instantiable in a simulation (possess initial conditions, …)
– be able to reproduce the results given in the reference paper
– contain creator’s contact details
– unambiguously identify each model constituent through annotation
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
●
Models must
– be encoded in a public machinereadable format
– be clearly linked to a single publication
– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)
– be instantiable in a simulation (possess initial conditions, …)
– be able to reproduce the results given in the reference paper
– contain creator’s contact details
– unambiguously identify each model constituent through annotation
You should worry about the details of the guidelines,
as they help you to check whether you provide all necessary information.
http://sems.uni-rostock.de
Bio-ontologies for model annotation
●
Major ontologies
●
Linking framework: RDF/XML
●
Annotation scheme: used to semantically enrich model files with
detailed descriptions of the underlying biological entities, mathematical
concepts or algorithms used during analysis
●
De facto standard: SBML annotation scheme
http://sems.uni-rostock.de
Bio-ontologies for model annotation
enzyme
enzyme
product
substrate
enzymatic rate law
catalytic rate constant
urn:miriam:SBO:0000011
urn:miriam:SBO:0000014
urn:miriam:SBO:0000014
urn:miriam:SBO:0000025
urn:miriam:SBO:0000015
http://sems.uni-rostock.de
Bio-ontologies for model annotation
Tyrosine
Phenylalanine-
4-hydroxylase
Phenylalanine-
4-hydroxylase
Tetrahydrobiopterin
urn:miriam:uniprot:P00439
urn:miriam:uniprot:Q03393
urn:miriam:uniprot:P07101
urn:miriam:uniprot:P00439
http://sems.uni-rostock.de
State of affairs in 2015
●
Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling
workshop, Waltemath et al 2015 (under review))
●
Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
●
Recognition
http://sems.uni-rostock.de
Software tool support
●
Standard converters (SBML ↔ SBGN; SBML ↔ CellML...)
●
Standard support in software
●
Interoperability tools
– Cytoscape for network analysis and visualization (SBML,
SBGN, BioPax)
– The Virtual Cell for modeling (SBML, BioPAx)
– VANTED for network analysis, visualization and manipulation
(SBML, SBGN)
Check COMBINE Website
for details
http://sems.uni-rostock.de
Model management systems
Fig.: The SEEK. Wolstencroft et
al (2015). doi:10.1186/s12918-
015-0174-y
Model management tasks:
●
Storage & Integration of data
●
Search & Retrieval
●
Version Control
●
Provenance
http://sems.uni-rostock.de
Getting involved
●
COMBINE user meeting→ next: COMBINE 2015, OCT 11-16,
Salt Lake City
●
COMBINE developers meeting → next: HARMONY 2016, June
7-11, Auckland
●
FAIR-DOM activities: webinars, blogs, foundries
●
COMBINE activities: workshops, presentations, tutorials
●
Help through specification documents, show cases, mailing
lists, ...
http://co.mbine.org/ http://fair-dom.org/
http://sems.uni-rostock.de
State of affairs in 2015
●
Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling
workshop, Waltemath et al 2015 (under review))
●
Infrastructures:
– Open model repositories
– Software tools export/import standards
– Model management systems
– Education
●
Recognition
http://sems.uni-rostock.de
Functional curation of models through virtual experiments
Fig.: Functional curation of models in the Web Lab. Cooper et al (2015) https://peerj.com/preprints/1338/ ;
Cooper et al (2014) doi:10.1016/j.pbiomolbio.2014.10.001
Try out the
Cardiac physiology
Web Lab
http://sems.uni-rostock.de
So far for the theory… and in practice?
●
Check for existing standards and specifications thereof: http://co.mbine.org
●
Get involved in standard development → through the relevant mailing lists
●
Problems with getting your model into the right format?
– Is it a problem with finding the approriate format or tool? → Ask on the
relevant mailing list... people are friendly and happy to help.
– Is it a tool problem? → Complain with tool developers... who will
hopefully change it.
– Is is a problem with the lack of a standards? → Feed back into the
standard's community… people are friendly and happy to improve the
standard.
●
Follow best practices when aiming at publishing a result.
http://sems.uni-rostock.de
Best practices for publishing reproducible modeling results
1) Encode the model in a standard format, e.g. SBML.
2) Annotate the SBML model, following MIRIAM.
3) Publish the simulation experiment descriptions in standard
format, e.g. SED-ML. If unsure what to include, consult the
MIASE guidelines.
4) Try to reproduce the results *yourself*.
5) Ask a colleague to reproduce the results.
6) If successful: Archive all steps that led to your results.
7) Disseminate model code and simulation description through an
open repository. Adapted from: Waltemath et al (2013), doi:10.1007/978-94-007-6803-1_10