Advertisement

Reproducibility of model-based results: standards, infrastructure, and recognition.

FAIRDOM
FAIRDOM
Sep. 24, 2015
Advertisement

More Related Content

Similar to Reproducibility of model-based results: standards, infrastructure, and recognition.(20)

Advertisement
Advertisement

Reproducibility of model-based results: standards, infrastructure, and recognition.

  1. http://sems.uni-rostock.de Dagmar WaltemathSeptember 2015, Rostock-Warnemünde | dcite Reproducibility of model-based results: Standards, infrastructure and recognition
  2. http://sems.uni-rostock.de What is a model? Fig.: Modeling Cellular Reprogramming Using Network-based Models. Courtesy Antonio del Sol Mesa, LCSB Luxembourg Fig.: Modeling the cell cycle using ODE systems. Goldbeter (1991), http://www.ncbi.nlm.nih.gov/pubmed/1833774 Fig.: Modeling large-scale networks. Lee et al (2013), http://www.nature.com/articles/srep02197. 2 In systems biology, a computational model represents biological facts in the computer. Often, the representation is simulated to help understand the system's dynamic behavior.
  3. http://sems.uni-rostock.de Re[usea|produci]bility challenge 3 Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
  4. http://sems.uni-rostock.de Re[usea|produci]bility challenge 4 Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School “With greater interaction between tools, and a common format for publications and databases, users would be better able to spend more time on actual research rather than on struggling with data format issues.”
  5. http://sems.uni-rostock.de Re[usea|produci]bility challenge (2003) 5 Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School “With greater interaction between tools, and a common format for publications and databases, users would be better able to spend more time on actual research rather than on struggling with data format issues.” (SBML L1)
  6. http://sems.uni-rostock.de → Standardised model representation 6 Ron Henkel et al. Database 2015;2015:bau130
  7. http://sems.uni-rostock.de Re[usea|produci]bility challenge (2010) 7 Fig.: Nature Blogs: Of Schemes and Dreams (2014) Nine Worrying Stats on the Effect of Poor Scientific Data Management Vijayalakshmi Chelliah et al. Nucl. Acids Res. 2015;43:D542-D548 Finding relevant models.
  8. http://sems.uni-rostock.de → Strategies for model similarity, ranking, clustering, filtering Fig.: Henkel et al 2010 http://www.biomedcentral.com/1471-2105/11/423/ Fig.: Schulz et al 2011 DOI: 10.1038/msb.2011.41 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x CellCycle Models x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x Fig.:Alm et al (2014) doi:10.1186/s13326-015-0014-4
  9. http://sems.uni-rostock.de Re[usea|produci]bility challenge (2012) Reproducing published models.
  10. http://sems.uni-rostock.de → Standardised simulation descriptions Fig.:Waltemath et al (2012) doi:10.1186/1752-0509-5-198
  11. http://sems.uni-rostock.de Re[usea|produci]bility challenge (2014) Model-related data in the systems biology workflow Linking the relevant files.
  12. http://sems.uni-rostock.de → Retrieval and archiving of simulation studies and asssociated files Model-related data in the systems biology workflow Linking model-related data Give me all the files I need to run this simulation study. Which are the most frequently used GO annotations in my model set? Which models contain reactions with 'ATP' as reactant and 'ADP' as product? Find good candidates for features describing my set of models.
  13. http://sems.uni-rostock.de State of affairs in 2015 ● Standards: – support for all steps of the modeling cycle – support of various modeling techniques – Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review)) ● Infrastructures: – Software tools export/import standards – Open model repositories and management systems – Education ● Recognition
  14. http://sems.uni-rostock.de COMBINE Standards ● COmputational Modeling in BIology Network ● Goals: – Avoid overlap of standardisation efforts – Coordinate standard developments – Coordinate meetings – Coordinate development of procedures & tools – common infrastructure for specification development, semantic annotation, and dissemination ● All specifications now citable and accessible in one place: Schreiber et al. (2015) http://journal.imbio.de/articles/pdf/jib-258.pdf
  15. http://sems.uni-rostock.de COMBINE Standards Fig. : COMBINE standards today. Slide courtesy M. Hucka. http://www.slideshare.net/thehuck/a-summary-of-various-combine-standardization-activities
  16. http://sems.uni-rostock.de COMBINE Standards ● Data formats – Community-developed representation formats for models and related data – Format: XML, OWL, RDF/XML ● Minimum Information/Reporting guidelines: – Minimum amount of data and information required reproduce and interpret an experiment – Format: human-readable specification documents ● Basis for the specification of data models and metadata ● Bio-ontologies
  17. http://sems.uni-rostock.de SBML Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
  18. http://sems.uni-rostock.de SBML Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014). Lucky modelers: You should not need to worry about the details of these (XML) formats, the tools should handle import and export! (Tool developers should though.)
  19. http://sems.uni-rostock.de Minimum Information Guidelines ● Reporting guidelines and checklists ● Narrative description of the information necessary to reproduce a model-based result ● MIRIAM: Minimum Information about the Annotation of a Model ● MIASE: Minimum Information about a Simulation Experiment ● MIAPE,MIAME… for experimental setups
  20. http://sems.uni-rostock.de MIRIAM – information to provide about a model ● Models must – be encoded in a public machinereadable format – be clearly linked to a single publication – reflect the structure of the biological processes described in the reference paper (list of reactions, …) – be instantiable in a simulation (possess initial conditions, …) – be able to reproduce the results given in the reference paper – contain creator’s contact details – unambiguously identify each model constituent through annotation
  21. http://sems.uni-rostock.de MIRIAM – information to provide about a model ● Models must – be encoded in a public machinereadable format – be clearly linked to a single publication – reflect the structure of the biological processes described in the reference paper (list of reactions, …) – be instantiable in a simulation (possess initial conditions, …) – be able to reproduce the results given in the reference paper – contain creator’s contact details – unambiguously identify each model constituent through annotation You should worry about the details of the guidelines, as they help you to check whether you provide all necessary information.
  22. http://sems.uni-rostock.de Bio-ontologies for model annotation ● Major ontologies ● Linking framework: RDF/XML ● Annotation scheme: used to semantically enrich model files with detailed descriptions of the underlying biological entities, mathematical concepts or algorithms used during analysis ● De facto standard: SBML annotation scheme
  23. http://sems.uni-rostock.de Bio-ontologies for model annotation enzyme enzyme product substrate enzymatic rate law catalytic rate constant urn:miriam:SBO:0000011 urn:miriam:SBO:0000014 urn:miriam:SBO:0000014 urn:miriam:SBO:0000025 urn:miriam:SBO:0000015
  24. http://sems.uni-rostock.de Bio-ontologies for model annotation Tyrosine Phenylalanine- 4-hydroxylase Phenylalanine- 4-hydroxylase Tetrahydrobiopterin urn:miriam:uniprot:P00439 urn:miriam:uniprot:Q03393 urn:miriam:uniprot:P07101 urn:miriam:uniprot:P00439
  25. http://sems.uni-rostock.de Levels of standardisation Fig.: COMBINE standards that are relevant to this workshop; adapted from (Chelliah et al., 2009, DILS)
  26. http://sems.uni-rostock.de State of affairs in 2015 ● Standards: – support for all steps of the modeling cycle – support of various modeling technique – Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review)) ● Infrastructures: – Software tools export/import standards – Open model repositories and management systems – Education ● Recognition
  27. http://sems.uni-rostock.de Software tool support ● Standard converters (SBML ↔ SBGN; SBML ↔ CellML...) ● Standard support in software ● Interoperability tools – Cytoscape for network analysis and visualization (SBML, SBGN, BioPax) – The Virtual Cell for modeling (SBML, BioPAx) – VANTED for network analysis, visualization and manipulation (SBML, SBGN) Check COMBINE Website for details
  28. http://sems.uni-rostock.de Software tool support in SBML Fig.: Software supporting SBML. Slide courtesy M. Hucka (ICSB 2014). Also check the SBML Software Matrix
  29. http://sems.uni-rostock.de Open model repositories ● Structured, type-specific archives ● Offer download of curated, annotated, published models and associated files (visual representations, simulation descriptions, publication…) CCDB
  30. http://sems.uni-rostock.de Model management systems Fig.: The SEEK. Wolstencroft et al (2015). doi:10.1186/s12918- 015-0174-y Model management tasks: ● Storage & Integration of data ● Search & Retrieval ● Version Control ● Provenance
  31. http://sems.uni-rostock.de Getting involved ● COMBINE user meeting→ next: COMBINE 2015, OCT 11-16, Salt Lake City ● COMBINE developers meeting → next: HARMONY 2016, June 7-11, Auckland ● FAIR-DOM activities: webinars, blogs, foundries ● COMBINE activities: workshops, presentations, tutorials ● Help through specification documents, show cases, mailing lists, ... http://co.mbine.org/ http://fair-dom.org/
  32. http://sems.uni-rostock.de State of affairs in 2015 ● Standards: – support for all steps of the modeling cycle – support of various modeling technique – Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review)) ● Infrastructures: – Open model repositories – Software tools export/import standards – Model management systems – Education ● Recognition
  33. http://sems.uni-rostock.de Recognition 33 1) Higher visibility of research 2) Long-term availability 3) Link to other resources 4) Quality-checks Fig.: Piwowar and Vision (2013) Data reuse and the open data citation advantage. PeerJ
  34. http://sems.uni-rostock.de Model curation and publication in BioModels Database Fig.: Li et al (2010)
  35. http://sems.uni-rostock.de Functional curation of models through virtual experiments Fig.: Functional curation of models in the Web Lab. Cooper et al (2015) https://peerj.com/preprints/1338/ ; Cooper et al (2014) doi:10.1016/j.pbiomolbio.2014.10.001 Try out the Cardiac physiology Web Lab
  36. http://sems.uni-rostock.de Enabling model version control Fig.: courtesy Martin Scharm, BudHat
  37. http://sems.uni-rostock.de Enabling on-the-fly reproduction of the model-based results Fig.: Software supporting SBML and SED-ML.Waltemath et al (2011). doi:10.1186/1752-0509-5-198
  38. http://sems.uni-rostock.de So far for the theory… and in practice? ● Check for existing standards and specifications thereof: http://co.mbine.org ● Get involved in standard development → through the relevant mailing lists ● Problems with getting your model into the right format? – Is it a problem with finding the approriate format or tool? → Ask on the relevant mailing list... people are friendly and happy to help. – Is it a tool problem? → Complain with tool developers... who will hopefully change it. – Is is a problem with the lack of a standards? → Feed back into the standard's community… people are friendly and happy to improve the standard. ● Follow best practices when aiming at publishing a result.
  39. http://sems.uni-rostock.de Best practices for publishing reproducible modeling results 1) Encode the model in a standard format, e.g. SBML. 2) Annotate the SBML model, following MIRIAM. 3) Publish the simulation experiment descriptions in standard format, e.g. SED-ML. If unsure what to include, consult the MIASE guidelines. 4) Try to reproduce the results *yourself*. 5) Ask a colleague to reproduce the results. 6) If successful: Archive all steps that led to your results. 7) Disseminate model code and simulation description through an open repository. Adapted from: Waltemath et al (2013), doi:10.1007/978-94-007-6803-1_10
Advertisement