SBML (the Systems Biology Markup Language),   model databases, and other resources                     Michael Hucka, Ph.D...
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
Research today: experimentation, computation, cogitation
The many roles of computation in biological researchInstrument/device control, data management, data processing,database a...
What are the outcomes of modeling and simulation?Usually, there are at least two scientific outcomes: •    One or more mode...
Models are results Models serve as statements of our current understanding of the phenomena being studied*   •   A computa...
But only if the modeling results are reproducible
Is it enough to describe the model & equations in a paper?Many models have traditionally been published this wayProblems: ...
Is it enough to make your (software X) script available?It’s vital for good science: •   Someone with access to the same s...
Is it enough to make your (software X) code available?It’s vital for good science— •   Someone with access to the same sof...
Different tools   different interfaces & languages
Communication is better with interoperable data formats
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
SB   ML     :a fo lin   rs g     of ua       tw fr         ar an           e ca
SBML = Systems Biology Markup LanguageFormat for representing computational models of biological processes •   Data struct...
The process is central  •   Called a “reaction” in SBML  •   Participants are pools of entities (species)Models can furthe...
Well-stirred compartments       c       n
Species pools are located in compartments        c                   protein A                protein B        n          ...
Reactions can involve any species anywhere       c                   protein A                 protein B        n         ...
Reactions can cross compartment boundaries       c                  protein A                  protein B       n          ...
Reaction/process rates can be (almost) arbitrary formulas       c                   protein A          f1(x)           pro...
“Rules”: equations expressing relationships in addition to reaction sys.g1(x)    cg2(x)               protein A           ...
“Events”: discontinuous actions triggered by system conditionsg1(x)       cg2(x)                   protein A              ...
Annotations: machine-readable semantics and links to other resources   “This is identified                                 ...
Today: spatially homogeneous models  •   Metabolic network models             Find                                        ...
Herrgård et al., Nature Biotech., 26:10, 2008                                                                             ...
SBML Level 1               SBML Level 2             SBML Level 3predefined math functions   user-defined functions    user-...
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
You want models? We got models.
BioModels DatabaseStores & serves quantitative models of biological interest •   Free, public resource •   Models must be ...
BioModels Databasehttp://biomodels.net/biomodels
Contents of BioModels DatabaseContents today: •   142,000+ pathway models (converted from KEGG) •   400+ hand-curated quan...
How can you check that a given SBML file is valid?
The Online SBML Validator
The Online SBML Validator          Find it           herehttp://sbml.org/Facilities/Validator
Where can you find more software?
Find software in the SBML Software Guide
Find software in the SBML Software Guide              Find SBML software
Results of 2011 survey of SBML-compatible software   Question: Which of the following categories best describe your softwa...
What about libraries for writing SBML-compatible software?
libSBMLReads, writes, validates SBMLCan check & convert unitsWritten in portable C++Runs on Linux, Mac, WindowsAPIs for C,...
JSBML              Pure Java implementation              API is compatible with libSBML but              more Java-like   ...
How can you stay informed of new developments?
Resources for news, questions and discussions
Front-page newsResources for news, questions and discussions
Twitter & RSS feedsResources for news, questions and discussions
Mailing lists/forumsResources for news, questions and discussions
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
SBML itself provides syntax and only limited semantics
SBML itself provides syntax and only limited semantics  No standard   identifiers
SBML itself provides syntax and only limited semantics         Low info         content  No standard   identifiers
SBML itself provides syntax and only limited semantics                           Raw models alone are insufficient          ...
Element in                                  Entity elsewherethe model                                  (e.g., in a databas...
Annotations add meaning and connectionsAnnotations can answer questions: •   “What exactly is the process represented by e...
SBML supports two annotation schemesSBO (Systems Biology Ontology) •   For mathematical semantics •   One SBML object ← on...
Systems Biology Ontology (SBO)                     http://biomodels.net/sbo
<sbml ...>  ...  <listOfCompartments>    <compartment id="cell" size="1e-15" />  </listOfCompartments>  <listOfSpecies>   ...
<sbml ...>  ...  <listOfCompartments>    <compartment id="cell" size="1e-15" />  </listOfCompartments>  <listOfSpecies>   ...
<sbml ...>  ...  <listOfCompartments>    <compartment id="cell" size="1e-15" />  </listOfCompartments>  <listOfSpecies>   ...
Software can use SBO terms to help you work with modelssemanticSBMLSBMLsqueezer
MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs:         ...
MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs:         ...
Goal: permit tracing model’s origins & people involved in its creationMinimal info required:  •   Name for the model  •   ...
MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs:         ...
MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs:         ...
Annotations for external referencesGoal: link model constituents to corresponding entities inbioinformatics resources (e.g...
http://www.ebi.ac.uk/chebiLow infocontent            Why might you care?
http://www.ebi.ac.uk/chebi     salicylic acid     Known by different names – Low info you want to write all of      docont...
Identifying resources has its own challengesFor linking to data, need: •   Globally unique, unambiguous identifiers •   ......
How do we create globally unique identifiers consistently?Long story short: •   Create unique resource identifiers (URIs) by...
Resolving resource identifiersMIRIAM Registry supports the creation of globally unique identifiers •   Example MIRIAM ident...
BioModels Database: example of using the annotations
Annotations enable many interesting possibilities Annotations             interesting possibilities                       ...
Summary: why care about standard ways of writing annotations? Structured, machine-readable annotations increase your model...
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
Model representation level                                                                                                ...
What about other kinds of models?
SBML Level 3: Supporting more categories of models                                    Package W     Package X          Pac...
Level 3 package            What it enablesHierarchical composition Models containing submodelsFlux balance constraints   F...
How can we capture the simulation/analysis procedures?
Decroly & Goldbeter, PNAS, 1982                                          ?BIOMD0000000319 in BioModels Database           ...
SED-ML = Simulation Experiment Description MLApplication-independent format to capture procedures, algorithms,parameter va...
What about visual diagrams?
Graphical representation of modelsToday: broad variation in graphical notation used in biological diagrams •   Between aut...
SBGN = Systems Biology Graphical NotationGoal: standardize the graphical notation in diagrams of biological processes •   ...
General background and motivations          Brief summary of SBML featuresOutline          A selection of resources for th...
Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010Such standards are the work of a great community
Get involved and make things better!COMBINE (Computational Modeling in Biology Network) •   SBML, SBGN, BioPAX, SED-ML, Ce...
SBML http://sbml.org       BioModels Database http://biomodels.net/biomodels                 COMBINE http://co.mbine.org  ...
I’d like your feedback!You can use this anonymous form: http://tinyurl.com/mhuckafeedback
SBML was made possible thanks to funding from:National Institute of General Medical Sciences (USA)European Molecular Biolo...
SBML (the Systems Biology Markup Language), model databases, and other resources
Upcoming SlideShare
Loading in …5
×

SBML (the Systems Biology Markup Language), model databases, and other resources

1,118 views

Published on

Tutorial given at the 2012 Computational Cell Biology Summer School at Cold Spring Harbor Laboratory, New York, USA, in August, 2012.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,118
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

SBML (the Systems Biology Markup Language), model databases, and other resources

  1. 1. SBML (the Systems Biology Markup Language), model databases, and other resources Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA Email: mhucka@caltech.edu Twitter: @mhucka CCB 2012, August 2012, Cold Spring Harbor Laboratory, NY, USA
  2. 2. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  3. 3. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  4. 4. Research today: experimentation, computation, cogitation
  5. 5. The many roles of computation in biological researchInstrument/device control, data management, data processing,database applications, statistical analysis, pattern matching, imageprocessing, text mining, chemical structure prediction, genomicsequence analysis, proteomics, other *omics, molecular modeling,molecular dynamics, kinetic simulation, simulated evolution,phylogenetics, ... (to name only a subset)!Focus here: modeling and simulation
  6. 6. What are the outcomes of modeling and simulation?Usually, there are at least two scientific outcomes: • One or more models (+ associated claims about their behaviors) • Publication of the results (in some form) Models come in many forms
  7. 7. Models are results Models serve as statements of our current understanding of the phenomena being studied* • A computational model documents your theory in a concrete form Model can— • Reduce ambiguity in communication • Offer a concrete framework for adding new data and theories • Support direct evaluation of relationships between theoriesBower & Bolouri, Computational modeling of genetic and biochemical networks, MIT Press, 2001
  8. 8. But only if the modeling results are reproducible
  9. 9. Is it enough to describe the model & equations in a paper?Many models have traditionally been published this wayProblems: • Errors in printing • Missing information • Dependencies on implementation • Outright errors • Can be a huge effort to recreate
  10. 10. Is it enough to make your (software X) script available?It’s vital for good science: • Someone with access to the same software can try to run it, understand it, verify the computational results, build on them, etc. • Opinion: you should always do this in any case
  11. 11. Is it enough to make your (software X) code available?It’s vital for good science— • Someone with access to the same software can try to run it, understand it, build on it, etc. • Opinion: you should always do this in any caseBut it’s still not ideal for communication of scientific results: • What if they don’t have access to that software? • And anyway, how will people find the model? • And how will people be able to relate the model to other work?
  12. 12. Different tools different interfaces & languages
  13. 13. Communication is better with interoperable data formats
  14. 14. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  15. 15. SB ML :a fo lin rs g of ua tw fr ar an e ca
  16. 16. SBML = Systems Biology Markup LanguageFormat for representing computational models of biological processes • Data structures + usage principles + serialization to XMLNeutral with respect to modeling framework • E.g., ODE, stochastic systems, etc.Development started in 2000, with first specification distributed in 2001
  17. 17. The process is central • Called a “reaction” in SBML • Participants are pools of entities (species)Models can further include: • Other constants & variables • Unit definitions • Compartments • Annotations • Explicit math • Discontinuous events Basic SBML concepts are fairly simple
  18. 18. Well-stirred compartments c n
  19. 19. Species pools are located in compartments c protein A protein B n gene mRNAn mRNAc
  20. 20. Reactions can involve any species anywhere c protein A protein B n gene mRNAn mRNAc
  21. 21. Reactions can cross compartment boundaries c protein A protein B n gene mRNAn mRNAc
  22. 22. Reaction/process rates can be (almost) arbitrary formulas c protein A f1(x) protein B n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc
  23. 23. “Rules”: equations expressing relationships in addition to reaction sys.g1(x) cg2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc
  24. 24. “Events”: discontinuous actions triggered by system conditionsg1(x) cg2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...)
  25. 25. Annotations: machine-readable semantics and links to other resources “This is identified “This is an enzymatic cg1(x)by GO id # ...” reaction with EC # ...”g2(x) . protein A f1(x) protein B . “This is a transport . n into the nucleus ...” “This compartment represents the nucleus ...” f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc “This event represents ...” Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...)
  26. 26. Today: spatially homogeneous models • Metabolic network models Find BioM exam ples in • Signaling pathway models http: odels Data base • Conductance-based models //bio mod els.ne t/bio • Neural models models • Pharmacokinetic/dynamics models • Infectious diseasesComing: SBML Level 3 packages to support other types • E.g.: Spatially inhomogeneous models, also qualitative/logical Scope of SBML encompasses many types of models
  27. 27. Herrgård et al., Nature Biotech., 26:10, 2008 2342 reactions A consensus yeast metabolic network reconstruction © 2008 Nature Publishing Group http://www.nature.com/naturebiotechnology obtained from a community approach to systems biology Markus J Herrgård1,19,20, Neil Swainston2,3,20, Paul Dobson3,4, Warwick B Dunn3,4, K Yalçin Arga5, Mikko Arvas6, Nils Blüthgen3,7, Simon Borger8, Roeland Costenoble9, Matthias Heinemann9, Michael Hucka10, Nicolas Le Novère11, Peter Li2,3, Wolfram Liebermeister8, Monica L Mo1, Ana Paula Oliveira12, Dina Petranovic12,19, Stephen Pettifer2,3, Evangelos Simeonidis3,7, Kieran Smallbone3,13, Irena Spasić2,3, Dieter Weichart3,4, Roger Brent14, David S Broomhead3,13, Hans V Westerhoff 3,7,15, Betül Kırdar5, Merja Penttilä6, Edda Klipp8, Bernhard Ø Palsson1, Uwe Sauer9, Stephen G Oliver3,16, Pedro Mendes2,3,17, Jens Nielsen12,18 & Douglas B Kell*3,4 Genomic data allow the large-scale manual or semi-automated of their parameters. Armed with such information, it is then possible to assembly of metabolic network reconstructions, which provide provide a stochastic or ordinary differential equation model of the entire highly curated organism-specific knowledge bases. Although metabolic network of interest. An attractive feature of metabolism, for the several genome-scale network reconstructions describe purposes of modeling, is that, in contrast to signaling pathways, metabo- Saccharomyces cerevisiae metabolism, they differ in scope lism is subject to direct thermodynamic and (in particular) stoichiometric and content, and use different terminologies to describe the constraints3. Our focus here is on the first two stages of the reconstruction same chemical entities. This makes comparisons between them process, especially as it pertains to the mapping of experimental metabo- difficult and underscores the desirability of a consolidated lomics data onto metabolic network reconstructions. metabolic network that collects and formalizes the ‘community Besides being an industrial workhorse for a variety of biotechnological knowledge’ of yeast metabolism. We describe how we have products, S. cerevisiae is a highly developed model organism for biochemi- produced a consensus metabolic network reconstruction cal, genetic, pharmacological and post-genomic studies5. It is especially for S. cerevisiae. In drafting it, we placed special emphasis attractive because of the availability of its genome sequence6, a whole series on referencing molecules to persistent databases or using of bar-coded deletion7,8 and other9 strains, extensive experimental ’omics database-independent forms, such as SMILES or InChI strings, data10–14 and the ability to grow it for extended periods under highly con- as this permits their chemical structure to be represented trolled conditions15. The very active scientific community that works on unambiguously and in a manner that permits automated S. cerevisiae has a history of collaborative research projects that have led to reasoning. The reconstruction is readily available via a publicly substantial advances in our understanding of eukaryotic biology6,8,13,16,17. Model scale & complexity have been increasing Many significant and popular models are in SBML form accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator Furthermore, yeast metabolic physiology has been the subject of inten- sive study and most of the components of the yeast metabolic network are relatively well characterized. Taken together, these factors make yeast
  28. 28. SBML Level 1 SBML Level 2 SBML Level 3predefined math functions user-defined functions user-defined functionstext-string math notation MathML subset MathML subsetreserved namespaces for no reserved namespaces no reserved namespaces annotations for annotations for annotationsno controlled annotation RDF-based controlled RDF-based controlled scheme annotation scheme annotation scheme no discrete events discrete events discrete events default values defined default values defined no default values monolithic monolithic modular
  29. 29. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  30. 30. You want models? We got models.
  31. 31. BioModels DatabaseStores & serves quantitative models of biological interest • Free, public resource • Models must be described in peer-reviewed publication(s)Hundreds of models are curated by handImports & exports models in several formats Figure courtesy of Camille Laibe
  32. 32. BioModels Databasehttp://biomodels.net/biomodels
  33. 33. Contents of BioModels DatabaseContents today: • 142,000+ pathway models (converted from KEGG) • 400+ hand-curated quantitative models signal transduction 9% metabolic process 3% 3% 25% multicelullar organismal process 5% rhythmic process cell cycle 6% homeostatic process response to stimulus 8% cell death 9% 23% localization others (e.g., developmental process) 9% • 400+ non-curated quantitative models Database data from 2012-08-10
  34. 34. How can you check that a given SBML file is valid?
  35. 35. The Online SBML Validator
  36. 36. The Online SBML Validator Find it herehttp://sbml.org/Facilities/Validator
  37. 37. Where can you find more software?
  38. 38. Find software in the SBML Software Guide
  39. 39. Find software in the SBML Software Guide Find SBML software
  40. 40. Results of 2011 survey of SBML-compatible software Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Out of 81 responses
  41. 41. What about libraries for writing SBML-compatible software?
  42. 42. libSBMLReads, writes, validates SBMLCan check & convert unitsWritten in portable C++Runs on Linux, Mac, WindowsAPIs for C, C++, C#, Java, Octave,Perl, Python, R, Ruby, MATLABWell documented APIOpen-source (LGPL) http://sbml.org/Software/libSBML
  43. 43. JSBML Pure Java implementation API is compatible with libSBML but more Java-like Functionality is subset of libSBML Open source (LGPL)http://sbml.org/Software/JSBML
  44. 44. How can you stay informed of new developments?
  45. 45. Resources for news, questions and discussions
  46. 46. Front-page newsResources for news, questions and discussions
  47. 47. Twitter & RSS feedsResources for news, questions and discussions
  48. 48. Mailing lists/forumsResources for news, questions and discussions
  49. 49. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  50. 50. SBML itself provides syntax and only limited semantics
  51. 51. SBML itself provides syntax and only limited semantics No standard identifiers
  52. 52. SBML itself provides syntax and only limited semantics Low info content No standard identifiers
  53. 53. SBML itself provides syntax and only limited semantics Raw models alone are insufficient Need standard schemes for Low info machine-readable annotations content • Identify entities • Mathematical semantics • Links to other data resources • Authorship & pub. info No standard identifiers
  54. 54. Element in Entity elsewherethe model (e.g., in a database) relationship qualifier (optional) Annotations at their simplest
  55. 55. Annotations add meaning and connectionsAnnotations can answer questions: • “What exactly is the process represented by equation ‘r17’?” • “What other identities (synonyms) does this entity have?” • “What role does constant ‘k3’ play in equation ‘r17’?” • “What organism are we talking about?” • ... etc. ...Multiple annotations on same entity are common
  56. 56. SBML supports two annotation schemesSBO (Systems Biology Ontology) • For mathematical semantics • One SBML object ← one SBO term • Short, compact, tightly coupled but limited scopeMIRIAM (Minimum Information Requested In the Annotation of Models) • For any kind of annotation • One SBML object ← multiple MIRIAM annotations • Larger, more free-form, wider scopeBoth are externalized and independent of SBML
  57. 57. Systems Biology Ontology (SBO) http://biomodels.net/sbo
  58. 58. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>
  59. 59. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> SBO:0000339 <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>
  60. 60. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> SBO:0000339 <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml> “forward bimolecular rate constant, continuous case”
  61. 61. Software can use SBO terms to help you work with modelssemanticSBMLSBMLsqueezer
  62. 62. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  63. 63. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  64. 64. Goal: permit tracing model’s origins & people involved in its creationMinimal info required: • Name for the model • Citation for a description of what is being modeled & its author • Contact info for the model creator(s) • Creation date & time • Last modification date & time • Statement of the model’s terms of distribution - Specific terms not mandated, just a statement of the terms Annotations for attributing model creators and sources
  65. 65. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  66. 66. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  67. 67. Annotations for external referencesGoal: link model constituents to corresponding entities inbioinformatics resources (e.g., databases, controlled vocabularies) • Supports: - Precise identification of model constituents - Discovery of models that concern the same thing - Comparison of model constituents between different modelsMIRIAM approach avoids putting data content directly in the model;instead, it points at external resources that contain the knowledge.
  68. 68. http://www.ebi.ac.uk/chebiLow infocontent Why might you care?
  69. 69. http://www.ebi.ac.uk/chebi salicylic acid Known by different names – Low info you want to write all of docontent them into your model? Why might you care?
  70. 70. Identifying resources has its own challengesFor linking to data, need: • Globally unique, unambiguous identifiers • ... that are persistent despite resource changes (e.g., changed URLs) • ... that are maintained by the communityProblem: different resources have different identification schemes • E.g.: entity “16480” - In ChEBI: entry 16480 is nitrous oxide - In PubMed: entry 16480 is the 1977 paper “Effect of gallstone- dissolution therapy on human liver structure” - In PubChem: entry 16480 is 1-chloro-4-isothiocyanatobenzene
  71. 71. How do we create globally unique identifiers consistently?Long story short: • Create unique resource identifiers (URIs) by combining 2 parts: namespace entity identifier { { Identifies a dataset Identifies a datum within the dataset • Create registry for namespaces - Allows people & software to use same namespace identifiers • Create service for URI resolution - Allows people & software to take a given resource identifier and figure out what it points to
  72. 72. Resolving resource identifiersMIRIAM Registry supports the creation of globally unique identifiers • Example MIRIAM identifier: urn:miriam:ec-code:1.1.1.1 • Provides various data about the resource, including alternate servers • Provides web servicesidentifiers.org is layered on top of that and provides resolvable URIs • Can type it in a web browser! • Example identifiers.org URI: http://identifiers.org/ec-code/1.1.1.1
  73. 73. BioModels Database: example of using the annotations
  74. 74. Annotations enable many interesting possibilities Annotations interesting possibilities semanticSBML Figure courtesy of Wolfram Leibermeister
  75. 75. Summary: why care about standard ways of writing annotations? Structured, machine-readable annotations increase your model’s utility • Allow more precise identification of model components - Understand model structure - Search/discover models - Compare models • Adds a semantic layer—integrates knowledge into the model - Helps recipients understand the underlying biology - Allows for better reuse of models - Supports conversion of models from one form to another
  76. 76. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  77. 77. Model representation level Concept due to Nicolas Le Novère Visual interpretation Biological semantics Dis Co cre nti te Mathematical semantics nuo s toc us ha lum sti pe ceMe dp nti tie tion an Sta ara s lc rea ion fie te me ode tat ld ap tra ter M la nno pro ns itio ode al ysis xim n M de l an ults Mo res ati on erical Num Model type Model life-cycle Major dimensions of a computational model
  78. 78. What about other kinds of models?
  79. 79. SBML Level 3: Supporting more categories of models Package W Package X Package Y Package Z SBML Level 3 Core (dependencies)An SBML Level 3 package adds constructs & capabilitiesModels declare which packages they use • Applications tell users which packages they supportPackage development can be decoupled
  80. 80. Level 3 package What it enablesHierarchical composition Models containing submodelsFlux balance constraints Flux balance analysis modelsQualitative models Petri net models, Boolean modelsSpatial Nonhomogeneous spatial modelsMulticomponent species Entities with structure & state; rule-based modelsGraph layout Diagrams of modelsGraph rendering Diagrams of modelsDistribution & ranges Nonscalar valuesAnnotations Richer annotation syntaxGroups Arbitrary grouping of model componentsDynamic structures Creation & destruction of model componentsArrays & sets Arrays or sets of entities
  81. 81. How can we capture the simulation/analysis procedures?
  82. 82. Decroly & Goldbeter, PNAS, 1982 ?BIOMD0000000319 in BioModels Database Software can’t read figure legends
  83. 83. SED-ML = Simulation Experiment Description MLApplication-independent format to capture procedures, algorithms,parameter values • Neutral format for encoding the steps to go from model to outputCan be used for • Simulation experiments encoding parametrizations & perturbations • Simulations using more than one model • Simulations using more than one method • Data manipulations to produce plot(s)libSedML project developing API library http://www.biomodels.net/sed­ml
  84. 84. What about visual diagrams?
  85. 85. Graphical representation of modelsToday: broad variation in graphical notation used in biological diagrams • Between authors, between journals, even people in same groupHowever, standard notations would offer benefits: • Consistency = easier to read diagrams with less ambiguity • Software support: verification of correctness, translation to math
  86. 86. SBGN = Systems Biology Graphical NotationGoal: standardize the graphical notation in diagrams of biological processes • Community-based development, à la SBMLMany groups participating3 sublanguages to describe different facets of a model http://sbgn.org
  87. 87. General background and motivations Brief summary of SBML featuresOutline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  88. 88. Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010Such standards are the work of a great community
  89. 89. Get involved and make things better!COMBINE (Computational Modeling in Biology Network) • SBML, SBGN, BioPAX, SED-ML, CellML, NeuroML http://co.mbine.orgUpcoming meeting: August 15–19 in Toronto, Canada • Right before ICSB (International Conference on Systems Biology)
  90. 90. SBML http://sbml.org BioModels Database http://biomodels.net/biomodels COMBINE http://co.mbine.org identifiers.org http://identifiers.orgURLs MIRIAM http://biomodels.net/miriam SED-ML http://biomodels.net/sed-ml SBO http://biomodels.net/sbo SBGN http://sbgn.org
  91. 91. I’d like your feedback!You can use this anonymous form: http://tinyurl.com/mhuckafeedback
  92. 92. SBML was made possible thanks to funding from:National Institute of General Medical Sciences (USA)European Molecular Biology Laboratory (EMBL)JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)JST ERATO-SORST Program (Japan)ELIXIR (UK)Beckman Institute, Caltech (USA)Keio University (Japan)International Joint Research Program of NEDO (Japan)Japanese Ministry of AgricultureJapanese Ministry of Educ., Culture, Sports, Science and Tech.BBSRC (UK)National Science Foundation (USA)DARPA IPTO Bio-SPICE Bio-Computation Program (USA)Air Force Office of Scientific Research (USA)STRI, University of Hertfordshire (UK)Molecular Sciences Institute (USA)

×