SlideShare a Scribd company logo
1 of 93
Download to read offline
SBML (the Systems Biology Markup Language),
   model databases, and other resources
                     Michael Hucka, Ph.D.
       Department of Computing + Mathematical Sciences
              California Institute of Technology
                      Pasadena, CA, USA
     Email: mhucka@caltech.edu           Twitter: @mhucka

  CCB 2012, August 2012, Cold Spring Harbor Laboratory, NY, USA
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
Research today: experimentation, computation, cogitation
The many roles of computation in biological research
Instrument/device control, data management, data processing,
database applications, statistical analysis, pattern matching, image
processing, text mining, chemical structure prediction, genomic
sequence analysis, proteomics, other *omics, molecular modeling,
molecular dynamics, kinetic simulation, simulated evolution,
phylogenetics, ... (to name only a subset)!
Focus here: modeling and simulation
What are the outcomes of modeling and simulation?
Usually, there are at least two scientific outcomes:
 •    One or more models (+ associated claims about their behaviors)
 •    Publication of the results (in some form)

                                                   Models come
                                                  in many forms
Models are results
 Models serve as statements of our current understanding of the
 phenomena being studied*
   •   A computational model documents your theory in a concrete form
 Model can—
   •   Reduce ambiguity in communication
   •   Offer a concrete framework for adding new data and theories
   •   Support direct evaluation of relationships between theories




Bower & Bolouri, Computational modeling of genetic and biochemical networks, MIT Press, 2001
But only if the modeling results are reproducible
Is it enough to describe the model & equations in a paper?
Many models have traditionally been published this way
Problems:
 •   Errors in printing
 •   Missing information
 •   Dependencies on
     implementation
 •   Outright errors
 •   Can be a huge
     effort to recreate
Is it enough to make your (software X) script available?
It’s vital for good science:
 •   Someone with access to the same software can try to run it,
     understand it, verify the computational results, build on them, etc.
 •   Opinion: you should always do this in any case
Is it enough to make your (software X) code available?
It’s vital for good science—
 •   Someone with access to the same software can try to run it,
     understand it, build on it, etc.
 •   Opinion: you should always do this in any case
But it’s still not ideal for communication of scientific results:
 •   What if they don’t have access to that software?
 •   And anyway, how will people find the model?
 •   And how will people be able to relate the model to other work?
Different tools   different interfaces & languages
Communication is better with interoperable data formats
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
SB
   ML
     :a
 fo lin
   rs g
     of ua
       tw fr
         ar an
           e ca
SBML = Systems Biology Markup Language
Format for representing computational models of biological processes
 •   Data structures + usage principles + serialization to XML
Neutral with respect to modeling framework
 •   E.g., ODE, stochastic systems, etc.


Development started in 2000, with first specification distributed in 2001
The process is central
  •   Called a “reaction” in SBML
  •   Participants are pools of entities (species)
Models can further include:
  •   Other constants & variables            •   Unit definitions
  •   Compartments                           •   Annotations
  •   Explicit math
  •   Discontinuous events




              Basic SBML concepts are fairly simple
Well-stirred compartments

       c



       n
Species pools are located in compartments
        c
                   protein A                protein B

        n




            gene               mRNAn          mRNAc
Reactions can involve any species anywhere

       c
                   protein A                 protein B

        n




            gene               mRNAn           mRNAc
Reactions can cross compartment boundaries

       c
                  protein A                  protein B

       n




           gene               mRNAn            mRNAc
Reaction/process rates can be (almost) arbitrary formulas

       c
                   protein A          f1(x)           protein B

        n

                     f5(x)                               f2(x)



            gene         f4(x)   mRNAn        f3(x)     mRNAc
“Rules”: equations expressing relationships in addition to reaction sys.

g1(x)    c
g2(x)               protein A             f1(x)           protein B
 .
 .
 .       n

                      f5(x)                                  f2(x)



             gene         f4(x)    mRNAn          f3(x)     mRNAc
“Events”: discontinuous actions triggered by system conditions

g1(x)       c
g2(x)                   protein A              f1(x)           protein B
 .
 .
 .           n

                          f5(x)                                   f2(x)



                 gene         f4(x)     mRNAn          f3(x)     mRNAc


        Event1: when (...condition...), Event2: when (...condition...), ...
           do (...assignments...)          do (...assignments...)
Annotations: machine-readable semantics and links to other resources

   “This is identified                                   “This is an enzymatic
            c
g1(x)by GO id # ...”                                    reaction with EC # ...”
g2(x)
  .                    protein A             f1(x)           protein B
  .
 “This is a transport
  .         n
 into the nucleus ...”                  “This compartment
                                     represents the nucleus ...”
                         f5(x)                                  f2(x)



              gene           f4(x)   mRNAn           f3(x)     mRNAc
                   “This event
                  represents ...”
     Event1: when (...condition...), Event2: when (...condition...), ...
        do (...assignments...)          do (...assignments...)
Today: spatially homogeneous models
  •   Metabolic network models             Find
                                          BioM
                                                 exam
                                                       ples
                                                            in
  •   Signaling pathway models
                                         http:
                                                 odels
                                                       Data
                                                            base
  • Conductance-based models                   //bio
                                                     mod
                                                         els.ne
                                                                t/bio
  • Neural models                                                     models
  • Pharmacokinetic/dynamics models
  • Infectious diseases


Coming: SBML Level 3 packages to support other types
  •   E.g.: Spatially inhomogeneous models, also qualitative/logical




      Scope of SBML encompasses many types of models
Herrgård et al., Nature Biotech., 26:10, 2008                                                                                                                                                           2342 reactions

                                                                              A consensus yeast metabolic network reconstruction
   © 2008 Nature Publishing Group http://www.nature.com/naturebiotechnology




                                                                              obtained from a community approach to systems
                                                                              biology
                                                                              Markus J Herrgård1,19,20, Neil Swainston2,3,20, Paul Dobson3,4, Warwick B Dunn3,4, K Yalçin Arga5, Mikko Arvas6,
                                                                              Nils Blüthgen3,7, Simon Borger8, Roeland Costenoble9, Matthias Heinemann9, Michael Hucka10,
                                                                              Nicolas Le Novère11, Peter Li2,3, Wolfram Liebermeister8, Monica L Mo1, Ana Paula Oliveira12, Dina Petranovic12,19,
                                                                              Stephen Pettifer2,3, Evangelos Simeonidis3,7, Kieran Smallbone3,13, Irena Spasić2,3, Dieter Weichart3,4,
                                                                              Roger Brent14, David S Broomhead3,13, Hans V Westerhoff 3,7,15, Betül Kırdar5, Merja Penttilä6, Edda Klipp8,
                                                                              Bernhard Ø Palsson1, Uwe Sauer9, Stephen G Oliver3,16, Pedro Mendes2,3,17, Jens Nielsen12,18 & Douglas B Kell*3,4

                                                                              Genomic data allow the large-scale manual or semi-automated         of their parameters. Armed with such information, it is then possible to
                                                                              assembly of metabolic network reconstructions, which provide        provide a stochastic or ordinary differential equation model of the entire
                                                                              highly curated organism-specific knowledge bases. Although          metabolic network of interest. An attractive feature of metabolism, for the
                                                                              several genome-scale network reconstructions describe               purposes of modeling, is that, in contrast to signaling pathways, metabo-
                                                                              Saccharomyces cerevisiae metabolism, they differ in scope           lism is subject to direct thermodynamic and (in particular) stoichiometric
                                                                              and content, and use different terminologies to describe the        constraints3. Our focus here is on the first two stages of the reconstruction
                                                                              same chemical entities. This makes comparisons between them         process, especially as it pertains to the mapping of experimental metabo-
                                                                              difficult and underscores the desirability of a consolidated        lomics data onto metabolic network reconstructions.
                                                                              metabolic network that collects and formalizes the ‘community          Besides being an industrial workhorse for a variety of biotechnological
                                                                              knowledge’ of yeast metabolism. We describe how we have             products, S. cerevisiae is a highly developed model organism for biochemi-
                                                                              produced a consensus metabolic network reconstruction               cal, genetic, pharmacological and post-genomic studies5. It is especially
                                                                              for S. cerevisiae. In drafting it, we placed special emphasis       attractive because of the availability of its genome sequence6, a whole series
                                                                              on referencing molecules to persistent databases or using           of bar-coded deletion7,8 and other9 strains, extensive experimental ’omics
                                                                              database-independent forms, such as SMILES or InChI strings,        data10–14 and the ability to grow it for extended periods under highly con-
                                                                              as this permits their chemical structure to be represented          trolled conditions15. The very active scientific community that works on
                                                                              unambiguously and in a manner that permits automated                S. cerevisiae has a history of collaborative research projects that have led to
                                                                              reasoning. The reconstruction is readily available via a publicly   substantial advances in our understanding of eukaryotic biology6,8,13,16,17.

   Model scale & complexity have been increasing
  Many significant and popular models are in SBML form                         accessible database and in the Systems Biology Markup
                                                                              Language (http://www.comp-sys-bio.org/yeastnet). It can be
                                                                              maintained as a resource that serves as a common denominator
                                                                                                                                                  Furthermore, yeast metabolic physiology has been the subject of inten-
                                                                                                                                                  sive study and most of the components of the yeast metabolic network
                                                                                                                                                  are relatively well characterized. Taken together, these factors make yeast
SBML Level 1               SBML Level 2             SBML Level 3
predefined math functions   user-defined functions    user-defined functions


text-string math notation       MathML subset            MathML subset

reserved namespaces for     no reserved namespaces   no reserved namespaces
      annotations                for annotations          for annotations

no controlled annotation     RDF-based controlled     RDF-based controlled
        scheme                annotation scheme        annotation scheme

   no discrete events           discrete events          discrete events


 default values defined       default values defined      no default values


       monolithic                 monolithic                modular
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
You want models? We got models.
BioModels Database
Stores & serves quantitative models of biological interest
 •   Free, public resource
 •   Models must be described in peer-reviewed publication(s)
Hundreds of models are curated by hand
Imports & exports models in several formats




                                                    Figure courtesy of Camille Laibe
BioModels Database




http://biomodels.net/biomodels
Contents of BioModels Database
Contents today:
 •   142,000+ pathway models (converted from KEGG)
 •   400+ hand-curated quantitative models
         signal transduction                         9%
         metabolic process                        3%
                                                3%                    25%
         multicelullar organismal process     5%
         rhythmic process
         cell cycle                         6%
         homeostatic process
         response to stimulus               8%
         cell death
                                                 9%                   23%
         localization
         others (e.g., developmental process)         9%


 •   400+ non-curated quantitative models
                                                  Database data from 2012-08-10
How can you check that a given SBML file is valid?
The Online SBML Validator
The Online SBML Validator

          Find it
           here




http://sbml.org/Facilities/Validator
Where can you find more software?
Find software in the SBML Software Guide
Find software in the SBML Software Guide




              Find SBML software
Results of 2011 survey of SBML-compatible software
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14

  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60   80

                                                                        Out of 81 responses
What about libraries for writing SBML-compatible software?
libSBML
Reads, writes, validates SBML
Can check & convert units
Written in portable C++
Runs on Linux, Mac, Windows
APIs for C, C++, C#, Java, Octave,
Perl, Python, R, Ruby, MATLAB
Well documented API
Open-source (LGPL)




                  http://sbml.org/Software/libSBML
JSBML
              Pure Java implementation
              API is compatible with libSBML but
              more Java-like
              Functionality is subset of libSBML
              Open source (LGPL)




http://sbml.org/Software/JSBML
How can you stay informed of new developments?
Resources for news, questions and discussions
Front-page news




Resources for news, questions and discussions
Twitter & RSS feeds




Resources for news, questions and discussions
Mailing lists/forums




Resources for news, questions and discussions
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
SBML itself provides syntax and only limited semantics
SBML itself provides syntax and only limited semantics




  No standard
   identifiers
SBML itself provides syntax and only limited semantics


         Low info
         content




  No standard
   identifiers
SBML itself provides syntax and only limited semantics
                           Raw models alone are insufficient
                           Need standard schemes for
         Low info          machine-readable annotations
         content
                            •   Identify entities
                            •   Mathematical semantics
                            •   Links to other data resources
                            •   Authorship & pub. info



  No standard
   identifiers
Element in                                  Entity elsewhere
the model                                  (e.g., in a database)
                   relationship qualifier
                         (optional)




             Annotations at their simplest
Annotations add meaning and connections
Annotations can answer questions:
 •   “What exactly is the process represented by equation ‘r17’?”
 •   “What other identities (synonyms) does this entity have?”
 •   “What role does constant ‘k3’ play in equation ‘r17’?”
 •   “What organism are we talking about?”
 •   ... etc. ...
Multiple annotations on same entity are common
SBML supports two annotation schemes
SBO (Systems Biology Ontology)
 •   For mathematical semantics

 •   One SBML object ← one SBO term
 •   Short, compact, tightly coupled but limited scope
MIRIAM (Minimum Information Requested In the Annotation of Models)
 •   For any kind of annotation

 •   One SBML object ← multiple MIRIAM annotations
 •   Larger, more free-form, wider scope
Both are externalized and independent of SBML
Systems Biology Ontology (SBO)




                     http://biomodels.net/sbo
<sbml ...>
  ...
  <listOfCompartments>
    <compartment id="cell" size="1e-15" />
  </listOfCompartments>
  <listOfSpecies>
    <species compartment="cell" id="S1" initialAmount="1000" />
    <species compartment="cell" id="S2" initialAmount="0" />
  <listOfSpecies>
  <listOfParameters>
    <parameter id="k" value="0.005" sboTerm="SBO:0000339" />
  <listOfParameters>
  <listOfReactions>
    <reaction id="r1" reversible="false">
      <listOfReactants>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" />
      </listOfReactants>
      <listOfProducts>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" />
      </listOfProducts>
      <kineticLaw sboTerm="SBO:0000052">
        <math>
         ...
        <math>
  ...
</sbml>
<sbml ...>
  ...
  <listOfCompartments>
    <compartment id="cell" size="1e-15" />
  </listOfCompartments>
  <listOfSpecies>
    <species compartment="cell" id="S1" initialAmount="1000" />
    <species compartment="cell" id="S2" initialAmount="0" />
  <listOfSpecies>
  <listOfParameters>
    <parameter id="k" value="0.005" sboTerm="SBO:0000339" />
                                             SBO:0000339
  <listOfParameters>
  <listOfReactions>
    <reaction id="r1" reversible="false">
      <listOfReactants>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" />
      </listOfReactants>
      <listOfProducts>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" />
      </listOfProducts>
      <kineticLaw sboTerm="SBO:0000052">
        <math>
         ...
        <math>
  ...
</sbml>
<sbml ...>
  ...
  <listOfCompartments>
    <compartment id="cell" size="1e-15" />
  </listOfCompartments>
  <listOfSpecies>
    <species compartment="cell" id="S1" initialAmount="1000" />
    <species compartment="cell" id="S2" initialAmount="0" />
  <listOfSpecies>
  <listOfParameters>
    <parameter id="k" value="0.005" sboTerm="SBO:0000339" />
                                             SBO:0000339
  <listOfParameters>
  <listOfReactions>
    <reaction id="r1" reversible="false">
      <listOfReactants>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" />
      </listOfReactants>
      <listOfProducts>
        <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" />
      </listOfProducts>
      <kineticLaw sboTerm="SBO:0000052">
        <math>
         ...
        <math>
  ...
</sbml>                     “forward bimolecular rate constant, continuous case”
Software can use SBO terms to help you work with models



semanticSBML




SBMLsqueezer
MIRIAM (Minimum Information Requested In the Annotation of Models)


 Addresses 2 general areas of annotation needs:




         Requirements for          Scheme for encoding
    reference correspondence           annotations


                       Annotations for              Annotations for
                      attributing model           referring to external
                      creators & sources             data resources



 MIRIAM is not specific to SBML
MIRIAM (Minimum Information Requested In the Annotation of Models)


 Addresses 2 general areas of annotation needs:




         Requirements for          Scheme for encoding
    reference correspondence           annotations


                       Annotations for              Annotations for
                      attributing model           referring to external
                      creators & sources             data resources



 MIRIAM is not specific to SBML
Goal: permit tracing model’s origins & people involved in its creation
Minimal info required:
  •   Name for the model
  •   Citation for a description of what is being modeled & its author
  •   Contact info for the model creator(s)
  •   Creation date & time
  •   Last modification date & time
  •   Statement of the model’s terms of distribution
      -   Specific terms not mandated, just a statement of the terms




  Annotations for attributing model creators and sources
MIRIAM (Minimum Information Requested In the Annotation of Models)


 Addresses 2 general areas of annotation needs:




         Requirements for          Scheme for encoding
    reference correspondence           annotations


                       Annotations for              Annotations for
                      attributing model           referring to external
                      creators & sources             data resources



 MIRIAM is not specific to SBML
MIRIAM (Minimum Information Requested In the Annotation of Models)


 Addresses 2 general areas of annotation needs:




         Requirements for          Scheme for encoding
    reference correspondence           annotations


                       Annotations for              Annotations for
                      attributing model           referring to external
                      creators & sources             data resources



 MIRIAM is not specific to SBML
Annotations for external references
Goal: link model constituents to corresponding entities in
bioinformatics resources (e.g., databases, controlled vocabularies)
 •   Supports:
     -   Precise identification of model constituents
     -   Discovery of models that concern the same thing
     -   Comparison of model constituents between different models
MIRIAM approach avoids putting data content directly in the model;
instead, it points at external resources that contain the knowledge.
http://www.ebi.ac.uk/chebi




Low info
content




            Why might you care?
http://www.ebi.ac.uk/chebi

     salicylic acid




     Known by different names – 
Low info you want to write all of
      do
content them into your model?




               Why might you care?
Identifying resources has its own challenges
For linking to data, need:
 •   Globally unique, unambiguous identifiers
 •   ... that are persistent despite resource changes (e.g., changed URLs)
 •   ... that are maintained by the community
Problem: different resources have different identification schemes
 •   E.g.: entity “16480”
     -   In ChEBI: entry 16480 is nitrous oxide
     -   In PubMed: entry 16480 is the 1977 paper “Effect of gallstone-
         dissolution therapy on human liver structure”
     -   In PubChem: entry 16480 is 1-chloro-4-isothiocyanatobenzene
How do we create globally unique identifiers consistently?
Long story short:
 •   Create unique resource identifiers (URIs) by combining 2 parts:

                     namespace            entity identifier




                  {
                                         {
                Identifies a dataset      Identifies a datum
                                         within the dataset

 •   Create registry for namespaces
     -   Allows people & software to use same namespace identifiers
 •   Create service for URI resolution
     -   Allows people & software to take a given resource identifier and
         figure out what it points to
Resolving resource identifiers
MIRIAM Registry supports the creation of globally unique identifiers
 •   Example MIRIAM identifier:
     urn:miriam:ec-code:1.1.1.1
 •   Provides various data about the
     resource, including alternate servers
 •   Provides web services


identifiers.org is layered on top of that and provides resolvable URIs
 •   Can type it in a web browser!
 •   Example identifiers.org URI:
     http://identifiers.org/ec-code/1.1.1.1
BioModels Database: example of using the annotations
Annotations enable many interesting possibilities
 Annotations             interesting possibilities
                                              semanticSBML




                                Figure courtesy of Wolfram Leibermeister
Summary: why care about standard ways of writing annotations?
 Structured, machine-readable annotations increase your model’s utility
  •   Allow more precise identification of model components
      -   Understand model structure
      -   Search/discover models
      -   Compare models
  •   Adds a semantic layer—integrates knowledge into the model
      -   Helps recipients understand the underlying biology
      -   Allows for better reuse of models
      -   Supports conversion of models from one form to another
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
Model representation level




                                                                                                 Concept due to Nicolas Le Novère
                                                  Visual interpretation

                                                  Biological semantics
              Dis
   Co              cre
      nti             te                          Mathematical semantics
          nuo            s    toc
              us                 ha
                   lum               sti
                         pe             ce
Me                    dp           nti
                                       tie                          tion
  an          Sta          ara             s                  lc rea                ion
     fie         te            me                         ode                   tat
        ld
           ap       tra           ter                   M               la nno
             pro        ns
                           itio                                      ode             al ysis
                xim             n                                M
                                                                             de l an           ults
                                                                           Mo               res
                    ati
                       on                                                            erical
                                                                                  Num

    Model type                                              Model life-cycle


                Major dimensions of a computational model
What about other kinds of models?
SBML Level 3: Supporting more categories of models

                                    Package W

     Package X          Package Y          Package Z

                  SBML Level 3 Core
                                                            (dependencies)


An SBML Level 3 package adds constructs & capabilities
Models declare which packages they use
 •    Applications tell users which packages they support
Package development can be decoupled
Level 3 package            What it enables
Hierarchical composition Models containing submodels
Flux balance constraints   Flux balance analysis models
Qualitative models         Petri net models, Boolean models
Spatial                    Nonhomogeneous spatial models
Multicomponent species Entities with structure & state; rule-based models
Graph layout               Diagrams of models
Graph rendering            Diagrams of models
Distribution & ranges      Nonscalar values
Annotations                Richer annotation syntax
Groups                     Arbitrary grouping of model components
Dynamic structures         Creation & destruction of model components
Arrays & sets              Arrays or sets of entities
How can we capture the simulation/analysis procedures?
Decroly & Goldbeter, PNAS, 1982




                                          ?


BIOMD0000000319 in BioModels Database




                      Software can’t read figure legends
SED-ML = Simulation Experiment Description ML
Application-independent format to capture procedures, algorithms,
parameter values
 •   Neutral format for encoding the steps to go from model to output
Can be used for
 •   Simulation experiments encoding parametrizations & perturbations

 •   Simulations using more than one model
 •   Simulations using more than one method
 •   Data manipulations to produce plot(s)
libSedML project developing API library



                  http://www.biomodels.net/sed­ml
What about visual diagrams?
Graphical representation of models
Today: broad variation in graphical notation used in biological diagrams
 •   Between authors, between journals, even people in same group
However, standard notations would offer benefits:
 •   Consistency = easier to read diagrams with less ambiguity
 •   Software support: verification of correctness, translation to math
SBGN = Systems Biology Graphical Notation
Goal: standardize the graphical notation in diagrams of biological processes
 •   Community-based development, à la SBML
Many groups participating
3 sublanguages to describe different facets of a model




                           http://sbgn.org
General background and motivations
          Brief summary of SBML features
Outline




          A selection of resources for the SBML-oriented modeler
          Annotations, connections and semantics
          Current and upcoming developments in community standards
          Closing
Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010

Such standards are the work of a great community
Get involved and make things better!
COMBINE (Computational Modeling in Biology Network)
 •   SBML, SBGN, BioPAX, SED-ML, CellML, NeuroML




                         http://co.mbine.org

Upcoming meeting: August 15–19 in Toronto, Canada
 •   Right before ICSB (International Conference on Systems Biology)
SBML http://sbml.org

       BioModels Database http://biomodels.net/biomodels

                 COMBINE http://co.mbine.org

            identifiers.org http://identifiers.org
URLs




                  MIRIAM http://biomodels.net/miriam

                  SED-ML http://biomodels.net/sed-ml

                      SBO http://biomodels.net/sbo

                    SBGN http://sbgn.org
I’d like your feedback!
You can use this anonymous form:
 http://tinyurl.com/mhuckafeedback
SBML was made possible thanks to funding from:
National Institute of General Medical Sciences (USA)
European Molecular Biology Laboratory (EMBL)
JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)
JST ERATO-SORST Program (Japan)
ELIXIR (UK)
Beckman Institute, Caltech (USA)
Keio University (Japan)
International Joint Research Program of NEDO (Japan)
Japanese Ministry of Agriculture
Japanese Ministry of Educ., Culture, Sports, Science and Tech.
BBSRC (UK)
National Science Foundation (USA)
DARPA IPTO Bio-SPICE Bio-Computation Program (USA)
Air Force Office of Scientific Research (USA)
STRI, University of Hertfordshire (UK)
Molecular Sciences Institute (USA)

More Related Content

What's hot (6)

Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...Standards and software: practical aids for reproducibility of computational r...
Standards and software: practical aids for reproducibility of computational r...
 
SBML: What Is It About?
SBML: What Is It About?SBML: What Is It About?
SBML: What Is It About?
 
Creating a new language to support open innovation
Creating a new language to support open innovationCreating a new language to support open innovation
Creating a new language to support open innovation
 
Software for SBML Today
Software for SBML TodaySoftware for SBML Today
Software for SBML Today
 
ASE02 DMP.ppt
ASE02 DMP.pptASE02 DMP.ppt
ASE02 DMP.ppt
 
20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process20090608 Abstraction and reusability in the biological modelling process
20090608 Abstraction and reusability in the biological modelling process
 

Similar to SBML (the Systems Biology Markup Language), model databases, and other resources

New challenges monolixday2011
New challenges monolixday2011New challenges monolixday2011
New challenges monolixday2011
blaudez
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
jaumebp
 
20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment
Jonathan Blakes
 

Similar to SBML (the Systems Biology Markup Language), model databases, and other resources (20)

Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
 
SBML and related resources 
and standardization efforts
SBML and related resources 
and standardization effortsSBML and related resources 
and standardization efforts
SBML and related resources 
and standardization efforts
 
Finding common ground between modelers and simulation software in systems bio...
Finding common ground between modelers and simulation software in systems bio...Finding common ground between modelers and simulation software in systems bio...
Finding common ground between modelers and simulation software in systems bio...
 
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
 
A Profile of Today's SBML-Compatible Software
A Profile of Today's SBML-Compatible SoftwareA Profile of Today's SBML-Compatible Software
A Profile of Today's SBML-Compatible Software
 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"Report: "MolGAN: An implicit generative model for small molecular graphs"
Report: "MolGAN: An implicit generative model for small molecular graphs"
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
 
A status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBMLA status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBML
 
New challenges monolixday2011
New challenges monolixday2011New challenges monolixday2011
New challenges monolixday2011
 
Towards a Rapid Model Prototyping Strategy for Systems & Synthetic Biology
Towards a Rapid Model Prototyping  Strategy for Systems & Synthetic BiologyTowards a Rapid Model Prototyping  Strategy for Systems & Synthetic Biology
Towards a Rapid Model Prototyping Strategy for Systems & Synthetic Biology
 
Bio inspiring computing and its application in cheminformatics
Bio inspiring computing and its application in cheminformaticsBio inspiring computing and its application in cheminformatics
Bio inspiring computing and its application in cheminformatics
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
 
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODSPREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
 
MIRIAM Resources
MIRIAM ResourcesMIRIAM Resources
MIRIAM Resources
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...
 
20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment20090219 The case for another systems biology modelling environment
20090219 The case for another systems biology modelling environment
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 

More from Mike Hucka

More from Mike Hucka (13)

Caltech DIBS: Digital Borrowing System
Caltech DIBS: Digital Borrowing SystemCaltech DIBS: Digital Borrowing System
Caltech DIBS: Digital Borrowing System
 
Finding the right wheel
Finding the right wheelFinding the right wheel
Finding the right wheel
 
Introduction to Satellite Meeting on Overview and Use of Standards and Format...
Introduction to Satellite Meeting on Overview and Use of Standards and Format...Introduction to Satellite Meeting on Overview and Use of Standards and Format...
Introduction to Satellite Meeting on Overview and Use of Standards and Format...
 
What is "COMBINE"?
What is "COMBINE"?What is "COMBINE"?
What is "COMBINE"?
 
Some SBML-related resources at SBML.org
Some SBML-related resources at SBML.orgSome SBML-related resources at SBML.org
Some SBML-related resources at SBML.org
 
Reproducibility of computational research: methods to avoid madness (Session ...
Reproducibility of computational research: methods to avoid madness (Session ...Reproducibility of computational research: methods to avoid madness (Session ...
Reproducibility of computational research: methods to avoid madness (Session ...
 
Update on SBML for Tuesday Sep. 17 (COMBINE 2013)
Update on SBML for Tuesday Sep. 17 (COMBINE 2013)Update on SBML for Tuesday Sep. 17 (COMBINE 2013)
Update on SBML for Tuesday Sep. 17 (COMBINE 2013)
 
A new language for a new biology: How SBML and other tools are transforming m...
A new language for a new biology: How SBML and other tools are transforming m...A new language for a new biology: How SBML and other tools are transforming m...
A new language for a new biology: How SBML and other tools are transforming m...
 
Systems Biology Systems
Systems Biology SystemsSystems Biology Systems
Systems Biology Systems
 
Retrospective about SBML on the occasion of the 10th Anniversary of SBML
Retrospective about SBML on the occasion of the 10th Anniversary of SBMLRetrospective about SBML on the occasion of the 10th Anniversary of SBML
Retrospective about SBML on the occasion of the 10th Anniversary of SBML
 
SBML (the Systems Biology Markup Language), BioModels Database, and related r...
SBML (the Systems Biology Markup Language), BioModels Database, and related r...SBML (the Systems Biology Markup Language), BioModels Database, and related r...
SBML (the Systems Biology Markup Language), BioModels Database, and related r...
 
Common ground between modelers and simulation software: the Systems Biology M...
Common ground between modelers and simulation software: the Systems Biology M...Common ground between modelers and simulation software: the Systems Biology M...
Common ground between modelers and simulation software: the Systems Biology M...
 
General updates about SBML and SBML Team activities
General updates about SBML and SBML Team activitiesGeneral updates about SBML and SBML Team activities
General updates about SBML and SBML Team activities
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 

SBML (the Systems Biology Markup Language), model databases, and other resources

  • 1. SBML (the Systems Biology Markup Language), model databases, and other resources Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA Email: mhucka@caltech.edu Twitter: @mhucka CCB 2012, August 2012, Cold Spring Harbor Laboratory, NY, USA
  • 2. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 3. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 4. Research today: experimentation, computation, cogitation
  • 5. The many roles of computation in biological research Instrument/device control, data management, data processing, database applications, statistical analysis, pattern matching, image processing, text mining, chemical structure prediction, genomic sequence analysis, proteomics, other *omics, molecular modeling, molecular dynamics, kinetic simulation, simulated evolution, phylogenetics, ... (to name only a subset)! Focus here: modeling and simulation
  • 6. What are the outcomes of modeling and simulation? Usually, there are at least two scientific outcomes: • One or more models (+ associated claims about their behaviors) • Publication of the results (in some form) Models come in many forms
  • 7. Models are results Models serve as statements of our current understanding of the phenomena being studied* • A computational model documents your theory in a concrete form Model can— • Reduce ambiguity in communication • Offer a concrete framework for adding new data and theories • Support direct evaluation of relationships between theories Bower & Bolouri, Computational modeling of genetic and biochemical networks, MIT Press, 2001
  • 8. But only if the modeling results are reproducible
  • 9. Is it enough to describe the model & equations in a paper? Many models have traditionally been published this way Problems: • Errors in printing • Missing information • Dependencies on implementation • Outright errors • Can be a huge effort to recreate
  • 10. Is it enough to make your (software X) script available? It’s vital for good science: • Someone with access to the same software can try to run it, understand it, verify the computational results, build on them, etc. • Opinion: you should always do this in any case
  • 11. Is it enough to make your (software X) code available? It’s vital for good science— • Someone with access to the same software can try to run it, understand it, build on it, etc. • Opinion: you should always do this in any case But it’s still not ideal for communication of scientific results: • What if they don’t have access to that software? • And anyway, how will people find the model? • And how will people be able to relate the model to other work?
  • 12. Different tools different interfaces & languages
  • 13. Communication is better with interoperable data formats
  • 14. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 15. SB ML :a fo lin rs g of ua tw fr ar an e ca
  • 16. SBML = Systems Biology Markup Language Format for representing computational models of biological processes • Data structures + usage principles + serialization to XML Neutral with respect to modeling framework • E.g., ODE, stochastic systems, etc. Development started in 2000, with first specification distributed in 2001
  • 17. The process is central • Called a “reaction” in SBML • Participants are pools of entities (species) Models can further include: • Other constants & variables • Unit definitions • Compartments • Annotations • Explicit math • Discontinuous events Basic SBML concepts are fairly simple
  • 19. Species pools are located in compartments c protein A protein B n gene mRNAn mRNAc
  • 20. Reactions can involve any species anywhere c protein A protein B n gene mRNAn mRNAc
  • 21. Reactions can cross compartment boundaries c protein A protein B n gene mRNAn mRNAc
  • 22. Reaction/process rates can be (almost) arbitrary formulas c protein A f1(x) protein B n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc
  • 23. “Rules”: equations expressing relationships in addition to reaction sys. g1(x) c g2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc
  • 24. “Events”: discontinuous actions triggered by system conditions g1(x) c g2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...)
  • 25. Annotations: machine-readable semantics and links to other resources “This is identified “This is an enzymatic c g1(x)by GO id # ...” reaction with EC # ...” g2(x) . protein A f1(x) protein B . “This is a transport . n into the nucleus ...” “This compartment represents the nucleus ...” f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc “This event represents ...” Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...)
  • 26. Today: spatially homogeneous models • Metabolic network models Find BioM exam ples in • Signaling pathway models http: odels Data base • Conductance-based models //bio mod els.ne t/bio • Neural models models • Pharmacokinetic/dynamics models • Infectious diseases Coming: SBML Level 3 packages to support other types • E.g.: Spatially inhomogeneous models, also qualitative/logical Scope of SBML encompasses many types of models
  • 27. Herrgård et al., Nature Biotech., 26:10, 2008 2342 reactions A consensus yeast metabolic network reconstruction © 2008 Nature Publishing Group http://www.nature.com/naturebiotechnology obtained from a community approach to systems biology Markus J Herrgård1,19,20, Neil Swainston2,3,20, Paul Dobson3,4, Warwick B Dunn3,4, K Yalçin Arga5, Mikko Arvas6, Nils Blüthgen3,7, Simon Borger8, Roeland Costenoble9, Matthias Heinemann9, Michael Hucka10, Nicolas Le Novère11, Peter Li2,3, Wolfram Liebermeister8, Monica L Mo1, Ana Paula Oliveira12, Dina Petranovic12,19, Stephen Pettifer2,3, Evangelos Simeonidis3,7, Kieran Smallbone3,13, Irena Spasić2,3, Dieter Weichart3,4, Roger Brent14, David S Broomhead3,13, Hans V Westerhoff 3,7,15, Betül Kırdar5, Merja Penttilä6, Edda Klipp8, Bernhard Ø Palsson1, Uwe Sauer9, Stephen G Oliver3,16, Pedro Mendes2,3,17, Jens Nielsen12,18 & Douglas B Kell*3,4 Genomic data allow the large-scale manual or semi-automated of their parameters. Armed with such information, it is then possible to assembly of metabolic network reconstructions, which provide provide a stochastic or ordinary differential equation model of the entire highly curated organism-specific knowledge bases. Although metabolic network of interest. An attractive feature of metabolism, for the several genome-scale network reconstructions describe purposes of modeling, is that, in contrast to signaling pathways, metabo- Saccharomyces cerevisiae metabolism, they differ in scope lism is subject to direct thermodynamic and (in particular) stoichiometric and content, and use different terminologies to describe the constraints3. Our focus here is on the first two stages of the reconstruction same chemical entities. This makes comparisons between them process, especially as it pertains to the mapping of experimental metabo- difficult and underscores the desirability of a consolidated lomics data onto metabolic network reconstructions. metabolic network that collects and formalizes the ‘community Besides being an industrial workhorse for a variety of biotechnological knowledge’ of yeast metabolism. We describe how we have products, S. cerevisiae is a highly developed model organism for biochemi- produced a consensus metabolic network reconstruction cal, genetic, pharmacological and post-genomic studies5. It is especially for S. cerevisiae. In drafting it, we placed special emphasis attractive because of the availability of its genome sequence6, a whole series on referencing molecules to persistent databases or using of bar-coded deletion7,8 and other9 strains, extensive experimental ’omics database-independent forms, such as SMILES or InChI strings, data10–14 and the ability to grow it for extended periods under highly con- as this permits their chemical structure to be represented trolled conditions15. The very active scientific community that works on unambiguously and in a manner that permits automated S. cerevisiae has a history of collaborative research projects that have led to reasoning. The reconstruction is readily available via a publicly substantial advances in our understanding of eukaryotic biology6,8,13,16,17. Model scale & complexity have been increasing Many significant and popular models are in SBML form accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator Furthermore, yeast metabolic physiology has been the subject of inten- sive study and most of the components of the yeast metabolic network are relatively well characterized. Taken together, these factors make yeast
  • 28. SBML Level 1 SBML Level 2 SBML Level 3 predefined math functions user-defined functions user-defined functions text-string math notation MathML subset MathML subset reserved namespaces for no reserved namespaces no reserved namespaces annotations for annotations for annotations no controlled annotation RDF-based controlled RDF-based controlled scheme annotation scheme annotation scheme no discrete events discrete events discrete events default values defined default values defined no default values monolithic monolithic modular
  • 29. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 30. You want models? We got models.
  • 31. BioModels Database Stores & serves quantitative models of biological interest • Free, public resource • Models must be described in peer-reviewed publication(s) Hundreds of models are curated by hand Imports & exports models in several formats Figure courtesy of Camille Laibe
  • 33. Contents of BioModels Database Contents today: • 142,000+ pathway models (converted from KEGG) • 400+ hand-curated quantitative models signal transduction 9% metabolic process 3% 3% 25% multicelullar organismal process 5% rhythmic process cell cycle 6% homeostatic process response to stimulus 8% cell death 9% 23% localization others (e.g., developmental process) 9% • 400+ non-curated quantitative models Database data from 2012-08-10
  • 34. How can you check that a given SBML file is valid?
  • 35. The Online SBML Validator
  • 36. The Online SBML Validator Find it here http://sbml.org/Facilities/Validator
  • 37. Where can you find more software?
  • 38. Find software in the SBML Software Guide
  • 39. Find software in the SBML Software Guide Find SBML software
  • 40.
  • 41. Results of 2011 survey of SBML-compatible software Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Out of 81 responses
  • 42. What about libraries for writing SBML-compatible software?
  • 43. libSBML Reads, writes, validates SBML Can check & convert units Written in portable C++ Runs on Linux, Mac, Windows APIs for C, C++, C#, Java, Octave, Perl, Python, R, Ruby, MATLAB Well documented API Open-source (LGPL) http://sbml.org/Software/libSBML
  • 44. JSBML Pure Java implementation API is compatible with libSBML but more Java-like Functionality is subset of libSBML Open source (LGPL) http://sbml.org/Software/JSBML
  • 45. How can you stay informed of new developments?
  • 46. Resources for news, questions and discussions
  • 47. Front-page news Resources for news, questions and discussions
  • 48. Twitter & RSS feeds Resources for news, questions and discussions
  • 49. Mailing lists/forums Resources for news, questions and discussions
  • 50. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 51. SBML itself provides syntax and only limited semantics
  • 52. SBML itself provides syntax and only limited semantics No standard identifiers
  • 53. SBML itself provides syntax and only limited semantics Low info content No standard identifiers
  • 54. SBML itself provides syntax and only limited semantics Raw models alone are insufficient Need standard schemes for Low info machine-readable annotations content • Identify entities • Mathematical semantics • Links to other data resources • Authorship & pub. info No standard identifiers
  • 55. Element in Entity elsewhere the model (e.g., in a database) relationship qualifier (optional) Annotations at their simplest
  • 56. Annotations add meaning and connections Annotations can answer questions: • “What exactly is the process represented by equation ‘r17’?” • “What other identities (synonyms) does this entity have?” • “What role does constant ‘k3’ play in equation ‘r17’?” • “What organism are we talking about?” • ... etc. ... Multiple annotations on same entity are common
  • 57. SBML supports two annotation schemes SBO (Systems Biology Ontology) • For mathematical semantics • One SBML object ← one SBO term • Short, compact, tightly coupled but limited scope MIRIAM (Minimum Information Requested In the Annotation of Models) • For any kind of annotation • One SBML object ← multiple MIRIAM annotations • Larger, more free-form, wider scope Both are externalized and independent of SBML
  • 58. Systems Biology Ontology (SBO) http://biomodels.net/sbo
  • 59. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
  • 60. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> SBO:0000339 <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml>
  • 61. <sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> SBO:0000339 <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ... </sbml> “forward bimolecular rate constant, continuous case”
  • 62. Software can use SBO terms to help you work with models semanticSBML SBMLsqueezer
  • 63. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  • 64. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  • 65. Goal: permit tracing model’s origins & people involved in its creation Minimal info required: • Name for the model • Citation for a description of what is being modeled & its author • Contact info for the model creator(s) • Creation date & time • Last modification date & time • Statement of the model’s terms of distribution - Specific terms not mandated, just a statement of the terms Annotations for attributing model creators and sources
  • 66. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  • 67. MIRIAM (Minimum Information Requested In the Annotation of Models) Addresses 2 general areas of annotation needs: Requirements for Scheme for encoding reference correspondence annotations Annotations for Annotations for attributing model referring to external creators & sources data resources MIRIAM is not specific to SBML
  • 68. Annotations for external references Goal: link model constituents to corresponding entities in bioinformatics resources (e.g., databases, controlled vocabularies) • Supports: - Precise identification of model constituents - Discovery of models that concern the same thing - Comparison of model constituents between different models MIRIAM approach avoids putting data content directly in the model; instead, it points at external resources that contain the knowledge.
  • 70. http://www.ebi.ac.uk/chebi salicylic acid Known by different names –  Low info you want to write all of do content them into your model? Why might you care?
  • 71. Identifying resources has its own challenges For linking to data, need: • Globally unique, unambiguous identifiers • ... that are persistent despite resource changes (e.g., changed URLs) • ... that are maintained by the community Problem: different resources have different identification schemes • E.g.: entity “16480” - In ChEBI: entry 16480 is nitrous oxide - In PubMed: entry 16480 is the 1977 paper “Effect of gallstone- dissolution therapy on human liver structure” - In PubChem: entry 16480 is 1-chloro-4-isothiocyanatobenzene
  • 72. How do we create globally unique identifiers consistently? Long story short: • Create unique resource identifiers (URIs) by combining 2 parts: namespace entity identifier { { Identifies a dataset Identifies a datum within the dataset • Create registry for namespaces - Allows people & software to use same namespace identifiers • Create service for URI resolution - Allows people & software to take a given resource identifier and figure out what it points to
  • 73. Resolving resource identifiers MIRIAM Registry supports the creation of globally unique identifiers • Example MIRIAM identifier: urn:miriam:ec-code:1.1.1.1 • Provides various data about the resource, including alternate servers • Provides web services identifiers.org is layered on top of that and provides resolvable URIs • Can type it in a web browser! • Example identifiers.org URI: http://identifiers.org/ec-code/1.1.1.1
  • 74. BioModels Database: example of using the annotations
  • 75. Annotations enable many interesting possibilities Annotations interesting possibilities semanticSBML Figure courtesy of Wolfram Leibermeister
  • 76. Summary: why care about standard ways of writing annotations? Structured, machine-readable annotations increase your model’s utility • Allow more precise identification of model components - Understand model structure - Search/discover models - Compare models • Adds a semantic layer—integrates knowledge into the model - Helps recipients understand the underlying biology - Allows for better reuse of models - Supports conversion of models from one form to another
  • 77. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 78. Model representation level Concept due to Nicolas Le Novère Visual interpretation Biological semantics Dis Co cre nti te Mathematical semantics nuo s toc us ha lum sti pe ce Me dp nti tie tion an Sta ara s lc rea ion fie te me ode tat ld ap tra ter M la nno pro ns itio ode al ysis xim n M de l an ults Mo res ati on erical Num Model type Model life-cycle Major dimensions of a computational model
  • 79. What about other kinds of models?
  • 80. SBML Level 3: Supporting more categories of models Package W Package X Package Y Package Z SBML Level 3 Core (dependencies) An SBML Level 3 package adds constructs & capabilities Models declare which packages they use • Applications tell users which packages they support Package development can be decoupled
  • 81. Level 3 package What it enables Hierarchical composition Models containing submodels Flux balance constraints Flux balance analysis models Qualitative models Petri net models, Boolean models Spatial Nonhomogeneous spatial models Multicomponent species Entities with structure & state; rule-based models Graph layout Diagrams of models Graph rendering Diagrams of models Distribution & ranges Nonscalar values Annotations Richer annotation syntax Groups Arbitrary grouping of model components Dynamic structures Creation & destruction of model components Arrays & sets Arrays or sets of entities
  • 82. How can we capture the simulation/analysis procedures?
  • 83. Decroly & Goldbeter, PNAS, 1982 ? BIOMD0000000319 in BioModels Database Software can’t read figure legends
  • 84. SED-ML = Simulation Experiment Description ML Application-independent format to capture procedures, algorithms, parameter values • Neutral format for encoding the steps to go from model to output Can be used for • Simulation experiments encoding parametrizations & perturbations • Simulations using more than one model • Simulations using more than one method • Data manipulations to produce plot(s) libSedML project developing API library http://www.biomodels.net/sed­ml
  • 85. What about visual diagrams?
  • 86. Graphical representation of models Today: broad variation in graphical notation used in biological diagrams • Between authors, between journals, even people in same group However, standard notations would offer benefits: • Consistency = easier to read diagrams with less ambiguity • Software support: verification of correctness, translation to math
  • 87. SBGN = Systems Biology Graphical Notation Goal: standardize the graphical notation in diagrams of biological processes • Community-based development, à la SBML Many groups participating 3 sublanguages to describe different facets of a model http://sbgn.org
  • 88. General background and motivations Brief summary of SBML features Outline A selection of resources for the SBML-oriented modeler Annotations, connections and semantics Current and upcoming developments in community standards Closing
  • 89. Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010 Such standards are the work of a great community
  • 90. Get involved and make things better! COMBINE (Computational Modeling in Biology Network) • SBML, SBGN, BioPAX, SED-ML, CellML, NeuroML http://co.mbine.org Upcoming meeting: August 15–19 in Toronto, Canada • Right before ICSB (International Conference on Systems Biology)
  • 91. SBML http://sbml.org BioModels Database http://biomodels.net/biomodels COMBINE http://co.mbine.org identifiers.org http://identifiers.org URLs MIRIAM http://biomodels.net/miriam SED-ML http://biomodels.net/sed-ml SBO http://biomodels.net/sbo SBGN http://sbgn.org
  • 92. I’d like your feedback! You can use this anonymous form: http://tinyurl.com/mhuckafeedback
  • 93. SBML was made possible thanks to funding from: National Institute of General Medical Sciences (USA) European Molecular Biology Laboratory (EMBL) JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) JST ERATO-SORST Program (Japan) ELIXIR (UK) Beckman Institute, Caltech (USA) Keio University (Japan) International Joint Research Program of NEDO (Japan) Japanese Ministry of Agriculture Japanese Ministry of Educ., Culture, Sports, Science and Tech. BBSRC (UK) National Science Foundation (USA) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Air Force Office of Scientific Research (USA) STRI, University of Hertfordshire (UK) Molecular Sciences Institute (USA)