SBML (the Systems Biology Markup Language)

                         Michael Hucka, Ph.D.
                          (On behalf of many people)

                         California Institute of Technology
                             Pasadena, California, USA
Tuesday, July 26, 2011                                        1
Roadmap




                         What is SBML?
                         What is the SBML community like today, and how did it get there?
                         Acknowledgments




Tuesday, July 26, 2011                                                                      2
SBML




Tuesday, July 26, 2011          3
Subject matter: computational modeling

                                                       Data



                                        Experiments




                                                      Models



     Focus on mechanistic, computational models
        •      Preferrably not statistical or curve-fitting models, but dynamical
               models expressing hypothesized physical & chemical mechanisms
               -         Equations refer to identifiable processes
               -         Parameters have physical interpretations



Tuesday, July 26, 2011                                                             4
Example: Tyson 1991 model of cell cycle




Tuesday, July 26, 2011                                             5
Goal: reproducible and reusable models and simulations
Tuesday, July 26, 2011                                    6
To achieve that, need effective means for sharing models
     Not enough simply to publish lists of equations!
     Need a software-independent format
        •      No single package answers all needs
        •      New techniques (          new tools) are developed continuously
        •      Different packages have different niche strengths
               -         Strengths are often complementary
     Need to capture both

        •      Mathematical content of a model

        •      Semantic content of a model




Tuesday, July 26, 2011                                                           7
SBML = Systems Biology Markup Language

     Format for representing quantitative models
        •      Defines object model + rules for its use
               -         Serialized to XML
     Neutral with respect to modeling framework
        •      ODE vs. stochastic vs. ...
     A lingua franca for software
     But: not a procedural description




Tuesday, July 26, 2011                                        8
Basic SBML concepts are simple

     The reaction is central: a process occurring at a given rate

        •      Participants are pools of entities (species)
                                                 f ([A],[B],[P ],...)
                               na A + nb B       − − − − − − → np P
                                                  −−−−−−

                                      f (...)
                               nc C   −−
                                       −→       nd D + ne E + nf F
                                                   .
                                                   .
                                                   .
     Models can further include:
        •      Other constants & variables               •   Unit definitions
        •      Compartments                              •   Annotations
        •      Explicit math
        •      Discontinuous events

Tuesday, July 26, 2011                                                         9
Basic SBML concepts are simple

     The reaction is central: a process occurring at a given rate

        •      Participants are pools of entities (species)             Can be anything
                                                                         conceptually
                                                 f ([A],[B],[P ],...)
                               na A + nb B       − − − − − − → np P
                                                  −−−−−−                  compatible

                                      f (...)
                               nc C   −−
                                       −→       nd D + ne E + nf F
                                                   .
                                                   .
                                                   .
     Models can further include:
        •      Other constants & variables               •   Unit definitions
        •      Compartments                              •   Annotations
        •      Explicit math
        •      Discontinuous events

Tuesday, July 26, 2011                                                                    9
Example of model type                            Example model
     Signaling pathway models                        BioModels Database model
                                                        #BIOMD0000000153
     Conductance-based models

        •      “Rate rules” for temporal evolution
                of quantitative parameters
                                                     BioModels Database model
                                                        #BIOMD0000000020


     Neural models
        •      “Events” for discontinuous changes
               in quantitative parameters
                                                     BioModels Database model
                                                        #BIOMD0000000127

     Pharmacokinetic/dynamics models
        •      “Species” is not required to be a
               biochemical entity
                                                     BioModels Database model
                                                        #BIOMD0000000234


     Infectious diseases                             BioModels Database model
                                                        #MODEL1008060001




         Scope of SBML is not limited to one kind of model
Tuesday, July 26, 2011                                                          10
300+ curated & annotated models in BioModels Database




Tuesday, July 26, 2011                              11
SBML Software Guide




Tuesday, July 26, 2011   12
SBML Software Guide




                         Find SBML software




Tuesday, July 26, 2011                        12
Number of software systems supporting SBML



                 300

                                                                          229 as of July 14 ↓
                 200



                 100



                    0
                         2001   2002   2003   2004   2005   2006   2007    2008   2009   2010   2011

                                        (counted in middle of each year)


                                 http://sbml.org/SBML_Software_Guide
Tuesday, July 26, 2011                                                                                 13
libSBML
     Reads, writes, validates SBML                  Latest stable version: 5.0.0
            •      Hundreds of rules for helping    http://sbml.org/Software/libSBML
                   to ensure correct SBML
     Unit checking & conversion
     Well-tested
     Core is written in portable C++
     Runs on Linux, Mac, Windows
     APIs for C, C++, C#, Java,
     Octave, Perl, Python, Ruby,
     MATLAB (some via SWIG)                        Developed by Sarah Keating, Frank
     Can use Expat, libxml2, or Xerces             Bergmann, Ben Bornstein, Akiya
                                                   Jouraku, & Mike Hucka, with
     Open-source under LGPL                        substantial contributions from many
                                                   other people
Tuesday, July 26, 2011                                                                   14
Current state of SBML specifications

     Specification document available from
     	

 http://sbml.org/Documents

     Newest: Level 3 Version 1 Core
             •      Oct. 2010


     About SBML “Levels”:
     •     Levels help manage significant restructuring of SBML architecture
     •     Levels coexist
            -     E.g., Level 2 models will remain valid and exist for a long time
     •     A Level is not solely a vertical change (i.e., more features)—there is
           horizontal change too (i.e., changes to existing elements)


Tuesday, July 26, 2011                                                               15
What is the SBML community like today, and how did it get there?
Tuesday, July 26, 2011                                        16
What happened in the beginning?
     Circa 2000: Hamid Bolouri contacted groups having relevant software tools

        •      Organized workshops & set goal: develop interoperability

        •      Funding from Japanese agency JST (via Hiroaki Kitano & John Doyle)
        •      3 core developers worked on software infrastructure at Caltech
     Early years: focus on software infrastructure (SBW)

        •      SBML was a component, but not sole (or even primary) focus
     Eventually: SBML turned out to be more popular

        •      2 core developers remained (Finney & Hucka), focused on SBML
        •      More groups/software supported SBML

        •      Original dev. process was ad hoc, but involved constant feedback
               -         Hosted biannual workshops where intense discussions were held

Tuesday, July 26, 2011                                                                   17
What happened when SBML gained users?
     Implemented editorial board
        •      Bootstrapped with heavily-involved people (Hucka, Finney, Le Novère)
               -         After that, turned to community-based elections
        •      Editors are volunteers, serve for limited terms
     Implemented electronic polling for major decisions & voting
     Continued biannual meetings

        •      Split into forum meetings and hackathons
     Developed a somewhat more formal process

        •      http://sbml.org/Documents/SBML_Development_Process




Tuesday, July 26, 2011                                                                18
SBML’s scope is widening to support more types of models


                                     Package Z

                         Package X                   Package Y

                                      SBML Level 3 Core

     SBML Level 3 is designed around concept of modular additions
        •      A package adds constructs & capabilities
     Models declare which packages they use

        •      Applications tell users which packages they support
     Package development can be decoupled

Tuesday, July 26, 2011                                               19
What’s happening now?
     SBML Level 3 Package specification & software development is ongoing
     Creation of COMBINE: Computational Modeling in Biology Network
        •      Goal: coordinate development of interoperable, non-overlapping
               standards covering all aspects of modeling in biology
        •      http://co.mbine.org/




Tuesday, July 26, 2011                                                          20
Model        Procedures     Results

     Representation
            format                                             SBRML


          Minimal info
                                                                  ?
         requirements


           Semantics—
          Mathematical


                         Other
                                 annotations   annotations   annotations


   Standards emerging for related but out-of-scope areas
Tuesday, July 26, 2011                                                     21
Some lessons about what think we got right
     Start with actual stakeholders

        •      Address real needs, not perceived ones
     Don’t include the kitchen sink
        •      Smaller & simpler     easier to understand, describe, implement
     Provide transparent & inclusive process
        •      Critical to legitimacy—people must see their ideas being considered
     Engage people, constantly, in many ways

        •      Not just electronic forums, email, etc., but face-to-face
        •      Not getting responses? Find a new approach!
     Have independent leaders/organizers/shepherds

        •      Avoid the appearance of bias or agenda

Tuesday, July 26, 2011                                                               22
Some lessons about what definitely got wrong
     Inadequate testing before freezing/releasing
     Not managing complexity creep
          •      Feature changes between SBML versions make support harder
     Not formalizing the process sufficiently

          •      Need “Requests for Comments” procedures, voting procedures, etc.
          •      Only put most of this in place in recent years
     Underestimating how much time it takes to do everything

          •       Also: democratic, open processes move slowly




Tuesday, July 26, 2011                                                              23
Roadmap




                         What is SBML?
                         What is the SBML community like today, and how did it get there?
                         Acknowledgments




Tuesday, July 26, 2011                                                                      24
People on SBML Team & BioModels Team
                            SBML Team                           BioModels.net Team
                          Michael Hucka                          Nicolas Le Novère
                          Sarah Keating                             Camille Laibe
                         Frank Bergmann                          Nicolas Rodriguez
                           Lucian Smith                               Nick Juty
                         Nicolas Rodriguez                     Vijayalakshmi Chelliah
                           Linda Taddeo                           Michael Schubert
                            Akiya Joukarou                            Lukas Endler
                           Akira Funahashi                              Chen Li
                           Kimberley Begley                          Harish Dharuri
                            Bruce Shapiro                                 Lu Li
                            Andrew Finney                               Enuo He
                            Ben Bornstein                           Mélanie Courtot
                              Ben Kovitz                           Alexander Broicher
                            Hamid Bolouri                            Arnaud Henry
                            Herbert Sauro     Visionaries           Marco Donizelli
                             Jo Matthews      Hiroaki Kitano
                            Maria Schilstra    John Doyle
Tuesday, July 26, 2011                                                                  25
National Institute of General Medical Sciences (USA)
   European Molecular Biology Laboratory (EMBL)
   ELIXIR (UK)
   Beckman Institute, Caltech (USA)
   Keio University (Japan)
   JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)
   National Science Foundation (USA)
   International Joint Research Program of NEDO (Japan)
   JST ERATO-SORST Program (Japan)
   Japanese Ministry of Agriculture
   Japanese Ministry of Educ., Culture, Sports, Science and Tech.
   BBSRC (UK)
   DARPA IPTO Bio-SPICE Bio-Computation Program (USA)
   Air Force Office of Scientific Research (USA)
   STRI, University of Hertfordshire (UK)
   Molecular Sciences Institute (USA)

Agencies to thank for supporting SBML & BioModels.net
Tuesday, July 26, 2011                                              26
Where to find out more
                                     SBML http://sbml.org
                         libSBML & JSBML http://sbml.org/Software
               BioModels Database http://biomodels.net/biomodels
                                  MIRIAM http://biomodels.net/miriam
                                  SED-ML http://biomodels.net/sed-ml
                                      SBO http://biomodels.net/sbo
                                    KiSAO http://www.ebi.ac.uk/compneur-srv/kisao/
                                   TEDDY http://www.ebi.ac.uk/compneur-srv/teddy/



                                  Thank you for your time!
Tuesday, July 26, 2011                                                               27
Extra slides




Tuesday, July 26, 2011                  28
Evolution of features took time & practical experience

                         Level 1         Level 2                 Level 3
              predefined math
                                   user-defined functions   user-defined functions
                 functions

     text-string math notation        MathML subset           MathML subset

     reserved namespaces for no reserved namespaces no reserved namespaces
           annotations           for annotations        for annotations
     no controlled annotation      RDF-based controlled    RDF-based controlled
             scheme                 annotation scheme       annotation scheme

            no discrete events        discrete events         discrete events

         default values defined     default values defined     no default values

                     monolithic         monolithic               modular

Tuesday, July 26, 2011                                                             29
Level 3 package            Active?   libSBML 5 implementation?
Graph layout                 ✓
Groups                       ✓
Spatial                      ✓
Flux balance constraints     ✓
Hierarchical composition     ✓             (in progress)
Multicomponent species       ✓
Annotations                  ✓
Graph rendering              ✓
Distribution & ranges        ✓
Qualitative models           ✓
Dynamic structures
Arrays & sets

Tuesday, July 26, 2011                                       30
Model                                    Entity
                         element                                referenced
                                      relationship qualifier
                                            (optional)


                    MIRIAM cross-references are simple triples


                  {      Data type
                         identifier
                                          Data item
                                          identifier
                                                              Annotation
                                                               qualifier      }
                         (Required)      (Required)           (Optional)

 Format:

            URI chosen from           Syntax & value space     Controlled
            agreed-upon list          depends on data type     vocabulary term

Tuesday, July 26, 2011                                                           31

SBML (the Systems Biology Markup Language)

  • 1.
    SBML (the SystemsBiology Markup Language) Michael Hucka, Ph.D. (On behalf of many people) California Institute of Technology Pasadena, California, USA Tuesday, July 26, 2011 1
  • 2.
    Roadmap What is SBML? What is the SBML community like today, and how did it get there? Acknowledgments Tuesday, July 26, 2011 2
  • 3.
  • 4.
    Subject matter: computationalmodeling Data Experiments Models Focus on mechanistic, computational models • Preferrably not statistical or curve-fitting models, but dynamical models expressing hypothesized physical & chemical mechanisms - Equations refer to identifiable processes - Parameters have physical interpretations Tuesday, July 26, 2011 4
  • 5.
    Example: Tyson 1991model of cell cycle Tuesday, July 26, 2011 5
  • 6.
    Goal: reproducible andreusable models and simulations Tuesday, July 26, 2011 6
  • 7.
    To achieve that,need effective means for sharing models Not enough simply to publish lists of equations! Need a software-independent format • No single package answers all needs • New techniques ( new tools) are developed continuously • Different packages have different niche strengths - Strengths are often complementary Need to capture both • Mathematical content of a model • Semantic content of a model Tuesday, July 26, 2011 7
  • 8.
    SBML = SystemsBiology Markup Language Format for representing quantitative models • Defines object model + rules for its use - Serialized to XML Neutral with respect to modeling framework • ODE vs. stochastic vs. ... A lingua franca for software But: not a procedural description Tuesday, July 26, 2011 8
  • 9.
    Basic SBML conceptsare simple The reaction is central: a process occurring at a given rate • Participants are pools of entities (species) f ([A],[B],[P ],...) na A + nb B − − − − − − → np P −−−−−− f (...) nc C −− −→ nd D + ne E + nf F . . . Models can further include: • Other constants & variables • Unit definitions • Compartments • Annotations • Explicit math • Discontinuous events Tuesday, July 26, 2011 9
  • 10.
    Basic SBML conceptsare simple The reaction is central: a process occurring at a given rate • Participants are pools of entities (species) Can be anything conceptually f ([A],[B],[P ],...) na A + nb B − − − − − − → np P −−−−−− compatible f (...) nc C −− −→ nd D + ne E + nf F . . . Models can further include: • Other constants & variables • Unit definitions • Compartments • Annotations • Explicit math • Discontinuous events Tuesday, July 26, 2011 9
  • 11.
    Example of modeltype Example model Signaling pathway models BioModels Database model #BIOMD0000000153 Conductance-based models • “Rate rules” for temporal evolution of quantitative parameters BioModels Database model #BIOMD0000000020 Neural models • “Events” for discontinuous changes in quantitative parameters BioModels Database model #BIOMD0000000127 Pharmacokinetic/dynamics models • “Species” is not required to be a biochemical entity BioModels Database model #BIOMD0000000234 Infectious diseases BioModels Database model #MODEL1008060001 Scope of SBML is not limited to one kind of model Tuesday, July 26, 2011 10
  • 12.
    300+ curated &annotated models in BioModels Database Tuesday, July 26, 2011 11
  • 13.
  • 14.
    SBML Software Guide Find SBML software Tuesday, July 26, 2011 12
  • 15.
    Number of softwaresystems supporting SBML 300 229 as of July 14 ↓ 200 100 0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 (counted in middle of each year) http://sbml.org/SBML_Software_Guide Tuesday, July 26, 2011 13
  • 16.
    libSBML Reads, writes, validates SBML Latest stable version: 5.0.0 • Hundreds of rules for helping http://sbml.org/Software/libSBML to ensure correct SBML Unit checking & conversion Well-tested Core is written in portable C++ Runs on Linux, Mac, Windows APIs for C, C++, C#, Java, Octave, Perl, Python, Ruby, MATLAB (some via SWIG) Developed by Sarah Keating, Frank Can use Expat, libxml2, or Xerces Bergmann, Ben Bornstein, Akiya Jouraku, & Mike Hucka, with Open-source under LGPL substantial contributions from many other people Tuesday, July 26, 2011 14
  • 17.
    Current state ofSBML specifications Specification document available from http://sbml.org/Documents Newest: Level 3 Version 1 Core • Oct. 2010 About SBML “Levels”: • Levels help manage significant restructuring of SBML architecture • Levels coexist - E.g., Level 2 models will remain valid and exist for a long time • A Level is not solely a vertical change (i.e., more features)—there is horizontal change too (i.e., changes to existing elements) Tuesday, July 26, 2011 15
  • 18.
    What is theSBML community like today, and how did it get there? Tuesday, July 26, 2011 16
  • 19.
    What happened inthe beginning? Circa 2000: Hamid Bolouri contacted groups having relevant software tools • Organized workshops & set goal: develop interoperability • Funding from Japanese agency JST (via Hiroaki Kitano & John Doyle) • 3 core developers worked on software infrastructure at Caltech Early years: focus on software infrastructure (SBW) • SBML was a component, but not sole (or even primary) focus Eventually: SBML turned out to be more popular • 2 core developers remained (Finney & Hucka), focused on SBML • More groups/software supported SBML • Original dev. process was ad hoc, but involved constant feedback - Hosted biannual workshops where intense discussions were held Tuesday, July 26, 2011 17
  • 20.
    What happened whenSBML gained users? Implemented editorial board • Bootstrapped with heavily-involved people (Hucka, Finney, Le Novère) - After that, turned to community-based elections • Editors are volunteers, serve for limited terms Implemented electronic polling for major decisions & voting Continued biannual meetings • Split into forum meetings and hackathons Developed a somewhat more formal process • http://sbml.org/Documents/SBML_Development_Process Tuesday, July 26, 2011 18
  • 21.
    SBML’s scope iswidening to support more types of models Package Z Package X Package Y SBML Level 3 Core SBML Level 3 is designed around concept of modular additions • A package adds constructs & capabilities Models declare which packages they use • Applications tell users which packages they support Package development can be decoupled Tuesday, July 26, 2011 19
  • 22.
    What’s happening now? SBML Level 3 Package specification & software development is ongoing Creation of COMBINE: Computational Modeling in Biology Network • Goal: coordinate development of interoperable, non-overlapping standards covering all aspects of modeling in biology • http://co.mbine.org/ Tuesday, July 26, 2011 20
  • 23.
    Model Procedures Results Representation format SBRML Minimal info ? requirements Semantics— Mathematical Other annotations annotations annotations Standards emerging for related but out-of-scope areas Tuesday, July 26, 2011 21
  • 24.
    Some lessons aboutwhat think we got right Start with actual stakeholders • Address real needs, not perceived ones Don’t include the kitchen sink • Smaller & simpler easier to understand, describe, implement Provide transparent & inclusive process • Critical to legitimacy—people must see their ideas being considered Engage people, constantly, in many ways • Not just electronic forums, email, etc., but face-to-face • Not getting responses? Find a new approach! Have independent leaders/organizers/shepherds • Avoid the appearance of bias or agenda Tuesday, July 26, 2011 22
  • 25.
    Some lessons aboutwhat definitely got wrong Inadequate testing before freezing/releasing Not managing complexity creep • Feature changes between SBML versions make support harder Not formalizing the process sufficiently • Need “Requests for Comments” procedures, voting procedures, etc. • Only put most of this in place in recent years Underestimating how much time it takes to do everything • Also: democratic, open processes move slowly Tuesday, July 26, 2011 23
  • 26.
    Roadmap What is SBML? What is the SBML community like today, and how did it get there? Acknowledgments Tuesday, July 26, 2011 24
  • 27.
    People on SBMLTeam & BioModels Team SBML Team BioModels.net Team Michael Hucka Nicolas Le Novère Sarah Keating Camille Laibe Frank Bergmann Nicolas Rodriguez Lucian Smith Nick Juty Nicolas Rodriguez Vijayalakshmi Chelliah Linda Taddeo Michael Schubert Akiya Joukarou Lukas Endler Akira Funahashi Chen Li Kimberley Begley Harish Dharuri Bruce Shapiro Lu Li Andrew Finney Enuo He Ben Bornstein Mélanie Courtot Ben Kovitz Alexander Broicher Hamid Bolouri Arnaud Henry Herbert Sauro Visionaries Marco Donizelli Jo Matthews Hiroaki Kitano Maria Schilstra John Doyle Tuesday, July 26, 2011 25
  • 28.
    National Institute ofGeneral Medical Sciences (USA) European Molecular Biology Laboratory (EMBL) ELIXIR (UK) Beckman Institute, Caltech (USA) Keio University (Japan) JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) National Science Foundation (USA) International Joint Research Program of NEDO (Japan) JST ERATO-SORST Program (Japan) Japanese Ministry of Agriculture Japanese Ministry of Educ., Culture, Sports, Science and Tech. BBSRC (UK) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Air Force Office of Scientific Research (USA) STRI, University of Hertfordshire (UK) Molecular Sciences Institute (USA) Agencies to thank for supporting SBML & BioModels.net Tuesday, July 26, 2011 26
  • 29.
    Where to findout more SBML http://sbml.org libSBML & JSBML http://sbml.org/Software BioModels Database http://biomodels.net/biomodels MIRIAM http://biomodels.net/miriam SED-ML http://biomodels.net/sed-ml SBO http://biomodels.net/sbo KiSAO http://www.ebi.ac.uk/compneur-srv/kisao/ TEDDY http://www.ebi.ac.uk/compneur-srv/teddy/ Thank you for your time! Tuesday, July 26, 2011 27
  • 30.
  • 31.
    Evolution of featurestook time & practical experience Level 1 Level 2 Level 3 predefined math user-defined functions user-defined functions functions text-string math notation MathML subset MathML subset reserved namespaces for no reserved namespaces no reserved namespaces annotations for annotations for annotations no controlled annotation RDF-based controlled RDF-based controlled scheme annotation scheme annotation scheme no discrete events discrete events discrete events default values defined default values defined no default values monolithic monolithic modular Tuesday, July 26, 2011 29
  • 32.
    Level 3 package Active? libSBML 5 implementation? Graph layout ✓ Groups ✓ Spatial ✓ Flux balance constraints ✓ Hierarchical composition ✓ (in progress) Multicomponent species ✓ Annotations ✓ Graph rendering ✓ Distribution & ranges ✓ Qualitative models ✓ Dynamic structures Arrays & sets Tuesday, July 26, 2011 30
  • 33.
    Model Entity element referenced relationship qualifier (optional) MIRIAM cross-references are simple triples { Data type identifier Data item identifier Annotation qualifier } (Required) (Required) (Optional) Format: URI chosen from Syntax & value space Controlled agreed-upon list depends on data type vocabulary term Tuesday, July 26, 2011 31