Software for SBML Today
               Michael Hucka, Ph.D.
   Department of Computing + Mathematical Sciences
          California Institute of Technology
                  Pasadena, CA, USA

Email: mhucka@caltech.edu           Twitter: @mhucka


     HARMONY 2012, Maastricht, The Netherlands
                                                       2
SBML = Systems Biology Markup Language
Format for representing computational models of biological processes
 •   Data structures + usage principles + serialization to XML
Neutral with respect to modeling framework
 •   E.g., ODE, stochastic systems, etc.




                                                                       3
afor
                   nc )
                fra ans
              ua m
           ing t hu
      is al o
  ML are (n
SB tw
  s of




                            4
The process is central
  •   Called a “reaction” in SBML
  •   Participants are pools of entities (species)
Models can further include:                  •   Unit definitions
  •   Other constants & variables            •   Annotations
  •   Compartments
  •   Explicit math
  •   Discontinuous events




              Basic SBML concepts are fairly simple
                                                                   5
Some basics of SBML core model encoding

Well-stirred compartments

       c



       n




                                                     6
Species pools are located in compartments
        c
                   protein A                protein B

        n




            gene               mRNAn          mRNAc




                                                        7
Reactions can involve any species anywhere

       c
                   protein A                 protein B

        n




            gene               mRNAn           mRNAc




                                                         8
Reactions can cross compartment boundaries

       c
                  protein A                  protein B

       n




           gene               mRNAn            mRNAc




                                                         9
Reaction/process rates can be (almost) arbitrary formulas

       c
                   protein A          f1(x)           protein B

        n

                     f5(x)                               f2(x)



            gene         f4(x)   mRNAn        f3(x)     mRNAc




                                                                  10
“Rules”: equations expressing relationships in addition to reaction sys.

g1(x)    c
g2(x)               protein A             f1(x)           protein B
 .
 .
 .       n

                      f5(x)                                  f2(x)



             gene         f4(x)    mRNAn          f3(x)     mRNAc




                                                                            11
“Events”: discontinuous actions triggered by system conditions

g1(x)       c
g2(x)                   protein A              f1(x)           protein B
 .
 .
 .           n

                          f5(x)                                   f2(x)



                 gene         f4(x)     mRNAn          f3(x)     mRNAc


        Event1: when (...condition...), Event2: when (...condition...), ...
           do (...assignments...)          do (...assignments...)
                                                                              12
Annotations: machine-readable semantics and links to other resources

   “This is identified                                   “This is an enzymatic
            c
g1(x)by GO id # ...”                                    reaction with EC # ...”
g2(x)
  .                    protein A             f1(x)           protein B
  .
 “This is a transport
  .         n
 into the nucleus ...”                  “This compartment
                                     represents the nucleus ...”
                         f5(x)                                  f2(x)



              gene           f4(x)   mRNAn           f3(x)     mRNAc
                   “This event
                  represents ...”
     Event1: when (...condition...), Event2: when (...condition...), ...
        do (...assignments...)          do (...assignments...)
                                                                                  13
Scope of SBML encompasses many types of models
                                                 14
Today: spatially homogeneous models
  •   Metabolic network models
  •   Signaling pathway models
  • Conductance-based models
  • Neural models
  • Pharmacokinetic/dynamics models
  • Infectious diseases




      Scope of SBML encompasses many types of models
                                                       14
Today: spatially homogeneous models
  •   Metabolic network models          F i nd
                                       BioM
                                               exam
                                                     ples i
  •   Signaling pathway models                 odels        n
                                                     Data
                                      http:/               base
  • Conductance-based models                 /biom
                                                   odels
                                                         .net/b
  • Neural models                                               iomo
                                                                    dels
  • Pharmacokinetic/dynamics models
  • Infectious diseases




      Scope of SBML encompasses many types of models
                                                                           14
Today: spatially homogeneous models
  •   Metabolic network models             F i nd
                                          BioM
                                                  exam
                                                        ples i
  •   Signaling pathway models                    odels        n
                                                        Data
                                         http:/               base
  • Conductance-based models                    /biom
                                                      odels
                                                            .net/b
  • Neural models                                                  iomo
                                                                       dels
  • Pharmacokinetic/dynamics models
  • Infectious diseases


Coming: SBML Level 3 packages to support other types
  •   E.g.: Spatially inhomogeneous models, also qualitative/logical




      Scope of SBML encompasses many types of models
                                                                              14
Where to learn more: SBML.org—the SBML portal




                                                15
Where to learn more: SBML.org—the SBML portal




                 Find SBML software




                                                15
SBML Software Guide, with different views (same data)
                                                       16
How did we gather data on the software tools?
Historically (until mid-2000’s):
 •   Word of mouth at workshops & conferences
 •   Direct contact
Mid/late-2000’s to ~2010:
 •   Created electronic survey
 •   Citation alerts (e.g., Web of Science)
2011:
 •   Expanded survey
     -   Basis of this talk




                                                         17
New version of the SBML software survey




                                          18
General features of the survey
Online, implemented using commercial survey website
28 questions
 •   Mix of multiple choice and fill-in-the-blank
85 responses by July 2011
 •   Removed incomplete responses
 •   81 software tools left
Avoided “corrections” to data




                                                      19
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14

  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60   80

                                                              Total number of software tools
                                                                                                  20
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14

  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60   80

                                                              Total number of software tools
                                                                                                  20
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14

  Framework or library (for use in developing s/w)            13
                                                               1/4            1/2           3/4
        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60    80

                                                              Total number of software tools
                                                                                                   20
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14

  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60   80

                                                              Total number of software tools
                                                                                                  21
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14                      ?
  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11
                                                          0        20             40        60   80

                                                              Total number of software tools
                                                                                                  21
Purposes of the software systems
   Question: Which of the following categories best describe your software?
   (Check all that apply.)

                                Simulation software                                    42

Analysis s/w (in addition, or instead of, simulation)                              40

           Creation/model development software                               31

        Visualization/display/formatting software                            31

          Utility software (e.g., format conversion)                    23

     Data integration and management software                      16

                             Repository or database            14                      ?
  Framework or library (for use in developing s/w)            13

        S/w for interactive env. (e.g., MATLAB, R, ...)       13

                               Annotation software            11                  Low
                                                          0        20             40        60   80

                                                              Total number of software tools
                                                                                                  21
Mathematical frameworks
Question: Regardless of whether your software provides simulation
capabilities, what modeling frameworks does the package support when
working with SBML files?

 Ordinary differential equations (ODE)                                   54

        Discrete stochastic simulation                         28

        Discontinuous event handling                          25

 Differential-algebraic equations (DAE)                 17

            Logical/Boolean networks               11

    Delay-differential equations (DDE)         9

    Partial differential equations (PDE)       8

None of the above, or other framework                    20

                                           0            20          40    60         80

                                                    Total number of software tools

                                                                                          22
Mathematical frameworks
Question: Regardless of whether your software provides simulation
capabilities, what modeling frameworks does the package support when
working with SBML files?

 Ordinary differential equations (ODE)                                       54

        Discrete stochastic simulation                         28

        Discontinuous event handling                          25

 Differential-algebraic equations (DAE)                 17

            Logical/Boolean networks               11

    Delay-differential equations (DDE)         9

    Partial differential equations (PDE)       8

None of the above, or other framework                    20              E.g.: FBA
                                           0            20          40        60     80

                                                    Total number of software tools

                                                                                          22
Specific SBML-specific characteristics
 Question: Which features of SBML can your software recognize and act on?



Species, reactions, parameters, and/or compartments                                        65

                         Work with reaction kinetics                             48

        Work with stoichiometric relationships/maps                              46

         Work with other mathematical relationships                        32

        Work with conditional discontinuous events                    27

                             Work with time delays          10

                            Other, or not applicable         14

                                                        0        20         40        60        80
                                                       Total number of software tools



                                                                                                23
Other supported standards
Question: Which other standards does your software support?

             MIRIAM                                 16
                 SBO                           14
               SBGN                           13
              BioPAX             6
              CellML        3
             SED-ML         3
             MFAML      1
               PNML     1                                     (Warning:
                SBOL    1                                   different scale)
                       0        5       10       15       20
              Total # software tools supporting other standards



                                                                          24
Operating systems supported by the 81 tools



Microsoft Windows                                         69
                        8
    Apple Mac OS 0                                     64

            Linux 0                               58

     Web browser                 26
                        7
                    0       20           40      60            80
                                 Total        Only this




                                                                    25
Operating systems supported by the 81 tools

                                 !

Microsoft Windows                                             69
                        8
    Apple Mac OS 0                                         64

            Linux 0                                   58

     Web browser                     26
                        7
                    0       20               40      60            80
                                     Total        Only this




                                                                        25
Availability of software



Fee-based                               Not
                  Fee-based
   2%                                  avail.
                     10%
                                       21%
                                                  Code
   Free                 Free                    available
   98%                  90%                       79%


 Fees for         Fees for non-        Is source code
academics          academics              available?




                                                            26
Final impressions
Some pleasing results
 •   Large variety, including tools with features SBML can’t yet represent
     -   Hopefully stands as testament to SBML’s utility
 •   Nearly 80% are open source
Some disappointing results
 •   Low response turnout: 85 vs 230 tools in matrix
 •   Low support for MIRIAM




                                                                             27
National Institute of General Medical Sciences (USA)
                          European Molecular Biology Laboratory (EMBL)
                          ELIXIR (UK)
Funding acknowledgments

                          Beckman Institute, Caltech (USA)
                          Keio University (Japan)
                          JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)
                          JST ERATO-SORST Program (Japan)
                          International Joint Research Program of NEDO (Japan)
                          Japanese Ministry of Agriculture
                          Japanese Ministry of Educ., Culture, Sports, Science and Tech.
                          BBSRC (UK)
                          National Science Foundation (USA)
                          DARPA IPTO Bio-SPICE Bio-Computation Program (USA)
                          Air Force Office of Scientific Research (USA)
                          STRI, University of Hertfordshire (UK)
                          Molecular Sciences Institute (USA)

                                                                                           28
Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010

A huge thank you to the community
                                                                  29

Software for SBML Today

  • 1.
    Software for SBMLToday Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA Email: mhucka@caltech.edu Twitter: @mhucka HARMONY 2012, Maastricht, The Netherlands 2
  • 2.
    SBML = SystemsBiology Markup Language Format for representing computational models of biological processes • Data structures + usage principles + serialization to XML Neutral with respect to modeling framework • E.g., ODE, stochastic systems, etc. 3
  • 3.
    afor nc ) fra ans ua m ing t hu is al o ML are (n SB tw s of 4
  • 4.
    The process iscentral • Called a “reaction” in SBML • Participants are pools of entities (species) Models can further include: • Unit definitions • Other constants & variables • Annotations • Compartments • Explicit math • Discontinuous events Basic SBML concepts are fairly simple 5
  • 5.
    Some basics ofSBML core model encoding Well-stirred compartments c n 6
  • 6.
    Species pools arelocated in compartments c protein A protein B n gene mRNAn mRNAc 7
  • 7.
    Reactions can involveany species anywhere c protein A protein B n gene mRNAn mRNAc 8
  • 8.
    Reactions can crosscompartment boundaries c protein A protein B n gene mRNAn mRNAc 9
  • 9.
    Reaction/process rates canbe (almost) arbitrary formulas c protein A f1(x) protein B n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc 10
  • 10.
    “Rules”: equations expressingrelationships in addition to reaction sys. g1(x) c g2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc 11
  • 11.
    “Events”: discontinuous actionstriggered by system conditions g1(x) c g2(x) protein A f1(x) protein B . . . n f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...) 12
  • 12.
    Annotations: machine-readable semanticsand links to other resources “This is identified “This is an enzymatic c g1(x)by GO id # ...” reaction with EC # ...” g2(x) . protein A f1(x) protein B . “This is a transport . n into the nucleus ...” “This compartment represents the nucleus ...” f5(x) f2(x) gene f4(x) mRNAn f3(x) mRNAc “This event represents ...” Event1: when (...condition...), Event2: when (...condition...), ... do (...assignments...) do (...assignments...) 13
  • 13.
    Scope of SBMLencompasses many types of models 14
  • 14.
    Today: spatially homogeneousmodels • Metabolic network models • Signaling pathway models • Conductance-based models • Neural models • Pharmacokinetic/dynamics models • Infectious diseases Scope of SBML encompasses many types of models 14
  • 15.
    Today: spatially homogeneousmodels • Metabolic network models F i nd BioM exam ples i • Signaling pathway models odels n Data http:/ base • Conductance-based models /biom odels .net/b • Neural models iomo dels • Pharmacokinetic/dynamics models • Infectious diseases Scope of SBML encompasses many types of models 14
  • 16.
    Today: spatially homogeneousmodels • Metabolic network models F i nd BioM exam ples i • Signaling pathway models odels n Data http:/ base • Conductance-based models /biom odels .net/b • Neural models iomo dels • Pharmacokinetic/dynamics models • Infectious diseases Coming: SBML Level 3 packages to support other types • E.g.: Spatially inhomogeneous models, also qualitative/logical Scope of SBML encompasses many types of models 14
  • 17.
    Where to learnmore: SBML.org—the SBML portal 15
  • 18.
    Where to learnmore: SBML.org—the SBML portal Find SBML software 15
  • 19.
    SBML Software Guide,with different views (same data) 16
  • 20.
    How did wegather data on the software tools? Historically (until mid-2000’s): • Word of mouth at workshops & conferences • Direct contact Mid/late-2000’s to ~2010: • Created electronic survey • Citation alerts (e.g., Web of Science) 2011: • Expanded survey - Basis of this talk 17
  • 21.
    New version ofthe SBML software survey 18
  • 22.
    General features ofthe survey Online, implemented using commercial survey website 28 questions • Mix of multiple choice and fill-in-the-blank 85 responses by July 2011 • Removed incomplete responses • 81 software tools left Avoided “corrections” to data 19
  • 23.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Total number of software tools 20
  • 24.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Total number of software tools 20
  • 25.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 1/4 1/2 3/4 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Total number of software tools 20
  • 26.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Total number of software tools 21
  • 27.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 ? Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 0 20 40 60 80 Total number of software tools 21
  • 28.
    Purposes of thesoftware systems Question: Which of the following categories best describe your software? (Check all that apply.) Simulation software 42 Analysis s/w (in addition, or instead of, simulation) 40 Creation/model development software 31 Visualization/display/formatting software 31 Utility software (e.g., format conversion) 23 Data integration and management software 16 Repository or database 14 ? Framework or library (for use in developing s/w) 13 S/w for interactive env. (e.g., MATLAB, R, ...) 13 Annotation software 11 Low 0 20 40 60 80 Total number of software tools 21
  • 29.
    Mathematical frameworks Question: Regardlessof whether your software provides simulation capabilities, what modeling frameworks does the package support when working with SBML files? Ordinary differential equations (ODE) 54 Discrete stochastic simulation 28 Discontinuous event handling 25 Differential-algebraic equations (DAE) 17 Logical/Boolean networks 11 Delay-differential equations (DDE) 9 Partial differential equations (PDE) 8 None of the above, or other framework 20 0 20 40 60 80 Total number of software tools 22
  • 30.
    Mathematical frameworks Question: Regardlessof whether your software provides simulation capabilities, what modeling frameworks does the package support when working with SBML files? Ordinary differential equations (ODE) 54 Discrete stochastic simulation 28 Discontinuous event handling 25 Differential-algebraic equations (DAE) 17 Logical/Boolean networks 11 Delay-differential equations (DDE) 9 Partial differential equations (PDE) 8 None of the above, or other framework 20 E.g.: FBA 0 20 40 60 80 Total number of software tools 22
  • 31.
    Specific SBML-specific characteristics Question: Which features of SBML can your software recognize and act on? Species, reactions, parameters, and/or compartments 65 Work with reaction kinetics 48 Work with stoichiometric relationships/maps 46 Work with other mathematical relationships 32 Work with conditional discontinuous events 27 Work with time delays 10 Other, or not applicable 14 0 20 40 60 80 Total number of software tools 23
  • 32.
    Other supported standards Question:Which other standards does your software support? MIRIAM 16 SBO 14 SBGN 13 BioPAX 6 CellML 3 SED-ML 3 MFAML 1 PNML 1 (Warning: SBOL 1 different scale) 0 5 10 15 20 Total # software tools supporting other standards 24
  • 33.
    Operating systems supportedby the 81 tools Microsoft Windows 69 8 Apple Mac OS 0 64 Linux 0 58 Web browser 26 7 0 20 40 60 80 Total Only this 25
  • 34.
    Operating systems supportedby the 81 tools ! Microsoft Windows 69 8 Apple Mac OS 0 64 Linux 0 58 Web browser 26 7 0 20 40 60 80 Total Only this 25
  • 35.
    Availability of software Fee-based Not Fee-based 2% avail. 10% 21% Code Free Free available 98% 90% 79% Fees for Fees for non- Is source code academics academics available? 26
  • 36.
    Final impressions Some pleasingresults • Large variety, including tools with features SBML can’t yet represent - Hopefully stands as testament to SBML’s utility • Nearly 80% are open source Some disappointing results • Low response turnout: 85 vs 230 tools in matrix • Low support for MIRIAM 27
  • 37.
    National Institute ofGeneral Medical Sciences (USA) European Molecular Biology Laboratory (EMBL) ELIXIR (UK) Funding acknowledgments Beckman Institute, Caltech (USA) Keio University (Japan) JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003) JST ERATO-SORST Program (Japan) International Joint Research Program of NEDO (Japan) Japanese Ministry of Agriculture Japanese Ministry of Educ., Culture, Sports, Science and Tech. BBSRC (UK) National Science Foundation (USA) DARPA IPTO Bio-SPICE Bio-Computation Program (USA) Air Force Office of Scientific Research (USA) STRI, University of Hertfordshire (UK) Molecular Sciences Institute (USA) 28
  • 38.
    Attendees at SBML10th Anniversary Symposium, Edinburgh, 2010 A huge thank you to the community 29