• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Modelling and Visualising Biological Systems - Falk Schreiber
 

Modelling and Visualising Biological Systems - Falk Schreiber

on

  • 534 views

Two topics will be discussed in this tutorial: (1) constraint-based modelling of metabolic systems using Flux Balance Analysis (FBA) and (2) standardised visual representation of cellular processes ...

Two topics will be discussed in this tutorial: (1) constraint-based modelling of metabolic systems using Flux Balance Analysis (FBA) and (2) standardised visual representation of cellular processes and biological networks using the Systems Biology Graphical Notation (SBGN).

Statistics

Views

Total Views
534
Views on SlideShare
534
Embed Views
0

Actions

Likes
0
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Modelling and Visualising Biological Systems - Falk Schreiber Modelling and Visualising Biological Systems - Falk Schreiber Presentation Transcript

    • Modelling and Visualising Biological Systems Falk Schreiber Institute of Computer Science Martin Luther University Halle-Wittenberg Bioinformatics IPK Gatersleben
    • IPK Gatersleben & MLU Halle-Wittenberg
    • Outline 1. Modelling metabolism - Basics - Constraint-based modelling: FBA - Mathematical representation - Application of constraints - Example - Resources and tools 2. Visualising models and networks - Basics - Standard graphical representation - Process Description Language - Resources and tools
    • Metabolic Modelling  Comprises the reconstruction, simulation, and analysis of metabolic models  Metabolic model  list of reactions and  associated properties, assumed to be present in the system under investigation, along with  description of the environment within which the biological system is assumed to reside  Provides a basis for system-level analysis of metabolism for different organisms Source: http://www.hydroponicist.com/pages/p69-oxygen-air.htm
    • Model size Model details Methods in Metabolic Modelling Topology only network structure Petri Nets + stoichiometric constraints + thermodynamics Flux Balance + mass balance + capacity constraints Kinetic + kinetic rate laws + kinetic parameters + metabolite concentrations
    • Flux Balance Analysis  Constraint-based stoichiometric modelling approach to predict and analyse the metabolic steady state conversion rates (fluxes)  Advantages  No kinetic parameters required  Quantitative predictions  Applicable to large systems  Applications  Prediction of optimal metabolic yields and flux distributions  Prediction of phenotype/viability of knockout-mutants  Prediction of pathway redundancies …
    • History of FBA
    • Principles of Flux Balance Analysis
    • Simulation Oxygene level
    • Reaction Network Formalism R1 : A v1 B  R2 : A v2 C  R3 : B v3 C  R4 : Aext b1 A  R5 : C b2 C plast  b: exchange fluxes v: internal fluxes R6 : B b 3 B plast 
    • Stoichiometric Matrix R1 : A v1 B  R2 : A v2 C  R1 R2 R3 R4 R5 R6 R3 : B v3 C  A 1 1 0 1 0 0 R4 : Aext b1 A  B 1 0 1 0 0 1 R5 : C   C plast C 0 1 1 0 1 0 b2 R6 : B b 3 B plast 
    • Stoichiometric Matrix
    • Stoichiometric Matrix R1 : A v1 B  R2 : A v2 C  R1 R2 R3 R4 R5 R6 R3 : B v3 C  A 1 1 0 1 0 0 R4 : Aext b1 A  B 1 0 1 0 0 1 R5 : C   C plast C 0 1 1 0 1 0 b2 R6 : B b 3 B plast 
    • Dynamic Mass Balance b: exchange fluxes v: internal fluxes Mass balance equations Matrix form v S dM  S v dt
    • Steady State Steady state assumption dM 0 dt thus S v  0 Steady state mass balance
    • FluxC FluxC Metabolic Modelling Feasible solution space Constraints FluxB  Mass balance: FluxB dM  S v  0 dt  Thermodynamic: directionality of reaction 0  vi    Capacity: enzymatic capacity, nutrient availability  i  vi  i
    • Metabolic Modelling Direction of increasing Z Optimization FluxC FluxC Feasible solution space Max./Min. Z FluxB  Optimization problem: maximize/minimize Z  Solved using linear programming Optimal solution FluxB
    • FluxB Example 50 A + 2B = 120 Feasible set 60 FluxA Z = 20A + 30B FluxB System of two metabolites A and B  Production constrains 0 < A < 60 and 0 < B < 50  Capacity for simultaneous production A + 2B < 120  Objective function Z = 20A + 30B Optimal value within feasible set 50 Feasible set Z = 2100 Z = 1500 60 FluxA
    • Linear Programming: Types of Solutions
    • Objective Function Question Objective What areto identify plausible physiological states? metabolite product How the biochemical production Maximize capabilities? What is the maximal growth rate and biomass yield? Maximize growth rate What is the trade-off between biomass production and metabolite overproduction? Maximize biomass production for a given metabolite production How energetically efficient can metabolism operate? Minimize ATP production or minimize nutrient uptake
    • Model Simulation and Analysis Flux balance analysis Yield / flux predictions under varying environmental conditions - multi parameter variation Knockout analysis Yield / flux predictions under varying genetic backgrounds - complete - specified Robustness analysis Obj. function sensitivity to flux variation of specific reaction - complete - specified Flux variability analysis Predictions of min/ max flux values - complete
    • Objective Function: Growth Objective
    • Objective Function: Growth Objective Metabolic demands of precursors and cofactors required for 1 g of biomass of E. coli. Metabolite ATP NADH NADPH G6P F6P R5P E4P T3P 3PG PEP PYR AcCoA OAA AKG Demand (mmol) 41.2570 -3.5470 18.2250 0.2050 0.0709 0.8977 0.3610 0.1290 1.4960 0.5191 2.8328 3.7478 1.7867 1.0789 Z = 41.2570 vATP - 3.547vNADH + 18.225vNADPH + ….
    • Summary Flux Balance Analysis
    • Metabolism in the Hordeum vulgare Seed FBA model of seed storage metabolism in developing endosperm of Hordeum vulgare
    • Metabolism in the Hordeum vulgare Seed FBA model of seed storage metabolism in developing endosperm of Hordeum vulgare Size 257 reactions, 234 metabolites Pathways Glyc, TCA, PPP, oxP, Ferm, Rubisco, AA, Starch, CW, and others
    • Case Study Source of images: L. Borisjuk and H. Rolletschek, IPK  Non-invasive imaging uncovers metabolic compartmentation in the endosperm  Primary site of alanine synthesis is the central endosperm  Alanine gradients reflect local oxygen state of the endosperm  13C-Ala gradient can be used as in vivo marker for hypoxia
    • Case Study Source of images: L. Borisjuk and H. Rolletschek, IPK  Alanine-AT: critical branch point separating aerobic from anaerobic metabolism  Modelling purpose: to elucidate role of alanine metabolism for seed tissues with varying oxygen supply
    • Simulation of Region-specific Metabolism Central endosperm (hypoxic region) A B Peripheral endosperm (aerobic region)
    • Regulation of Alanine-AT  Regulation of Alanine-AT in the endosperm in response to changing oxygen supply
    • Current Research Directions  Model coupling (different organs)  Multiscale modelling (different modelling methods)
    • Software Tools for FBA  CellNetAnalyzer (CNA) http://www.mpi-magdeburg.mpg.de/projects/cna/cna.html  COBRA Toolbox http://gcrg.ucsd.edu/downloads/COBRAToolbox  FBA-SimVis http://fbasimvis.ipk-gatersleben.de/
    • Model Reconstruction: Metabolic Model 1. Model definition  Organism, organ, dev. stage, pathways, model boundaries 2. Model reconstruction & data retrieval  Top-down: metabolism – pathways – reactions  Integration of heterogeneous data types  Data types: biochemical, physiological, genomic data  Data basis: literature, databases  Missing data  Data referring to closely related species/organs/dev. stages  Inferred reactions: indirect, inferred from BM requirements  Unknown reaction directionality: rev; unknown compartment: cytosol
    • Additional Parameters  Maximum uptake/excretion rates  Literature, experimental data, approximations (e.g. related taxa)  Growth objective  Biomass composition  Energy requirements (growth, maintenance)  Literature, experimental data, approximations (e.g. dev. stage)
    • Outline 1. Modelling metabolism - Basics - Constraint-based modelling: FBA - Mathematical representation - Application of constraints - Example - Resources and tools 2. Visualising models and networks - Basics - Standard graphical representation - Process Description Language - Resources and tools
    • Question 1 – Can you Read this? A network with 102 nodes Protein interaction network, source: Jeong et al. Nature, 2001
    • Question 1 – Can you Read this? A network with 103 nodes Metabolic network, source: KEGG, 2012
    • Question 1 – Can you Read this? A network with 104 nodes Protein interaction network, source: DIP, 2013
    • Part 1 A network with 104 nodes Protein interaction network, source: DIP, 2013  Automatic layout of large networks and circuit-boards
    • Question 2 – Can you Understand this?
    • Question 2 – Can you Understand this? Stimulates gene transcription? Associates into? Is degraded? Translocates? Reciprocal stimulation?
    • Part 2 Stimulates gene transcription? Associates into? Is degraded? Translocates? Reciprocal stimulation? Standardisation of graphical representation
    • Part 1 A network with 104 nodes Protein interaction network, source: DIP, 2013  Automatic layout of large networks and circuit-boards
    • Automatic Layout of Networks  Force-based approaches  Simulate a system of physical forces Eades. Congressus Numerantium, 1984. Fruchterman & Reingold. Software - Practice and Experience,1991.  Layered approaches  Decycling - layering - crossing reduction - coordinate assignment Sugiyama et al. IEEE Transactions on Systems, Man and Cybernetics, 1981.  Orthogonal / grid-based approaches Tamassia. SIAM Journal on Computing, 1987. Biedl et al. Graph Drawing, LNCS 1353, 1997.
    • Many Special Layout Algorithms  Commonly extensions of the three classes of layout algorithms  Force-based  Layered  Orthogonal / grid-based  Examples Source: Karp & Paley. Conf. Bioinformatics and Genome Research, 1994. Source: Becker & Rojas. Bioinformatics, 2001. Source: Schreiber. In Silico Biology, 2002. Source: Genc & Dogrusoz. Graph Drawing LNCS 2912, 2004.
    • Good Network Layout  Better layouts have  Fewer edge crossings  Large crossing angles  Straighter edges  Horizontal and vertical edges  Symmetrical parts shown symmetrically …  Special layout algorithms
    • Part 2 Standardisation of graphical representation
    • Ambiguity in Conventional Representation
    • Standardised Symbols are Important Most English speaking country Singapore Quebec Iran Norway China Poland Israel USA and Canada
    • Pathway Diagrams has been Used a Long Time Ago A metabolic pathway diagram From the wall chart of Biochemical Path-ways created by Gerhard Michal (1968) Electrical circuit diagram representing cell membrane From Hodgkin AL and Huxley AF (1952) A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117: 500-544.
    • What is SBGN?  A way to unambiguously describe biochemical and cellular events in graphs  Limited amount of symbols (~30)  Smooth learning curve  Can graphically represent quantitative models, biochemical pathways, at different levels of granularity  Developed since 2006 by a growing community, part of COMBINE  Three languages  Process Descriptions  Entity Relationships  Activity Flow  one state = one glyph  one entity = one glyph  conceptual level
    • Graph Trinity: Three Languages in One Process Description maps     Unambiguous Mechanistic Sequential Combinatorial explosion Entity Relationships Activity Flow maps maps  Unambiguous  Mechanistic  Non-Sequential  Ambiguous  Conceptual  Sequential
    • Graph Trinity: Three Languages in One Process Description Entity Relationships Activity Flow
    • SBGN Process Description Language  A Process Description (PD) Diagram represents all molecular processes and interactions occurring between various biochemical entities  It depicts how entities transition forms as a result of biochemical reactions (including non-covalent modifications such as binding)  Most of the classic metabolic pathways (e.g., glycolysis and TCA cycle) in biochemistry textbooks were drawn in this approach  Though not the conventional approach for drawing signaling pathways, this approach captures the details of biochemical reactions within the pathway network and provides, in most cases, unambiguous interpretation of pathway mechanisms
    • Graph Trinity: Three Languages in One Process Description maps     Unambiguous Mechanistic Sequential Combinatorial explosion Entity Relationships Activity Flow maps maps  Unambiguous  Mechanistic  Non-Sequential  Ambiguous  Conceptual  Sequential
    • SBGN Process Description L1 V1.2 Reference Card
    • Pools of Entities  Collection of molecules indistinguishable in some sense  Non-overlapping  Characterized by concentration
    • Entity Types Unspecified entity Simple chemical Macromolecule LABEL LABEL LABEL Nucleic acid feature LABEL
    • Material Type  Unit of information  Controlled vocabulary (SBO)  Indicates its chemical structure (physical composition) Name Non-macromolecular ion Non-macromolecular radical Ribonucleic acid Deoxribonucleic acid Protein Polysaccharide mt:prot pre:label Label mt:ion mt:rad mt:rna mt:dna mt:prot mt:psac PhyA
    • Conceptual Type  Unit of information  Controlled vocabulary (SBO)  Indicates its function within the context of a given PD map Name Gene Transcription start site Gene coding region Gene regulatory region Messenger RNA ct:grr pre:label Label ct:gene ct:tss ct:coding ct:grr ct:mRNA crp
    • Macromolecular Pools: State Variables  Pool is set of molecules somehow undistinguishable  Molecules can be in different state  (Non)phosphorylated  Open/close channel  Modified at some state R R Ch Ch Close Open Kinase P@237 P R 2P
    • Stateless and State-full Entity Types  Not all entities can have states  Stateless mt:prot  Simple chemicals  Unspecified entity PhyA  State-full entities  Macromolecule Pr/Prf  Nucleic acid feature  Complex  State is defined as combination of state values  Once defined state variable should be always visible
    • Example 1: LEC1/AFL-B3 Network Macromolecules: biochemical substances that are built up from the covalent linking of pseudo-identical units. Examples of macromolecules include proteins, nucleic acids (RNA, DNA), and polysaccharides (glycogen, cellulose, starch, etc.).
    • Complex and Multimer  Represents complexes of molecules held together by non-covalent bonds  Multimer require cardinality  Can have state variables  In multimer it means that all monomers have same state  Use complex if not the same states Multimers N:2 LABEL N:5 LABEL Complex N:3 LABEL LABEL LABEL
    • Key Concept: Process  Process: conversion of element of one pool to another  Special cases  Non-covalent binding Association Dissociation  Incompleteness Uncertain process Omitted process Association Dissociation Process Uncertain process Omitted process ? //
    • LEC1/AFL-B3 Network Omitted processes are processes that are known to exist, but details are omitted from the map for the sake of clarity or parsimony. A single omitted process can represent any number of actual processes.
    • Arcs  Using pools by process  Consumption/production  Stoichiometry (optional)  Regulating process rate  Stimulation  Inhibition  Catalysis  Requirement for process  Necessary stimulation consumption production catalysis stimulation inhibition necessary stimulation modulation 2
    • Laying out Process Arcs  Production can represents consumption  Reversible process  Substrates and products should come to opposite sides of process shape (two connectors)  Regulatory arcs should come to other two sides of the process  If you have separate regulation of forward and backward process, you have to split
    • LEC1/AFL-B3 Network A stimulation affects positively the flux of a process represented by the target process.
    • Sink/source: Creation and Destruction  Represents creation and destruction of entities  Shape to represent source of materials and sink of degraded entities
    • LEC1/AFL-B3 network A submap is used to encapsulate processes (including all types of nodes and edges) within one glyph. The submap hides its content to the users, and display only input terminals (or ports).
    • LEC1/AFL-B3 Factors and Maturation Gene Control
    • Environmental Influence  External influences: Perturbing agent  Light  Temperature change  Mutation/disease Phenotype  System manifestation: Phenotype LABEL  Apoptosis  Phenotype Perturbing agent LABEL
    • LEC1/AFL-B3 Factors and Maturation Gene Control The phenotype glyph represents biological processes or phenotypes that are affected or generated by a biochemical/regulatory network. Such processes can take place at different levels and are independent of the biochemical network itself.
    • Clone Marker  Each entity pool is only once represented on the map  Layout problems  Clone marker as visual indicator of duplication  Stateless nodes carry unnamed marker  State-full nodes carry named marker to simplify recognition LABEL marker
    • LEC1/AFL-B3 Factors and Maturation Gene Control If an EPN is duplicated on a map, it is necessary to indicate this fact by using the clone marker auxiliary unit. The purpose of this marker is to provide the reader with a visual indication that this node has been cloned, and that at least one other occurrence of the EPN can be found in the map.
    • Discrimination Between Knowledge Levels Transcription + factor Target gene DNA complex transcription translation
    • Discrimination Between Knowledge Levels Transcription factors and target gene DNA together stimulate transcription, translation Transcription factor stimulates the transcription of several putative target genes
    • Compartments  Container to represent physical or logical structure  Free form  Visually thicker line  The same entity pools in different compartments are different  Compartments are independent  Overlapping do not mean containment
    • Compartments Neuro-muscular junction
    • Logical Gates  Encode of network logic  To simplify layout If there are many activators for the process  To include uncertain information Combination of TF with unknown or combinatorial binding kinetics  Three main logic operations  AND: all are required  OR: any combination is required  NOT: prevent influence
    • Strength and Weakness of SBGN-PD Strength  Easy translation into mathematical model  Natural mapping to SBML  A lot of information in DBs  KEGG  Panther  Timeline is easily extractable Weakness  Full explicit definition of state  Combinatorial complexity  Additional assumption to include uncertain information  Laborious creation
    • SBGN Process Description L1 V1.2 Reference Card
    • SBGN Process Description - Entity Pool Nodes
    • SBGN Process Description - Process Nodes
    • SBGN Process Description - Connecting Arcs
    • Software Tools for SBGN  SBGN http://www.sbgn.org  SBGN-ED http://www.sbgned.org
    • Standards in Systems Biology Source: Demir et. al. Nature Biotechnology, 2012.
    • High Throughput Modelling and Visualisation  Path2Models: A pipeline to produce models that combine data from different sources  140.000 kinetic, logical and constraint-based models  Part of BioModelsDB Path2Models team: F. Büchel, T. Czauderna, C. Chaouiya, A. Dräger, M. Glont, H. Hermjakob, M. Hucka, S. Keating, D.B. Kell, R. Keller , C. Laibe, N. Le Novère, P. Mendes, F. Mittag, M. Rall, N. Rodriguez, J. SaezRodriguez, F. Schreiber, M. Schubert, N. Swainston, M. van Iersel, C. Wrzodek, M. Wybrow, A. Zell