Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Role of Machine Learning in
Modelling the Cell.

      John Hawkins
      ARC Centre for Complex Systems
      Univers...
Overview of Talk

   Overview of cell biology
   Modelling the cell
   Subcellular localisation signals
   Machine Lea...
Cell Biology – Quick and Dirty
                      Membrane bound
                       Organelles
                   ...
Cell Feedback

   At a particular time point a set of genes
    will be expressed.
   These do not remain constant, inst...
Modelling the cell
   Ideally we would like to model the cell
    from the level of a 3D physical
    simulation.
      ...
Biological Sequences
   Many Important Biological Molecules are
    Polymers.
       Thus representable as a sequence of...
Information Content
   How much information in a linear sequence?
   Two crucial elements to function
       Physical/c...
Biological Patterns

   Motifs – General term for patterns
   Numerous Definitions & Visualisations
       PROSITE Patt...
Peroxisomal Localisation

   Predominantly controlled by a C-
    terminal sequence called the PTS1
    signal.
   Rough...
Nuclear Export
   Some proteins move continuously between the
    nucleus and cytoplasm of the cell.
   Either as:
    ...
Machine Learning
   Requires a set of examples, with
       Raw input, sequences data, and
       Known classes that th...
Bias

   Bias is generally unavoidable
       (Mitchell, 1980)
   Three Sources of Bias
       Input Encoding
       ...
Neural Networks
   Graphical Model consisting of layers of
    nodes connected by weights
   Feed forward neural network...
Simple Neural Networks




   F F N N O h = S (W1 ∙ I1 + W2 ∙ I2 + b)
   R N N O h = S (W1 ∙ I2 + W2 ∙ S (W1 ∙
    I1 + ...
RNNs in Bioinformatics

   Bi-Directional RNN
Applications

   We have applied these techniques to
       Subcellular Localisation to
           Endoplasmic Reticulu...
The End…




           ?
Upcoming SlideShare
Loading in …5
×

The role of machine learning in modelling the cell

395 views

Published on

  • Be the first to comment

The role of machine learning in modelling the cell

  1. 1. The Role of Machine Learning in Modelling the Cell. John Hawkins ARC Centre for Complex Systems University of Queensland Australia
  2. 2. Overview of Talk  Overview of cell biology  Modelling the cell  Subcellular localisation signals  Machine Learning in General  Neural networks  Feed Forward versus Recurrent
  3. 3. Cell Biology – Quick and Dirty  Membrane bound Organelles  Nucleus  DNA -> RNA -> Protein  Transport, e.g.  Mitochondria  Peroxisome  Modification, e.g.  Disulphide Bond Formation  Glycosylation
  4. 4. Cell Feedback  At a particular time point a set of genes will be expressed.  These do not remain constant, instead the emerging picture is that  There is some essential cycle of gene expression  With a capacity to indulge in alternative pathways of expression under external stimulus.  The pattern of expression is
  5. 5. Modelling the cell  Ideally we would like to model the cell from the level of a 3D physical simulation.  Currently this is infeasible  So numerous approaches are taken to form abstractions  Gene Regulatory Networks  Differential equation models of particular pathways  Machine learning models of particular
  6. 6. Biological Sequences  Many Important Biological Molecules are Polymers.  Thus representable as a sequence of discrete symbols.  Sequence M = [m1, m2, …, mn] where:  DNA mi  { A, T, G, C }  RNA mi  { A, U, G, C }  Protein mi  { G, A, V, L, I, P, S, C, T, M, D, E, H, K, R, N, Q, F, Y, W }
  7. 7. Information Content  How much information in a linear sequence?  Two crucial elements to function  Physical/chemical properties  Molecular shape  Each residue has well known properties  Denaturation. (Anfinsen,1973).  Sequence defines arrangement of chemical properties which in turn defines folding.
  8. 8. Biological Patterns  Motifs – General term for patterns  Numerous Definitions & Visualisations  PROSITE Patterns – Regular Expression  PROSITE Profiles – Probability Matrix  LOGOs
  9. 9. Peroxisomal Localisation  Predominantly controlled by a C- terminal sequence called the PTS1 signal.  Roughly 12 residues long  Known dependencies between locations
  10. 10. Nuclear Export  Some proteins move continuously between the nucleus and cytoplasm of the cell.  Either as:  Transporters  Regulators
  11. 11. Machine Learning  Requires a set of examples, with  Raw input, sequences data, and  Known classes that the machine should predict  In essence Function Approximation  Start with a General parametrised function over the input data  Adjust the parameters until the output of the function is a good approximation to the known classes of the examples.
  12. 12. Bias  Bias is generally unavoidable  (Mitchell, 1980)  Three Sources of Bias  Input Encoding  Function Structure (Architecture)  Parameter adjustment algorithm (learning)
  13. 13. Neural Networks  Graphical Model consisting of layers of nodes connected by weights  Feed forward neural networks  Fixed input window  Signal propagates in a single pass through the layers  Recurrent Neural Networks  Signal processed in parts  Recurrent connections maintain a memory state  Output generated after processing the last piece of the input signal
  14. 14. Simple Neural Networks  F F N N O h = S (W1 ∙ I1 + W2 ∙ I2 + b)  R N N O h = S (W1 ∙ I2 + W2 ∙ S (W1 ∙ I1 + b ) + b )
  15. 15. RNNs in Bioinformatics  Bi-Directional RNN
  16. 16. Applications  We have applied these techniques to  Subcellular Localisation to  Endoplasmic Reticulum  Mitochondria  Chloroplast  Peroxisome  http://pprowler.imb.uq.edu.au  Working with whole genome data and wet lab biologists to use these tools for data mining.
  17. 17. The End… ?

×