Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Biology


Published on

I gave this talk at the Symbolic Regression Workshop that took place in GECCO 2010

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Biology

  1. 1. Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Biology <ul><li>P. Widera, J. Bacardit, N. Krasnogor, C. Garcia-Martinez, M. Lozano </li></ul>
  2. 2. <ul><li>Symbolic Regression and Modeling are tightly linked in Bioinformatics, Systems and Synthetic Biology. </li></ul><ul><li>We explore two problems: </li></ul><ul><ul><li>Synthesis of effective energy functions for PSP </li></ul></ul><ul><ul><li>Synthesis of effective Systems/Synthetic Biology models </li></ul></ul><ul><li>Not run-of-the-mill Symbolic Regression, however: </li></ul><ul><ul><li>a symbolic solution is sought </li></ul></ul><ul><ul><li>must fit available data </li></ul></ul><ul><ul><li>must be human understandable </li></ul></ul>
  3. 3. Synthesis of effective energy functions for PSP
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8. Synthesis of effective Systems/Synthetic Biology models <ul><li>P systems are a executable modeling framework that closely mimic biological reality. </li></ul><ul><li>Can be seen as programs that explicitly mimic the internal behavior of cell systems . </li></ul><ul><li>Cells (and most biologists) don’t do differential calculus! </li></ul>
  9. 9. Motivation <ul><li>Learning a program with stochastic behavior vs. learning a P system. </li></ul><ul><li>A cell is a living example of distributed computing. </li></ul>function f1(p1,p2,p3,p4) { if (p1<p2) and (rand<0.5) print p3 else print p4 } function f1(p1,p2,p3,p4) { if (p1<p2) RND print p3 RND else RND print p4 RND }
  10. 10. P Systems <ul><li>Stochastic P systems are designed for specifying and simulating cellular systems. </li></ul><ul><li>Defined by the tuple: </li></ul><ul><li>O is the alphabet of molecules. </li></ul><ul><li>L={l 1 ,…,l n } is the set of labels representing different compartments. </li></ul><ul><li>μ is the membrane structure with n≥1 membranes. </li></ul>
  11. 11. P Systems <ul><li>M l i , 1 ≤ I ≤ n , is the initial configuration of membrane i , i.e., multiset of objects over O placed inside the compartment of membrane l i . </li></ul><ul><li>is a finite set of rules associated with compartment l i . These rules are expressed as: </li></ul><ul><ul><li>where o 1 , o 2 , o’ 1 , o’ 2 are multisets of objects over O and l ε L is a compartment label. </li></ul></ul>
  12. 12. Modular Assembly of P Systems <ul><li>Modules: set of rules representing molecular interactions that occur often. </li></ul><ul><li>Elemental modules: Degradation, complexation, unregulated gene expression, negative gene expression, etc. </li></ul><ul><li>Combination of basic modules (building-blocks) originates more complex modules, allowing modular and hierarchical modeling with P systems. </li></ul><ul><li>Challenge: Explore the large combinatorial space of modules and corresponding parameters. </li></ul>
  13. 13. Experimental Setup <ul><li>Compare different evolutionary algorithms to find structure & optimise parameters (kinetic constants) in P systems. </li></ul><ul><li>Four test cases of increasing difficulty and dimension: </li></ul><ul><ul><li>TC1: Pulse generator for different initial conditions (13 parameters). </li></ul></ul><ul><ul><li>TC2: Same problem as TC1 but with larger parameter ranges. </li></ul></ul><ul><ul><li>TC3: More general pulse generator: feed-forward loop motif (18 parameters). </li></ul></ul><ul><ul><li>TC4: Bandwidth detector (34 parameters). </li></ul></ul><ul><li>Unclear which fitness function to use </li></ul>
  14. 14. Target Models
  15. 15. Average Model Fit <ul><li>Test Case 1 </li></ul><ul><li>Test Case 2 </li></ul>
  16. 16. Average Model Fit <ul><li>Test Case 3 </li></ul>For protein1, all algorithms have similar output to the target.
  17. 17. Average Model Fit <ul><li>Test Case 4 </li></ul>
  18. 18. Discussions & Conclusions <ul><li>Design of effective fitness functions: </li></ul><ul><ul><li>in both problems there is a lack of “silver bullet” fitness function. What to do? </li></ul></ul><ul><ul><li>Besides the obviouts “fit”, include, robustness, sensitivity, parsimony and “semantic fit” terms </li></ul></ul><ul><li>What space is being searched? </li></ul><ul><li>CPU hungry problems: </li></ul><ul><ul><li>partial evaluations? lazy evaluation? </li></ul></ul><ul><ul><li>Grid/Cloud/GPGPU </li></ul></ul><ul><li>Human Understandability & Plausibility </li></ul>
  19. 19. Acknowledgements <ul><li>Jonathan Blake </li></ul><ul><li>Claudio Lima </li></ul><ul><li>Francisco Romero-Campero </li></ul><ul><li>Karima Righetti </li></ul><ul><li>Jamie Twycross </li></ul>/136 Integrated Environment Machine Learning & Optimisation Modeling & Model Checking Molecular Micro-Biology Stochastic Simulations Members of my team working on SB 2 EP/E017215/1 EP/H024905/1 BB/F01855X/1 BB/D019613/1 University of Nottingham Prof. M. Camara, Dr. S. Heeb, Dr. G. Rampioni, Prof. P. Williams Weizmann Institute of Science Prof. D. Lancet, Prof. I. Pilpel This Workshop Organisers You for listening!
  20. 20. Any Questions? <ul><ul><li>www.synbiont.org </li></ul></ul>/136 Become a member and have access to a large international community of Synthetic Biologists