A simple method for incorporating sequence information into directed evolution experiments

601 views
519 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
601
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A simple method for incorporating sequence information into directed evolution experiments

  1. 1. A simple method for incorporating sequence information into directed evolution experiments Kyle L. Jensen*, Hal Alper*, Curt Fischer, Gregory Stephanopoulos Department of Chemical Engineering Massachusetts Institute of Technology sequence phenotype
  2. 2. When screening throughput is limit, linking sequence to phenotype can help direct downstream searches <ul><ul><li>Screen based </li></ul></ul><ul><ul><li>(selectable trait)‏ </li></ul></ul><ul><ul><li>Assay based </li></ul></ul><ul><ul><li>(no selectable trait)‏ </li></ul></ul>
  3. 3. Here, a P Ltet promoter was mutated to create a library of promoter variants Alper H., C. Fischer, E. Nevoigt, and G. Stephanopoulos, 2005. Tuning genetic control through promoter engineering. Proc. Natl. Acad. Sci. U S A 102:12678-83.
  4. 4. 69 promoter variants were created using error prone PCR
  5. 5. The 69 promoter variants spanned an 800-fold range of activity - How different are the underlying, mutagenized sequences? - What, on a sequence level, causes the variation? 800 fold range Log relative fluorescence Mutant number Top 50% Bottom 50%
  6. 6. Each of the 69 mutants had a unique sequence and incorporated multiple transition SNPs mutations promoter region Log relative fluorescence Mutant number Position [nt] Mutant number
  7. 7. The effects of individual mutations were “masked” by the presence of other mutations <ul><li>Just because a mutation occurs more frequently in one class, is it correlated? </li></ul><ul><li>Is the ratio of top/bottom important? </li></ul><ul><li>What is the statistical significance of a mutation that is distributed between the two classes? </li></ul>Some mutations have obvious effects ...most do not Position [nt] Mutant number Class distribution
  8. 8. Each individual position can be evaluated using a simple binomial distribution Same as: what's the probability of getting heads 14 of 20 coin tosses? P-value: 14 or more heads out of 20 Assuming the positions are independent Position [nt] Class distribution
  9. 9. Similar analysis over the promoter region revealed 7 positions significantly correlated with activity Class distribution Position [nt]
  10. 10. Position [nt] Mutant number Class distribution Log relative fluorescence Mutant number Position [nt]
  11. 11. A similar analysis can be applied to an arbitrary number of mutants and phenotypic classes 1 2 M . . . mutants M phenotypes Mutants with mutations as “position 35” . . . . . . . . . or 1 2 3 4 5 6 Y
  12. 12. The generalized probability of the phenotype distribution can be used to find mutation-phenotype correlations <ul><li>Probability of a particular vector color distribution </li></ul><ul><li>Significance of a correlation between mutations at “position 35” and the green phenotypic class </li></ul><ul><ul><li>Prior probability of </li></ul></ul><ul><ul><li>green phenotype </li></ul></ul>
  13. 13. In our case, we tested 8 locations, spanning a range of functions & confidences Class distribution Position [nt]
  14. 14. 7/8 of the single position mutants were in agreement with the predicted function
  15. 15. Rationally designed promoters with combinations of mutations showed predicted activity but also signs of site interaction <ul><ul><li>* </li></ul></ul>
  16. 16. In summary, this simple method, based on multinomial statistics, can be used to link sequence variations to particular phenotypes

×