STAR: Recombination site prediction

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    STAR: Recombination site prediction - Presentation Transcript

    1. Predicting structural disruption caused by crossover : a machine learning approach Denis C. Bauer Talk CIBCB 2005
    2. Outline
      • Introduction in Protein Design
      • Theory of SCHEMA
      • Our Approach
      • Results
      • Summary
    3. Protein
      • Biological Functions
        • Proteins are fundamental components of all living cells
          • Messenger Function (e.g. Hormones)
          • Catalystic Function (e.g. Enzymes)
          • Regulatoy Function (e.g. Antibodies)
      • Protein Design for Industry and Medicine
        • Better adjusted
        • New function
      Introduction
    4. Protein Structure
      • Primary Structure
      • Secondary Structure
      • Tertiary Structure
      • Quaternary Structure
      Pictures from : Principles of BIOCHEMISTRY, Horton, Moran, Ochs, Rawn, Scrimgeours Introduction
    5. Protein Design
      • Creating new amino acid sequences
        • Huge sequence space
        • Not every possible sequence is stable
      Solution: using sequences which already exist Introduction Gly Ala – Glu Thr Pro Val Gly Asp – – – Glu Thr Pro – – – – – – Gly Ala – Glu Pro – – – 20 100 possible Amino Acid sequences
    6. Benefit of Recombination KEMHQPLTFGELENLPLLNTDKPVQALM Problem: how to identify recombination sites ? Introduction KIPDELGLIFKFEAPGRVTRVLSSQ … M H K L N E K A P TIKELPQPPTFGELKKLPLLNTDKPVQAL M L K P G K G MKIADELGEIFKFEAPGRVTRYLSSQ… A P E L Y A Better resistant to heat Higher performance Higher performance Better resistant to heat Mayfly Lives where its hot MKIPDELGLIFKFEAPGRVTRALSSQ… MKIPDELGLIFKFEAPGRVTRALSSQ… KEMHQPLTFGELENLPLLNTDKPVQAL KEMHQPLTFGELENLPLLNTDKPVQAL
    7. SCHEMA
      • Research group of Prof. Francis Arnold
      • Idea: Positions where the least interaction are disrupted
      SCHEMA SCHEMA profile
    8. Limitations
      • 3D Structure necessary
        • Problem: hard to derive for some proteins
          • time consuming
          • expensive
      Solution: Disengaging from 3D structure SCHEMA
    9. Our approach
    10. Alternative to SCHEMA 3D Structure Information Schema Alg Schema Score Predicting Sequence Benefit: All Proteins can be processed Our Approach
    11. Predicting Schema-Profile Predicted Schema Score Sequence Support Vector Regression Predictive Model * * Bodén, M., Yuan, Z. and Bailey, T. L. Prediction of protein continuum secondary structure with probabilistic models. submitted Our Approach Model Bidirectional Recurrent Network Feed Forward Neural Network
    12. Results Table 1 Results for all approaches. r = correlation coefficient (ideally 1), devA = Root Mean Square Error (RMSE) normalized by the standard deviation (ideally 0). Results 0.62 0.83 SVR nu 0.63 0.82 SVR eps 0.52 0.88 BRNN 0.57 0.86 FFNN devA r Method
    13. Results Results
    14. Results Results
    15. Refinements Contact Numbers Predicting Model Predicted Schema Score predicted Input features Solvent Accessibility Score CC 0.88 0.88 0.6 Ensemble 0.88 Results ML model ML model ML model ML model
    16. However…
      • Only a limited number of connections are considered
      • Broken connections are reconnected after recombination
    17. Summary
      • Design proteins with recombination rather than from scratch
        • Identifiy recombination site
        • Idea: finding the sites where the least interactions are disrupted (SCHEMA)
      • Predicting SCHEMA-score to overcome the limitation
      • SCHEMA too limited to be the only means for recombination site prediction
      • Future work
        • All interactions
        • Actual recombination process
    18. Acknowledgments
      • Supervisors Dr. Mikael Bod é n and Dr. Ricarda Thier
      • Dr. Zheng Yuan
      • Prof. Francis Arnold’s research group
    19. Thank you Ref: C. A. Voigt, C. Martinez, Z.-G. Wang, S. L. Mayo, and F. H. Arnold, Protein building blocks preserved by recombination, Nat Struct Biol, vol. 9, no. 7, pp. 553-558, Jul 2002. Meyer MM, Silberg JJ, Voigt CA, Endelman JB, Mayo SL, Wang ZG, Arnold FH. Library analysis of SCHEMA-guided protein recombination. Protein Sci. 2003 Aug;12(8):1686-93. Bodén, M., Yuan, Z. and Bailey, T. L. Prediction of protein continuum secondary structure with probabilistic models. submitted.
    20. PDB 1zg4
    21. Recombination Site Identification
      • Recombination vs Mutagenesis or Design
      • from scratch
        • Higher fraction of functional proteins
        • Higher diversity  higher chance to find
        • a better hybrid
      • Requirement
        • Identify recombination site
        • Identify which segments are useful
        • Identify beneficial segment combinations
      • Existing methods
        • SCHEMA (Hybrid evaluation : avoid breaking connections)
        • FamClash (Hybrid evaluation : avoid changing properties of
        • residue pairs)
        • STAR (Site suggestion according to strucural compactness)
      • Known methods too limited to be a good means for
      • recombination site prediction
      http://www.che.caltech.edu/groups/fha/
    22. Possible approaches
      • Identify a new measure for evaluating hybrids (derived from datasets of biologically produced hybrids)
      • Include more information in the decision process
        • Sequence/Structure (SCHEMA)
        • Chemical features (FamClash)
        • Predicting important residues for structure and/or function
        • Predicting enzyme function from protein sequence
        • Substitution tolerance
        • Hydrophobic patterning
        • Surface clefts or binding sites
        • Solvent accessibility
        • Domains/motifs of parents
    SlideShare Zeitgeist 2009

    + Denis BauerDenis Bauer Nominate

    custom

    295 views, 0 favs, 0 embeds more stats

    The presentation was given at the CIBCB, 2005, in S more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 295
      • 295 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 5
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories