Following the Evolution of New Protein Folds via Protodomains

591 views

Published on

Protein evolution proceeds through genetic mechanisms, but selection acts on biological assemblies. I define a protodomain as a minimal independently evolving unit with conserved structure. Protodomain rearrangements have minimal impact on biological assemblies, so they represent a valid evolutionary path through fold space.

These slides are from my Candidacy Exam on Jan 28, 2013 at University of California, San Diego. It discusses my current research in Philip Bourne's lab, as well as proposes research for my thesis over the next two years. An audio version is available at http://www.scivee.tv/node/57082

Published in: Health & Medicine
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
591
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Implications not within scope (eg sequence comparisons)Define motifsAcknowledge Philippe’s work
  • Orengo, C A, Flores, T P, Taylor, W R, Thornton, J M. Identification and classification of protein fold families. Protein Eng (1993) vol. 6 (5) pp. 485-500 1. SSAP structure comparison 2. 150 non-redundant reps 3. See 3 clusters by Multidimensional scalingHolm and Sander. Protein structure comparison by alignment of distance matrices. J Mol Biol (1993) vol. 233 (1) pp. 123-38 1. Original DALI 2. 225 representatives 3. finds 3 clusters by hierarchical clusteringHolm and Sander. Touring protein fold space with Dali/FSSP. Nucleic Acids Res (1998) vol. 26 (1) pp. 316-9 1. Automated classification of fold spaceHolm and Sander. Mapping the protein universe. Science (1996) vol. 273 (5275) pp. 595-603 1. Use multivariate scaling to project proteins to 2D. 2. Use 287 unique folds as input 3. find 5 classes 4. DALI 5. Updates: Holm and Sander. Touring protein fold space with Dali/FSSP. Nucleic Acids Res (1998) vol. 26 (1) pp. 316-9Shindyalov and Bourne. An alternative view of protein fold space. Proteins (2000) vol. 38 (3) pp. 247-60 1. 2016 repr (using fast structural alignment), but only use 75 of them? 2. all-v-all, but no visualizationHou, Jingtong, Sims, Gregory E, Zhang, Chao, Kim, Sung-Hou H. A global representation of the protein fold space. Proceedings of the National Academy of Sciences of the United States of America (2003) vol. 100 (5) pp. 2386-90 1. 3D projection 2. incorporates SCOP 3. 498 scop fold reprsChoi and Kim. Evolution of protein structural classes and protein sequence families. Proceedings of the National Academy of Sciences of the United States of America (2006) vol. 103 (38) pp. 14056-61 1. [from Taylor] Common structural ancestors (CSAs) are estimated for protein families and the age of the CSA plotted in fold space. This shows the b/a class to be the most ancient. Although there is debate about the methods used to estimate age, the ancient nature of the b/a proteins is clear but not unexpected, as many other functional properties suggest their antiquity.Taylor. Evolutionary transitions in protein fold space. Curr Opin Struct Biol (2007) vol. 17 (3) pp. 354-61 1. Concludes that attempts to embed fold space are futile. 2. Previous attempts were able to distinguish 'class' level but failed at finding significant relationships.  3. Contains a nice discussion about CP evolution 4. Cites Orengo, Holm, Hou, Choi.Sadreyev et al. Discrete-continuous duality of protein structure space. Curr Opin Struct Biol (2009) vol. 19 (3) pp. 321-8 1. Argues for continuous structural space with discrete evolutionary space. 2. Using DALI z-score as metric, cluster 2000 proteins in 2D using CLANS. Find continuous space with some 'mountains' of higher densityDaniels et al. Touring Protein Space with Matt. IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM (2011) PREPRINT 1. Abstract claims automated classification at superfamily/fold 
  • Familiar terms but highlight subtleties
  • Should I mention ongoing Topic Page involvement?
  • Based on Uliel. Bioinformatics (1999) vol. 15 (11) pp. 930-6
  • Lee and Blaber. Experimental support for the evolution of symmetric protein architecture from a simple peptide motif. PNAS (2011) vol. 108 (1) pp. 126-30
  • Would already be detected as protodomain
  • Green-active transportersRed- channelsOmit legend
  • a - All alpha proteinsb - All beta proteinsc - Alpha and beta proteins (a/b)d - Alpha and beta proteins (a+b)e - Multi-domain proteins (alpha and beta)f - Membrane and cell surface proteins and peptidesg - Small proteinsh - Coiled coil proteinsi - Low resolution protein structuresj - Peptidesk - Designed proteinsMain cluster: 6000/7467 = 80% nodes, 83780/86878 = 96% edges
  • Following the Evolution of New Protein Folds via Protodomains

    1. 1. FOLLOWING THE EVOLUTIONOF NEW PROTEIN FOLDS VIAPROTODOMAINSSpencer BlivenJanuary 28, 2013Advancement to Candidacy Exam
    2. 2. CATH: http://www.cathdb.info/browse/bro wse_hierarchyhttp://scop.mrc-lmb.cam.ac.uk/scop/
    3. 3. CONTINUITY Sadreyev, R. I., Kim, B.-H., &Grishin, N. V. (2009). Discrete-continuous duality of protein structure space. Current Opinion in Structural Biology, 19(3), 321–328. Grishin. J Struct Biol (2001) vol. 134 (2-3) pp. 167-85
    4. 4. MODELS OF FOLD SPACE β Orengo, Flores, Taylor, Thorn ton. Protein Eng (1993) vol. 6 α α/β (5) pp. 485-500 Holm and Sander. J Mol Biol (1993) vol. 233 (1) pp. 123-38 α+β Holm and Sander. Science (1996) vol. 273 (5275) pp. 595- 603 Shindyalov and Bourne. Proteins (2000) vol. 38 (3) pp. 247-60 Hou, Sims, Zhang, Kim. PNAS (2003) vol. 100 (5) pp. 2386-90 Taylor. Curr Opin Struct Biol (2007) vol. 17 (3) pp. 354-61 Sadreyev et al. Curr Opin Struct Biol (2009) vol. 19 (3) pp. 321-8
    5. 5. BIG QUESTIONS Is fold space discrete or continuous? Where do new folds come from? What insights can we gain by studying fold space?
    6. 6. DEFINITIONS
    7. 7. BIOLOGICAL ASSEMBLIES Sesbania mosaic virus [1VAK]Asymmetric Unit Biological Assembly Hemoglobin [1hv4]
    8. 8. DOMAINS Compact Geometry Independently FoldingNon-contiguous domains Multi-chain domainsSqualene-HopeneCyclase Kunitz-type trypsin[1SQC] inhibitor [1r8o]
    9. 9. FOLD Group of domains with  Same major secondary structural elements  Same mutual orientation  Same connectivity
    10. 10. PROTODOMAINS A protodomain is a minimal, independently evolving protein unit with a conserved structure. Defined through evolution, but usually observed as structural motif Coined by Philippe Youkharibache
    11. 11. PROTODOMAINS  A protodomain is a minimal, independently evolving protein unit with a conserved structure. GTP bindingGlyoxalase I from regulator fromClostridium Thermotogaacetobutylicum[3 maritima [1VR8]HDP]Glyoxalase I in E. Pseudomonascoli [1F9Z] 1,2-dihydroxy- naphthalene dioxygenase [2EHZ]
    12. 12. PROPOSAL
    13. 13. SPECIFIC AIMS1. Improve algorithms to identify conserved protodomains globally across the PDB.2. Identify structurally similar and potentially homologous protodomains across fold space.3. Integrate protodomain arrangements with domain and quaternary structure information to create a parsimonious model of fold evolution across the tree of life.4. Apply protodomain principles to understanding the evolution of specific protein families.
    14. 14. AIM 1 Improve algorithms to identify conserved protodomains globally across the PDB.Preliminary Research:a) Circular Permutation with CE-CPb) Symmetry with CE-SymmProposed Research:a) Improve CE-Symm algorithmb) Create algorithms for other types of protodomain rearrangementsc) Run algorithms globally across the PDBd) Create non-redundant catalogue of protodomains
    15. 15. CIRCULAR PERMUTATION Spencer Bliven and Andreas Prlić. Circular Permutation in Proteins. PLoSComputBiol (2012) 8(3): e1002445.
    16. 16. CIRCULAR PERMUTATION EVOLUTION Fission & Fusion Permutation by Duplication
    17. 17. CE-CP  A Prlić, S Bliven, P Rose, J Jacobsen, PV Troshin, M Chapman, J Gao, CH Koh, S Foisy, R Holland, G Rimša, ML Heuer, H. Brandstätter–Müller, PE Bourne, and S Willis. BioJava: an open- source framework for bioinformatics in 2012. Bioinformatics (2012).  http://www.rcsb.org/pdb/workbench/workbench.do N C C N Molybdate-binding protein Regulator of G proteinConcanavalin A [1NLS.A] [1ATG.A] vs. OpuAC signaling 10 [2IHB.A] vs.vs. Pea Lectin [1RIN.A+B] [2B4L.A] vaccinia H1-related phosphatase[1VHR.A]
    18. 18. DETECTING CIRCULAR PERMUTATIONS
    19. 19. SYMMETRY Beta Propeller Goodsell, D. S., & Olson, A. J. (2000). Structural symmetry and protein function. Annual Review of Biophysics and Biomolecular Structure, 29, 105–153.
    20. 20. SYMMETRY Functionally important FGF-1 3JUT  Protein evolution (e.g. beta-trefoil)  DNA binding  Allosteric regulation  Cooperativity TATA Binding Protein 1TGH Widespread (19% of proteins) Hemoglobin 4HHB
    21. 21. SYMMETRY EVOLUTION Start with perfectly symmetric homomer Duplications & Fusions Symmetry lost to drift
    22. 22. INTERMEDIATES TO BETA-TREFOIL FGF-1 [3JUT]Lee, J., &Blaber, M. (2011). Experimental support for the evolution of symmetric protein architecturefrom a simple peptide motif. PNAS, 108(1), 126–130.
    23. 23. CE-SYMM WISHLIST Find alignments for all valid rotations Refine alignments based on isomorphism constraints Utilize crystallographic symmetry more efficiently for biological assemblies Triose Phosphate Isomerase [8TIM] Detect multiple axes of symmetry 5-enol-pyruvyl shikimate-3-phosphate (EPSP) synthase [1G6S]
    24. 24. CE-SYMM Andreas Prlić, Spencer E. Bliven, Peter W. Rose, Philippe Youkharibache, Douglas Myers- Turnbull, Philip E. Bourne. On Symmetry and Pseudo-Symmetry in Proteins. In preparation. FGF-1 [3JUT] AmtB [3C1G]
    25. 25. CE-SYMMFGF-1
    26. 26. CE-SYMMFGF-1
    27. 27. ADDITIONAL METHODS FOR DETECTINGPROTODOMAINS Changes in Quaternary Structure Protodomain searches (Douglas Myers-Turnbull) Domain Swapping
    28. 28. AIM 2 Identify structurally similar and potentially homologous protodomains across fold space.Preliminary Researcha) All-vs-all comparison of chains & domainsb) Clustering & network analysisProposed Researcha) Run all-vs-all comparison of protodomainsb) Build protodomain similarity networkc) Correlate network with existing properties: ligand binding, symmetry order, enzymatic activity, and distribution across organisms, etc
    29. 29. ALL-VS-ALL STRUCTURAL ALIGNMENT Andreas Prlić, Spencer Bliven, Peter W Rose, Wolfgang F. Bluhm, Chris Bizon, Adam Godzik, Philip E. Bourne. Precalculated Protein Structure Alignments at the RCSB PDB website. Bioinformatics (2010) vol. 26 (23) pp. 2983-2985
    30. 30. ALL-VS-ALL STRUCTURAL ALIGNMENT Use sequence clustering to get representative chains with <40% sequence identity (currently 23410) Split into domains by SCOP or PDP All chains and domains compared using FATCAT Use Open Science Grid (OSG) Client/Server architecture for aggregating results … Scores …
    31. 31. NETWORK FROM TRANSPORTERCLASSIFICATION DATABASE (TCDB) Primary Active Transporters Channels/Pores Transmembrane Electron Carriers Group Translocators …
    32. 32. BETA PROPELLERS Symmetry C4 C5 C6 C7
    33. 33. CROSS-CLASS EXAMPLE 3GP6.A  PagP, modifies lipid A  f.4.1 (transmembrane beta-barrel) 1KT6.A  Retinol-binding protein  b.60.1 (Lipocalins)
    34. 34. AIM 3 Integrate protodomain arrangements with domain and quaternary structure information to create a parsimonious model of fold evolution across the tree of life.Preliminary Researcha) Classification of biological assemblies by quaternary symmetry & chain stoichiometryb) Model for evolution via protodomainsProposed Researcha) Determine the protodomain content of each biological assemblyb) Identify BAs with conserved protodomain architecture but different chain architecture, or vice versac) Integrate data with model of protodomain evolution
    35. 35. QUATERNARY STRUCTURE Find symmetry & pseudosymmetry within biological assemblies Functions at chain level Can use various thresholds to determine stoichiometry (95% sequence, CE alignment, etc)Rhinovirus 2 [3DPR] GTP Cyclohydrolase I Hemoglobin [4HHB] I (60,60,60,60,60) [1A8R] D5 (10) C2 (2,2)
    36. 36. EVOLUTIONARY MODEL1. Local Mutation2. Protodomain fusion3. Protodomain fission4. Loss of Interface5. Gain of Interface6. New Protodomains
    37. 37. CONNECTION TO FOLD SPACE Mostly local mutations = continuous regions Protodomain creation & rearrangement = discrete regions Identifying evolutionary events allows quantitative comparison of the frequencies of each mechanism Biologically rather than geometrically motivated
    38. 38. AIM 4 Apply protodomain principles to understanding the evolution of specific protein families. Qualities  Have good structural coverage  Contain multiple members with symmetry at either domain or quaternary structure level.  Contain circularly permuted members  Span a diverse set of folds Ion Channels Beta Propellers AmtB [3C1G]
    39. 39. SODIUM/ASPARTATESYMPORTER FROMPYROCOCCUSHORIKOSHII(GLTPH) cytoplasm Top Side [2NXW]Forrest, L. R., Krämer, R., & Ziegler, C. (2011). The structural basis ofsecondary active transport mechanisms. Biochimica etBiophysicaActa, 1807(2), 167–188.
    40. 40. CONCLUSIONS Biological Assemblies are the functional unit of structure Protodomains can rearrange without modifying the biological assembly Separating changes in biological assembly from genetic changes can provide evolutionary perspective on fold space  Local Changes = Continuous Evolution  Protodomain rearrangements = Discrete Transitions
    41. 41. TIMELINE
    42. 42. PUBLICATIONS A Prlić, S Bliven, PW Rose, WF Bluhm, C Bizon, A Godzik, PE Bourne. Precalculated Protein Structure Alignments at the RCSB PDB website. Bioinformatics (2010) vol. 26 (23) pp. 2983-2985 Spencer Bliven and Andreas Prlić. Circular Permutation in Proteins. PLoSComputBiol (2012) 8(3): e1002445. A Prlić, S Bliven, P Rose, J Jacobsen, PV Troshin, M Chapman, J Gao, CH Koh, S Foisy, R Holland, G Rimša, ML Heuer, H Brandstätter–Müller, PE Bourne, and S Willis. BioJava: an open- source framework for bioinformatics in 2012. Bioinformatics (2012).Intended: CE-Symm method Evolutionary model & examples of protodomain evolution Structural similarity network analysis Use of model for specific protein family
    43. 43. ACKNOWLEDGMENTSCommittee CollaboratorsPhilip Bourne Philippe YoukharibacheMilton H. Saier Jean-Pierre ChangeuxRussell F. Doolittle BiojavaContributorsMichael K. GilsonAdam Godzik The lovely ChristineBourne Lab/PDB BlivenAndreas PrlićPeter RoseDouglas Myers-TurnbullLab & PDB members
    44. 44. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

    ×