“Bee engaged with Youth”. World Bee Day 2024; May. 20th.
BrentWathen.ppt
1. Hydrophobic Residue
Patterning in β-Strands and
Implications for β-Sheet
Nucleation
Brent Wathen
Dept. of Biochemistry
Queen’s University
2. 2
Outline
• Part I: Introduction
• Proteins
• Protein Folding
• Part II: Protein Structure Prediction
• Goals, Challenges
• Techniques
• State of the Art
• Part III: Residue Patterning on β-Strands
• β-Sheet Nucleation
• Hydrophobic/Hydrophilic Patterning
3. 3
Outline
• Part I: Introduction
• Proteins
• Protein Folding
• Part II: Protein Structure Prediction
• Goals, Challenges
• Techniques
• State of the Art
• Part III: Residue Patterning on β-Strands
• β-Sheet Nucleation
• Hydrophobic/Hydrophilic Patterning
5. 5
Proteins – Some Basics
• What Is a Protein?
• Linear Sequence of Amino Acids...
Part I: Introduction
6. 6
Proteins – Some Basics
• What Is a Protein?
• Linear Sequence of Amino Acids...
• What is an Amino Acid?
Part I: Introduction
7. 7
Proteins – Some Basics
• What Is a Protein?
• Linear Sequence of Amino Acids...
• What is an Amino Acid?
Part I: Introduction
8. 8
Proteins – Some Basics
• How many types of Amino Acids?
Part I: Introduction
9. 9
Proteins – Some Basics
• How many types of Amino Acids?
• 20 Naturally Occurring Amino Acids
• Differ only in SIDE CHAINS
Isoleucine Arginine Tyrosine
Part I: Introduction
10. 10
Proteins – Some Basics
• Amino Acids connect via PEPTIDE BOND
Part I: Introduction
11. 11
Proteins – Some Basics
• Backbone can swivel:
DIHEDRAL ANGLES
• 2 per Amino Acid
• Proteins can be 100’s of
Amino Acids in length!
• Lots of freedom of
movement
Part I: Introduction
13. 13
Protein Functions
• What do proteins do?
• Enzymes
• Cellular Signaling
• Antibodies
Part I: Introduction
14. 14
Protein Functions
• What do proteins do?
• Enzymes
• Cellular Signaling
• Antibodies
• WHAT DON’T THEY DO!
Part I: Introduction
15. 15
Protein Functions
• What do proteins do?
• Enzymes
• Cellular Signaling
• Antibodies
• WHAT DON’T THEY DO!
• Comes from Greek Work Proteios – PRIMARY
• Fundamental to virtually all cellular processes
Part I: Introduction
17. 17
Protein Functions
• How do proteins do so much?
• Proteins FOLD spontaneously
• Assume a characteristic 3D SHAPE
• Shape depends on particular Amino Acid
Sequence
• Shape gives SPECIFIC function
Part I: Introduction
18. 18
Protein Structure
• STRUCTURE FUNCTION relationship
• Determining structure is often critical in
understanding what a protein does
• 2 main techniques
• X-ray crystallography
• NMR
• 0.5Å RMSD accuracy
• Both are very challenging
• Months to years of work
• Many proteins don’t yield to these methods
Part I: Introduction
19. 19
Protein Structure
• Levels of organization
• Primary Sequence
• Secondary Structure (Modular building blocks)
• α-helices
• β-sheets
• Tertiary Structure
• Quartenary Structure
• Hydrophobic/Hydrophilic Organization
• Hydrophobics ON INSIDE
• Hydrophobic Cores
Part I: Introduction
22. 22
Protein Folding
• What we DO know...
• Protein folding is FAST!!
• Typically a couple of seconds
• Folding is CONSISTENT!!
• Involves weak forces – Non-Covalent
• Hydrogen Bonding, van der Waals, Salt Bridges
• Mostly, 2-STATE systems
• VERY FEW INTERMEDIATES
• Makes it hard to study – BLACK BOX
Part I: Introduction
23. 23
Protein Folding
• What we DON’T know...
• Mechanism...?
• Forces...?
• Relative contributions?
• Hydrophobic Force thought to be critical
Part I: Introduction
24. 24
Intro Summary
• Proteins are central to all living things
• Critical to all biological studies
• Folding process is largely unknown
• Sequence Structure Mapping
• Structure Function relationship
• Determining Protein Structure Experimentally is
HARD WORK
Part I: Introduction
25. 25
Outline
• Part I: Introduction
• Proteins
• Protein Folding
• Part II: Protein Structure Prediction
• Goals, Challenges
• Techniques
• State of the Art
• Part III: Residue Patterning on β-Strands
• β-Sheet Nucleation
• Hydrophobic/Hydrophilic Patterning
26. 26
The Prediction Problem
Can we predict the final 3D protein structure
knowing only its amino acid sequence?
Part II: Structure Prediction
27. 27
The Prediction Problem
Can we predict the final 3D protein structure
knowing only its amino acid sequence?
• Studied for 4 Decades
• “Holy Grail” in Biological Sciences
• Primary Motivation for Bioinformatics
• Based on this 1-to-1 Mapping of Sequence to
Structure
• Still very much an OPEN PROBLEM
Part II: Structure Prediction
28. 28
PSP: Goals
• Accurate 3D structures. But not there yet.
• Good “guesses”
• Working models for researchers
• Understand the FOLDING PROCESS
• Get into the Black Box
• Only hope for some proteins
• 25% won’t crystallize, too big for NMR
• Best hope for novel protein engineering
• Drug design, etc.
Part II: Structure Prediction
29. 29
PSP: Major Hurdles
• Energetics
• We don’t know all the forces involved in detail
• Too computationally expensive BY FAR!
• Conformational search impossibly large
• 100 a.a. protein, 2 moving dihedrals, 2 possible
positions for each diheral: 2
200
conformations!
• Levinthal’s Paradox
• Longer than time of universe to search
• Proteins fold in a couple of seconds??
• Multiple-minima problem
Part II: Structure Prediction
30. 30
Tertiary Structure Prediction
• Major Techniques
• Template Modeling
• Homology Modeling
• Threading
• Template-Free Modeling
• ab initio Methods
• Physics-Based
• Knowledge-Based
Part II: Structure Prediction
31. 31
Template Modeling
• Homology Modeling
• Works with HOMOLOGS
• ~ 50% of new sequences have HOMOLOGS
• BLAST or PSI-BLAST search to find good models
• Refine:
• Molecular Dynamics
• Energy Minimization
Part II: Structure Prediction
32. 32
Template-Free Modeling
• Modeling based primarily from sequence
• May also use: Secondary Structure Prediction,
analysis of residue contacts in PDB, etc.
• Advantages:
• Can give insights into FOLDING MECHANISMS
• Adaptable: Prions, Membrane, Natively Unfolded
• Doesn’t require homologs
• Only way to model NEW FOLDS
• Useful for de novo protein design
• Disadvantages: HARD!
Part II: Structure Prediction
33. 33
Template-Free Modeling
• Physics-Based
• Use ONLY the PRIMARY SEQUENCE
• Try to model ALL FORCES
• EXTREMELY EXPENSIVE computationally
• Knowledge-Based
• Include other knowledge: SSP, PDB Analysis
• Statistical Energy Potentials
• Not so interested in folding process
• “Hot” area of research
Part II: Structure Prediction
34. 34
Template-Free Modeling
• All methods SIMPLIFY problem
• Reduced Atomic Representations
• C-α’s only; C-α + C-β; etc.
• Simplify Force Fields
• Only van der Waals; only 2-body interactions
• Reduced Conformational Searches
• Lattice Models
• Dihedral Angle Restrictions
Part II: Structure Prediction
35. 35
Template-Free Modeling
• Basic Approach:
1. Begin with an unfolded conformation
2. Make small conformational change
3. Measure energy of new conformation
Accept based on heuristic: SA, MC, etc.
4. Repeat until ending criteria reached
• Underlying Assumption:
Correct Conformation has LOWEST ENERGY
Part II: Structure Prediction
36. 36
Diverse Efforts
• Data Mining
• Pattern Classification
• Neural Networks, HMMs, Nearest Neighbour, etc.
• Packing Algorithms
• Search Optimization
• Traveling Salesman Problem
• Contact Maps, Contact Order
• Constraint Logic, etc.
• Combinations of the above!
Part II: Structure Prediction
37. 37
ROSETTA
• Pioneered by Baker Group (U. of Washington)
• Fragment Based Method
• Guiding Assumption:
• Fragment Conformations in PDB approximate their
structural preferences
• Pre-build fragment library
• Alleviates need to do local energy calculations
• Lowest energy conformations should already be in
library
Part II: Structure Prediction
38. 38
ROSETTA
• Pre-build fragment library
• 3-mers and 9-mers
• 200 structural possibilities for each
• Build conformations from the library
• Randomly assign 3-mers, 9-mers along chain
• During conformational search, reassign a 3-mer or a
9-mer to a new conformation at random
• Score using energy function
• Adaptive: Coarse grain at first, detailed at end
• Accept changes based on Monte Carlo method
Part II: Structure Prediction
39. 39
Diverse Efforts
• Data Mining
• Pattern Classification
• Neural Networks, HMMs, Nearest Neighbour, etc.
• Packing Algorithms
• Search Optimization
• Traveling Salesman Problem
• Contact Maps, Contact Order
• Constraint Logic, etc.
• Combinations of the above!
Part II: Structure Prediction
40. 40
State of the Art
• CASP Competition
• Critical Assessment of Structure Prediction
• Blind Competition Every 2 years
• CASP6 in 2004 - CASP7 just completed
• ~75 proteins whose structures have not been
published as yet
• Easy homologs examples
• Distant homologs available
• De novo structures: no homologs known
Part II: Structure Prediction
41. 41
State of the Art
• Template Modeling
CASP6 Target 266
(green), and best
model (blue)
Moult, J. (2005) Cur. Opin.
Struct. Bio. 15:285-289
Part II: Structure Prediction
42. 42
State of the Art
• Template Modeling
• Alignment still not easy, and often requires multiple
templates
• Accurate core models (within 2-3Å RMSD)
• Still not good at modeling regions missing from
template
• Side-chain modeling not too good
• Molecular dynamics not able to improve models as
hoped
Part II: Structure Prediction
43. 43
State of the Art
• Template-Free Modeling
CASP6
target 201,
and best
model.
Vincent, J.J. et. al (2005)
Proteins 7:67-83.
Part II: Structure Prediction
44. 44
State of the Art
CASP6 target
241, and 3 best
models.
• Template-Free Modeling
Vincent, J.J. et. al (2005)
Proteins 7:67-83.
Part II: Structure Prediction
45. 45
State of the Art
• How Good are Current Techniques?
• CASP6 Summary:
“The disappointing results for [hard new fold] targets
suggest that the prediction community as a whole
has learned to copy well but has not really learned
how proteins fold.”
Vincent, J.J. et. al (2005)
Proteins 7:67-83.
Part II: Structure Prediction
46. 46
PSP Summary
• Many diverse, creative efforts
• Progress IS being made in finding final 3D
structures
• Less so with regards to understanding folding
mechanisms
• NEEDED:
• Marriage of Creative Ideas and Increased
Resources
Part II: Structure Prediction
47. 47
Outline
• Part I: Introduction
• Proteins
• Protein Folding
• Part II: Protein Structure Prediction
• Goals, Challenges
• Techniques
• State of the Art
• Part III: Residue Patterning on β-Strands
• β-Sheet Nucleation
• Hydrophobic/Hydrophilic Patterning
48. 48
β-Sheet Basics
• Made up of β-Strands
• Diverse:
• Parallel/Antiparallel
• Edge/Interior Strands
• Typically Twisted
• Many Forms
• β-sandwiches, β-barrels, β-helices, β-propellers, etc.
• 2D? 3D?
• Less studied than helices
Part III: β-Strand Patterning
50. 50
Beta Sheet Basics
• What do we know?
• Residues:
• V, I, F, Y, W, T, C L
• Found largely in Protein Cores
• Amphipathic Nature
Part III: β-Strand Patterning
52. 52
Theory of β-Sheet Nucleation
• Hydrophobic Zipper (HZ)
• Dill et. al. (1993)
• Hydrophobic residues from different parts of
chain make initial contact
• Correct alignment of backbones
• Hydrogen bonding
• Subsequent growth via “Zipping Up”
Part III: β-Strand Patterning
53. 53
• Hydrophobic Zipper (HZ)
Dill, K.A. et al., (1993)
Proc. Natl. Acad. Sci.
USA 90: 1942-1946.
Part III: β-Strand Patterning
Theory of β-Sheet Nucleation
54. 54
Theory of Nucleation
• Hydrophobic Zipper (HZ)
• Once Hydrophobic “Seed” established, can
grow out 2 directions
Part III: β-Strand Patterning
56. 56
Thought Experiment...
• What would a Beta Seed look like?
• Contain hydrophobics
• On both strands
Part III: β-Strand Patterning
57. 57
Thought Experiment...
• What would a Beta Seed look like?
• Contain hydrophobics
• On both strands
• How many?
• Will single hydrophobic on each strand be
sufficient?
Part III: β-Strand Patterning
58. 58
Thought Experiment...
• What would a Beta Seed look like?
• Contain hydrophobics
• On both strands
• How many?
• Will single hydrophobic on each strand be
sufficient?
• Single Unlikely:
• 1 Hydrophobic Residue NOT SPECIFIC ENOUGH
• Too many possible combinations
Part III: β-Strand Patterning
59. 59
Thought Experiment...
• What would a Beta Seed look like?
• Contain hydrophobics
• On both strands
• How many?
• Will single hydrophobic on each strand be
sufficient?
• Single Unlikely:
• 1 Hydrophobic Residue NOT SPECIFIC ENOUGH
• Too many possible combinations
At least 1 strand must have >1 Hydrophobic
Part III: β-Strand Patterning
60. 60
Thought Experiment...
• What hydrophobic arrangement would lead to
Beta Sheet Nucleation?
• i,i+1?
• i,i+2?
• i,i+3?
• i,i+4?
Part III: β-Strand Patterning
61. 61
Thought Experiment...
• What hydrophobic arrangement would lead to
Beta Sheet Nucleation?
• i,i+1? No, not likely: Amphipathic.
• i,i+2?
• i,i+3?
• i,i+4?
Part III: β-Strand Patterning
62. 62
Thought Experiment...
• What hydrophobic arrangement would lead to
Beta Sheet Nucleation?
• i,i+1? No, not likely: Amphipathic.
• i,i+2?
• i,i+3? No... Amphipathic.
• i,i+4?
Part III: β-Strand Patterning
63. 63
Thought Experiment...
• What hydrophobic arrangement would lead to
Beta Sheet Nucleation?
• i,i+1? No, not likely: Amphipathic.
• i,i+2?
• i,i+3? No... Amphipathic.
• i,i+4? Seems too far apart...
Part III: β-Strand Patterning
64. 64
Thought Experiment...
• What hydrophobic arrangement would lead to
Beta Sheet Nucleation?
• i,i+1? No, not likely: Amphipathic.
• i,i+2? Most likely.
• i,i+3? No... Amphipathic.
• i,i+4? Seems too far apart... Chain loop?
Part III: β-Strand Patterning
65. 65
Hypothesis
Assuming:
• Beta Sheets Nucleate by Hydrophobics (HZ)
• i,i+2 hydrophobic pairings on beta strands are
necessary for nucleation
Part III: β-Strand Patterning
66. 66
Hypothesis
Assuming:
• Sec. structures contain their nucleating residues
• Beta Sheets Nucleate by Hydrophobics (HZ)
• i,i+2 hydrophobic pairings on beta strands are
necessary for nucleation
Beta Strands contain an increased frequency of
i,i+2 hydrophobic residue pairings.
Part III: β-Strand Patterning
71. 71
Technique
• Looking for statistically significant patterns
• For any particular pattern:
1. Count how often it occurs in database
2. Randomly shuffle residues in sheets
3. Re-count how often pattern occurs
4. Repeat random shuffle and counting x1000
5. Compare initial count, avg random count
Calculate the Std Dev σ
If σ > 3.0, statistically significant
Part III: β-Strand Patterning
78. 78
Technique
• Patterns of Interest:
• Hydrophobic patterning (V L I F M)
• Hydrophilic patterning (K R E D S T N Q)
• Positions:
• i,i+1
• i,i+2
• i,i+3
• i,i+4
• Consider only strands of length >= 5 residues
Part III: β-Strand Patterning
97. 97
Results
• Hydrophobics: Summary
• Where are the hydrophobic pairings??
• Not at i,i+1 or i,i+3 or i,i+4
• Barely at i,i+2
• Note:
• Moderate i,i+2 pairing: No strong aggregation
• Low low i,i+4 pairing: Not Dispersed! Isolated
Part III: β-Strand Patterning
115. 115
Results
• Examine localized hydrophobic pairings...
• Summary:
• Localized i,i+2 hydrophobic pairing at NT and CT
• Disfavoured at interior positions
-4
-2
0
2
4
6
8
10
NT NT+1 NT+2 Central CT-2 CT-1 CT Avg
Pattern Location
z-
Score
Part III: β-Strand Patterning
116. 116
Results
• Examine localized hydrophobic pairings...
• Are these patterns sense-specific?
• @ NT+1:
• Favoured for Parallel, Antiparallel
-4
-2
0
2
4
6
8
10
Parallel Antiparallel Mixed Edge
Strand Type
z-
Score
Part III: β-Strand Patterning
117. 117
Results
• Examine localized hydrophobic pairings...
• Are these patterns sense-specific?
• @ CT:
• Favoured for Antiparallel, Mixed
• NOT PARALLEL!
-1
0
1
2
3
4
5
Parallel Antiparallel Mixed Edge
Strand Type
z-
Score
Part III: β-Strand Patterning
118. 118
Conclusions
• Hydrophobic patterning suggests:
• Hydrophobics are located on one side of beta
sheets AMPHIPATHIC
• Hydrophobics are CLUSTERED
• Hydrophobics aggregate at NT, CT
• Parallel Strands: @ NT only
• Antiparallel Strands: @ NT & CT
• Supports HYDROPHOBIC ZIPPER theory for
sheet nucleation
Part III: β-Strand Patterning
122. 122
Implications
• How do beta sheets nucleate?
• Antiparallel
• Nucleate at edge
• Growth is unidirectional
Part III: β-Strand Patterning
123. 123
Future Work
1. Extend this work to 2D
Both intra- and inter-strand patterning
2. Consider more complex patterning
3 residues on one strand? NT Position?
Specific residue combinations?
3. Consider patterning by beta-sheet type
Beta Helices, Barrels, Sandwiches, etc.
Part III: β-Strand Patterning
124. 124
Acknowledgements
• Dr. Jia
• Lab Members
• Dr. Qilu Ye
• Dr. Vinay Singh
• Dr. Susan Yates
• Daniel Lee
• Jimmy Zheng
• Neilin Jaffer
• NSERC
• Andrew Wong
• Michael Suits
• Laura van Staalduinen
• Mark Currie
• Kateryna Podzelinska