How aromatic amino acids
promote peptide folding
Olivier Bignucolo, Stephan Grzesiek, Simon Bernèche
)
Structures of
aromatic amino acids
How aromatic amino acids
promote peptide folding
1.- Why investigate the relation between sequence and
peptide conformations
?

2.- Experimental sectio
n

- Residual dipolar couplings (RDC
)

- Aromatics induce some “ordering in the chain
”

3.- Theoretical sectio
n

- Aromatics induce formation of a turn or a Alpha-Heli
x

- Driving force for folding
If anybody has forgotten..
.

this is a Alpha-Helix
Journal of Structural Biology 179 (2012) 347–358
Why work on protein
conformational prediction?
Exponentially increasing gap
between the number of known
sequences and the number of
resolved structures
Sequences
3D structures
Intrinsically unfolded proteins
TANGUY CHOUARD 2011 | VOL 471 | NATURE | 151
The idea: reduce the complex protein folding problem to
the smallest possible scal
e

<=> How does a single residue affect the conformation
?

Experiment
Synthesis of 16 peptides of sequence: EGAAXAAS
S

Extraction of NMR observables
 

=> Residual Dipolar Couplings (RDCs)
Dipolar Coupling
Dipolar Coupling
N
H
N + H => oriented dipole
N
H
N
H
Dipolar Coupling
Similarly Cα and Hα form
other oriented dipoles
Cα
Hα
N
H
Dipolar Coupling
D	∝
Thus, we get information
about an orientation
Why Residual Dipolar Couplings?
 

→ Information about long range interactions
The blue arrows represent the
orientation of the N - H bond
of selected peptide bonds.
http://en.wikipedia.org/wiki/Residual_dipolar_coupling
RDC => short and long range,
quantitative information
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
Exp!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
Majority of substitution
s

=> flat profile
X=Gly
Results from Experiment
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
X=Trp
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
Exp!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
X=Gly
X=Trp: very contrasted
experimental pattern
Results from Experiment
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
X=Trp
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
Exp!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
X=Gly
Conclusions from the NMR spectroscopy
:

- For X=Gly => even less order than other amino acid
s

- For X=Trp => a kink in the peptide backbon
e
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
X=Gly: Predicted pattern
flat, boring too ... it will
be our “control”
Red: values extracted from the MD simulations
X=Gly
Black: NMR
Results from Experiment


and Simulations
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
Predicted pattern with X=Trp
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
Results from Experiment


and Simulations
X=Gly
X=Trp
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
Something is wrong in these
figures...
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
Results from Experiment


and Simulations
X=Gly
X=Trp
Results from Experiment


and Simulations
X=Trp
X=Gly
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
X=Gly
X=Trp
Let’s present the data correctly
!

Aromatics: large variability
Results from Experiment


and Simulations
X=Trp
X=Gly
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 G5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 G5 A6 A7 S8 S9!
Predicted pattern with X=Trp
large variabilit
y

some “good” snapshots
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz)!
G2 A3 A4 W5 A6 A7 S8 S9!
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Ca
RDC
(Hz)!
E1 G2 A3 A4 W5 A6 A7 S8 S9!
X=Gly
X=Trp
Open red squares:


a “good” snapshot
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
0.6!
0.7!
0.8!
0! 100! 200! 300! 400! 500! 600! 700!
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
Radius
of
gyration
(nm)
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
RMSD
 

Time series and the agreement to experiment
Trp: the variabilit
y

can also be observe
d

within run
RMSD = Residual Mean Square Deviation
between experiment and simulations
Time series and the agreement to experiment
RMSD
 

2.5!
3.5!
4.5!
5.5!
6.5!
0.7!
1.0!
1.3!
1.6!
1.9!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
Distance
between
N
and
C
termini
(nm)
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
Wow! The distance between
the termini overlaps quite
well over the agreement to
experiment!!
RMSD
Time series and the agreement to experiment
RMSD
 

2.5!
3.5!
4.5!
5.5!
6.5!
0.7!
1.0!
1.3!
1.6!
1.9!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
Distance
between
N
and
C
termini
(nm)
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
Hey! The same is true for
the radius of gyration
RMSD
 

2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
0.6!
0.7!
0.8!
0! 100! 200! 300! 400! 500! 600! 700!
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
Radius
of
gyration
(nm)
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
RMSD
 

Rg
2.5!
3.5!
4.5!
5.5!
6.5!
0.7!
1.2!
1.7!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
N-C
(nm)
2.5!
3.5!
4.5!
5.5!
6.5!
-60!
0!
60!
120!
Psi
5
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
2.5!
3.5!
4.5!
5.5!
6.5!
-120!
-90!
-60!
-30!
0!
30!
60!
90!
Cα
3-4-5-6
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
0.6!
0.7!
0.8!
0! 100! 200! 300! 400! 500! 600! 700!
Rg
(nm)
RMSD
between
predicetd
and
experimental
RDCs
(Hz)!
2.5!
3.5!
4.5!
5.5!
6.5!
1!
3!
5!
7!
9!
11!
13!
Clusters
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
X=Trp
Many structural parameters behave similarly.
But What is important:
 

they are all consistent to each other!
3.0!
4.0!
5.0!
6.0!
7.0!
0.7!
1.0!
1.3!
1.6!
1.9!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
N-C
Dist.
(nm)
3.0!
4.0!
5.0!
6.0!
7.0!
-60!
0!
60!
120!
Psi
5
3.0!
4.0!
5.0!
6.0!
7.0!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
3.0!
4.0!
5.0!
6.0!
7.0!
-120!
-90!
-60!
-30!
0!
30!
60!
90!
Cα
3-4-5-6
3.0!
4.0!
5.0!
6.0!
7.0!
0.5!
0.6!
0.7!
0.8!
0.9!
0! 100! 200! 300! 400! 500! 600! 700!
Rg
(nm)
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
3.0!
4.0!
5.0!
6.0!
7.0!
1!
3!
5!
7!
9!
11!
13!
Clusters
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
X=Tyr
2.5!
3.5!
4.5!
5.5!
6.5!
0.7!
1.2!
1.7!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
N-C
(nm)
2.5!
3.5!
4.5!
5.5!
6.5!
-60!
0!
60!
120!
Psi
5
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
2.5!
3.5!
4.5!
5.5!
6.5!
-120!
-90!
-60!
-30!
0!
30!
60!
90!
Cα
3-4-5-6
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
0.6!
0.7!
0.8!
0! 100! 200! 300! 400! 500! 600! 700!
Rg
(nm)
RMSD
between
predicetd
and
experimental
RDCs
(Hz)!
2.5!
3.5!
4.5!
5.5!
6.5!
1!
3!
5!
7!
9!
11!
13!
Clusters
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
X=Trp
3.0!
4.0!
5.0!
6.0!
7.0!
0.7!
1.0!
1.3!
1.6!
1.9!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
N-C
Dist.
(nm)
3.0!
4.0!
5.0!
6.0!
7.0!
-60!
0!
60!
120!
Psi
5
3.0!
4.0!
5.0!
6.0!
7.0!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
3.0!
4.0!
5.0!
6.0!
7.0!
-120!
-90!
-60!
-30!
0!
30!
60!
90!
Cα
3-4-5-6
3.0!
4.0!
5.0!
6.0!
7.0!
0.5!
0.6!
0.7!
0.8!
0.9!
0! 100! 200! 300! 400! 500! 600! 700!
Rg
(nm)
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
3.0!
4.0!
5.0!
6.0!
7.0!
1!
3!
5!
7!
9!
11!
13!
Clusters
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
2.5!
3.5!
4.5!
5.5!
6.5!
0.7!
1.0!
1.3!
1.6!
1.9!
2.2!
0! 100! 200! 300! 400! 500! 600! 700!
N-C
Dist.
(nm)
3.2!
4.2!
5.2!
6.2!
-60!
0!
60!
120!
Psi
5
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
2.5!
3.5!
4.5!
5.5!
6.5!
7.5!
8.5!
-120!
-90!
-60!
-30!
0!
30!
60!
90!
Cα
3-4-5-6
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
0.6!
0.7!
0.8!
0! 100! 200! 300! 400! 500! 600! 700!
Rg
(nm)
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
3.0!
4.0!
5.0!
6.0!
1!
3!
5!
7!
9!
11!
13!
Clusters
50! 100! 50! 100! 50! 100!
0! 50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
X=Trp X=Tyr
3.0!
4.0!
5.0!
6.0!
7.0!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
2.5!
3.5!
4.5!
5.5!
6.5!
0.5!
1.5!
2.5!
3.5!
0! 100! 200! 300! 400! 500! 600! 700!
H-bonds
We can extract much more information from
the intramolecular hydrogen bonding
Let’s play with the conformations of the peptide with X=Tr
p

One typical 100 ns simulation produces 5001 different conformations
.

Step 1:
 

For each individual conformation, count the intramolecular H-Bonds
 

Step 2
:

Sort the conformations by the number of H-Bonds: 0,1,2..
.

Step 3:
 

For each conformation in each individual group so produced,
calculate RDCs, take average, and draw the RDC pattern
You could also call this:
 

“Hydrogen bond clustering analysis”
or so ...
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
Structures sorted according to numbers of intramolecular H-Bond
s

0 H-Bonds
Hum, no H-Bonds => boring, flat pattern
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
0 H-Bond
s

1 H-Bon
d

Structures sorted according to numbers of intramolecular H-Bond
s
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
0 H-Bond
s

1 H-Bon
d

2 H-Bond
s

Structures sorted according to numbers of intramolecular H-Bond
s

2 H-Bonds => pattern a little bit contrasted
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
0 H-Bond
s

1 H-Bon
d

2 H-Bond
s

3 H-Bond
s

Structures sorted according to numbers of intramolecular H-Bond
s

Hey, now we have it!
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
0 H-Bond
s

1 H-Bon
d

2 H-Bond
s

3 H-Bond
s

4 H-Bond
s

Structures sorted according to numbers of intramolecular H-Bond
s
!12$
!8$
!4$
0$
4$
8$
12$
16$
20$
1
H-
13
Cα
RDC
)Hz)
0$H!bonds$
1$H!bond$
2$H!bonds$
3$H!bonds$
4$H!bonds$
5$or$more$
Exp.$$
E1 G2 A3 A4 W5 A6 A7 S8 S9!
0 H-Bond
s

1 H-Bon
d

2 H-Bond
s

3 H-Bond
s

4 H-Bond
s

5 H-Bond
s

=
>

- Strong relation between H=Bonds and the RDC patter
n

- More H-Bonds <=> pattern closer to the experimental on
e
Which intramolecular hydrogen
bonding?
The problem: in this nine-residue peptide, there ar
e

12 H-Bond donor
s

24 H-Bond acceptors
 

=> many imaginable combinations
But next question: which H-Bonds??
?
Which intramolecular hydrogen
bonds are relevant? In other
words: which H-bonds are involved
in the experimental RDC pattern?
From all possible combinations,
only one is statistically
significant. Fortunately, it is
even highly significant!
Figure -- | For X = Trp, the RMSD to experiment
depends on the intramolecular helix typical
hydrogen bonding. Each point represents the average
over one 100 ns long simulation (R = -0.92, p < 0.001)


The peptide forms a short helix or a turn
The solution contains a combination of short
helices or turns, and extended structures
Figure -- | Representative structures of the two clusters which, if combined in
proportions of 30% (A) and 66% (B) of the number of frames, produces the RDCs with
the lowest RMSD to experiment. Carbon: silver, Oxygen: red, Nitrogen: blue, Hydrogen:
black, secondary structure: purple (A: 310 Helix, B: turn). Hydrogen bonds are highlighted.
Diagrams produced using VMD 39.
How might an aromatic side-chain
induce folding propensities?
The idea
:

Because its side-chain is bulky, Trp would limit the access of
water molecules to neighboring carbonyl and amide groups
.

Consequently, these backbone atoms would interact more
with each other, leading to an increased folding propensity
.

How can we prove that this is a driving force
?

If this is true, one should be able to observe, when X = Trp, a
reduced hydration of these particular atoms and even in
conformations for which the peptide is extended.
Two unfolded, rather extended conformations.
 

Water: only molecules closer than 3Å from the A6 amide are represente
d

Left: X = Tr
p

Right: X = Gly
How might an aromatic side-chain
induce folding propensities?
Figure -- | Number of water molecules in the first hydration shell of carbonyl and amide
groups. We report averages and SE over 7 (Trp) or 9 (Gly) simulations of about 100 ns each.
Localized lack of hydration => nucleation of folding
Any questions?
Some simpli
fi
cations:


- Heteronuclear pairs


- ≃ Constant internuclear distance


- Time averaging
Dipolar coupling:


⤹
Interaction energy:


N
H
Dipolar Coupling
Hydrogen bonding between
side-chain oxygen of a Serine
and a backbone carbonyl?
Which intramolecular hydrogen
bonds are relevant? In other
words: which H-bonds are involved
in the experimental RDC pattern?
310 helix typical
hydrogen bond between
backbone atoms of
residues 5 and 8


+ hydrogen bond
between amine group of
residue 5 and side-chain
alcohol of Serine 8?
Which intramolecular hydrogen
bonds are relevant? In other
words: which H-bonds are involved
in the experimental RDC pattern?
Which intramolecular hydrogen
bonds are relevant? In other
words: which H-bonds are involved
in the experimental RDC pattern?
A well balanced Hydrogen bonding, with
some H-Bonds all along the line??
1H-15N RDCs calculated from the pdb coordinates of two peptides with an in-house algorithm (ref.
below).


Huang J-r, Grzesiek S (2010) Ensemble Calculations of Unstructured Proteins Constrained by RDC and PRE Data: A Case Study of
Urea-Denatured Ubiquitin. Journal of the American Chemical Society 132: 694-705.
B
Molecular Dynamics (MD
)

Work hypothesi
s
Chemical shifts: comparison between experimental and predicted value
s

RMSD = 1.79 pp
m

r = 0.99
Chemical shifts: comparison between experimental and predicted value
s
Chemical shifts: comparison between experimental and predicted value
s
cs of EGAAWAASS
Nitrogen
105
110
115
120
125
130
Residues
Chemical
shift
(Hz)
Hydrogen
7.0
7.2
7.4
7.6
7.8
8.0
8.2
8.4
8.6
8.8
9.0
Residues
Chemical
shift
(Hz)
N => r: 0.99 rmsd: 2.10 Hz
H => r: 0.89 rmsd: 0.20 Hz
Cα => r: 0.99 rmsd: 1.27 Hz
Chemical shifts: comparison between experimental and predicted value
s
cs of EGAAIAASS
N => r: 0.97 rmsd: 2.69 Hz
H => r: 0.61 rmsd: 0.23 Hz
Cα => r: 0.99 rmsd: 0.92 Hz
Nitrogen
105
110
115
120
125
130
Residues
Chemical
shift
(Hz)
Hydrogen
7.0
7.2
7.4
7.6
7.8
8.0
8.2
8.4
8.6
8.8
9.0
Residues
Chemical
shift
(Hz)
Experimental values of Cα A6 not
available
Chemical shifts: comparison between experimental and predicted value
s
What these data show but what
they don’t show
:

Understand the method
:

Chemical shifts were calculated using SPARTA, a
database system which predicts backbone chemical shifts
of proteins using their structural coordinates as input. To
estimate the chemical shifts of a peptide, the program
searches within its database for successive triplets with
sequence identical to that the investigated peptide. It
further selects structures having a similar set of Φ, Ψ and
Χ1 angles as the investigated peptide, and then attributes
the empirical values of the selected backbone atoms,
additionally weighed according to the degree of similarity
as well as some other structural information, to the triplet
of the investigated peptide
.

Shen Y, Bax A (2007) Protein backbone chemical shifts predicted from
searching a database for torsion angle and sequence homology. Journal
of Biomolecular NMR 38: 289-302.
Chemical shifts: comparison between experimental and predicted value
s
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz
)!
G2 A3 A4 X5 A6 A7 S8 S9!
RDCs: How the time series look lik
e

Peptide with X = Tr
p

!50$
!30$
!10$
10$
30$
50$
0$ 20$ 40$ 60$ 80$ 100$
!50$
!30$
!10$
10$
30$
50$
0$ 20$ 40$ 60$ 80$ 100$
-10
-8
-6
-4
-2
0
2
4
6
1
H-
15
N
RDC
(Hz
)!
G2 A3 A4 X5 A6 A7 S8 S9!
RDCs: How the time series look lik
e

Peptide with X = Tr
p

As a consequence, we HAVE to work with averages
 

!50$
!30$
!10$
10$
30$
50$
0$ 20$ 40$ 60$ 80$ 100$
!50$
!30$
!10$
10$
30$
50$
0$ 20$ 40$ 60$ 80$ 100$
For comparison: time series of chemical shift
s

Peptide with X = Tr
p

105
110
115
120
125
130
1
HN
chemical
shifts
(ppm)!
G2 A3 A4 W5 A6 A7 S8 !
100#
105#
110#
115#
120#
125#
130#
0# 20# 40# 60# 80# 100#
Method to search for relations between RDC and structur
e

Consider the large
fl
uctuations from simulation to simulation as a
potential source of informatio
n

→ draw or compute relations between replicated simulations and
structural parameter
s

→ con
fi
rm the obtained results through clustering and time-series
analysi
s
Relation to the Radius of gyratio
n

6.0! 6.5! 7.0! 7.5!
Radius of Gyration (Å)
0
1
2
3
6.000! 6.500! 7.000! 7.500!
0
3
6
9
6.000! 6.500! 7.000! 7.500!
Rmsd
between
pred.
and
exp.
RDCs
(Hz)!
0 1 2 3
Relation to the Radius of gyratio
n

Relation to the number of intramolecular hydrogen bond
s

6.0! 6.5! 7.0! 7.5!
Radius of Gyration (Å)
0
1
2
3
6.000! 6.500! 7.000! 7.500!
0
3
6
9
6.000! 6.500! 7.000! 7.500!
Rmsd
between
pred.
and
exp.
RDCs
(Hz)!
0
1
2
3
0.000! 1.000! 2.000! 3.000!
0
3
6
9
0.000! 1.000! 2.000! 3.000!
Hydrogen bonds within the peptide
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p

Thus, the structures (when X = TRP) which
fi
t at best
with the experiment have:


- smaller distance between r1 and r9


- smaller radius of gyration


- higher number of intramolecular hydrogen bonds


=> Sort (cluster) the frames as a function of the
number of intramolecular hydrogen bonds
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
We thus observe a relation: the RDC pattern is more contrasted for the
structures with many intramolecular hydrogen bonds


Curiosity: can we observe anything if we sort the structures according to the
number of hydrogen bonds for a peptide with a
fl
at RDC pattern?
RDCs: comparison between experimental and predicted value
s

Peptide with X = Tr
p
RDCs: comparison between experimental and predicted value
s

Peptide with X = IL
E

-12
-8
-4
0
4
8
12
16
20
1
H-
13
Cα
RDC
(Hz)!
E1 G21 G22 A3 A4 I5 A6 A7 S8 S9!
We observe the same relation for both peptides!


We could thus “design” which RDC pattern we wish through the selection of
frames with the adequate number of intramolecular hydrogen bonds...


Next question: what differentiate effectively these peptides?
-12
-8
-4
0
4
8
12
16
20
1
H-
13
Cα
RDC
(Hz)!
E1 G21 G22 A3 A4 X5 A6 A7 S8 S9!
Exp. values, X = Ile
Exp. values, X = Trp
RDCs: comparison between experimental and predicted value
s

Peptide with X = IL
E
The peptides differ in the probability distribution of the
intramolecular hydrogen bond
s

0.00#
0.04#
0.08#
0.12#
0.16#
0.20#
0.00#
0.03#
0.06#
0.09#
0.12#
0.15#
Unimodal distribution


at around 1.0 - 1-5
Multimodal distribution


X = Ile X = Trp
Energy difference between the hydrogen bond distribution
s

Boltzmann estimation:


1.- MD produces about 45% of structures
with 3-4 H-bonds


2.- Wish about 100%, because these are
the structures that reproduce at best the
experimental results


ΔE = KbT∙Ln(100/45) ≈ 2 kj∙mol-1


Accuracy current force
fi
els: 2-4 kj.mol-1 (*)


(*): Shirts MR, Pande VS (2005) Solvation free energies of amino acid
side chain analogs for common molecular mechanics water models. The
Journal of Chemical Physics 122: 134508.
0.00#
0.03#
0.06#
0.09#
0.12#
0.15#
0.1# 0.7# 1.3# 1.9# 2.5# 3.0# 3.6# 4.2# 4.8# 5.4#
X = Trp
One step further: identi
fi
cation of the relevant hydrogen bond
s

Method:
 

Having observed that the RDC pro
fi
le of the different peptides is related, at least partly, to the
probability distribution of the number of intramolecular hydrogen bonds, we focused on identifying,
for each peptide, which particular hydrogen bond drives (or prevents) the structural propensities
toward the experimentally observed RDC pattern. Although the total observed number of
intramolecular hydrogen bonds was generally small, the peptides contain between 10 (X = Pro) and
12 (X = Trp) donors and between 23 and 24 acceptors.
 

For the identi
fi
cation of the structurally relevant hydrogen bonds, we counted (using the g-hbond
subprogram of GROMACS) for each frame the number of hydrogen bonds between each residue
pairs. We created tables of 45 columns (number of possible residue pairs for a peptide with 9
residues) and the number of rows was the number of replications (9 simulations with X = Trp, for
example, representing 864 ns). The pairs forming less than 2.5% of the total number of hydrogen
bonds were discarded from further analysis. In the peptide with X = Trp, for example, only 11
residue pairs suf
fi
ced to account for more than 80% of all the observed hydrogen bonds. For each
of these residue pairs, we calculated:
 

- the correlation coef
fi
cient between the number of hydrogen bonds and the RMSD between
experimental and predicted RDCs
,

- the correlation coef
fi
cient between the number of hydrogen bonds and the correlation
coef
fi
cient between experimental and predicted RDCs
.

We performed linear regression between each of these set of values. The selected relevant
residue pairs were those for which both parameters were found statistically signi
fi
cant and both
correlation coef
fi
cients (in absolute value) were higher than 0.6 (arbitrary chosen cut-off). The same
approach was used for the identi
fi
cation of the relevant hydrogen bonds at the atomic level.
One step further: identi
fi
cation of the relevant hydrogen bond
s

All are typical of 310-helix and α-helix: suggests a dynamical behavior,
with conformational exchange between 310 helix and α-helix
One step further: identi
fi
cation of the relevant hydrogen bond
s

Peptide with X = Tr
p

Sum of following hydrogen bonds(*):


CO of A3 with NH of A6


CO of A4 with NH of A7


CO of A4 with NH of S8
y"="$7.513x"+"8.0591"
R²"="0.82132"
p"="0.001"
4"
5"
6"
7"
8"
9"
0.0" 0.1" 0.2" 0.3" 0.4" 0.5"
(*): cut-off value reduced to 3.0 Å to be more selective
Rmsd
between
pred.
and
exp.
RDCs
(Hz)!
Each point: average over one 100 ns simulation
We never have too many proof
s

Do time-series analyses con
fi
rm all of this
?

Do they allow to go more into the details?
5!
6!
7!
8!
9!
10!
-90!
-60!
-30!
0!
30!
60!
90!
120!
150!
Psi
of
residues
4
(dashed
line)
and
5
(continuous
line)
We never have too many proof
s

Do time-series analyses con
fi
rm all of this
?

Do they allow to go more into the details
?

0! 50! 100! 50! 100!
50! 100! 50! 100! 50! 100! 50! 100!
Time (ns)!
1
D
HN
and
1
D
CAHA
RDC:
RMSD
between
predicetd
and
experimental
values
(Hz)!
We never have too many proof
s

Do time-series analyses con
fi
rm all of this
?

Do they allow to go more into the details
?

Time (ns)
-60!
-30!
0!
30!
60!
90!
120!
150!
0! 200! 400! 600!
Psi
of
residue
6!
0! 200! 400!
X = TRP X = Gly
X = Trp: there is a tendency to maintain the psi value of Ala6 at
about -30 (and phi between -50 and -70, not shown), which is
typical of turns or Helices.
 

This is not observed for X = Gly.
How might the side-chain of Trp increase the
folding propensities of EGAAXAASS?
How might the side-chain of Trp increase the
folding propensities of EGAAXAASS?
Hypothesis
:

The bulky side-chain of Trp would limit the access of water
molecules to the carbonyl and amide of the neighboring residue
s

Consequently, these backbone atoms would interact more with each
other, leading to an increased folding propensity
.

If this is true, one should be able to observe, when X = Trp, a
reduced interaction between these particular atoms ands water
molecules even in conformations for which the peptide is extended.
How might the side-chain of Trp increase the folding propensities of
EGAAXAASS?
Hypothesis
:

The bulky side-chain of Trp would limit the access of water molecules to the carbonyl and amide of the neighboring
residue
s

Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity
.

If this is true, one should be able to observe, when X = Trp, a reduced interaction between these particular atoms
ands water molecules even in conformations for which the peptide is extended.
VMD => 1 structure with Gly, extended,
with 1 or 2 H2O around CO of A3 or A4,
because there is space around H of Gly
VMD => 1 structure with Trp, extended, without
H2O around CO of A3 or A4, because there is
no space between them and the side-chain
Two extended, unfolded conformations. Look at the hydration of A6 amide proton
 

Left: X = Gl
y

Right: X = Trp
How might the side-chain of Trp increase the folding propensities of
EGAAXAASS?
Hypothesis
:

The bulky side-chain of Trp would limit the access of water molecules to the carbonyl and amide of the neighboring
residue
s

Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity
.

If this is true, one should be able to observe, when X = Trp, a reduced interaction between these particular atoms
ands water molecules even in conformations for which the peptide is extended
.

How could we verify this hypothesis?
We compared the number of water molecules in the
fi
rst
solvation shell of CO and NH groups for clusters of structures
where X = Trp which were “folded” (in the
fi
gure as Trp fold), or
extended (Trp ext), and of structures where X = Gly, which are
mainly extended.
How might the side-chain of Trp increase the folding
propensities of EGAAXAASS?
Compare number of water molecules in the
fi
rst solvation shell of CO and NH groups
for clusters of structures where X = Trp which were “folded” (Trp fold), or extended (Trp
ext), and of structures where X = Gly, which are mainly extended.
0.0
0.5
1.0
1.5
2.0
Trp fold. Trp ext. Gly
n(r)
at
0.33
nm
0.0
0.5
1.0
1.5
2.0
Trp fold. Trp ext. Gly
n(r)
at
0.28
nm
Carbonyl of Ala 3 Amide of Ala 6
1.- Less H2O interacting with the CO of A3 as well as with the NH of A6 when X = Trp
2.- Tendency remains true even when the Trp containing peptide is extended.
Figure -- | Number of water molecules in the first hydration shell of carbonyl and amide
groups. We report averages and SE over 7 (Trp) or 9 (Gly) simulations of about 100 ns each.
The stars refer to the results of the Tukey’s Honesty Significance Difference test, putting
emphasis on the selected extended peptides with X=Trp compared to the peptides with
X=Gly.
Bioinformatics approach
Bioinformatics approach
“The proteins listed in Table 1 have a distinctiv
e

amino acid composition (Table 2), as has also bee
n

described for ‘intrinsically disordered’ protein region
s

[9,43]. In short, they are signi
fi
cantly enriched i
n

P, E, K, S and Q, and depleted in W, Y, F, C, I, L and
N, compared with the average folded protein in the
PDB. Dunker and colleagues term the
fi
rst group
disorder-promoting amino acids, and the second group
order-promoting amino acids [9,43]”.
Bioinformatics approach

Arom fold

  • 1.
    How aromatic aminoacids promote peptide folding Olivier Bignucolo, Stephan Grzesiek, Simon Bernèche )
  • 2.
    Structures of aromatic aminoacids How aromatic amino acids promote peptide folding
  • 3.
    1.- Why investigatethe relation between sequence and peptide conformations ? 2.- Experimental sectio n - Residual dipolar couplings (RDC ) - Aromatics induce some “ordering in the chain ” 3.- Theoretical sectio n - Aromatics induce formation of a turn or a Alpha-Heli x - Driving force for folding If anybody has forgotten.. . this is a Alpha-Helix
  • 4.
    Journal of StructuralBiology 179 (2012) 347–358 Why work on protein conformational prediction? Exponentially increasing gap between the number of known sequences and the number of resolved structures Sequences 3D structures
  • 5.
    Intrinsically unfolded proteins TANGUYCHOUARD 2011 | VOL 471 | NATURE | 151
  • 6.
    The idea: reducethe complex protein folding problem to the smallest possible scal e <=> How does a single residue affect the conformation ? Experiment Synthesis of 16 peptides of sequence: EGAAXAAS S Extraction of NMR observables => Residual Dipolar Couplings (RDCs)
  • 7.
  • 8.
    Dipolar Coupling N H N +H => oriented dipole N H
  • 9.
    N H Dipolar Coupling Similarly Cαand Hα form other oriented dipoles Cα Hα
  • 10.
    N H Dipolar Coupling D ∝ Thus, weget information about an orientation
  • 11.
    Why Residual DipolarCouplings? → Information about long range interactions The blue arrows represent the orientation of the N - H bond of selected peptide bonds. http://en.wikipedia.org/wiki/Residual_dipolar_coupling RDC => short and long range, quantitative information
  • 12.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! Exp! G2 A3 A4G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! Majority of substitution s => flat profile X=Gly Results from Experiment
  • 13.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! X=Trp -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! Exp! G2 A3 A4 G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! X=Gly X=Trp: very contrasted experimental pattern Results from Experiment
  • 14.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! X=Trp -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! Exp! G2 A3 A4 G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! X=Gly Conclusions from the NMR spectroscopy : - For X=Gly => even less order than other amino acid s - For X=Trp => a kink in the peptide backbon e
  • 15.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! X=Gly: Predicted pattern flat, boring too ... it will be our “control” Red: values extracted from the MD simulations X=Gly Black: NMR Results from Experiment and Simulations
  • 16.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! Predicted pattern with X=Trp -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! Results from Experiment and Simulations X=Gly X=Trp
  • 17.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! Something is wrong in these figures... -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! Results from Experiment and Simulations X=Gly X=Trp
  • 18.
    Results from Experiment andSimulations X=Trp X=Gly -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! X=Gly X=Trp Let’s present the data correctly ! Aromatics: large variability
  • 19.
    Results from Experiment andSimulations X=Trp X=Gly -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 G5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 G5 A6 A7 S8 S9! Predicted pattern with X=Trp large variabilit y some “good” snapshots -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz)! G2 A3 A4 W5 A6 A7 S8 S9! -12 -8 -4 0 4 8 12 16 20 1 H- 13 Ca RDC (Hz)! E1 G2 A3 A4 W5 A6 A7 S8 S9! X=Gly X=Trp Open red squares: a “good” snapshot
  • 20.
    2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 0.6! 0.7! 0.8! 0! 100! 200!300! 400! 500! 600! 700! 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! Radius of gyration (nm) 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! RMSD Time series and the agreement to experiment Trp: the variabilit y can also be observe d within run RMSD = Residual Mean Square Deviation between experiment and simulations
  • 21.
    Time series andthe agreement to experiment RMSD 2.5! 3.5! 4.5! 5.5! 6.5! 0.7! 1.0! 1.3! 1.6! 1.9! 2.2! 0! 100! 200! 300! 400! 500! 600! 700! 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! Distance between N and C termini (nm) 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! Wow! The distance between the termini overlaps quite well over the agreement to experiment!! RMSD
  • 22.
    Time series andthe agreement to experiment RMSD 2.5! 3.5! 4.5! 5.5! 6.5! 0.7! 1.0! 1.3! 1.6! 1.9! 2.2! 0! 100! 200! 300! 400! 500! 600! 700! 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! Distance between N and C termini (nm) 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! Hey! The same is true for the radius of gyration RMSD 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 0.6! 0.7! 0.8! 0! 100! 200! 300! 400! 500! 600! 700! 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! Radius of gyration (nm) 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! RMSD Rg
  • 23.
    2.5! 3.5! 4.5! 5.5! 6.5! 0.7! 1.2! 1.7! 2.2! 0! 100! 200!300! 400! 500! 600! 700! N-C (nm) 2.5! 3.5! 4.5! 5.5! 6.5! -60! 0! 60! 120! Psi 5 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 2.5! 3.5! 4.5! 5.5! 6.5! -120! -90! -60! -30! 0! 30! 60! 90! Cα 3-4-5-6 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 0.6! 0.7! 0.8! 0! 100! 200! 300! 400! 500! 600! 700! Rg (nm) RMSD between predicetd and experimental RDCs (Hz)! 2.5! 3.5! 4.5! 5.5! 6.5! 1! 3! 5! 7! 9! 11! 13! Clusters 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! X=Trp Many structural parameters behave similarly. But What is important: they are all consistent to each other!
  • 24.
    3.0! 4.0! 5.0! 6.0! 7.0! 0.7! 1.0! 1.3! 1.6! 1.9! 2.2! 0! 100! 200!300! 400! 500! 600! 700! N-C Dist. (nm) 3.0! 4.0! 5.0! 6.0! 7.0! -60! 0! 60! 120! Psi 5 3.0! 4.0! 5.0! 6.0! 7.0! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 3.0! 4.0! 5.0! 6.0! 7.0! -120! -90! -60! -30! 0! 30! 60! 90! Cα 3-4-5-6 3.0! 4.0! 5.0! 6.0! 7.0! 0.5! 0.6! 0.7! 0.8! 0.9! 0! 100! 200! 300! 400! 500! 600! 700! Rg (nm) 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! 3.0! 4.0! 5.0! 6.0! 7.0! 1! 3! 5! 7! 9! 11! 13! Clusters 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! X=Tyr 2.5! 3.5! 4.5! 5.5! 6.5! 0.7! 1.2! 1.7! 2.2! 0! 100! 200! 300! 400! 500! 600! 700! N-C (nm) 2.5! 3.5! 4.5! 5.5! 6.5! -60! 0! 60! 120! Psi 5 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 2.5! 3.5! 4.5! 5.5! 6.5! -120! -90! -60! -30! 0! 30! 60! 90! Cα 3-4-5-6 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 0.6! 0.7! 0.8! 0! 100! 200! 300! 400! 500! 600! 700! Rg (nm) RMSD between predicetd and experimental RDCs (Hz)! 2.5! 3.5! 4.5! 5.5! 6.5! 1! 3! 5! 7! 9! 11! 13! Clusters 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! X=Trp
  • 25.
    3.0! 4.0! 5.0! 6.0! 7.0! 0.7! 1.0! 1.3! 1.6! 1.9! 2.2! 0! 100! 200!300! 400! 500! 600! 700! N-C Dist. (nm) 3.0! 4.0! 5.0! 6.0! 7.0! -60! 0! 60! 120! Psi 5 3.0! 4.0! 5.0! 6.0! 7.0! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 3.0! 4.0! 5.0! 6.0! 7.0! -120! -90! -60! -30! 0! 30! 60! 90! Cα 3-4-5-6 3.0! 4.0! 5.0! 6.0! 7.0! 0.5! 0.6! 0.7! 0.8! 0.9! 0! 100! 200! 300! 400! 500! 600! 700! Rg (nm) 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! 3.0! 4.0! 5.0! 6.0! 7.0! 1! 3! 5! 7! 9! 11! 13! Clusters 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! 2.5! 3.5! 4.5! 5.5! 6.5! 0.7! 1.0! 1.3! 1.6! 1.9! 2.2! 0! 100! 200! 300! 400! 500! 600! 700! N-C Dist. (nm) 3.2! 4.2! 5.2! 6.2! -60! 0! 60! 120! Psi 5 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 2.5! 3.5! 4.5! 5.5! 6.5! 7.5! 8.5! -120! -90! -60! -30! 0! 30! 60! 90! Cα 3-4-5-6 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 0.6! 0.7! 0.8! 0! 100! 200! 300! 400! 500! 600! 700! Rg (nm) 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)! 3.0! 4.0! 5.0! 6.0! 1! 3! 5! 7! 9! 11! 13! Clusters 50! 100! 50! 100! 50! 100! 0! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! X=Trp X=Tyr 3.0! 4.0! 5.0! 6.0! 7.0! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds 2.5! 3.5! 4.5! 5.5! 6.5! 0.5! 1.5! 2.5! 3.5! 0! 100! 200! 300! 400! 500! 600! 700! H-bonds We can extract much more information from the intramolecular hydrogen bonding
  • 26.
    Let’s play withthe conformations of the peptide with X=Tr p One typical 100 ns simulation produces 5001 different conformations . Step 1: For each individual conformation, count the intramolecular H-Bonds Step 2 : Sort the conformations by the number of H-Bonds: 0,1,2.. . Step 3: For each conformation in each individual group so produced, calculate RDCs, take average, and draw the RDC pattern You could also call this: “Hydrogen bond clustering analysis” or so ...
  • 27.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! Structures sorted according to numbers of intramolecular H-Bond s 0 H-Bonds Hum, no H-Bonds => boring, flat pattern
  • 28.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! 0 H-Bond s 1 H-Bon d Structures sorted according to numbers of intramolecular H-Bond s
  • 29.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! 0 H-Bond s 1 H-Bon d 2 H-Bond s Structures sorted according to numbers of intramolecular H-Bond s 2 H-Bonds => pattern a little bit contrasted
  • 30.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! 0 H-Bond s 1 H-Bon d 2 H-Bond s 3 H-Bond s Structures sorted according to numbers of intramolecular H-Bond s Hey, now we have it!
  • 31.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! 0 H-Bond s 1 H-Bon d 2 H-Bond s 3 H-Bond s 4 H-Bond s Structures sorted according to numbers of intramolecular H-Bond s
  • 32.
    !12$ !8$ !4$ 0$ 4$ 8$ 12$ 16$ 20$ 1 H- 13 Cα RDC )Hz) 0$H!bonds$ 1$H!bond$ 2$H!bonds$ 3$H!bonds$ 4$H!bonds$ 5$or$more$ Exp.$$ E1 G2 A3A4 W5 A6 A7 S8 S9! 0 H-Bond s 1 H-Bon d 2 H-Bond s 3 H-Bond s 4 H-Bond s 5 H-Bond s = > - Strong relation between H=Bonds and the RDC patter n - More H-Bonds <=> pattern closer to the experimental on e
  • 33.
    Which intramolecular hydrogen bonding? Theproblem: in this nine-residue peptide, there ar e 12 H-Bond donor s 24 H-Bond acceptors => many imaginable combinations But next question: which H-Bonds?? ?
  • 34.
    Which intramolecular hydrogen bondsare relevant? In other words: which H-bonds are involved in the experimental RDC pattern?
  • 35.
    From all possiblecombinations, only one is statistically significant. Fortunately, it is even highly significant! Figure -- | For X = Trp, the RMSD to experiment depends on the intramolecular helix typical hydrogen bonding. Each point represents the average over one 100 ns long simulation (R = -0.92, p < 0.001) The peptide forms a short helix or a turn
  • 36.
    The solution containsa combination of short helices or turns, and extended structures Figure -- | Representative structures of the two clusters which, if combined in proportions of 30% (A) and 66% (B) of the number of frames, produces the RDCs with the lowest RMSD to experiment. Carbon: silver, Oxygen: red, Nitrogen: blue, Hydrogen: black, secondary structure: purple (A: 310 Helix, B: turn). Hydrogen bonds are highlighted. Diagrams produced using VMD 39.
  • 37.
    How might anaromatic side-chain induce folding propensities? The idea : Because its side-chain is bulky, Trp would limit the access of water molecules to neighboring carbonyl and amide groups . Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity . How can we prove that this is a driving force ? If this is true, one should be able to observe, when X = Trp, a reduced hydration of these particular atoms and even in conformations for which the peptide is extended.
  • 38.
    Two unfolded, ratherextended conformations. Water: only molecules closer than 3Å from the A6 amide are represente d Left: X = Tr p Right: X = Gly How might an aromatic side-chain induce folding propensities?
  • 39.
    Figure -- |Number of water molecules in the first hydration shell of carbonyl and amide groups. We report averages and SE over 7 (Trp) or 9 (Gly) simulations of about 100 ns each. Localized lack of hydration => nucleation of folding
  • 40.
  • 41.
    Some simpli fi cations: - Heteronuclearpairs - ≃ Constant internuclear distance - Time averaging Dipolar coupling: ⤹ Interaction energy: N H Dipolar Coupling
  • 42.
    Hydrogen bonding between side-chainoxygen of a Serine and a backbone carbonyl? Which intramolecular hydrogen bonds are relevant? In other words: which H-bonds are involved in the experimental RDC pattern?
  • 43.
    310 helix typical hydrogenbond between backbone atoms of residues 5 and 8 + hydrogen bond between amine group of residue 5 and side-chain alcohol of Serine 8? Which intramolecular hydrogen bonds are relevant? In other words: which H-bonds are involved in the experimental RDC pattern?
  • 44.
    Which intramolecular hydrogen bondsare relevant? In other words: which H-bonds are involved in the experimental RDC pattern? A well balanced Hydrogen bonding, with some H-Bonds all along the line??
  • 45.
    1H-15N RDCs calculatedfrom the pdb coordinates of two peptides with an in-house algorithm (ref. below). Huang J-r, Grzesiek S (2010) Ensemble Calculations of Unstructured Proteins Constrained by RDC and PRE Data: A Case Study of Urea-Denatured Ubiquitin. Journal of the American Chemical Society 132: 694-705. B Molecular Dynamics (MD ) Work hypothesi s
  • 46.
    Chemical shifts: comparisonbetween experimental and predicted value s RMSD = 1.79 pp m r = 0.99
  • 47.
    Chemical shifts: comparisonbetween experimental and predicted value s
  • 48.
    Chemical shifts: comparisonbetween experimental and predicted value s
  • 49.
    cs of EGAAWAASS Nitrogen 105 110 115 120 125 130 Residues Chemical shift (Hz) Hydrogen 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 Residues Chemical shift (Hz) N=> r: 0.99 rmsd: 2.10 Hz H => r: 0.89 rmsd: 0.20 Hz Cα => r: 0.99 rmsd: 1.27 Hz Chemical shifts: comparison between experimental and predicted value s
  • 50.
    cs of EGAAIAASS N=> r: 0.97 rmsd: 2.69 Hz H => r: 0.61 rmsd: 0.23 Hz Cα => r: 0.99 rmsd: 0.92 Hz Nitrogen 105 110 115 120 125 130 Residues Chemical shift (Hz) Hydrogen 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 Residues Chemical shift (Hz) Experimental values of Cα A6 not available Chemical shifts: comparison between experimental and predicted value s
  • 51.
    What these datashow but what they don’t show : Understand the method : Chemical shifts were calculated using SPARTA, a database system which predicts backbone chemical shifts of proteins using their structural coordinates as input. To estimate the chemical shifts of a peptide, the program searches within its database for successive triplets with sequence identical to that the investigated peptide. It further selects structures having a similar set of Φ, Ψ and Χ1 angles as the investigated peptide, and then attributes the empirical values of the selected backbone atoms, additionally weighed according to the degree of similarity as well as some other structural information, to the triplet of the investigated peptide . Shen Y, Bax A (2007) Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology. Journal of Biomolecular NMR 38: 289-302. Chemical shifts: comparison between experimental and predicted value s
  • 52.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz )! G2 A3 A4X5 A6 A7 S8 S9! RDCs: How the time series look lik e Peptide with X = Tr p !50$ !30$ !10$ 10$ 30$ 50$ 0$ 20$ 40$ 60$ 80$ 100$ !50$ !30$ !10$ 10$ 30$ 50$ 0$ 20$ 40$ 60$ 80$ 100$
  • 53.
    -10 -8 -6 -4 -2 0 2 4 6 1 H- 15 N RDC (Hz )! G2 A3 A4X5 A6 A7 S8 S9! RDCs: How the time series look lik e Peptide with X = Tr p As a consequence, we HAVE to work with averages !50$ !30$ !10$ 10$ 30$ 50$ 0$ 20$ 40$ 60$ 80$ 100$ !50$ !30$ !10$ 10$ 30$ 50$ 0$ 20$ 40$ 60$ 80$ 100$
  • 54.
    For comparison: timeseries of chemical shift s Peptide with X = Tr p 105 110 115 120 125 130 1 HN chemical shifts (ppm)! G2 A3 A4 W5 A6 A7 S8 ! 100# 105# 110# 115# 120# 125# 130# 0# 20# 40# 60# 80# 100#
  • 55.
    Method to searchfor relations between RDC and structur e Consider the large fl uctuations from simulation to simulation as a potential source of informatio n → draw or compute relations between replicated simulations and structural parameter s → con fi rm the obtained results through clustering and time-series analysi s
  • 56.
    Relation to theRadius of gyratio n 6.0! 6.5! 7.0! 7.5! Radius of Gyration (Å) 0 1 2 3 6.000! 6.500! 7.000! 7.500! 0 3 6 9 6.000! 6.500! 7.000! 7.500! Rmsd between pred. and exp. RDCs (Hz)!
  • 57.
    0 1 23 Relation to the Radius of gyratio n Relation to the number of intramolecular hydrogen bond s 6.0! 6.5! 7.0! 7.5! Radius of Gyration (Å) 0 1 2 3 6.000! 6.500! 7.000! 7.500! 0 3 6 9 6.000! 6.500! 7.000! 7.500! Rmsd between pred. and exp. RDCs (Hz)! 0 1 2 3 0.000! 1.000! 2.000! 3.000! 0 3 6 9 0.000! 1.000! 2.000! 3.000! Hydrogen bonds within the peptide
  • 58.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p Thus, the structures (when X = TRP) which fi t at best with the experiment have: - smaller distance between r1 and r9 - smaller radius of gyration - higher number of intramolecular hydrogen bonds => Sort (cluster) the frames as a function of the number of intramolecular hydrogen bonds
  • 59.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 60.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 61.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 62.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 63.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 64.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = Tr p
  • 65.
    We thus observea relation: the RDC pattern is more contrasted for the structures with many intramolecular hydrogen bonds Curiosity: can we observe anything if we sort the structures according to the number of hydrogen bonds for a peptide with a fl at RDC pattern? RDCs: comparison between experimental and predicted value s Peptide with X = Tr p
  • 66.
    RDCs: comparison betweenexperimental and predicted value s Peptide with X = IL E -12 -8 -4 0 4 8 12 16 20 1 H- 13 Cα RDC (Hz)! E1 G21 G22 A3 A4 I5 A6 A7 S8 S9!
  • 67.
    We observe thesame relation for both peptides! We could thus “design” which RDC pattern we wish through the selection of frames with the adequate number of intramolecular hydrogen bonds... Next question: what differentiate effectively these peptides? -12 -8 -4 0 4 8 12 16 20 1 H- 13 Cα RDC (Hz)! E1 G21 G22 A3 A4 X5 A6 A7 S8 S9! Exp. values, X = Ile Exp. values, X = Trp RDCs: comparison between experimental and predicted value s Peptide with X = IL E
  • 68.
    The peptides differin the probability distribution of the intramolecular hydrogen bond s 0.00# 0.04# 0.08# 0.12# 0.16# 0.20# 0.00# 0.03# 0.06# 0.09# 0.12# 0.15# Unimodal distribution at around 1.0 - 1-5 Multimodal distribution X = Ile X = Trp
  • 69.
    Energy difference betweenthe hydrogen bond distribution s Boltzmann estimation: 1.- MD produces about 45% of structures with 3-4 H-bonds 2.- Wish about 100%, because these are the structures that reproduce at best the experimental results ΔE = KbT∙Ln(100/45) ≈ 2 kj∙mol-1 Accuracy current force fi els: 2-4 kj.mol-1 (*) (*): Shirts MR, Pande VS (2005) Solvation free energies of amino acid side chain analogs for common molecular mechanics water models. The Journal of Chemical Physics 122: 134508. 0.00# 0.03# 0.06# 0.09# 0.12# 0.15# 0.1# 0.7# 1.3# 1.9# 2.5# 3.0# 3.6# 4.2# 4.8# 5.4# X = Trp
  • 70.
    One step further:identi fi cation of the relevant hydrogen bond s Method: Having observed that the RDC pro fi le of the different peptides is related, at least partly, to the probability distribution of the number of intramolecular hydrogen bonds, we focused on identifying, for each peptide, which particular hydrogen bond drives (or prevents) the structural propensities toward the experimentally observed RDC pattern. Although the total observed number of intramolecular hydrogen bonds was generally small, the peptides contain between 10 (X = Pro) and 12 (X = Trp) donors and between 23 and 24 acceptors. For the identi fi cation of the structurally relevant hydrogen bonds, we counted (using the g-hbond subprogram of GROMACS) for each frame the number of hydrogen bonds between each residue pairs. We created tables of 45 columns (number of possible residue pairs for a peptide with 9 residues) and the number of rows was the number of replications (9 simulations with X = Trp, for example, representing 864 ns). The pairs forming less than 2.5% of the total number of hydrogen bonds were discarded from further analysis. In the peptide with X = Trp, for example, only 11 residue pairs suf fi ced to account for more than 80% of all the observed hydrogen bonds. For each of these residue pairs, we calculated: - the correlation coef fi cient between the number of hydrogen bonds and the RMSD between experimental and predicted RDCs , - the correlation coef fi cient between the number of hydrogen bonds and the correlation coef fi cient between experimental and predicted RDCs . We performed linear regression between each of these set of values. The selected relevant residue pairs were those for which both parameters were found statistically signi fi cant and both correlation coef fi cients (in absolute value) were higher than 0.6 (arbitrary chosen cut-off). The same approach was used for the identi fi cation of the relevant hydrogen bonds at the atomic level.
  • 71.
    One step further:identi fi cation of the relevant hydrogen bond s All are typical of 310-helix and α-helix: suggests a dynamical behavior, with conformational exchange between 310 helix and α-helix
  • 72.
    One step further:identi fi cation of the relevant hydrogen bond s Peptide with X = Tr p Sum of following hydrogen bonds(*): CO of A3 with NH of A6 CO of A4 with NH of A7 CO of A4 with NH of S8 y"="$7.513x"+"8.0591" R²"="0.82132" p"="0.001" 4" 5" 6" 7" 8" 9" 0.0" 0.1" 0.2" 0.3" 0.4" 0.5" (*): cut-off value reduced to 3.0 Å to be more selective Rmsd between pred. and exp. RDCs (Hz)! Each point: average over one 100 ns simulation
  • 73.
    We never havetoo many proof s Do time-series analyses con fi rm all of this ? Do they allow to go more into the details?
  • 74.
    5! 6! 7! 8! 9! 10! -90! -60! -30! 0! 30! 60! 90! 120! 150! Psi of residues 4 (dashed line) and 5 (continuous line) We never havetoo many proof s Do time-series analyses con fi rm all of this ? Do they allow to go more into the details ? 0! 50! 100! 50! 100! 50! 100! 50! 100! 50! 100! 50! 100! Time (ns)! 1 D HN and 1 D CAHA RDC: RMSD between predicetd and experimental values (Hz)!
  • 75.
    We never havetoo many proof s Do time-series analyses con fi rm all of this ? Do they allow to go more into the details ? Time (ns) -60! -30! 0! 30! 60! 90! 120! 150! 0! 200! 400! 600! Psi of residue 6! 0! 200! 400! X = TRP X = Gly X = Trp: there is a tendency to maintain the psi value of Ala6 at about -30 (and phi between -50 and -70, not shown), which is typical of turns or Helices. This is not observed for X = Gly.
  • 76.
    How might theside-chain of Trp increase the folding propensities of EGAAXAASS?
  • 77.
    How might theside-chain of Trp increase the folding propensities of EGAAXAASS? Hypothesis : The bulky side-chain of Trp would limit the access of water molecules to the carbonyl and amide of the neighboring residue s Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity . If this is true, one should be able to observe, when X = Trp, a reduced interaction between these particular atoms ands water molecules even in conformations for which the peptide is extended.
  • 78.
    How might theside-chain of Trp increase the folding propensities of EGAAXAASS? Hypothesis : The bulky side-chain of Trp would limit the access of water molecules to the carbonyl and amide of the neighboring residue s Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity . If this is true, one should be able to observe, when X = Trp, a reduced interaction between these particular atoms ands water molecules even in conformations for which the peptide is extended. VMD => 1 structure with Gly, extended, with 1 or 2 H2O around CO of A3 or A4, because there is space around H of Gly VMD => 1 structure with Trp, extended, without H2O around CO of A3 or A4, because there is no space between them and the side-chain Two extended, unfolded conformations. Look at the hydration of A6 amide proton Left: X = Gl y Right: X = Trp
  • 79.
    How might theside-chain of Trp increase the folding propensities of EGAAXAASS? Hypothesis : The bulky side-chain of Trp would limit the access of water molecules to the carbonyl and amide of the neighboring residue s Consequently, these backbone atoms would interact more with each other, leading to an increased folding propensity . If this is true, one should be able to observe, when X = Trp, a reduced interaction between these particular atoms ands water molecules even in conformations for which the peptide is extended . How could we verify this hypothesis? We compared the number of water molecules in the fi rst solvation shell of CO and NH groups for clusters of structures where X = Trp which were “folded” (in the fi gure as Trp fold), or extended (Trp ext), and of structures where X = Gly, which are mainly extended.
  • 80.
    How might theside-chain of Trp increase the folding propensities of EGAAXAASS? Compare number of water molecules in the fi rst solvation shell of CO and NH groups for clusters of structures where X = Trp which were “folded” (Trp fold), or extended (Trp ext), and of structures where X = Gly, which are mainly extended. 0.0 0.5 1.0 1.5 2.0 Trp fold. Trp ext. Gly n(r) at 0.33 nm 0.0 0.5 1.0 1.5 2.0 Trp fold. Trp ext. Gly n(r) at 0.28 nm Carbonyl of Ala 3 Amide of Ala 6 1.- Less H2O interacting with the CO of A3 as well as with the NH of A6 when X = Trp 2.- Tendency remains true even when the Trp containing peptide is extended.
  • 81.
    Figure -- |Number of water molecules in the first hydration shell of carbonyl and amide groups. We report averages and SE over 7 (Trp) or 9 (Gly) simulations of about 100 ns each. The stars refer to the results of the Tukey’s Honesty Significance Difference test, putting emphasis on the selected extended peptides with X=Trp compared to the peptides with X=Gly.
  • 82.
  • 83.
  • 84.
    “The proteins listedin Table 1 have a distinctiv e amino acid composition (Table 2), as has also bee n described for ‘intrinsically disordered’ protein region s [9,43]. In short, they are signi fi cantly enriched i n P, E, K, S and Q, and depleted in W, Y, F, C, I, L and N, compared with the average folded protein in the PDB. Dunker and colleagues term the fi rst group disorder-promoting amino acids, and the second group order-promoting amino acids [9,43]”. Bioinformatics approach