2. Page | 1
INTRODUCTION
Pseudomonas aeruginosa is an opportunistic, Gram-negative bacterium that frequently
causes life-threatening infections in burn, cancer, cystic fibrosis, and immune compromised
patients. It is especially dangerous due to a naturally high innate resistance and high frequency of
acquired anti-microbial resistance and thus there is considerable interest in vaccine development
against this pathogen. [1] Infection begins with bacterial adherence to host mucosal cells via type
IV pilus on the surface of the bacteria. These pili are composed of repeating units of structural
proteins known as pilin which contain a functional receptor binding site near the C-terminus. [1]
The high abundance of surface pilin, however, also makes the protein an ideal vaccine target as it
can be more easily recognized by the immune system. Furthermore, as pilin is crucial to the
initiation of infection, patient antibodies that block the binding site can effectively prevent the
infection.
Given the desperate need for a P. aeruginosa vaccine, the purpose of the following
experiments was to design two potential protein antigens that could be used in such a vaccine. The
first experiment involved using homology modeling software Modeller [2] to create a homology
model of type IV fimbrial precursor pilin protein based off the known structure of truncated PAK
pilin protein. These were chosen as both play important roles in bacterial adhesion and due to a
predicted homology between them. Moreover, the structure of the former is currently unknown
which necessitates the creation of a homology model. The second experiment involved using
Chimera and Foldit software [3, 4] to redesign the truncated PAK pilin protein to be more suitable
for use in a vaccine. This meant mutating various residues to improve stability around the binding
region and/or outright eliminating regions of the protein to reduce size and thus cost to produce.
Five redesigns in total were made and run through the in lucem molecular mechanics simulation,
3. Page | 2
ilmm, [5] using the XSEDE Stampede supercomputer [6] to determine how our redesigns would
fare in in-vivo conditions. Data from these simulations were then used to create a final redesign
which was also run through the ilmm simulation to determine the success of the redesign.
METHODS
Homology Model for Type IV Fimbrial Precursor Pilin
FASTA sequence for type IV fimbrial precursor pilin was obtained from the NCBI
database [7] while truncated PAK pilin structure in a PDB file was obtained from the RCSB Protein
Databank (PDB code 1DZO). [8] These were used as inputs for Modeller to generate a homology
model of the Type IV pilin. Modeller does this by comparing the FASTA sequences of the proteins
then predicting a structure of the unknown protein using the assumption that sequence alignment
correlates with structure alignment. [2] We then analyzed the structures of the pilin proteins using
Chimera’s MatchMaker function. [3]
Redesign of Truncated PAK Pilin
Five redesigns of the truncated PAK pilin (PDB code 1DZO) were made using Chimera
and Foldit software and saved as PDB files. [3, 4] While performing each redesign we made sure
to not make any changes to the binding region of the protein which was identified in a previous
study. [9] In the first redesign we used Foldit’s automated mutation and conformation algorithms
to create a more stable protein, calling this design Automated Foldit Design (AFD). In the second
redesign, we used our own intuition to manually redesign the protein while also using Foldit’s
algorithms as guidelines. The structure obtained from this procedure was labeled Manual Foldit
Design (MFD). In the third redesign, we also used our intuition to mutate specific residues and
change their conformations but instead used Chimera software. This design was labeled Intuitive
4. Page | 3
Design (ID). For the fourth and fifth designs, we first eliminated all residues in the PAK pilin
protein that we felt were unnecessary for preserving the structure of the binding region, which was
identified in previous studies [10, 11], to create a PAK pilin fragment. In the first of the fragment
designs the residues in this fragment were left unchanged and was labeled Fragment Wild Type
(FWT). In the second we used Foldit to mutate the residues in the fragment as we saw fit along
with the software’s stabilization algorithms and labeled this design Fragment Manual Design
(FMD).
Molecular Simulations and Final Redesign
With the five protein redesigns and the wild type structure as PDB files we were then able
to run them through ilmm molecular dynamics software by accessing the XSEDE Stampede
supercomputer. [5, 6] The ilmm software places the protein in a simulated 10Å solvation cube and
uses molecular dynamics algorithms to replicate protein behavior in solution. We used a simulation
length of 5ns over 5000 steps and, to mimic conditions in the human body, a temperature of 37°C
(310K). The simulations then output various data for analysis including Define Secondary
Structure of Proteins (DSSP) plots, C-α root mean square deviation (RMSD) – deviation with
respect to a minimum energy structure – data, C-α root mean square fluctuation (RMSF) –
deviation with respect to the starting structure – data, and animations of protein dynamics. DSSP
plots assign secondary structure identities to residues in the sequence based on hydrogen bonding
in the protein backbone. We then made a final redesign (RD) of the pilin protein with Foldit using
the outputted data as a guide and performed the ilmm simulation on this redesign to determine the
success of our changes. Ultimately, the process of designing RD consisted of making mutations in
AFD which was identified as the most stable of the initial redesigns.
5. Page | 4
RESULTS
Homology Model for Type IV Fimbrial Precursor Pilin
Chimera MatchMaker alignment (Figures 1bc) of the homology model and the reference
PAK pilin revealed that the two structures had an overall C-α RMSD of 0.581Å despite only having
a sequence identity of 56.52%. While the the reference structure had a slightly longer main alpha
helix and, in certain areas, longer beta strands, the C-α RMSDs for most residues in non-loop
regions still fell within 0.2Å (Figure 1b). However, the small alpha helix near the C-terminus of
the reference was absent in the homology model. In addition, the greatest C-α RMSDs, i.e. those
above 0.4Å, between the two structures generally occurred in loop regions where the sequences
didn’t align (Figures 1bc). Finally, residues that play major roles in pilin binding [11] in the
truncated PAK pilin protein were not completely conserved in the analogous region of the
homology model, namely with modifications Q136M and T138I.
Redesign of Truncated PAK Pilin: Simulation Results
The simulation C-α RMSD values for all the initial designs and wild type protein reached
an equilibrium near the end of each simulation (Figure 3). In addition, they all equilibrated to about
the same value of 2Å. On the other hand, this was not the case for either of the fragment designs
or the final redesign (FWD, FMD, and RD); in fact, the RMSD of the mutated fragment (FMD)
was increasing by the end of the simulation. Moreover, of all the untruncated designs the final
redesign in general had the highest RMSD over all time points. In terms of residue fluctuations,
protein animations (not included) and C-α RMSF plots (Figures 4ab) indicated that loop regions
in all designs corresponded to areas with the greatest fluctuations. This was especially apparent at
the loops near the N termini of the designs. Moreover, the fragmented designs had RMSF values
that were generally larger than those of the untruncated proteins (Figure 4b).
6. Page | 5
DSSP plots (Figures 5a-c, g) revealed that the untruncated designs and wild type protein
had relatively consistent secondary structures with no significant difference among each design
and the wild type. The only noticeable disappearance was that of the beta bridge in the Manual
Foldit Design and wild type protein in the loop region around residue 68. In addition, the intuitive
design adopted an additional alpha helix in the region around residue 55. We tried to replicate this
helix in the final redesign by matching the sequence in the corresponding region but
unsuccessfully. On the other hand, neither of the fragment designs could maintain their secondary
structures (Figures 5de). In the case of the mutated fragment the beta sheets and alpha helix were
completely lost while in the case of the wild type fragment only the helix was lost.
DISCUSSION
Homology Model for Type IV Fimbrial Precursor Pilin
The data suggest that the homology model successfully recreated the major secondary
structures of the PAK pilin reference. The beta sheets and alpha helices of the reference structure
were replicated in the homology model to within 0.4Å C-α RMSD with only minor differences in
structure size, i.e. being a couple residues longer (Figures 1a-c). Although there was a small helix
near the C-terminus of the reference that was not present in the homology model, the data for the
corresponding region in the homology model show that the C-α RMSD fell within 0.2Å indicating
similarity between the two structures (Figure 1b). It may be possible that this region in the
homology model may in fact be a helix but was unidentified due to some quirk in Modeller. This
is supported by the conformation of the backbone in this region of the homology model which has
some resemblance to a single turn helix. As a result, the homology in backbone structure between
the two in this region indicate the patient’s adaptive immune system may be able to recognize both
7. Page | 6
using the same binding ligands. However, residues identified as host cell binding participants in
PAK pilin [9] weren’t present in the corresponding region of the Type IV pilin homology model
(Figures 1ac) suggesting that Type IV pilin cannot completely substitute PAK pilin. It should be
noted that the homology model is only a predictor of the protein structure and future X-ray
crystallography experiments must be performed to determine the success of this study in
establishing the structure of Type IV pilin.
Redesign of Truncated PAK Pilin and Effects on Stability
Trends in C-α RMSD for the initial, untruncated, designs (AFD, ID, and MFD) suggest
that while the untruncated designs could maintain a stable structure, the fragment designs could
not. This is because they were all able to reach an equilibrium within 5ns (Figure 3). By contrast,
the RMSD of both fragment designs and the final redesign (FMD, FWT, and RD) did not
equilibrate and were in general larger signifying that these structures were much less stable than
the untruncated designs. The ability or lack thereof of the designs to maintain their secondary
structures according to DSSP plots (Figures 5a-g) further corroborates this. Surprisingly, the DSSP
plot of the intuitive design also revealed that the structure adopted an alpha helix between residues
53 and 58 but attempts to recreate this in the final redesign via sequence matching were
unsuccessful.
In terms of preserving the binding region, all the untruncated designs had lower or
equivalent RMSF values around the binding region relative to the wild type suggesting that these
designs were successful at stabilizing the binding region (Figures 4ab). The DSSP plots also
indicated no major changes in major secondary structure in this region (Figures 5a-c, g). By
contrast, both fragment designs had much higher C-α RMSF around the binding region relative to
the other designs and the wild type protein and could not maintain their major secondary structures.
8. Page | 7
This indicates these designs had a destabilizing effect on the binding region and were thus
unsuccessful designs. As a result, we conclude that the removed regions of the protein were critical
to maintaining structure and recommend future vaccine designs shouldn’t be truncated to the same
degree performed in this study.
CONCLUSIONS
We created a total of six antigen designs for use in a P. aeruginosa vaccine based off Type
IV fibrial precursor pilin and truncated PAK pilin. The first was a homology model of the former
protein while the remainder were modifications of the latter. Though the homology model could
recreate the majority of the secondary structures in the PAK pilin reference, there were slight
differences in sequence around the binding region. Despite this, similarities in backbone structure
around the region mean it may be possible to create a vaccine antigen that allows the immune
system to recognize both antigens on P. aeruginosa. Protein redesigns of PAK pilin had mixed
results in stabilizing the protein around the binding regions; untruncated designs could stabilize
the protein while truncated ones could not. Thus we do not recommend any future redesigns of
PAK pilin to truncate the protein to the same degree we did. Future tests are required to determine
whether our designs in actuality prove to be more stable than the wild type PAK pilin but we are
confident any of our untruncated designs will be helpful in the creation of a future vaccine.
REFERENCES
1. Keizer, D. W. et al. Structure of a pilin monomer from Pseudomonas aeruginosa:
Implications for the assembly of pili. J. Biol. Chem. 276, 24186–24193 (2001).
9. Page | 8
2. Sali, Andrej. MODELLER. Computer software. MODELLER, Program for Comparative
Protein Structure Modelling by Satisfaction of Spatial Restraints. Vers. 9.17. Sali Lab,
UCSF, n.d. Web. <https://salilab.org/modeller/tutorial/basic.html>.
3. Chimera. Computer software. Vers. 1.11.2. UCSF, n.d. Web.
<http://www.cgl.ucsf.edu/chimera/>.
4. Foldit. Foldit Solve Puzzles for Science. University of Washington, n.d. Web.
<http://fold.it/portal/>.
5. Beck, DAC, McCully, ME, Alonso, DOV, Daggett, V. in lucem molecular mechanics,
University of Washington, Seattle, 2000-2017.
6. XSEDE User Portal | TACC Stampede. National Science Foundation, n.d. Web. 26 Jan.
2017. <https://portal.xsede.org/#/guest/>.
7. "Type 4 Fimbrial Precursor PilA [Pseudomonas Aeruginosa PAO1] - Protein -
NCBI." National Center for Biotechnology Information. U.S. National Library of
Medicine, n.d. Web. 09 Jan. 2017.
8. RCSB Protein Data Bank, Hazes, B., Sastry, P.A., Hayakawa, K., Read, R.J., Irvin, R.T.
"1DZO." RCSB PDB - 1DZO: Truncated PAK Pilin from Pseudomonas Aeruginosa
Structure Summary Page. N.p., n.d. Web. 09 Jan. 2017.
9. Wong, W. Y. et al. Structure-function analysis of the adherence-binding domain on the
pilin of Pseudomonas aeruginosa strains PAK and KB7. Biochemistry 34, 12963– 12972
(1995).
10. Hodges, Robert S., William Paranchych, Kok K. Lee, Sastry A. Parimi, Randall T. Irvin,
and Peter C. Doig. Synthetic Pseudomonas Aeruginosa Pilin Peptide Vaccine. The
10. Page | 9
Governors of the University of Alberta, assignee. Patent US 5612036 A. 18 Mar. 1997.
Print.
11. Wong, W. Y. et al. Structure-function analysis of the adherence-binding domain on the
pilin of Pseudomonas aeruginosa strains PAK and KB7. Biochemistry 34, 12963– 12972
(1995).
FIGURE LEGENDS
Figure 1
A) Reference truncated PAK pilin structure. Residues involved in host cell binding are colored
magenta while residues that participate that don’t participate in binding but are still the target of
our vaccine design are colored cyan. Sequence was identified in previous work. [10, 11]; B) Type
IV Fimbrial Precursor Pilin homology model with color spectrally assigned based on C-α RMSD
with respect to truncated PAK pilin: Blue = 0 Angstroms, White= 0.2 Angstroms, Red = 0.4
Angstroms; C) Homology model with blue indicating sequence alignment and red indicating
where sequence doesn’t align.
Figure 2
A) Fragment Wild Type (FWT); B) Fragment Manual Design (FMD); Mutations shown.
Figure 3: C-α RMSD of designs as a function of time. AFD = Automated Foldit Design; ID =
Intuitive Design; MFD = Manual Foldit Design; WT = Wild Type Protein; FMD = Fragment
Manual Design; FWT = Fragment Wild Type.
Figure 4
C-α RMSF as a function of residue position for untruncated designs (A) or all designs with adjusted
residue numbers (B). Adjusted residue number relates residue position in fragments with the
11. Page | 10
analogous position in the untruncated designs. Residues 104-120 are identified as the binding
region. AFD = Automated Foldit Design; ID = Intuitive Design; MFD = Manual Foldit Design;
WT = Wild Type Protein; FMD = Fragment Manual Design; FWT = Fragment Wild Type.
Figure 5a-f
DSSP plots of protein designs and wild type. AFD = Automated Foldit Design; ID = Intuitive
Design; MFD = Manual Foldit Design; WT = Wild Type Protein; FMD = Fragment Manual
Design; FWT = Fragment Wild Type.
Figure 6
FASTA Sequences for proteins and redesigns used in study.
16. Page | 15
Wild Type Protein Sequences
Type IV Fimbrial Pilin Precursor
ARSEGASALATINPLKTTVEESLSRGIAGSKIKIGTTASTATETYVGVEPDANKLGVIAVAI
EDSGAGDITFTFQTGTSSPKNATKVITLNRTADGVWACKSTQDPMFTPKGCDN
Wild Type PAK Pilin (WT)
GTEFARSEGASALASVNPLKTTVEEALSRGWSVKSGTGTEDATKKEVPLGVAADANKL
GTIALKPDPADGTADITLTFTMGGAGPKNKGKIITLTRTAADGLWKCTSDQDEQFIPKGC
SR
Redesigns of PAK Pilin
Automated Foldit Design (AFD)
RRERAREYGQRALNSVNRLQEVVERALRRGWAVRSGTGREDKREKKVPLGRKDDQN
RLGRIELRSDPADGRRDIRIRFRMEGAGPKNKGKVITLERESKKGRWKCTSDQDEQFIP
KGCSR
Intuitive Design (ID)
RRERARREGESALDKVNPLKTTVEEALSRGWSVKSGTGTEDATKKEVPLGVAADANK
NGTIALKPDPADGTADITLTGTMGGAAPKNKGKIITLTRESKKGLWKCTSDQDEQFIPKG
CSR
Manual Foldit Design (MFD)
GTQFARSEGESALRSVNRLKTTVEEALSRGWSVKSGTGTEDATKKEVPLGVAADANKL
GTIALKPDPADGTADITLTFTMGGAGPKNKGKIITLTRTAADGLWKCTSDQDEQFIPKGC
SR
Fragment Manual Design (FMD)
KITTDTRTAADGLKKCTSDQDEQFIPKGCSR
Fragment Wild Type (FWT)
KIITLTRTAADGLWKCTSDQDEQFIPKGCSR
Final Redesign (RD)
RRERAAEYGQRALNSVNRLQEVVERALRRGWAVRSGTGREDKREKKVPLGRKADAN
RNGRIELRSDPADGRRDIRIRFRMEGAGPKNKGKVITLERESKKGRWKCTSDQDEQFI
PKGCSR
Figure 6