Fold recognition and
ab initio protein
modeling
Michael Dolan
6/26/18 Source: Aza Toth
What if there is no homolog?
Computational methods for protein
structure prediction
• Homology (or “comparative”) modeling
used for proteins which have their homologous protein structures deposited in
the Protein Data Bank (PDB)
used to model those proteins which have the same fold as proteins of known
structures, but do not have homologous proteins with known structure
• Fold recognition / threading
• ab initio modeling
Uses the laws of physics along with protein fragments to construct a model using
the laws of physics (energy function)
Protein fold recognition
• Can be applied when homology modeling
methods provide no reliable prediction
• attempt to identify a model fold for a given
target sequence among the known folds even
if no sequence similarity can be detected
Protein Fold Recognition
Basic premise
• Similar sequence implies similar structure but
not all similar structures have similar
sequence
– structure is evolutionary more conserved than
sequence
– number of unique structural folds in nature is
fairly small
Structures conserve more than just
sequence….
SCOP: Structural Classification of Proteins
Similar structures can be found among
proteins with no sequence similarity
Chap. 11 Protein Structures: Published by Eleanore Bruce
3.6 Å
5% ID
NK-lysin (1nkl) Bacteriocin (1e68)
Less protein folds compare to
sequence diversity
Protein Fold Recognition / Threading
Which of the known folds is likely to be
similar to the (unknown) fold of a new
protein when only its amino-acid sequence
is known?
Predicting Secondary Structure
From Primary Structure
TEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEK
TEAVDAWTVEKAFKTFANDNGVDGAWTVEKAFKTFTVTEK
Replace both sequences with an engineered peptide
Source: Minor and Kim. 1996. Nature 380:730-734
a -helix b-strand
Protein Threading
• Threading method defines the "fitness" of the
query from the structural environment of the
template structure.
• Sequences are fitted directly onto the backbone
coordinates of known protein structures
• Matching of sequences to backbone coordinates
is performed in 3D space, incorporating specific
pair interactions explicitly
Ab initio / de novo methods
• Build protein 3D structures from sequence
alone
– based on physical
principles
https://doi.org/10.1371/journal.pone.0032637
Protein intramolecular interactions
https://www.fastbleep.com/biology-notes/40/1175
Let’s pause and think about this
problem…
• For a protein of 200 residues and considering
only 3 backbone angles (F,Y, and W)…
…there are 3200 possibilities.
• There are estimated to be 1082 atoms in
universe.
Rosetta
Fragment-based, ab initio modeling
• Sections of a sequence are subjected to secondary structure prediction
• Assembled in 3D space, looking for lowest energy configurations
Fragment-based folding using Rosetta
ab initio modeling
Challenges:
– scoring function
– fast method for sampling conformations
Advantages:
– Can work for novel folds
– helps to understand the folding process
Disadvantages:
– applicable to short sequences only; monomers
– time consuming
– misleading results?
Hands on Exercises
Robetta: A web server for ab initio modeling
Rosetta: Command line suite of programs for ab
initio modeling
(see assoc tutorial)
Robetta: full-chain protein structure
prediction server
http://robetta.bakerlab.org
Rosetta analysis

Protein fold recognition and ab_initio modeling

  • 1.
    Fold recognition and abinitio protein modeling Michael Dolan 6/26/18 Source: Aza Toth
  • 2.
    What if thereis no homolog?
  • 3.
    Computational methods forprotein structure prediction • Homology (or “comparative”) modeling used for proteins which have their homologous protein structures deposited in the Protein Data Bank (PDB) used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure • Fold recognition / threading • ab initio modeling Uses the laws of physics along with protein fragments to construct a model using the laws of physics (energy function)
  • 4.
    Protein fold recognition •Can be applied when homology modeling methods provide no reliable prediction • attempt to identify a model fold for a given target sequence among the known folds even if no sequence similarity can be detected
  • 5.
    Protein Fold Recognition Basicpremise • Similar sequence implies similar structure but not all similar structures have similar sequence – structure is evolutionary more conserved than sequence – number of unique structural folds in nature is fairly small
  • 6.
    Structures conserve morethan just sequence….
  • 7.
  • 8.
    Similar structures canbe found among proteins with no sequence similarity Chap. 11 Protein Structures: Published by Eleanore Bruce
  • 9.
    3.6 Å 5% ID NK-lysin(1nkl) Bacteriocin (1e68) Less protein folds compare to sequence diversity
  • 10.
    Protein Fold Recognition/ Threading Which of the known folds is likely to be similar to the (unknown) fold of a new protein when only its amino-acid sequence is known?
  • 11.
    Predicting Secondary Structure FromPrimary Structure TEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEK TEAVDAWTVEKAFKTFANDNGVDGAWTVEKAFKTFTVTEK Replace both sequences with an engineered peptide Source: Minor and Kim. 1996. Nature 380:730-734 a -helix b-strand
  • 12.
    Protein Threading • Threadingmethod defines the "fitness" of the query from the structural environment of the template structure. • Sequences are fitted directly onto the backbone coordinates of known protein structures • Matching of sequences to backbone coordinates is performed in 3D space, incorporating specific pair interactions explicitly
  • 13.
    Ab initio /de novo methods • Build protein 3D structures from sequence alone – based on physical principles https://doi.org/10.1371/journal.pone.0032637
  • 14.
  • 15.
    Let’s pause andthink about this problem… • For a protein of 200 residues and considering only 3 backbone angles (F,Y, and W)… …there are 3200 possibilities. • There are estimated to be 1082 atoms in universe.
  • 16.
  • 17.
    Fragment-based, ab initiomodeling • Sections of a sequence are subjected to secondary structure prediction • Assembled in 3D space, looking for lowest energy configurations
  • 18.
  • 19.
    ab initio modeling Challenges: –scoring function – fast method for sampling conformations Advantages: – Can work for novel folds – helps to understand the folding process Disadvantages: – applicable to short sequences only; monomers – time consuming – misleading results?
  • 20.
    Hands on Exercises Robetta:A web server for ab initio modeling Rosetta: Command line suite of programs for ab initio modeling (see assoc tutorial)
  • 21.
    Robetta: full-chain proteinstructure prediction server http://robetta.bakerlab.org
  • 22.