THREADING MODELING METHODS
RATAN VISHWAS
ROLE NUMBER- 018COL012
ratanvishwas4@gmail.com
Noida institute of engineering and technology
M.Pharm (branch:- pharmacology )
1st year 2nd semester
2019
Introduction
Protein threading, also known as fold recognition, is a
method of protein modeling which is used to model those
proteins which have the same fold as proteins of known
structures, but do not have homologous proteins with known
structure.
It differs from the homology modeling method of structure
prediction as it (protein threading) is used for proteins which
do not have their homologous protein structures deposited in
the Protein Data Bank (PDB), whereas homology modeling is
used for those proteins which do.
Cont.………
Threading works by using statistical knowledge of the
relationship between the structures deposited in the PDB
and the sequence of the protein which one wishes to model.
Protein threading is based on two basic
observations:
I. that the number of different folds in nature is
fairly small (approximately 1300).
II. that 90% of the new structures submitted to the
PDB in the past three years have similar structural
folds to ones already in the PDB.
Classification of protein structure
The Structural Classification of Proteins (SCOP)
database provides a detailed and comprehensive
description of the structural and evolutionary
relationships of known structure.
Proteins are classified to reflect both structural and
evolutionary relatedness.
1. Family (clear evolutionary relationship):
Proteins clustered together into families are clearly
evolutionarily related.
Generally, this means that pairwise residue identities
between the proteins are 30% and greater.
However, in some cases similar functions and structures
provide definitive evidence of common descent in the
absence of high sequence identity;
for example, many globin's form a family though some
members have sequence identities of only 15%.
2. Superfamily (probable common evolutionary
origin):
Proteins that have low sequence identities, but whose
structural and functional features suggest that a
common evolutionary origin is probable, are placed
together in superfamilies.
For example, actin, the ATPase domain of the heat
shock protein, and hexakinase together form a
superfamily .
3. Fold (major structural similarity):
Proteins are defined as having a common fold if they have
the same major secondary structures in the same
arrangement and with the same topological connections.
Different proteins with the same fold often have peripheral
elements of secondary structure and turn regions that differ
in size and conformation. In some cases, these differing
peripheral regions may comprise half the structure.
Cont…..
Proteins placed together in the same fold category may
not have a common evolutionary origin: the structural
similarities could arise just from the physics and
chemistry of proteins favoring certain packing
arrangements and chain topologies.
Method
A general paradigm of protein threading consists of the
following four steps:
A. The construction of a structure template database:
Select protein structures from the protein structure
databases as structural templates.
This generally involves selecting protein structures from
databases such as PDB, FSSP, SCOP, or CATH, after removing
protein structures with high sequence similarities.
Cont…..
PDB(Protein Data Bank)
FSSP(Families of structurally similar proteins database)
SCOP(The Structural Classification of Proteins database)
B. The design of the scoring function:
Design a good scoring function to measure the fitness
between target sequences and templates based on the
knowledge of the known relationships between the
structures and the sequences.
A good scoring function should contain mutation
potential, environment fitness potential, pairwise
potential, secondary structure compatibilities, and gap
penalties.
The quality of the energy function is closely related to
the prediction accuracy, especially the alignment
accuracy.
C. Threading alignment:
Align the target sequence with each of the structure
templates by optimizing the designed scoring function.
This step is one of the major tasks of all threading-based
structure prediction programs that take into account the
pairwise contact potential;
otherwise, a dynamic programming algorithm can fulfill it.
D. Threading prediction:
Select the threading alignment that is statistically most
probable as the threading prediction.
Then construct a structure model for the target by
placing the backbone atoms of the target sequence at
their aligned backbone positions of the selected
structural template.
Threading Comparison with homology modeling
a) Homology modeling and protein threading are both
template-based methods and there is no rigorous
boundary between them in terms of prediction
techniques.
b) But the protein structures of their targets are
different.
c) Homology modeling is for those targets which have
homologous proteins with known structure
(usually/maybe of same family), while protein
threading is for those targets with only fold-level
homology found.
References
1. Jones DT. (1999) Protein secondary structure prediction based on position-
specific scoring matrices. J Mol Biol 292: 195–202.
2. Shi J, Blundell TL, Mizuguchi K. (2001) FUGUE: sequence-structure homol-
ogy recognition using environment-specific substitution tables and structure-
dependent gap penalties. J Mol Biol 310: 243–257.
3. McGuffin LJ, Jones DT. (2003) Improvement of the GenTHREADER method
for genomic fold recognition. Bioinformatics 19: 874–881.
4. Jones DT, Bryson K, Coleman A, et al. (2005) Prediction of novel and analo-
gous folds using fragment assembly and fold recognition. Proteins 61(7):
143–151.
5. Zhang Y, Arakaki AK, Skolnick J. (2005) TASSER: an automated method
for the prediction of protein tertiary structures in CASP6. Proteins 61(7):
91–98.
Thank you

Threading modeling methods

  • 1.
    THREADING MODELING METHODS RATANVISHWAS ROLE NUMBER- 018COL012 ratanvishwas4@gmail.com Noida institute of engineering and technology M.Pharm (branch:- pharmacology ) 1st year 2nd semester 2019
  • 2.
    Introduction Protein threading, alsoknown as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it (protein threading) is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do.
  • 3.
    Cont.……… Threading works byusing statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.
  • 4.
    Protein threading isbased on two basic observations: I. that the number of different folds in nature is fairly small (approximately 1300). II. that 90% of the new structures submitted to the PDB in the past three years have similar structural folds to ones already in the PDB.
  • 5.
    Classification of proteinstructure The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the structural and evolutionary relationships of known structure. Proteins are classified to reflect both structural and evolutionary relatedness.
  • 6.
    1. Family (clearevolutionary relationship): Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater. However, in some cases similar functions and structures provide definitive evidence of common descent in the absence of high sequence identity; for example, many globin's form a family though some members have sequence identities of only 15%.
  • 7.
    2. Superfamily (probablecommon evolutionary origin): Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable, are placed together in superfamilies. For example, actin, the ATPase domain of the heat shock protein, and hexakinase together form a superfamily .
  • 8.
    3. Fold (majorstructural similarity): Proteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections. Different proteins with the same fold often have peripheral elements of secondary structure and turn regions that differ in size and conformation. In some cases, these differing peripheral regions may comprise half the structure.
  • 9.
    Cont….. Proteins placed togetherin the same fold category may not have a common evolutionary origin: the structural similarities could arise just from the physics and chemistry of proteins favoring certain packing arrangements and chain topologies.
  • 10.
    Method A general paradigmof protein threading consists of the following four steps: A. The construction of a structure template database: Select protein structures from the protein structure databases as structural templates. This generally involves selecting protein structures from databases such as PDB, FSSP, SCOP, or CATH, after removing protein structures with high sequence similarities.
  • 11.
    Cont….. PDB(Protein Data Bank) FSSP(Familiesof structurally similar proteins database) SCOP(The Structural Classification of Proteins database)
  • 12.
    B. The designof the scoring function: Design a good scoring function to measure the fitness between target sequences and templates based on the knowledge of the known relationships between the structures and the sequences. A good scoring function should contain mutation potential, environment fitness potential, pairwise potential, secondary structure compatibilities, and gap penalties. The quality of the energy function is closely related to the prediction accuracy, especially the alignment accuracy.
  • 13.
    C. Threading alignment: Alignthe target sequence with each of the structure templates by optimizing the designed scoring function. This step is one of the major tasks of all threading-based structure prediction programs that take into account the pairwise contact potential; otherwise, a dynamic programming algorithm can fulfill it.
  • 14.
    D. Threading prediction: Selectthe threading alignment that is statistically most probable as the threading prediction. Then construct a structure model for the target by placing the backbone atoms of the target sequence at their aligned backbone positions of the selected structural template.
  • 15.
    Threading Comparison withhomology modeling a) Homology modeling and protein threading are both template-based methods and there is no rigorous boundary between them in terms of prediction techniques. b) But the protein structures of their targets are different. c) Homology modeling is for those targets which have homologous proteins with known structure (usually/maybe of same family), while protein threading is for those targets with only fold-level homology found.
  • 16.
    References 1. Jones DT.(1999) Protein secondary structure prediction based on position- specific scoring matrices. J Mol Biol 292: 195–202. 2. Shi J, Blundell TL, Mizuguchi K. (2001) FUGUE: sequence-structure homol- ogy recognition using environment-specific substitution tables and structure- dependent gap penalties. J Mol Biol 310: 243–257. 3. McGuffin LJ, Jones DT. (2003) Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19: 874–881. 4. Jones DT, Bryson K, Coleman A, et al. (2005) Prediction of novel and analo- gous folds using fragment assembly and fold recognition. Proteins 61(7): 143–151. 5. Zhang Y, Arakaki AK, Skolnick J. (2005) TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61(7): 91–98.
  • 17.