SlideShare a Scribd company logo
Protein Structural Bioinformatics
Definition
The subdiscipline of bioinformatics that focuses on the
representation, storage, retrieval, analysis, and display of
structural information at the atomic and subcellular spatial
scales.
(From Structural Bioinformatics, by P.E. Bourne & H. Weissig (eds.), John Wiley &
Sons, Inc., 2003, pp.4.)
Why is STRUCTURAL bioinformatics important?
Because a protein’s function is determined by its structure.
Knowledge of a protein’s structure is necessary in order to gain
a full understanding of the biological role of a protein.
Bioinformatics methods can be used to analyze
protein structural data in the following ways:
• Visualization of protein structures
• Alignment of protein structures
• Classification of proteins into families, based on similarity
of their structures
• Prediction of protein structures
• Simulation of protein folding and dynamic motions
Protein structure determination by x-ray crystallography or
NMR is difficult (see Powerpoint slides from last module).
It takes 1-3 years to solve a protein structure by these methods. Certain
proteins, such as membrane proteins, are extremely difficult or impossible to
solve by these methods. Due to genomic sequencing efforts, the gap
between known protein sequences and known protein structures is
increasing– only about 3,000 unique protein structures have been
determined, but over 1 million unique sequences have been determined.
Therefore, it is necessary to use bioinformatics methods to predict the
structures of proteins for which a crystal structure or NMR structure has not
been determined.
Bioinformatics methods can predict:
(1) secondary structural elements in a protein sequence
(2) the tertiary structure of the entire sequence
(3) “special” structures, such as transmembrane a-helices,
transmembrane b-barrels, coiled coils, and leucine zippers
Protein Secondary Structure Prediction
All secondary structure prediction is based on the assumption that there
should be a correlation between amino acid sequence and secondary
structure– in other words, it is assumed that certain stretches of amino acids
are more likely to form one type of secondary structure than another.
During secondary structure prediction, the conformational state of each
residue in a protein sequence is predicted; generally each residue is
predicted as having one of three possible states:
(1) a-helical structure
(2) b-strand
(3) “other” (b-turn, loop, or random coil)
Sometimes b-turn is separated as a 4th state.
Why is prediction of secondary structure useful?
It can help guide sequence alignment or improve existing sequence
alignment of distantly related sequences. It is also an intermediate step in
some methods for tertiary structure prediction.
Methods of secondary structure prediction fall into
two broad classes:
Ab initio methods– predict secondary structure based solely
on protein sequence; these methods compute statistics for the
residues that occur in different secondary structural elements in
proteins with known structures, in order to identify “patterns” in
the types of residues that occur in a given type of secondary
structure.
Homology-based methods– make use of multiple sequence
alignments of homologous proteins to predict secondary
structure; these methods are able to locate conserved patterns
that are characteristic of particular secondary structural
elements across the aligned family members.
Certain amino acids are observed more frequently than others in a-
helices, b-strands, and b-turns in crystal structures (see Figure). This
leads to the idea that each amino acid tends to “prefer” being
constrained in a certain type of secondary structure, or has an
“intrinsic propensity” to adopt that secondary structure.
Fig. 4-10 from Lehninger Principles of Biochemistry, 4th ed.
The figure shows that:
Glu, Met, Ala are most
frequent in a-helices
Val, Tyr, Ile are most
frequent in b-strands
Pro, Gly, Asn are most
frequent in b-turns
Based on this data, it
is believed that Glu
has a high a-helical
propensity, but a low
b-strand propensity.
Ab initio methods of secondary structure prediction:
• These methods calculate the relative propensity (intrinsic tendency) of each
amino acid in a protein sequence to belong to a certain secondary structural
element.
• Propensity scores for the 20 amino acids are derived from known protein
structures: these propensities are calculated from the relative frequency of a
given amino acid within the proteins, its frequency in a given type of
secondary structure, and the fraction of all amino acids occurring in that type
of secondary structure.
• Stretches of a protein’s sequence that contain many residues with a high a-
helical propensity are predicted to fold into a-helices. Stretches of sequence
that contain many residues with a high b-strand propensity are predicted to
fold into b-strands.
• Two examples: Chou-Fasman method, GOR method
Accuracy of ab initio methods:
• These methods are not very accurate:
• Chou-Fasman method, 50%-60% accuracy
• GOR method, 64% accuracy, drastically underpredicts b-strands
• These methods are only a little better than randomly assigning secondary
structure! Known proteins consist of ~31% a-helix and ~28% b-sheet, so
randomly assigning secondary structural elements to residues would result in
~30% accuracy.
• Specific problems with these methods:
• Tend to underpredict the lengths of a-helices and b-strands– can’t
identify the first and last residues of helices and strands very well
• Tend to miss b-strands completely
A few homology-based 2o structure prediction methods:
Neural network methods:
PROFsec (an improved version of PHDsec)
http://www.predictprotein.org/
PSIPRED
http://bioinf.cs.ucl.ac.uk/psipred/
SSpro (newest version is 4.0)
http://scratch.proteomics.ics.uci.edu/
SAM-T (SAM-T08 is newest version; SAM-T06, SAM-T02, SAM-T99-- old versions)
http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html
Nearest-neighbor methods:
NNSSP
no longer available online
PREDATOR
http://mobyle.pasteur.fr/cgi-bin/portal.py?#forms::predator
HMM methods:
HMMSTER
http://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.php
A few methods for predicting transmembrane a-helices:
TMHMM
http://www.cbs.dtu.dk/services/TMHMM/
HMMTOP
http://www.enzim.hu/hmmtop/index.html
Phobius (also predicts presence of signal peptides)
http://phobius.sbc.su.se/
TopPred
http://mobyle.pasteur.fr/cgi-bin/portal.py?#forms::toppred
PRED-TMR
http://athina.biol.uoa.gr/PRED-TMR/
DAS
http://mendel.imp.ac.at/sat/DAS/DAS.html
TMpred
http://www.ch.embnet.org/software/TMPRED_form.html
MEMSAT
http://bioinf.cs.ucl.ac.uk/psipred/
Accuracies of the methods:
Levels of accuracy are reported by the developers to be in the range of 75-95%.
At least one study (2001) found TMHMM to be the best performing program.
It is best to use several methods and compare the results to arrive at a consensus
prediction. When different methods, specifically methods that are based on different
algorithms, give similar results, the reliability of the results is higher.
Tertiary structure prediction methods fall into three
classes:
(1) Homology modeling (also called comparative modeling)
A structure is built based on the known structure of another protein that is
similar in sequence (a homolog).
(2) Threading (also called structural fold recognition)
A structure is predicted for a protein by “threading” its sequence through a
variety of known structures to determine which structure the sequence best
fits.
(3) Ab initio prediction (also called de novo prediction)
A structure is predicted based only on the amino acid sequence of the
protein, using the physicochemical properties of its residues and the
principles governing protein folding.
Homology modeling for tertiary structure prediction:
Homology modeling is based on the idea that if two proteins share a high
degree of sequence similarity (i.e., they are close homologs), they are likely
to have very similar 3D structures. In general, proteins that share >30%
sequence identity are likely to be quite similar in structure.
Therefore, if a protein of unknown structure is similar in sequence to a
protein of known structure, the known structure can be used as a template to
which the unknown sequence is fit. The structure that is built for the
unknown sequence is then called a homology model for the structure of that
sequence.
The “safe homology
modeling zone,” above the
gray curve, is the region
where two proteins are likely
to have the same structure.
Fig. 5 from R. Nair & B. Rost,
Protein Science (2002) 11: 2836-47.
Steps in homology modeling for tertiary structure
prediction:
The protein of unknown structure for which a structural model is to be built
will be called the “target sequence.”
1. Template selection– Identify protein(s) in the PDB that are
homologous to the target sequence using BLAST or PSI-BLAST. If a close
homolog with known structure is found, its structure will serve as a template
to which the target sequence will be matched. The template should have
at least 30% sequence identity with the target. (Proteins that share less
than 30% sequence identity may not be similar enough in structure to carry
out homology modeling.) If PSI-BLAST does not identify a suitable template,
it will probably be necessary to construct a structural model by threading.
It is possible to use multiple templates if more than one good template is
identified. When multiple templates are available, it is best to use more than
one template to avoid biasing the model toward a single protein. The
template used in the next step of homology modeling will then be an
averaged structure based on all of the chosen templates.
Steps in homology modeling for tertiary structure
prediction:
2. Sequence alignment– Construct a multiple sequence alignment of
the target, the template, and other homologous sequences. It is actually the
alignment of the target and template that is of interest, but the inclusion of
other homologs provides more information, helping to ensure that the best
alignment of homologous residues is achieved. The quality of the target-
template alignment is critical for constructing an accurate structural
model for the target. If a given residue in the target is not aligned with the
proper residue in the template, the error cannot be corrected in later steps of
model building. A robust multiple sequence alignment program should be
used for this step, and the resulting alignment should be very carefully
examined and manually refined if necessary.
Steps in homology modeling for tertiary structure
prediction:
3. Backbone model building– Residues in the aligned regions of the
target and template are assumed to adopt the same structure. Therefore,
the backbone atoms of these residues in the target can be placed in the
same 3D location as the backbone atoms of these residues in the template.
See the alignment below as an example.
Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF...
Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF...
For these residues, backbone atoms of the target are assumed
to occupy the same 3D location as those of the template.
F aligned with F. They are identical,
so all atoms of target F will overlap
the 3D positions of all atoms of
template F.
E aligned with D. They are not identical, but
their backbone atoms can be assumed to
occupy the same 3D position. So backbone
atoms of target D will overlap the 3D
positions of backbone atoms of template E.
Steps in homology modeling for tertiary structure prediction:
4. Loop building– There are likely to be regions in the alignment where
gaps appear because the target sequence does not match the template. The
target sequence residues in these gap regions are assumed to form a loop that
is not present in the template structure. The structure of this loop can be built
using several different methods. In any case, it is a difficult problem since the
template provides no information to guide the building of the loop structure.
Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF...
Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF...
“Extra” residues in the target sequence do not
match the template and are assumed to form a loop.
target loop
Steps in homology modeling for tertiary structure
prediction:
5. Side chain addition– The side chains are added to the backbone
structure. Each side chain could potentially have many possible
conformations due to bond rotation, but steric clashes with neighboring
atoms are not allowed. Therefore, side chain that have the lowest interaction
energy with nearby atoms are chosen.
Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF...
Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF...
Target and template are both F, so
all atoms of the target side chain
can be modeled as having the same
3D positions as the template side
chain, at least initially. (Small
changes in position may be
necessary in later refinement steps.)
Target and template have different
side chains (D vs. E), so the side
chain rotamer that is chosen for the
target D must not overlap/clash with
any neighboring atoms.
Steps in homology modeling for tertiary structure
prediction:
6. Model refinement– Unfavorable bond angles, bond lengths, and
atom contacts are likely to exist in the preliminary model, so an energy
minimization procedure is applied to refine the model. In this procedure,
atom positions are shifted so that the overall conformation of the entire
structure has the lowest energy potential. Only limited energy minimization
should be applied (a few hundred iterations) so that major errors are
removed but residues are not moved from their correct positions.
7. Model evaluation– The model is checked for anomalies in dihedral
angles, bond lengths, and atom contacts.
Programs for homology modeling:
Many programs for automated homology modeling are now available, so
anyone can construct a homology model on a regular PC. However,
construction of a “good” homology model (at least for sequences that are not
highly similar) usually requires some expertise and usually should be done
with human intervention, rather than in a fully automated fashion.
A few of the freely available programs for homology
modeling:
SWISS-MODEL– Produces accurate models; fast; good tutorials available.
http://swissmodel.expasy.org/
I-TASSER– Produces accurate models; easy to use, but slow
http://zhanglab.ccmb.med.umich.edu/I-TASSER/
Modeller– must be downloaded and installed locally
http://salilab.org/modeller/modeller.html
WHAT IF
http://swift.cmbi.ru.nl/servers/html/index.html
http://swift.cmbi.ru.nl/whatif/
Is a homology model CORRECT?
Since the actual (experimentally determined) structure of the target is not
known, there is no way to say whether or not the homology model is
“correct.” Instead, the best a researcher can do is compare the homology
model to the structure of the template from which it was derived. If the atom
positions in the model do not deviate very much from those of the template,
the homology model is said to be “accurate.” The greater the deviation
between model and template, the lower the accuracy of the model.
When is a homology model definitely INCORRECT?
A homology model has regions that are incorrect if it contains structural
features that do not occur in native proteins, such as:
• Hydrophobic side chains on the surface of the model (these side
chains should be buried)
• Unreasonable bond lengths or angles
• Unfavorable noncovalent contacts between atoms (clashes)
• Unreasonable dihedral angles
Accuracy of homology modeling:
The template selection and alignment accuracy are crucial to the accuracy of a homology
model. The accuracy of the model depends on the percentage of sequence identity
between the target and template. The average coordinate agreement between the
modeled structure and the actual structure drops ~0.3 Å for each 10% reduction in
sequence identity.
The largest structural differences between homologous proteins are in surface loops. In
other words, the structure of the protein core is more highly conserved. Therefore, the
regions that are most likely to be in error in a homology model are the surface loops.
High-accuracy homology models can be built when the target and template have 50%
or greater sequence identity. Errors are mostly mistakes in side-chain packing, small
shifts of the core backbone regions, and occasionally larger errors in loops.
Medium-accuracy homology models can be built when the proteins share 30-50%
sequence identity. There can be alignment mistakes, and there are more frequent side-
chain packing, core distortion, and loop modeling errors.
Low-accuracy homology models are based on proteins that share <30% sequence
identity. If a model is based on an almost insignificant alignment to a known structure, the
model may have an entirely incorrect fold.
The best model-building programs will produce models of similar accuracy, provided that
the methods are used optimally.
Stephen James
stephen@macfast.org
9746935363

More Related Content

What's hot

Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
Rida Khalid
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
MugdhaSharma11
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
Mazhar Khan
 
Presentation1
Presentation1Presentation1
Presentation1
firesea
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
SHRIKANT YANKANCHI
 
Structure analysis of protein
Structure analysis of proteinStructure analysis of protein
Structure analysis of protein
KAUSHAL SAHU
 
Prosite
PrositeProsite
Cath
CathCath
Cath
Ramya S
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
Vidya Kalaivani Rajkumar
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
Vijay Hemmadi
 
Protein Predictinon
Protein PredictinonProtein Predictinon
Protein Predictinon
SHRADHEYA GUPTA
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
Samvartika Majumdar
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
Roshan Karunarathna
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
Arindam Ghosh
 
Protein Databases
Protein DatabasesProtein Databases
Homology modelling
Homology modellingHomology modelling
Homology modelling
Ayesha Choudhury
 
Homology modeling
Homology modelingHomology modeling
Ecocyc database
Ecocyc databaseEcocyc database
Ecocyc database
Shiv Kumar
 
Protein database
Protein databaseProtein database
Protein database
Khalid Hakeem
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
Vidya Kalaivani Rajkumar
 

What's hot (20)

Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Presentation1
Presentation1Presentation1
Presentation1
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
Structure analysis of protein
Structure analysis of proteinStructure analysis of protein
Structure analysis of protein
 
Prosite
PrositeProsite
Prosite
 
Cath
CathCath
Cath
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Protein Predictinon
Protein PredictinonProtein Predictinon
Protein Predictinon
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Ecocyc database
Ecocyc databaseEcocyc database
Ecocyc database
 
Protein database
Protein databaseProtein database
Protein database
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 

Viewers also liked

Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
Balachandramohan Bcm
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
karamveer prajapat
 
protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modelling
Dileep Paruchuru
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
Muhammed sadiq
 
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic AlgorithmsLit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Kato Mivule
 
Protein structure classification
Protein structure classificationProtein structure classification
Protein structure classification
Malla Reddy College of Pharmacy
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
Elda Nurafnie
 
Sk rndm grmmrs
Sk rndm grmmrsSk rndm grmmrs
Sk rndm grmmrs
ivan weinel
 
Presentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali ShahPresentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali Shah
guest5de83e
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
GIScRG
 
Ph.D. work
Ph.D. workPh.D. work
Structure prediction of Proteins
Structure prediction of ProteinsStructure prediction of Proteins
Structure prediction of Proteins
geetikaJethra
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-Baca
Roderic Page
 
Presentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticePresentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informatice
zahid6
 
Abinitio.ppt
Abinitio.pptAbinitio.ppt
Abinitio.ppt
Lester Smathis
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
Daniela Rotariu
 
BLAST
BLASTBLAST
Ketone bodies, ketosis & it’s pathogenesis
Ketone bodies, ketosis & it’s pathogenesisKetone bodies, ketosis & it’s pathogenesis
Ketone bodies, ketosis & it’s pathogenesis
enamifat
 
Tertiary structure of proteins
Tertiary structure of proteinsTertiary structure of proteins
Tertiary structure of proteins
Kinza Ayub
 
Protein identication characterization
Protein identication characterizationProtein identication characterization
Protein identication characterization
Malla Reddy College of Pharmacy
 

Viewers also liked (20)

Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modelling
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
 
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic AlgorithmsLit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
 
Protein structure classification
Protein structure classificationProtein structure classification
Protein structure classification
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Sk rndm grmmrs
Sk rndm grmmrsSk rndm grmmrs
Sk rndm grmmrs
 
Presentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali ShahPresentation 2007 Journal Club Azhar Ali Shah
Presentation 2007 Journal Club Azhar Ali Shah
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
Ph.D. work
Ph.D. workPh.D. work
Ph.D. work
 
Structure prediction of Proteins
Structure prediction of ProteinsStructure prediction of Proteins
Structure prediction of Proteins
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-Baca
 
Presentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticePresentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informatice
 
Abinitio.ppt
Abinitio.pptAbinitio.ppt
Abinitio.ppt
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
BLAST
BLASTBLAST
BLAST
 
Ketone bodies, ketosis & it’s pathogenesis
Ketone bodies, ketosis & it’s pathogenesisKetone bodies, ketosis & it’s pathogenesis
Ketone bodies, ketosis & it’s pathogenesis
 
Tertiary structure of proteins
Tertiary structure of proteinsTertiary structure of proteins
Tertiary structure of proteins
 
Protein identication characterization
Protein identication characterizationProtein identication characterization
Protein identication characterization
 

Similar to Protein structure 2

58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus
Annadurai B
 
Computational predictiction of prrotein structure
Computational predictiction of prrotein structureComputational predictiction of prrotein structure
Computational predictiction of prrotein structure
Archita Srivastava
 
Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural Biology
Antonio E. Serrano
 
Drug discovery presentation
Drug discovery presentationDrug discovery presentation
Drug discovery presentation
Theertha Raveendran
 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
Shikha Popali
 
L1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptxL1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptx
kigaruantony
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
ashharnomani
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
MuhammadRizwan863722
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
AliAhamd7
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
AliAhamd7
 
Comparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplicationsComparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplications
LynellBull52
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
SHEETHUMOLKS
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt
60BT119YAZHINIK
 
6. protein secondry structure ppt
6. protein secondry structure ppt6. protein secondry structure ppt
6. protein secondry structure ppt
VinaKhan1
 
demonstration lecture on Homology modeling
demonstration lecture on Homology modelingdemonstration lecture on Homology modeling
demonstration lecture on Homology modeling
Maharaj Vinayak Global University
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptx
MO.SHAHANAWAZ
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
Abhik Seal
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
ijcseit
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
journal ijrtem
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
IJRTEMJOURNAL
 

Similar to Protein structure 2 (20)

58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus
 
Computational predictiction of prrotein structure
Computational predictiction of prrotein structureComputational predictiction of prrotein structure
Computational predictiction of prrotein structure
 
Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural Biology
 
Drug discovery presentation
Drug discovery presentationDrug discovery presentation
Drug discovery presentation
 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
 
L1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptxL1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptx
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
Comparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplicationsComparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplications
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt
 
6. protein secondry structure ppt
6. protein secondry structure ppt6. protein secondry structure ppt
6. protein secondry structure ppt
 
demonstration lecture on Homology modeling
demonstration lecture on Homology modelingdemonstration lecture on Homology modeling
demonstration lecture on Homology modeling
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptx
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 

More from Rainu Rajeev

Jsir 59(2) 87 101
Jsir 59(2) 87 101Jsir 59(2) 87 101
Jsir 59(2) 87 101
Rainu Rajeev
 
Areps siddiqui etal 2013
Areps siddiqui etal 2013Areps siddiqui etal 2013
Areps siddiqui etal 2013
Rainu Rajeev
 
Marine microbiology ecology &amp; applications colin munn
Marine microbiology  ecology &amp; applications colin munnMarine microbiology  ecology &amp; applications colin munn
Marine microbiology ecology &amp; applications colin munn
Rainu Rajeev
 
Agricultural biotechnology
Agricultural biotechnologyAgricultural biotechnology
Agricultural biotechnology
Rainu Rajeev
 
Marine board pp17_microcean
Marine board pp17_microceanMarine board pp17_microcean
Marine board pp17_microcean
Rainu Rajeev
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
Rainu Rajeev
 

More from Rainu Rajeev (6)

Jsir 59(2) 87 101
Jsir 59(2) 87 101Jsir 59(2) 87 101
Jsir 59(2) 87 101
 
Areps siddiqui etal 2013
Areps siddiqui etal 2013Areps siddiqui etal 2013
Areps siddiqui etal 2013
 
Marine microbiology ecology &amp; applications colin munn
Marine microbiology  ecology &amp; applications colin munnMarine microbiology  ecology &amp; applications colin munn
Marine microbiology ecology &amp; applications colin munn
 
Agricultural biotechnology
Agricultural biotechnologyAgricultural biotechnology
Agricultural biotechnology
 
Marine board pp17_microcean
Marine board pp17_microceanMarine board pp17_microcean
Marine board pp17_microcean
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
 

Recently uploaded

Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 

Recently uploaded (20)

Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 

Protein structure 2

  • 1.
  • 2. Protein Structural Bioinformatics Definition The subdiscipline of bioinformatics that focuses on the representation, storage, retrieval, analysis, and display of structural information at the atomic and subcellular spatial scales. (From Structural Bioinformatics, by P.E. Bourne & H. Weissig (eds.), John Wiley & Sons, Inc., 2003, pp.4.) Why is STRUCTURAL bioinformatics important? Because a protein’s function is determined by its structure. Knowledge of a protein’s structure is necessary in order to gain a full understanding of the biological role of a protein.
  • 3. Bioinformatics methods can be used to analyze protein structural data in the following ways: • Visualization of protein structures • Alignment of protein structures • Classification of proteins into families, based on similarity of their structures • Prediction of protein structures • Simulation of protein folding and dynamic motions
  • 4. Protein structure determination by x-ray crystallography or NMR is difficult (see Powerpoint slides from last module). It takes 1-3 years to solve a protein structure by these methods. Certain proteins, such as membrane proteins, are extremely difficult or impossible to solve by these methods. Due to genomic sequencing efforts, the gap between known protein sequences and known protein structures is increasing– only about 3,000 unique protein structures have been determined, but over 1 million unique sequences have been determined. Therefore, it is necessary to use bioinformatics methods to predict the structures of proteins for which a crystal structure or NMR structure has not been determined. Bioinformatics methods can predict: (1) secondary structural elements in a protein sequence (2) the tertiary structure of the entire sequence (3) “special” structures, such as transmembrane a-helices, transmembrane b-barrels, coiled coils, and leucine zippers
  • 5. Protein Secondary Structure Prediction All secondary structure prediction is based on the assumption that there should be a correlation between amino acid sequence and secondary structure– in other words, it is assumed that certain stretches of amino acids are more likely to form one type of secondary structure than another. During secondary structure prediction, the conformational state of each residue in a protein sequence is predicted; generally each residue is predicted as having one of three possible states: (1) a-helical structure (2) b-strand (3) “other” (b-turn, loop, or random coil) Sometimes b-turn is separated as a 4th state. Why is prediction of secondary structure useful? It can help guide sequence alignment or improve existing sequence alignment of distantly related sequences. It is also an intermediate step in some methods for tertiary structure prediction.
  • 6. Methods of secondary structure prediction fall into two broad classes: Ab initio methods– predict secondary structure based solely on protein sequence; these methods compute statistics for the residues that occur in different secondary structural elements in proteins with known structures, in order to identify “patterns” in the types of residues that occur in a given type of secondary structure. Homology-based methods– make use of multiple sequence alignments of homologous proteins to predict secondary structure; these methods are able to locate conserved patterns that are characteristic of particular secondary structural elements across the aligned family members.
  • 7. Certain amino acids are observed more frequently than others in a- helices, b-strands, and b-turns in crystal structures (see Figure). This leads to the idea that each amino acid tends to “prefer” being constrained in a certain type of secondary structure, or has an “intrinsic propensity” to adopt that secondary structure. Fig. 4-10 from Lehninger Principles of Biochemistry, 4th ed. The figure shows that: Glu, Met, Ala are most frequent in a-helices Val, Tyr, Ile are most frequent in b-strands Pro, Gly, Asn are most frequent in b-turns Based on this data, it is believed that Glu has a high a-helical propensity, but a low b-strand propensity.
  • 8. Ab initio methods of secondary structure prediction: • These methods calculate the relative propensity (intrinsic tendency) of each amino acid in a protein sequence to belong to a certain secondary structural element. • Propensity scores for the 20 amino acids are derived from known protein structures: these propensities are calculated from the relative frequency of a given amino acid within the proteins, its frequency in a given type of secondary structure, and the fraction of all amino acids occurring in that type of secondary structure. • Stretches of a protein’s sequence that contain many residues with a high a- helical propensity are predicted to fold into a-helices. Stretches of sequence that contain many residues with a high b-strand propensity are predicted to fold into b-strands. • Two examples: Chou-Fasman method, GOR method
  • 9. Accuracy of ab initio methods: • These methods are not very accurate: • Chou-Fasman method, 50%-60% accuracy • GOR method, 64% accuracy, drastically underpredicts b-strands • These methods are only a little better than randomly assigning secondary structure! Known proteins consist of ~31% a-helix and ~28% b-sheet, so randomly assigning secondary structural elements to residues would result in ~30% accuracy. • Specific problems with these methods: • Tend to underpredict the lengths of a-helices and b-strands– can’t identify the first and last residues of helices and strands very well • Tend to miss b-strands completely
  • 10. A few homology-based 2o structure prediction methods: Neural network methods: PROFsec (an improved version of PHDsec) http://www.predictprotein.org/ PSIPRED http://bioinf.cs.ucl.ac.uk/psipred/ SSpro (newest version is 4.0) http://scratch.proteomics.ics.uci.edu/ SAM-T (SAM-T08 is newest version; SAM-T06, SAM-T02, SAM-T99-- old versions) http://compbio.soe.ucsc.edu/SAM_T08/T08-query.html Nearest-neighbor methods: NNSSP no longer available online PREDATOR http://mobyle.pasteur.fr/cgi-bin/portal.py?#forms::predator HMM methods: HMMSTER http://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.php
  • 11. A few methods for predicting transmembrane a-helices: TMHMM http://www.cbs.dtu.dk/services/TMHMM/ HMMTOP http://www.enzim.hu/hmmtop/index.html Phobius (also predicts presence of signal peptides) http://phobius.sbc.su.se/ TopPred http://mobyle.pasteur.fr/cgi-bin/portal.py?#forms::toppred PRED-TMR http://athina.biol.uoa.gr/PRED-TMR/ DAS http://mendel.imp.ac.at/sat/DAS/DAS.html TMpred http://www.ch.embnet.org/software/TMPRED_form.html MEMSAT http://bioinf.cs.ucl.ac.uk/psipred/ Accuracies of the methods: Levels of accuracy are reported by the developers to be in the range of 75-95%. At least one study (2001) found TMHMM to be the best performing program. It is best to use several methods and compare the results to arrive at a consensus prediction. When different methods, specifically methods that are based on different algorithms, give similar results, the reliability of the results is higher.
  • 12. Tertiary structure prediction methods fall into three classes: (1) Homology modeling (also called comparative modeling) A structure is built based on the known structure of another protein that is similar in sequence (a homolog). (2) Threading (also called structural fold recognition) A structure is predicted for a protein by “threading” its sequence through a variety of known structures to determine which structure the sequence best fits. (3) Ab initio prediction (also called de novo prediction) A structure is predicted based only on the amino acid sequence of the protein, using the physicochemical properties of its residues and the principles governing protein folding.
  • 13. Homology modeling for tertiary structure prediction: Homology modeling is based on the idea that if two proteins share a high degree of sequence similarity (i.e., they are close homologs), they are likely to have very similar 3D structures. In general, proteins that share >30% sequence identity are likely to be quite similar in structure. Therefore, if a protein of unknown structure is similar in sequence to a protein of known structure, the known structure can be used as a template to which the unknown sequence is fit. The structure that is built for the unknown sequence is then called a homology model for the structure of that sequence. The “safe homology modeling zone,” above the gray curve, is the region where two proteins are likely to have the same structure. Fig. 5 from R. Nair & B. Rost, Protein Science (2002) 11: 2836-47.
  • 14. Steps in homology modeling for tertiary structure prediction: The protein of unknown structure for which a structural model is to be built will be called the “target sequence.” 1. Template selection– Identify protein(s) in the PDB that are homologous to the target sequence using BLAST or PSI-BLAST. If a close homolog with known structure is found, its structure will serve as a template to which the target sequence will be matched. The template should have at least 30% sequence identity with the target. (Proteins that share less than 30% sequence identity may not be similar enough in structure to carry out homology modeling.) If PSI-BLAST does not identify a suitable template, it will probably be necessary to construct a structural model by threading. It is possible to use multiple templates if more than one good template is identified. When multiple templates are available, it is best to use more than one template to avoid biasing the model toward a single protein. The template used in the next step of homology modeling will then be an averaged structure based on all of the chosen templates.
  • 15. Steps in homology modeling for tertiary structure prediction: 2. Sequence alignment– Construct a multiple sequence alignment of the target, the template, and other homologous sequences. It is actually the alignment of the target and template that is of interest, but the inclusion of other homologs provides more information, helping to ensure that the best alignment of homologous residues is achieved. The quality of the target- template alignment is critical for constructing an accurate structural model for the target. If a given residue in the target is not aligned with the proper residue in the template, the error cannot be corrected in later steps of model building. A robust multiple sequence alignment program should be used for this step, and the resulting alignment should be very carefully examined and manually refined if necessary.
  • 16. Steps in homology modeling for tertiary structure prediction: 3. Backbone model building– Residues in the aligned regions of the target and template are assumed to adopt the same structure. Therefore, the backbone atoms of these residues in the target can be placed in the same 3D location as the backbone atoms of these residues in the template. See the alignment below as an example. Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF... Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF... For these residues, backbone atoms of the target are assumed to occupy the same 3D location as those of the template. F aligned with F. They are identical, so all atoms of target F will overlap the 3D positions of all atoms of template F. E aligned with D. They are not identical, but their backbone atoms can be assumed to occupy the same 3D position. So backbone atoms of target D will overlap the 3D positions of backbone atoms of template E.
  • 17. Steps in homology modeling for tertiary structure prediction: 4. Loop building– There are likely to be regions in the alignment where gaps appear because the target sequence does not match the template. The target sequence residues in these gap regions are assumed to form a loop that is not present in the template structure. The structure of this loop can be built using several different methods. In any case, it is a difficult problem since the template provides no information to guide the building of the loop structure. Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF... Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF... “Extra” residues in the target sequence do not match the template and are assumed to form a loop. target loop
  • 18. Steps in homology modeling for tertiary structure prediction: 5. Side chain addition– The side chains are added to the backbone structure. Each side chain could potentially have many possible conformations due to bond rotation, but steric clashes with neighboring atoms are not allowed. Therefore, side chain that have the lowest interaction energy with nearby atoms are chosen. Target: ...FKSQAAIHEAYCNFHYKVTAAASRTPEIDFDVHFSSIF... Template: ...FKQQANIHCAYCNGAYKIG-------GKELQVHFSWLF... Target and template are both F, so all atoms of the target side chain can be modeled as having the same 3D positions as the template side chain, at least initially. (Small changes in position may be necessary in later refinement steps.) Target and template have different side chains (D vs. E), so the side chain rotamer that is chosen for the target D must not overlap/clash with any neighboring atoms.
  • 19. Steps in homology modeling for tertiary structure prediction: 6. Model refinement– Unfavorable bond angles, bond lengths, and atom contacts are likely to exist in the preliminary model, so an energy minimization procedure is applied to refine the model. In this procedure, atom positions are shifted so that the overall conformation of the entire structure has the lowest energy potential. Only limited energy minimization should be applied (a few hundred iterations) so that major errors are removed but residues are not moved from their correct positions. 7. Model evaluation– The model is checked for anomalies in dihedral angles, bond lengths, and atom contacts.
  • 20. Programs for homology modeling: Many programs for automated homology modeling are now available, so anyone can construct a homology model on a regular PC. However, construction of a “good” homology model (at least for sequences that are not highly similar) usually requires some expertise and usually should be done with human intervention, rather than in a fully automated fashion. A few of the freely available programs for homology modeling: SWISS-MODEL– Produces accurate models; fast; good tutorials available. http://swissmodel.expasy.org/ I-TASSER– Produces accurate models; easy to use, but slow http://zhanglab.ccmb.med.umich.edu/I-TASSER/ Modeller– must be downloaded and installed locally http://salilab.org/modeller/modeller.html WHAT IF http://swift.cmbi.ru.nl/servers/html/index.html http://swift.cmbi.ru.nl/whatif/
  • 21. Is a homology model CORRECT? Since the actual (experimentally determined) structure of the target is not known, there is no way to say whether or not the homology model is “correct.” Instead, the best a researcher can do is compare the homology model to the structure of the template from which it was derived. If the atom positions in the model do not deviate very much from those of the template, the homology model is said to be “accurate.” The greater the deviation between model and template, the lower the accuracy of the model. When is a homology model definitely INCORRECT? A homology model has regions that are incorrect if it contains structural features that do not occur in native proteins, such as: • Hydrophobic side chains on the surface of the model (these side chains should be buried) • Unreasonable bond lengths or angles • Unfavorable noncovalent contacts between atoms (clashes) • Unreasonable dihedral angles
  • 22. Accuracy of homology modeling: The template selection and alignment accuracy are crucial to the accuracy of a homology model. The accuracy of the model depends on the percentage of sequence identity between the target and template. The average coordinate agreement between the modeled structure and the actual structure drops ~0.3 Å for each 10% reduction in sequence identity. The largest structural differences between homologous proteins are in surface loops. In other words, the structure of the protein core is more highly conserved. Therefore, the regions that are most likely to be in error in a homology model are the surface loops. High-accuracy homology models can be built when the target and template have 50% or greater sequence identity. Errors are mostly mistakes in side-chain packing, small shifts of the core backbone regions, and occasionally larger errors in loops. Medium-accuracy homology models can be built when the proteins share 30-50% sequence identity. There can be alignment mistakes, and there are more frequent side- chain packing, core distortion, and loop modeling errors. Low-accuracy homology models are based on proteins that share <30% sequence identity. If a model is based on an almost insignificant alignment to a known structure, the model may have an entirely incorrect fold. The best model-building programs will produce models of similar accuracy, provided that the methods are used optimally.