SlideShare a Scribd company logo
Protein Structure Modeling
Learning Objectives
• At the end of this topic the student should be
able to:
(i) Perform computer aided- protein structure
determination and validation
(ii) Design docking experiments
(iii) Demonstrate understanding of protein
interactions and simulations
Computer-Aided Protein Structure
Prediction
Protein
Sequence +
Structure
Recap
Structure prediction aims at building models of 3D structures of proteins that
could be useful for understanding structure function relationships.
STAT115 5
Protein Structure Levels
Folded structure:
lowest energy state
Watch Video
http://www.youtube.com/watch?v=fvBO3TqJ6F
E&feature=fvw
Protein Structure Determination &
Prediction
• Experimental Techniques
– X-ray Crystallography
– NMR
• Limitations of Current Experimental
Techniques
– Protein DataBank (PDB) -> 30,000 protein structures
– Unique structure are between 4000 to 5000 only
– Non-redudant (NR) -> 10,000,000 proteins sequences
• Importance of Structure Prediction
– Fill gap between known sequence and structures
– Protein Engineering.- To alter function of a protein
– Rational Drug Design
STAT115 8
Protein Structure Prediction
• Since AA sequence determines structure, can we
predict protein structure from its AA sequence?
– predicting the three angles, unlimited DoF!
• Physical properties that determine fold
– Rigidity of the protein backbone
– Interactions among amino acids, including
• Electrostatic interactions
• van der Waals forces
• Volume constraints
• Hydrogen, disulfide bonds
– Interactions of amino acids with water
Why Protein Modeling
 The goal of research in the area of structural genomics is to
provide the means to characterize and identify the large
number of protein sequences that are being discovered
Knowledge of the three-dimensional structure
 helps in the rational design of site-directed mutations
 can be of great importance for the design of drugs
 greatly enhances our understanding of how proteins function
and how they interact with each other , for example, explain
antigenic behaviour, DNA binding specificity, etc
Computational methods for Protein Structure
Prediction
 Homology or Comparative Modeling
 Fold Recognition or threading Methods
 Ab initio methods that utilize knowledge-based information
 Ab initio methods without the aid of knowledge-based
information
Why do we need computational approaches?
 Structural information from x-ray crystallography or NMR
results
obtained much more slowly.
techniques involve elaborate technical procedures
many proteins fail to crystallize at all and/or cannot be
obtained or dissolved in large enough quantities for NMR
measurements
The size of the protein is also a limiting factor for NMR
 With a better computational method this can be done extremely fast.
Tertiary Structure Prediction Methods
Any given protein sequence
Structure selection
Compare sequence identity with proteins that have solved structure
Homology
Modeling
> 35%
Fold
Recognition
ab initio
Folding
< 35%
< 35%
Structure refinement
Final Structure
Structure selection
Homology Modeling
• Predicts the three-dimensional structure of a
given protein sequence (TARGET) based on an
alignment to one or more known protein
structures (TEMPLATES)
• If similarity between the TARGET sequence and
the TEMPLATE sequence is detected, structural
similarity can be assumed.
• In general, 30% sequence identity is required for
generating useful models.
Homology Modeling Process
 Template recognition
 Alignment
 Determining structurally conserved regions
 Backbone generation
 Building loops or variable regions
 Conformational search for side chains
 Refinement of structure
 Validating structures
Template Recognition
• First we search the related proteins sequence(templates) to the target
sequence in any structural database of proteins
• The accuracy of model depends on the selection of a proper template
• FASTA and BLAST (PSI-BLAST) from EMBL-EBI and NCBI can be used
• This gives a probable set of templates but the final one is not yet decided
• After intial aligments and finding structurally conserved regions among
templates, we choose the final template
How good can homology modeling be?
Sequence Identity
60-100% Comparable to medium resolution NMR
Substrate Specificity
30-60% Molecular replacement in crystallography
Support site-directed mutagenesis
through visualization
<30% Serious errors!!!
Protein Structural Databases
• Templates can be found using the TARGET sequence as a
query for searching using FASTA or BLAST
– PDB (http://www.rcsb.org/pdb)
– MODELLER
(http://guitar.rockefeller.edu/modeller/modeller.html)
– ModBase (http://pipe.rockefeller.edu/modbase/general-
info.html)
– 3DCrunch
(http://www.expasy.ch/swissmod/SM_3DCrunch.html)
Target-Template Alignment
• No current comparative modeling method can
recover from an incorrect alignment
• Use multiple sequence alignments as initial guide.
• Consider slightly alternative alignments in areas of
uncertainty, build multiple models
Note: sequence from database versus sequence from
the actual PDB are not always identical
Target-Multiple Template Alignment
• Alignment is prepared by superimposing all template
structures
• Add target sequence to this alignment
• Compare with multiple sequence alignment and adjust
Gaining confidence in template searching
• Once a suitable template is found, it is a good idea to do a
literature search (PubMed) on the relevant fold to
determine what biological role(s) it plays.
• Does this match the biological/biochemical function that
you expect?
Other factors to consider in selecting
templates
• Template environment
– pH
– Ligands present?
• Resolution of the templates
• Family of proteins
– Phylogenetic tree construction can help find the
subfamily closest to the target sequence
• Multiple templates?
Determining Structurally Conserved Regions
(SCRs)
• When two or more reference protein structures are available
• Establish structural guidelines for the family of proteins under consideration
• First step in building a model protein by homology is determining what
regions are structurally conserved or constant among all the reference
proteins
• Target protein is supposed to assume the same conformation in conserved
regions
Structurally Conserved Regions
 SCRs are region in all proteins of a particular family that
are nearly identical in structures.
 Tend to be at inner cores of the proteins
 Usually contains alpha-helices and beta sheets
 No SCR can span more than one secondary structure
Alignment of Target Protein with SCR
 After doing alignments and finding SCRs
 We align the unknown sequence with the aligned reference proteins with the
knowledge of SCRs
 Since SCR cant contain insertions and deletions
 The enhancement of standard alignment algorithm is used
 After finding the suitable template and aligning the unknown sequence with the
template
 Assignment of coordinates within conserved regions is done
Assignment of coordinates within conserved
region
 Once the correspondence between amino acids in the reference and model
sequences has been made, the coordinates for an SCR can be assigned
 The reference proteins' coordinates are used as a basis for this assignment
 Where the side chains of the reference and model proteins are the same at
corresponding locations along the sequence, all the coordinates for the amino
acid are transferred
 Where they differ, the backbone coordinates are transferred , but the side
chain atoms are automatically replaced to preserve the model protein's
residue types
Assignment of coordinates in loop or variable
region
Two main methods
 Finding similar peptide segments in other proteins
 Generating a segment de-novo
Finding similar peptide segment in other proteins
 Advantage: all loops found are guaranteed to have reasonable
internal geometries and conformations
 Disadvantage: may not fit properly into the given model protein’s
framework
In this case, de-novo method is advisable
Assignment of coordinates in loop or
variable region
Selection Of Loops
 Check the loops on the basis of steric overlaps
 A specified degree of overlap can be tolerated
 Check the atoms within the loop against each other
 Then check loop atoms against rest of the protein’s atoms
Sidechains on surface of protein
• Exposed sidechains on surface can be highly flexible without a
single dominant conformation
• So ultimately if these solvent exposed sidechains do not form
binding interactions with other molecules or involved in say, a
catalytic reaction, then accuracy may not be
Side Chain Conformation Search
 With bond lengths, bond angles and two rotable backbone bonds per
residue φ and ψ, its very difficult to find the best conformation of a side
chain
 In addition, side chains of many residues have one or more degree of
freedom.
 Hence Side chain conformational search in loop regions is must
 Side chain residues replaced during coordinate transformations should
also be checked
Selection Of Good Rotamer
 Fortunately, statstical studies show side chain adopt only a small number
of many possible conformations
 The correct rotamer of a particular residue is mainly determined by local
environment
 Side chain generally adopt conformations where they are closely packed
Refinement of model using Molecular
Mechanics
Many structural artifacts can be introduced while the model protein is
being built
 Substitution of large side chains for small ones
 Strained peptide bonds between segments taken from difference
reference proteins
 Non optimum conformation of loops
Model Validation
 Every homology model contains errors. Two main reasons
 % sequence identity between reference and model
 The number of errors in templates
 Hence it is essential to check the correctness of overall fold/ structure,
errors of localized regions and stereochemical parameters: bond
lengths, angles, geometries
Structural Prediction by Homology Modeling
Structural Databases
Reference Proteins
Conserved Regions Protein Sequence
Predicted Conserved Regions
Initial Model
Structure Analysis
Refined Model
SeqFold,Profiles-3D, PSI-BLAST, BLAST & FASTA, Fold-recognition methods
Cα Matrix Matching
Sequence Alignment
Coordinate Assignment
Loop Searching/generation
WHAT IF, PROCHECK, PROSAII,..
Sidechain Rotamers
and/or MM/MD
MODELER
Errors in Homology Modeling
a) Side chain packing b) Distortions and shifts c) no template
Errors in Homology Modeling
d) Misalignments e) incorrect template
Marti-Renom et al., Ann. Rev. Biophys. Biomol. Struct., 2000, 29:291-325.
Automated Web-Based Homology Modelling
 SWISS Model : http://www.expasy.org/swissmod/SWISS-MODEL.html
 WHAT IF : http://www.cmbi.kun.nl/swift/servers/
 The CPHModels Server : http://www.cbs.dtu.dk/services/CPHmodels/
 3D Jigsaw : http://www.bmm.icnet.uk/~3djigsaw/
 SDSC1 : http://cl.sdsc.edu/hm.html
 EsyPred3D : http://www.fundp.ac.be/urbm/bioinfo/esypred/
 HHpred server: https://toolkit.tuebingen.mpg.de/tools/hhpred
Tools for Model Evaluation
 WHAT IF http://www.cmbi.kun.nl/gv/servers/WIWWWI/
 SOV http://predictioncenter.llnl.gov/local/sov/sov.html
 PROVE http://www.ucmb.ulb.ac.be/UCMB/PROVE/
 ANOLEA http://www.fundp.ac.be/pub/ANOLEA.html
 ERRAT http://www.doe-mbi.ucla.edu/Services/ERRATv2/
 VERIFY3D http://shannon.mbi.ucla.edu/DOE/Services/Verify_3D/
 BIOTECH http://biotech.embl-ebi.ac.uk:8400/
 ProsaII http://www.came.sbg.ac.at
 WHATCHECK http://www.sander.embl-heidelberg.de/whatcheck/
Challenges
 To model proteins with lower similarities( eg < 30% sequence identity)
 To increase accuracy of models and to make it fully automated
 Improvements may include simulataneous optimization techniques in
side chain modeling and loop modeling
 Developing better optimizers and potential function, which can lead
the model structure away from template towards the correct
structure
 Although comparative modelling needs significant improvement,
it is already a mature technique that can be used to address
many practical problems
Comparative Modeling Server & Program
 COMPOSER
http://www.tripos.com/sciTech/inSilicoDisc/bioInformatics/matchmaker.h
tml
 MODELER http://salilab.org/modeler
 InsightII http://www.msi.com/
 SYBYL http://www.tripos.com/
STAT115 41
Fold Recognition
• Also called protein threading
• Given new sequence and library of known
folds, find best alignment of sequence to each
fold, returned the most favorable one
STAT115 42
Protein Folding with
Dynamic Programming
• D. Eisenburg 1994
• Align sequence to each fold (a string of
environmental classes)
• Advantages: fast and works pretty well
• Disadvantages: do not consider AA contacts
STAT115 43
Fold Recognition
• Each predictor can submit N top hits
• Every predictor does well on something
• Common folds (more examples) are easier to
recognize
STAT115 44
Fold Recognition
• Alignment (seq to fold) is a big problem
STAT115 45
ab initio
• Predict interresidue contacts and then
compute structure (mild success)
• Simplified energy term + reduced search space
(phi/psi or lattice) (moderate success)
• Creative ways to memorize sequence 
structure correlations in short segments from
the PDB, and use these to model new
structures: ROSETTA
46
Ab initio prediction of protein structure
sample conformational space such that native-like conformations are found
astronomically large number of
conformations
5 states/100 residues = 5100 = 1070
select
hard to design functions
that are not fooled by
non-native conformations
(“decoys”)
Programs for Model Protein Prediction
https://molbiol-tools.ca/Protein_tertiary_structure.htm
L1Protein_Structure_Analysis.pptx

More Related Content

Similar to L1Protein_Structure_Analysis.pptx

Presentation1
Presentation1Presentation1
Presentation1
firesea
 
Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural Biology
Antonio E. Serrano
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptx
MO.SHAHANAWAZ
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
AliAhamd7
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
AliAhamd7
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
AmnaAkram29
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modeling
pkchoudhury
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
MuhammadRizwan863722
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
Ayesha Choudhury
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methods
mohammed muzammil
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure Prediction
SelimReza76
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)
Melvin Alex
 
Tertiary structure prediction- MODELLER, RASMOL
Tertiary structure prediction- MODELLER, RASMOLTertiary structure prediction- MODELLER, RASMOL
Tertiary structure prediction- MODELLER, RASMOL
kaverinair1
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
karamveer prajapat
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt
60BT119YAZHINIK
 
Drug discovery presentation
Drug discovery presentationDrug discovery presentation
Drug discovery presentation
Theertha Raveendran
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
ashharnomani
 
Comparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplicationsComparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplications
LynellBull52
 
Homology Modelling
Homology ModellingHomology Modelling
Homology Modelling
MAYANK ,MEHENDIRATTA
 
protein design, principles and examples.pptx
protein design, principles and examples.pptxprotein design, principles and examples.pptx
protein design, principles and examples.pptx
GopiChand121
 

Similar to L1Protein_Structure_Analysis.pptx (20)

Presentation1
Presentation1Presentation1
Presentation1
 
Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural Biology
 
HOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptxHOMOLOGY MODELLING.pptx
HOMOLOGY MODELLING.pptx
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
homology modellign lecture .pdf
homology modellign lecture .pdfhomology modellign lecture .pdf
homology modellign lecture .pdf
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modeling
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methods
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure Prediction
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)
 
Tertiary structure prediction- MODELLER, RASMOL
Tertiary structure prediction- MODELLER, RASMOLTertiary structure prediction- MODELLER, RASMOL
Tertiary structure prediction- MODELLER, RASMOL
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt
 
Drug discovery presentation
Drug discovery presentationDrug discovery presentation
Drug discovery presentation
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
 
Comparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplicationsComparative Protein Structure Modeling and itsApplications
Comparative Protein Structure Modeling and itsApplications
 
Homology Modelling
Homology ModellingHomology Modelling
Homology Modelling
 
protein design, principles and examples.pptx
protein design, principles and examples.pptxprotein design, principles and examples.pptx
protein design, principles and examples.pptx
 

Recently uploaded

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 

Recently uploaded (20)

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 

L1Protein_Structure_Analysis.pptx

  • 2. Learning Objectives • At the end of this topic the student should be able to: (i) Perform computer aided- protein structure determination and validation (ii) Design docking experiments (iii) Demonstrate understanding of protein interactions and simulations
  • 4. Recap Structure prediction aims at building models of 3D structures of proteins that could be useful for understanding structure function relationships.
  • 5. STAT115 5 Protein Structure Levels Folded structure: lowest energy state
  • 7. Protein Structure Determination & Prediction • Experimental Techniques – X-ray Crystallography – NMR • Limitations of Current Experimental Techniques – Protein DataBank (PDB) -> 30,000 protein structures – Unique structure are between 4000 to 5000 only – Non-redudant (NR) -> 10,000,000 proteins sequences • Importance of Structure Prediction – Fill gap between known sequence and structures – Protein Engineering.- To alter function of a protein – Rational Drug Design
  • 8. STAT115 8 Protein Structure Prediction • Since AA sequence determines structure, can we predict protein structure from its AA sequence? – predicting the three angles, unlimited DoF! • Physical properties that determine fold – Rigidity of the protein backbone – Interactions among amino acids, including • Electrostatic interactions • van der Waals forces • Volume constraints • Hydrogen, disulfide bonds – Interactions of amino acids with water
  • 9. Why Protein Modeling  The goal of research in the area of structural genomics is to provide the means to characterize and identify the large number of protein sequences that are being discovered Knowledge of the three-dimensional structure  helps in the rational design of site-directed mutations  can be of great importance for the design of drugs  greatly enhances our understanding of how proteins function and how they interact with each other , for example, explain antigenic behaviour, DNA binding specificity, etc
  • 10. Computational methods for Protein Structure Prediction  Homology or Comparative Modeling  Fold Recognition or threading Methods  Ab initio methods that utilize knowledge-based information  Ab initio methods without the aid of knowledge-based information
  • 11. Why do we need computational approaches?  Structural information from x-ray crystallography or NMR results obtained much more slowly. techniques involve elaborate technical procedures many proteins fail to crystallize at all and/or cannot be obtained or dissolved in large enough quantities for NMR measurements The size of the protein is also a limiting factor for NMR  With a better computational method this can be done extremely fast.
  • 12. Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence identity with proteins that have solved structure Homology Modeling > 35% Fold Recognition ab initio Folding < 35% < 35% Structure refinement Final Structure Structure selection
  • 13. Homology Modeling • Predicts the three-dimensional structure of a given protein sequence (TARGET) based on an alignment to one or more known protein structures (TEMPLATES) • If similarity between the TARGET sequence and the TEMPLATE sequence is detected, structural similarity can be assumed. • In general, 30% sequence identity is required for generating useful models.
  • 14. Homology Modeling Process  Template recognition  Alignment  Determining structurally conserved regions  Backbone generation  Building loops or variable regions  Conformational search for side chains  Refinement of structure  Validating structures
  • 15. Template Recognition • First we search the related proteins sequence(templates) to the target sequence in any structural database of proteins • The accuracy of model depends on the selection of a proper template • FASTA and BLAST (PSI-BLAST) from EMBL-EBI and NCBI can be used • This gives a probable set of templates but the final one is not yet decided • After intial aligments and finding structurally conserved regions among templates, we choose the final template
  • 16. How good can homology modeling be? Sequence Identity 60-100% Comparable to medium resolution NMR Substrate Specificity 30-60% Molecular replacement in crystallography Support site-directed mutagenesis through visualization <30% Serious errors!!!
  • 17. Protein Structural Databases • Templates can be found using the TARGET sequence as a query for searching using FASTA or BLAST – PDB (http://www.rcsb.org/pdb) – MODELLER (http://guitar.rockefeller.edu/modeller/modeller.html) – ModBase (http://pipe.rockefeller.edu/modbase/general- info.html) – 3DCrunch (http://www.expasy.ch/swissmod/SM_3DCrunch.html)
  • 18. Target-Template Alignment • No current comparative modeling method can recover from an incorrect alignment • Use multiple sequence alignments as initial guide. • Consider slightly alternative alignments in areas of uncertainty, build multiple models Note: sequence from database versus sequence from the actual PDB are not always identical
  • 19. Target-Multiple Template Alignment • Alignment is prepared by superimposing all template structures • Add target sequence to this alignment • Compare with multiple sequence alignment and adjust
  • 20. Gaining confidence in template searching • Once a suitable template is found, it is a good idea to do a literature search (PubMed) on the relevant fold to determine what biological role(s) it plays. • Does this match the biological/biochemical function that you expect?
  • 21. Other factors to consider in selecting templates • Template environment – pH – Ligands present? • Resolution of the templates • Family of proteins – Phylogenetic tree construction can help find the subfamily closest to the target sequence • Multiple templates?
  • 22. Determining Structurally Conserved Regions (SCRs) • When two or more reference protein structures are available • Establish structural guidelines for the family of proteins under consideration • First step in building a model protein by homology is determining what regions are structurally conserved or constant among all the reference proteins • Target protein is supposed to assume the same conformation in conserved regions
  • 23. Structurally Conserved Regions  SCRs are region in all proteins of a particular family that are nearly identical in structures.  Tend to be at inner cores of the proteins  Usually contains alpha-helices and beta sheets  No SCR can span more than one secondary structure
  • 24. Alignment of Target Protein with SCR  After doing alignments and finding SCRs  We align the unknown sequence with the aligned reference proteins with the knowledge of SCRs  Since SCR cant contain insertions and deletions  The enhancement of standard alignment algorithm is used  After finding the suitable template and aligning the unknown sequence with the template  Assignment of coordinates within conserved regions is done
  • 25. Assignment of coordinates within conserved region  Once the correspondence between amino acids in the reference and model sequences has been made, the coordinates for an SCR can be assigned  The reference proteins' coordinates are used as a basis for this assignment  Where the side chains of the reference and model proteins are the same at corresponding locations along the sequence, all the coordinates for the amino acid are transferred  Where they differ, the backbone coordinates are transferred , but the side chain atoms are automatically replaced to preserve the model protein's residue types
  • 26. Assignment of coordinates in loop or variable region Two main methods  Finding similar peptide segments in other proteins  Generating a segment de-novo
  • 27. Finding similar peptide segment in other proteins  Advantage: all loops found are guaranteed to have reasonable internal geometries and conformations  Disadvantage: may not fit properly into the given model protein’s framework In this case, de-novo method is advisable Assignment of coordinates in loop or variable region
  • 28. Selection Of Loops  Check the loops on the basis of steric overlaps  A specified degree of overlap can be tolerated  Check the atoms within the loop against each other  Then check loop atoms against rest of the protein’s atoms
  • 29. Sidechains on surface of protein • Exposed sidechains on surface can be highly flexible without a single dominant conformation • So ultimately if these solvent exposed sidechains do not form binding interactions with other molecules or involved in say, a catalytic reaction, then accuracy may not be
  • 30. Side Chain Conformation Search  With bond lengths, bond angles and two rotable backbone bonds per residue φ and ψ, its very difficult to find the best conformation of a side chain  In addition, side chains of many residues have one or more degree of freedom.  Hence Side chain conformational search in loop regions is must  Side chain residues replaced during coordinate transformations should also be checked
  • 31. Selection Of Good Rotamer  Fortunately, statstical studies show side chain adopt only a small number of many possible conformations  The correct rotamer of a particular residue is mainly determined by local environment  Side chain generally adopt conformations where they are closely packed
  • 32. Refinement of model using Molecular Mechanics Many structural artifacts can be introduced while the model protein is being built  Substitution of large side chains for small ones  Strained peptide bonds between segments taken from difference reference proteins  Non optimum conformation of loops
  • 33. Model Validation  Every homology model contains errors. Two main reasons  % sequence identity between reference and model  The number of errors in templates  Hence it is essential to check the correctness of overall fold/ structure, errors of localized regions and stereochemical parameters: bond lengths, angles, geometries
  • 34. Structural Prediction by Homology Modeling Structural Databases Reference Proteins Conserved Regions Protein Sequence Predicted Conserved Regions Initial Model Structure Analysis Refined Model SeqFold,Profiles-3D, PSI-BLAST, BLAST & FASTA, Fold-recognition methods Cα Matrix Matching Sequence Alignment Coordinate Assignment Loop Searching/generation WHAT IF, PROCHECK, PROSAII,.. Sidechain Rotamers and/or MM/MD MODELER
  • 35. Errors in Homology Modeling a) Side chain packing b) Distortions and shifts c) no template
  • 36. Errors in Homology Modeling d) Misalignments e) incorrect template Marti-Renom et al., Ann. Rev. Biophys. Biomol. Struct., 2000, 29:291-325.
  • 37. Automated Web-Based Homology Modelling  SWISS Model : http://www.expasy.org/swissmod/SWISS-MODEL.html  WHAT IF : http://www.cmbi.kun.nl/swift/servers/  The CPHModels Server : http://www.cbs.dtu.dk/services/CPHmodels/  3D Jigsaw : http://www.bmm.icnet.uk/~3djigsaw/  SDSC1 : http://cl.sdsc.edu/hm.html  EsyPred3D : http://www.fundp.ac.be/urbm/bioinfo/esypred/  HHpred server: https://toolkit.tuebingen.mpg.de/tools/hhpred
  • 38. Tools for Model Evaluation  WHAT IF http://www.cmbi.kun.nl/gv/servers/WIWWWI/  SOV http://predictioncenter.llnl.gov/local/sov/sov.html  PROVE http://www.ucmb.ulb.ac.be/UCMB/PROVE/  ANOLEA http://www.fundp.ac.be/pub/ANOLEA.html  ERRAT http://www.doe-mbi.ucla.edu/Services/ERRATv2/  VERIFY3D http://shannon.mbi.ucla.edu/DOE/Services/Verify_3D/  BIOTECH http://biotech.embl-ebi.ac.uk:8400/  ProsaII http://www.came.sbg.ac.at  WHATCHECK http://www.sander.embl-heidelberg.de/whatcheck/
  • 39. Challenges  To model proteins with lower similarities( eg < 30% sequence identity)  To increase accuracy of models and to make it fully automated  Improvements may include simulataneous optimization techniques in side chain modeling and loop modeling  Developing better optimizers and potential function, which can lead the model structure away from template towards the correct structure  Although comparative modelling needs significant improvement, it is already a mature technique that can be used to address many practical problems
  • 40. Comparative Modeling Server & Program  COMPOSER http://www.tripos.com/sciTech/inSilicoDisc/bioInformatics/matchmaker.h tml  MODELER http://salilab.org/modeler  InsightII http://www.msi.com/  SYBYL http://www.tripos.com/
  • 41. STAT115 41 Fold Recognition • Also called protein threading • Given new sequence and library of known folds, find best alignment of sequence to each fold, returned the most favorable one
  • 42. STAT115 42 Protein Folding with Dynamic Programming • D. Eisenburg 1994 • Align sequence to each fold (a string of environmental classes) • Advantages: fast and works pretty well • Disadvantages: do not consider AA contacts
  • 43. STAT115 43 Fold Recognition • Each predictor can submit N top hits • Every predictor does well on something • Common folds (more examples) are easier to recognize
  • 44. STAT115 44 Fold Recognition • Alignment (seq to fold) is a big problem
  • 45. STAT115 45 ab initio • Predict interresidue contacts and then compute structure (mild success) • Simplified energy term + reduced search space (phi/psi or lattice) (moderate success) • Creative ways to memorize sequence  structure correlations in short segments from the PDB, and use these to model new structures: ROSETTA
  • 46. 46 Ab initio prediction of protein structure sample conformational space such that native-like conformations are found astronomically large number of conformations 5 states/100 residues = 5100 = 1070 select hard to design functions that are not fooled by non-native conformations (“decoys”)
  • 47. Programs for Model Protein Prediction https://molbiol-tools.ca/Protein_tertiary_structure.htm