SlideShare a Scribd company logo
1 of 50
P. K. Choudhury (Ph. D, 1st year,
       Dairy Microbiology)
The druggable genome


                   Human genome
 Polysaccharides   Lipids   Nucleic Acids   Proteins



                                                Proteins with
                                                 binding site



Druggable genome: Subset of genes which
  express proteins capable of binding small
  drug-like molecules
Protein Structure Prediction

    Why predict protein structure if we can use
        experimental tools to determine it?


• Experimental methods are slow and expensive

• Some structures were failed to be solved
• A representative family structure can suffice to
  deduce structures of the entire family sequences
Outline to modeling………………

1. Introduction to protein structure and databases

2. Structure prediction approaches

   •   Ab-initio
   •   Threading
   •   Homology modeling

3. Hands on molecular modeling

4. Model evaluation
1. Protein structure and databases

Protein structure is hierarchic:
• Pauling built models based on the following
  principles, codified by G. N. Ramachandran:
   •   Bond lengths and angles -should be similar to
       those found in individual amino acids and
       small peptides
   •   Peptide bond -should be planer
   •   Overlaps-not permitted, pairs of atoms no
       closer than sum of their covalent radii
   •   Stabilization-have   sterics    that    permit
       hydrogen bonding
• Two degrees of freedom:
   •    (phi) angle = rotation about N-C
   •    (psi) angle = rotation about C-C

• A linear amino acid polymer with some folds
  is better but still not functional nor
  completely energetically favorable packing!
Ramachandran Plot
SCOP-Fold classification




 All alpha (α)    All beta (β)   Alpha and beta(α, β)
Databases

• RCSB-the Protein Data Bank-all deposited structures
   •     Experimentally-determined structures of proteins, nucleic acids, and
         complex assemblies.
   •     Currently having 65,000 structures

• Uniport main sequence database
       o     SwissPro
       o     TrEMBL
   Collaborations:
   European Bioinformatics Institute (EBI),
   Swiss Institute of Bioinformatics (SIB) ,
   Protein Information Resource (PIR)


• NCBI lots of databases, including sequence and
                                     structures
• PDBsum combines structural & sequence data
2. Structure Prediction Approaches

o Ab-initio fold prediction
   •   Not based on similarity to a
       sequence- structure

o Threading (Fold Recognition)
   •   Requires a structure similar
       to a known structure

o Homology modeling
   •   Based on sequence similarity
       with a protein for which a
       structure has been solved.
a. Ab initio modeling
o   Structure prediction     from
    “first principals”:

o   Shows that we understand
    the process.

o   Given only the sequence, try
    to predict the structure
    based on physico-chemical
    properties
    •   The force field
    •   Molecular dynamics
    •   Minimal energy
Force field
•    Mathematical expressions describing the potential energy of a
     molecular system
•    Each expression describes a different type of physico-chemical
     interaction between atoms in the system:
     • Van der Waals forces, Covalent bonds, Hydrogen bonds,
         Charges, Hydrophobic effects

    Molecular dynamics
•    Simulates the forces that governs the protein within water.
•    Since proteins usually naturally fold, this would lead to the
     native protein structure.
Minimal Energy
Assumption: the folded
form is the minimal energy
conformation of a protein

•   Use of simplified energy
    function
•   Search methods for minimal
    energy conformation:

    •   Greedy search
    •   Simulated annealing
b. Threading (Fold reorganization)

Given a sequence and a library of folds, thread the
sequence through each fold. Take the one with the
highest score (I-TASSER).

•   Method will fail if new protein does not belong
    to any fold in the library.

•   Score of the threading is computed based on
    known physical chemistry properties and
    statistics of amino acids.
c. Homology Modeling…………………

o   A protein structure is
    defined by its amino acid
    sequence.

o   Closely related sequences
    adopt      highly     similar
    structures, distantly related
    sequences may still fold
    into similar structures.

o   Three-dimensional               Triophospate ismoerases
    structure of proteins from      44.7% sequence identity
                                          0.95 RMSD
    the same family is more
    conserved
3. Hands on molecular modeling




        Homology
The Query Protein

Name: Dihydrodipicolinate reductase
Enzyme reaction:




Molecular process: Lysine biosynthesis (early stages)
Organism: E. coli
Sequence length: 273 aa
Steps in homology modeling




1.   Searching for structures related to the query sequence

2.   Selecting templates

3.   Aligning query sequence with template structures

4.   Building a model for the query using information from the
     template structures (Modelor 9.10)

5.   Evaluating the model
1. Searching For Structures

              Get your sequence
>DAPB_ECOLI
MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGEL
AGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTT
GFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEII
EAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGF
ATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESG
LFDMRDVLDLNNL*
Get your sequence

>DAPB_ECOLI
MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKT
GVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRD
AAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAM
GEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLE
ITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL

PIR Format
>P1;1ARZ/A
Sequence;1ARZ/A:   1   ::    273::   Dihydrodipicolinate
reductase:: Escherichia Coli:0.00:0.00
MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGA
GKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAG
KQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVD
APSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEH
TAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL*
•   Aligning query sequence with template structures

•   Building a model for the query using information from the
    template structures (Modeller 9.10)

•   Modeller 9.10 will generate PDB files with reference to the
    template structure.

•   Evaluation of the structure in SAVES
4. Model evaluation

Examples of assessment approaches:

1. Assessment of the model’s stereochemistry

2. Prediction of unreliable regions of the model -
   “pseudo energy” profile: peaks  errors

3. Consistence with experimental observations

4. Consistence with evolutionary conservation rates

   .
Structural Analysis Verification Server




                          http://nihserver.mbi.ucla.edu/SAVES/
Real vs. model superimposition
Outline to docking………….
 •   Introduction to protein-ligand docking
 •   Scoring functions
 •   Assessing performance
 •   Practical aspects
Protein ligand Docking
• A Structure-Based Drug Design (SBDD) method
    “structure” means “using protein structure”
• Computational method that mimics the binding of a ligand to a protein
• Given...




• Predicts...
   • The pose of the molecule in
      the binding site
   • The binding affinity or a score
      representing the strength of
      binding
Pose vs. binding site
• Binding site (or “active site”)
   •   The part of the protein where the ligand
       binds
   •   Generally a cavity on the protein surface
   •   Can be identified by looking at the
       crystal structure of the protein

• Pose (or “binding mode”)
   • The geometry of the ligand in the binding
     site
   • Geometry = location, orientation and
     conformation
• Protein-ligand docking is not about
  identifying the binding site
Outline to docking………….
 •   Introduction to protein-ligand docking
 •   Scoring functions
 •   Assessing performance
 •   Practical aspects
Components of docking software
 • Typically, protein-ligand docking software consist
   of two main components which work together:
 1. Search algorithm
     • Generates a large number of poses of a molecule in the
        binding site
 2. Scoring function
     • Calculates a score or binding affinity for a particular pose

 • To provide
     The pose of the molecule in the
      binding site
     The binding affinity or a score
      representing the strength of
      binding
The perfect scoring function will

• Accurately calculate the binding affinity
   • Will allow actives to be identified in a virtual screen
   • Be able to rank actives in terms of affinity

• Score the poses of an active higher than poses of an
  inactive
   • Will rank actives higher than inactives in a virtual screen

• Score the correct pose of the active higher than an
  incorrect pose of the active
   • Will allow the correct pose of the active to be identified

“actives” = molecules with biological activity
Broadly speaking, scoring functions can be
  divided into the following classes:
  • Forcefield-based
     • Based on terms from molecular mechanics force fields
     • GoldScore, DOCK, AutoDock
  • Empirical
     • Parameterised against experimental binding affinities
     • ChemScore, PLP, Glide SP/XP
  • Knowledge-based potentials
     • Based on statistical analysis of observed pairwise
       distributions
     • PMF, DrugScore, ASP
Böhm’s empirical scoring function
 •   In general, scoring functions assume that the free energy of binding can be written as
     a linear sum of terms to reflect the various contributions to binding


 •   Bohm’s scoring function included contributions from
     hydrogen bonding, ionic interactions, lipophilic
     interactions and the loss of internal conformational
     freedom of the ligand.


 •   The ∆G values on the right of the equation are all constants
 •   ∆Go is a contribution to the binding energy that does not directly depend on any specific
     interactions with the protein
 •   The hydrogen bonding and ionic terms are both dependent on the geometry of the
     interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α)
     being penalised.
 •   The lipophilic term is proportional to the contact surface area (Alipo) between protein and
     ligand involving non-polar atoms.
 •   The conformational entropy term is the penalty associated with freezing internal rotations
     of the ligand. It is largely entropic in nature. Here the value is directly proportional to the
     number of rotatable bonds in the ligand (NROT).
Outline to docking………….
 •   Introduction to protein-ligand docking
 •   Scoring functions
 •   Assessing performance
 •   Practical aspects
Pose prediction accuracy

 •   Accuracy measured by RMSD (root mean squared deviation) compared to known
     crystal structures
          RMSD = square root of the average of (the difference between a particular
           coordinate in the crystal and that coordinate in the pose)2
          Within 2.0Å RMSD considered cut-off for accuracy

 •   In general, the best docking software predicts the correct pose about 70% of the
     time
 •   Need a dataset of Nact known actives, and inactives
 •   Dock all molecules, and rank each by score
 •   Ideally, all actives would be at the top of the list

 •   Define enrichment, E, as the number of actives found (Nfound) in the top X% of
     scores (typically 1% or 5%), compared to how many expected by chance

       E = Nfound / (Nact * X/100)
       E > 1 implies “positive enrichment”, better than random
       E < 1 implies “negative enrichment”, worse than random
Outline to docking………….
 •   Introduction to protein-ligand docking
 •   Scoring functions
 •   Assessing performance
 •   Practical aspects
Protein preparation
• The Protein Data Bank (PDB) is a repository of protein crystal
  structures, often in complexes with inhibitors
• PDB structures often contain water molecules
   • In general, all water molecules are removed except where it is known
     that they play an important role in coordinating to the ligand
• PDB structures are missing all hydrogen atoms
   • Many docking programs require the protein to have explicit hydrogens.
      In general these can be added unambiguously, except in the case of
      acidic/basic side chains
                                                  N     NH
• An incorrect assignment of protonation
  states in the active site will give poor                       HN     NH
  results                                        R                   +
• Glutamate, Aspartate have COO- or COOH
   • OH is hydrogen bond donor, O- is not               N         R
                                                HN
• Histidine is a base and its neutral form has
  two tautomers.
                                                 R
Ligand preparation
A reasonable 3D structure is required as starting point
   •   Even during flexible docking, bond lengths and angles are held
       fixed

The protonation state and tautomeric form of a particular
ligand could influence its hydrogen bonding ability
   •   Either protonate as expected for physiological pH and use a single
       tautomer
   •   Or generate and dock all possible protonation states and
       tautomers, and retain the one with the highest score

                      OH           H+         O



                     Enol                  Ketone
Conclusions

• Computationals prediction of protein structure using
  modeling tools are effort saving and error minimizing
  processes.
• Homology modeling can be successively applied if structure
  of known sequence simillarity is known.
• Protein-ligand docking is an essential tool for computational
  drug design
   •   Widely used in pharmaceutical companies
• But it’s not a golden bullet
   •   The perfect scoring function has yet to be found
   •   The performance varies from target to target, and scoring function
       to scoring function
• Care needs to be taken when preparing both the protein and
  the ligands
• The more information you have, the better your chances.
Useful links…..
1. SEARCHING FOR STRUCTURES
        • PDB-Blast at NCBI- http://blast.ncbi.nlm.nih.gov/Blast.cgi
        • Meta server- 3D judry http://bioinfo.pl/meta/
        • FFAS03- http://ffas.ljcrf.edu/ffas-cgi/cgi/ffas.pl
        • HHPRED- http://toolkit.tuebingen.mpg.de/hhpred

2. SELECTING TEMPLATES

3. ALIGNING QUERY SEQUENCE WITH TEMPLATE STRUCTURES
         • MSA - MUSCLE, T-coffee and MAFFT at
           http://toolkit.tuebingen.mpg.de/sections/alignment
         • Alignment editor – Bioedit - http://www.mbio.ncsu.edu/BioEdit/bioedit.html

4. BUILDING A MODEL
         • Nest - http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:nest
         • Modeller - http://salilab.org/modeller/modeller.html

5. EVALUATING THE MODEL
        • ConSurf http://consurf.tau.ac.il
        • PROCHECK http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html
        • WHATCHECK www.cmbi.kun.nl/swift/whatcheck/
        • ProSA https://prosa.services.came.sbg.ac.at/prosa.php
        • ProQ http://www.sbc.su.se/~bjornw/ProQ/ProQ.cgi
Computer Aided Molecular Modeling

More Related Content

What's hot

conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mapping
Vishakha Giradkar
 
Conformational analysis
Conformational analysisConformational analysis
Conformational analysis
Pinky Vincent
 

What's hot (20)

Molecular modelling
Molecular modellingMolecular modelling
Molecular modelling
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
7.local and global minima
7.local and global minima7.local and global minima
7.local and global minima
 
In Silico methods for ADMET prediction of new molecules
 In Silico methods for ADMET prediction of new molecules In Silico methods for ADMET prediction of new molecules
In Silico methods for ADMET prediction of new molecules
 
conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mapping
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 
Conformational analysis
Conformational analysisConformational analysis
Conformational analysis
 
In silico drug design/Molecular docking
In silico drug design/Molecular dockingIn silico drug design/Molecular docking
In silico drug design/Molecular docking
 
Molecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug designMolecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug design
 
Pharmacophore mapping and virtual screening
Pharmacophore mapping and virtual screeningPharmacophore mapping and virtual screening
Pharmacophore mapping and virtual screening
 
Molecular mechanics
Molecular mechanicsMolecular mechanics
Molecular mechanics
 
Molecular maodeling and drug design
Molecular maodeling and drug designMolecular maodeling and drug design
Molecular maodeling and drug design
 
Presentation1
Presentation1Presentation1
Presentation1
 
Quantum Mechanics in Molecular modeling
Quantum Mechanics in Molecular modelingQuantum Mechanics in Molecular modeling
Quantum Mechanics in Molecular modeling
 
CADD
CADDCADD
CADD
 
molecular docking
molecular dockingmolecular docking
molecular docking
 
MD Simulation
MD SimulationMD Simulation
MD Simulation
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
MOLECULAR MODELLING
MOLECULAR MODELLINGMOLECULAR MODELLING
MOLECULAR MODELLING
 

Viewers also liked

protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modelling
Dileep Paruchuru
 
Bacteriophages of LAB control measures and significance
Bacteriophages of LAB control measures and significance  Bacteriophages of LAB control measures and significance
Bacteriophages of LAB control measures and significance
pkchoudhury
 
Proteolytic systems in lactic acid bacteria
Proteolytic systems in lactic acid bacteria Proteolytic systems in lactic acid bacteria
Proteolytic systems in lactic acid bacteria
pkchoudhury
 
K-means and Hierarchical Clustering
K-means and Hierarchical ClusteringK-means and Hierarchical Clustering
K-means and Hierarchical Clustering
guestfee8698
 

Viewers also liked (20)

protein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modellingprotein sturcture prediction and molecular modelling
protein sturcture prediction and molecular modelling
 
Review On Molecular Modeling
Review On Molecular ModelingReview On Molecular Modeling
Review On Molecular Modeling
 
STRUCTURE BASED DRUG DESIGN - MOLECULAR MODELLING AND DRUG DISCOVERY
STRUCTURE BASED DRUG DESIGN - MOLECULAR MODELLING AND DRUG DISCOVERYSTRUCTURE BASED DRUG DESIGN - MOLECULAR MODELLING AND DRUG DISCOVERY
STRUCTURE BASED DRUG DESIGN - MOLECULAR MODELLING AND DRUG DISCOVERY
 
Best Boat Docking System: SlideMoor
Best Boat Docking System: SlideMoorBest Boat Docking System: SlideMoor
Best Boat Docking System: SlideMoor
 
Dwb nwbio 2015
Dwb nwbio 2015Dwb nwbio 2015
Dwb nwbio 2015
 
What can you learn from molecular modeling?
What can you learn from molecular modeling?What can you learn from molecular modeling?
What can you learn from molecular modeling?
 
Modeling Instruction in High School Chemistry
Modeling Instruction in High School ChemistryModeling Instruction in High School Chemistry
Modeling Instruction in High School Chemistry
 
Bacteriophages of LAB control measures and significance
Bacteriophages of LAB control measures and significance  Bacteriophages of LAB control measures and significance
Bacteriophages of LAB control measures and significance
 
Homology modeling and molecular docking
Homology modeling and molecular dockingHomology modeling and molecular docking
Homology modeling and molecular docking
 
Proteolytic systems in lactic acid bacteria
Proteolytic systems in lactic acid bacteria Proteolytic systems in lactic acid bacteria
Proteolytic systems in lactic acid bacteria
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
 
Theory and application of fluctuating-charge models
Theory and application of fluctuating-charge modelsTheory and application of fluctuating-charge models
Theory and application of fluctuating-charge models
 
K-means and Hierarchical Clustering
K-means and Hierarchical ClusteringK-means and Hierarchical Clustering
K-means and Hierarchical Clustering
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
08 clustering
08 clustering08 clustering
08 clustering
 
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. pptImage segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
 
Introduction to simulation modeling
Introduction to simulation modelingIntroduction to simulation modeling
Introduction to simulation modeling
 
Computer aided drug designing
Computer aided drug designingComputer aided drug designing
Computer aided drug designing
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 

Similar to Computer Aided Molecular Modeling

Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
ashharnomani
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
Deependra Ban
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
palliyath91
 
protein structure prediction presentation .pptx
protein structure prediction presentation .pptxprotein structure prediction presentation .pptx
protein structure prediction presentation .pptx
abdulahad563527
 

Similar to Computer Aided Molecular Modeling (20)

In silico structure prediction
In silico structure predictionIn silico structure prediction
In silico structure prediction
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
docking
docking docking
docking
 
L1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptxL1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptx
 
protein design, principles and examples.pptx
protein design, principles and examples.pptxprotein design, principles and examples.pptx
protein design, principles and examples.pptx
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
 
Computational Prediction of Protein Structure.pptx
Computational Prediction of Protein Structure.pptxComputational Prediction of Protein Structure.pptx
Computational Prediction of Protein Structure.pptx
 
molecular docking screnning. pptx
molecular docking screnning. pptxmolecular docking screnning. pptx
molecular docking screnning. pptx
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
 
Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
 
P. Joshi SBDD and docking.ppt
P. Joshi SBDD and docking.pptP. Joshi SBDD and docking.ppt
P. Joshi SBDD and docking.ppt
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014
 
Molecular modelling (1)
Molecular modelling (1)Molecular modelling (1)
Molecular modelling (1)
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
protein structure prediction presentation .pptx
protein structure prediction presentation .pptxprotein structure prediction presentation .pptx
protein structure prediction presentation .pptx
 
Protein structure analysis
Protein structure analysis Protein structure analysis
Protein structure analysis
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 

Computer Aided Molecular Modeling

  • 1. P. K. Choudhury (Ph. D, 1st year, Dairy Microbiology)
  • 2. The druggable genome Human genome Polysaccharides Lipids Nucleic Acids Proteins Proteins with binding site Druggable genome: Subset of genes which express proteins capable of binding small drug-like molecules
  • 3. Protein Structure Prediction Why predict protein structure if we can use experimental tools to determine it? • Experimental methods are slow and expensive • Some structures were failed to be solved • A representative family structure can suffice to deduce structures of the entire family sequences
  • 4. Outline to modeling……………… 1. Introduction to protein structure and databases 2. Structure prediction approaches • Ab-initio • Threading • Homology modeling 3. Hands on molecular modeling 4. Model evaluation
  • 5. 1. Protein structure and databases Protein structure is hierarchic:
  • 6. • Pauling built models based on the following principles, codified by G. N. Ramachandran: • Bond lengths and angles -should be similar to those found in individual amino acids and small peptides • Peptide bond -should be planer • Overlaps-not permitted, pairs of atoms no closer than sum of their covalent radii • Stabilization-have sterics that permit hydrogen bonding • Two degrees of freedom: •  (phi) angle = rotation about N-C •  (psi) angle = rotation about C-C • A linear amino acid polymer with some folds is better but still not functional nor completely energetically favorable packing!
  • 8. SCOP-Fold classification All alpha (α) All beta (β) Alpha and beta(α, β)
  • 9. Databases • RCSB-the Protein Data Bank-all deposited structures • Experimentally-determined structures of proteins, nucleic acids, and complex assemblies. • Currently having 65,000 structures • Uniport main sequence database o SwissPro o TrEMBL Collaborations: European Bioinformatics Institute (EBI), Swiss Institute of Bioinformatics (SIB) , Protein Information Resource (PIR) • NCBI lots of databases, including sequence and structures • PDBsum combines structural & sequence data
  • 10. 2. Structure Prediction Approaches o Ab-initio fold prediction • Not based on similarity to a sequence- structure o Threading (Fold Recognition) • Requires a structure similar to a known structure o Homology modeling • Based on sequence similarity with a protein for which a structure has been solved.
  • 11. a. Ab initio modeling o Structure prediction from “first principals”: o Shows that we understand the process. o Given only the sequence, try to predict the structure based on physico-chemical properties • The force field • Molecular dynamics • Minimal energy
  • 12. Force field • Mathematical expressions describing the potential energy of a molecular system • Each expression describes a different type of physico-chemical interaction between atoms in the system: • Van der Waals forces, Covalent bonds, Hydrogen bonds, Charges, Hydrophobic effects Molecular dynamics • Simulates the forces that governs the protein within water. • Since proteins usually naturally fold, this would lead to the native protein structure.
  • 13. Minimal Energy Assumption: the folded form is the minimal energy conformation of a protein • Use of simplified energy function • Search methods for minimal energy conformation: • Greedy search • Simulated annealing
  • 14. b. Threading (Fold reorganization) Given a sequence and a library of folds, thread the sequence through each fold. Take the one with the highest score (I-TASSER). • Method will fail if new protein does not belong to any fold in the library. • Score of the threading is computed based on known physical chemistry properties and statistics of amino acids.
  • 15.
  • 16. c. Homology Modeling………………… o A protein structure is defined by its amino acid sequence. o Closely related sequences adopt highly similar structures, distantly related sequences may still fold into similar structures. o Three-dimensional Triophospate ismoerases structure of proteins from 44.7% sequence identity 0.95 RMSD the same family is more conserved
  • 17. 3. Hands on molecular modeling Homology
  • 18. The Query Protein Name: Dihydrodipicolinate reductase Enzyme reaction: Molecular process: Lysine biosynthesis (early stages) Organism: E. coli Sequence length: 273 aa
  • 19. Steps in homology modeling 1. Searching for structures related to the query sequence 2. Selecting templates 3. Aligning query sequence with template structures 4. Building a model for the query using information from the template structures (Modelor 9.10) 5. Evaluating the model
  • 20. 1. Searching For Structures Get your sequence >DAPB_ECOLI MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGEL AGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTT GFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEII EAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGF ATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESG LFDMRDVLDLNNL*
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. Get your sequence >DAPB_ECOLI MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKT GVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRD AAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAM GEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLE ITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL PIR Format >P1;1ARZ/A Sequence;1ARZ/A: 1 :: 273:: Dihydrodipicolinate reductase:: Escherichia Coli:0.00:0.00 MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGA GKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAG KQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVD APSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEH TAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL*
  • 28. Aligning query sequence with template structures • Building a model for the query using information from the template structures (Modeller 9.10) • Modeller 9.10 will generate PDB files with reference to the template structure. • Evaluation of the structure in SAVES
  • 29. 4. Model evaluation Examples of assessment approaches: 1. Assessment of the model’s stereochemistry 2. Prediction of unreliable regions of the model - “pseudo energy” profile: peaks  errors 3. Consistence with experimental observations 4. Consistence with evolutionary conservation rates .
  • 30. Structural Analysis Verification Server http://nihserver.mbi.ucla.edu/SAVES/
  • 31.
  • 32. Real vs. model superimposition
  • 33.
  • 34. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 35. Protein ligand Docking • A Structure-Based Drug Design (SBDD) method “structure” means “using protein structure” • Computational method that mimics the binding of a ligand to a protein • Given... • Predicts... • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding
  • 36. Pose vs. binding site • Binding site (or “active site”) • The part of the protein where the ligand binds • Generally a cavity on the protein surface • Can be identified by looking at the crystal structure of the protein • Pose (or “binding mode”) • The geometry of the ligand in the binding site • Geometry = location, orientation and conformation • Protein-ligand docking is not about identifying the binding site
  • 37. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 38. Components of docking software • Typically, protein-ligand docking software consist of two main components which work together: 1. Search algorithm • Generates a large number of poses of a molecule in the binding site 2. Scoring function • Calculates a score or binding affinity for a particular pose • To provide  The pose of the molecule in the binding site  The binding affinity or a score representing the strength of binding
  • 39. The perfect scoring function will • Accurately calculate the binding affinity • Will allow actives to be identified in a virtual screen • Be able to rank actives in terms of affinity • Score the poses of an active higher than poses of an inactive • Will rank actives higher than inactives in a virtual screen • Score the correct pose of the active higher than an incorrect pose of the active • Will allow the correct pose of the active to be identified “actives” = molecules with biological activity
  • 40. Broadly speaking, scoring functions can be divided into the following classes: • Forcefield-based • Based on terms from molecular mechanics force fields • GoldScore, DOCK, AutoDock • Empirical • Parameterised against experimental binding affinities • ChemScore, PLP, Glide SP/XP • Knowledge-based potentials • Based on statistical analysis of observed pairwise distributions • PMF, DrugScore, ASP
  • 41. Böhm’s empirical scoring function • In general, scoring functions assume that the free energy of binding can be written as a linear sum of terms to reflect the various contributions to binding • Bohm’s scoring function included contributions from hydrogen bonding, ionic interactions, lipophilic interactions and the loss of internal conformational freedom of the ligand. • The ∆G values on the right of the equation are all constants • ∆Go is a contribution to the binding energy that does not directly depend on any specific interactions with the protein • The hydrogen bonding and ionic terms are both dependent on the geometry of the interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α) being penalised. • The lipophilic term is proportional to the contact surface area (Alipo) between protein and ligand involving non-polar atoms. • The conformational entropy term is the penalty associated with freezing internal rotations of the ligand. It is largely entropic in nature. Here the value is directly proportional to the number of rotatable bonds in the ligand (NROT).
  • 42. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 43. Pose prediction accuracy • Accuracy measured by RMSD (root mean squared deviation) compared to known crystal structures  RMSD = square root of the average of (the difference between a particular coordinate in the crystal and that coordinate in the pose)2  Within 2.0Å RMSD considered cut-off for accuracy • In general, the best docking software predicts the correct pose about 70% of the time • Need a dataset of Nact known actives, and inactives • Dock all molecules, and rank each by score • Ideally, all actives would be at the top of the list • Define enrichment, E, as the number of actives found (Nfound) in the top X% of scores (typically 1% or 5%), compared to how many expected by chance  E = Nfound / (Nact * X/100)  E > 1 implies “positive enrichment”, better than random  E < 1 implies “negative enrichment”, worse than random
  • 44. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 45. Protein preparation • The Protein Data Bank (PDB) is a repository of protein crystal structures, often in complexes with inhibitors • PDB structures often contain water molecules • In general, all water molecules are removed except where it is known that they play an important role in coordinating to the ligand • PDB structures are missing all hydrogen atoms • Many docking programs require the protein to have explicit hydrogens. In general these can be added unambiguously, except in the case of acidic/basic side chains N NH • An incorrect assignment of protonation states in the active site will give poor HN NH results R + • Glutamate, Aspartate have COO- or COOH • OH is hydrogen bond donor, O- is not N R HN • Histidine is a base and its neutral form has two tautomers. R
  • 46. Ligand preparation A reasonable 3D structure is required as starting point • Even during flexible docking, bond lengths and angles are held fixed The protonation state and tautomeric form of a particular ligand could influence its hydrogen bonding ability • Either protonate as expected for physiological pH and use a single tautomer • Or generate and dock all possible protonation states and tautomers, and retain the one with the highest score OH H+ O Enol Ketone
  • 47.
  • 48. Conclusions • Computationals prediction of protein structure using modeling tools are effort saving and error minimizing processes. • Homology modeling can be successively applied if structure of known sequence simillarity is known. • Protein-ligand docking is an essential tool for computational drug design • Widely used in pharmaceutical companies • But it’s not a golden bullet • The perfect scoring function has yet to be found • The performance varies from target to target, and scoring function to scoring function • Care needs to be taken when preparing both the protein and the ligands • The more information you have, the better your chances.
  • 49. Useful links….. 1. SEARCHING FOR STRUCTURES • PDB-Blast at NCBI- http://blast.ncbi.nlm.nih.gov/Blast.cgi • Meta server- 3D judry http://bioinfo.pl/meta/ • FFAS03- http://ffas.ljcrf.edu/ffas-cgi/cgi/ffas.pl • HHPRED- http://toolkit.tuebingen.mpg.de/hhpred 2. SELECTING TEMPLATES 3. ALIGNING QUERY SEQUENCE WITH TEMPLATE STRUCTURES • MSA - MUSCLE, T-coffee and MAFFT at http://toolkit.tuebingen.mpg.de/sections/alignment • Alignment editor – Bioedit - http://www.mbio.ncsu.edu/BioEdit/bioedit.html 4. BUILDING A MODEL • Nest - http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:nest • Modeller - http://salilab.org/modeller/modeller.html 5. EVALUATING THE MODEL • ConSurf http://consurf.tau.ac.il • PROCHECK http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html • WHATCHECK www.cmbi.kun.nl/swift/whatcheck/ • ProSA https://prosa.services.came.sbg.ac.at/prosa.php • ProQ http://www.sbc.su.se/~bjornw/ProQ/ProQ.cgi