Computer Aided Molecular Modeling


Published on

Computer Aided Molecular Modeling

Published in: Education, Technology
  • Sir..your slides were much helpful for the modeling .It ll be a great pleasure if you could just forward the slide to my email:
    Are you sure you want to  Yes  No
    Your message goes here
  • sir please send me this ppt to my mail
    Are you sure you want to  Yes  No
    Your message goes here
  • plze send me this PPT
    Are you sure you want to  Yes  No
    Your message goes here
  • kindly forward this slide to me :)
    Are you sure you want to  Yes  No
    Your message goes here

    could you send this file to my email (


    thanks in advance

    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Computer Aided Molecular Modeling

  1. 1. P. K. Choudhury (Ph. D, 1st year, Dairy Microbiology)
  2. 2. The druggable genome Human genome Polysaccharides Lipids Nucleic Acids Proteins Proteins with binding siteDruggable genome: Subset of genes which express proteins capable of binding small drug-like molecules
  3. 3. Protein Structure Prediction Why predict protein structure if we can use experimental tools to determine it?• Experimental methods are slow and expensive• Some structures were failed to be solved• A representative family structure can suffice to deduce structures of the entire family sequences
  4. 4. Outline to modeling………………1. Introduction to protein structure and databases2. Structure prediction approaches • Ab-initio • Threading • Homology modeling3. Hands on molecular modeling4. Model evaluation
  5. 5. 1. Protein structure and databasesProtein structure is hierarchic:
  6. 6. • Pauling built models based on the following principles, codified by G. N. Ramachandran: • Bond lengths and angles -should be similar to those found in individual amino acids and small peptides • Peptide bond -should be planer • Overlaps-not permitted, pairs of atoms no closer than sum of their covalent radii • Stabilization-have sterics that permit hydrogen bonding• Two degrees of freedom: •  (phi) angle = rotation about N-C •  (psi) angle = rotation about C-C• A linear amino acid polymer with some folds is better but still not functional nor completely energetically favorable packing!
  7. 7. Ramachandran Plot
  8. 8. SCOP-Fold classification All alpha (α) All beta (β) Alpha and beta(α, β)
  9. 9. Databases• RCSB-the Protein Data Bank-all deposited structures • Experimentally-determined structures of proteins, nucleic acids, and complex assemblies. • Currently having 65,000 structures• Uniport main sequence database o SwissPro o TrEMBL Collaborations: European Bioinformatics Institute (EBI), Swiss Institute of Bioinformatics (SIB) , Protein Information Resource (PIR)• NCBI lots of databases, including sequence and structures• PDBsum combines structural & sequence data
  10. 10. 2. Structure Prediction Approacheso Ab-initio fold prediction • Not based on similarity to a sequence- structureo Threading (Fold Recognition) • Requires a structure similar to a known structureo Homology modeling • Based on sequence similarity with a protein for which a structure has been solved.
  11. 11. a. Ab initio modelingo Structure prediction from “first principals”:o Shows that we understand the process.o Given only the sequence, try to predict the structure based on physico-chemical properties • The force field • Molecular dynamics • Minimal energy
  12. 12. Force field• Mathematical expressions describing the potential energy of a molecular system• Each expression describes a different type of physico-chemical interaction between atoms in the system: • Van der Waals forces, Covalent bonds, Hydrogen bonds, Charges, Hydrophobic effects Molecular dynamics• Simulates the forces that governs the protein within water.• Since proteins usually naturally fold, this would lead to the native protein structure.
  13. 13. Minimal EnergyAssumption: the foldedform is the minimal energyconformation of a protein• Use of simplified energy function• Search methods for minimal energy conformation: • Greedy search • Simulated annealing
  14. 14. b. Threading (Fold reorganization)Given a sequence and a library of folds, thread thesequence through each fold. Take the one with thehighest score (I-TASSER).• Method will fail if new protein does not belong to any fold in the library.• Score of the threading is computed based on known physical chemistry properties and statistics of amino acids.
  15. 15. c. Homology Modeling…………………o A protein structure is defined by its amino acid sequence.o Closely related sequences adopt highly similar structures, distantly related sequences may still fold into similar structures.o Three-dimensional Triophospate ismoerases structure of proteins from 44.7% sequence identity 0.95 RMSD the same family is more conserved
  16. 16. 3. Hands on molecular modeling Homology
  17. 17. The Query ProteinName: Dihydrodipicolinate reductaseEnzyme reaction:Molecular process: Lysine biosynthesis (early stages)Organism: E. coliSequence length: 273 aa
  18. 18. Steps in homology modeling1. Searching for structures related to the query sequence2. Selecting templates3. Aligning query sequence with template structures4. Building a model for the query using information from the template structures (Modelor 9.10)5. Evaluating the model
  21. 21. • Aligning query sequence with template structures• Building a model for the query using information from the template structures (Modeller 9.10)• Modeller 9.10 will generate PDB files with reference to the template structure.• Evaluation of the structure in SAVES
  22. 22. 4. Model evaluationExamples of assessment approaches:1. Assessment of the model’s stereochemistry2. Prediction of unreliable regions of the model - “pseudo energy” profile: peaks  errors3. Consistence with experimental observations4. Consistence with evolutionary conservation rates .
  23. 23. Structural Analysis Verification Server
  24. 24. Real vs. model superimposition
  25. 25. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  26. 26. Protein ligand Docking• A Structure-Based Drug Design (SBDD) method “structure” means “using protein structure”• Computational method that mimics the binding of a ligand to a protein• Given...• Predicts... • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding
  27. 27. Pose vs. binding site• Binding site (or “active site”) • The part of the protein where the ligand binds • Generally a cavity on the protein surface • Can be identified by looking at the crystal structure of the protein• Pose (or “binding mode”) • The geometry of the ligand in the binding site • Geometry = location, orientation and conformation• Protein-ligand docking is not about identifying the binding site
  28. 28. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  29. 29. Components of docking software • Typically, protein-ligand docking software consist of two main components which work together: 1. Search algorithm • Generates a large number of poses of a molecule in the binding site 2. Scoring function • Calculates a score or binding affinity for a particular pose • To provide  The pose of the molecule in the binding site  The binding affinity or a score representing the strength of binding
  30. 30. The perfect scoring function will• Accurately calculate the binding affinity • Will allow actives to be identified in a virtual screen • Be able to rank actives in terms of affinity• Score the poses of an active higher than poses of an inactive • Will rank actives higher than inactives in a virtual screen• Score the correct pose of the active higher than an incorrect pose of the active • Will allow the correct pose of the active to be identified“actives” = molecules with biological activity
  31. 31. Broadly speaking, scoring functions can be divided into the following classes: • Forcefield-based • Based on terms from molecular mechanics force fields • GoldScore, DOCK, AutoDock • Empirical • Parameterised against experimental binding affinities • ChemScore, PLP, Glide SP/XP • Knowledge-based potentials • Based on statistical analysis of observed pairwise distributions • PMF, DrugScore, ASP
  32. 32. Böhm’s empirical scoring function • In general, scoring functions assume that the free energy of binding can be written as a linear sum of terms to reflect the various contributions to binding • Bohm’s scoring function included contributions from hydrogen bonding, ionic interactions, lipophilic interactions and the loss of internal conformational freedom of the ligand. • The ∆G values on the right of the equation are all constants • ∆Go is a contribution to the binding energy that does not directly depend on any specific interactions with the protein • The hydrogen bonding and ionic terms are both dependent on the geometry of the interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α) being penalised. • The lipophilic term is proportional to the contact surface area (Alipo) between protein and ligand involving non-polar atoms. • The conformational entropy term is the penalty associated with freezing internal rotations of the ligand. It is largely entropic in nature. Here the value is directly proportional to the number of rotatable bonds in the ligand (NROT).
  33. 33. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  34. 34. Pose prediction accuracy • Accuracy measured by RMSD (root mean squared deviation) compared to known crystal structures  RMSD = square root of the average of (the difference between a particular coordinate in the crystal and that coordinate in the pose)2  Within 2.0Å RMSD considered cut-off for accuracy • In general, the best docking software predicts the correct pose about 70% of the time • Need a dataset of Nact known actives, and inactives • Dock all molecules, and rank each by score • Ideally, all actives would be at the top of the list • Define enrichment, E, as the number of actives found (Nfound) in the top X% of scores (typically 1% or 5%), compared to how many expected by chance  E = Nfound / (Nact * X/100)  E > 1 implies “positive enrichment”, better than random  E < 1 implies “negative enrichment”, worse than random
  35. 35. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  36. 36. Protein preparation• The Protein Data Bank (PDB) is a repository of protein crystal structures, often in complexes with inhibitors• PDB structures often contain water molecules • In general, all water molecules are removed except where it is known that they play an important role in coordinating to the ligand• PDB structures are missing all hydrogen atoms • Many docking programs require the protein to have explicit hydrogens. In general these can be added unambiguously, except in the case of acidic/basic side chains N NH• An incorrect assignment of protonation states in the active site will give poor HN NH results R +• Glutamate, Aspartate have COO- or COOH • OH is hydrogen bond donor, O- is not N R HN• Histidine is a base and its neutral form has two tautomers. R
  37. 37. Ligand preparationA reasonable 3D structure is required as starting point • Even during flexible docking, bond lengths and angles are held fixedThe protonation state and tautomeric form of a particularligand could influence its hydrogen bonding ability • Either protonate as expected for physiological pH and use a single tautomer • Or generate and dock all possible protonation states and tautomers, and retain the one with the highest score OH H+ O Enol Ketone
  38. 38. Conclusions• Computationals prediction of protein structure using modeling tools are effort saving and error minimizing processes.• Homology modeling can be successively applied if structure of known sequence simillarity is known.• Protein-ligand docking is an essential tool for computational drug design • Widely used in pharmaceutical companies• But it’s not a golden bullet • The perfect scoring function has yet to be found • The performance varies from target to target, and scoring function to scoring function• Care needs to be taken when preparing both the protein and the ligands• The more information you have, the better your chances.
  39. 39. Useful links…..1. SEARCHING FOR STRUCTURES • PDB-Blast at NCBI- • Meta server- 3D judry • FFAS03- • HHPRED- SELECTING TEMPLATES3. ALIGNING QUERY SEQUENCE WITH TEMPLATE STRUCTURES • MSA - MUSCLE, T-coffee and MAFFT at • Alignment editor – Bioedit - BUILDING A MODEL • Nest - • Modeller - EVALUATING THE MODEL • ConSurf • PROCHECK • WHATCHECK • ProSA • ProQ