Your SlideShare is downloading. ×
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Computer Aided Molecular Modeling

4,557

Published on

Computer Aided Molecular Modeling

Computer Aided Molecular Modeling

Published in: Education, Technology
6 Comments
12 Likes
Statistics
Notes
No Downloads
Views
Total Views
4,557
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
6
Likes
12
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. P. K. Choudhury (Ph. D, 1st year, Dairy Microbiology)
  • 2. The druggable genome Human genome Polysaccharides Lipids Nucleic Acids Proteins Proteins with binding siteDruggable genome: Subset of genes which express proteins capable of binding small drug-like molecules
  • 3. Protein Structure Prediction Why predict protein structure if we can use experimental tools to determine it?• Experimental methods are slow and expensive• Some structures were failed to be solved• A representative family structure can suffice to deduce structures of the entire family sequences
  • 4. Outline to modeling………………1. Introduction to protein structure and databases2. Structure prediction approaches • Ab-initio • Threading • Homology modeling3. Hands on molecular modeling4. Model evaluation
  • 5. 1. Protein structure and databasesProtein structure is hierarchic:
  • 6. • Pauling built models based on the following principles, codified by G. N. Ramachandran: • Bond lengths and angles -should be similar to those found in individual amino acids and small peptides • Peptide bond -should be planer • Overlaps-not permitted, pairs of atoms no closer than sum of their covalent radii • Stabilization-have sterics that permit hydrogen bonding• Two degrees of freedom: •  (phi) angle = rotation about N-C •  (psi) angle = rotation about C-C• A linear amino acid polymer with some folds is better but still not functional nor completely energetically favorable packing!
  • 7. Ramachandran Plot
  • 8. SCOP-Fold classification All alpha (α) All beta (β) Alpha and beta(α, β)
  • 9. Databases• RCSB-the Protein Data Bank-all deposited structures • Experimentally-determined structures of proteins, nucleic acids, and complex assemblies. • Currently having 65,000 structures• Uniport main sequence database o SwissPro o TrEMBL Collaborations: European Bioinformatics Institute (EBI), Swiss Institute of Bioinformatics (SIB) , Protein Information Resource (PIR)• NCBI lots of databases, including sequence and structures• PDBsum combines structural & sequence data
  • 10. 2. Structure Prediction Approacheso Ab-initio fold prediction • Not based on similarity to a sequence- structureo Threading (Fold Recognition) • Requires a structure similar to a known structureo Homology modeling • Based on sequence similarity with a protein for which a structure has been solved.
  • 11. a. Ab initio modelingo Structure prediction from “first principals”:o Shows that we understand the process.o Given only the sequence, try to predict the structure based on physico-chemical properties • The force field • Molecular dynamics • Minimal energy
  • 12. Force field• Mathematical expressions describing the potential energy of a molecular system• Each expression describes a different type of physico-chemical interaction between atoms in the system: • Van der Waals forces, Covalent bonds, Hydrogen bonds, Charges, Hydrophobic effects Molecular dynamics• Simulates the forces that governs the protein within water.• Since proteins usually naturally fold, this would lead to the native protein structure.
  • 13. Minimal EnergyAssumption: the foldedform is the minimal energyconformation of a protein• Use of simplified energy function• Search methods for minimal energy conformation: • Greedy search • Simulated annealing
  • 14. b. Threading (Fold reorganization)Given a sequence and a library of folds, thread thesequence through each fold. Take the one with thehighest score (I-TASSER).• Method will fail if new protein does not belong to any fold in the library.• Score of the threading is computed based on known physical chemistry properties and statistics of amino acids.
  • 15. c. Homology Modeling…………………o A protein structure is defined by its amino acid sequence.o Closely related sequences adopt highly similar structures, distantly related sequences may still fold into similar structures.o Three-dimensional Triophospate ismoerases structure of proteins from 44.7% sequence identity 0.95 RMSD the same family is more conserved
  • 16. 3. Hands on molecular modeling Homology
  • 17. The Query ProteinName: Dihydrodipicolinate reductaseEnzyme reaction:Molecular process: Lysine biosynthesis (early stages)Organism: E. coliSequence length: 273 aa
  • 18. Steps in homology modeling1. Searching for structures related to the query sequence2. Selecting templates3. Aligning query sequence with template structures4. Building a model for the query using information from the template structures (Modelor 9.10)5. Evaluating the model
  • 19. 1. Searching For Structures Get your sequence>DAPB_ECOLIMHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL*
  • 20. Get your sequence>DAPB_ECOLIMHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNLPIR Format>P1;1ARZ/ASequence;1ARZ/A: 1 :: 273:: Dihydrodipicolinatereductase:: Escherichia Coli:0.00:0.00MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL*
  • 21. • Aligning query sequence with template structures• Building a model for the query using information from the template structures (Modeller 9.10)• Modeller 9.10 will generate PDB files with reference to the template structure.• Evaluation of the structure in SAVES
  • 22. 4. Model evaluationExamples of assessment approaches:1. Assessment of the model’s stereochemistry2. Prediction of unreliable regions of the model - “pseudo energy” profile: peaks  errors3. Consistence with experimental observations4. Consistence with evolutionary conservation rates .
  • 23. Structural Analysis Verification Server http://nihserver.mbi.ucla.edu/SAVES/
  • 24. Real vs. model superimposition
  • 25. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 26. Protein ligand Docking• A Structure-Based Drug Design (SBDD) method “structure” means “using protein structure”• Computational method that mimics the binding of a ligand to a protein• Given...• Predicts... • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding
  • 27. Pose vs. binding site• Binding site (or “active site”) • The part of the protein where the ligand binds • Generally a cavity on the protein surface • Can be identified by looking at the crystal structure of the protein• Pose (or “binding mode”) • The geometry of the ligand in the binding site • Geometry = location, orientation and conformation• Protein-ligand docking is not about identifying the binding site
  • 28. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 29. Components of docking software • Typically, protein-ligand docking software consist of two main components which work together: 1. Search algorithm • Generates a large number of poses of a molecule in the binding site 2. Scoring function • Calculates a score or binding affinity for a particular pose • To provide  The pose of the molecule in the binding site  The binding affinity or a score representing the strength of binding
  • 30. The perfect scoring function will• Accurately calculate the binding affinity • Will allow actives to be identified in a virtual screen • Be able to rank actives in terms of affinity• Score the poses of an active higher than poses of an inactive • Will rank actives higher than inactives in a virtual screen• Score the correct pose of the active higher than an incorrect pose of the active • Will allow the correct pose of the active to be identified“actives” = molecules with biological activity
  • 31. Broadly speaking, scoring functions can be divided into the following classes: • Forcefield-based • Based on terms from molecular mechanics force fields • GoldScore, DOCK, AutoDock • Empirical • Parameterised against experimental binding affinities • ChemScore, PLP, Glide SP/XP • Knowledge-based potentials • Based on statistical analysis of observed pairwise distributions • PMF, DrugScore, ASP
  • 32. Böhm’s empirical scoring function • In general, scoring functions assume that the free energy of binding can be written as a linear sum of terms to reflect the various contributions to binding • Bohm’s scoring function included contributions from hydrogen bonding, ionic interactions, lipophilic interactions and the loss of internal conformational freedom of the ligand. • The ∆G values on the right of the equation are all constants • ∆Go is a contribution to the binding energy that does not directly depend on any specific interactions with the protein • The hydrogen bonding and ionic terms are both dependent on the geometry of the interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α) being penalised. • The lipophilic term is proportional to the contact surface area (Alipo) between protein and ligand involving non-polar atoms. • The conformational entropy term is the penalty associated with freezing internal rotations of the ligand. It is largely entropic in nature. Here the value is directly proportional to the number of rotatable bonds in the ligand (NROT).
  • 33. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 34. Pose prediction accuracy • Accuracy measured by RMSD (root mean squared deviation) compared to known crystal structures  RMSD = square root of the average of (the difference between a particular coordinate in the crystal and that coordinate in the pose)2  Within 2.0Å RMSD considered cut-off for accuracy • In general, the best docking software predicts the correct pose about 70% of the time • Need a dataset of Nact known actives, and inactives • Dock all molecules, and rank each by score • Ideally, all actives would be at the top of the list • Define enrichment, E, as the number of actives found (Nfound) in the top X% of scores (typically 1% or 5%), compared to how many expected by chance  E = Nfound / (Nact * X/100)  E > 1 implies “positive enrichment”, better than random  E < 1 implies “negative enrichment”, worse than random
  • 35. Outline to docking…………. • Introduction to protein-ligand docking • Scoring functions • Assessing performance • Practical aspects
  • 36. Protein preparation• The Protein Data Bank (PDB) is a repository of protein crystal structures, often in complexes with inhibitors• PDB structures often contain water molecules • In general, all water molecules are removed except where it is known that they play an important role in coordinating to the ligand• PDB structures are missing all hydrogen atoms • Many docking programs require the protein to have explicit hydrogens. In general these can be added unambiguously, except in the case of acidic/basic side chains N NH• An incorrect assignment of protonation states in the active site will give poor HN NH results R +• Glutamate, Aspartate have COO- or COOH • OH is hydrogen bond donor, O- is not N R HN• Histidine is a base and its neutral form has two tautomers. R
  • 37. Ligand preparationA reasonable 3D structure is required as starting point • Even during flexible docking, bond lengths and angles are held fixedThe protonation state and tautomeric form of a particularligand could influence its hydrogen bonding ability • Either protonate as expected for physiological pH and use a single tautomer • Or generate and dock all possible protonation states and tautomers, and retain the one with the highest score OH H+ O Enol Ketone
  • 38. Conclusions• Computationals prediction of protein structure using modeling tools are effort saving and error minimizing processes.• Homology modeling can be successively applied if structure of known sequence simillarity is known.• Protein-ligand docking is an essential tool for computational drug design • Widely used in pharmaceutical companies• But it’s not a golden bullet • The perfect scoring function has yet to be found • The performance varies from target to target, and scoring function to scoring function• Care needs to be taken when preparing both the protein and the ligands• The more information you have, the better your chances.
  • 39. Useful links…..1. SEARCHING FOR STRUCTURES • PDB-Blast at NCBI- http://blast.ncbi.nlm.nih.gov/Blast.cgi • Meta server- 3D judry http://bioinfo.pl/meta/ • FFAS03- http://ffas.ljcrf.edu/ffas-cgi/cgi/ffas.pl • HHPRED- http://toolkit.tuebingen.mpg.de/hhpred2. SELECTING TEMPLATES3. ALIGNING QUERY SEQUENCE WITH TEMPLATE STRUCTURES • MSA - MUSCLE, T-coffee and MAFFT at http://toolkit.tuebingen.mpg.de/sections/alignment • Alignment editor – Bioedit - http://www.mbio.ncsu.edu/BioEdit/bioedit.html4. BUILDING A MODEL • Nest - http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:nest • Modeller - http://salilab.org/modeller/modeller.html5. EVALUATING THE MODEL • ConSurf http://consurf.tau.ac.il • PROCHECK http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html • WHATCHECK www.cmbi.kun.nl/swift/whatcheck/ • ProSA https://prosa.services.came.sbg.ac.at/prosa.php • ProQ http://www.sbc.su.se/~bjornw/ProQ/ProQ.cgi

×