Prepared By:
Ms. Jyoti Rani
Assistant
Professor
BUEST,SPES
CONTENTS
 Introduction to QSAR
 QSAR Analysis models
 3D QSAR
 COMFA
 COMSIA
 Applications
 Case study
INTRODUCTION TO QSAR
 To relate the biological activity of a series of compounds to their
physicochemical parameters in a quantitative fashion using a
mathematical formula.
 The fundamental principle involved is difference in structural properties
is responsible for variations in biological activities of the compound.
 Physico-chemical parameters:
1. Hydrophobicity of substituents
2. Hydrophobicity of the molecule
3. Electronic properties of substituents
4. Steric properties of substituents
QSAR Analysis models
 Hansch Analysis: Corelates biological activity with physico-chemical
parsmeters.
Log(1/c) = k1 log P + k2 σ + k3 Es + k4
 Free-Wilson Analysis: Corelates biological activity with certain
structural features of the compound.
 Limitation:
1. Does not consider 3D structure.
2. No graphical output thereby making the interpretation of results in
familiar chemical terms, frequently difficult if not impossible
3D QSAR
 3D QSAR is an extension of classical QSAR which exploits the 3 dimensional
properties of the ligands to predict their biological activity using robust stastical
analysis like PLS, G/PLS, ANN etc.
 3D-QSAR uses probe-based sampling within a molecular lattice to determine three-
dimensional properties of molecules and can then correlate these 3D descriptors with
biological activity.
 No QSAR model can replace the experimental assays, though experimental
techniques are also not free from errors.
 Some of the major factors like desolvation energetics, temperature, diffusion,
transport, pH, salt concentration etc. which contribute to the overall free energy of
binding are difficult to handle, and thus usually ignored.
 Regardless of all such problems, 3D-QSAR becomes a useful alternative approach.
Statistical Techniques for Building QSAR Models
CoMFA(Comparative Molecular Field
Analysis)
 In 1987, Cramer developed the predecessor of 3D approaches called Dynamic
Lattice-Oriented Molecular Modeling System (DYLOMMS) that involves the
use of PCA to extract vectors from the molecular interaction fields, which are
then correlated with biological activities.
 Soon after he modified it by combining the two existing techniques, GRID and
PLS, to develop a powerful 3D QSAR methodology, Comparative Molecular
Field Analysis (CoMFA).
 The underlying idea of CoMFA is that differences in a target property, e.g.,
biological activity, are often closely related to equivalent changes in shapes
and strengths of non-covalent interaction fields surrounding the molecules.
 Hence, the molecules are placed in a cubic grid and the interaction energies
between the molecule and a defined probe are calculated for each grid point.
Protocol for CoMFA:
 A standard CoMFA procedure, as implemented in the Sybyl Software, follows the following
sequential steps:
1. Bioactive conformations of each molecule are determined.
2. All the molecules are superimposed or aligned using either manual or automated methods, in
a manner defined by the supposed mode of interaction with the receptor.
3. The overlaid molecules are placed in the center of a lattice grid with a spacing of 2 Å.
4. The algorithm compares, in three-dimensions, the steric and electrostatic fields calculated
around the molecules with different probe groups positioned at all intersections of the lattice.
5. The interaction energy or field values are correlated with the biological activity data using
PLS technique, which identifies and extracts the quantitative influence of specific chemical
features of molecules on their biological activity.
6. The results are articulated as correlation equations with the number of latent variable terms,
each of which is a linear combination of original independent lattice descriptors.
7. For visual understanding, the PLS output is presented in the form of an interactive graphics
consisting of colored contour plots of coefficients of the corresponding field variables at each
lattice intersection, and showing the imperative favorable and unfavorable regions in three
dimensional space which are considerably associated with the biological activity.
DRAWBACKS AND LIMITATIONS
OF CoMFA
 CoMFA has several pitfalls and imperfections:
1. Too many adjustable parameters like overall orientation, lattice placement, step size,
probe atom type etc.
2. Uncertainty in selection of compounds and variables
3. Fragmented contour maps with variable selection procedures
4. Hydrophobicity not well-quantified
5. Cut-off limits used
6. Low signal to noise ratio due to many useless field variables
7. Imperfections in potential energy functions
8. Various practical problems with PLS
9. Applicable only to in vitro data
Comparative Molecular Similarity Indices Analysis
(CoMSIA)
 CoMSIA was developed to overcome certain limitations of CoMFA.
 In CoMSIA, molecular similarity indices calculated from modified SEAL similarity
fields are employed as descriptors to simultaneously consider steric, electrostatic,
hydrophobic and hydrogen bonding properties.
 These indices are estimated indirectly by comparing the similarity of each
molecule in the dataset with a common probe atom (having a radius of 1 Å,
charge of +1 and hydrophobicity of +1) positioned at the intersections of a
surrounding grid/lattice.
 For computing similarity at all grid points, the mutual distances between the probe
atom and the atoms of the molecules in the aligned dataset are also taken into
account.
 To describe this distance-dependence and calculate the molecular properties, Gaussian-type
functions are employed. Since the underlying Gaussian-type functional forms are ‘smooth’
with no singularities, their slopes are not as steep as the Coulombic and Lennard-Jones
potentials in CoMFA; therefore, no arbitrary cut- off limits are required to be defined.
 CoMSIA is provided by Tripos Inc. in the Sybyl software, along with CoMFA.
The Comparison of CoMFA and CoMSIA
APPLICATIONS:
1. QSAR in Chromatography: Quantitative Structure–Retention
Relationships (QSRRs)
2. The Use of QSAR and Computational Methods in Drug Design.
3. In Silico Approaches for Predicting ADME Properties.
4. Prediction of Harmful Human Health Effects of Chemicals from Structure.
5. Chemometric Methods and Theoretical Molecular Descriptors in Predictive
QSAR Modeling of the Environmental Behavior of Organic Pollutants.
6. The Role of QSAR Methodology in the Regulatory Assessment of
Chemicals.
7. Nanomaterials – the Great Challenge for QSAR Modelers
Case study:Human Eosinophil
Phosphodiesterase •
 The phosphodiesterase type IV (PDE4) plays an important role
in regulating intracellular levels of cAMP and cGMP.
 PDE4 has highly expressed in inflammatory and immune cells
and airway smooth muscle and degrade the cAMP’s
concentration.
 The inhibition of PDE4 increase the intracellular cAMP
concentrations to kill inflammatory cells and relax airway
smooth muscle.
 To develop the selective PDE4 inhibitors as anti-inflammatory
and asthmatic drugs has attracted extensive research has been
conducted. The QSAR studies of PDE4 inhibitors have also
been done by using CoMFA and CoMSIA methods.
 More potent and selective PDE4 inhibitors, a series of 5,6-dihydro-(9H) -
pyrazolo[3,4-c] -1,2,4-triazolo [4,3R]pyridine, were improved and synthesized
based on the structures of 7-oxo-4,5,6,7-tetrahydro-1H-pyrazolo[3,4-c]
pyridine.
 In order to study the interaction mechanism of PDE4 with 31 new
compounds, the QSAR model was built by using the CoMFA. 5,6-dihydro-
(9H) -pyrazolo[3,4- c] -1,2,4-triazolo [4,3R]pyridine
Structures of 5, 6-Dihydro-(9H)-pyrazolo[3,4-c]-
1,2,4-triazolo[4,3-a]pyridines.
The superstition of 31structures of
5,6-Dihydro-(9H)-pyrazolo[3,4-c]-1,2,4-
triazolo[4,3-α] pyridines.
Four compounds were randomly selected as test set, other twenty-seven
compounds as training set.
Method:
 The structures of 31 compounds were built with molecular sketch program.
 Then Gasteiger-Hückel charges were assigned to each atom and the
energy minimization of each molecule.
 All the compounds studied had common rigid substructure. Therefore, the
common rigid substructure alignment was carried out by using database
alignment tool.
 The most active compound cyclobutyl substituent is used as template
molecule.
 All aligned molecules were put into a 3D cubic lattice that extending at least
4 Å beyond the volumes of all investigated molecules on all axes.
 In the 3D lattice, the grid spacing was set to 2.0 Å in the x, y, and z
directions.
 A sp3 hybrized carbon atom with a charge of +1 was used as the probe
atom, CoMFA steric and electrostatic interaction fields were calculated.
Method contd.
 Partial least squares (PLS) method was carried out to build the 3D-QSAR models.
Leave-one- out (LOO) cross-validated PLS analysis was used to check the
predictive ability of the models and to determine the optimal number of components
to be used in the final QSAR models.
 The PLS analysis gave a CoMFA model with cross-validated q2 value of 0.565 for 3
optimal components. The non-cross-validated PLS analysis of these compounds
was repeated with the optimal number of components and the R2 value was 0.867.
The correlation plot of experimental values vs
the predicted values was shown in Plot of the
predicted pIC50 vs experimental pIC50
• The interacting mode of compound with
protein.
• The inhibitors and the important residues for
inhibitor-protein interaction are represented
by ball-and-stick models, respectively.
• The green dashed lines denote the hydrogen
bonds.
Structures of Erβ protein binding with
compounds obtained from molecular
docking
Contour Maps of CoMFA
Model
Contour plots of the CoMFA steric fields (left) and electrostatic fields (right) of
compound No. 24
CONCLUSION
 CoMFA and CoMSIA are useful techniques in understanding
pharmacological properties of studied compounds, and they
have been successfully used in modern drug design.
 Despite of all the pitfalls it has now been globally used for drug
discovery based on well-established principles of statistics, is
intrinsically a valuable and viable medicinal chemistry tool
whose application domain range from explaining the structure-
activity relationships quantitatively and retrospectively, to
endowing synthetic guidance leading to logical and
experimentally testable hypotheses.
 Apart from synthetic applications it has also been used in
various other fields too.
Reference
 An Introduction to Medicinal Chemistry FIFTH EDITION
by Graham L. Patrick
 Yuhong Xiang, Zhuoyong Zhang*, Aijing Xiao, and Jinxu
Huo “Recent Studies of QSAR on Inhibitors of Estrogen
Receptor and Human Eosinophil Phosphodiesterase”
Department of Chemistry, Capital Normal University,
Beijing 100048, P.R. China
 Jitender Verma, Vijay M. Khedkar and Evans C. Coutinho
“3D-QSAR in Drug Design - A Review” Department of
Pharmaceutical Chemistry, Bombay College of Pharmacy,
Kalina, Santacruz (E), Mumbai 400 098, India
Thank you

3 D QSAR Approaches and Contour Map Analysis

  • 1.
    Prepared By: Ms. JyotiRani Assistant Professor BUEST,SPES
  • 2.
    CONTENTS  Introduction toQSAR  QSAR Analysis models  3D QSAR  COMFA  COMSIA  Applications  Case study
  • 3.
    INTRODUCTION TO QSAR To relate the biological activity of a series of compounds to their physicochemical parameters in a quantitative fashion using a mathematical formula.  The fundamental principle involved is difference in structural properties is responsible for variations in biological activities of the compound.  Physico-chemical parameters: 1. Hydrophobicity of substituents 2. Hydrophobicity of the molecule 3. Electronic properties of substituents 4. Steric properties of substituents
  • 4.
    QSAR Analysis models Hansch Analysis: Corelates biological activity with physico-chemical parsmeters. Log(1/c) = k1 log P + k2 σ + k3 Es + k4  Free-Wilson Analysis: Corelates biological activity with certain structural features of the compound.  Limitation: 1. Does not consider 3D structure. 2. No graphical output thereby making the interpretation of results in familiar chemical terms, frequently difficult if not impossible
  • 5.
    3D QSAR  3DQSAR is an extension of classical QSAR which exploits the 3 dimensional properties of the ligands to predict their biological activity using robust stastical analysis like PLS, G/PLS, ANN etc.  3D-QSAR uses probe-based sampling within a molecular lattice to determine three- dimensional properties of molecules and can then correlate these 3D descriptors with biological activity.  No QSAR model can replace the experimental assays, though experimental techniques are also not free from errors.  Some of the major factors like desolvation energetics, temperature, diffusion, transport, pH, salt concentration etc. which contribute to the overall free energy of binding are difficult to handle, and thus usually ignored.  Regardless of all such problems, 3D-QSAR becomes a useful alternative approach.
  • 6.
    Statistical Techniques forBuilding QSAR Models
  • 7.
    CoMFA(Comparative Molecular Field Analysis) In 1987, Cramer developed the predecessor of 3D approaches called Dynamic Lattice-Oriented Molecular Modeling System (DYLOMMS) that involves the use of PCA to extract vectors from the molecular interaction fields, which are then correlated with biological activities.  Soon after he modified it by combining the two existing techniques, GRID and PLS, to develop a powerful 3D QSAR methodology, Comparative Molecular Field Analysis (CoMFA).  The underlying idea of CoMFA is that differences in a target property, e.g., biological activity, are often closely related to equivalent changes in shapes and strengths of non-covalent interaction fields surrounding the molecules.  Hence, the molecules are placed in a cubic grid and the interaction energies between the molecule and a defined probe are calculated for each grid point.
  • 8.
    Protocol for CoMFA: A standard CoMFA procedure, as implemented in the Sybyl Software, follows the following sequential steps: 1. Bioactive conformations of each molecule are determined. 2. All the molecules are superimposed or aligned using either manual or automated methods, in a manner defined by the supposed mode of interaction with the receptor. 3. The overlaid molecules are placed in the center of a lattice grid with a spacing of 2 Å. 4. The algorithm compares, in three-dimensions, the steric and electrostatic fields calculated around the molecules with different probe groups positioned at all intersections of the lattice. 5. The interaction energy or field values are correlated with the biological activity data using PLS technique, which identifies and extracts the quantitative influence of specific chemical features of molecules on their biological activity. 6. The results are articulated as correlation equations with the number of latent variable terms, each of which is a linear combination of original independent lattice descriptors. 7. For visual understanding, the PLS output is presented in the form of an interactive graphics consisting of colored contour plots of coefficients of the corresponding field variables at each lattice intersection, and showing the imperative favorable and unfavorable regions in three dimensional space which are considerably associated with the biological activity.
  • 9.
    DRAWBACKS AND LIMITATIONS OFCoMFA  CoMFA has several pitfalls and imperfections: 1. Too many adjustable parameters like overall orientation, lattice placement, step size, probe atom type etc. 2. Uncertainty in selection of compounds and variables 3. Fragmented contour maps with variable selection procedures 4. Hydrophobicity not well-quantified 5. Cut-off limits used 6. Low signal to noise ratio due to many useless field variables 7. Imperfections in potential energy functions 8. Various practical problems with PLS 9. Applicable only to in vitro data
  • 10.
    Comparative Molecular SimilarityIndices Analysis (CoMSIA)  CoMSIA was developed to overcome certain limitations of CoMFA.  In CoMSIA, molecular similarity indices calculated from modified SEAL similarity fields are employed as descriptors to simultaneously consider steric, electrostatic, hydrophobic and hydrogen bonding properties.  These indices are estimated indirectly by comparing the similarity of each molecule in the dataset with a common probe atom (having a radius of 1 Å, charge of +1 and hydrophobicity of +1) positioned at the intersections of a surrounding grid/lattice.  For computing similarity at all grid points, the mutual distances between the probe atom and the atoms of the molecules in the aligned dataset are also taken into account.
  • 11.
     To describethis distance-dependence and calculate the molecular properties, Gaussian-type functions are employed. Since the underlying Gaussian-type functional forms are ‘smooth’ with no singularities, their slopes are not as steep as the Coulombic and Lennard-Jones potentials in CoMFA; therefore, no arbitrary cut- off limits are required to be defined.  CoMSIA is provided by Tripos Inc. in the Sybyl software, along with CoMFA. The Comparison of CoMFA and CoMSIA
  • 12.
    APPLICATIONS: 1. QSAR inChromatography: Quantitative Structure–Retention Relationships (QSRRs) 2. The Use of QSAR and Computational Methods in Drug Design. 3. In Silico Approaches for Predicting ADME Properties. 4. Prediction of Harmful Human Health Effects of Chemicals from Structure. 5. Chemometric Methods and Theoretical Molecular Descriptors in Predictive QSAR Modeling of the Environmental Behavior of Organic Pollutants. 6. The Role of QSAR Methodology in the Regulatory Assessment of Chemicals. 7. Nanomaterials – the Great Challenge for QSAR Modelers
  • 13.
    Case study:Human Eosinophil Phosphodiesterase•  The phosphodiesterase type IV (PDE4) plays an important role in regulating intracellular levels of cAMP and cGMP.  PDE4 has highly expressed in inflammatory and immune cells and airway smooth muscle and degrade the cAMP’s concentration.  The inhibition of PDE4 increase the intracellular cAMP concentrations to kill inflammatory cells and relax airway smooth muscle.  To develop the selective PDE4 inhibitors as anti-inflammatory and asthmatic drugs has attracted extensive research has been conducted. The QSAR studies of PDE4 inhibitors have also been done by using CoMFA and CoMSIA methods.
  • 14.
     More potentand selective PDE4 inhibitors, a series of 5,6-dihydro-(9H) - pyrazolo[3,4-c] -1,2,4-triazolo [4,3R]pyridine, were improved and synthesized based on the structures of 7-oxo-4,5,6,7-tetrahydro-1H-pyrazolo[3,4-c] pyridine.  In order to study the interaction mechanism of PDE4 with 31 new compounds, the QSAR model was built by using the CoMFA. 5,6-dihydro- (9H) -pyrazolo[3,4- c] -1,2,4-triazolo [4,3R]pyridine Structures of 5, 6-Dihydro-(9H)-pyrazolo[3,4-c]- 1,2,4-triazolo[4,3-a]pyridines. The superstition of 31structures of 5,6-Dihydro-(9H)-pyrazolo[3,4-c]-1,2,4- triazolo[4,3-α] pyridines.
  • 15.
    Four compounds wererandomly selected as test set, other twenty-seven compounds as training set.
  • 16.
    Method:  The structuresof 31 compounds were built with molecular sketch program.  Then Gasteiger-Hückel charges were assigned to each atom and the energy minimization of each molecule.  All the compounds studied had common rigid substructure. Therefore, the common rigid substructure alignment was carried out by using database alignment tool.  The most active compound cyclobutyl substituent is used as template molecule.  All aligned molecules were put into a 3D cubic lattice that extending at least 4 Å beyond the volumes of all investigated molecules on all axes.  In the 3D lattice, the grid spacing was set to 2.0 Å in the x, y, and z directions.  A sp3 hybrized carbon atom with a charge of +1 was used as the probe atom, CoMFA steric and electrostatic interaction fields were calculated.
  • 17.
    Method contd.  Partialleast squares (PLS) method was carried out to build the 3D-QSAR models. Leave-one- out (LOO) cross-validated PLS analysis was used to check the predictive ability of the models and to determine the optimal number of components to be used in the final QSAR models.  The PLS analysis gave a CoMFA model with cross-validated q2 value of 0.565 for 3 optimal components. The non-cross-validated PLS analysis of these compounds was repeated with the optimal number of components and the R2 value was 0.867. The correlation plot of experimental values vs the predicted values was shown in Plot of the predicted pIC50 vs experimental pIC50
  • 18.
    • The interactingmode of compound with protein. • The inhibitors and the important residues for inhibitor-protein interaction are represented by ball-and-stick models, respectively. • The green dashed lines denote the hydrogen bonds. Structures of Erβ protein binding with compounds obtained from molecular docking
  • 19.
    Contour Maps ofCoMFA Model Contour plots of the CoMFA steric fields (left) and electrostatic fields (right) of compound No. 24
  • 21.
    CONCLUSION  CoMFA andCoMSIA are useful techniques in understanding pharmacological properties of studied compounds, and they have been successfully used in modern drug design.  Despite of all the pitfalls it has now been globally used for drug discovery based on well-established principles of statistics, is intrinsically a valuable and viable medicinal chemistry tool whose application domain range from explaining the structure- activity relationships quantitatively and retrospectively, to endowing synthetic guidance leading to logical and experimentally testable hypotheses.  Apart from synthetic applications it has also been used in various other fields too.
  • 22.
    Reference  An Introductionto Medicinal Chemistry FIFTH EDITION by Graham L. Patrick  Yuhong Xiang, Zhuoyong Zhang*, Aijing Xiao, and Jinxu Huo “Recent Studies of QSAR on Inhibitors of Estrogen Receptor and Human Eosinophil Phosphodiesterase” Department of Chemistry, Capital Normal University, Beijing 100048, P.R. China  Jitender Verma, Vijay M. Khedkar and Evans C. Coutinho “3D-QSAR in Drug Design - A Review” Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India
  • 23.