2. • Developing New QSAR methodologies
•CoRIA and its Variants
•HomoSAR
•LISA
•eCoRIA and eQSAR
•CoOAN
•Solving Protein Structures (using NMR)
•Computational Prediction of Resistance and QMAR
•Lead optimization strategies for Anti-TB, Dengue, AD etc
• Studies on reaction pathways and transition states using ab initio and
Quantum Mechanics.
• Molecular dynamics of Drug-Cyclodextrin complexes
Research in Prof. Coutinho’s Lab
3. Molecular Modeling in Drug Design
Receptor
Unknown-
Ligand
Unknown
Receptor
Known- Ligand
Unknown
Receptor
Known- Ligand
Known
Receptor
Unknown –
Ligand Known
4. What is QSAR?
Compounds + biological
activity
New compounds with
improved biological
activity
QSAR
5. The number of compounds required for synthesis in
order to place 10 different groups in 4 positions of
benzene ring is 104
Solution: synthesize a small number of compounds
and from their data derive rules to predict the
biological activity of other compounds.
Why QSAR?
6. QSAR date back to the 19th century
A.F.A. Cros (University of Strasbourg; 1863)
Increased toxicity of alcohols with decrease in water solubility
H. H. Meyer (University of Marburg; 1890’s) and Charles Ernest
Overton (University of Zurich; 1890’s) [working independently]
Toxicity of organic compounds depended on their lipophilicity
Crum-Brown and Fraser
the physiological action of a substance was a function of its chemical
composition and constitution
Richet
inverse relationship between the cytotoxicities of a diverse set of simple
organic molecules with water solubilities
7. Hammett,
"sigma-rho” culture; to understand the effect of substituents on
organic reactions
Taft
devised a way to separate polar, steric, and resonance effects and
introduced the first steric parameter, Es
Hansch and Fujita
The contributions of Hammett and Taft together laid the
mechanistic basis for the development of the QSAR paradigm
8. Hammett Equation
Linear Free Energy Relationships
Louis Hammett (1894-1987), correlated electronic properties of organic
acids and bases with their equilibrium constants and reactivity
Measures the electron withdrawing or electron donating effects
in comparison to benzoic acid & how affected its ionization)
Consider the dissociation of benzoic acid:
9. Hammett Equation
› m-NO2 increases dissociation constant (nitro
group is EWG stabilizing the negative charge)
› p-NO2 exhibits greater electron withdrawing effect
› p-C2H5 group on benzoic acid
10. Hammett observed similar substituent effects on the organic
acids and bases dissociation like phenyl acetic acid.
Hammett Equation
11. A linear free-energy relationship is said to exist if ‘the same series of changes
in conditions affects the rate or equilibrium of a second reaction in exactly the
same way as the first’
The free energy is proportional to the logarithm of the equilibrium
constant
Graph for a linear free energy
relationship
12. › The following equation was derived as the relationship is linear;
where r is the slope of the line and the abscissa values are always
those for benzoic acid and are given the symbol, s (substituent
constant); equation simplified as:
› r (reaction constant) relates the effect of substituents on that
equilibrium to the effect of those substituents on the benzoic acid
equilibrium
› The reaction constant depends on the nature of the chemical
reaction as well as the reaction conditions (solvent, temperature,
etc.)
› The sign and magnitude of the reaction constant are indicative
of the extent of charge build up during the reaction progress
13. › Reactions with ρ > 0 are favored by electron withdrawing
groups (i.e., the stabilization of negative charge)
› Reactions with ρ < 0 are favored by electron donating groups
(i.e., the stabilization of positive charge)
› For benzoic acid r is equal to 1.00 in pure water at 25oC
› s is a descriptor of the substituents;
› The magnitude of s gives the relative strength of the electron-
withdrawing or -donating properties of the substituents
› s is positive if the substituent is electron-withdrawing and;
› s is negative if substituent is electron-donating
› The relationships as developed by Hammett are termed linear
free energy relationships
14. › By definition, s for hydrogen is ZERO
› Positive s for the NO2 group indicate electron-withdrawing effect
m-NO2 (inductive effect); while p-NO2 (inductive + resonance
effect)
› Electronegative chlorine produce an inductive electron-withdrawing
effect
The magnitude of the effect in the p-Cl position being less than in
the m-Cl, and only the inductive effect is possible with chlorine
› CH3O- group can be electron-donating or -withdrawing, depending
on the position of substitution
m-CH3O an inductive electron-withdrawing effect is seen
p-CH3O only a small inductive effect is expected; an electron-
donating resonance effect occurs for p-CH3O, giving an overall
electron-donating effect
Hammett Constant
15. Applications of the Hammett Equation
› The prediction of the pKa of ionization equilibria
› Therefore,
› For benzoic acid the equation is
› Consider for substituted benzoic acid
› Given smeta=0.71 for NO2 and spara=-0.13 for CH3 groups,
calculated pKa=2.91, compared to the experimental value of
2.97
16. Applications of the Hammett Equation
› The applicability of Hammett's electronic descriptors in a QSAR
relating the inhibition of bacterial growth by a series of
sulfonamides
› where X represents various substituents
› A QSAR was developed based on the s values of the substituents
› where C is the minimum concentration of compound that inhibited
growth of E. coli
› The electron-withdrawing substituents favor inhibition of growth
17. Log P is a measure of the drug’s hydrophobicity, which was
selected as a measure of its ability to pass through cell
membranes.
The log P (or log Po/w) value reflects the relative solubility of the
drug in octanol (representing the lipid bilayer of a cell
membrane) and water (the fluid within the cell and in blood).
Log P values may be measured experimentally or, more
commonly, calculated.
Hansch’s Approach
19. › The Hammett substituent constant (s) reflects the drug
molecule’s intrinsic reactivity, related to electronic
factors caused by aryl substituents.
› In chemical reactions, aromatic ring substituents can
alter the rate of reaction by up to 6 orders of magnitude!
› For example, the rate of the reaction below is ~105 times
slower when X = NO2 than when X = CH3
CH3OH
C Cl
H
X
C OCH3 + HCl
H
X
20. › Log 1/C = S ai + m
where C=predicted activity,
ai= contribution per group, and
m=activity of reference
Free-Wilson Analysis
Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br]
+ 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl]
+ 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82
N
Br
X
Y HCl
21. 8. Topliss Scheme
Used to decide which substituents to use if optimising compounds
one by one (where synthesis is complex and slow)
Example: Aromatic substituents
L E M
ML EL E M
L E M
L E M
See Central
Branch
L E M
H
4-Cl
4-CH34-OMe 3,4-Cl2
4-But
3-CF3-4-Cl
3-Cl 3-Cl 4-CF3
2,4-Cl2
4-NO2
3-NMe2
3-CF3-4-NO2
3-CH3
2-Cl
4-NO2
3-CF3
3,5-Cl2
3-NO2
4-F
4-NMe2
3-Me-4-NMe2
4-NH2
22. Rationale
Replace H with
para-Cl (+p and +s)
+p and/or +s
advantageous
favourable p
unfavourable s
+p and/or +s
disadvantageous
Act. Little
change
Act.
add second Cl to
increase p and s
further
replace with OMe
(-p and -s)
replace with Me
(+p and -s)
Further changes suggested based on arguments of p, s and
steric strain
8. Topliss Scheme
24. Contents
I. Basics of regression analysis - linear and multiple
linear regression,
II. Introduction to PCA & PCR, PLS, ANN and GFA.
III. Validation of QSAR models
A. Correlation coefficients (r2 and r2 pred), F-test,
standard error,
B. cross-validation by calculation of q2, boot-strap
analysis and randomization.
IV.Applicability domain for predictions using a QSAR model.
V. Design of training and test sets using factorial design
25. Linear and multiple linear Regression
(Image Coutesy: CAMO Software AS)
Linear Data
Non-Linear Data
27. b0
b1
y=b0+b1x+e
x
y
Least squares (LS) used
for estimation of regression coefficients
Simple linear regression
])(][)([
))((
22
yyxx
yyxx
b
Error
29. Linear and Multiple linear Regression
• When to use
• When no. of observations more than no. of variables
• Not used in current QSAR formalisms
• Limitations
• Inaccurate when inter-correlated variable are present
• Cannot be applied when no. of variables are more than
observations
42. Selection of training and test set using factorial
designs
1. In factorial designs the investigated factors are varied
at fixed levels.
2. Each factor (chemical feature or descriptors) is
investigated at levels based on type of factorial
experiment.
3. Full factorial design for K chemical
features/descriptors at two levels gives nK compounds.
44. Applicability Domain in QSAR
• OECD Definition: Applicability domain (AD) of a QSAR model is
the physico-chemical, structural or biological space, knowledge or
information on which the training set of the model has been
developed, and for which it is applicable to make predictions for new
compounds.
• A new European legislation on chemicals – REACH (Registration,
Evaluation, Authorization and restriction of Chemicals) came into
force in 2007.
• Purpose
• Reliably application of (Q)SAR
• Intrapolation is better Extrapolation
45. What are the key aspects in defining the AD of
QSAR models ?
• Identification of the subspace of chemical structures.
• Defined AD determines the degree of generalization of a given
predictive model.
• A well defined AD indicates if the endpoint for the chemical
structures under evaluation can be reliably predicted.
• Characterization of the interpolation space is very significant
to define the AD for a given QSAR model
46. How can the AD of a model be defined ?
• Range Based methods
• Bounding Box or convex hull
• PCA Bounding Box
• Distance based methods
• Geometric Methods
• Probability Density Distribution based methods
Empty Region
Dense region
Bounding Box or convex hull
48. Is it correct to say :
• “prediction result is always
reliable for a point within
the application region” ?
• “prediction is always
unreliable if the point is
outside the application
region” ?
Concluding remark