SlideShare a Scribd company logo
Quantitative Structure-Activity
Relationship
Elvis A. F. Martis
Graduate Student (Ph.D.)
Department of Pharmaceutical Chemistry
Bombay College of Pharmacy
1
• Developing New QSAR methodologies
•CoRIA and its Variants
•HomoSAR
•LISA
•eCoRIA and eQSAR
•CoOAN
•Solving Protein Structures (using NMR)
•Computational Prediction of Resistance and QMAR
•Lead optimization strategies for Anti-TB, Dengue, AD etc
• Studies on reaction pathways and transition states using ab initio and
Quantum Mechanics.
• Molecular dynamics of Drug-Cyclodextrin complexes
Research in Prof. Coutinho’s Lab
Molecular Modeling in Drug Design
Receptor
Unknown-
Ligand
Unknown
Receptor
Known- Ligand
Unknown
Receptor
Known- Ligand
Known
Receptor
Unknown –
Ligand Known
What is QSAR?
Compounds + biological
activity
New compounds with
improved biological
activity
QSAR
The number of compounds required for synthesis in
order to place 10 different groups in 4 positions of
benzene ring is 104
Solution: synthesize a small number of compounds
and from their data derive rules to predict the
biological activity of other compounds.
Why QSAR?
QSAR date back to the 19th century
A.F.A. Cros (University of Strasbourg; 1863)
Increased toxicity of alcohols with decrease in water solubility
 H. H. Meyer (University of Marburg; 1890’s) and Charles Ernest
Overton (University of Zurich; 1890’s) [working independently]
 Toxicity of organic compounds depended on their lipophilicity
 Crum-Brown and Fraser
the physiological action of a substance was a function of its chemical
composition and constitution
 Richet
inverse relationship between the cytotoxicities of a diverse set of simple
organic molecules with water solubilities
 Hammett,
"sigma-rho” culture; to understand the effect of substituents on
organic reactions
 Taft
 devised a way to separate polar, steric, and resonance effects and
introduced the first steric parameter, Es
 Hansch and Fujita
The contributions of Hammett and Taft together laid the
mechanistic basis for the development of the QSAR paradigm
Hammett Equation
 Linear Free Energy Relationships
Louis Hammett (1894-1987), correlated electronic properties of organic
acids and bases with their equilibrium constants and reactivity
 Measures the electron withdrawing or electron donating effects
in comparison to benzoic acid & how affected its ionization)
Consider the dissociation of benzoic acid:
Hammett Equation
› m-NO2 increases dissociation constant (nitro
group is EWG stabilizing the negative charge)
› p-NO2 exhibits greater electron withdrawing effect
› p-C2H5 group on benzoic acid
Hammett observed similar substituent effects on the organic
acids and bases dissociation like phenyl acetic acid.
Hammett Equation
 A linear free-energy relationship is said to exist if ‘the same series of changes
in conditions affects the rate or equilibrium of a second reaction in exactly the
same way as the first’
 The free energy is proportional to the logarithm of the equilibrium
constant
Graph for a linear free energy
relationship
› The following equation was derived as the relationship is linear;
where r is the slope of the line and the abscissa values are always
those for benzoic acid and are given the symbol, s (substituent
constant); equation simplified as:
› r (reaction constant) relates the effect of substituents on that
equilibrium to the effect of those substituents on the benzoic acid
equilibrium
› The reaction constant depends on the nature of the chemical
reaction as well as the reaction conditions (solvent, temperature,
etc.)
› The sign and magnitude of the reaction constant are indicative
of the extent of charge build up during the reaction progress
› Reactions with ρ > 0 are favored by electron withdrawing
groups (i.e., the stabilization of negative charge)
› Reactions with ρ < 0 are favored by electron donating groups
(i.e., the stabilization of positive charge)
› For benzoic acid r is equal to 1.00 in pure water at 25oC
› s is a descriptor of the substituents;
› The magnitude of s gives the relative strength of the electron-
withdrawing or -donating properties of the substituents
› s is positive if the substituent is electron-withdrawing and;
› s is negative if substituent is electron-donating
› The relationships as developed by Hammett are termed linear
free energy relationships
› By definition, s for hydrogen is ZERO
› Positive s for the NO2 group indicate electron-withdrawing effect
 m-NO2 (inductive effect); while p-NO2 (inductive + resonance
effect)
› Electronegative chlorine produce an inductive electron-withdrawing
effect
 The magnitude of the effect in the p-Cl position being less than in
the m-Cl, and only the inductive effect is possible with chlorine
› CH3O- group can be electron-donating or -withdrawing, depending
on the position of substitution
 m-CH3O an inductive electron-withdrawing effect is seen
 p-CH3O only a small inductive effect is expected; an electron-
donating resonance effect occurs for p-CH3O, giving an overall
electron-donating effect
Hammett Constant
Applications of the Hammett Equation
› The prediction of the pKa of ionization equilibria
› Therefore,
› For benzoic acid the equation is
› Consider for substituted benzoic acid
› Given smeta=0.71 for NO2 and spara=-0.13 for CH3 groups,
calculated pKa=2.91, compared to the experimental value of
2.97
Applications of the Hammett Equation
› The applicability of Hammett's electronic descriptors in a QSAR
relating the inhibition of bacterial growth by a series of
sulfonamides
› where X represents various substituents
› A QSAR was developed based on the s values of the substituents
› where C is the minimum concentration of compound that inhibited
growth of E. coli
› The electron-withdrawing substituents favor inhibition of growth
Log P is a measure of the drug’s hydrophobicity, which was
selected as a measure of its ability to pass through cell
membranes.
The log P (or log Po/w) value reflects the relative solubility of the
drug in octanol (representing the lipid bilayer of a cell
membrane) and water (the fluid within the cell and in blood).
Log P values may be measured experimentally or, more
commonly, calculated.
Hansch’s Approach
Hansch’s Approach
› The Hammett substituent constant (s) reflects the drug
molecule’s intrinsic reactivity, related to electronic
factors caused by aryl substituents.
› In chemical reactions, aromatic ring substituents can
alter the rate of reaction by up to 6 orders of magnitude!
› For example, the rate of the reaction below is ~105 times
slower when X = NO2 than when X = CH3
CH3OH
C Cl
H
X

C OCH3 + HCl
H

X
› Log 1/C = S ai + m
where C=predicted activity,
ai= contribution per group, and
m=activity of reference
Free-Wilson Analysis
Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br]
+ 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl]
+ 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82
N
Br
X
Y HCl
8. Topliss Scheme
Used to decide which substituents to use if optimising compounds
one by one (where synthesis is complex and slow)
Example: Aromatic substituents
L E M
ML EL E M
L E M
L E M
See Central
Branch
L E M
H
4-Cl
4-CH34-OMe 3,4-Cl2
4-But
3-CF3-4-Cl
3-Cl 3-Cl 4-CF3
2,4-Cl2
4-NO2
3-NMe2
3-CF3-4-NO2
3-CH3
2-Cl
4-NO2
3-CF3
3,5-Cl2
3-NO2
4-F
4-NMe2
3-Me-4-NMe2
4-NH2
Rationale
Replace H with
para-Cl (+p and +s)
+p and/or +s
advantageous
favourable p
unfavourable s
+p and/or +s
disadvantageous
Act. Little
change
Act.
add second Cl to
increase p and s
further
replace with OMe
(-p and -s)
replace with Me
(+p and -s)
Further changes suggested based on arguments of p, s and
steric strain
8. Topliss Scheme
Chemometrics in
QSAR
23
Contents
I. Basics of regression analysis - linear and multiple
linear regression,
II. Introduction to PCA & PCR, PLS, ANN and GFA.
III. Validation of QSAR models
A. Correlation coefficients (r2 and r2 pred), F-test,
standard error,
B. cross-validation by calculation of q2, boot-strap
analysis and randomization.
IV.Applicability domain for predictions using a QSAR model.
V. Design of training and test sets using factorial design
Linear and multiple linear Regression
(Image Coutesy: CAMO Software AS)
Linear Data
Non-Linear Data
Data structure
Y-variableX-variable
Objects, same number
in x and y-column
2
4
1
.
.
.
7
6
8
.
.
.
b0
b1
y=b0+b1x+e
x
y
Least squares (LS) used
for estimation of regression coefficients
Simple linear regression





])(][)([
))((
22
yyxx
yyxx
b
Error
Model
Data (X,Y)
Regression analysis
Future X Prediction
What does Regression analysis Do
Outliers?
Pre-processing
Interpretation
Linear and Multiple linear Regression
• When to use
• When no. of observations more than no. of variables
• Not used in current QSAR formalisms
• Limitations
• Inaccurate when inter-correlated variable are present
• Cannot be applied when no. of variables are more than
observations
Principle Component Analysis (PCA)
PCA
• Overcomes all Limitations in Linear Regression
• Data compression
Basic Principle of Principle Components
Variable Matrix Score Matrix
Loading Matrix Error or Residue
Regression by data
compression
Regression on scores
PC1
t-score
y
q
ti
PCA
to compress datax1
x2
x3
More than one Principle Components
PC1 PC2
75% 15% 15%100%
Partial Least Squares (PLS)
Variable
Matrix Score
Matrix
Loading
Matrix
Loading
Matrix
Comparision of MLR, PCA and PLS
x4
x1
x2
x3
x4
x2
x3
x1
x2
x4
x3
y
y
y
t1
t2
MLR
PCR
PLS
x1
t1
t2
Genetic Function Approximation (GFA) and
Genetic/Partial Least Squares (G/PLS)
Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN)
Backpropagation Networks
› Attributed to Rumelhart and McClelland, late 70’s
› To bypass the linear classification problem, we can
construct multilayer networks. Typically we have fully
connected, feedforward networks.
I1
I2
1
Hidden Layer
H1
H2
O1
O2
Input Layer Output Layer
Wi,j
Wj,k
1’s - bias


 
j
jxj Hw
e
xO
,
1
1
)(
I3
1


 
i
ixi Iw
e
xH
,
1
1
)(
Validation of QSAR Models
• Internal validation:
• The correlation coefficient, r
• Pearson’s correlation coefficient, r2
• Cross-validation (CV)
• Leave-one-out
• Leave-few-out
• Bootstrapping
• Randomization or y-scrambling
• Fischer statistic (F value)
• Full
• Sequential
• External Validation
• Predictive correlation coefficient (r2
pred)
Practical Considerations for QSAR modeling
How to Begin?
What to do?
What to Expect?
How to Conclude?
Selection of training and test set using factorial
designs
1. In factorial designs the investigated factors are varied
at fixed levels.
2. Each factor (chemical feature or descriptors) is
investigated at levels based on type of factorial
experiment.
3. Full factorial design for K chemical
features/descriptors at two levels gives nK compounds.
Experiments in a design
with three variables
Group π Es MR
H 0.00 0.00 1.03
CH3 0.56 -1.24 5.65
C2H5 1.02 -1.31 10.30
n-C3H7 1.55 -1.60 14.96
i-C3H7 1.53 -1.71 14.96
n-C4H9 2.13 -1.63 19.61
t-C4H9 1.98 -2.78 19.62
H2C=CH** 0.82 10.99
C6H5** 1.96 -3.82 25.36
CH2Cl 0.17 -1.48 10.49
CF3 0.88 -2.40 5.02
CN -0.57 -0.51 6.33
F 0.14 -0.46 0.92
Cl 0.71 -0.97 6.03
Br 0.86 -1.16 8.88
I 1.12 -1.40 13.94
OH -0.67 -0.55 2.85
OCH3 -0.02 -0.55 7.87
OCH2CH3 0.38 12.47
SH 0.39 -1.07 9.22
SCH3 0.61 -1.07 13.82
NO2** -0.28 -2.52 7.36
23 factorial Design
Applicability Domain in QSAR
• OECD Definition: Applicability domain (AD) of a QSAR model is
the physico-chemical, structural or biological space, knowledge or
information on which the training set of the model has been
developed, and for which it is applicable to make predictions for new
compounds.
• A new European legislation on chemicals – REACH (Registration,
Evaluation, Authorization and restriction of Chemicals) came into
force in 2007.
• Purpose
• Reliably application of (Q)SAR
• Intrapolation is better Extrapolation
What are the key aspects in defining the AD of
QSAR models ?
• Identification of the subspace of chemical structures.
• Defined AD determines the degree of generalization of a given
predictive model.
• A well defined AD indicates if the endpoint for the chemical
structures under evaluation can be reliably predicted.
• Characterization of the interpolation space is very significant
to define the AD for a given QSAR model
How can the AD of a model be defined ?
• Range Based methods
• Bounding Box or convex hull
• PCA Bounding Box
• Distance based methods
• Geometric Methods
• Probability Density Distribution based methods
Empty Region
Dense region
Bounding Box or convex hull
•Descriptor ranges •Distances
•Geometric •Probabilistic
Is it correct to say :
• “prediction result is always
reliable for a point within
the application region” ?
• “prediction is always
unreliable if the point is
outside the application
region” ?
Concluding remark
Questions?
THANK YOU For bearing with me

More Related Content

What's hot

Quantitative structure activity relationships
Quantitative structure  activity relationshipsQuantitative structure  activity relationships
Quantitative structure activity relationships
Amiya ghosh
 
2D - QSAR
2D - QSAR2D - QSAR
2D - QSAR
Ajay Kumar
 
STATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSARSTATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSAR
RaniBhagat1
 
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMDENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
Shikha Popali
 
Fragment based drug design
Fragment based drug designFragment based drug design
Fragment based drug design
Ekta Tembhare
 
Attenuated total reflectance spectroscopy
Attenuated total reflectance spectroscopyAttenuated total reflectance spectroscopy
Attenuated total reflectance spectroscopy
chaitanya kolli
 
Quantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipQuantitative Structure Activity Relationship
Quantitative Structure Activity Relationship
RaniBhagat1
 
Noesy [autosaved]
Noesy [autosaved]Noesy [autosaved]
Noesy [autosaved]
University of Allahabad
 
3 d qsar approaches structure
3 d qsar approaches structure3 d qsar approaches structure
3 d qsar approaches structure
ROHIT PAL
 
Denovo Drug Design
Denovo Drug DesignDenovo Drug Design
Denovo Drug Design
Somasekhar Gupta
 
In Silico methods for ADMET prediction of new molecules
 In Silico methods for ADMET prediction of new molecules In Silico methods for ADMET prediction of new molecules
In Silico methods for ADMET prediction of new molecules
MadhuraDatar
 
CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)
Pinky Vincent
 
Molecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug designMolecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug design
Ajay Kumar
 
Analog design medicinal chemistry
Analog design medicinal chemistryAnalog design medicinal chemistry
Analog design medicinal chemistry
Mohit umare
 
Two Diemensional NMR (2D NMR)
Two Diemensional NMR (2D NMR)Two Diemensional NMR (2D NMR)
Two Diemensional NMR (2D NMR)
Ali Baig
 
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
B V V S Hanagal Shri Kumareshwar College of Pharmacy, Bagalkote
 
3d qsar
3d qsar3d qsar
3d qsar
Mahendra G S
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)
Theabhi.in
 
Structure based in silico virtual screening
Structure based in silico virtual screeningStructure based in silico virtual screening
Structure based in silico virtual screening
Joon Jyoti Sahariah
 
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
Shikha Popali
 

What's hot (20)

Quantitative structure activity relationships
Quantitative structure  activity relationshipsQuantitative structure  activity relationships
Quantitative structure activity relationships
 
2D - QSAR
2D - QSAR2D - QSAR
2D - QSAR
 
STATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSARSTATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSAR
 
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMDENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
 
Fragment based drug design
Fragment based drug designFragment based drug design
Fragment based drug design
 
Attenuated total reflectance spectroscopy
Attenuated total reflectance spectroscopyAttenuated total reflectance spectroscopy
Attenuated total reflectance spectroscopy
 
Quantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipQuantitative Structure Activity Relationship
Quantitative Structure Activity Relationship
 
Noesy [autosaved]
Noesy [autosaved]Noesy [autosaved]
Noesy [autosaved]
 
3 d qsar approaches structure
3 d qsar approaches structure3 d qsar approaches structure
3 d qsar approaches structure
 
Denovo Drug Design
Denovo Drug DesignDenovo Drug Design
Denovo Drug Design
 
In Silico methods for ADMET prediction of new molecules
 In Silico methods for ADMET prediction of new molecules In Silico methods for ADMET prediction of new molecules
In Silico methods for ADMET prediction of new molecules
 
CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)
 
Molecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug designMolecular and Quantum Mechanics in drug design
Molecular and Quantum Mechanics in drug design
 
Analog design medicinal chemistry
Analog design medicinal chemistryAnalog design medicinal chemistry
Analog design medicinal chemistry
 
Two Diemensional NMR (2D NMR)
Two Diemensional NMR (2D NMR)Two Diemensional NMR (2D NMR)
Two Diemensional NMR (2D NMR)
 
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
 
3d qsar
3d qsar3d qsar
3d qsar
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)
 
Structure based in silico virtual screening
Structure based in silico virtual screeningStructure based in silico virtual screening
Structure based in silico virtual screening
 
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
CHEMISTRY OF PEPTIDES [M.PHARM, M.SC, BSC, B.PHARM]
 

Viewers also liked

Qsar
QsarQsar
Qsar
nehla313
 
Basics of QSAR Modeling
Basics of QSAR ModelingBasics of QSAR Modeling
Basics of QSAR Modeling
Prachi Pradeep
 
Free wilson analysis qsar
Free wilson analysis qsarFree wilson analysis qsar
Free wilson analysis qsar
Rahul B S
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
bhavnesh munjal
 
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
Simon Curtis
 
Solubility boston-2012-published
Solubility boston-2012-publishedSolubility boston-2012-published
Solubility boston-2012-published
Simon Curtis
 
Separation experiment web
Separation experiment webSeparation experiment web
Separation experiment web
aldawaa
 
Raw mat, specification 112070804009
Raw mat, specification  112070804009Raw mat, specification  112070804009
Raw mat, specification 112070804009Patel Parth
 
Liquid liquid extraction ppt
Liquid liquid extraction pptLiquid liquid extraction ppt
Liquid liquid extraction ppt
Umer Farooq
 
QSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative StructureQSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative Structure
Saramita De Chakravarti
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
Abhik Seal
 
Structure Activity Relationships - Antipsychotics
Structure Activity Relationships - AntipsychoticsStructure Activity Relationships - Antipsychotics
Structure Activity Relationships - AntipsychoticsTulasi Raman
 
Chromatography
ChromatographyChromatography
Chromatography
suyashipod
 
31 liquid-liquid extraction
31   liquid-liquid extraction31   liquid-liquid extraction
31 liquid-liquid extractionIncopin
 

Viewers also liked (20)

Qsar
QsarQsar
Qsar
 
Basics of QSAR Modeling
Basics of QSAR ModelingBasics of QSAR Modeling
Basics of QSAR Modeling
 
Qsar lecture
Qsar lectureQsar lecture
Qsar lecture
 
Free wilson analysis qsar
Free wilson analysis qsarFree wilson analysis qsar
Free wilson analysis qsar
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
 
Extraction
ExtractionExtraction
Extraction
 
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
The A to Z of pharmaceutical cocrystals: a decade of fast-moving new science ...
 
Solubility boston-2012-published
Solubility boston-2012-publishedSolubility boston-2012-published
Solubility boston-2012-published
 
Separation experiment web
Separation experiment webSeparation experiment web
Separation experiment web
 
Qsar
QsarQsar
Qsar
 
Qsar
QsarQsar
Qsar
 
Raw mat, specification 112070804009
Raw mat, specification  112070804009Raw mat, specification  112070804009
Raw mat, specification 112070804009
 
Liquid liquid extraction ppt
Liquid liquid extraction pptLiquid liquid extraction ppt
Liquid liquid extraction ppt
 
Chromatography
ChromatographyChromatography
Chromatography
 
QSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative StructureQSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative Structure
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
 
Structure Activity Relationships - Antipsychotics
Structure Activity Relationships - AntipsychoticsStructure Activity Relationships - Antipsychotics
Structure Activity Relationships - Antipsychotics
 
M.PHARM_ Rupsa Ghosh
M.PHARM_ Rupsa GhoshM.PHARM_ Rupsa Ghosh
M.PHARM_ Rupsa Ghosh
 
Chromatography
ChromatographyChromatography
Chromatography
 
31 liquid-liquid extraction
31   liquid-liquid extraction31   liquid-liquid extraction
31 liquid-liquid extraction
 

Similar to QSAR

Quantitative structure - activity relationship (QSAR)
Quantitative  structure - activity  relationship (QSAR)Quantitative  structure - activity  relationship (QSAR)
Quantitative structure - activity relationship (QSAR)
Eswaran Murugesan
 
QSAR
QSARQSAR
Linear free energy relationships
Linear free energy relationshipsLinear free energy relationships
Linear free energy relationships
Tamralipta Mahavidyalaya
 
Quantitative structure activity relationships
Quantitative structure activity relationshipsQuantitative structure activity relationships
Quantitative structure activity relationships
Shilpa Harak
 
QSAR by hansch analysis
QSAR by hansch analysisQSAR by hansch analysis
QSAR by hansch analysis
kholood adil
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
Pharma Rising, Bhopal
 
Physicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdfPhysicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdf
RAJ K. MAURYA
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptx
RadhaChafle1
 
Ligand based drug design
Ligand based drug designLigand based drug design
Ligand based drug design
Satyendra Yadav
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
bhavnesh munjal
 
Qsar UMA
Qsar   UMAQsar   UMA
Qsar UMA
Uma Bansal
 
Insilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastaseInsilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastase
Jayashankar Lakshmanan
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Atai Rabby
 
Introduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity RelationshipsIntroduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity Relationships
Omar Sokkar
 
QSAR (Quantitative Structural Activity Relationship)
QSAR (Quantitative Structural Activity Relationship)QSAR (Quantitative Structural Activity Relationship)
QSAR (Quantitative Structural Activity Relationship)
Richa Tripathy
 
Brønsted catalysis
Brønsted catalysisBrønsted catalysis
Brønsted catalysis
Daniel Morton
 
Qsar studies
Qsar studiesQsar studies
Qsar studies
routhusree
 
Steric parameters taft’s steric factor (es)
Steric parameters  taft’s steric factor (es)Steric parameters  taft’s steric factor (es)
Steric parameters taft’s steric factor (es)
Shikha Popali
 

Similar to QSAR (20)

Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Quantitative structure - activity relationship (QSAR)
Quantitative  structure - activity  relationship (QSAR)Quantitative  structure - activity  relationship (QSAR)
Quantitative structure - activity relationship (QSAR)
 
QSAR
QSARQSAR
QSAR
 
QSAR
QSARQSAR
QSAR
 
Linear free energy relationships
Linear free energy relationshipsLinear free energy relationships
Linear free energy relationships
 
Quantitative structure activity relationships
Quantitative structure activity relationshipsQuantitative structure activity relationships
Quantitative structure activity relationships
 
QSAR by hansch analysis
QSAR by hansch analysisQSAR by hansch analysis
QSAR by hansch analysis
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
 
Physicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdfPhysicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdf
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptx
 
Ligand based drug design
Ligand based drug designLigand based drug design
Ligand based drug design
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
 
Qsar UMA
Qsar   UMAQsar   UMA
Qsar UMA
 
Insilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastaseInsilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastase
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
Introduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity RelationshipsIntroduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity Relationships
 
QSAR (Quantitative Structural Activity Relationship)
QSAR (Quantitative Structural Activity Relationship)QSAR (Quantitative Structural Activity Relationship)
QSAR (Quantitative Structural Activity Relationship)
 
Brønsted catalysis
Brønsted catalysisBrønsted catalysis
Brønsted catalysis
 
Qsar studies
Qsar studiesQsar studies
Qsar studies
 
Steric parameters taft’s steric factor (es)
Steric parameters  taft’s steric factor (es)Steric parameters  taft’s steric factor (es)
Steric parameters taft’s steric factor (es)
 

Recently uploaded

Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 

Recently uploaded (20)

Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 

QSAR

  • 1. Quantitative Structure-Activity Relationship Elvis A. F. Martis Graduate Student (Ph.D.) Department of Pharmaceutical Chemistry Bombay College of Pharmacy 1
  • 2. • Developing New QSAR methodologies •CoRIA and its Variants •HomoSAR •LISA •eCoRIA and eQSAR •CoOAN •Solving Protein Structures (using NMR) •Computational Prediction of Resistance and QMAR •Lead optimization strategies for Anti-TB, Dengue, AD etc • Studies on reaction pathways and transition states using ab initio and Quantum Mechanics. • Molecular dynamics of Drug-Cyclodextrin complexes Research in Prof. Coutinho’s Lab
  • 3. Molecular Modeling in Drug Design Receptor Unknown- Ligand Unknown Receptor Known- Ligand Unknown Receptor Known- Ligand Known Receptor Unknown – Ligand Known
  • 4. What is QSAR? Compounds + biological activity New compounds with improved biological activity QSAR
  • 5. The number of compounds required for synthesis in order to place 10 different groups in 4 positions of benzene ring is 104 Solution: synthesize a small number of compounds and from their data derive rules to predict the biological activity of other compounds. Why QSAR?
  • 6. QSAR date back to the 19th century A.F.A. Cros (University of Strasbourg; 1863) Increased toxicity of alcohols with decrease in water solubility  H. H. Meyer (University of Marburg; 1890’s) and Charles Ernest Overton (University of Zurich; 1890’s) [working independently]  Toxicity of organic compounds depended on their lipophilicity  Crum-Brown and Fraser the physiological action of a substance was a function of its chemical composition and constitution  Richet inverse relationship between the cytotoxicities of a diverse set of simple organic molecules with water solubilities
  • 7.  Hammett, "sigma-rho” culture; to understand the effect of substituents on organic reactions  Taft  devised a way to separate polar, steric, and resonance effects and introduced the first steric parameter, Es  Hansch and Fujita The contributions of Hammett and Taft together laid the mechanistic basis for the development of the QSAR paradigm
  • 8. Hammett Equation  Linear Free Energy Relationships Louis Hammett (1894-1987), correlated electronic properties of organic acids and bases with their equilibrium constants and reactivity  Measures the electron withdrawing or electron donating effects in comparison to benzoic acid & how affected its ionization) Consider the dissociation of benzoic acid:
  • 9. Hammett Equation › m-NO2 increases dissociation constant (nitro group is EWG stabilizing the negative charge) › p-NO2 exhibits greater electron withdrawing effect › p-C2H5 group on benzoic acid
  • 10. Hammett observed similar substituent effects on the organic acids and bases dissociation like phenyl acetic acid. Hammett Equation
  • 11.  A linear free-energy relationship is said to exist if ‘the same series of changes in conditions affects the rate or equilibrium of a second reaction in exactly the same way as the first’  The free energy is proportional to the logarithm of the equilibrium constant Graph for a linear free energy relationship
  • 12. › The following equation was derived as the relationship is linear; where r is the slope of the line and the abscissa values are always those for benzoic acid and are given the symbol, s (substituent constant); equation simplified as: › r (reaction constant) relates the effect of substituents on that equilibrium to the effect of those substituents on the benzoic acid equilibrium › The reaction constant depends on the nature of the chemical reaction as well as the reaction conditions (solvent, temperature, etc.) › The sign and magnitude of the reaction constant are indicative of the extent of charge build up during the reaction progress
  • 13. › Reactions with ρ > 0 are favored by electron withdrawing groups (i.e., the stabilization of negative charge) › Reactions with ρ < 0 are favored by electron donating groups (i.e., the stabilization of positive charge) › For benzoic acid r is equal to 1.00 in pure water at 25oC › s is a descriptor of the substituents; › The magnitude of s gives the relative strength of the electron- withdrawing or -donating properties of the substituents › s is positive if the substituent is electron-withdrawing and; › s is negative if substituent is electron-donating › The relationships as developed by Hammett are termed linear free energy relationships
  • 14. › By definition, s for hydrogen is ZERO › Positive s for the NO2 group indicate electron-withdrawing effect  m-NO2 (inductive effect); while p-NO2 (inductive + resonance effect) › Electronegative chlorine produce an inductive electron-withdrawing effect  The magnitude of the effect in the p-Cl position being less than in the m-Cl, and only the inductive effect is possible with chlorine › CH3O- group can be electron-donating or -withdrawing, depending on the position of substitution  m-CH3O an inductive electron-withdrawing effect is seen  p-CH3O only a small inductive effect is expected; an electron- donating resonance effect occurs for p-CH3O, giving an overall electron-donating effect Hammett Constant
  • 15. Applications of the Hammett Equation › The prediction of the pKa of ionization equilibria › Therefore, › For benzoic acid the equation is › Consider for substituted benzoic acid › Given smeta=0.71 for NO2 and spara=-0.13 for CH3 groups, calculated pKa=2.91, compared to the experimental value of 2.97
  • 16. Applications of the Hammett Equation › The applicability of Hammett's electronic descriptors in a QSAR relating the inhibition of bacterial growth by a series of sulfonamides › where X represents various substituents › A QSAR was developed based on the s values of the substituents › where C is the minimum concentration of compound that inhibited growth of E. coli › The electron-withdrawing substituents favor inhibition of growth
  • 17. Log P is a measure of the drug’s hydrophobicity, which was selected as a measure of its ability to pass through cell membranes. The log P (or log Po/w) value reflects the relative solubility of the drug in octanol (representing the lipid bilayer of a cell membrane) and water (the fluid within the cell and in blood). Log P values may be measured experimentally or, more commonly, calculated. Hansch’s Approach
  • 19. › The Hammett substituent constant (s) reflects the drug molecule’s intrinsic reactivity, related to electronic factors caused by aryl substituents. › In chemical reactions, aromatic ring substituents can alter the rate of reaction by up to 6 orders of magnitude! › For example, the rate of the reaction below is ~105 times slower when X = NO2 than when X = CH3 CH3OH C Cl H X  C OCH3 + HCl H  X
  • 20. › Log 1/C = S ai + m where C=predicted activity, ai= contribution per group, and m=activity of reference Free-Wilson Analysis Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br] + 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl] + 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82 N Br X Y HCl
  • 21. 8. Topliss Scheme Used to decide which substituents to use if optimising compounds one by one (where synthesis is complex and slow) Example: Aromatic substituents L E M ML EL E M L E M L E M See Central Branch L E M H 4-Cl 4-CH34-OMe 3,4-Cl2 4-But 3-CF3-4-Cl 3-Cl 3-Cl 4-CF3 2,4-Cl2 4-NO2 3-NMe2 3-CF3-4-NO2 3-CH3 2-Cl 4-NO2 3-CF3 3,5-Cl2 3-NO2 4-F 4-NMe2 3-Me-4-NMe2 4-NH2
  • 22. Rationale Replace H with para-Cl (+p and +s) +p and/or +s advantageous favourable p unfavourable s +p and/or +s disadvantageous Act. Little change Act. add second Cl to increase p and s further replace with OMe (-p and -s) replace with Me (+p and -s) Further changes suggested based on arguments of p, s and steric strain 8. Topliss Scheme
  • 24. Contents I. Basics of regression analysis - linear and multiple linear regression, II. Introduction to PCA & PCR, PLS, ANN and GFA. III. Validation of QSAR models A. Correlation coefficients (r2 and r2 pred), F-test, standard error, B. cross-validation by calculation of q2, boot-strap analysis and randomization. IV.Applicability domain for predictions using a QSAR model. V. Design of training and test sets using factorial design
  • 25. Linear and multiple linear Regression (Image Coutesy: CAMO Software AS) Linear Data Non-Linear Data
  • 26. Data structure Y-variableX-variable Objects, same number in x and y-column 2 4 1 . . . 7 6 8 . . .
  • 27. b0 b1 y=b0+b1x+e x y Least squares (LS) used for estimation of regression coefficients Simple linear regression      ])(][)([ ))(( 22 yyxx yyxx b Error
  • 28. Model Data (X,Y) Regression analysis Future X Prediction What does Regression analysis Do Outliers? Pre-processing Interpretation
  • 29. Linear and Multiple linear Regression • When to use • When no. of observations more than no. of variables • Not used in current QSAR formalisms • Limitations • Inaccurate when inter-correlated variable are present • Cannot be applied when no. of variables are more than observations
  • 30. Principle Component Analysis (PCA) PCA • Overcomes all Limitations in Linear Regression • Data compression
  • 31. Basic Principle of Principle Components Variable Matrix Score Matrix Loading Matrix Error or Residue
  • 32. Regression by data compression Regression on scores PC1 t-score y q ti PCA to compress datax1 x2 x3
  • 33. More than one Principle Components PC1 PC2 75% 15% 15%100%
  • 34. Partial Least Squares (PLS) Variable Matrix Score Matrix Loading Matrix Loading Matrix
  • 35. Comparision of MLR, PCA and PLS x4 x1 x2 x3 x4 x2 x3 x1 x2 x4 x3 y y y t1 t2 MLR PCR PLS x1 t1 t2
  • 36. Genetic Function Approximation (GFA) and Genetic/Partial Least Squares (G/PLS)
  • 39. Backpropagation Networks › Attributed to Rumelhart and McClelland, late 70’s › To bypass the linear classification problem, we can construct multilayer networks. Typically we have fully connected, feedforward networks. I1 I2 1 Hidden Layer H1 H2 O1 O2 Input Layer Output Layer Wi,j Wj,k 1’s - bias     j jxj Hw e xO , 1 1 )( I3 1     i ixi Iw e xH , 1 1 )(
  • 40. Validation of QSAR Models • Internal validation: • The correlation coefficient, r • Pearson’s correlation coefficient, r2 • Cross-validation (CV) • Leave-one-out • Leave-few-out • Bootstrapping • Randomization or y-scrambling • Fischer statistic (F value) • Full • Sequential • External Validation • Predictive correlation coefficient (r2 pred)
  • 41. Practical Considerations for QSAR modeling How to Begin? What to do? What to Expect? How to Conclude?
  • 42. Selection of training and test set using factorial designs 1. In factorial designs the investigated factors are varied at fixed levels. 2. Each factor (chemical feature or descriptors) is investigated at levels based on type of factorial experiment. 3. Full factorial design for K chemical features/descriptors at two levels gives nK compounds.
  • 43. Experiments in a design with three variables Group π Es MR H 0.00 0.00 1.03 CH3 0.56 -1.24 5.65 C2H5 1.02 -1.31 10.30 n-C3H7 1.55 -1.60 14.96 i-C3H7 1.53 -1.71 14.96 n-C4H9 2.13 -1.63 19.61 t-C4H9 1.98 -2.78 19.62 H2C=CH** 0.82 10.99 C6H5** 1.96 -3.82 25.36 CH2Cl 0.17 -1.48 10.49 CF3 0.88 -2.40 5.02 CN -0.57 -0.51 6.33 F 0.14 -0.46 0.92 Cl 0.71 -0.97 6.03 Br 0.86 -1.16 8.88 I 1.12 -1.40 13.94 OH -0.67 -0.55 2.85 OCH3 -0.02 -0.55 7.87 OCH2CH3 0.38 12.47 SH 0.39 -1.07 9.22 SCH3 0.61 -1.07 13.82 NO2** -0.28 -2.52 7.36 23 factorial Design
  • 44. Applicability Domain in QSAR • OECD Definition: Applicability domain (AD) of a QSAR model is the physico-chemical, structural or biological space, knowledge or information on which the training set of the model has been developed, and for which it is applicable to make predictions for new compounds. • A new European legislation on chemicals – REACH (Registration, Evaluation, Authorization and restriction of Chemicals) came into force in 2007. • Purpose • Reliably application of (Q)SAR • Intrapolation is better Extrapolation
  • 45. What are the key aspects in defining the AD of QSAR models ? • Identification of the subspace of chemical structures. • Defined AD determines the degree of generalization of a given predictive model. • A well defined AD indicates if the endpoint for the chemical structures under evaluation can be reliably predicted. • Characterization of the interpolation space is very significant to define the AD for a given QSAR model
  • 46. How can the AD of a model be defined ? • Range Based methods • Bounding Box or convex hull • PCA Bounding Box • Distance based methods • Geometric Methods • Probability Density Distribution based methods Empty Region Dense region Bounding Box or convex hull
  • 48. Is it correct to say : • “prediction result is always reliable for a point within the application region” ? • “prediction is always unreliable if the point is outside the application region” ? Concluding remark
  • 50. THANK YOU For bearing with me