SlideShare a Scribd company logo
1 of 50
Quantitative Structure Activity
Relationship (QSAR): Statistical
method and Product concept
Presented by:
Radha Sureshrao Chafle
F. Y. M. Pharm. (Semester II)
(Pharmacology)
Guided by:
Dr. Mrs. Vandana S. Nikam
HOD, Associate Professor in Pharmacology
What is QSAR?
 QSAR is a mathematical relationship
which describe the structural dependence
of biological activities either by
physicochemical parameters, by indicator
variables encoding different structural
features , or by three-dimensional
molecular property profiles of the
compounds.
Contd.
Drugs, which exert their biological effects by interaction
with a specific target must have :
 a three-dimensional structure, which in the arrangement
of its functional groups and in its surface properties is
more or less complementary to a binding site.
 Better the steric fit and complementarity of the of the
surface properties of a drug to its binding site are, the
higher its affinity will be and the higher may be its
biological activity.
Classical QSAR analyses
 consider only 2D structures
 main field of application is in substituent
variation of a common scaffold.
3D QSAR
• has a much broader scope.
• starts from 3D structures and
correlates biological activities with
3D-property fields.
History & Development of QSAR
 1868, Crum-Brown and Fraser: Published an equation Φ =
f(C) i.e. assumption of physiological activity Φ as function of
the chemical structure C.
 1900, H. H. Meyer and C. E. Overton: lipoid theory of
narcosis
 1930‘s, L. Hammett: electronic sigma constants
 1964, C. Hansch and T. Fujita: QSAR
 1984, P. Andrews: affinity contributions of functional groups
 1985, P. Goodford: GRID (hot spots at protein surface)
 1988, R. Cramer: 3D QSAR
 1992, H.-J. Bohm: LUDI interaction sites, docking, scoring
 1997, C. Lipinski: bioavailability rule of five
 1998, Ajay, W. P. Walters and M. A. Murcko; J. Sadowski
and H. Kubinyi: drug-like character
Basic Requirements in QSAR
Studies
 all analogs belong to a congeneric series
 all analogs exert the same mechanism of
action
 all analogs bind in a comparable manner
 the effects of isosteric replacement can be
predicted
 binding affinity is correlated to interaction
energies
 biological activities are correlated to binding
affinity
Molecular Properties and Their
Parameters
Molecular Property Corresponding
Interaction
Parameters
Lipophilicity hydrophobic interactions log P, 𝜋, 𝑓, RM, 𝜒
Polarizability van-der-Waals
interactions
MR, parachor, MV
Electron density ionic bonds, dipole-
dipole interactions,
hydrogen bonds, charge
transfer interactions
σ, R, F, κ, quantum
chemical indices
Topology steric hindrance
geometric fit
Es, rv, L, B, distances,
volumes
QSAR models
 Hansch model (property-property relationship):
 Definition of the lipophilicity parameter π
πX = log PRX - log PRH
where PRX represents the partition coefficient between n-
octanol and water and PRH that of the parent compound.
 Linear Hansch model
Log 1/C = a log P + b σ + c MR + ... + k
 Nonlinear Hansch models
log 1/C = a (log P)2 + b log P + c σ + ... + k
log 1/C = a π2 + b π + c σ + ... + k
log 1/C = a log P - b log (ßP + 1) + c σ + ... +
Contd.
 Free-Wilson model (structure-property relationship)
log 1/C = Σ ai + µ
ai = substituent group contributions
µ = activity contribution of reference compound
 Mixed Hansch/Free-Wilson model
log 1/C = a (log P)2 + b log P + c σ + ... + Σ ai + k
log 1/C = a log P - b log (ßP + 1) + c σ + ... + Σ ai + k
Lipophilicity
Hydrophobic interaction between a drug
and a binding site at a receptor
Definition of Partition Coefficients
P = corg/caq (n-octanol/water system)
n-Octanol/Water as a Standard
System
 membrane analogous structure
 hydrogen bond donor and acceptor
 practically insoluble in water
 no desolvation on transfer into organic
phase
 very low vapor pressure
 transparent in the UV region
 large data base of log P values
 Additivity Principle of π Values(C.
Hansch, 1964)
πX = log PR-X - log PR-H
The lipophilicity parameter π is an additive,
constitutive molecular parameter;
compare the Hammett Equation:
ρσX = log KR-X - log KR-H
 Non additivity of π Values
- intramolecular hydrogen bonds
- ortho effects (phenols)
- polysubstituted aromatic compounds
- conjugation (push pull effect)
- Heterocyclic compound
- Cyclophanes
Hydrophobic Fragmental Constants f
 The hydrophobic fragmental constant of a substituent
or molecular fragment represents the lipophilicity
contribution of that molecular fragment .
(R. Rekker, The Hydrophobic Fragmental Constant,
Elsevier, Amsterdam 1977; R. Rekker. Eur. J. Med. Chem.
14, 479 (1979)
log P = Σ aifi (R. Rekker, 1973)
 Experimental Determination of Log P Values
- Shake flask method
- Reversed phase thin layer chromatography
- High performance liquid chromatography (HPLC)
Polarizability Parameters
 Molar volume, Molar Refractivity, Parachor
MV =
MW
d
MR =
n2 – 1
n2+ 2
.
MW
d
PA = 𝛾1/4 MW
d
d = density; n = refraction index;
γ = surface tension
(MR is most often scaled by a factor of 0.1)
Electronic Parameters
 Hammett Equation
ρσ = log KRX - log KRH
 Calculation of pKa Values
pKa R-X = pKa R-H - ρσ
pKa value of 3,5-dinitro-4-methyl-benzoic
acid
(pKa benzoic acid = 4.20)
experimental value = 2.97
calculated value = 4.20 - (0.71 - 0.17 + 0.71) =
2.95
Quantum Mechanical Descriptors
 Atom partial charges:
Mulliken population analysis (orbital population)
ESP charges (mapping EP to atom locations)
 Dipole moment:
strength and orientation behavior of a molecule in an
electrostatic field
 HOMO / LUMO (“frontier orbital theory“):
HOMO = energy of highest occupied molecular orbital,
“nucleophilicity’’
LUMO = energy of lowest unoccupied molecular orbital,
“electrophilicity”
 Superdelocalizability:
estimate for the reactivity of positions in aromatic hydrocarbon
3D QSAR
 3D QSAR is an extension of classical QSAR which
exploits the 3 dimensional properties of the ligands
to predict their biological activity using robust
statistical analysis like PLS, G/PLS, ANN etc.
 3D QSAR uses probe based sampling within a
molecular lattice to determine three-dimensional
properties of molecules and can then correlate these
3D descriptors with biological activity.
 Some of the major factors like desolvation
energetics, temperature, diffusion, transport, pH,
salt concentration etc. which contribute to the
overall free energy of binding are difficult to handle,
and thus usually ignored.
On the basis of
intermolecular bonding
On the basis of alignment
criterion
On the basis of
chemometric
techniques used
Ligand Based
3D QSAR
For e.g.
CoMFA,
CoMSIA,
COMPASS,
CoMMA,
SoMFA
Receptor
Based 3D
QSAR
For e.g.
COMBINE,
AFMoC,
HIFA,
CoRIA
Linear 3D
QSAR
For e.g. CoMFA,
CoMSIA,
AFMoC,
GERM,
CoMMA,
SoMFA
Classification of 3D QSAR
Alignment
dependent 3D
QSAR
For e.g. CoMFA,
CoMSIA,
GERM,
COMBINE,
AFMoC, HIFA,
CoRIA
Alignment
independent 3D
QSAR
For e.g.
COMPASS,
CoMMA, HQSAR,
WHIM,
EVA/CoSA,
GRIND
Comparative Molecular Field
Analysis (CoMFA)
 The Scientist named Cramer developed the
predecessor of 3D approaches called Dynamic
Lattice Oriented Molecular Modeling System
(DYLOMMS) that involves the use of PCA to
extract vectors from the molecular interaction fields,
which are then correlated with biological activities
in 1987.
 CoMFA, powerful 3D QSAR methodology is a
combination of GRID and PLS.
Protocol for CoMFA
 Determination of Bioactive conformations of the
molecule.
 Superimposition or the alignment of molecules
using either manual or automated methods, in a
manner defined by the supposed mode of
interaction with the receptor.
 The steric and electrostatic fields calculated
around the molecules with different probe groups
positioned at all interactions of the lattice.
 The overlaid molecules are placed in the center of
a lattice grid with a spacing of 2 Å.
Contd.
 The PLS technique is used to correlate the
interaction energy or field values with the
biological activity, by which the quantitative
influence of specific chemical features of
molecules on their biological can be
identified and extracted.
 The results are coupled as correlation
equations with the number of latent variable
terms, each of which is a linear combination
of original independent lattice descriptors.
Steps in CoMFA:
 a set of molecules is first selected.
 all molecules have to interact with the same kind
of receptor (or enzyme, ion channel, transporter)
in the same manner, i.e., with identical binding
sites in the same relative geometry.
 a certain subgroup of molecules is selected
which constitutes a training set to derive the
CoMFA model.
 The residual molecules are considered to be a
test set which independently proves the validity
of the derived model(s).
 Atomic partial charges are calculated and
(several) low energy conformations are
generated. A pharmacophore hypothesis is
derived to orient the superposition of all
individual molecules and to afford a rational
and consistent alignment.
 A sufficiently large box is positioned around
the molecules and a grid distance is defined.
 PLS analysis is the most appropriate method
for this purpose. Normally, cross-validation is
used to check the internal predictivity of the
derived model.
Pharmacophore hypotheses and
alignment:
Drawbacks of CoMFA:
 Too many adjustable parameters
 Uncertainty in selection of compounds and
variables.
 Fragmented contour maps with variable
selection procedures.
 Hydrophobicity not well quantified
 Cut-off limits used.
 Low signal to noise ratio due to many useless
field variables.
 Imperfections in potential energy funtions.
 Applicable only to in vitro data.
Comparative Molecular Similarity
Indices Analysis (CoMSIA)
 Molecular similarity indices are calculated
from modified SEAL similarity fields are
employed as descriptors to simultaneously
consider steric, electrostatic, hydrophobic
and hydrogen bonding properties.
 These indices are estimated indirectly by
comparing the similarity of each molecule in
the dataset with a common probe atom
(having a radius of 1Å, charge of +1 and
hydrophobicity of +1) positioned at the
intersections of a surrounding grid/lattice.
 For computing similarity at all grid points, the
mutual distances between the probe atom and the
atoms of the molecules in the aligned dataset are
also taken into account.
 To describe this distance dependence and
calculate the molecular properties, Gaussian type
functions are employed.
 Since the underlying Gaussian type functional
forms are ‘smooth’ with no singularities their
slopes are not as steep as the Columbic and
Lennerd Jones potentials in CoMFA; therefore no
arbitrary cut off limits are required to be defined.
Comparison between CoMFA and
CoMSIA
CoMFA CoMSIA
Function type Lennerd-Jones potential,
Coulomb potential
Gaussian
Descriptors Interaction energies Similarity indices
Cut-off required Not required
Field Steric, electrostatic Steric, electrostatic,
hydrophobic, hydrogen
bond donor and
hydrogen bond acceptor
Contour map Often not contiguous Contiguous
Model reproducibility Poor Good
*CoMSIA is Provide By TRIPOS Inc. in the Sybyl Software, along
with CoMFA.
Statistical methods used in QSAR
Linear Regression
Analysis (RA)
Multivariate Data
Analysis
Pattern Recognition
Simple Linear regression Principal component
analysis (PCA)
Cluster analysis
Multiple Linear
regression (MLR)
Principal component
regression (PCR)
Artificial neural
networks (ANNs)
Stepwise multiple linear
regression
Partial least square
analysis (PLS)
k-nearest neighbor
(kNN)
Genetic function
approximation (GFA)
Genetic partial least
squares (G/PLS)
Linear Regression Analysis (LRA)
 Linear Regression Analyses are
considered as an easily interpretable
methods indicate for QSAR analysis.
 These techniques construct a statistical
model to represent the correlation of one
or more independent variables(x) with a
dependent explicative variable (y).
 The model can be utilized to predict y
from the knowledge of x variables, either
quantitative or qualitative.
a. Simple Linear regression
method
 Standard linear regression calculation to
generate a set of QSAR equations that
include a single independent descriptor x
and dependent variable y.
 A one term linear equation is produced
separately for each independent variable
from the descriptor set.
𝑦 = 𝑎 + 𝑏𝑥
b. Multiple Linear regression (MLR)
 Referred as linear free energy relationship
(LFER) method.
 Generates QSAR equations by performing
standard multivariable regression
calculations to identify the dependence of
a drug property or any all of the
descriptors under investigation.
 Involves more than one variables.
𝑦 = 𝑏0 + 𝑏1𝑥1 + 𝑏2𝑥2 + … … … + 𝑏𝑚𝑥𝑚 + 𝑒
c. Stepwise multiple linear
regression
 Commonly used variant of MLR.
 Creates multiple term linear equation but not
all the independent variables are used.
 Each independent variable is sequentially
added to the equation and new regression is
performed every time.
 The new term is preserved only if the model
passes a test for significance.
 Is useful when the number of descriptors are
large and key descriptor is unknown.
Multivariate Data Analysis
 It replaced LRA.
 It tried to explain an extended set of
variables by means of a reduced number
of new latent variables possessing the
maximum amount of information relevant
to the problem.
a. Principal component analysis
(PCA)
 Data reduction technique that does not
generate QSAR model.
 Creates new set of orthogonal descriptors
i.e. principal components which describe
most of the information contained in the
independent variables.
 Reduces dimensionality of a multivariate
data set of descriptors to the actual
amount of data available.
b. Principal component regression
(PCR)
 When principal components are employed
as the independent variables to perform a
linear regression, the method is termed as
Principal component regression.
 PCR applies scores from PCA
decomposition as regressors in the QSAR
model, to generate multiple term linear
equation.
c. Partial least square analysis (PLS)
 An iterative regression procedure that
produces its solution based on linear
transformation of large number of original
descriptors to a small number of new
orthogonal terms called latent variables.
 PLS is able to analyze complex SAR data
in a more realistic way.
 Is able to interpret the influence of
molecular structure on biological activity
d. Genetic function approximation
(GFA)
 Serves as an alternative to standard
regression analysis for building QSAR
equations.
 Employs natural principles of evolution of
species which leads to improvements by
recombination
 Suitable for obtaining QSAR equations when
dealing with a larger number of independent
variables.
 Results in multiple models generated by
initial models using genetic algorithms.
e. Genetic partial least squares
(G/PLS)
 It is valuable analytical tool that has
evolved by combining the best features of
GFA and PLS.
Pattern Recognition
 The method is based on the principal of
analogy.
 The method is used for the detection of
the distance or closeness within the large
amount of multivariate data.
a. Cluster analysis
 Statistical pattern recognition method used to
investigate the relationship between
observations associated with several
properties and to partition the data set into
categories consisting of similar elements.
 Allows for the consideration of the inactive
compounds in the analysis.
 Can be used to study a large set of
substituents to identify subsets which share
similar physical properties.
b. Artificial neural networks (ANNs)
 The technique has its origin from the real
neurons present in an animal brain.
 Are parallel computational systems
consists of groups of highly
interconnected processing elements called
neurons, which are arranged in a series of
layers.
a)First layer: Input layer
b)Subsequent layer: Hidden layer
c) Last layer: Output layer
Contd.
 Each layer does its independent
computations and pass the results to another
one.
 The weighed inputs are summed up and
supplied to the hidden layers, where a non
linear transfer function does all the required
processing.
 The results of transfer function are
communicated to the neurons in the output
layer, where the results are interpreted and
finally presented to the users.
c. k-nearest neighbor (kNN)
 Simplest machine learning algorithms.
 Most commonly used for classifying a new
pattern (e.g. a molecule)
 Technique is based on a simple distance
learning approach.
 Where unknown/ new molecules are
classified according to the majority of its k-
nearest neighbors in the training set.
 The nearness is determined by Euclidean
distance metric (e.g. similarity measure
computed using the structural descriptors of
the molecules).
Importance of statistical parameter
 Equations generated/established in QSAR
studies are Linear Regression equations.
 A number of equations may be
generated/established for one
problem/case under study.
 Statistics helps in selecting one suitable
bet fit equation out of them.
Contd.
 This may be done by checking standard
deviation/variance an other related
parameters for the data set used for QSAR
studies .
 Correlation coefficient computed for the
data set under study also helps in selecting
appropriate QSAR equation.
Reference:
 Kubinyi H., Introduction, In: Mannhold R., Larsen P.,
Timmerman H. QSAR: Hansch Analysis and related
approaches. New York, VCH Publishers, 1993. p. 4-8
 Kubinyi H., Introduction, In: Mannhold R., Larsen P.,
Timmerman H. QSAR: Hansch Analysis and related
approaches. New York, VCH Publishers, 1993. p. 27-54
 Kubinyi H., Introduction, In: Mannhold R., Larsen P.,
Timmerman H. QSAR: Hansch Analysis and related
approaches. New York, VCH Publishers, 1993. p. 57- 68
 Kubinyi H., Introduction, In: Mannhold R., Larsen P.,
Timmerman H. QSAR: Hansch Analysis and related
approaches. New York, VCH Publishers, 1993. p. 159
Quantitative Structure Activity Relationship.pptx

More Related Content

What's hot

What's hot (20)

3 d qsar approaches structure
3 d qsar approaches structure3 d qsar approaches structure
3 d qsar approaches structure
 
PHARMACOHORE MAPPING AND VIRTUAL SCRRENING FOR RESEARCH DEPARTMENT
PHARMACOHORE MAPPING AND VIRTUAL SCRRENING FOR RESEARCH DEPARTMENTPHARMACOHORE MAPPING AND VIRTUAL SCRRENING FOR RESEARCH DEPARTMENT
PHARMACOHORE MAPPING AND VIRTUAL SCRRENING FOR RESEARCH DEPARTMENT
 
(Kartik Tiwari) Denovo Drug Design.pptx
(Kartik Tiwari) Denovo Drug Design.pptx(Kartik Tiwari) Denovo Drug Design.pptx
(Kartik Tiwari) Denovo Drug Design.pptx
 
Rationale of prodrug design and practical consideration of
Rationale of prodrug design and practical consideration ofRationale of prodrug design and practical consideration of
Rationale of prodrug design and practical consideration of
 
3D QSAR
3D QSAR3D QSAR
3D QSAR
 
Pharmacophore mapping
Pharmacophore mapping Pharmacophore mapping
Pharmacophore mapping
 
Virtual sreening
Virtual sreeningVirtual sreening
Virtual sreening
 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug Discovery
 
Rationale of prodrug design and practical considertions of prodrug design
Rationale of prodrug design and practical considertions of prodrug designRationale of prodrug design and practical considertions of prodrug design
Rationale of prodrug design and practical considertions of prodrug design
 
molecular docking its types and de novo drug design and application and softw...
molecular docking its types and de novo drug design and application and softw...molecular docking its types and de novo drug design and application and softw...
molecular docking its types and de novo drug design and application and softw...
 
QSAR.pptx
QSAR.pptxQSAR.pptx
QSAR.pptx
 
statistical tools used in qsar analysis
statistical tools used in qsar analysisstatistical tools used in qsar analysis
statistical tools used in qsar analysis
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)
 
SAR & QSAR
SAR & QSARSAR & QSAR
SAR & QSAR
 
Virtual screening ppt
Virtual screening pptVirtual screening ppt
Virtual screening ppt
 
OVERVIEW OF MODERN DRUG DISCOVERY PROCESS
OVERVIEW OF MODERN DRUG DISCOVERY PROCESSOVERVIEW OF MODERN DRUG DISCOVERY PROCESS
OVERVIEW OF MODERN DRUG DISCOVERY PROCESS
 
STATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSARSTATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSAR
 
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
LEAD IDENTIFICATION BY SUHAS PATIL (S.K.)
 
Target discovery and validation
Target discovery and validation Target discovery and validation
Target discovery and validation
 

Similar to Quantitative Structure Activity Relationship.pptx

Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
Atai Rabby
 

Similar to Quantitative Structure Activity Relationship.pptx (20)

Physicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdfPhysicochemical properties (descriptors) in QSAR.pdf
Physicochemical properties (descriptors) in QSAR.pdf
 
Qsar
QsarQsar
Qsar
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
QSAR by Faizan Deshmukh
QSAR by Faizan DeshmukhQSAR by Faizan Deshmukh
QSAR by Faizan Deshmukh
 
QSAR.pptx
QSAR.pptxQSAR.pptx
QSAR.pptx
 
Qsar studies
Qsar studiesQsar studies
Qsar studies
 
Ligand based drug design
Ligand based drug designLigand based drug design
Ligand based drug design
 
Insilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastaseInsilico methods for design of novel inhibitors of Human leukocyte elastase
Insilico methods for design of novel inhibitors of Human leukocyte elastase
 
Qsar parameter
Qsar parameterQsar parameter
Qsar parameter
 
Lecture 5
Lecture 5Lecture 5
Lecture 5
 
Qsar
QsarQsar
Qsar
 
Qsar UMA
Qsar   UMAQsar   UMA
Qsar UMA
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 
QSAR
QSARQSAR
QSAR
 
Quantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipQuantitative Structure Activity Relationship
Quantitative Structure Activity Relationship
 
QSAR
QSARQSAR
QSAR
 
RATIONAL DRUG DESIGN.pptx
RATIONAL DRUG DESIGN.pptxRATIONAL DRUG DESIGN.pptx
RATIONAL DRUG DESIGN.pptx
 
Qsar ppt
Qsar pptQsar ppt
Qsar ppt
 
Lecture5 100717171918-phpapp01
Lecture5 100717171918-phpapp01Lecture5 100717171918-phpapp01
Lecture5 100717171918-phpapp01
 
Introduction to Drug Design
 Introduction to Drug Design Introduction to Drug Design
Introduction to Drug Design
 

Recently uploaded

Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
Sérgio Sacani
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
sreddyrahul
 

Recently uploaded (20)

Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
 
Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...
 
family therapy psychotherapy types .pdf
family therapy psychotherapy types  .pdffamily therapy psychotherapy types  .pdf
family therapy psychotherapy types .pdf
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
The Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdfThe Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdf
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
SCHISTOSOMA HEAMATOBIUM life cycle  .pdfSCHISTOSOMA HEAMATOBIUM life cycle  .pdf
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
 

Quantitative Structure Activity Relationship.pptx

  • 1. Quantitative Structure Activity Relationship (QSAR): Statistical method and Product concept Presented by: Radha Sureshrao Chafle F. Y. M. Pharm. (Semester II) (Pharmacology) Guided by: Dr. Mrs. Vandana S. Nikam HOD, Associate Professor in Pharmacology
  • 2. What is QSAR?  QSAR is a mathematical relationship which describe the structural dependence of biological activities either by physicochemical parameters, by indicator variables encoding different structural features , or by three-dimensional molecular property profiles of the compounds.
  • 3. Contd. Drugs, which exert their biological effects by interaction with a specific target must have :  a three-dimensional structure, which in the arrangement of its functional groups and in its surface properties is more or less complementary to a binding site.  Better the steric fit and complementarity of the of the surface properties of a drug to its binding site are, the higher its affinity will be and the higher may be its biological activity.
  • 4. Classical QSAR analyses  consider only 2D structures  main field of application is in substituent variation of a common scaffold. 3D QSAR • has a much broader scope. • starts from 3D structures and correlates biological activities with 3D-property fields.
  • 5. History & Development of QSAR  1868, Crum-Brown and Fraser: Published an equation Φ = f(C) i.e. assumption of physiological activity Φ as function of the chemical structure C.  1900, H. H. Meyer and C. E. Overton: lipoid theory of narcosis  1930‘s, L. Hammett: electronic sigma constants  1964, C. Hansch and T. Fujita: QSAR  1984, P. Andrews: affinity contributions of functional groups  1985, P. Goodford: GRID (hot spots at protein surface)  1988, R. Cramer: 3D QSAR  1992, H.-J. Bohm: LUDI interaction sites, docking, scoring  1997, C. Lipinski: bioavailability rule of five  1998, Ajay, W. P. Walters and M. A. Murcko; J. Sadowski and H. Kubinyi: drug-like character
  • 6. Basic Requirements in QSAR Studies  all analogs belong to a congeneric series  all analogs exert the same mechanism of action  all analogs bind in a comparable manner  the effects of isosteric replacement can be predicted  binding affinity is correlated to interaction energies  biological activities are correlated to binding affinity
  • 7. Molecular Properties and Their Parameters Molecular Property Corresponding Interaction Parameters Lipophilicity hydrophobic interactions log P, 𝜋, 𝑓, RM, 𝜒 Polarizability van-der-Waals interactions MR, parachor, MV Electron density ionic bonds, dipole- dipole interactions, hydrogen bonds, charge transfer interactions σ, R, F, κ, quantum chemical indices Topology steric hindrance geometric fit Es, rv, L, B, distances, volumes
  • 8. QSAR models  Hansch model (property-property relationship):  Definition of the lipophilicity parameter π πX = log PRX - log PRH where PRX represents the partition coefficient between n- octanol and water and PRH that of the parent compound.  Linear Hansch model Log 1/C = a log P + b σ + c MR + ... + k  Nonlinear Hansch models log 1/C = a (log P)2 + b log P + c σ + ... + k log 1/C = a π2 + b π + c σ + ... + k log 1/C = a log P - b log (ßP + 1) + c σ + ... +
  • 9. Contd.  Free-Wilson model (structure-property relationship) log 1/C = Σ ai + µ ai = substituent group contributions µ = activity contribution of reference compound  Mixed Hansch/Free-Wilson model log 1/C = a (log P)2 + b log P + c σ + ... + Σ ai + k log 1/C = a log P - b log (ßP + 1) + c σ + ... + Σ ai + k
  • 10. Lipophilicity Hydrophobic interaction between a drug and a binding site at a receptor Definition of Partition Coefficients P = corg/caq (n-octanol/water system)
  • 11. n-Octanol/Water as a Standard System  membrane analogous structure  hydrogen bond donor and acceptor  practically insoluble in water  no desolvation on transfer into organic phase  very low vapor pressure  transparent in the UV region  large data base of log P values
  • 12.  Additivity Principle of π Values(C. Hansch, 1964) πX = log PR-X - log PR-H The lipophilicity parameter π is an additive, constitutive molecular parameter; compare the Hammett Equation: ρσX = log KR-X - log KR-H
  • 13.  Non additivity of π Values - intramolecular hydrogen bonds - ortho effects (phenols) - polysubstituted aromatic compounds - conjugation (push pull effect) - Heterocyclic compound - Cyclophanes
  • 14. Hydrophobic Fragmental Constants f  The hydrophobic fragmental constant of a substituent or molecular fragment represents the lipophilicity contribution of that molecular fragment . (R. Rekker, The Hydrophobic Fragmental Constant, Elsevier, Amsterdam 1977; R. Rekker. Eur. J. Med. Chem. 14, 479 (1979) log P = Σ aifi (R. Rekker, 1973)  Experimental Determination of Log P Values - Shake flask method - Reversed phase thin layer chromatography - High performance liquid chromatography (HPLC)
  • 15. Polarizability Parameters  Molar volume, Molar Refractivity, Parachor MV = MW d MR = n2 – 1 n2+ 2 . MW d PA = 𝛾1/4 MW d d = density; n = refraction index; γ = surface tension (MR is most often scaled by a factor of 0.1)
  • 16. Electronic Parameters  Hammett Equation ρσ = log KRX - log KRH  Calculation of pKa Values pKa R-X = pKa R-H - ρσ pKa value of 3,5-dinitro-4-methyl-benzoic acid (pKa benzoic acid = 4.20) experimental value = 2.97 calculated value = 4.20 - (0.71 - 0.17 + 0.71) = 2.95
  • 17. Quantum Mechanical Descriptors  Atom partial charges: Mulliken population analysis (orbital population) ESP charges (mapping EP to atom locations)  Dipole moment: strength and orientation behavior of a molecule in an electrostatic field  HOMO / LUMO (“frontier orbital theory“): HOMO = energy of highest occupied molecular orbital, “nucleophilicity’’ LUMO = energy of lowest unoccupied molecular orbital, “electrophilicity”  Superdelocalizability: estimate for the reactivity of positions in aromatic hydrocarbon
  • 18. 3D QSAR  3D QSAR is an extension of classical QSAR which exploits the 3 dimensional properties of the ligands to predict their biological activity using robust statistical analysis like PLS, G/PLS, ANN etc.  3D QSAR uses probe based sampling within a molecular lattice to determine three-dimensional properties of molecules and can then correlate these 3D descriptors with biological activity.  Some of the major factors like desolvation energetics, temperature, diffusion, transport, pH, salt concentration etc. which contribute to the overall free energy of binding are difficult to handle, and thus usually ignored.
  • 19. On the basis of intermolecular bonding On the basis of alignment criterion On the basis of chemometric techniques used Ligand Based 3D QSAR For e.g. CoMFA, CoMSIA, COMPASS, CoMMA, SoMFA Receptor Based 3D QSAR For e.g. COMBINE, AFMoC, HIFA, CoRIA Linear 3D QSAR For e.g. CoMFA, CoMSIA, AFMoC, GERM, CoMMA, SoMFA Classification of 3D QSAR Alignment dependent 3D QSAR For e.g. CoMFA, CoMSIA, GERM, COMBINE, AFMoC, HIFA, CoRIA Alignment independent 3D QSAR For e.g. COMPASS, CoMMA, HQSAR, WHIM, EVA/CoSA, GRIND
  • 20. Comparative Molecular Field Analysis (CoMFA)  The Scientist named Cramer developed the predecessor of 3D approaches called Dynamic Lattice Oriented Molecular Modeling System (DYLOMMS) that involves the use of PCA to extract vectors from the molecular interaction fields, which are then correlated with biological activities in 1987.  CoMFA, powerful 3D QSAR methodology is a combination of GRID and PLS.
  • 21. Protocol for CoMFA  Determination of Bioactive conformations of the molecule.  Superimposition or the alignment of molecules using either manual or automated methods, in a manner defined by the supposed mode of interaction with the receptor.  The steric and electrostatic fields calculated around the molecules with different probe groups positioned at all interactions of the lattice.  The overlaid molecules are placed in the center of a lattice grid with a spacing of 2 Å.
  • 22. Contd.  The PLS technique is used to correlate the interaction energy or field values with the biological activity, by which the quantitative influence of specific chemical features of molecules on their biological can be identified and extracted.  The results are coupled as correlation equations with the number of latent variable terms, each of which is a linear combination of original independent lattice descriptors.
  • 23. Steps in CoMFA:  a set of molecules is first selected.  all molecules have to interact with the same kind of receptor (or enzyme, ion channel, transporter) in the same manner, i.e., with identical binding sites in the same relative geometry.  a certain subgroup of molecules is selected which constitutes a training set to derive the CoMFA model.  The residual molecules are considered to be a test set which independently proves the validity of the derived model(s).
  • 24.  Atomic partial charges are calculated and (several) low energy conformations are generated. A pharmacophore hypothesis is derived to orient the superposition of all individual molecules and to afford a rational and consistent alignment.  A sufficiently large box is positioned around the molecules and a grid distance is defined.  PLS analysis is the most appropriate method for this purpose. Normally, cross-validation is used to check the internal predictivity of the derived model.
  • 25.
  • 27. Drawbacks of CoMFA:  Too many adjustable parameters  Uncertainty in selection of compounds and variables.  Fragmented contour maps with variable selection procedures.  Hydrophobicity not well quantified  Cut-off limits used.  Low signal to noise ratio due to many useless field variables.  Imperfections in potential energy funtions.  Applicable only to in vitro data.
  • 28. Comparative Molecular Similarity Indices Analysis (CoMSIA)  Molecular similarity indices are calculated from modified SEAL similarity fields are employed as descriptors to simultaneously consider steric, electrostatic, hydrophobic and hydrogen bonding properties.  These indices are estimated indirectly by comparing the similarity of each molecule in the dataset with a common probe atom (having a radius of 1Å, charge of +1 and hydrophobicity of +1) positioned at the intersections of a surrounding grid/lattice.
  • 29.  For computing similarity at all grid points, the mutual distances between the probe atom and the atoms of the molecules in the aligned dataset are also taken into account.  To describe this distance dependence and calculate the molecular properties, Gaussian type functions are employed.  Since the underlying Gaussian type functional forms are ‘smooth’ with no singularities their slopes are not as steep as the Columbic and Lennerd Jones potentials in CoMFA; therefore no arbitrary cut off limits are required to be defined.
  • 30. Comparison between CoMFA and CoMSIA CoMFA CoMSIA Function type Lennerd-Jones potential, Coulomb potential Gaussian Descriptors Interaction energies Similarity indices Cut-off required Not required Field Steric, electrostatic Steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor Contour map Often not contiguous Contiguous Model reproducibility Poor Good *CoMSIA is Provide By TRIPOS Inc. in the Sybyl Software, along with CoMFA.
  • 31. Statistical methods used in QSAR Linear Regression Analysis (RA) Multivariate Data Analysis Pattern Recognition Simple Linear regression Principal component analysis (PCA) Cluster analysis Multiple Linear regression (MLR) Principal component regression (PCR) Artificial neural networks (ANNs) Stepwise multiple linear regression Partial least square analysis (PLS) k-nearest neighbor (kNN) Genetic function approximation (GFA) Genetic partial least squares (G/PLS)
  • 32. Linear Regression Analysis (LRA)  Linear Regression Analyses are considered as an easily interpretable methods indicate for QSAR analysis.  These techniques construct a statistical model to represent the correlation of one or more independent variables(x) with a dependent explicative variable (y).  The model can be utilized to predict y from the knowledge of x variables, either quantitative or qualitative.
  • 33. a. Simple Linear regression method  Standard linear regression calculation to generate a set of QSAR equations that include a single independent descriptor x and dependent variable y.  A one term linear equation is produced separately for each independent variable from the descriptor set. 𝑦 = 𝑎 + 𝑏𝑥
  • 34. b. Multiple Linear regression (MLR)  Referred as linear free energy relationship (LFER) method.  Generates QSAR equations by performing standard multivariable regression calculations to identify the dependence of a drug property or any all of the descriptors under investigation.  Involves more than one variables. 𝑦 = 𝑏0 + 𝑏1𝑥1 + 𝑏2𝑥2 + … … … + 𝑏𝑚𝑥𝑚 + 𝑒
  • 35. c. Stepwise multiple linear regression  Commonly used variant of MLR.  Creates multiple term linear equation but not all the independent variables are used.  Each independent variable is sequentially added to the equation and new regression is performed every time.  The new term is preserved only if the model passes a test for significance.  Is useful when the number of descriptors are large and key descriptor is unknown.
  • 36. Multivariate Data Analysis  It replaced LRA.  It tried to explain an extended set of variables by means of a reduced number of new latent variables possessing the maximum amount of information relevant to the problem.
  • 37. a. Principal component analysis (PCA)  Data reduction technique that does not generate QSAR model.  Creates new set of orthogonal descriptors i.e. principal components which describe most of the information contained in the independent variables.  Reduces dimensionality of a multivariate data set of descriptors to the actual amount of data available.
  • 38. b. Principal component regression (PCR)  When principal components are employed as the independent variables to perform a linear regression, the method is termed as Principal component regression.  PCR applies scores from PCA decomposition as regressors in the QSAR model, to generate multiple term linear equation.
  • 39. c. Partial least square analysis (PLS)  An iterative regression procedure that produces its solution based on linear transformation of large number of original descriptors to a small number of new orthogonal terms called latent variables.  PLS is able to analyze complex SAR data in a more realistic way.  Is able to interpret the influence of molecular structure on biological activity
  • 40. d. Genetic function approximation (GFA)  Serves as an alternative to standard regression analysis for building QSAR equations.  Employs natural principles of evolution of species which leads to improvements by recombination  Suitable for obtaining QSAR equations when dealing with a larger number of independent variables.  Results in multiple models generated by initial models using genetic algorithms.
  • 41. e. Genetic partial least squares (G/PLS)  It is valuable analytical tool that has evolved by combining the best features of GFA and PLS.
  • 42. Pattern Recognition  The method is based on the principal of analogy.  The method is used for the detection of the distance or closeness within the large amount of multivariate data.
  • 43. a. Cluster analysis  Statistical pattern recognition method used to investigate the relationship between observations associated with several properties and to partition the data set into categories consisting of similar elements.  Allows for the consideration of the inactive compounds in the analysis.  Can be used to study a large set of substituents to identify subsets which share similar physical properties.
  • 44. b. Artificial neural networks (ANNs)  The technique has its origin from the real neurons present in an animal brain.  Are parallel computational systems consists of groups of highly interconnected processing elements called neurons, which are arranged in a series of layers. a)First layer: Input layer b)Subsequent layer: Hidden layer c) Last layer: Output layer
  • 45. Contd.  Each layer does its independent computations and pass the results to another one.  The weighed inputs are summed up and supplied to the hidden layers, where a non linear transfer function does all the required processing.  The results of transfer function are communicated to the neurons in the output layer, where the results are interpreted and finally presented to the users.
  • 46. c. k-nearest neighbor (kNN)  Simplest machine learning algorithms.  Most commonly used for classifying a new pattern (e.g. a molecule)  Technique is based on a simple distance learning approach.  Where unknown/ new molecules are classified according to the majority of its k- nearest neighbors in the training set.  The nearness is determined by Euclidean distance metric (e.g. similarity measure computed using the structural descriptors of the molecules).
  • 47. Importance of statistical parameter  Equations generated/established in QSAR studies are Linear Regression equations.  A number of equations may be generated/established for one problem/case under study.  Statistics helps in selecting one suitable bet fit equation out of them.
  • 48. Contd.  This may be done by checking standard deviation/variance an other related parameters for the data set used for QSAR studies .  Correlation coefficient computed for the data set under study also helps in selecting appropriate QSAR equation.
  • 49. Reference:  Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 4-8  Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 27-54  Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 57- 68  Kubinyi H., Introduction, In: Mannhold R., Larsen P., Timmerman H. QSAR: Hansch Analysis and related approaches. New York, VCH Publishers, 1993. p. 159