SlideShare a Scribd company logo
1 of 45
Download to read offline
Government Post Graduate College Mandian Abbottabad
Assignment no 1: Quantitative Structure-Activity Relationship (QSAR)
Submitted by:
Name: Zarlish Attique
Registration no: 187104
Subject: Pharmacoinformatics
Department: Bioinformatics
Semester: 5th
Submitted to:
Teacher Name: Sir Imran
Department of Bioinformatics
Date of Submission: November 10,2020
Quantitative
structure–activity
relationship
Introduction to Quantitative structure-activity relationship (QSAR)
Quantitative structure-activity relationship (QSAR) is a computational modeling method for
revealing relationships between structural properties of chemical compounds and biological
activities in a quantitative manner for a series of compounds. QSAR modeling is essential
for drug discovery, but it has many constraints.
Quantitative Structure-Activity Relationship (QSAR) is an approach to find qualitative
relationships between chemical structure and their biological activity Quantitative Structure
Activity Relationship (QSAR) models are theoretical models that relate a quantitative measure of
chemical structure to a physical property, or a biological activity Principle: Structurally similar
chemicals are likely to have similar physicochemical and biological properties
Mathematical form
QSAR models are of the form:
Apred = f(D1, D2,...Dn)
where,
❖ Apred: biological activity (or toxicological endpoint)
❖ D1,D2,...Dn: chemical or structural properties (molecular descriptors)
Or
A QSAR has the form of a mathematical model:
Activity = f (physiochemical properties and/or structural properties) + error
The error includes model error (bias) and observational variability, that is, the variability in
observations even on a correct model.
Note: Qualitative SARs and quantitative SARs, collectively are referred to as (Q)SARs.
Qualitative relationships are derived from non-continuous data (e.g., yes or no data), while
quantitative relationships are derived for continuous data (e.g., toxic potency data).
Quantitative structure–
activity
relationship Models
Quantitative structure–activity relationship Models
Quantitative structure–activity relationship models (QSAR models) are regression or
classification models used in the chemical and biological sciences and engineering. Like other
regression models, QSAR regression models relate a set of "predictor" variables (X) to the
potency of the response variable (Y), while classification QSAR models relate the predictor
variables to a categorical value of the response variable.
In QSAR modeling, the predictors consist of physico-chemical properties or theoretical
molecular descriptors of chemicals; the QSAR response-variable could be a biological activity of
the chemicals. QSAR models first summarize a supposed relationship between chemical
structures and biological activity in a data-set of chemicals. Second, QSAR models predict the
activities of new chemicals.
Related terms
Related terms include quantitative structure–property relationships (QSPR) when a chemical
property is modeled as the response variable. "Different properties or behaviors of chemical
molecules have been investigated in the field of QSPR. Some examples are quantitative
structure–reactivity relationships (QSRRs), quantitative structure–chromatography relationships
(QSCRs) and, quantitative structure–toxicity relationships (QSTRs), quantitative structure–
electrochemistry relationships (QSERs), and quantitative structure–biodegradability relationships
(QSBRs)."
Example for demonstration
As an example, biological activity can be expressed quantitatively as the concentration of a
substance required to give a certain biological response. Additionally, when physicochemical
properties or structures are expressed by numbers, one can find a mathematical relationship, or
quantitative structure-activity relationship, between the two. The mathematical expression, if
carefully validated can then be used to predict the modeled response of other chemical structures.
Principal steps of
QSAR
Principal steps of QSAR
Principal steps of QSAR including (i) Selection of Data set and extraction of structural/empirical
descriptors (ii) variable selection, (iii) model construction and (iv) validation evaluation."
The QSAR method involves recognition that a molecule (organic, peptide, protein, etc.) is really
a three–dimensional distribution of properties. The most important of these properties are steric
(e.g., shape and volume), electronic (e.g. electric charge and electrostatic potential), and what are
termed ‘lipophilic’ properties (how polar or non–polar the sections of the molecular are, usually
exemplified by the log of the octanol–water partition coefficient, log P).
The QSAR method (and analogously QSTR and QSPR) involves a number of key steps:
1. Converting molecular structures into mathematical descriptors that encapsulate the key
properties of the molecules relevant to the activity or property being modelled.
2. Selecting the best descriptors from a larger set of accessible, relevant descriptors.
3. Mapping the molecular descriptors into the properties, preferably using a ‘model–free’
mapping system in which no assumptions are needed as to the functional form of the
structure– activity relationship. These relationships are often complex, unknown and non–
linear.
4. Validating the model to determine how predictive it is, and how well it will generalise to new
molecules not in the data set used to generate the model (the training set).
“Step 1- Generation of descriptors”
There are a myriad methods for generating molecular descriptors. Packages such as Dragon
are able to generate over a thousand descriptors, while methods such as CoMFA15 generate
many thousands. Molecular descriptors can be of diverse types.
Molecular descriptors
Molecular descriptors are final products of mathematical procedures transforming chemical
information encoded within a molecular structure to a numerical representative.
Dimensionality of molecular descriptors can identify QSAR model type as described below:
0D QSAR- These are descriptors derived from molecular formula e.g. molecular weight, number
and type of atoms etc.
1D QSAR- A substructure list representation of a molecule can be considered as a one-
dimensional (1D) molecular representation and consists of a list of molecular fragments (e.g.
functional groups, rings, bonds, substituents etc.).
2D QSAR- A molecular graph contains topological or two dimensional (2D) information. It
describes how the atoms are bonded in a molecule, both the type of bonding and the interaction
of particular atoms (e.g. total path count, molecular connectivity indices etc.).
3D QSAR- These are calculated starting from a geometrical or 3D representation of a molecule.
These descriptors include molecular surface, molecular volume and other geometrical properties.
There are different types of 3D descriptors e.g. electronic, steric, shape etc.
4D QSAR- Four dimensional information is described in this type of models, and the fourth
dimension is an ensemble of conformation of each ligand.
5D-QSAR – Five dimensional information is described in this type of models, and the fifth
dimension is the possibility to represent an ensemble of up to six different induced-fit models.
The descriptors are fall into 4 classes: Topological, Geometrical, Electronic and Hybrid.
Mainly we have,
1. Topological descriptors: Topological descriptors are graph invariants generated by
applying the theorems of graph theory. Examples of topological descriptors are: atom
counts, ring counts, molecular weight, weighted paths, molecular connectivity indices,
substructure counts, molecular distance edge descriptors, kappa indices, electro-
topological state indices, and some other invariants.Aspects of the structures related to
the electrons are encoded by calculating electronic descriptors. Examples of electronic
descriptors are: partial atomic charges, HOMO or LUMO energies, dipole moment.
2. Geometric descriptors: Geometric descriptors are used to encode the 3-D aspects of the
molecular structure such as moments of inertia, solvent accessible surface area, length-to-
breadth ratios, shadow areas, gravitational index.
Descriptor selection
To build a good QSAR model, a minimal set of information–rich descriptors is required. The
large number of possible indices creates several problems for the modeller:
❖ Many descriptors do not contain molecular information relevant to the problem.
❖ Many descriptors are linearly dependent (contain essentially the same information).
❖ Use of poor descriptors in QSAR yields poor and misleading models.
❖ Including too many descriptors in the model, even if they contain relevant information,
can result in overfitting of the model, and loss of ability of the model to generalise to
unseen molecules.
❖ Many methods of screening this large pool of potential descriptors for relevant ones
can lead to chance correlations (correlations that arise by chance
“Step 2- Structure–activity mapping”
Many methods have been used to map molecular descriptors to properties. The majority are
regression methods, of which multiple linear regression was the first used.
Regression methods attempt to fit a specific function with free parameters to a set of data. They
usually do this using some gradient descent method such as least squares, which finds the best
set of free parameters that minimise the sum of the squares of the errors between the measured
values of the dependent variables, and those calculated by the fitted function. Some QSAR
problems have relatively linear response surfaces that can be modelled successfully by linear
regression methods.
“Step 3- Validation and testing”
It is important to know how predictive a model is, once derived, to show whether a structure–
property mapping method has overfitted the data, a neural net has overtrained or that chance
correlations are present.
Several methods have been developed to estimate the validity or predictivity of the derived
structure– property model. The most common method is ‘leave–one–out’ cross– validation.8S
This involves leaving each molecule out of the training set in turn, then creating a model
using the remainder of the training set. The property of the omitted molecule is predicted
using the model derived from all of the other molecules. This method is not a very rigorous
test of the predictivity of the model and suffers from two other major deficiencies: the time to
carry out the cross–validation increases as the square of the size of the training set; the
method produces n final models (each corresponding to one of the training set molecules
being left out) and it is not clear which is the ‘best’ model.
A better method is to remove a percentage of the training set into a test set. The structure–
property model is derived using the reduced training set, and the properties of the test set
predicted using this model. This is a more rigorous test of the quality of the structure–
property model but again suffers from problems: not all of the available data can be used to
make the model as some must be held back for the test set; it is not clear how the test set is
best selected from the training set, eg randomly or using cluster analysis.
Overall study design of a QSAR-guided drug discovery project.
Types of QSAR
“Types of QSAR”
1.Fragment-based QSAR
Analogously, the "partition coefficient"—a measurement of differential solubility and itself a
component of QSAR predictions—can be predicted either by atomic methods (known as
"XLogP" or "ALogP") or by chemical fragment methods (known as "CLogP" and other
variations). It has been shown that the logP of compound can be determined by the sum of its
fragments; fragment-based methods are generally accepted as better predictors than atomic-based
methods. Fragmentary values have been determined statistically, based on empirical data for
known logP values. This method gives mixed results and is generally not trusted to have
accuracy of more than ±0.1 units.
Group or Fragment based QSAR is also known as GQSAR. GQSAR allows flexibility to study
various molecular fragments of interest in relation to the variation in biological response. The
molecular fragments could be substituents at various substitution sites in congeneric set of
molecules or could be on the basis of pre-defined chemical rules in case of non-congeneric sets.
GQSAR also considers cross-terms fragment descriptors, which could be helpful in identification
of key fragment interactions in determining variation of activity. Lead discovery using
Fragnomics is an emerging paradigm. In this context FB-QSAR proves to be a promising
strategy for fragment library design and in fragment-to-lead identification endeavours.
An advanced approach on fragment or group-based QSAR based on the concept of
pharmacophore-similarity is developed. This method, pharmacophore-similarity-based QSAR
(PS-QSAR) uses topological pharmacophoric descriptors to develop QSAR models. This activity
prediction may assist the contribution of certain pharmacophore features encoded by respective
fragments toward activity improvement and/or detrimental effects.
2.3D-QSAR
The acronym 3D-QSAR or 3-D QSAR refers to the application of force field calculations
requiring three-dimensional structures of a given set of small molecules with known activities
(training set). The training set needs to be superimposed (aligned) by either experimental data
(e.g. based on ligand-protein crystallography) or molecule superimposition software. It uses
computed potentials, e.g. the Lennard-Jones potential, rather than experimental constants and is
concerned with the overall molecule rather than a single substituent. The first 3-D QSAR was
named Comparative Molecular Field Analysis (CoMFA) by Cramer et al. It examined the steric
fields (shape of the molecule) and the electrostatic fields which were correlated by means
of partial least squares regression (PLS).
The created data space is then usually reduced by a following feature extraction . An alternative
approach uses multiple-instance learning by encoding molecules as sets of data instances, each of
which represents a possible molecular conformation. A label or response is assigned to each set
corresponding to the activity of the molecule, which is assumed to be determined by at least one
instance in the set (i.e. some conformation of the molecule).
On June 18, 2011 the Comparative Molecular Field Analysis (CoMFA) patent has dropped any
restriction on the use of GRID and partial least-squares (PLS) technologies.
3.Chemical descriptor based QSAR
In this approach, descriptors quantifying various electronic, geometric, or steric properties of a
molecule are computed and used to develop a QSAR. This approach is different from the
fragment (or group contribution) approach in that the descriptors are computed for the system as
whole rather than from the properties of individual fragments. This approach is different from
the 3D-QSAR approach in that the descriptors are computed from scalar quantities (e.g.,
energies, geometric parameters) rather than from 3D fields. An example of this approach is the
QSARs developed for olefin polymerization by half sandwich compounds.
Modeling in QSAR
Modeling in QSAR
In the literature it can be often found that chemists have a preference for partial least
squares (PLS) methods, since it applies the feature extraction and induction in one step.
Data mining approach
Computer QSAR models typically calculate a relatively large number of features. Because those
lack structural interpretation ability, the preprocessing steps face a feature selection problem (i.e.,
which structural features should be interpreted to determine the structure-activity relationship).
Feature selection can be accomplished by visual inspection (qualitative selection by a human);
by data mining; or by molecule mining.
A typical data mining based prediction uses e.g. support vector machines, decision
trees, artificial neural networks for inducing a predictive learning model.
Molecule mining approaches, a special case of structured data mining approaches, apply a
similarity matrix based prediction or an automatic fragmentation scheme into molecular
substructures. Furthermore, there exist also approaches using maximum common
subgraph searches or graph kernels.
Matched molecular pair analysis
Typically QSAR models derived from non linear machine learning is seen as a "black box",
which fails to guide medicinal chemists. Recently there is a relatively new concept of matched
molecular pair analysis or prediction driven MMPA which is coupled with QSAR model in order
to identify activity cliffs.
Evaluation of the
quality of QSAR
models
Evaluation of the quality of QSAR models
QSAR modeling produces predictive models derived from application of statistical tools
correlating biological activity (including desirable therapeutic effect and undesirable side effects)
or physico-chemical properties in QSPR models of chemicals (drugs/toxicants/environmental
pollutants) with descriptors representative of molecular structure or properties. QSARs are being
applied in many disciplines, for example: risk assessment, toxicity prediction, and regulatory
decisions in addition to drug discovery and lead optimization. Obtaining a good quality QSAR
model depends on many factors, such as the quality of input data, the choice of descriptors and
statistical methods for modeling and for validation. Any QSAR modeling should ultimately lead
to statistically robust and predictive models capable of making accurate and reliable predictions
of the modeled response of new compounds.
For validation of QSAR models, usually various strategies are adopted:
❖ Internal validation or cross-validation (actually, while extracting data, cross validation is a
measure of model robustness, the more a model is robust (higher q2) the less data extraction
perturb the original model);
❖ External validation by splitting the available data set into training set for model development and
prediction set for model predictivity check;
❖ Blind external validation by application of model on new external data and
❖ Data randomization or Y-scrambling for verifying the absence of chance correlation between the
response and the modeling descriptors.
The success of any QSAR model depends on accuracy of the input data, selection of appropriate
descriptors and statistical tools, and most importantly validation of the developed model.
Validation is the process by which the reliability and relevance of a procedure are established for
a specific purpose; for QSAR models validation must be mainly for robustness, prediction
performances and applicability domain (AD) of the models.
Some validation methodologies can be problematic. For example, leave one-out cross-validation
generally leads to an overestimation of predictive capacity. Even with external validation, it is
difficult to determine whether the selection of training and test sets was manipulated to maximize
the predictive capacity of the model being published.
Different aspects of validation of QSAR models that need attention include methods of selection
of training set compounds, setting training set size and impact of variable selection for training
set models for determining the quality of prediction. Development of novel validation parameters
for judging quality of QSAR models is also important.
Computational
resources for QSAR
Computational resources for QSAR
The main objective of QSAR models is to allow the prediction of biological activities of untested
or novel compounds to provide insight into relevant and consistent chemical properties or
descriptors (2D/3D) which defines the biological activity. Once, a series of predicted models are
collected, these can be used for database mining for the identification of novel chemical
compounds, particularly, for those having drug-like properties (following Lipinski (Rule of Five)
along with suitable pharmacokinetic properties.
A diagram demonstrating list of softwares used for computing
chemical descriptors and types of different descriptors.
Web Servers/Databases/Mirror Sites
Chemical Libraries
Chemical Libraries Description
ZINC A free database of commercially available compounds for virtual
screening.
NCI 250251 open structures ready for searching.
Web interface on Libraries
Name Description
RCDK Allows the user to load molecules, evaluate fingerprints, calculate molecular
descriptors and view structures in 2D.
JoeLib A Cheminformatics algorithm library, which was designed for prototyping, data
mining, graph mining, and of course algorithm development.
Molecular
Modeliing
Open Source program library for molecular simulation applications.
Standalone software
Structure Drawing
Name Description
ACD/ChemSketch 11.0
Freeware
It allows you to draw chemical structures including organics,
organometallics, polymers, and Markush structures.
ACD/3D Viewer 3D viewing tool for implementation with both ISIS Draw and ISIS
Base.
Biomer Java based online biomolecular modelling package.
MOLEKEL An open source(GPL) multiplatform molecular visualization
program.
Descriptor Calculation
Name Description
ACD/LogP Freeware Fragment-based algorithm for Logp prediction.
Steric A program to calculate molecular steric effects.
DALTON Program for ab-initio calculation of molecular prop.
Chemoinformatics Kits and Optimizations
Name Description
Ghemical Computational chemistry software package released under the GNU GPL.
gOpenMol Tool for the visualization and analysis of molecular structures and their chemical
properties.
LIGPLOT Automatically plotting protein-ligand interactions.
VIEWMOL Graphical front end for some quantum chemical and molecular modeling programmers.
VEGA Molecular Software package.
TINKER Software tools for Molecular Design.
MolPOV A graphics file converter.
MOLMOL A molecular graphics program.
MOPAC Is a semiempirical quantum chemistry program based on Dewar and Thiel's NDDO
approximation.
MEQI Molecular Scaffold Analysis.
ASC Analytic Surface Calculation Package for PDBs.
Babel File format converter.
POWERMV A software environment for statistical analysis, molecular viewing, descriptor
generation, and similarity search.
Links
Chemical Database With Information
ChemIDPlus ACD
(Available
Chemicals
Directory)
Bionet
Database
CAP
(Chemicals
Available for
Purchase)
CHEBi,
ASINEX,
Maybridge
Database,
Zinc
ChemBank ChemDB,
ChemStar
CSD
(Cambridge
Structural
Database)
HIC-Up
(Hetero-
compound
Information
Centre -
Uppsala)
IBS
(Inter
Biosceen)
Database
MDPI
(Molecular
Diversity
Preservation
International)
MSDchem NCI
Database
PDBsum,
PubChem
Relibase SPRESI GLIDA ChemBioGRID Ched ECOTOX QueryCHEM
Drug Database
BIOSTER CMC (Comprehensive
Medicinal
Chemistry)/">CMC
(Comprehensive Medicinal
Chemistry)
Dictionary of
Drug
DrugBank MDDR
(MDL
Drug Data
Report)
MedChem
PharmGKB Rx-list SDD ( Super
Drug
Database)
WDI (The
Drug Index)
WOMBAT AID/DRUGS
QSAR / in silico Tools
VEGA platform
Using the VEGA platform, you can access a series of QSAR (quantitative structure-activity
relationship) models for regulatory purposes, or develop your own model for research purposes.
CAESAR software (version 2)
This is the standalone version of the CAESAR software (version 1). Most of the new features
that will be present in the new CAESAR v2.0 software have been integrated in the stand-alone
models for Developmental Toxicity and Mutagenicity.
CAESAR software (version 1)
The CAESAR Application is a JAVA™ web application that allows the access to all the toxicity
predictive models developed within the CAESAR Project.
DEMETRA
DEMETRA is an EU-funded project. This project aim has been to develop predictive models and
software which give a quantitative prediction of the toxicity of a molecule, in particular
molecules of pesticides, candidate pesticides, and their derivatives. The input is the chemical
structure of the compound, and the software algorithms use “Quantitative Structure-Activity
Relationships” (QSARs). The DEMETRA software tool can be used for toxicity prediction of
molecules of pesticides and related compounds.
T.E.S.T
Toxicity Estimation Software Tool (T.E.S.T.) will enable users to easily estimate acute toxicity
using the above QSAR methodologies.
Toxtree
Toxtree is a full-featured and flexible user-friendly open source application, which can estimate
toxic hazard by applying a decision tree approach. Toxtree could be applied to datasets from
various compatible file types. User-defined molecular structures are also supported – they could
be entered by SMILES, or by using the built-in 2D structure diagram editor.
OCHEM
The OCHEM is an online database of experimental measurements intergrated with the modeling
environment. Submit your experimental data or use the data uploaded by other users to build
predictive QSAR models for physical-chemical or biological properties.
Chemistry Development Kit (CDK)
The Chemistry Development Kit (CDK) is a Java library for structural chemo- and
bioinformatics. It is now developed by more than 50 developers all over the world and used in
more than 10 different academic as well as industrial projects world wide.
Weka
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can
either be applied directly to a dataset or called from your own Java code. Weka contains tools for
data pre-processing, classification, regression, clustering, association rules, and visualization
AMBIT
AMBIT is a software for chemoinformatic data management, is an outcome of the LRI
project: ’Building blocks for a future (Q)SAR ((Quantitative) Structure Activity Relationship)
decision support system.
ChemAxon Marvin
Marvin is a collection of tools for drawing, displaying and characterizing chemical structures,
queries, macromolecules and reactions.
Sybyl-X
Whether you need to find new lead candidates, optimize lead series, or perform other related life
science experiments like modeling a protein structure, SYBYL-X has solutions to move your
discovery research forward. With capabilities for small molecule modeling and simulation,
macromolecular modeling and simulation, cheminformatics, lead identification, and lead
optimization, all wrapped up in an easy to use, cost-effective interface, SYBYL-X has the tools
and capabilities you need for molecular design.
Discovery Studio
Discovery Studio is a software suite of life science molecular design solutions for computational
chemists and computational biologists. Discovery Studio makes it easier to examine the
properties of large and small molecules, study systems, identify leads and optimize candidates.
ADF: Amsterdam Density Functional software
ADF has a 30-year track record as a premium-quality quantum chemistry software package
based on Density Functional Theory (DFT). It consists of
❖ the molecular DFT program ADF
❖ the periodic DFT program BAND
❖ the post-ADF COSMO-RS program for thermodynamics of liquids
❖ The ReaxFF program for modeling chemical reactions
AMBER: Assisted Model Building with Energy Refinement
“Amber” refers to two things: a set of molecular mechanical force fields for the simulation of
biomolecules (which are in the public domain, and are used in a variety of simulation programs);
and a package of molecular simulation programs which includes source code and demos. The
current version of the code is Amber version 11, which is distributed by UCSF.
ChemDraw Ultra
ChemDraw Ultre 12 is an outstanding package for chemical structure drawing and editing. For
preparing chemical schemes. The graphics prepared in ChemDraw Ultra insures a professional
and neat appearance.
Gaussian 09: Expanding the limits of computational chemistry
Gaussian 09 is the latest version of the Gaussian series of electronic structure programs. Starting
from the fundamental laws of quantum mechanics, Gaussian 09 predicts the energies, molecular
structures, vibrational frequencies and molecular properties of molecules and reactions in a wide
variety of chemical environments. Gaussian 09’s models can be applied to both stable species
and compounds which are difficult or impossible to observe experimentally (e.g., short-lived
intermediates and transition structures).
Molecular Operating Environment
MOE provides a suite of applications for manipulating and analyzing large collections of
compounds, building property models, consensus models and SD pipeline command line tools.
MolProp Tk
The MolProp TK provides a customizable framework for molecular property calculation geared
towards enabling rapid database filtering. Filtering attempts to eliminate inappropriate or
undesirable compounds from a large set before beginning to use them in modelling studies. The
goal is to remove all of the compounds that should not be suggested to a medicinal chemist as a
potential hit.
ADMEWORKS ModelBuilder
ADMEWORKS ModelBuilder is a tool dedicated for building QSAR/QSPR models that can
later be used for predicting various chemical and biological properties of compounds. Two
classes of models (Qualitative and Quantitative) can be built using various algorithms. The
models are based on values of physicochemical, topological, geometrical, and electronic
properties derived from the molecular structure.
KnowItAll
Bio-Rad’s award-winning KnowItAll Informatics System offers fully integrated software and/or
database desktop solutions that provide scientific researchers multiple tools such as database
building, management, mining/search, analysis, prediction, structure drawing, and reporting, all
within a single user interface.
MetaDrugTM
MetaDrug is a unique systems pharmacology platform designed for evaluation of biological
effects of small molecule compounds on the human body, with pathway analysis and other
bioinformatics applications from toxicogenomics to translational medicine.
Noraymet ADME
ADME predictive software: the software includes a selection of INVIVE models (in vivo-in vitro
extrapolation that allows predicting the in vivo pharmacokinetics from in vitro results.
PreADME
Around half of all drugs in clinical development fail to commercialize because of poor ADME
and toxicity properties. There is increasing interest in the early prediction of ADME properties,
in order to increase the success rate of compounds reaching development. Using the PreADME
the result of ADME prediction can be used as the most outstanding and practical guidance for the
early drug discovery.
Sarchitect
Sarchitect Designer is the model building edition of Sarchitect. The product vision is to enable
building “best possible” models. Sarchitect Miner is the interface between the model builders
and the model users. Miner enables the use of models by
❖ medicinal chemists to profile and optimize their compounds in silico, and
❖ DMPK groups to rank compounds for prioritization in their studies
Volsurf+
The pharmacokinetic behaviour of compounds is linked to their efficacy and thus is critical for
drug discovery. Understanding how to optimise compounds according to multiple simultaneous
criteria is a great advantage in focusing design efforts. VolSurf+ creates 128 molecular
descriptors from 3D Molecular Interaction Fields (MIFs) produced by our software GRID, which
are particularly relevant to ADME prediction and are also simple to interpret. One example
would be the interaction energy moment descriptor between hydrophobic and hydrophilic
regions, which is important for membrane permeability prediction. These can then be used with
provided chemometric tools to build statistical models.
Bio-Loom
BioByte’s new Bio-Loom program weaves several different threads of data into a cohesive
whole. Of course, Bio-Loom still calculates hydrophobic and molecular refractivity parameters
via CLOGP & CMR, calculations which have been the world standard for decades, but it now
has the ability to access BioByte’s entire Thor Masterfile database, which includes over 60,000
measured log P and log D values (in many solvent systems), as well as 14000 pKas, including
associated references.
MolCode Toolbox
The modules of Molcode Toolbox software predict a wide range of experimentally unknown
values of properties of compounds including physicochemical, biological, ADME-Tox,
ecological pathways/ecotoxicity and adverse drug effects. The modules of Molcode ToolBox
software consist of internally encoded computational models built on various datasets related to
the aforementioned properties. All the models in Molcode Toolbox are specified by model’s
name, CAS-number (not obligatory) and systematic name of compounds, (bio)assay,
property/activity value and unit, and reference.
TerraQSAR
All TerraQSAR programs use proprietary neural networks for the computation of biological
effects and physico-chemical properties of defined chemicals. The input is in the form of 2D/3D
SMILES strings.
ADRIANA.Code
ADRIANA.Code comprises a unique combination of methods for calculating molecular structure
descriptors on a sound geometric and physicochemical basis. These descriptors can be used for a
wide range of applications in all areas of chemistry.
Applications of QSAR
Applications of QSAR
1. Chemical applications of QSAR
❖ One of the first historical QSAR applications was to predict boiling points.
❖ It is well known for instance that within a particular family of chemical compounds,
especially of organic chemistry, that there are strong correlations between structure and
observed properties. A simple example is the relationship between the number of carbons
in alkanes and their boiling points. There is a clear trend in the increase of boiling point
with an increase in the number carbons, and this serves as a means for predicting the
boiling points of higher alkanes.
❖ A still very interesting application is the Hammett equation, Taft equation and pKa
prediction methods.
2. Biological applications of QSAR
❖ The biological activity of molecules is usually measured in assays to establish the level of
inhibition of particular signal transduction or metabolic pathways. Drug discovery often
involves the use of QSAR to identify chemical structures that could have good inhibitory
effects on specific targets and have low toxicity (non-specific activity). Of special interest
is the prediction of partition coefficient log P, which is an important measure used in
identifying "druglikeness" according to Lipinski's Rule of Five.
❖ While many quantitative structure activity relationship analyses involve the interactions
of a family of molecules with an enzyme or receptor binding site, QSAR can also be used
to study the interactions between the structural domains of proteins. Protein-protein
interactions can be quantitatively analyzed for structural variations resulted from site-
directed mutagenesis.
❖ It is part of the machine learning method to reduce the risk for a SAR paradox, especially
taking into account that only a finite amount of data is available. In general, all QSAR
problems can be divided into coding and learning.
3. For risk management
❖ (Q)SAR models have been used for risk management. QSARS are suggested by
regulatory authorities; in the European Union, QSARs are suggested by
the REACH regulation, where "REACH" abbreviates "Registration, Evaluation,
Authorisation and Restriction of Chemicals". Regulatory application of QSAR methods
includes in silico toxicological assessment of genotoxic impurities. Commonly used
QSAR assessment software such as DEREK or MCASE is used to genotoxicity of
impurity according to ICH M7.
❖ The chemical descriptor space whose convex hull is generated by a particular training set
of chemicals is called the training set's applicability domain. Prediction of properties of
novel chemicals that are located outside the applicability domain uses extrapolation, and
so is less reliable (on average) than prediction within the applicability domain. The
assessment of the reliability of QSAR predictions remains a research topic.
❖ The QSAR equations can be used to predict biological activities of newer molecules
before their synthesis.
❖ QSAR has been applied extensively and successfully over several decades to find predictive
models for activity of bioactive agents. It has also been applied to areas related to discovery and
subsequent development of bioactive agents: distinguishing drug–like from non–drug– like
molecules,87 drug resistance,88 toxicity prediction,89–94 physicochemical properties prediction
(eg water solubility, lipophilicity),95 gastrointestinal absorption,96 activity of peptides,97 data
mining,98 drug metabolism,99 and prediction of other pharmacokinetic and ADME
properties.1OO,1O1 Recent reviews1O2–112 summarise work in a number of these areas and a
book78 has summarised the application of neural networks to combinatorial discovery. The
journal Quantitative Structure–Rctivity Relationships contains abstracts of QSAR studies in
other journals in each issue.
It is clear that the number of potential applications for structure–property modelling, in the most general
case, is extensive and growing daily. Improved molecular descriptors, based on a better understanding
of which molecular attributes are most important for a given property being modelled, and increasing
use of genetic and artificial intelligence methods will raise QSAR to even greater levels of usefulness
than the current high level. A basic understanding of QSAR concepts is essential for most people,
across a diverse range of skills, who design molecules.
In the field of drug designing
1. Information from the intercept values
2. Importance of log P0
concept
3. Bioisosterism
4. Enzyme Inhibition
5. Information on receptor site
6. Importance in Drug Research
1. Information from the intercept values
Intercept represents the activity of the unsubstituted compound. The activity increases or
decreases depending on the substitution which is reflected in the slope or regression coefficient.
Intercept is very high, and slope is low- basic nucleus or the parent compound has high activity.
Intercept is a measure of intrinsic activity.
Similar slope or correlation coefficients. Intercepts are different and carbamates are more active.
Examples 1. Alcohol and Carbamates inhibit bacterial luminescence
Example 2 Esters and alkyl carbonates -their narcotic action on tadpoles. Carbamates are
more active than the esters.
2. Importance of log P0
concept
Examples 1 Analgesic activity of Hydroxycodenone esters : Anticonvulsant activity
(barbiturates, benzodiazepines etc ) :
The logP value of many CNS drug was around 2.0. For example
3. Bio-isosterism
Replacement of one functional group with other having similar properties both qualitatively and
quantitatively. Discovery of cyanoguanidine as the bioisostere of thiourea. Development of H2
antagonists.
Example 1: A guanidine isostere, where C=S is replaced with C=NH, resulted in increased
basicity and reduced activity. To decrease the basicity, an electron withdrawing group like NO2 ,
-CN were introduced into the guanido group.
Cyanoguanidine group – NH(C=NCN)-NH- = ideal isoster for thiourea –NH(C=S) -NHR with
reduced basicity. 15 Department of pharmaceutical chemistry
Thiourea is replaced with –– NH(C=CHNO2) –NHR group. 16Department of pharmaceutical
chemistry
4. Enzyme Inhibition
Dihydrofolate reductase (DHFR) most extensively investigated enzyme. DHFR inhibitors are
therapeutically important as highly selective in Antibacterial, Antimalarial, Antitumor agents
Replacement of one methoxy group of trimethoprim by an acidic side chain - carboxylate group-
increase in inhibitory activities but selectivity and membrane permeability significantly
decreased. Replacement to sulfonyl group- Mycobacterium lufu DHFR 19 Department of
pharmaceutical chemistry.
5. Information on receptor site
Inhibition of dihydro folate reductase(DHFR) by benzyl pyrimidines (Trimethoprim type).
Inhibition of DHFR from bovine liver and from E.coli.
Mammalian enzyme –hydrophobicity, bacterial inhibition depends on the bulk of the substituent.
QSAR on Quinazolines- hydrophobic parameter for the substituent at 5 positions - a large
positive coefficient. • Hydrophobic pocket - both mammalian and bacterial DHFR. 23
Department of pharmaceutical chemistry
Triazines - hydrophobic pocket is larger in bacterial enzyme than the mammalian enzyme. 24
Department of pharmaceutical chemistry
6. Importance in Drug Research
QSAR has correctly predicted the activity of large number of compounds before their synthesis.
Colchicine -anti-cancer drug - toxicity
I is the indicator variable for the presence of the group -COCH3 at 10 position.
Conclusion
From the above discussion it is concluded that the QSAR study is a good prediction tool for
investigation drug activity or binding mode on specific receptors. Descriptors that show the best
correlation in the investigation gives information about important functional groups in the
structures of tested compounds. So, by changing some groups in the structure of drugs, we can
increase their pharmacological activity or physicochemical properties. In general, the
experimental determinations are very expensive and the QSAR studies allow a reduction of this
cost. It is basically used to study the biological activities with various properties associated with
the structures, which is helpful to explain how structural features in a drug molecule influence
the biological activities. QSAR methods can be used to build models that can predict properties
or activities for organic compounds. However, an effective way to encode the structures with
calculated molecular structure descriptors are required for accurate models’ development. The
descriptors incorporated in model’s development can provide an opportunity to focus on specific
features account for the property or activity of interest in the compounds. QSAR should not
replace experimental values, but it is useful predictive tool and might be usable if no data were
available.
References
1. https://www.sciencedirect.com/topics/pharmacology-toxicology-and-
pharmaceutical-science/quantitative-structure-activity-
relationship#:~:text=Quantitative%20structure%20activity%20relationship%20(Q
SAR,the%20substance%20will%20be%20modified.
2. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3135-4
3. https://www.frontiersin.org/articles/10.3389/fphar.2018.01275/full
4. https://www.nature.com/articles/s41598-019-45522-3
5. https://www.creative-biolabs.com/drug-discovery/therapeutics/sar-and-qsar-
models.htm
6. https://echa.europa.eu/support/registration/how-to-avoid-unnecessary-testing-on-
animals/qsar-models
7. https://www.researchgate.net/publication/323791617_Quantitative_Structure-
Activity_Relationship_QSAR_Modeling_Approaches_to_Biological_Applications
8. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074254/
9. http://crdd.osdd.net/qsar.php
10. https://link.springer.com/10.1007/978-0-387-30440-3_422
11. https://www.mdpi.com/journal/molecules/special_issues/qsar
12. https://pubmed.ncbi.nlm.nih.gov/19929826/
13. https://app.dimensions.ai/details/publication/pub.1101540571
14. https://www.hindawi.com/journals/apm/2019/5173786/
15. https://www.schrodinger.com/field-based-qsar
16. https://qsartoolbox.org/
17. https://academic.oup.com/bib/article-pdf/3/1/73/482370/73.pdf
18. https://www.chemicool.com/definition/qsar.html
19. https://www.pharmatutor.org/articles/history-revolution-of-qsar-quantitative-
structural-activity-relationship
20. https://www.slideshare.net/nehla313/qsar-57546822
21. https://www.slideshare.net/MahendraMahi28/qsar-96595698
22. https://www.slideshare.net/rahulbs89/qsar-16769617
23. https://www.slideshare.net/EswaranMurugesan/quantitative-structure-activity-
relationship-qsar
24. https://www.slideshare.net/OmarSokkar/introduction-to-quantitative-structure-
activity-relationships
25. https://www.slideshare.net/binujass1/applications-of-qsar
26. https://www.slideshare.net/SaramitaDeChakravart/qsar-activity-relationships-
quantitative-structure
27. https://www.slideshare.net/MahendraMahi28/3d-qsar
28. https://www.slideshare.net/abhikseal/qsar-and-drug-design-ppt
29. https://www.slideshare.net/binujass1/applications-of-qsar
30. Medicinal chemistry by Burger, Wiley Publications Co.
31. An introduction to medicinal chemistry- Graham.L.Patrick ,page no: 383
32. The organic chemistry of the drug design and drug action – Richard. B. Silverman.
33. QSAR: Hansch Analysis and Related Approaches- Hugo Kubinyi . vol :1, page no;
115
34. QSAR - Application in Drug Design- International Journal of Pharmaceutical
Research & Allied Sciences, vol: 2. Page no: 6
35. Application of QSAR in Drug Design and Drug Discovery- World Journal of
Clinical Pharmacology, Microbiology and Toxicology, Vol. 1, page no: 28.
36. https://pubmed.ncbi.nlm.nih.gov/12002226/

More Related Content

What's hot

Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniquesROHIT PAL
 
Pharmacophore mapping
Pharmacophore mapping Pharmacophore mapping
Pharmacophore mapping GamitKinjal
 
Molecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingMolecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingAkshay Kank
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Theabhi.in
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design pptAbhik Seal
 
De novo drug design
De novo drug designDe novo drug design
De novo drug designmojdeh y
 
MOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptxMOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptxE Poovarasan
 
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMDENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMShikha Popali
 
Pharmacophore modeling
Pharmacophore modelingPharmacophore modeling
Pharmacophore modelingDevika Rana
 
Virtual screening ppt
Virtual screening pptVirtual screening ppt
Virtual screening pptVivekYadav490
 
APPLICATIONS OF QSAR
APPLICATIONS OF QSARAPPLICATIONS OF QSAR
APPLICATIONS OF QSARBinuja S.S
 
Quantitative structure activity relationships
Quantitative structure  activity relationshipsQuantitative structure  activity relationships
Quantitative structure activity relationshipsAmiya ghosh
 
Conformational analysis
Conformational analysisConformational analysis
Conformational analysisPinky Vincent
 
Hansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR ModelsHansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR ModelsAkshay Kank
 
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)AkshayYadav176
 
Presentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningPresentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningJoon Jyoti Sahariah
 

What's hot (20)

3D QSAR.pptx
3D QSAR.pptx3D QSAR.pptx
3D QSAR.pptx
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
 
Pharmacophore mapping
Pharmacophore mapping Pharmacophore mapping
Pharmacophore mapping
 
Molecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingMolecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular Modeling
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
 
De novo drug design
De novo drug designDe novo drug design
De novo drug design
 
MOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptxMOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptx
 
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMDENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
 
Pharmacophore modeling
Pharmacophore modelingPharmacophore modeling
Pharmacophore modeling
 
Virtual screening ppt
Virtual screening pptVirtual screening ppt
Virtual screening ppt
 
APPLICATIONS OF QSAR
APPLICATIONS OF QSARAPPLICATIONS OF QSAR
APPLICATIONS OF QSAR
 
Quantitative structure activity relationships
Quantitative structure  activity relationshipsQuantitative structure  activity relationships
Quantitative structure activity relationships
 
Conformational analysis
Conformational analysisConformational analysis
Conformational analysis
 
De Novo Drug Design
De Novo Drug DesignDe Novo Drug Design
De Novo Drug Design
 
Hansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR ModelsHansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR Models
 
QSAR
QSARQSAR
QSAR
 
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)
Pharmacophore Mapping and Virtual Screening (Computer aided Drug design)
 
QSAR by Faizan Deshmukh
QSAR by Faizan DeshmukhQSAR by Faizan Deshmukh
QSAR by Faizan Deshmukh
 
Presentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningPresentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screening
 

Similar to QSAR quantitative structure activity relationship

Quantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipQuantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipRaniBhagat1
 
QSAR 2024 _ MIB ppt ontoh materi kul.pptx
QSAR 2024 _ MIB ppt ontoh materi kul.pptxQSAR 2024 _ MIB ppt ontoh materi kul.pptx
QSAR 2024 _ MIB ppt ontoh materi kul.pptxarkoprofesional
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptorsRAJAN ROLTA
 
Lecture 5 pharmacophore and qsar
Lecture 5  pharmacophore and  qsarLecture 5  pharmacophore and  qsar
Lecture 5 pharmacophore and qsarRAJAN ROLTA
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataRajarshi Guha
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdfUmeshYadava1
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxRadhaChafle1
 
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)mohamedchaouche
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 
DRUG TARGETS AND DRUG DESIGNING.pptx
DRUG TARGETS AND DRUG DESIGNING.pptxDRUG TARGETS AND DRUG DESIGNING.pptx
DRUG TARGETS AND DRUG DESIGNING.pptxHumaRao8
 

Similar to QSAR quantitative structure activity relationship (20)

Unit 2 cadd assignment
Unit 2 cadd assignmentUnit 2 cadd assignment
Unit 2 cadd assignment
 
QSPR For Pharmacokinetics
QSPR For PharmacokineticsQSPR For Pharmacokinetics
QSPR For Pharmacokinetics
 
Quantitative Structure Activity Relationship
Quantitative Structure Activity RelationshipQuantitative Structure Activity Relationship
Quantitative Structure Activity Relationship
 
JAI
JAIJAI
JAI
 
qsar.pptx
qsar.pptxqsar.pptx
qsar.pptx
 
QSAR 2024 _ MIB ppt ontoh materi kul.pptx
QSAR 2024 _ MIB ppt ontoh materi kul.pptxQSAR 2024 _ MIB ppt ontoh materi kul.pptx
QSAR 2024 _ MIB ppt ontoh materi kul.pptx
 
3 D QSAR Approaches and Contour Map Analysis
3 D QSAR Approaches and Contour Map Analysis3 D QSAR Approaches and Contour Map Analysis
3 D QSAR Approaches and Contour Map Analysis
 
Drug design
Drug designDrug design
Drug design
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptors
 
Ligand based drug desighning
Ligand based drug desighningLigand based drug desighning
Ligand based drug desighning
 
Lanjutan kimed
Lanjutan kimedLanjutan kimed
Lanjutan kimed
 
Lecture 5 pharmacophore and qsar
Lecture 5  pharmacophore and  qsarLecture 5  pharmacophore and  qsar
Lecture 5 pharmacophore and qsar
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity Data
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf
 
des.pptx
des.pptxdes.pptx
des.pptx
 
A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptx
 
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
DRUG TARGETS AND DRUG DESIGNING.pptx
DRUG TARGETS AND DRUG DESIGNING.pptxDRUG TARGETS AND DRUG DESIGNING.pptx
DRUG TARGETS AND DRUG DESIGNING.pptx
 

More from ZarlishAttique1

Automated and manual Primer designing and its validation using Bioinformatics...
Automated and manual Primer designing and its validation using Bioinformatics...Automated and manual Primer designing and its validation using Bioinformatics...
Automated and manual Primer designing and its validation using Bioinformatics...ZarlishAttique1
 
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104ZarlishAttique1
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryZarlishAttique1
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlishAttique1
 
Receptor Effector coupling by G-Proteins Zarlish attique 187104
Receptor Effector coupling by G-Proteins Zarlish attique 187104 Receptor Effector coupling by G-Proteins Zarlish attique 187104
Receptor Effector coupling by G-Proteins Zarlish attique 187104 ZarlishAttique1
 
Computational phylogenetics theoretical concepts, methods with practical on C...
Computational phylogenetics theoretical concepts, methods with practical on C...Computational phylogenetics theoretical concepts, methods with practical on C...
Computational phylogenetics theoretical concepts, methods with practical on C...ZarlishAttique1
 

More from ZarlishAttique1 (7)

Automated and manual Primer designing and its validation using Bioinformatics...
Automated and manual Primer designing and its validation using Bioinformatics...Automated and manual Primer designing and its validation using Bioinformatics...
Automated and manual Primer designing and its validation using Bioinformatics...
 
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104
Phylogenetic tree construction using bioinformatics tools Zarlish attique 187104
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information library
 
DBMS Helping material
DBMS Helping materialDBMS Helping material
DBMS Helping material
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modeller
 
Receptor Effector coupling by G-Proteins Zarlish attique 187104
Receptor Effector coupling by G-Proteins Zarlish attique 187104 Receptor Effector coupling by G-Proteins Zarlish attique 187104
Receptor Effector coupling by G-Proteins Zarlish attique 187104
 
Computational phylogenetics theoretical concepts, methods with practical on C...
Computational phylogenetics theoretical concepts, methods with practical on C...Computational phylogenetics theoretical concepts, methods with practical on C...
Computational phylogenetics theoretical concepts, methods with practical on C...
 

Recently uploaded

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 

Recently uploaded (20)

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 

QSAR quantitative structure activity relationship

  • 1. Government Post Graduate College Mandian Abbottabad Assignment no 1: Quantitative Structure-Activity Relationship (QSAR) Submitted by: Name: Zarlish Attique Registration no: 187104 Subject: Pharmacoinformatics Department: Bioinformatics Semester: 5th Submitted to: Teacher Name: Sir Imran Department of Bioinformatics Date of Submission: November 10,2020
  • 3. Introduction to Quantitative structure-activity relationship (QSAR) Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities in a quantitative manner for a series of compounds. QSAR modeling is essential for drug discovery, but it has many constraints. Quantitative Structure-Activity Relationship (QSAR) is an approach to find qualitative relationships between chemical structure and their biological activity Quantitative Structure Activity Relationship (QSAR) models are theoretical models that relate a quantitative measure of chemical structure to a physical property, or a biological activity Principle: Structurally similar chemicals are likely to have similar physicochemical and biological properties Mathematical form QSAR models are of the form: Apred = f(D1, D2,...Dn) where, ❖ Apred: biological activity (or toxicological endpoint) ❖ D1,D2,...Dn: chemical or structural properties (molecular descriptors) Or A QSAR has the form of a mathematical model: Activity = f (physiochemical properties and/or structural properties) + error The error includes model error (bias) and observational variability, that is, the variability in observations even on a correct model. Note: Qualitative SARs and quantitative SARs, collectively are referred to as (Q)SARs. Qualitative relationships are derived from non-continuous data (e.g., yes or no data), while quantitative relationships are derived for continuous data (e.g., toxic potency data).
  • 5. Quantitative structure–activity relationship Models Quantitative structure–activity relationship models (QSAR models) are regression or classification models used in the chemical and biological sciences and engineering. Like other regression models, QSAR regression models relate a set of "predictor" variables (X) to the potency of the response variable (Y), while classification QSAR models relate the predictor variables to a categorical value of the response variable. In QSAR modeling, the predictors consist of physico-chemical properties or theoretical molecular descriptors of chemicals; the QSAR response-variable could be a biological activity of the chemicals. QSAR models first summarize a supposed relationship between chemical structures and biological activity in a data-set of chemicals. Second, QSAR models predict the activities of new chemicals. Related terms Related terms include quantitative structure–property relationships (QSPR) when a chemical property is modeled as the response variable. "Different properties or behaviors of chemical molecules have been investigated in the field of QSPR. Some examples are quantitative structure–reactivity relationships (QSRRs), quantitative structure–chromatography relationships (QSCRs) and, quantitative structure–toxicity relationships (QSTRs), quantitative structure– electrochemistry relationships (QSERs), and quantitative structure–biodegradability relationships (QSBRs)." Example for demonstration As an example, biological activity can be expressed quantitatively as the concentration of a substance required to give a certain biological response. Additionally, when physicochemical properties or structures are expressed by numbers, one can find a mathematical relationship, or quantitative structure-activity relationship, between the two. The mathematical expression, if carefully validated can then be used to predict the modeled response of other chemical structures.
  • 7. Principal steps of QSAR Principal steps of QSAR including (i) Selection of Data set and extraction of structural/empirical descriptors (ii) variable selection, (iii) model construction and (iv) validation evaluation." The QSAR method involves recognition that a molecule (organic, peptide, protein, etc.) is really a three–dimensional distribution of properties. The most important of these properties are steric (e.g., shape and volume), electronic (e.g. electric charge and electrostatic potential), and what are termed ‘lipophilic’ properties (how polar or non–polar the sections of the molecular are, usually exemplified by the log of the octanol–water partition coefficient, log P). The QSAR method (and analogously QSTR and QSPR) involves a number of key steps: 1. Converting molecular structures into mathematical descriptors that encapsulate the key properties of the molecules relevant to the activity or property being modelled. 2. Selecting the best descriptors from a larger set of accessible, relevant descriptors. 3. Mapping the molecular descriptors into the properties, preferably using a ‘model–free’ mapping system in which no assumptions are needed as to the functional form of the structure– activity relationship. These relationships are often complex, unknown and non– linear. 4. Validating the model to determine how predictive it is, and how well it will generalise to new molecules not in the data set used to generate the model (the training set). “Step 1- Generation of descriptors” There are a myriad methods for generating molecular descriptors. Packages such as Dragon are able to generate over a thousand descriptors, while methods such as CoMFA15 generate many thousands. Molecular descriptors can be of diverse types.
  • 8. Molecular descriptors Molecular descriptors are final products of mathematical procedures transforming chemical information encoded within a molecular structure to a numerical representative. Dimensionality of molecular descriptors can identify QSAR model type as described below: 0D QSAR- These are descriptors derived from molecular formula e.g. molecular weight, number and type of atoms etc. 1D QSAR- A substructure list representation of a molecule can be considered as a one- dimensional (1D) molecular representation and consists of a list of molecular fragments (e.g. functional groups, rings, bonds, substituents etc.). 2D QSAR- A molecular graph contains topological or two dimensional (2D) information. It describes how the atoms are bonded in a molecule, both the type of bonding and the interaction of particular atoms (e.g. total path count, molecular connectivity indices etc.). 3D QSAR- These are calculated starting from a geometrical or 3D representation of a molecule. These descriptors include molecular surface, molecular volume and other geometrical properties. There are different types of 3D descriptors e.g. electronic, steric, shape etc. 4D QSAR- Four dimensional information is described in this type of models, and the fourth dimension is an ensemble of conformation of each ligand. 5D-QSAR – Five dimensional information is described in this type of models, and the fifth dimension is the possibility to represent an ensemble of up to six different induced-fit models. The descriptors are fall into 4 classes: Topological, Geometrical, Electronic and Hybrid. Mainly we have, 1. Topological descriptors: Topological descriptors are graph invariants generated by applying the theorems of graph theory. Examples of topological descriptors are: atom counts, ring counts, molecular weight, weighted paths, molecular connectivity indices, substructure counts, molecular distance edge descriptors, kappa indices, electro- topological state indices, and some other invariants.Aspects of the structures related to
  • 9. the electrons are encoded by calculating electronic descriptors. Examples of electronic descriptors are: partial atomic charges, HOMO or LUMO energies, dipole moment. 2. Geometric descriptors: Geometric descriptors are used to encode the 3-D aspects of the molecular structure such as moments of inertia, solvent accessible surface area, length-to- breadth ratios, shadow areas, gravitational index. Descriptor selection To build a good QSAR model, a minimal set of information–rich descriptors is required. The large number of possible indices creates several problems for the modeller: ❖ Many descriptors do not contain molecular information relevant to the problem. ❖ Many descriptors are linearly dependent (contain essentially the same information). ❖ Use of poor descriptors in QSAR yields poor and misleading models. ❖ Including too many descriptors in the model, even if they contain relevant information, can result in overfitting of the model, and loss of ability of the model to generalise to unseen molecules.
  • 10. ❖ Many methods of screening this large pool of potential descriptors for relevant ones can lead to chance correlations (correlations that arise by chance “Step 2- Structure–activity mapping” Many methods have been used to map molecular descriptors to properties. The majority are regression methods, of which multiple linear regression was the first used. Regression methods attempt to fit a specific function with free parameters to a set of data. They usually do this using some gradient descent method such as least squares, which finds the best set of free parameters that minimise the sum of the squares of the errors between the measured values of the dependent variables, and those calculated by the fitted function. Some QSAR problems have relatively linear response surfaces that can be modelled successfully by linear regression methods. “Step 3- Validation and testing” It is important to know how predictive a model is, once derived, to show whether a structure– property mapping method has overfitted the data, a neural net has overtrained or that chance
  • 11. correlations are present. Several methods have been developed to estimate the validity or predictivity of the derived structure– property model. The most common method is ‘leave–one–out’ cross– validation.8S This involves leaving each molecule out of the training set in turn, then creating a model using the remainder of the training set. The property of the omitted molecule is predicted using the model derived from all of the other molecules. This method is not a very rigorous test of the predictivity of the model and suffers from two other major deficiencies: the time to carry out the cross–validation increases as the square of the size of the training set; the method produces n final models (each corresponding to one of the training set molecules being left out) and it is not clear which is the ‘best’ model. A better method is to remove a percentage of the training set into a test set. The structure–
  • 12. property model is derived using the reduced training set, and the properties of the test set predicted using this model. This is a more rigorous test of the quality of the structure– property model but again suffers from problems: not all of the available data can be used to make the model as some must be held back for the test set; it is not clear how the test set is best selected from the training set, eg randomly or using cluster analysis. Overall study design of a QSAR-guided drug discovery project.
  • 14. “Types of QSAR” 1.Fragment-based QSAR Analogously, the "partition coefficient"—a measurement of differential solubility and itself a component of QSAR predictions—can be predicted either by atomic methods (known as "XLogP" or "ALogP") or by chemical fragment methods (known as "CLogP" and other variations). It has been shown that the logP of compound can be determined by the sum of its fragments; fragment-based methods are generally accepted as better predictors than atomic-based methods. Fragmentary values have been determined statistically, based on empirical data for known logP values. This method gives mixed results and is generally not trusted to have accuracy of more than ±0.1 units. Group or Fragment based QSAR is also known as GQSAR. GQSAR allows flexibility to study various molecular fragments of interest in relation to the variation in biological response. The molecular fragments could be substituents at various substitution sites in congeneric set of molecules or could be on the basis of pre-defined chemical rules in case of non-congeneric sets. GQSAR also considers cross-terms fragment descriptors, which could be helpful in identification
  • 15. of key fragment interactions in determining variation of activity. Lead discovery using Fragnomics is an emerging paradigm. In this context FB-QSAR proves to be a promising strategy for fragment library design and in fragment-to-lead identification endeavours. An advanced approach on fragment or group-based QSAR based on the concept of pharmacophore-similarity is developed. This method, pharmacophore-similarity-based QSAR (PS-QSAR) uses topological pharmacophoric descriptors to develop QSAR models. This activity prediction may assist the contribution of certain pharmacophore features encoded by respective fragments toward activity improvement and/or detrimental effects. 2.3D-QSAR The acronym 3D-QSAR or 3-D QSAR refers to the application of force field calculations requiring three-dimensional structures of a given set of small molecules with known activities (training set). The training set needs to be superimposed (aligned) by either experimental data (e.g. based on ligand-protein crystallography) or molecule superimposition software. It uses
  • 16. computed potentials, e.g. the Lennard-Jones potential, rather than experimental constants and is concerned with the overall molecule rather than a single substituent. The first 3-D QSAR was named Comparative Molecular Field Analysis (CoMFA) by Cramer et al. It examined the steric fields (shape of the molecule) and the electrostatic fields which were correlated by means of partial least squares regression (PLS). The created data space is then usually reduced by a following feature extraction . An alternative approach uses multiple-instance learning by encoding molecules as sets of data instances, each of which represents a possible molecular conformation. A label or response is assigned to each set corresponding to the activity of the molecule, which is assumed to be determined by at least one instance in the set (i.e. some conformation of the molecule). On June 18, 2011 the Comparative Molecular Field Analysis (CoMFA) patent has dropped any restriction on the use of GRID and partial least-squares (PLS) technologies.
  • 17. 3.Chemical descriptor based QSAR In this approach, descriptors quantifying various electronic, geometric, or steric properties of a molecule are computed and used to develop a QSAR. This approach is different from the fragment (or group contribution) approach in that the descriptors are computed for the system as whole rather than from the properties of individual fragments. This approach is different from the 3D-QSAR approach in that the descriptors are computed from scalar quantities (e.g., energies, geometric parameters) rather than from 3D fields. An example of this approach is the QSARs developed for olefin polymerization by half sandwich compounds.
  • 19. Modeling in QSAR In the literature it can be often found that chemists have a preference for partial least squares (PLS) methods, since it applies the feature extraction and induction in one step. Data mining approach Computer QSAR models typically calculate a relatively large number of features. Because those lack structural interpretation ability, the preprocessing steps face a feature selection problem (i.e., which structural features should be interpreted to determine the structure-activity relationship). Feature selection can be accomplished by visual inspection (qualitative selection by a human); by data mining; or by molecule mining. A typical data mining based prediction uses e.g. support vector machines, decision trees, artificial neural networks for inducing a predictive learning model. Molecule mining approaches, a special case of structured data mining approaches, apply a similarity matrix based prediction or an automatic fragmentation scheme into molecular substructures. Furthermore, there exist also approaches using maximum common subgraph searches or graph kernels.
  • 20. Matched molecular pair analysis Typically QSAR models derived from non linear machine learning is seen as a "black box", which fails to guide medicinal chemists. Recently there is a relatively new concept of matched molecular pair analysis or prediction driven MMPA which is coupled with QSAR model in order to identify activity cliffs.
  • 21. Evaluation of the quality of QSAR models
  • 22. Evaluation of the quality of QSAR models QSAR modeling produces predictive models derived from application of statistical tools correlating biological activity (including desirable therapeutic effect and undesirable side effects) or physico-chemical properties in QSPR models of chemicals (drugs/toxicants/environmental pollutants) with descriptors representative of molecular structure or properties. QSARs are being applied in many disciplines, for example: risk assessment, toxicity prediction, and regulatory decisions in addition to drug discovery and lead optimization. Obtaining a good quality QSAR model depends on many factors, such as the quality of input data, the choice of descriptors and statistical methods for modeling and for validation. Any QSAR modeling should ultimately lead to statistically robust and predictive models capable of making accurate and reliable predictions of the modeled response of new compounds. For validation of QSAR models, usually various strategies are adopted: ❖ Internal validation or cross-validation (actually, while extracting data, cross validation is a measure of model robustness, the more a model is robust (higher q2) the less data extraction perturb the original model); ❖ External validation by splitting the available data set into training set for model development and prediction set for model predictivity check; ❖ Blind external validation by application of model on new external data and ❖ Data randomization or Y-scrambling for verifying the absence of chance correlation between the response and the modeling descriptors. The success of any QSAR model depends on accuracy of the input data, selection of appropriate descriptors and statistical tools, and most importantly validation of the developed model. Validation is the process by which the reliability and relevance of a procedure are established for a specific purpose; for QSAR models validation must be mainly for robustness, prediction performances and applicability domain (AD) of the models. Some validation methodologies can be problematic. For example, leave one-out cross-validation generally leads to an overestimation of predictive capacity. Even with external validation, it is difficult to determine whether the selection of training and test sets was manipulated to maximize the predictive capacity of the model being published.
  • 23. Different aspects of validation of QSAR models that need attention include methods of selection of training set compounds, setting training set size and impact of variable selection for training set models for determining the quality of prediction. Development of novel validation parameters for judging quality of QSAR models is also important.
  • 25. Computational resources for QSAR The main objective of QSAR models is to allow the prediction of biological activities of untested or novel compounds to provide insight into relevant and consistent chemical properties or descriptors (2D/3D) which defines the biological activity. Once, a series of predicted models are collected, these can be used for database mining for the identification of novel chemical compounds, particularly, for those having drug-like properties (following Lipinski (Rule of Five) along with suitable pharmacokinetic properties. A diagram demonstrating list of softwares used for computing chemical descriptors and types of different descriptors.
  • 26. Web Servers/Databases/Mirror Sites Chemical Libraries Chemical Libraries Description ZINC A free database of commercially available compounds for virtual screening. NCI 250251 open structures ready for searching. Web interface on Libraries Name Description RCDK Allows the user to load molecules, evaluate fingerprints, calculate molecular descriptors and view structures in 2D. JoeLib A Cheminformatics algorithm library, which was designed for prototyping, data mining, graph mining, and of course algorithm development. Molecular Modeliing Open Source program library for molecular simulation applications. Standalone software Structure Drawing Name Description ACD/ChemSketch 11.0 Freeware It allows you to draw chemical structures including organics, organometallics, polymers, and Markush structures. ACD/3D Viewer 3D viewing tool for implementation with both ISIS Draw and ISIS Base. Biomer Java based online biomolecular modelling package. MOLEKEL An open source(GPL) multiplatform molecular visualization program.
  • 27. Descriptor Calculation Name Description ACD/LogP Freeware Fragment-based algorithm for Logp prediction. Steric A program to calculate molecular steric effects. DALTON Program for ab-initio calculation of molecular prop. Chemoinformatics Kits and Optimizations Name Description Ghemical Computational chemistry software package released under the GNU GPL. gOpenMol Tool for the visualization and analysis of molecular structures and their chemical properties. LIGPLOT Automatically plotting protein-ligand interactions. VIEWMOL Graphical front end for some quantum chemical and molecular modeling programmers. VEGA Molecular Software package. TINKER Software tools for Molecular Design. MolPOV A graphics file converter. MOLMOL A molecular graphics program. MOPAC Is a semiempirical quantum chemistry program based on Dewar and Thiel's NDDO approximation. MEQI Molecular Scaffold Analysis. ASC Analytic Surface Calculation Package for PDBs. Babel File format converter. POWERMV A software environment for statistical analysis, molecular viewing, descriptor generation, and similarity search. Links Chemical Database With Information
  • 28. ChemIDPlus ACD (Available Chemicals Directory) Bionet Database CAP (Chemicals Available for Purchase) CHEBi, ASINEX, Maybridge Database, Zinc ChemBank ChemDB, ChemStar CSD (Cambridge Structural Database) HIC-Up (Hetero- compound Information Centre - Uppsala) IBS (Inter Biosceen) Database MDPI (Molecular Diversity Preservation International) MSDchem NCI Database PDBsum, PubChem Relibase SPRESI GLIDA ChemBioGRID Ched ECOTOX QueryCHEM Drug Database BIOSTER CMC (Comprehensive Medicinal Chemistry)/">CMC (Comprehensive Medicinal Chemistry) Dictionary of Drug DrugBank MDDR (MDL Drug Data Report) MedChem PharmGKB Rx-list SDD ( Super Drug Database) WDI (The Drug Index) WOMBAT AID/DRUGS
  • 29. QSAR / in silico Tools VEGA platform Using the VEGA platform, you can access a series of QSAR (quantitative structure-activity relationship) models for regulatory purposes, or develop your own model for research purposes. CAESAR software (version 2) This is the standalone version of the CAESAR software (version 1). Most of the new features that will be present in the new CAESAR v2.0 software have been integrated in the stand-alone models for Developmental Toxicity and Mutagenicity. CAESAR software (version 1) The CAESAR Application is a JAVA™ web application that allows the access to all the toxicity predictive models developed within the CAESAR Project. DEMETRA DEMETRA is an EU-funded project. This project aim has been to develop predictive models and software which give a quantitative prediction of the toxicity of a molecule, in particular molecules of pesticides, candidate pesticides, and their derivatives. The input is the chemical structure of the compound, and the software algorithms use “Quantitative Structure-Activity Relationships” (QSARs). The DEMETRA software tool can be used for toxicity prediction of molecules of pesticides and related compounds. T.E.S.T Toxicity Estimation Software Tool (T.E.S.T.) will enable users to easily estimate acute toxicity using the above QSAR methodologies. Toxtree Toxtree is a full-featured and flexible user-friendly open source application, which can estimate toxic hazard by applying a decision tree approach. Toxtree could be applied to datasets from
  • 30. various compatible file types. User-defined molecular structures are also supported – they could be entered by SMILES, or by using the built-in 2D structure diagram editor. OCHEM The OCHEM is an online database of experimental measurements intergrated with the modeling environment. Submit your experimental data or use the data uploaded by other users to build predictive QSAR models for physical-chemical or biological properties. Chemistry Development Kit (CDK) The Chemistry Development Kit (CDK) is a Java library for structural chemo- and bioinformatics. It is now developed by more than 50 developers all over the world and used in more than 10 different academic as well as industrial projects world wide. Weka Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization AMBIT AMBIT is a software for chemoinformatic data management, is an outcome of the LRI project: ’Building blocks for a future (Q)SAR ((Quantitative) Structure Activity Relationship) decision support system. ChemAxon Marvin Marvin is a collection of tools for drawing, displaying and characterizing chemical structures, queries, macromolecules and reactions. Sybyl-X Whether you need to find new lead candidates, optimize lead series, or perform other related life science experiments like modeling a protein structure, SYBYL-X has solutions to move your discovery research forward. With capabilities for small molecule modeling and simulation,
  • 31. macromolecular modeling and simulation, cheminformatics, lead identification, and lead optimization, all wrapped up in an easy to use, cost-effective interface, SYBYL-X has the tools and capabilities you need for molecular design. Discovery Studio Discovery Studio is a software suite of life science molecular design solutions for computational chemists and computational biologists. Discovery Studio makes it easier to examine the properties of large and small molecules, study systems, identify leads and optimize candidates. ADF: Amsterdam Density Functional software ADF has a 30-year track record as a premium-quality quantum chemistry software package based on Density Functional Theory (DFT). It consists of ❖ the molecular DFT program ADF ❖ the periodic DFT program BAND ❖ the post-ADF COSMO-RS program for thermodynamics of liquids ❖ The ReaxFF program for modeling chemical reactions AMBER: Assisted Model Building with Energy Refinement “Amber” refers to two things: a set of molecular mechanical force fields for the simulation of biomolecules (which are in the public domain, and are used in a variety of simulation programs); and a package of molecular simulation programs which includes source code and demos. The current version of the code is Amber version 11, which is distributed by UCSF. ChemDraw Ultra ChemDraw Ultre 12 is an outstanding package for chemical structure drawing and editing. For preparing chemical schemes. The graphics prepared in ChemDraw Ultra insures a professional and neat appearance. Gaussian 09: Expanding the limits of computational chemistry
  • 32. Gaussian 09 is the latest version of the Gaussian series of electronic structure programs. Starting from the fundamental laws of quantum mechanics, Gaussian 09 predicts the energies, molecular structures, vibrational frequencies and molecular properties of molecules and reactions in a wide variety of chemical environments. Gaussian 09’s models can be applied to both stable species and compounds which are difficult or impossible to observe experimentally (e.g., short-lived intermediates and transition structures). Molecular Operating Environment MOE provides a suite of applications for manipulating and analyzing large collections of compounds, building property models, consensus models and SD pipeline command line tools. MolProp Tk The MolProp TK provides a customizable framework for molecular property calculation geared towards enabling rapid database filtering. Filtering attempts to eliminate inappropriate or undesirable compounds from a large set before beginning to use them in modelling studies. The goal is to remove all of the compounds that should not be suggested to a medicinal chemist as a potential hit. ADMEWORKS ModelBuilder ADMEWORKS ModelBuilder is a tool dedicated for building QSAR/QSPR models that can later be used for predicting various chemical and biological properties of compounds. Two classes of models (Qualitative and Quantitative) can be built using various algorithms. The models are based on values of physicochemical, topological, geometrical, and electronic properties derived from the molecular structure. KnowItAll Bio-Rad’s award-winning KnowItAll Informatics System offers fully integrated software and/or database desktop solutions that provide scientific researchers multiple tools such as database building, management, mining/search, analysis, prediction, structure drawing, and reporting, all within a single user interface. MetaDrugTM
  • 33. MetaDrug is a unique systems pharmacology platform designed for evaluation of biological effects of small molecule compounds on the human body, with pathway analysis and other bioinformatics applications from toxicogenomics to translational medicine. Noraymet ADME ADME predictive software: the software includes a selection of INVIVE models (in vivo-in vitro extrapolation that allows predicting the in vivo pharmacokinetics from in vitro results. PreADME Around half of all drugs in clinical development fail to commercialize because of poor ADME and toxicity properties. There is increasing interest in the early prediction of ADME properties, in order to increase the success rate of compounds reaching development. Using the PreADME the result of ADME prediction can be used as the most outstanding and practical guidance for the early drug discovery. Sarchitect Sarchitect Designer is the model building edition of Sarchitect. The product vision is to enable building “best possible” models. Sarchitect Miner is the interface between the model builders and the model users. Miner enables the use of models by ❖ medicinal chemists to profile and optimize their compounds in silico, and ❖ DMPK groups to rank compounds for prioritization in their studies Volsurf+ The pharmacokinetic behaviour of compounds is linked to their efficacy and thus is critical for drug discovery. Understanding how to optimise compounds according to multiple simultaneous criteria is a great advantage in focusing design efforts. VolSurf+ creates 128 molecular descriptors from 3D Molecular Interaction Fields (MIFs) produced by our software GRID, which are particularly relevant to ADME prediction and are also simple to interpret. One example
  • 34. would be the interaction energy moment descriptor between hydrophobic and hydrophilic regions, which is important for membrane permeability prediction. These can then be used with provided chemometric tools to build statistical models. Bio-Loom BioByte’s new Bio-Loom program weaves several different threads of data into a cohesive whole. Of course, Bio-Loom still calculates hydrophobic and molecular refractivity parameters via CLOGP & CMR, calculations which have been the world standard for decades, but it now has the ability to access BioByte’s entire Thor Masterfile database, which includes over 60,000 measured log P and log D values (in many solvent systems), as well as 14000 pKas, including associated references. MolCode Toolbox The modules of Molcode Toolbox software predict a wide range of experimentally unknown values of properties of compounds including physicochemical, biological, ADME-Tox, ecological pathways/ecotoxicity and adverse drug effects. The modules of Molcode ToolBox software consist of internally encoded computational models built on various datasets related to the aforementioned properties. All the models in Molcode Toolbox are specified by model’s name, CAS-number (not obligatory) and systematic name of compounds, (bio)assay, property/activity value and unit, and reference. TerraQSAR All TerraQSAR programs use proprietary neural networks for the computation of biological effects and physico-chemical properties of defined chemicals. The input is in the form of 2D/3D SMILES strings. ADRIANA.Code ADRIANA.Code comprises a unique combination of methods for calculating molecular structure descriptors on a sound geometric and physicochemical basis. These descriptors can be used for a wide range of applications in all areas of chemistry.
  • 36. Applications of QSAR 1. Chemical applications of QSAR ❖ One of the first historical QSAR applications was to predict boiling points. ❖ It is well known for instance that within a particular family of chemical compounds, especially of organic chemistry, that there are strong correlations between structure and observed properties. A simple example is the relationship between the number of carbons in alkanes and their boiling points. There is a clear trend in the increase of boiling point with an increase in the number carbons, and this serves as a means for predicting the boiling points of higher alkanes. ❖ A still very interesting application is the Hammett equation, Taft equation and pKa prediction methods. 2. Biological applications of QSAR ❖ The biological activity of molecules is usually measured in assays to establish the level of inhibition of particular signal transduction or metabolic pathways. Drug discovery often involves the use of QSAR to identify chemical structures that could have good inhibitory effects on specific targets and have low toxicity (non-specific activity). Of special interest
  • 37. is the prediction of partition coefficient log P, which is an important measure used in identifying "druglikeness" according to Lipinski's Rule of Five. ❖ While many quantitative structure activity relationship analyses involve the interactions of a family of molecules with an enzyme or receptor binding site, QSAR can also be used to study the interactions between the structural domains of proteins. Protein-protein interactions can be quantitatively analyzed for structural variations resulted from site- directed mutagenesis. ❖ It is part of the machine learning method to reduce the risk for a SAR paradox, especially taking into account that only a finite amount of data is available. In general, all QSAR problems can be divided into coding and learning. 3. For risk management ❖ (Q)SAR models have been used for risk management. QSARS are suggested by regulatory authorities; in the European Union, QSARs are suggested by the REACH regulation, where "REACH" abbreviates "Registration, Evaluation, Authorisation and Restriction of Chemicals". Regulatory application of QSAR methods includes in silico toxicological assessment of genotoxic impurities. Commonly used QSAR assessment software such as DEREK or MCASE is used to genotoxicity of impurity according to ICH M7. ❖ The chemical descriptor space whose convex hull is generated by a particular training set of chemicals is called the training set's applicability domain. Prediction of properties of novel chemicals that are located outside the applicability domain uses extrapolation, and so is less reliable (on average) than prediction within the applicability domain. The assessment of the reliability of QSAR predictions remains a research topic. ❖ The QSAR equations can be used to predict biological activities of newer molecules before their synthesis.
  • 38. ❖ QSAR has been applied extensively and successfully over several decades to find predictive models for activity of bioactive agents. It has also been applied to areas related to discovery and subsequent development of bioactive agents: distinguishing drug–like from non–drug– like molecules,87 drug resistance,88 toxicity prediction,89–94 physicochemical properties prediction (eg water solubility, lipophilicity),95 gastrointestinal absorption,96 activity of peptides,97 data mining,98 drug metabolism,99 and prediction of other pharmacokinetic and ADME properties.1OO,1O1 Recent reviews1O2–112 summarise work in a number of these areas and a book78 has summarised the application of neural networks to combinatorial discovery. The journal Quantitative Structure–Rctivity Relationships contains abstracts of QSAR studies in other journals in each issue. It is clear that the number of potential applications for structure–property modelling, in the most general case, is extensive and growing daily. Improved molecular descriptors, based on a better understanding of which molecular attributes are most important for a given property being modelled, and increasing use of genetic and artificial intelligence methods will raise QSAR to even greater levels of usefulness than the current high level. A basic understanding of QSAR concepts is essential for most people, across a diverse range of skills, who design molecules. In the field of drug designing 1. Information from the intercept values 2. Importance of log P0 concept 3. Bioisosterism 4. Enzyme Inhibition 5. Information on receptor site 6. Importance in Drug Research 1. Information from the intercept values Intercept represents the activity of the unsubstituted compound. The activity increases or decreases depending on the substitution which is reflected in the slope or regression coefficient.
  • 39. Intercept is very high, and slope is low- basic nucleus or the parent compound has high activity. Intercept is a measure of intrinsic activity. Similar slope or correlation coefficients. Intercepts are different and carbamates are more active. Examples 1. Alcohol and Carbamates inhibit bacterial luminescence Example 2 Esters and alkyl carbonates -their narcotic action on tadpoles. Carbamates are more active than the esters. 2. Importance of log P0 concept
  • 40. Examples 1 Analgesic activity of Hydroxycodenone esters : Anticonvulsant activity (barbiturates, benzodiazepines etc ) : The logP value of many CNS drug was around 2.0. For example 3. Bio-isosterism Replacement of one functional group with other having similar properties both qualitatively and quantitatively. Discovery of cyanoguanidine as the bioisostere of thiourea. Development of H2 antagonists.
  • 41. Example 1: A guanidine isostere, where C=S is replaced with C=NH, resulted in increased basicity and reduced activity. To decrease the basicity, an electron withdrawing group like NO2 , -CN were introduced into the guanido group. Cyanoguanidine group – NH(C=NCN)-NH- = ideal isoster for thiourea –NH(C=S) -NHR with reduced basicity. 15 Department of pharmaceutical chemistry Thiourea is replaced with –– NH(C=CHNO2) –NHR group. 16Department of pharmaceutical chemistry 4. Enzyme Inhibition Dihydrofolate reductase (DHFR) most extensively investigated enzyme. DHFR inhibitors are therapeutically important as highly selective in Antibacterial, Antimalarial, Antitumor agents Replacement of one methoxy group of trimethoprim by an acidic side chain - carboxylate group- increase in inhibitory activities but selectivity and membrane permeability significantly decreased. Replacement to sulfonyl group- Mycobacterium lufu DHFR 19 Department of pharmaceutical chemistry.
  • 42. 5. Information on receptor site Inhibition of dihydro folate reductase(DHFR) by benzyl pyrimidines (Trimethoprim type). Inhibition of DHFR from bovine liver and from E.coli. Mammalian enzyme –hydrophobicity, bacterial inhibition depends on the bulk of the substituent. QSAR on Quinazolines- hydrophobic parameter for the substituent at 5 positions - a large positive coefficient. • Hydrophobic pocket - both mammalian and bacterial DHFR. 23 Department of pharmaceutical chemistry Triazines - hydrophobic pocket is larger in bacterial enzyme than the mammalian enzyme. 24 Department of pharmaceutical chemistry 6. Importance in Drug Research QSAR has correctly predicted the activity of large number of compounds before their synthesis. Colchicine -anti-cancer drug - toxicity I is the indicator variable for the presence of the group -COCH3 at 10 position.
  • 43. Conclusion From the above discussion it is concluded that the QSAR study is a good prediction tool for investigation drug activity or binding mode on specific receptors. Descriptors that show the best correlation in the investigation gives information about important functional groups in the structures of tested compounds. So, by changing some groups in the structure of drugs, we can increase their pharmacological activity or physicochemical properties. In general, the experimental determinations are very expensive and the QSAR studies allow a reduction of this cost. It is basically used to study the biological activities with various properties associated with the structures, which is helpful to explain how structural features in a drug molecule influence the biological activities. QSAR methods can be used to build models that can predict properties or activities for organic compounds. However, an effective way to encode the structures with calculated molecular structure descriptors are required for accurate models’ development. The descriptors incorporated in model’s development can provide an opportunity to focus on specific features account for the property or activity of interest in the compounds. QSAR should not replace experimental values, but it is useful predictive tool and might be usable if no data were available.
  • 44. References 1. https://www.sciencedirect.com/topics/pharmacology-toxicology-and- pharmaceutical-science/quantitative-structure-activity- relationship#:~:text=Quantitative%20structure%20activity%20relationship%20(Q SAR,the%20substance%20will%20be%20modified. 2. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3135-4 3. https://www.frontiersin.org/articles/10.3389/fphar.2018.01275/full 4. https://www.nature.com/articles/s41598-019-45522-3 5. https://www.creative-biolabs.com/drug-discovery/therapeutics/sar-and-qsar- models.htm 6. https://echa.europa.eu/support/registration/how-to-avoid-unnecessary-testing-on- animals/qsar-models 7. https://www.researchgate.net/publication/323791617_Quantitative_Structure- Activity_Relationship_QSAR_Modeling_Approaches_to_Biological_Applications 8. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074254/ 9. http://crdd.osdd.net/qsar.php 10. https://link.springer.com/10.1007/978-0-387-30440-3_422 11. https://www.mdpi.com/journal/molecules/special_issues/qsar 12. https://pubmed.ncbi.nlm.nih.gov/19929826/ 13. https://app.dimensions.ai/details/publication/pub.1101540571 14. https://www.hindawi.com/journals/apm/2019/5173786/ 15. https://www.schrodinger.com/field-based-qsar 16. https://qsartoolbox.org/ 17. https://academic.oup.com/bib/article-pdf/3/1/73/482370/73.pdf 18. https://www.chemicool.com/definition/qsar.html 19. https://www.pharmatutor.org/articles/history-revolution-of-qsar-quantitative-
  • 45. structural-activity-relationship 20. https://www.slideshare.net/nehla313/qsar-57546822 21. https://www.slideshare.net/MahendraMahi28/qsar-96595698 22. https://www.slideshare.net/rahulbs89/qsar-16769617 23. https://www.slideshare.net/EswaranMurugesan/quantitative-structure-activity- relationship-qsar 24. https://www.slideshare.net/OmarSokkar/introduction-to-quantitative-structure- activity-relationships 25. https://www.slideshare.net/binujass1/applications-of-qsar 26. https://www.slideshare.net/SaramitaDeChakravart/qsar-activity-relationships- quantitative-structure 27. https://www.slideshare.net/MahendraMahi28/3d-qsar 28. https://www.slideshare.net/abhikseal/qsar-and-drug-design-ppt 29. https://www.slideshare.net/binujass1/applications-of-qsar 30. Medicinal chemistry by Burger, Wiley Publications Co. 31. An introduction to medicinal chemistry- Graham.L.Patrick ,page no: 383 32. The organic chemistry of the drug design and drug action – Richard. B. Silverman. 33. QSAR: Hansch Analysis and Related Approaches- Hugo Kubinyi . vol :1, page no; 115 34. QSAR - Application in Drug Design- International Journal of Pharmaceutical Research & Allied Sciences, vol: 2. Page no: 6 35. Application of QSAR in Drug Design and Drug Discovery- World Journal of Clinical Pharmacology, Microbiology and Toxicology, Vol. 1, page no: 28. 36. https://pubmed.ncbi.nlm.nih.gov/12002226/