By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )
INTRODUCTION
THE NEED FOR CHEMOINFORMATICS
CHEMOINFORMATICS AND DRUG DISCOVERY
HISTORICAL EVOLUTION
BASIC CONCEPTS
Chemistry Space
Molecular Descriptors
High-Throughput Screening
The Similar-Structure, Similar-Property Principle
Graph theory and Chemoinformatics
CHEMOINFORMATICS TASKS
MOLECULAR REPRESENTATIONS
Topological Representations
Geometrical Representations
TYPES OF MOLECULAR DESCRIPTORS
IN SILICO DE NOVO MOLECULAR DESIGN
FREE CHEMISTRY DATABASE
FUTURE
CONCLUSION
REFERENCE
SYNOPSIS
INTRODUCTION & DEFINITION
G. Paris, 1998.
“Chemoinformatics is a generic term that encompasses the
design, creation, organization, management, retrieval, analysis,
dissemination, visualization, and use of chemical information.”
J. Gasteiger, 2004
“Chemoinformatics is the application of informatics methods to
solve chemical problems.”
F.K. Brown, 1998
“Chemoinformatics is the mixing of those information resources
to transform data into information and information into
knowledge for the intended purpose of making better decisions
faster in the area of drug lead identification and optimization.”
Varnek, 2007
“Chemoinformatics is a field dealing with molecular objects
(graphs, vectors) in multidimentional chemical space.”
3
Cheminformatics
4
WHAT IS CHEMINFORMATICS?
Chemistry
Statistics
Informatics
Mathematics
Cheminformatics
Who & Why Uses Cheminformatics?
•Life sciences, biochemistry, drug industries use
cheminformatics.
20 years ago: 80% in lab – 20% in front of computer
Now: 20% in lab - 70% in front of computer
Examples:
• Organic chemistry – automated reaction planning,
Beilstein search
• Physical chemistry – modeling of structure
properties (boiling points)
• Inorganic chemistry – ligand bond interactions
• Analytical chemistry – structure elucidation of
small compounds
• Biochemistry – protein/small molecule interaction
networks
5
Cheminformatics
HISTORY
 The first edition of Beilstein’s Handbuch der Organischen Chemie was published
in 1881 and contained two volumes, registering 1500 compounds, with more than
2000 pages.
 Chemical Abstracts have been published since 1907.
 Cambridge Structural Database, was developed to contain three-dimensional
crystal structures of conformations of compounds and it was developed in
1965.
 Gasteiger in 1975, the Journal of Chemical Documentation changed its name to
Journal of Chemical Information and Computer Sciences.
 In 1997, the National Cancer Institute built and publicly distributed their database
with compounds a associated biological anti-tumor data.
 Chemical Informatics Letters, an open web access journal published since 2000,
6
Cheminformatics
7
Cheminformatics
BASIC CONCEPT
 Chemistry Space: Chemistry space is the term given to the space
that contains all of the theoretically possible molecules and is therefore
theoretically infinite.
Chemical space
Drug like chemical space
Hits space
Drug

8
Cheminformatics
BASIC CONCEPT
 Molecular Descriptors: "The molecular descriptor is the final result of a logic and
mathematical procedure which transforms chemical information encoded within a
symbolic representation of a molecule into a useful number or the result of some
standardized experiment.“
 Two main categories: experimental measurements, such as log P, molar refractivity,
dipole moment, polarizability, and, in general, physico-chemical properties, and
theoretical molecular descriptors, which are derived from a symbolic
representation of the molecule and can be further classified according to the different
types of molecular representation.
Basic requirements for optimal descriptors
 Structural interpretation
 Good correlation with at least one property
 Preferably discriminate among isomers
 Possible to apply to local structure
 Possible to generalize to "higher" descriptors
 Not be based on experimental properties
 Not be trivially related to other descriptors
 Should be possible to construct efficiently
 Should use familiar structural concepts
 Change gradually with gradual change in structures
 The correct size dependence, if related to the molecule size
9
Cheminformatics
BASIC CONCEPT
High-Throughput Screening
 Rapid
 use well plates (96, 384, or 1536 wells)
 Two extremes are evident in many HTS programs:
diverse and focused screening libraries.
The Similar-Structure, Similar-Property Principle
 Much of chemoinformatics is essentially based on the fundamental
assertion that similar molecules will also tend to exhibit similar
properties; this is known as the similar-structure, similar-property
principle, often simply referred to as the similar property principle.
10
Cheminformatics
MOLECULAR REPRESENTATIONS
Adjacency Matrix
11
Cheminformatics
SMILES NOTATIONS
The original SMILES specification was developed by Arthur
Weininger and David Weininger in the late 1980s.
12
Cheminformatics
Cheminformatics
Geometrical Representations
 Conformer
 2D & 3D Molecular Representations
14
Cheminformatics
15
Cheminformatics
16
Cheminformatics
TYPES OF MOLECULAR DESCRIPTORS
 Vector Representations(Molecular Fingerprints)
Structure-Key Fingerprints Hash -Key Fingerprints
17
Pharmacophore Models
The term pharmacophore was first used by Paul Ehrlich
(1854–1915) in 1909
 Structure-Based Pharmacophores.
 Ligand-Based Pharmacophores.
Molecular Scaffolds and Scaffold Hopping
 Used to describe the core structure of molecule
 Scaffold hopping (leapfrogging, lead-hopping, chemotype switching,
and scaffold searching)
18
In silico de novo MOLECULAR DESIGN
 De Novo Design
 Virtual Combinatorial Synthesis
Quantitative Structure-Activity Relationships
Molecular Docking
Docking consists of two components:
 Search algorithm – generation of plausible structures
 Scoring function – to identify which of the identified structures are of
most interest
Common programs for molecular docking:
 Dock
 Autodock
 FlexX
 ArgusLab
 GOLD (Genetic Optimization for Ligand Docking) 19
Cheminformatics
FREE CHEMISTRY DATABASES
PUBCHEM
ZINC
EMOLECULES
CHEBI
NIST CHEMISTRY WEBBOOK
CHEMEXPER
DRUGBANK
CHEMBANK
IUPAC-NIST SOLUBILITY DATABASE
20
Cheminformatics
21
Cheminformatics
FUTURE
The computer is used to analyze the interactions between the drug and
the receptor site and design molecules with an optimal fit. Once targets
are developed, libraries of compounds are screened for activity with one
or more relevant assays using High Throughput Screening. Hits are then
evaluated for binding, potency, selectivity, and functional activity.
Seeking to improve:
Potency
Absorption
 Distribution
 Metabolism
 Elimination
APPLICATIONS OF CHEMOINFORMATICS
1. Chemical Information
 Storage and retrieval of chemical structures and associated data to
manage the flood of data by the softwares are available for
drawing and databases.
 Dissemination of data on the internet
 Cross-linking of data to information
2. All fields of chemistry
 rediction of the physical, chemical, or biological properties of
compounds
3. Analytical Chemistry
 Chemical(s) of concern Chemical Specific data Structural analogue
Property analogue Biological or mechanistic analogue Data bases
ata mining Structure searchable Structure activity relationships 20
 Analysis of data from analytical chemistry to mak predictions on
the quality, origin, and age of the investigated objects
 Elucidation of the structure of a compound based on spectroscopic
data 22
Cheminformatics
Chemoinformatics -A Textbook, Johann
Gasteiger and Thomas Engel, Wiley-VCH 2003
An Introduction to Chemoinformatics, Andrew
R. Leach, Valerie J. Gillet, Springer 2007
J. Polanski, University of Silesia, Katowice,
Poland(pdf)
http://www.mdpi.org
http://www.springer.com
REFERENCE

Cheminformatics, concept by kk sahu sir

  • 1.
    By KAUSHAL KUMAR SAHU AssistantProfessor (Ad Hoc) Department of Biotechnology Govt. Digvijay Autonomous P. G. College Raj-Nandgaon ( C. G. )
  • 2.
    INTRODUCTION THE NEED FORCHEMOINFORMATICS CHEMOINFORMATICS AND DRUG DISCOVERY HISTORICAL EVOLUTION BASIC CONCEPTS Chemistry Space Molecular Descriptors High-Throughput Screening The Similar-Structure, Similar-Property Principle Graph theory and Chemoinformatics CHEMOINFORMATICS TASKS MOLECULAR REPRESENTATIONS Topological Representations Geometrical Representations TYPES OF MOLECULAR DESCRIPTORS IN SILICO DE NOVO MOLECULAR DESIGN FREE CHEMISTRY DATABASE FUTURE CONCLUSION REFERENCE SYNOPSIS
  • 3.
    INTRODUCTION & DEFINITION G.Paris, 1998. “Chemoinformatics is a generic term that encompasses the design, creation, organization, management, retrieval, analysis, dissemination, visualization, and use of chemical information.” J. Gasteiger, 2004 “Chemoinformatics is the application of informatics methods to solve chemical problems.” F.K. Brown, 1998 “Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.” Varnek, 2007 “Chemoinformatics is a field dealing with molecular objects (graphs, vectors) in multidimentional chemical space.” 3 Cheminformatics
  • 4.
  • 5.
    Who & WhyUses Cheminformatics? •Life sciences, biochemistry, drug industries use cheminformatics. 20 years ago: 80% in lab – 20% in front of computer Now: 20% in lab - 70% in front of computer Examples: • Organic chemistry – automated reaction planning, Beilstein search • Physical chemistry – modeling of structure properties (boiling points) • Inorganic chemistry – ligand bond interactions • Analytical chemistry – structure elucidation of small compounds • Biochemistry – protein/small molecule interaction networks 5 Cheminformatics
  • 6.
    HISTORY  The firstedition of Beilstein’s Handbuch der Organischen Chemie was published in 1881 and contained two volumes, registering 1500 compounds, with more than 2000 pages.  Chemical Abstracts have been published since 1907.  Cambridge Structural Database, was developed to contain three-dimensional crystal structures of conformations of compounds and it was developed in 1965.  Gasteiger in 1975, the Journal of Chemical Documentation changed its name to Journal of Chemical Information and Computer Sciences.  In 1997, the National Cancer Institute built and publicly distributed their database with compounds a associated biological anti-tumor data.  Chemical Informatics Letters, an open web access journal published since 2000, 6 Cheminformatics
  • 7.
  • 8.
    BASIC CONCEPT  ChemistrySpace: Chemistry space is the term given to the space that contains all of the theoretically possible molecules and is therefore theoretically infinite. Chemical space Drug like chemical space Hits space Drug  8 Cheminformatics
  • 9.
    BASIC CONCEPT  MolecularDescriptors: "The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment.“  Two main categories: experimental measurements, such as log P, molar refractivity, dipole moment, polarizability, and, in general, physico-chemical properties, and theoretical molecular descriptors, which are derived from a symbolic representation of the molecule and can be further classified according to the different types of molecular representation. Basic requirements for optimal descriptors  Structural interpretation  Good correlation with at least one property  Preferably discriminate among isomers  Possible to apply to local structure  Possible to generalize to "higher" descriptors  Not be based on experimental properties  Not be trivially related to other descriptors  Should be possible to construct efficiently  Should use familiar structural concepts  Change gradually with gradual change in structures  The correct size dependence, if related to the molecule size 9 Cheminformatics
  • 10.
    BASIC CONCEPT High-Throughput Screening Rapid  use well plates (96, 384, or 1536 wells)  Two extremes are evident in many HTS programs: diverse and focused screening libraries. The Similar-Structure, Similar-Property Principle  Much of chemoinformatics is essentially based on the fundamental assertion that similar molecules will also tend to exhibit similar properties; this is known as the similar-structure, similar-property principle, often simply referred to as the similar property principle. 10 Cheminformatics
  • 11.
  • 12.
    SMILES NOTATIONS The originalSMILES specification was developed by Arthur Weininger and David Weininger in the late 1980s. 12 Cheminformatics
  • 13.
  • 14.
    Geometrical Representations  Conformer 2D & 3D Molecular Representations 14 Cheminformatics
  • 15.
  • 16.
  • 17.
    TYPES OF MOLECULARDESCRIPTORS  Vector Representations(Molecular Fingerprints) Structure-Key Fingerprints Hash -Key Fingerprints 17
  • 18.
    Pharmacophore Models The termpharmacophore was first used by Paul Ehrlich (1854–1915) in 1909  Structure-Based Pharmacophores.  Ligand-Based Pharmacophores. Molecular Scaffolds and Scaffold Hopping  Used to describe the core structure of molecule  Scaffold hopping (leapfrogging, lead-hopping, chemotype switching, and scaffold searching) 18
  • 19.
    In silico denovo MOLECULAR DESIGN  De Novo Design  Virtual Combinatorial Synthesis Quantitative Structure-Activity Relationships Molecular Docking Docking consists of two components:  Search algorithm – generation of plausible structures  Scoring function – to identify which of the identified structures are of most interest Common programs for molecular docking:  Dock  Autodock  FlexX  ArgusLab  GOLD (Genetic Optimization for Ligand Docking) 19 Cheminformatics
  • 20.
    FREE CHEMISTRY DATABASES PUBCHEM ZINC EMOLECULES CHEBI NISTCHEMISTRY WEBBOOK CHEMEXPER DRUGBANK CHEMBANK IUPAC-NIST SOLUBILITY DATABASE 20 Cheminformatics
  • 21.
    21 Cheminformatics FUTURE The computer isused to analyze the interactions between the drug and the receptor site and design molecules with an optimal fit. Once targets are developed, libraries of compounds are screened for activity with one or more relevant assays using High Throughput Screening. Hits are then evaluated for binding, potency, selectivity, and functional activity. Seeking to improve: Potency Absorption  Distribution  Metabolism  Elimination
  • 22.
    APPLICATIONS OF CHEMOINFORMATICS 1.Chemical Information  Storage and retrieval of chemical structures and associated data to manage the flood of data by the softwares are available for drawing and databases.  Dissemination of data on the internet  Cross-linking of data to information 2. All fields of chemistry  rediction of the physical, chemical, or biological properties of compounds 3. Analytical Chemistry  Chemical(s) of concern Chemical Specific data Structural analogue Property analogue Biological or mechanistic analogue Data bases ata mining Structure searchable Structure activity relationships 20  Analysis of data from analytical chemistry to mak predictions on the quality, origin, and age of the investigated objects  Elucidation of the structure of a compound based on spectroscopic data 22 Cheminformatics
  • 23.
    Chemoinformatics -A Textbook,Johann Gasteiger and Thomas Engel, Wiley-VCH 2003 An Introduction to Chemoinformatics, Andrew R. Leach, Valerie J. Gillet, Springer 2007 J. Polanski, University of Silesia, Katowice, Poland(pdf) http://www.mdpi.org http://www.springer.com REFERENCE