CHEMINFORMATICS
1
By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )
ROADWAY
• Introduction
• What is cheminformatics?
• Why do we have to use informatics methods in chemistry?
• Is it cheminformatics or chemoinformatics?
• Emergence of cheminformations
• Three major aspects of cheminformatics
• Basics of cheminformatics
• Topological representations
• Tools used for cheminformatics
• Application of cheminformatics
• Role of cheminformatics in morden drug discovery
• Conclusion
• Bibliography
2
INTRODUCTION
• Informatics is the discipline of science which investigates the
structure and properties of scientific information, as well as the
activity, its theory, history, methodology and organization.
• Different disciplines:
– Bioinformatics,
– Chemoinformatics,
– Geoinformatics,
– Health informatics,
– Laboratory informatics,
– Neuroinformatics,
– Social informatics.
Chemistry
Statist
ics
Informatics
Mathe
matics
3
CHEMINFORMATICS
• Cheminformatics is
• Purpose: Cheminformatics involves organization of
chemical data in a logical form to facilitate the process of:
– Data organization.
– Understanding chemical properties.
– Their relationship to structures.
– Making inferences.
– Assessing the properties of new compounds by comparison
with the known compounds.
– The risk involved in the earlier random processes like in drug
discovery methods, is largely removed by informatics.
INFORMATION DATA KNOWLEDGE
4
• Greg Paris came up with a much broader
definition, ”Chemoinformatics is a generic term that
encompasses the design, creation, organization,
management, retrieval, analysis, dissemination,
visualization, and use of chemical information.”
• Chemoinformatics is the application of informatics
methods to solve chemical problems.
• It has been recognized in recent years as a distinct
discipline in computational molecular sciences.
• Cheminformatics is also known as interface science as it
combines physics, chemistry, biology, mathematics,
biochemistry, statistics and informatics.
5
WHY DO WE HAVE TO USE INFORMATICS
METHODS IN CHEMISTRY?
• An enormous amount of data .
• Many problems in chemistry are too complex to be
solved by methods based on first principles through
theoretical calculations.
• This is true, for the relationships between the
structure of a compound and its biological activity, or
for the influence of reaction conditions on chemical
reactivity.
6
IS IT CHEMINFORMATICS OR
CHEMOINFORMATICS?
CHEMINFOR
MATICS
CHEMOINFOR
MATICS
CHEMI-
INFORMATICS
MOLECULAR
INFORMATICS
CHEMICAL
INFORMATICS
CHEMOBIOIN
FORMATICS
7
EMERGENCE
• Subjected in, the Journal of Chemical Documentation, started in
1961 (the name Changed to the Journal of Chemical Information
and computer Science in 1975).
• Then the first book appeared in 1971 (Lynch, Harrison, Town and
Ash, Computer Handling of Chemical Structure Information).
• The first international conference on the subject was held in 1973
at 4 Noordwijkerhout and every three years since 1987.
• The term Chemoinformatics was given by Frank Brown in 1998.
8
THREE MAJOR ASPECTS OF
CHEMINFORMATICS
• i) Information Acquisition: a process of generating and
collecting data empirically (experimentation) or from
theory.
• ii) Information Management: deals with storage and
retrieval of information.
• iii) Information use: which includes data analysis,
correlation, and application to problems in the
chemical and biochemical sciences.
9
BASICS OF CHEMINFORMATICS
10
LOOKUP TABLE
• Simply stores atom
information.
• Eg:-Acetaminophen
CONNECTION TABLE
• A connection table stores
the same information that is
present in a 2D structure
diagram, namely the atoms
that are present in a
molecule and what bonds
exist between the atoms.
11
STRUCTURE SEARCHING
• Morgan algorithm .
• Numbering the atoms of a
molecule in a unique way.
• By Morgan algorithm, atoms
of the same elemental type
can be topologically
equivalent.
SUBSTRUCTURE SEARCHING
• A substructure search involves
finding all the structures in a
database that contain one or
more particular structural
fragments.
• For example, we might want
to find all of the structures in a
database which contain the
nitro group.
• A subgraph isomorphism
algorithm
12
• SIMILARITY SEARCHING :
– Similarity searching involves looking for all the structures in a database
that are highly similar to a given structure.
• QUANTITATIVE STRUCTURE ACTIVITY / PROPERTY
RELATIONSHIP (QSAR/QSPR)
– Building on work by Hammett and Taft in the fifties, Hansch and Fujita
showed in 1964 that the influence of substituents on biological activity
data can be quantified.
– Its based on assumptions that activity of the molecule is related to its
structure.
– For example, at the time of drug design, we have to look after these
following points:
• • Single therapeutic target
• • Drug like chemical
• • Some toxicity anticipated
• • Multiple unknown targets
• • Diverse Structures
• • Human and ecosystems. 13
CHEMOMETRICS
• Analysis of chemical data.
• Multilinear regression analysis.
• Pattern recognition methods were introduced
in the seventies to analyze chemical data
• In the nineties, artificial neural networks
gained prominence for analyzing chemical
data.
14
MOLECULAR MODELINGR
• The commonly available softwares for
molecular modeling are ArgusLab, Chimera,
and Ghemical.
. Langridge and coworkers visualizing 3D molecular models on the
screens of Cathode Ray Tubes
G. Marshall visualizing protein structure on graphic
screens.
15
TOPOLOGICAL REPRESENTATIONS
• To perceive features such as rings and
aromaticity, and to treat stereochemistry, 3D
structures, or molecular surfaces.
• LINEAR NOTATIONS
16
• In SMILES, atoms are generally represented by
their chemical symbol, with upper-case
representing an aliphatic atom (C = aliphatic
carbon, N = aliphatic nitrogen, etc) and lower-
case representing an aromatic atom (c =
aromatic carbon, etc).
• Hydrogens are not normally represented
explicitly.
• Consecutive characters represent atoms bonded
together with a single bond. Therefore, the
SMILES for propane would simply be: CCC or 1-
propanol would be: CCCO.
• Double bonds are represented by an “=” sign,
e.g. propene would be: C=CC.
• Parentheses are used to represent branching in
the molecule, e.g. the SMILES for Isopropyl
alcohol (2-propanol) is: CC(O)C.
• Atoms other than the major organic ones (C, S,
N, O, P, Cl, Br, I, B) or ions must be enclosed in
square brackets. 17
TOOLS USED FOR CHEMINFORMATICS
• ISIS-Draw ChemAxon
18
• ChemDraw
19
• ChemSketch
20
APPLICATION OF
CHEMINFORMATICS
1. Chemical Information
• storage and retrieval of chemical structures and associated data to
manage the flood of data by the softwares are available for drawing and
databases.
• dissemination of data on the internet
• cross-linking of data to information
•
2. All fields of chemistry
• prediction of the physical, chemical, or biological properties of compounds
3. Analytical Chemistry
• analysis of data from analytical chemistry to make predictions on the
quality, origin, and age of the investigated objects
• elucidation of the structure of a compound based on spectroscopic data
•
21
4. Organic Chemistry
• prediction of the course and products of organic reactions
• design of organic syntheses
5. Drug Design as well as for bioactive molecules.
• identification of new lead structures
• optimization of lead structures
• establishment of quantitative structure-activity
relationships
• comparison of chemical libraries
• definition and analysis of structural diversity
• planning of chemical libraries
• analysis of high-throughput data
• docking of a ligand into a receptor
22
CONCLUSION
In identifying and
understanding
structural and
functional behaviour of
chemical
compounds/biomolecu
les
Relation of
molecular
structures to
desirable
properties
Determining the
3D structures
23
BIBLIOGRAPHY
BOOK
• Chemoinformatics -A Textbook, Johann Gasteiger and Thomas Engel,
Wiley-VCH 2003
PDFS
• Chemoinformatics: Principles and Applications by Md. WasimAktar and
SidhuMurmu
• A Study on Cheminformatics and its Applications on ModernDrug
Discovery byB.FirdausBegam and Dr. J.Satheesh Kumar.
INTERNET RESOURCES
• http://booksite.elsevier.com/brochures/compchemometrics/PDF/Chemoi
nformatics.pdf
• http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.4831&rep=
rep1&type=pdf
• http://accolade-ltd.com/chemoinformatics-principles-and-applications/
https://www.acs.org/content/acs/en/careers/college-to-career/chemistry-
careers/cheminformatics.html
24

Cheminformatics by kk sahu

  • 1.
    CHEMINFORMATICS 1 By KAUSHAL KUMAR SAHU AssistantProfessor (Ad Hoc) Department of Biotechnology Govt. Digvijay Autonomous P. G. College Raj-Nandgaon ( C. G. )
  • 2.
    ROADWAY • Introduction • Whatis cheminformatics? • Why do we have to use informatics methods in chemistry? • Is it cheminformatics or chemoinformatics? • Emergence of cheminformations • Three major aspects of cheminformatics • Basics of cheminformatics • Topological representations • Tools used for cheminformatics • Application of cheminformatics • Role of cheminformatics in morden drug discovery • Conclusion • Bibliography 2
  • 3.
    INTRODUCTION • Informatics isthe discipline of science which investigates the structure and properties of scientific information, as well as the activity, its theory, history, methodology and organization. • Different disciplines: – Bioinformatics, – Chemoinformatics, – Geoinformatics, – Health informatics, – Laboratory informatics, – Neuroinformatics, – Social informatics. Chemistry Statist ics Informatics Mathe matics 3
  • 4.
    CHEMINFORMATICS • Cheminformatics is •Purpose: Cheminformatics involves organization of chemical data in a logical form to facilitate the process of: – Data organization. – Understanding chemical properties. – Their relationship to structures. – Making inferences. – Assessing the properties of new compounds by comparison with the known compounds. – The risk involved in the earlier random processes like in drug discovery methods, is largely removed by informatics. INFORMATION DATA KNOWLEDGE 4
  • 5.
    • Greg Pariscame up with a much broader definition, ”Chemoinformatics is a generic term that encompasses the design, creation, organization, management, retrieval, analysis, dissemination, visualization, and use of chemical information.” • Chemoinformatics is the application of informatics methods to solve chemical problems. • It has been recognized in recent years as a distinct discipline in computational molecular sciences. • Cheminformatics is also known as interface science as it combines physics, chemistry, biology, mathematics, biochemistry, statistics and informatics. 5
  • 6.
    WHY DO WEHAVE TO USE INFORMATICS METHODS IN CHEMISTRY? • An enormous amount of data . • Many problems in chemistry are too complex to be solved by methods based on first principles through theoretical calculations. • This is true, for the relationships between the structure of a compound and its biological activity, or for the influence of reaction conditions on chemical reactivity. 6
  • 7.
    IS IT CHEMINFORMATICSOR CHEMOINFORMATICS? CHEMINFOR MATICS CHEMOINFOR MATICS CHEMI- INFORMATICS MOLECULAR INFORMATICS CHEMICAL INFORMATICS CHEMOBIOIN FORMATICS 7
  • 8.
    EMERGENCE • Subjected in,the Journal of Chemical Documentation, started in 1961 (the name Changed to the Journal of Chemical Information and computer Science in 1975). • Then the first book appeared in 1971 (Lynch, Harrison, Town and Ash, Computer Handling of Chemical Structure Information). • The first international conference on the subject was held in 1973 at 4 Noordwijkerhout and every three years since 1987. • The term Chemoinformatics was given by Frank Brown in 1998. 8
  • 9.
    THREE MAJOR ASPECTSOF CHEMINFORMATICS • i) Information Acquisition: a process of generating and collecting data empirically (experimentation) or from theory. • ii) Information Management: deals with storage and retrieval of information. • iii) Information use: which includes data analysis, correlation, and application to problems in the chemical and biochemical sciences. 9
  • 10.
  • 11.
    LOOKUP TABLE • Simplystores atom information. • Eg:-Acetaminophen CONNECTION TABLE • A connection table stores the same information that is present in a 2D structure diagram, namely the atoms that are present in a molecule and what bonds exist between the atoms. 11
  • 12.
    STRUCTURE SEARCHING • Morganalgorithm . • Numbering the atoms of a molecule in a unique way. • By Morgan algorithm, atoms of the same elemental type can be topologically equivalent. SUBSTRUCTURE SEARCHING • A substructure search involves finding all the structures in a database that contain one or more particular structural fragments. • For example, we might want to find all of the structures in a database which contain the nitro group. • A subgraph isomorphism algorithm 12
  • 13.
    • SIMILARITY SEARCHING: – Similarity searching involves looking for all the structures in a database that are highly similar to a given structure. • QUANTITATIVE STRUCTURE ACTIVITY / PROPERTY RELATIONSHIP (QSAR/QSPR) – Building on work by Hammett and Taft in the fifties, Hansch and Fujita showed in 1964 that the influence of substituents on biological activity data can be quantified. – Its based on assumptions that activity of the molecule is related to its structure. – For example, at the time of drug design, we have to look after these following points: • • Single therapeutic target • • Drug like chemical • • Some toxicity anticipated • • Multiple unknown targets • • Diverse Structures • • Human and ecosystems. 13
  • 14.
    CHEMOMETRICS • Analysis ofchemical data. • Multilinear regression analysis. • Pattern recognition methods were introduced in the seventies to analyze chemical data • In the nineties, artificial neural networks gained prominence for analyzing chemical data. 14
  • 15.
    MOLECULAR MODELINGR • Thecommonly available softwares for molecular modeling are ArgusLab, Chimera, and Ghemical. . Langridge and coworkers visualizing 3D molecular models on the screens of Cathode Ray Tubes G. Marshall visualizing protein structure on graphic screens. 15
  • 16.
    TOPOLOGICAL REPRESENTATIONS • Toperceive features such as rings and aromaticity, and to treat stereochemistry, 3D structures, or molecular surfaces. • LINEAR NOTATIONS 16
  • 17.
    • In SMILES,atoms are generally represented by their chemical symbol, with upper-case representing an aliphatic atom (C = aliphatic carbon, N = aliphatic nitrogen, etc) and lower- case representing an aromatic atom (c = aromatic carbon, etc). • Hydrogens are not normally represented explicitly. • Consecutive characters represent atoms bonded together with a single bond. Therefore, the SMILES for propane would simply be: CCC or 1- propanol would be: CCCO. • Double bonds are represented by an “=” sign, e.g. propene would be: C=CC. • Parentheses are used to represent branching in the molecule, e.g. the SMILES for Isopropyl alcohol (2-propanol) is: CC(O)C. • Atoms other than the major organic ones (C, S, N, O, P, Cl, Br, I, B) or ions must be enclosed in square brackets. 17
  • 18.
    TOOLS USED FORCHEMINFORMATICS • ISIS-Draw ChemAxon 18
  • 19.
  • 20.
  • 21.
    APPLICATION OF CHEMINFORMATICS 1. ChemicalInformation • storage and retrieval of chemical structures and associated data to manage the flood of data by the softwares are available for drawing and databases. • dissemination of data on the internet • cross-linking of data to information • 2. All fields of chemistry • prediction of the physical, chemical, or biological properties of compounds 3. Analytical Chemistry • analysis of data from analytical chemistry to make predictions on the quality, origin, and age of the investigated objects • elucidation of the structure of a compound based on spectroscopic data • 21
  • 22.
    4. Organic Chemistry •prediction of the course and products of organic reactions • design of organic syntheses 5. Drug Design as well as for bioactive molecules. • identification of new lead structures • optimization of lead structures • establishment of quantitative structure-activity relationships • comparison of chemical libraries • definition and analysis of structural diversity • planning of chemical libraries • analysis of high-throughput data • docking of a ligand into a receptor 22
  • 23.
    CONCLUSION In identifying and understanding structuraland functional behaviour of chemical compounds/biomolecu les Relation of molecular structures to desirable properties Determining the 3D structures 23
  • 24.
    BIBLIOGRAPHY BOOK • Chemoinformatics -ATextbook, Johann Gasteiger and Thomas Engel, Wiley-VCH 2003 PDFS • Chemoinformatics: Principles and Applications by Md. WasimAktar and SidhuMurmu • A Study on Cheminformatics and its Applications on ModernDrug Discovery byB.FirdausBegam and Dr. J.Satheesh Kumar. INTERNET RESOURCES • http://booksite.elsevier.com/brochures/compchemometrics/PDF/Chemoi nformatics.pdf • http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.4831&rep= rep1&type=pdf • http://accolade-ltd.com/chemoinformatics-principles-and-applications/ https://www.acs.org/content/acs/en/careers/college-to-career/chemistry- careers/cheminformatics.html 24