This document provides an overview of chemoinformatics. It begins with an introduction to chemoinformatics, defining it as a field that uses computers to store and analyze large quantities of chemical data. It notes that central concepts in chemoinformatics like QSAR have been used for over 30 years, but the field is now playing a larger role in drug discovery. The document then discusses the differences between chemoinformatics, bioinformatics, and molecular modeling. It provides examples of important chemical databases and methods for searching databases and predicting compound properties in chemoinformatics.
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Lecture 12 – chemoinformatic
1. Lecture 12 – Chemoinformatic
BTT- 516– Drug Designing and Development
2. Topics to be covered today
1. Introduction to Chemoinformatic
2. Difference between Chemoinformatic and Bioinformatics
3. Databases in Cheminformatics
4. Homework
3. Chemoinformatic
Cheminformatics is a field of information technology that uses computers and
computer programs to facilitate the collection, storage, analysis, and
manipulation of large quantities of chemical data.
Chemical data includes chemical formulas, chemical structures, chemical
properties, chemical spectra, and biochemical or biological activities. The term
cheminformatics, which is an abbreviated form of “chemical informatics,”
The term chemoinformatic was coined by by Frank Brown about 10 years ago
(Brown, 1998).
4. Central concepts behind cheminformatics, such as quantitative structure–activity
relationships (QSARs) and compound property prediction, have been around for
more than 30 years. Until recently cheminformatics was a relatively obscure
discipline with a comparatively small academic or industrial presence. However,
with the advent of high throughput drug screening and the need for million-
compound chemical libraries, cheminformatics is now playing a key role in many
aspects of drug discovery and drug development.
Cheminformatics is also playing a vital role in emerging fields such as chemical
genomics (Yang et al., 2006), systems biology (Schnackenberg and Beger, 2006), and
metabolomics (Schlotterbeck et al., 2006; Wishart et al., 2006).
Cheminformatics (as it is known in North America)
Chemoinformatics as it is known in Europe and the rest of the world, is actually a close
cousin to bioinformatics.
6. Databases in Cheminformatics
There are three types of cheminformatic databases:
(1) Archival or “global” compound databases: eg, Pubchem
(2) Specialized or highly curated databases
(3) Structural databases
In the second category of databases (curated or highly annotated) are a number of
smaller, more specialized resources. A partial list of these databases includes KEGG
(Kanehisa et al., 2006), MetaCyc (Caspi et al., 2006), DrugBank (Wishart et al., 2006),
Pharmabase (http://www.pharmabase.org), TTD (Chen et al., 2002), HMDB (Wishart
et al., 2007), ChEBI (Brooksbank et al., 2005), and PharmGKB (Hewett et al., 2002).
The third category of cheminformatic databases are structural databases
containing 3-D coordinate data. Some databases, such as the Cambridge
Structure Database (http://www.ccdc.cam.ac.uk) contain the 3-D coordinates
of chemical structures that were determined experimentally.
7. Database Searching in Cheminformatics
In cheminformatics, there are a number of equivalent methods to perform both
“sequence” (i.e., string) and structure matching against large chemical compound
libraries.
Thanks to the development of standardized text representations of chemical compounds
through InChI (IUPAC International Chemical Identifier) strings and SMILES strings, it
is possible to give every chemical a unique character string.
InChI and SMILES strings uniquely define chemical compounds, much like a gene or
protein can be uniquely defined by its sequence. As a result, if a chemical database such as
PubChem, ZINC, or DrugBank is converted into a collection of SMILES strings or InChI
identifiers, it is then possible to use character string comparison to do compound
matching.
Several Web-based conversion sites, including the Molecular Structure File Converter
(http://iris12.colby. edu/∼www/sconv.cgi), the Cactus Structure File Converter
(http://cactus.nci.nih.gov/services/translate/), and the InChI converter
(http://inchi.info/converter en.html) are now available to facilitate conversion between
MOL, SDF, PDB, SMILES, and InChI formats.
8. Property Prediction in Cheminformatics
Compound property prediction is something common to both bioinformatics
and cheminformatics software. In bioinformatics, the compounds being
analyzed are typically macromolecules such as peptides, proteins, RNA, or
DNA.
In cheminformatics, the compounds being analyzed are usually small molecule
drugs, drug leads, toxins, or metabolites.
In cheminformatics, the properties of interest include electronic or charge distribution,
preferred conformations, heats of formation, solubility, LogP, pKa, refractivity,
melting point, molecule length, molecular area, molecular volume, reactive groups.
Some of these chemical properties, such as solubility, LogP, and charge are
particularly relevant to understanding or predicting the activity, absorption,
distribution, and metabolism (ADMET) of drug compounds (Hansch and Zhang,
1993; Hou and Xu, 2003).
9. Homology
modeling
• Homology modeling,
also known as
comparative modeling of
protein, refers
to constructing an
atomic-resolution model
of the "target" protein
from its amino acid
sequence and an
experimental three-
dimensional structure of
a related homologous
protein (the "template").
10. Lennard-Jones
potential
• The Lennard-Jones potential is a simplified
model that yet describes the essential features of
interactions between simple atoms and molecules:
Two interacting particles repel each other at very
close distance, attract each other at moderate
distance, and do not interact at infinite distance
11. Thank you
Er. Rajan Rolta
Faculty of Applied Sciences and Biotechnology
Shoolini University,
Village Bhajol, Solan (H.P)
+91-7018792621 (Mob No.)
rajanrolta@shooliniuniversity.com