SlideShare a Scribd company logo
1 of 96
EBI is an Outstation of the European Molecular Biology Laboratory.
Small Molecules in Bioinformatics
EBI Bioinformatics Roadshow
Dr. Louisa Bellis, ChEMBL
Copenhagen, June 2011
Small molecule resources at the EBI
18.03.2024
2
Agenda
• Introduction
• Small molecule resources
• ChEMBL
• ChEBI
• Searching and browsing
• Hands-on Exercises
Small molecules participate in
all processes of life
What are Small Molecules?
• A small molecule is defined as a low molecular weight
organic compound.
• Most drugs are small molecules to allow passage over
cell membranes and oral bioavailability.
• They are also able to bind to proteins and enzymes,
thereby altering function, which can lead to a therapeutic
effect.
Some
common
small
molecules:
Amino
Acids
Metabolism
Adenosine 5’-triphosphate (ATP): the
"molecular unit of currency" of intracellular
energy transfer.
• generated in the cell by energy-consuming processes, broken down by
energy-releasing processes
• proteins that bind ATP do so in a characteristic protein fold known as the
Rossmann fold, which is a general nucleotide-binding structural domain that
can also bind the cofactor NAD
Adenosine 5'-triphosphate
Small molecule resources at the EBI
18.03.2024
8
Enzymes
• Enzyme inhibitors are molecules that bind to enzymes and
decrease their activity.
• Many drugs are enzyme inhibitors.
They are also used as herbicides
and pesticides.
• Enzyme activators bind to enzymes and increase their enzymatic
activity.
• Enzyme activators are often involved in the allosteric regulation of
enzymes in the control of metabolism.
clavulanic acid
acts as a suicide
inhibitor of
bacterial β-lactamase
enzymes
Small molecule resources at the EBI
18.03.2024
9
Pathways
http://www.genome.jp/kegg-bin/highlight_pathway?scale=1.0&map=map00231&keyword=tryptophan
Small molecule resources at the EBI
18.03.2024
10
Drug types 2003 - 2009
'Small molecules' in various shades of blue (http://chembl.blogspot.com/)
Small molecule resources at the EBI
18.03.2024
12
Small Molecule Databases
• Small Molecule Databases can be used to:
• Investigate historical compounds and associated bioactivity data.
• To give fresh insight into previously rejected drugs.
• Create Structure-Activity Relationships (SARs)
• Look at how changing a functional group can change the
biological activity of a compound – before you start your own
synthesis.
18.03.2024
13 Small molecule resources at the EBI
• Direct synthesis
• Could reduce number of compounds made – if any similar
compounds have significant toxicity or unfavourable binding data,
you can save time by not making analogues.
• Direct end product testing
• Suggest what testing could be carried out – the database can
give you an idea of what testing has given ‘good’ (i.e. clear)
results.
• Reduce number of compounds put through High Throughput
Screening (HTS).
18.03.2024
14 Small molecule resources at the EBI
ChEBI and ChEMBL
Small molecule resources at the EBI
What is ChEBI?
• Chemical Entities of Biological Interest
• Freely available
• Focused on ‘small’ chemical entities (no proteins or
nucleic acids)
• Illustrated dictionary of chemical nomenclature
• High quality, manually annotated
• Provides chemical ontology
Access ChEBI at http://www.ebi.ac.uk/chebi/
Small molecule resources at the EBI
18.03.2024
16
ChEBI home page
Small molecule resources at the EBI
18.03.2024
17
ChEBI data overview
Visualisation
caffeine
1,3,7-trimethylxanthine
methyltheobromine
Nomenclature
Formula: C8H10N4O2
Charge: 0
Mass: 194.19
Chemical data
metabolite
CNS stimulant
trimethylxanthines
Ontology
MSDchem: CFF
KEGG DRUG: D00528
Database Xrefs
Chemical Informatics
InChI=1/C8H10N4O2/c1-10-4-9-6-
5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
SMILES: CN1C(=O)N(C)c2ncn(C)c2C1=O
ChEBI – Chemical Entities of Biological Interest
18.03.2024
19
ChEBI entry view
Chemical Structures
• Chemical structure may be
interactively explored
using MarvinView applet
• Available in formats
• Image
• Molfile
• InChI and InChIKey
• SMILES
Small molecule resources at the EBI
18.03.2024
20
ChEBI – Chemical Entities of Biological Interest
18.03.2024
21
Automatic Cross-references
What is ChEMBL?
• Database of bioactive, drug-like small molecules.
• Contains 2D structures, calculated properties (logP, mol
weight, Lipinski etc)
• Contains abstracted bioactivity data, e.g. binding data
and IC50, from multiple primary scientific journals
• Covers about 30 years of compound synthesis and
testing
• Annotated FDA-approved drugs
Access ChEMBL at https://www.ebi.ac.uk/chembldb/
Small molecule resources at the EBI
18.03.2024
22
ChEMBL Main Search Page
Small molecule resources at the EBI
Master headline
18.03.2024
24
Calc.
properties
Drug
Information
Clickable structure
Small molecule resources at the EBI
18.03.2024
25 Small molecule resources at the EBI
Structural
Representations
18.03.2024
26 Small molecule resources at the EBI
18.03.2024
27 Small molecule resources at the EBI
Parent and Salt
Forms
Database links
ChEBI Link:
18.03.2024
28 Small molecule resources at the EBI
This will take you back to ChEMBL
ChemSpider Links:
18.03.2024
30
The link works
both ways. They
link TO
ChemSpider and
FROM
ChemSpider.
They link on
Standard_Inchi
Small molecule resources at the EBI
Wikipedia Links:
18.03.2024
31
We also have links with
Wikipedia. These also use
the Standard_Inchi as the
common identifier. These
links will link to the
Compound Report Card in
ChEMBL.
The links are added by a
ChemoBot and can be
updated with each
release, if required.
Small molecule resources at the EBI
STRUCTURAL
REPRESENTATION
18.03.2024
32 Small molecule resources at the EBI
Stereoisomers
• Compounds that have same molecular formula and
configuration, but differ in the 3-dimensional orientations.
• The central tetrahedral carbon has 4 different molecular
groups/atoms attached. This is known as the chiral
centre.
18.03.2024
33 Small molecule resources at the EBI
Stereoisomerism Example - Thalidomide
• Caused thousands of deformities in babies across 46
countries between 1957 and 1961.
• The R isomer is to control morning sickness but the S
isomer was teratogenic.
• Sparked more tightly controlled laboratory practices
across the world.
18.03.2024
34 Small molecule resources at the EBI
Stereoisomers
• Where known, the stereochemistry of the compound is
noted in the structure and in the name.
• If a stereoisomer of an existing compound is submitted, it
is given a separate id number.
• If a mixture of two stereoisomers had data submitted, we
will also give this a separate id number if the activity of
the compounds can not be isolated.
• If you draw a planar compound into the structure search,
you will receive data on all stereoisomers.
18.03.2024
35 Small molecule resources at the EBI
Ofloxacin, Levofloxacin and Dextrofloxacin
• Fluoroquinolone antibiotics
• Ofloxacin is a racemic (equal) mixture of Levo and Dextro
isomers.
• Levofloxacin is the more active stereoisomer
• Dextrofloxacin is the less active stereoisomer
• ChEMBL has data on each with separate bioactivities.
18.03.2024
36 Small molecule resources at the EBI
Tautomers (keto-enol form)
• Two forms readily interconvert via the migration of a
hydrogen to the adjacent oxygen and the swapping of a
single to a double bond, and vice versa.
• ChEMBL does not differentiate between different
tautomers.
• The preferred tautomeric structure is retained.
• ChEBI does differentiate and will store the separate
tautomers.
18.03.2024
37 Small molecule resources at the EBI
Salts
• About 50% of marketed drugs are combined with salts to
aid in their activity.
• Some salts prevent the drug from being absorbed in the mouth.
• Some salts help the drug be activated in the intestines, rather
than the stomach.
• There are approx 53,450 ChEMBL compounds with salts.
• Bioactivity data is recorded against the parent drug and
against the salt.
• Therefore, it’s important to give these compounds
different ChEMBL ids.
18.03.2024
38 Small molecule resources at the EBI
Salt Example: Morphine
• Morphine can be administered with many different salts:
• Hydrochloride (HCl)
• Sulphate (SO4)
• Tartrate
• Acetate
• Citrate
• Methobromide (MeBr)
• Hydrobromide (HBr)
• Hydroiodide (HI)
• Lactate
• Chloride (Cl)
• Bitartrate
18.03.2024
39 Small molecule resources at the EBI
Dealing with Salts in ChEMBL
• Each compound, if in a salt form, is analysed and
matched to a ‘parent’ – i.e. the base form of the
compound. (Not inorganic compounds)
• For example, morphine hydrochloride (CHEMBL556578),
morphine sulfate (CHEMBL422878) and morphine sulfate
hydrate (CHEMBL1200603) are matched to their parent
morphine (CHEMBL70)
• This relationship is shown on the interface of the
compound page.
• Additionally, when you run a search for a compound, you
will only be brought back the parent form in the results
grid.
18.03.2024
40 Small molecule resources at the EBI
Parents and Salts on the Compound Page
18.03.2024
41
PARENT
(compound report
page)
SALTS
(with hyperlinks beneath)
Small molecule resources at the EBI
• Clicking on the Bioactivity Summary pie chart will give
you the bioactivity data for ALL forms of the compound
• To get salt specific bioactivity data, click on the hyperlink
beneath the salt form of interest to be taken to its
compound page.
18.03.2024
42
Morphine - All Data Morphine HCl specific data
Small molecule resources at the EBI
Naming and Classification
Small molecule resources at the EBI
Chemical names
Common or trivial names are those that are highly used.
Advantages of common names include
simplicity,
pronounceability and
universally recognised
The main disadvantage is ambiguity – the same common
name may refer to more than one type of chemical.
Small molecule resources at the EBI
Systematic names
A systematic name is one which corresponds to the chemical
structure such that the structure can be determined from the
name, e.g. 1,2-dimethyl-naphthalene
Software packages exist which can generate structures from
the systematic names (e.g. ACD/Name, ChemOffice,
MarvinSketch).
More than one correct systematic name can be assigned to the
same molecular structure, depending on the manner in which
naming rules are applied.
Small molecule resources at the EBI
Examples of common and systematic names
Common names Systematic names
caffeine
guaranine
theine
1,3,7-trimethyl-3,7-
dihydro-1H-purine-2,6-
dione
7-methyltheophylline
1,3,7-trimethyl-2,6-
dioxopurine
Small molecule resources at the EBI
SEARCHING IN CHEBI
Why?
• Ontological data
• Structure classification
• Chemical entity, e.g. hydrocarbon
• Role, e.g. ligand
• Subatomic particle, e.g. electron
• Links to other databases
• Kegg
• DrugBank
• PDBEChem
• Citations
How?
Text-based
Drawing
The ChEBI ontology
Organised into three sub-ontologies, namely
• Molecular structure ontology
• Subatomic particle ontology
• Role ontology
(R)-adrenaline
Small molecule resources at the EBI
18.03.2024
50
Molecular structure ontology
Small molecule resources at the EBI
18.03.2024
51
Role ontology
Small molecule resources at the EBI
18.03.2024
52
ChEBI – Chemical Entities of Biological Interest
18.03.2024
53
ChEBI ontology relationships
• Generic ontology relationships
• Chemistry-specific relationships
ChEBI – Chemical Entities of Biological Interest
18.03.2024
54
Viewing ChEBI ontology
Simple and advanced text search
Narrow by
category
AND, OR
and BUT
NOT
Small molecule resources at the EBI
18.03.2024
55
Structure search Search options
Structure
drawing tools
Small molecule resources at the EBI
18.03.2024
56
Search Results
Click to go to
compound page
Hover-over for
search menu
Small molecule resources at the EBI
18.03.2024
57
Types of structure search
• Identity – based on InChI
• Substructure – uses fingerprints to narrow search range, then
performs full substructure search algorithm
• Similarity – based on Tanimoto coefficient calculated between the
fingerprints
InChI=1/H2O/h1H2
1010110111
0010110010
1010110111
0010110010
Tanimoto(a,b)
= c / (a+b-c)
= 4 / (4+7-4)
= 0.57
a
b
Small molecule resources at the EBI
18.03.2024
58
18.03.2024
59
Browse via Periodic Table
Molecular
entities /
Elements
Small molecule resources at the EBI
18.03.2024
60
Navigate via links in ontology
Click to follow links
Small molecule resources at the EBI
CHEBI SEARCH EXAMPLE
ChEBI example
• Search for ‘Glycine’
• What is the ChEBI ID for this?
• Is it available as a Kegg compound?
• What are the IUPAC names?
• What is ‘glycine zwitterion’?
•
• 15428
• Yes
• Glycine, aminoacetic acid
• It is a tautomer of glycine
SEARCHING IN CHEMBL
18.03.2024
63 Small molecule resources at the EBI
How to search in ChEMBL:
• Keywords
• Compound name – dopamine, haloperidol
• Assay name – cytotoxicity, liver hepatotoxicity
• Target – RAF-1, IRAK-4
• Structure
• BLAST search – FASTA sequence from UniProt
• Protein or taxonomy hierarchy
18.03.2024
64 Small molecule resources at the EBI
Where to search:
18.03.2024
65 Small molecule resources at the EBI
Using the search field (found on main page):
• Best for single words
• E.g. ‘dopamine’, ‘Muscarinic’
• Looks for matching text in compound name, key or
synonym
• 3-o-methyl-alpha-methyldopamine
• Muscarinic receptor 4
• Needs an exact match
• Can’t use wildcards, e.g. ‘%’, ‘?’…
18.03.2024
66 Small molecule resources at the EBI
Using the Protein Sequence Search
18.03.2024
67
• Useful for searching for a specific protein or a protein
from the same family
• The results brought back will show a percentage similarity
to the inputted sequence.
• An exact match will give 100%.
• Same targets but different organisms will give ~90%
Small molecule resources at the EBI
Compound Drawing
• Can draw the full structure of
interest or a partial structure
• Using the Substructure
Search you can find
compounds containing your
partial structure
• Using the Similarity Search,
you can find similar
compounds – based on a
percentage score (70-100%)
18.03.2024
68 Small molecule resources at the EBI
DOWNLOAD AND ANALYSIS
OF CHEMBL RESULTS
18.03.2024
69 Small molecule resources at the EBI
• The compounds can be downloaded as an *.SDFile.
18.03.2024
71 Small molecule resources at the EBI
• The bioactivity data can be downloaded as *.XLS
18.03.2024
72 Small molecule resources at the EBI
18.03.2024
73 Small molecule resources at the EBI
CHEMBL WORKED EXAMPLE
STRUCTURE ACTIVITY
RELATIONSHIPS
Small molecule resources at the EBI
Drug design
• Ligand-based: relies on knowledge of other molecules that bind to the
biological target of interest.
• Structure-based: relies on knowledge of the 3D structure of the biological
target.
• A lead has
• evidence that modulation of the target will have therapeutic value: e.g. disease
linkage studies showing associations between mutations in the biological target
and certain disease states.
• evidence that the target is druggable, i.e. capable of binding to a small molecule
and that its activity can be modulated by the small molecule.
• Target is cloned and expressed, then libraries of potential drug compounds
are screened using screening assays
Small molecule resources at the EBI
18.03.2024
76
Drug Discovery Process
> 2,900,000 bioactivities
> 600,000 compounds
~30,000 distinct lead series
~12,000 candidates ~2000
drugs
Target
Discovery
Lead
Discovery
Lead
Optimisatio
n
Preclinical
Development
Phase
1
Phase
2
Phase
3
Launch
•Target
identification
•Microarray
profiling
•Target
validation
•Assay
development
•Biochemistry
•Clinical/Animal
disease models
•High-throughput
Screening (HTS)
•Fragment-based
screening
•Focused
libraries
•Screening
collection
•Medicinal
Chemistry
•Structure-based
drug design
•Selectivity
screens
•ADMET screens
•Cellular/Animal
disease models
•Pharmacokineti
cs
•Toxicology
•In vivo safety
pharmacology
•Formulation
•Dose prediction
PK
tolerabilit
y
Efficacy
Safety
&
Efficacy
Indication
Discovery &
expansion
Med. Chem. SAR Clinical
Candidates
Dru
gs
Discovery Development Use
Clinical Trials
ChEMBL database
Small molecule resources at the EBI
SAR Data
Compound
Assay
Ki=4.5
nM
N
N
N
N
N
O
N
O
N
O
H
H
H
H
H
>Thrombin
MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSY
EEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRS
RYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEG
SSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGD
EEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEAD
CGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVL
TAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLK
KPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVC
KDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFY
THVFRLKKWIQKVIDQFGE
APTT
11 min
Target
Compound
Bioactivity
Small molecule resources at the EBI
Current Data Content (ChEMBL_10)
• Abstracted from 40,623 papers from 27 journals
• Ongoing curation and clean-up of all data
• 785,746 compound records
• 639,570 distinct compound structures
• 8,371 targets
• 5,190 protein molecular targets
• Over 3,200,000 experimental bioactivities
• binding measurements, functional assays and ADMET
Small molecule resources at the EBI
ChEMBL Assay Data
• ChEMBL contains >3 million data points relating compounds to
targets or effects.
• These activities come from ~500K assays reported in
medicinal chemistry literature.
• Assays can be classified as:
• functional assay endpoints
e.g., Vasodilation
• binding measurements
e.g., IC50
• ADME/toxicity data
e.g., LD50
Small molecule resources at the EBI
55
29
16
Functional Binding ADMET
Compound Properties and Selectivity
• Stores a wide range of calculated compound properties
(e.g., mol wt, logP, RO5 violations)
• Can be used to identify compounds most likely to have good in
vivo properties (Absorption, Distribution, Metabolism, Excretion)
• Contains activity information against liability targets (e.g.,
cytochrome P450s, HERG K+ channel)
• If compounds have been tested in these assays, can avoid those
with potential toxicity issues
• Contains data on a wide range of targets
• If compounds have been tested against multiple targets, can get
an idea of their selectivity (important for validation studies)
Small molecule resources at the EBI
18.03.2024
81
Why Use SARs?
• A chemical structure determines its physical and
biological characteristics.
• Small changes to the structure can have a large impact
on activities.
• Understanding what changes have the greatest/least
effect can aid in drug design.
• Using the many available databases that contain this
information reduce time and money spent on synthesis of
potential drug compounds.
Example:
1. You are interested in creating a compound to target
IRAK4 and the compound must have an aniline core
structure.
2. Run a search for IRAK4 and download all of the
compounds as an SDFile and all of the IC50 data as a
text file.
3. Combine the compounds and data into one SDFile.
4. Analyse the SAR data with an external program.
• There are over 3,000,000 data points in ChEMBL
• Difficult to manually look through them all
• Pipeline Pilot ™ is used in the ChEMBL team to visualise
mass amounts of data.
• SAR grids can be created using downloaded structures
and associated bioactivity data.
Sort the data into
ascending IC50 values
Simple SAR
• A compound with an IC50 < 100nM for a target, is
considered to be ‘good’.
• Search for IRAK4 and filter for IC50 < 100nM.
• Download the filtered bioactivities as an XLS spreadsheet
(26 bioactivities).
• Extract the list of ChEMBL_IDs from the spreadsheet and
paste them into the search box (24 ids).
• Run same search and filter on the bioactivities of IC50 >
100nM (96 bioactivities).
• Download the bioactivity data and extract the list of
ChEMBL_IDs (7 ids).
• These 7 compounds are ‘potentially’ selective for IRAK4
and unselective for any other targets.
Specific to IRAK4
Specific to IRAK4 and Others
Downloads and programmatic access
Downloading ChEBI flavours
18.03.2024
93
• All downloads come in two flavours
• 3 star only entries (manually annotated ChEBI
entries)
• 2 and 3 star entries (manually annotated ChEBI,
ChEMBL and user submissions)
Small molecule resources at the EBI
18.03.2024
94
Downloading ChEBI
• OBO file
• Use on OBO-edit
• SDF File
• Chemistry software compliant such as Bioclipse
• Flat file, tab delimited
• Import all the data into Excel
• Parse it into your own database structure
• Oracle binary dumps
• Import into an oracle database
• Generic SQL insert statements
• Import into MySQL or postgresql database
Small molecule resources at the EBI
18.03.2024
95
The ChEBI web service
• Programmatic access to a ChEBI entry
• SOAP based Java implementation
• Clients currently available in Java and perl
• Methods
• getLiteEntity
• getCompleteEntity and getCompleteEntityByList
• getOntologyParents
• getOntologyChildren and getAllOntologyChildrenInPath
• getStructureSearch
• Documented at
http://www.ebi.ac.uk/chebi/webServices.do.
Small molecule resources at the EBI
Downloading ChEMBL
• Frequent releases (approx monthly)
• SDFile
• Text
• MySQL
• Oracle
Small molecule resources at the EBI
Downloading ChEMBL
Small molecule resources at the EBI
Help and Feedback
• Email addresses for support queries and feedback
• General questions and feedback on ChEMBL interface:
chembl-help@ebi.ac.uk
• Reporting of data errors:
chembl-data@ebi.ac.uk
• General questions, support and feedback on ChEBI
chebi-help@ebi.ac.uk
Small molecule resources at the EBI
18.03.2024
98
Thank you

More Related Content

Similar to louisa_bellis_small_molecules_copenhagen_roadshow.pptx

combinatorialchemistry-190501111508 (1).pptx
combinatorialchemistry-190501111508 (1).pptxcombinatorialchemistry-190501111508 (1).pptx
combinatorialchemistry-190501111508 (1).pptxHarshitaGaur20
 
Accessing small molecule data using ChEBI
Accessing small molecule data using ChEBIAccessing small molecule data using ChEBI
Accessing small molecule data using ChEBIDuncan Hull
 
Lipinski in silico drug discovery durham nc 2014
Lipinski in silico drug discovery durham nc 2014Lipinski in silico drug discovery durham nc 2014
Lipinski in silico drug discovery durham nc 2014Christopher Lipinski
 
System Modelling and Metabolomics.pptx
System Modelling and Metabolomics.pptxSystem Modelling and Metabolomics.pptx
System Modelling and Metabolomics.pptxMedhavi27
 
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...Chanin Nantasenamat
 
Structure activity relation ship
Structure activity relation shipStructure activity relation ship
Structure activity relation shipAkshil Mehta
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Lee Larcombe
 
Pham yang embl-ebi
Pham yang embl-ebiPham yang embl-ebi
Pham yang embl-ebiNate Wildes
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein databasechinmayeec
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistryHarendra Bisht
 
JBEI Research Highlights - October 2018
JBEI Research Highlights - October 2018 JBEI Research Highlights - October 2018
JBEI Research Highlights - October 2018 Irina Silva
 
Enzymes are proteins used for wide range of reaction.pptx
Enzymes are proteins used for wide range of reaction.pptxEnzymes are proteins used for wide range of reaction.pptx
Enzymes are proteins used for wide range of reaction.pptxtekalignpawulose09
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryLee Larcombe
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxashharnomani
 
2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...Michel Dumontier
 

Similar to louisa_bellis_small_molecules_copenhagen_roadshow.pptx (20)

combinatorialchemistry-190501111508 (1).pptx
combinatorialchemistry-190501111508 (1).pptxcombinatorialchemistry-190501111508 (1).pptx
combinatorialchemistry-190501111508 (1).pptx
 
Accessing small molecule data using ChEBI
Accessing small molecule data using ChEBIAccessing small molecule data using ChEBI
Accessing small molecule data using ChEBI
 
Lipinski in silico drug discovery durham nc 2014
Lipinski in silico drug discovery durham nc 2014Lipinski in silico drug discovery durham nc 2014
Lipinski in silico drug discovery durham nc 2014
 
System Modelling and Metabolomics.pptx
System Modelling and Metabolomics.pptxSystem Modelling and Metabolomics.pptx
System Modelling and Metabolomics.pptx
 
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
 
Structure activity relation ship
Structure activity relation shipStructure activity relation ship
Structure activity relation ship
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
 
Pham yang embl-ebi
Pham yang embl-ebiPham yang embl-ebi
Pham yang embl-ebi
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
ChemSpider – An Online Database and Registration System Linking the Web
ChemSpider – An Online Database and  Registration System Linking the WebChemSpider – An Online Database and  Registration System Linking the Web
ChemSpider – An Online Database and Registration System Linking the Web
 
Combinatorial chemistry
Combinatorial chemistryCombinatorial chemistry
Combinatorial chemistry
 
JBEI Research Highlights - October 2018
JBEI Research Highlights - October 2018 JBEI Research Highlights - October 2018
JBEI Research Highlights - October 2018
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
 
Protein database
Protein databaseProtein database
Protein database
 
Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)
 
Enzymes are proteins used for wide range of reaction.pptx
Enzymes are proteins used for wide range of reaction.pptxEnzymes are proteins used for wide range of reaction.pptx
Enzymes are proteins used for wide range of reaction.pptx
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
 
2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

louisa_bellis_small_molecules_copenhagen_roadshow.pptx

  • 1. EBI is an Outstation of the European Molecular Biology Laboratory. Small Molecules in Bioinformatics EBI Bioinformatics Roadshow Dr. Louisa Bellis, ChEMBL Copenhagen, June 2011
  • 2. Small molecule resources at the EBI 18.03.2024 2 Agenda • Introduction • Small molecule resources • ChEMBL • ChEBI • Searching and browsing • Hands-on Exercises
  • 3. Small molecules participate in all processes of life
  • 4. What are Small Molecules? • A small molecule is defined as a low molecular weight organic compound. • Most drugs are small molecules to allow passage over cell membranes and oral bioavailability. • They are also able to bind to proteins and enzymes, thereby altering function, which can lead to a therapeutic effect.
  • 6. Metabolism Adenosine 5’-triphosphate (ATP): the "molecular unit of currency" of intracellular energy transfer. • generated in the cell by energy-consuming processes, broken down by energy-releasing processes • proteins that bind ATP do so in a characteristic protein fold known as the Rossmann fold, which is a general nucleotide-binding structural domain that can also bind the cofactor NAD Adenosine 5'-triphosphate Small molecule resources at the EBI 18.03.2024 8
  • 7. Enzymes • Enzyme inhibitors are molecules that bind to enzymes and decrease their activity. • Many drugs are enzyme inhibitors. They are also used as herbicides and pesticides. • Enzyme activators bind to enzymes and increase their enzymatic activity. • Enzyme activators are often involved in the allosteric regulation of enzymes in the control of metabolism. clavulanic acid acts as a suicide inhibitor of bacterial β-lactamase enzymes Small molecule resources at the EBI 18.03.2024 9
  • 9. Drug types 2003 - 2009 'Small molecules' in various shades of blue (http://chembl.blogspot.com/) Small molecule resources at the EBI 18.03.2024 12
  • 10. Small Molecule Databases • Small Molecule Databases can be used to: • Investigate historical compounds and associated bioactivity data. • To give fresh insight into previously rejected drugs. • Create Structure-Activity Relationships (SARs) • Look at how changing a functional group can change the biological activity of a compound – before you start your own synthesis. 18.03.2024 13 Small molecule resources at the EBI
  • 11. • Direct synthesis • Could reduce number of compounds made – if any similar compounds have significant toxicity or unfavourable binding data, you can save time by not making analogues. • Direct end product testing • Suggest what testing could be carried out – the database can give you an idea of what testing has given ‘good’ (i.e. clear) results. • Reduce number of compounds put through High Throughput Screening (HTS). 18.03.2024 14 Small molecule resources at the EBI
  • 12. ChEBI and ChEMBL Small molecule resources at the EBI
  • 13. What is ChEBI? • Chemical Entities of Biological Interest • Freely available • Focused on ‘small’ chemical entities (no proteins or nucleic acids) • Illustrated dictionary of chemical nomenclature • High quality, manually annotated • Provides chemical ontology Access ChEBI at http://www.ebi.ac.uk/chebi/ Small molecule resources at the EBI 18.03.2024 16
  • 14. ChEBI home page Small molecule resources at the EBI 18.03.2024 17
  • 15. ChEBI data overview Visualisation caffeine 1,3,7-trimethylxanthine methyltheobromine Nomenclature Formula: C8H10N4O2 Charge: 0 Mass: 194.19 Chemical data metabolite CNS stimulant trimethylxanthines Ontology MSDchem: CFF KEGG DRUG: D00528 Database Xrefs Chemical Informatics InChI=1/C8H10N4O2/c1-10-4-9-6- 5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES: CN1C(=O)N(C)c2ncn(C)c2C1=O
  • 16. ChEBI – Chemical Entities of Biological Interest 18.03.2024 19 ChEBI entry view
  • 17. Chemical Structures • Chemical structure may be interactively explored using MarvinView applet • Available in formats • Image • Molfile • InChI and InChIKey • SMILES Small molecule resources at the EBI 18.03.2024 20
  • 18. ChEBI – Chemical Entities of Biological Interest 18.03.2024 21 Automatic Cross-references
  • 19. What is ChEMBL? • Database of bioactive, drug-like small molecules. • Contains 2D structures, calculated properties (logP, mol weight, Lipinski etc) • Contains abstracted bioactivity data, e.g. binding data and IC50, from multiple primary scientific journals • Covers about 30 years of compound synthesis and testing • Annotated FDA-approved drugs Access ChEMBL at https://www.ebi.ac.uk/chembldb/ Small molecule resources at the EBI 18.03.2024 22
  • 20. ChEMBL Main Search Page Small molecule resources at the EBI
  • 22. 18.03.2024 25 Small molecule resources at the EBI Structural Representations
  • 23. 18.03.2024 26 Small molecule resources at the EBI
  • 24. 18.03.2024 27 Small molecule resources at the EBI Parent and Salt Forms Database links
  • 25. ChEBI Link: 18.03.2024 28 Small molecule resources at the EBI
  • 26. This will take you back to ChEMBL
  • 27. ChemSpider Links: 18.03.2024 30 The link works both ways. They link TO ChemSpider and FROM ChemSpider. They link on Standard_Inchi Small molecule resources at the EBI
  • 28. Wikipedia Links: 18.03.2024 31 We also have links with Wikipedia. These also use the Standard_Inchi as the common identifier. These links will link to the Compound Report Card in ChEMBL. The links are added by a ChemoBot and can be updated with each release, if required. Small molecule resources at the EBI
  • 30. Stereoisomers • Compounds that have same molecular formula and configuration, but differ in the 3-dimensional orientations. • The central tetrahedral carbon has 4 different molecular groups/atoms attached. This is known as the chiral centre. 18.03.2024 33 Small molecule resources at the EBI
  • 31. Stereoisomerism Example - Thalidomide • Caused thousands of deformities in babies across 46 countries between 1957 and 1961. • The R isomer is to control morning sickness but the S isomer was teratogenic. • Sparked more tightly controlled laboratory practices across the world. 18.03.2024 34 Small molecule resources at the EBI
  • 32. Stereoisomers • Where known, the stereochemistry of the compound is noted in the structure and in the name. • If a stereoisomer of an existing compound is submitted, it is given a separate id number. • If a mixture of two stereoisomers had data submitted, we will also give this a separate id number if the activity of the compounds can not be isolated. • If you draw a planar compound into the structure search, you will receive data on all stereoisomers. 18.03.2024 35 Small molecule resources at the EBI
  • 33. Ofloxacin, Levofloxacin and Dextrofloxacin • Fluoroquinolone antibiotics • Ofloxacin is a racemic (equal) mixture of Levo and Dextro isomers. • Levofloxacin is the more active stereoisomer • Dextrofloxacin is the less active stereoisomer • ChEMBL has data on each with separate bioactivities. 18.03.2024 36 Small molecule resources at the EBI
  • 34. Tautomers (keto-enol form) • Two forms readily interconvert via the migration of a hydrogen to the adjacent oxygen and the swapping of a single to a double bond, and vice versa. • ChEMBL does not differentiate between different tautomers. • The preferred tautomeric structure is retained. • ChEBI does differentiate and will store the separate tautomers. 18.03.2024 37 Small molecule resources at the EBI
  • 35. Salts • About 50% of marketed drugs are combined with salts to aid in their activity. • Some salts prevent the drug from being absorbed in the mouth. • Some salts help the drug be activated in the intestines, rather than the stomach. • There are approx 53,450 ChEMBL compounds with salts. • Bioactivity data is recorded against the parent drug and against the salt. • Therefore, it’s important to give these compounds different ChEMBL ids. 18.03.2024 38 Small molecule resources at the EBI
  • 36. Salt Example: Morphine • Morphine can be administered with many different salts: • Hydrochloride (HCl) • Sulphate (SO4) • Tartrate • Acetate • Citrate • Methobromide (MeBr) • Hydrobromide (HBr) • Hydroiodide (HI) • Lactate • Chloride (Cl) • Bitartrate 18.03.2024 39 Small molecule resources at the EBI
  • 37. Dealing with Salts in ChEMBL • Each compound, if in a salt form, is analysed and matched to a ‘parent’ – i.e. the base form of the compound. (Not inorganic compounds) • For example, morphine hydrochloride (CHEMBL556578), morphine sulfate (CHEMBL422878) and morphine sulfate hydrate (CHEMBL1200603) are matched to their parent morphine (CHEMBL70) • This relationship is shown on the interface of the compound page. • Additionally, when you run a search for a compound, you will only be brought back the parent form in the results grid. 18.03.2024 40 Small molecule resources at the EBI
  • 38. Parents and Salts on the Compound Page 18.03.2024 41 PARENT (compound report page) SALTS (with hyperlinks beneath) Small molecule resources at the EBI
  • 39. • Clicking on the Bioactivity Summary pie chart will give you the bioactivity data for ALL forms of the compound • To get salt specific bioactivity data, click on the hyperlink beneath the salt form of interest to be taken to its compound page. 18.03.2024 42 Morphine - All Data Morphine HCl specific data Small molecule resources at the EBI
  • 40. Naming and Classification Small molecule resources at the EBI
  • 41. Chemical names Common or trivial names are those that are highly used. Advantages of common names include simplicity, pronounceability and universally recognised The main disadvantage is ambiguity – the same common name may refer to more than one type of chemical. Small molecule resources at the EBI
  • 42. Systematic names A systematic name is one which corresponds to the chemical structure such that the structure can be determined from the name, e.g. 1,2-dimethyl-naphthalene Software packages exist which can generate structures from the systematic names (e.g. ACD/Name, ChemOffice, MarvinSketch). More than one correct systematic name can be assigned to the same molecular structure, depending on the manner in which naming rules are applied. Small molecule resources at the EBI
  • 43. Examples of common and systematic names Common names Systematic names caffeine guaranine theine 1,3,7-trimethyl-3,7- dihydro-1H-purine-2,6- dione 7-methyltheophylline 1,3,7-trimethyl-2,6- dioxopurine Small molecule resources at the EBI
  • 45. Why? • Ontological data • Structure classification • Chemical entity, e.g. hydrocarbon • Role, e.g. ligand • Subatomic particle, e.g. electron • Links to other databases • Kegg • DrugBank • PDBEChem • Citations
  • 47. The ChEBI ontology Organised into three sub-ontologies, namely • Molecular structure ontology • Subatomic particle ontology • Role ontology (R)-adrenaline Small molecule resources at the EBI 18.03.2024 50
  • 48. Molecular structure ontology Small molecule resources at the EBI 18.03.2024 51
  • 49. Role ontology Small molecule resources at the EBI 18.03.2024 52
  • 50. ChEBI – Chemical Entities of Biological Interest 18.03.2024 53 ChEBI ontology relationships • Generic ontology relationships • Chemistry-specific relationships
  • 51. ChEBI – Chemical Entities of Biological Interest 18.03.2024 54 Viewing ChEBI ontology
  • 52. Simple and advanced text search Narrow by category AND, OR and BUT NOT Small molecule resources at the EBI 18.03.2024 55
  • 53. Structure search Search options Structure drawing tools Small molecule resources at the EBI 18.03.2024 56
  • 54. Search Results Click to go to compound page Hover-over for search menu Small molecule resources at the EBI 18.03.2024 57
  • 55. Types of structure search • Identity – based on InChI • Substructure – uses fingerprints to narrow search range, then performs full substructure search algorithm • Similarity – based on Tanimoto coefficient calculated between the fingerprints InChI=1/H2O/h1H2 1010110111 0010110010 1010110111 0010110010 Tanimoto(a,b) = c / (a+b-c) = 4 / (4+7-4) = 0.57 a b Small molecule resources at the EBI 18.03.2024 58
  • 56. 18.03.2024 59 Browse via Periodic Table Molecular entities / Elements Small molecule resources at the EBI
  • 57. 18.03.2024 60 Navigate via links in ontology Click to follow links Small molecule resources at the EBI
  • 59. ChEBI example • Search for ‘Glycine’ • What is the ChEBI ID for this? • Is it available as a Kegg compound? • What are the IUPAC names? • What is ‘glycine zwitterion’? • • 15428 • Yes • Glycine, aminoacetic acid • It is a tautomer of glycine
  • 60. SEARCHING IN CHEMBL 18.03.2024 63 Small molecule resources at the EBI
  • 61. How to search in ChEMBL: • Keywords • Compound name – dopamine, haloperidol • Assay name – cytotoxicity, liver hepatotoxicity • Target – RAF-1, IRAK-4 • Structure • BLAST search – FASTA sequence from UniProt • Protein or taxonomy hierarchy 18.03.2024 64 Small molecule resources at the EBI
  • 62. Where to search: 18.03.2024 65 Small molecule resources at the EBI
  • 63. Using the search field (found on main page): • Best for single words • E.g. ‘dopamine’, ‘Muscarinic’ • Looks for matching text in compound name, key or synonym • 3-o-methyl-alpha-methyldopamine • Muscarinic receptor 4 • Needs an exact match • Can’t use wildcards, e.g. ‘%’, ‘?’… 18.03.2024 66 Small molecule resources at the EBI
  • 64. Using the Protein Sequence Search 18.03.2024 67 • Useful for searching for a specific protein or a protein from the same family • The results brought back will show a percentage similarity to the inputted sequence. • An exact match will give 100%. • Same targets but different organisms will give ~90% Small molecule resources at the EBI
  • 65. Compound Drawing • Can draw the full structure of interest or a partial structure • Using the Substructure Search you can find compounds containing your partial structure • Using the Similarity Search, you can find similar compounds – based on a percentage score (70-100%) 18.03.2024 68 Small molecule resources at the EBI
  • 66. DOWNLOAD AND ANALYSIS OF CHEMBL RESULTS 18.03.2024 69 Small molecule resources at the EBI
  • 67.
  • 68. • The compounds can be downloaded as an *.SDFile. 18.03.2024 71 Small molecule resources at the EBI
  • 69. • The bioactivity data can be downloaded as *.XLS 18.03.2024 72 Small molecule resources at the EBI
  • 70. 18.03.2024 73 Small molecule resources at the EBI
  • 73. Drug design • Ligand-based: relies on knowledge of other molecules that bind to the biological target of interest. • Structure-based: relies on knowledge of the 3D structure of the biological target. • A lead has • evidence that modulation of the target will have therapeutic value: e.g. disease linkage studies showing associations between mutations in the biological target and certain disease states. • evidence that the target is druggable, i.e. capable of binding to a small molecule and that its activity can be modulated by the small molecule. • Target is cloned and expressed, then libraries of potential drug compounds are screened using screening assays Small molecule resources at the EBI 18.03.2024 76
  • 74. Drug Discovery Process > 2,900,000 bioactivities > 600,000 compounds ~30,000 distinct lead series ~12,000 candidates ~2000 drugs Target Discovery Lead Discovery Lead Optimisatio n Preclinical Development Phase 1 Phase 2 Phase 3 Launch •Target identification •Microarray profiling •Target validation •Assay development •Biochemistry •Clinical/Animal disease models •High-throughput Screening (HTS) •Fragment-based screening •Focused libraries •Screening collection •Medicinal Chemistry •Structure-based drug design •Selectivity screens •ADMET screens •Cellular/Animal disease models •Pharmacokineti cs •Toxicology •In vivo safety pharmacology •Formulation •Dose prediction PK tolerabilit y Efficacy Safety & Efficacy Indication Discovery & expansion Med. Chem. SAR Clinical Candidates Dru gs Discovery Development Use Clinical Trials ChEMBL database Small molecule resources at the EBI
  • 76. Current Data Content (ChEMBL_10) • Abstracted from 40,623 papers from 27 journals • Ongoing curation and clean-up of all data • 785,746 compound records • 639,570 distinct compound structures • 8,371 targets • 5,190 protein molecular targets • Over 3,200,000 experimental bioactivities • binding measurements, functional assays and ADMET Small molecule resources at the EBI
  • 77. ChEMBL Assay Data • ChEMBL contains >3 million data points relating compounds to targets or effects. • These activities come from ~500K assays reported in medicinal chemistry literature. • Assays can be classified as: • functional assay endpoints e.g., Vasodilation • binding measurements e.g., IC50 • ADME/toxicity data e.g., LD50 Small molecule resources at the EBI 55 29 16 Functional Binding ADMET
  • 78. Compound Properties and Selectivity • Stores a wide range of calculated compound properties (e.g., mol wt, logP, RO5 violations) • Can be used to identify compounds most likely to have good in vivo properties (Absorption, Distribution, Metabolism, Excretion) • Contains activity information against liability targets (e.g., cytochrome P450s, HERG K+ channel) • If compounds have been tested in these assays, can avoid those with potential toxicity issues • Contains data on a wide range of targets • If compounds have been tested against multiple targets, can get an idea of their selectivity (important for validation studies) Small molecule resources at the EBI 18.03.2024 81
  • 79. Why Use SARs? • A chemical structure determines its physical and biological characteristics. • Small changes to the structure can have a large impact on activities. • Understanding what changes have the greatest/least effect can aid in drug design. • Using the many available databases that contain this information reduce time and money spent on synthesis of potential drug compounds.
  • 80. Example: 1. You are interested in creating a compound to target IRAK4 and the compound must have an aniline core structure. 2. Run a search for IRAK4 and download all of the compounds as an SDFile and all of the IC50 data as a text file. 3. Combine the compounds and data into one SDFile. 4. Analyse the SAR data with an external program.
  • 81. • There are over 3,000,000 data points in ChEMBL • Difficult to manually look through them all • Pipeline Pilot ™ is used in the ChEMBL team to visualise mass amounts of data. • SAR grids can be created using downloaded structures and associated bioactivity data.
  • 82. Sort the data into ascending IC50 values
  • 83.
  • 84.
  • 85.
  • 86. Simple SAR • A compound with an IC50 < 100nM for a target, is considered to be ‘good’. • Search for IRAK4 and filter for IC50 < 100nM. • Download the filtered bioactivities as an XLS spreadsheet (26 bioactivities). • Extract the list of ChEMBL_IDs from the spreadsheet and paste them into the search box (24 ids). • Run same search and filter on the bioactivities of IC50 > 100nM (96 bioactivities). • Download the bioactivity data and extract the list of ChEMBL_IDs (7 ids). • These 7 compounds are ‘potentially’ selective for IRAK4 and unselective for any other targets.
  • 88. Specific to IRAK4 and Others
  • 90. Downloading ChEBI flavours 18.03.2024 93 • All downloads come in two flavours • 3 star only entries (manually annotated ChEBI entries) • 2 and 3 star entries (manually annotated ChEBI, ChEMBL and user submissions) Small molecule resources at the EBI
  • 91. 18.03.2024 94 Downloading ChEBI • OBO file • Use on OBO-edit • SDF File • Chemistry software compliant such as Bioclipse • Flat file, tab delimited • Import all the data into Excel • Parse it into your own database structure • Oracle binary dumps • Import into an oracle database • Generic SQL insert statements • Import into MySQL or postgresql database Small molecule resources at the EBI
  • 92. 18.03.2024 95 The ChEBI web service • Programmatic access to a ChEBI entry • SOAP based Java implementation • Clients currently available in Java and perl • Methods • getLiteEntity • getCompleteEntity and getCompleteEntityByList • getOntologyParents • getOntologyChildren and getAllOntologyChildrenInPath • getStructureSearch • Documented at http://www.ebi.ac.uk/chebi/webServices.do. Small molecule resources at the EBI
  • 93. Downloading ChEMBL • Frequent releases (approx monthly) • SDFile • Text • MySQL • Oracle Small molecule resources at the EBI
  • 94. Downloading ChEMBL Small molecule resources at the EBI
  • 95. Help and Feedback • Email addresses for support queries and feedback • General questions and feedback on ChEMBL interface: chembl-help@ebi.ac.uk • Reporting of data errors: chembl-data@ebi.ac.uk • General questions, support and feedback on ChEBI chebi-help@ebi.ac.uk Small molecule resources at the EBI 18.03.2024 98