SlideShare a Scribd company logo
1 of 58
Download to read offline
Computational Drug Discovery
Associate Professor Dr. Chanin Nantasenamat
ā€Ø
E-mail: chanin.nan@mahidol.edu
YouTube: http://bit.ly/dataprofessor
Machine Learning for Making Sense
of Big Data in Drug Discovery
About the Speaker
ā€¢ Research group website at http://codes.bio
ā€¢ Codes and Data at http://github.com/
chaninn and http://github.com/chaninlab
ā€¢ YouTube Channel called Data Professor
available at http://bit.ly/dataprofessor
ā€¢ Data Professor FaceBook Page at ā€Ø
http://facebook.com/dataprofessor
Icon made by Freepik from www.ļ¬‚aticon.com
Disease
ā€¢ The word ā€˜diseaseā€™ is
deļ¬ned by Cambridge
Dictionary asā€Ø
ā€Ø
illness of people, animals, plants,
etc., caused by infection or a
failure of health rather than by
an accident
http://static.ļ¬lmannex.com/users/galleries/
294182/19265_fa_rszd.jpg
Drugs
ā€¢ A ā€˜drugā€™ is a biological or
chemical entity that can
modulate the course of a
disease state by interacting
with its target protein
ā€¢ Biological entityā€Ø
(e.g. antibodies)
ā€¢ Chemical entityā€Ø
(e.g. small molecules)
Natthapon Ngamnithiporn. Image from FreePik.ā€Ø
http://www.freepik.com/free-photo/packings-of-pills-and-
capsules-of-medicines_1178867.htm
Li et al. BMC Syst Biol 8 (2014) 141.
Drug Discovery Process
ā€¢ Costs ~2 billion USD
ā€¢ Takes about 10-15 years
ā€¢ Failure rate is > 90%
http://drugdiscovery.nd.edu/
Drug Discovery Process
Ashburn andThor. Nature Rev. Drug Discov. 3 (2004) 673-683
Identify target
protein that is key in
modulating disease
Screen for ā€˜hitā€™
molecules that
can inhibit the
target protein
ā€˜Hit-to-leadā€™
and ā€˜Lead
optimizationā€™
Evaluate
pharmaco-
kinetic
properties
Initiate Clinical trials to evaluate
safety & dosage; efļ¬cacy & side effects;
adverse reaction to long-term use
Drug reaches
the market
https://slideplayer.com/slide/13182763/
From a million to one
Multi-objective optimization
ā€¢ A drug need not only target the protein of interest but must
also possess other properties
ā€¢ Desirable characteristics of a drug:
1. Binds selectively to the target protein
2. Absorbs in the stomach (oral drugs)
3. Permeates gut-wall or cell-wall (can reach target site)
4. Metabolically stable
5. Non-toxic
6. Can be synthesized
ā€¢ To achieve all these desirable properties, the chemical structure
will need to be optimized (an optimal balance will need to be
achieved against many factors)
Creating new compounds
ā€¢ We can look to nature for inspiration (biologically inspired)
or use existing drugs as starting point
ā€¢ Medicinal chemists optimize existing componds by modifying
them in a process known as bioisosteric replacement
(replacing a hydrogen atom by a halogen atom)
ā€¢ Cheminformaticians can computationally enumerate a
compound (compound enumeration) library using the
rules of organic chemistry (considers chemical stability and
synthetic feasibility)
Icon made by dDara from www.ļ¬‚aticon.com
Molecules
ā€¢ Molecules can be thought of as framework of atoms
(molecular graph) where atoms are vertices and bonds are
edges
- Each vertices can typically be one of nine atoms (C, N, O, F, P, S, Cl or
Br)
- Each edge that links the vertices can be a single, double or triple bond
ā€¢ Compound enumeration as performed by the research group of
JL Reymond (Acc Chem Res 2015, 48(3):722-730)
- Molecules of up to 13 atoms āŸ¶ 977 million possible molecules (109)
- Molecules of up to 17 atoms āŸ¶ 166 billion possible molecules (1011)
Chemical space
ā€¢ Theoretically possible chemical space as
revealed via compound enumeration by the
research group of JL Reymond (Acc Chem Res 2015,
48(3):722-730)
- Molecules of up to 13 atoms āŸ¶ 977 million
possible molecules (109)
- Molecules of up to 17 atoms āŸ¶ 166 billion
possible molecules (1011)
ā€¢ Drug space (<500 Da) is estimated to
constitute up to 40 atoms (in some cases, even
more) āŸ¶ roughly 1060 molecules
Drug Discovery Toolbox
Combina(
torial,
Chemistry,
Chemical,
Libraries,
Chemical,
Space,
High(
Throughput,
Screening,
Property,
Filters,
Compu(
ta;onal,
Chemistry,
Machine(
Learning(
QSAR(
Proteo3
chemo3
metrics(
Molecular(
Modeling(
Molecular(
Dynamics(
Molecular(
Docking(
Bioactivity
ā€¢ Bioactivity is the activity elicited by the
target protein of interest
ā€¢ Such target proteins are typically involved
in key pathways that inļ¬‚uence the course
of a disease
ā€¢ Thus, great attention has been placed to
modulate these target proteins
ā€¢ Primary literature
ā€¢ Curated
Databases
ā€¢ ChEMBL, BindingDB,
MOAD, PubChem
ā€¢ Open Innovation
ā€¢ Pharmaceutical
companies are
making data publicly
available for non-
commercial diseases
What can computers do?
ā€¢ Computers (IBM Deep Blue) have defeated human in
Jeopardy and Chess
ā€¢ Google released a self-driving car
ā€¢ NASA uses computers to simulate space missions
ā€¢ Computers are being used to design aircrafts and cars
ā€¢ Supermarkets and Shopping Malls are using our
purchase history to analyze and predict our spending
behavior
ā€¢ Why not use it to discover, design and develop new
drugs?
ā€¢ Computers (deep learning) can
paint likeVan Gogh and Picasso
ā€¢ Computers can programmatically
code music (Sonic Pi)
ā€¢ Computers can dream
http://www.boredpanda.com/computer-deep-learning-algorithm-painting-masters/
https://storage.googleapis.com/cdn.thenewstack.io/media/2015/07/google-deep-dream-
artiļ¬cial-neural-networks-12.jpg
Why do we need computational
models in drug discovery?
ā€¢ To discern structure-activity
relationship of chemical library
ā€¢ In vitro data are limited,
expensive, time-consuming,
laborious, etc.
ā€¢ Computational models can be
quickly built to preliminarily
predict the pharmacokinetics
and bioactivity of query
compounds
Anuwongcharoen et al. PeerJ 4 (2016) e1958
Questions that can be answered by
computational models
ā€¢ What target proteins could my compound(s) bind
to and modulate?
ā€¢ Would my compound bind unspeciļ¬cally to other
proteins and thus have off-target activity?
ā€¢ What type of compounds can bind and modulate
the bioactivity of the target protein of my interest?
ā€¢ Are there similar compounds to my query
compound that may potentially exert similar
binding behavior?
ā€¢ How does my compound bind to the protein
structure of its target? Hall et al. Prog Biophys Mol Biol 116 (2014) 82-91.
ā€¢ How can I modify the structure
of my compound to enhance
its pharmacokinetics and
bioactivity?
ADMET
QSAR
Pharmacophore
Statistical molecular design
Molecular modeling
Protein structure prediction
- Homology/comparative
- Ab initio
Molecular dynamics
Normal mode analysis
Docking/reverse docking
Binding cavity analysis
Pharmacophore
Proteinā€“ligand interactome
Proteinā€“protein interactome
Drug target gene expression
Intrinsically disordered proteins
Allo-network drugs
High-throughput synthesis
High-throughput screening
Privileged structures
Bioisostere
Chemoisostere
Scaffold hopping
Sequence alignment
BLAST
Phylogenetic analysis
Biological space
Computational chemistry
Molecular descriptors
Chemical space
Profiling
Filtering
- Lipinskiā€™s rule of 5
Search
- Molecular similarity
- Substructure similarity
- Shape, volume and
charge-based similarityDatabases
Small molecules
- DrugBank
- ChEMBL
- Pubchem
- BindingDB
- ZINC
Proteins
- PDB
- UniProt
- SCOP
Protein-protein
- MINT
- STITCH
- STRING
Pathway
- KEGG
- Reactome
Proteochemometrics
Computational
chemogenomics
Graph/network theory
Fragment-based docking
Fragment-based QSAR
Ligand growing
Structure-based
Systems-based
Medicinal chemistry
Bioinformatics
Cheminformatics
Ligand-based
Chemogenomics
Fragment-based
Maximizing computational tools for successful drug discovery
Overview of Computational Drug Discovery
Nantasenamat and Prachayasittikul. Expert Opin Drug Discov 10 (2015) 321-329.
Bioinformatics
ā€¢ Bioinformatics is a discipline entailing
the use of computational approaches to
analyze biological data
ā€£ Analyze and compare genes, proteins
and genomes
ā€£ Explore structures and functions of
biomolecules (DNA, protein, lipid and
carbohydrate)
ā€£ Explore network biology and metabolic
pathways
http://www.gettyimages.com/detail/photo/bioinformatics-background-concept-royalty-free-
image/475811932?esource=SEO_GIS_CDN_Redirect
I424
L428
F404
R394
E353
A350
D351
L354
P535
W383L525
Suvannang et al. Manuscript under Preparation.
ā€¢ Cheminformatics is a discipline at the
interface of chemistry and computers that
enables the analysis of various aspects
relevant to chemical structures
ā€£ Chemical space for investigating
Molecular similarity/diversity
ā€£ Molecular descriptors (e.g. MW,
LogP, nHBdon, nHBacc) and
Quantum chemical
descriptors (HOMO, LUMO,
HOMO-LUMO)
Cheminformatics
Ertl and Rohde. J Cheminf 4 (2012) 12.
Drugs and its pre-cursors
ā€¢ Fragments - are one of many substructures found in a compound (drug)
ā€¢ Privileged substructures - are substructures that are commonly found as
inhibitors/activators (drugs) against several therapeutic targets
ā€¢ Hits - are a small subset of compounds from large chemical libraries that are
identiļ¬ed from high-throughput screening
ā€¢ Leads - are compounds that have undergone minor structural optimization from
hits. From there, these leads often undergo further rounds of ā€œlead optimizationā€
ā€¢ Drugs - are one of many leads that had passed rigorous tests (pre-clinical and
clinical trials) before reaching the market
Identifying hits
ā€¢ So how does one go about
identifying hit compounds?
- High-throughput screening ā€Ø
(Experimental and computational)
- Find similar compounds to
known actives as the bioactivity of
each compound is not an isolated point
(similar chemical structures also provide
similar biological activity)
ą¹ 30% of these similar compounds to
known actives, are themselves actives
https://southernresearch.org/news/nih-contract-high-
throughput-screening-for-zika/
Hernandex-Santoyo et al. Protein-protein and protein-
ligand docking. DOI:10.5772/56376
MartinYC, J Med Chem 2002, 45(19):4350-4358
Lead generation (Hit-to-Lead)
ā€¢ Identiļ¬ed hits from high-
throughput screens are
transformed to leads by
means of limited
structural modiļ¬cation
(as to optimize their
ADMET properties)
ā€¢ Generated leads are
subjected to further
rounds of lead
optimization
Fuller N et al. Drug DiscovToday 2016, 21(8):1272-1283.
Fragment-based Drug Design
Source: http://practicalfragments.blogspot.com/2011/08/ļ¬rst-fragment-based-drug-approved.html
Zelboraf treats melanoma by inhibiting BRAF.
DeLaBarre B. http://consultingbiochemist.com/2014/12/cone-chemical-space/
ā€¢ Christopher Lipinski analyzed a large set of > 2,000 orally-active
drugs that led to what is known as the Lipinskiā€™s Rule of 5, which is a set of
rules deļ¬ning the drug like-ness of small molecules
ā€£ Molecular weight < 500 Da
ā€£ Lipophilicity (LogP) < 5
ā€£ Hydrogen bond donors < 5
ā€£ Hydrogen bond acceptors < 10
Lipinskiā€™s Rule of 5
a b
c da b
c d
Christopher Lipinski
@ Pļ¬zer
Lipinski et al.Adv Drug Deliv Rev 23 (1997) 3-25
Suvannang et al. (2017) Unpublished results
ā€¢ In drug discovery, there is a tendency for the lipophilicity and
molecular weight to increase as lead optimization progresses
as to improve the drugā€™s afļ¬nity and selectivity
ā€£ Molecular weight < 300 Da
ā€£ Lipophilicity (LogP) < 3
ā€£ Hydrogen bond donors < 3
ā€£ Hydrogen bond acceptors < 3
ā€£ Rotatable bonds < 3
Lead-like Rule of 3
Chemical space
ā€¢ Chemical space can be generally deļ¬ned as
the universe of synthetically feasible small
molecules of <500 Da that is estimated to
be in the order of ~1060 molecules
ā€¢ The visualization of which gives us a birdā€™s
eye glance at the relative diversity/likeness
of chemical libraries
ā€¢ Reymond group at University of Bern,
Switzerland developed a computational
algorithm that enumerates all possible chemical
structures that can be built from 17 heavy
atoms in their GDB-17 database which amounts
to 166.4 billion
Reymond and Awale.ACS Chem Neurosci 3 (2012) 649-657.
Biological space
ā€¢ Biological space refers to the chemical
space of druggable protein families
ā€£ ADMET
ā€£ Aminergic/Lipophilic GPCR space
ā€£ Kinase space
ā€£ Protease space
ā€£ CYP450
ā€£ Nuclear receptors Petit-Zeman S. http://www.nature.com/horizon/
chemicalspace/background/ļ¬gs/explore_b1.html
Fragment space
ā€¢ Fragment space can be deļ¬ned as
the universe or collection of all possible
molecular fragments (substructures)
ā€¢ Fragments are < 300 Da
ā€¢ Utilization of the fragment space has
been suggested to allow more diverse
exploration of the possible chemical
space
ā€¢ Reymond group also extracted 10
million fragments from the GDB-17
https://software.zbh.uni-hamburg.de/assets/softwareserverslide6-
a0e42ecb3651120926821932574540d5b2e83ff0209654f9ab14
804c7858451a.png
Virshup et al. J Am Chem Soc 135 (2013) 7296-7303
Koch et al. PNAS 102 (2005) 17272-17277
Structural classiļ¬cation of natural products (SCONP)
Nadin et al.Angew Chem Int Ed 51 (2012) 1114-1122.
Polypharmacology
ā€¢ There is a paradigm shift from ā€˜one
drug-one targetā€™ to ā€˜one drug-
multiple targetsā€™
ā€¢ Unintended off-target binding may elicit
undesirable side effects and adverse
effects
ā€¢ Desirable off-target binding gives you
drug repositioning opportunities
ā€¢ Knowledge of polypharmacology may aid
in the design of multi-targeted drugs
Reddy and Zhang. Expert Rev Clin Pharmacol 6 (2013) 41-47
Kinase targets of Staurosporine
Drug repositioning/repurposing
ā€¢ There is a need to
discover new drugs for
treatment especially rare
and neglected diseases
ā€¢ Drug repositioning/ re-
purposing is a lucrative
approach as it tests
existing FDA-approved
drugs against various
other whole-cell and
target assays
Wu et al. Mol BioSyst 9 (2013) 1268-1281.
Experimental activity (pIC50)
5.0 5.5 6.0 6.5 7.0 7.5 8.0
Predictedactivity(pIC50)
5.0
5.5
6.0
6.5
7.0
7.5
8.0
What is QSAR? (1)
ā€¢ QSAR/QSPR is the
acronym of Quantitative
Structure-Activity/Property
Relationship
ā€¢ QSAR seeks to correlate
structural features of
compounds with their
biological activities
What is QSAR? (2)
ā€¢ Structure governs activity/
property
ā€¢ Typically in the medicinal
chemistry literature, effects
of substituent groups on
activity is extensively studied
1"
2"
3"
4"
5"
6"
ā€¢ QSAR/QSPR studies exploits this knowledge for modeling the
biological or chemical activities/properties
What is QSAR? (3)
ā€¢ QSAR involves three main concepts:
1. Selecting a biological activity or chemical property of interest
2. Generating the physicochemical description
3. Predicting the biological activity or chemical property
Qm# Energy# Ī¼# HOMO# LUMO# HOMO0LUMO#gap#
0.2271& '309.834& 1.0521& '0.21346& '0.0127& 0.20076&
0.2142& '195.31& 0.2337& '0.22611& '0.01915& 0.20696&
IC50%
0.05$
1.50$
Molecular
Descriptors
Biological
Activity
Computational Chemistry
Machine Learning
Compounds of Interest
Predict
Growth of QSAR?
ā€¢ A search in
SCOPUS
shows the
growing trend
of QSAR
publications
Data set preparation QSAR modeling
ChEMBL 23
Bioactivity
measured by IC50
Remove duplicate
SMILES
Bioactivity data of
ER Ī± inhibitors
Initial
data set
10,666 bioactivity
data for 5,809
compounds
IC50
subset
3,527 compounds
Final
data set
1,299 compounds
Select entries with
CONFIDENCE_SCORE=9
and assay_type=B
Selected
data set
1,346 compounds
Mechanistic
interpretation of
feature
importance
Feature
selection
12 sets of
PaDEL
fingerprints
Descriptor
calculation
Data
splitting
Evaluate
performance
QSAR model
Predicted
pIC50 values
Y-scrambling
for evaluating
chance
correlation
Delete entries with < or >
signs and those with
Salt removal
Transform
IC50 to pIC50
Final
data set
Tautomer
standardization
Remove collinear
descriptors
70/30 split ratio
Perform 10
data splits
Delete entries with missing
SMILES notation
R2,Q2,
Rm
2, RMSE
A typical QSAR workļ¬‚ow
Suvannang et al. RSC Adv 2018, 8: 11344-11356
Applications of QSAR/QSPR models
ā€¢ Regulatory Use: QSAR for modelling environmental
toxicity/chemical hazards by EPA and OECD
ā€¢ Drug Design: QSAR for modelling biological activities
ā€¢ Materials Design: QSPR for modelling chemical
properties
GFP$
LPS$
QSAR$
DNA$
PCP$
BPA$
Bacitracin$
Quorum$
Furin$
Vasorelaxant$
Vitamin$E$
Template?$
Monomer$
Phenol$
Sulfonamide$
EDTA?$
DPPC$
Copper$
Complex$
AnDmalarial$
AnD?H1N1$
Aromatase$
Inhibitors$
CYP450$
Inhibitors$
Monte$Carlo$
Feature$$
SelecDon$
Text$
Mining$
Biological activity/chemical property
modeled by QSAR
Biological Activity Chemical Property
Bioconcentration Boiling point
Biodegradation Chromatographic retention time
Carcinogenicity Dielectric constant
Drug metabolism Diffusion coefļ¬cient
Inhibitor constant Dissociation constant
Mutagenicity Melting point
Permeability Reactivity
Blood brain barrier Solubility
Skin Stability
Pharmacokinetics Thermodynamic properties
Receptor binding Viscosity
Nantasenamat et al. EXCLI J. (2009) 8: 74-88
Multiple
Compounds
Single ā€Ø
Target Protein
Multiple
Compounds
Multiple ā€Ø
Target Proteins
QSAR Proteochemometrics
Summary
ā€¢ QSAR models allow us to understand how changes to the
chromophore structure leads to GFP color change
ā€¢ PCM models allow us to understand how changes to
chromophore structure, changes to protein structure and the
chromophore-protein interaction inļ¬‚uences GFP color
change
ā€¢ Insights from the predictive models could be used in further
extending the spectral repertoire of GFP
Nantasenamat C et al. J. Comput. Chem. 35(27): 1951-1966.
Proteochemometrics
ā€¢ Proteochemometrics was developed by Maris Lapins and Jarl Wikberg of
Uppsala University in 2001
ā€¢ Advantages
ā€¢ Can explain ligand-target afļ¬nity by providing detailed maps down to
the substructures and amino acid level
ā€¢ Can be used to rationalise why a ligand is active toward one target and
not on the other related target
ā€¢ Has been shown to be useful for Drug Repositioning
ā€¢ Could be used for Personalized Medicine
Conclusion (1)
ā€¢ It is without a doubt that the QSAR paradigm boasts much beneļ¬t for the rational design
of robust compounds
ā€¢ Nevertheless, there are certain shortcomings that may limit the widespread application
of QSAR
ā€¢ Workļ¬‚ow of QSAR model development
ā€¢ High dimensionality of the input space
ā€¢ Representation of the molecular structure
ā€¢ Interpretability and meaning of the developed QSAR models
ā€¢ Presence of outliers or activity cliffs
ā€¢ Validation of QSAR model performance
ā€¢ Applicability in real-world setting
Conclusion (2)
ā€¢ In spite of certain inherent ļ¬‚aws, the QSAR paradigms inevitably
one of the most useful forces contributing to the rapid
development of drug discovery and design.
ā€¢ As with all technologies, QSAR is not perfect; however, its
weaknesses and ļ¬‚aws are continuously being identiļ¬ed, solved
and reformed to help shape a new improved and robust
approach that is approaching minimal predictive error
ā€¢ To help realize the goal of developing an intuitive approach
toward the development of robust QSAR models, our
laboratory had developed a software that affords a semi-
automated if not automated QSAR modeling.
Conclusion (3)
ā€¢ At more than 10 years of QSAR research, we can say that the
demise of QSAR is a myth if done properly and we had only
scratched the surface of its full potential
ā€¢ QSAR is continuously evolvingā€¦starting from 2D-QSAR to ā€Ø
8D-QSAR!
ā€¢ Proteochemometrics (so to say Multi-Target QSAR) enables
us to take advantage of the explosion of Omics data
A"so%ware"for"performing"automated"Data"Mining"
AutoWeka"is"a"
Python"wrapper"
of"Weka"
ā€¢ It is freely available at http://www.mt.mahidol.ac.th/autoweka/
ā€¢ Nantasenamat et al. Chapter 8:AutoWeka:Toward an Automated Data Mining
Software for QSAR and QSPR Studies. In: Cartwright H.Artiļ¬cial Neural
Networks, Springer, pp. 119-147.
AutoWeka
BioCurator
Nantasenamat et al. Manuscript under preparation.
ā€¢ We had developed a web application that allow users to upload
ChEMBL bioactivity data for automatic data curation
Protocol
ā€¢ The web app selects a
subset of IC50/Ki data
ā€¢ Removes redundant
compounds if bioactivity
values exceed 2 SD
ā€¢ Remove data with < or >
symbols in the bioactivity
label
ā€¢ Remove redundant
compounds based on
SMILES notation
osFP
Simeon et al. J Cheminf 8 (2016) 72.
Protocol
ā€¢ The web app accepts the
input peptide sequence
and computes amino acid
composition descriptors
ā€¢ Applies the constructed
predictive model to predict
the class label of the query
peptide
ā€¢ Predicted class label is
relayed into the Results
output
Simeon et al. J Cheminform (2016) 8:72
DOI 10.1186/s13321-016-0185-8
RESEARCH ARTICLE
osFP: a web server forĀ predicting the
oligomeric states ofĀ fluorescent proteins
Saw Simeon1
, Watshara Shoombuatong1
, Nuttapat Anuwongcharoen1
, Likit Preeyanon2
,
Virapong Prachayasittikul2
, Jarl E. S. Wikberg3
and Chanin Nantasenamat1*
Abstract
Background: Currently, monomeric ļ¬‚uorescent proteins (FP) are ideal markers for protein tagging. The prediction of
Open Access
HemoPred
Win et al. Future Med Chem 9 (2017) 275-291.
Protocol
ā€¢ The web app accepts the
input peptide sequence
and computes amino acid
composition descriptors
ā€¢ Applies the constructed
predictive model to predict
the class label of the query
peptide
ā€¢ Predicted class label is
relayed into the Results
output
Future
Medicinal
Chemistry
Research Article
HemoPred: a web server for predicting the
hemolytic activity of peptides
For reprint orders, please contact reprints@future-science.com
CryoProtect
Win et al. Future Med Chem 9 (2017) 275-291.
Protocol
ā€¢ The web app accepts the
input peptide sequence
and computes amino acid
composition descriptors
ā€¢ Applies the constructed
predictive model to predict
the class label of the query
peptide
ā€¢ Predicted class label is
relayed into the Results
output
Research Article
CryoProtect: A Web Server for Classifying Antifreeze Proteins
from Nonantifreeze Proteins
Reny Pratiwi,1,2
Aijaz Ahmad Malik,1
Nalini Schaduangrat,1
Virapong Prachayasittikul,3
Jarl E. S. Wikberg,4
Chanin Nantasenamat,1
and Watshara Shoombuatong1
1
Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
2
Department of Medical Laboratory Technology, Faculty of Health Science, Setia Budi University, Surakarta 57127, Indonesia
3
Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University,
Bangkok 10700, Thailand
4
Hindawi
Journal of Chemistry
Volume 2017,Article ID 9861752, 15 pages
https://doi.org/10.1155/2017/9861752
How to get started in CDD?
ā€¢ Hardware
ā€¢ Laptop
ā€¢ Desktop
ā€¢ High-
performance
computer
ā€¢ Compute clusters
ā€¢ Cloud computing
ā€¢ Software
ā€¢ Commercial
ā€¢ Free
ā€¢ Programming
ā€¢ C, Java, etc.
ā€¢ R, Python,
MATLAB, etc.
Computational Drug Discovery
based on Open Source
ā€¢ Data source
ā—¦ Bioactivity data: ChEMBL,
PubChem, BindingDB
ā—¦ Chemical database: ZINC,
ChemSpider, GDB-17
ā—¦ Biological database: PDB, UniProt
ā€¢ Data curation and pre-processing
ā—¦ BioCurator (developed in-house)
ā—¦ Babel
ā€¢ Descriptor calculation
ā—¦ Rcpi, PyDPI, CDK, PADEL
ā€¢ Multivariate analysis
ā—¦ R: caret
ā—¦ Python: scikit-learn
ā€¢ Plots
ā—¦ R: ggplot
ā—¦ Python: MatPlotLib, seaborn
Molecular modeling
ā—¦ Avogadro
ā—¦ PyMol
ā—¦ Chimera
ā—¦ VMD
ā€¢ Molecular docking
ā—¦ AutoDock
ā€¢ Molecular dynamics
ā—¦ Gromacs
ā—¦ NAMD

More Related Content

What's hot

Virtual sreening
Virtual sreeningVirtual sreening
Virtual sreeningMahendra G S
Ā 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing Ayesha Aftab
Ā 
Threading modeling methods
Threading modeling methodsThreading modeling methods
Threading modeling methodsratanvishwas
Ā 
A Brief Overview of Cheminformatics
A Brief Overview of CheminformaticsA Brief Overview of Cheminformatics
A Brief Overview of CheminformaticsSunghwan Kim
Ā 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AIDatabricks
Ā 
Presentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningPresentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningJoon Jyoti Sahariah
Ā 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniquesROHIT PAL
Ā 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination EL Sayed Sabry
Ā 
ā€œDocking Studies and Drug Designā€
ā€œDocking  Studies and Drug Designā€ā€œDocking  Studies and Drug Designā€
ā€œDocking Studies and Drug Designā€Naresh Juttu
Ā 
Artificial intelligence in drug discovery
Artificial intelligence in drug discoveryArtificial intelligence in drug discovery
Artificial intelligence in drug discoveryRAVINDRABABUKOPPERA
Ā 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYShikha Popali
Ā 
Chemoinformatics.ppt
Chemoinformatics.pptChemoinformatics.ppt
Chemoinformatics.pptwadhava gurumeet
Ā 
Cheminformatics
CheminformaticsCheminformatics
Cheminformaticsbaoilleach
Ā 
In silico drug design/Molecular docking
In silico drug design/Molecular dockingIn silico drug design/Molecular docking
In silico drug design/Molecular dockingKannan Iyanar
Ā 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryAbhik Seal
Ā 

What's hot (20)

Virtual sreening
Virtual sreeningVirtual sreening
Virtual sreening
Ā 
Chemoinformatic
Chemoinformatic Chemoinformatic
Chemoinformatic
Ā 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing
Ā 
Threading modeling methods
Threading modeling methodsThreading modeling methods
Threading modeling methods
Ā 
A Brief Overview of Cheminformatics
A Brief Overview of CheminformaticsA Brief Overview of Cheminformatics
A Brief Overview of Cheminformatics
Ā 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
Ā 
docking
docking docking
docking
Ā 
Homology modelling
Homology modellingHomology modelling
Homology modelling
Ā 
Presentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screeningPresentation on insilico drug design and virtual screening
Presentation on insilico drug design and virtual screening
Ā 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
Ā 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
Ā 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
Ā 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination
Ā 
ā€œDocking Studies and Drug Designā€
ā€œDocking  Studies and Drug Designā€ā€œDocking  Studies and Drug Designā€
ā€œDocking Studies and Drug Designā€
Ā 
Artificial intelligence in drug discovery
Artificial intelligence in drug discoveryArtificial intelligence in drug discovery
Artificial intelligence in drug discovery
Ā 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
Ā 
Chemoinformatics.ppt
Chemoinformatics.pptChemoinformatics.ppt
Chemoinformatics.ppt
Ā 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
Ā 
In silico drug design/Molecular docking
In silico drug design/Molecular dockingIn silico drug design/Molecular docking
In silico drug design/Molecular docking
Ā 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug Discovery
Ā 

Similar to Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery

Drug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsDrug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsKedar Bandekar
Ā 
Drug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsDrug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsKedar Bandekar
Ā 
Sparsh bioinfo.ppt
Sparsh bioinfo.pptSparsh bioinfo.ppt
Sparsh bioinfo.pptSparshTiwari14
Ā 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Prof. Wim Van Criekinge
Ā 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Prof. Wim Van Criekinge
Ā 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekingeProf. Wim Van Criekinge
Ā 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekingeProf. Wim Van Criekinge
Ā 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsAfra Fathima
Ā 
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppt
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfpptdruggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppt
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppttaoufikakabli1
Ā 
Computer Aided Drug Design
Computer Aided Drug DesignComputer Aided Drug Design
Computer Aided Drug Designpooja sabarinathan
Ā 
Natural products in drug discovery
Natural products in drug discoveryNatural products in drug discovery
Natural products in drug discoverySAKTHIVEL G
Ā 
Drug Discovery subject (clinical research)
Drug Discovery subject (clinical research)Drug Discovery subject (clinical research)
Drug Discovery subject (clinical research)Jannat985397
Ā 
Computer aided drug designing (cadd)
Computer aided drug designing (cadd)Computer aided drug designing (cadd)
Computer aided drug designing (cadd)University of Allahabad
Ā 
Lecture 7 computer aided drug design
Lecture 7  computer aided drug designLecture 7  computer aided drug design
Lecture 7 computer aided drug designRAJAN ROLTA
Ā 
Hit to lead drug discovery .pptx
Hit to lead drug discovery .pptxHit to lead drug discovery .pptx
Hit to lead drug discovery .pptxBhupinder Solanki
Ā 
The Many Roles of Computation in Drug Discovery
The Many Roles of Computation in Drug DiscoveryThe Many Roles of Computation in Drug Discovery
The Many Roles of Computation in Drug DiscoveryNishoanth Ramanathan
Ā 

Similar to Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery (20)

Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
Ā 
Drug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsDrug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugs
Ā 
Drug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugsDrug discovery clinical evaluation of new drugs
Drug discovery clinical evaluation of new drugs
Ā 
Sparsh bioinfo.ppt
Sparsh bioinfo.pptSparsh bioinfo.ppt
Sparsh bioinfo.ppt
Ā 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014
Ā 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Ā 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
Ā 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
Ā 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Ā 
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppt
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfpptdruggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppt
druggggggggggggggjjjhjgjgygygjhfggfdgfdgdfppt
Ā 
Drug design
Drug design Drug design
Drug design
Ā 
Computer Aided Drug Design
Computer Aided Drug DesignComputer Aided Drug Design
Computer Aided Drug Design
Ā 
Natural products in drug discovery
Natural products in drug discoveryNatural products in drug discovery
Natural products in drug discovery
Ā 
Drug discovery anthony crasto
Drug discovery  anthony crastoDrug discovery  anthony crasto
Drug discovery anthony crasto
Ā 
Drug Discovery subject (clinical research)
Drug Discovery subject (clinical research)Drug Discovery subject (clinical research)
Drug Discovery subject (clinical research)
Ā 
Computer aided drug designing (cadd)
Computer aided drug designing (cadd)Computer aided drug designing (cadd)
Computer aided drug designing (cadd)
Ā 
Lecture 7 computer aided drug design
Lecture 7  computer aided drug designLecture 7  computer aided drug design
Lecture 7 computer aided drug design
Ā 
Hit to lead drug discovery .pptx
Hit to lead drug discovery .pptxHit to lead drug discovery .pptx
Hit to lead drug discovery .pptx
Ā 
Drug discovery hit to lead
Drug discovery hit to leadDrug discovery hit to lead
Drug discovery hit to lead
Ā 
The Many Roles of Computation in Drug Discovery
The Many Roles of Computation in Drug DiscoveryThe Many Roles of Computation in Drug Discovery
The Many Roles of Computation in Drug Discovery
Ā 

Recently uploaded

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
Ā 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
Ā 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
Ā 
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Callshivangimorya083
Ā 
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...amitlee9823
Ā 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
Ā 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
Ā 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...shivangimorya083
Ā 
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort ServiceBDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort ServiceDelhi Call girls
Ā 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
Ā 
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Delhi Call girls
Ā 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
Ā 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
Ā 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
Ā 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
Ā 
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Callshivangimorya083
Ā 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
Ā 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
Ā 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
Ā 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
Ā 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
Ā 
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Ā 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Ā 
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...
Call Girls Indiranagar Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service B...
Ā 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
Ā 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Ā 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171āœ”ļøBody to body massage wit...
Ā 
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort ServiceBDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
Ā 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Ā 
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Ā 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
Ā 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Ā 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
Ā 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Ā 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
Ā 
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Ā 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
Ā 

Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery

  • 1. Computational Drug Discovery Associate Professor Dr. Chanin Nantasenamat ā€Ø E-mail: chanin.nan@mahidol.edu YouTube: http://bit.ly/dataprofessor Machine Learning for Making Sense of Big Data in Drug Discovery
  • 2. About the Speaker ā€¢ Research group website at http://codes.bio ā€¢ Codes and Data at http://github.com/ chaninn and http://github.com/chaninlab ā€¢ YouTube Channel called Data Professor available at http://bit.ly/dataprofessor ā€¢ Data Professor FaceBook Page at ā€Ø http://facebook.com/dataprofessor Icon made by Freepik from www.ļ¬‚aticon.com
  • 3. Disease ā€¢ The word ā€˜diseaseā€™ is deļ¬ned by Cambridge Dictionary asā€Ø ā€Ø illness of people, animals, plants, etc., caused by infection or a failure of health rather than by an accident http://static.ļ¬lmannex.com/users/galleries/ 294182/19265_fa_rszd.jpg
  • 4. Drugs ā€¢ A ā€˜drugā€™ is a biological or chemical entity that can modulate the course of a disease state by interacting with its target protein ā€¢ Biological entityā€Ø (e.g. antibodies) ā€¢ Chemical entityā€Ø (e.g. small molecules) Natthapon Ngamnithiporn. Image from FreePik.ā€Ø http://www.freepik.com/free-photo/packings-of-pills-and- capsules-of-medicines_1178867.htm
  • 5. Li et al. BMC Syst Biol 8 (2014) 141.
  • 6. Drug Discovery Process ā€¢ Costs ~2 billion USD ā€¢ Takes about 10-15 years ā€¢ Failure rate is > 90% http://drugdiscovery.nd.edu/
  • 7. Drug Discovery Process Ashburn andThor. Nature Rev. Drug Discov. 3 (2004) 673-683 Identify target protein that is key in modulating disease Screen for ā€˜hitā€™ molecules that can inhibit the target protein ā€˜Hit-to-leadā€™ and ā€˜Lead optimizationā€™ Evaluate pharmaco- kinetic properties Initiate Clinical trials to evaluate safety & dosage; efļ¬cacy & side effects; adverse reaction to long-term use Drug reaches the market
  • 9. Multi-objective optimization ā€¢ A drug need not only target the protein of interest but must also possess other properties ā€¢ Desirable characteristics of a drug: 1. Binds selectively to the target protein 2. Absorbs in the stomach (oral drugs) 3. Permeates gut-wall or cell-wall (can reach target site) 4. Metabolically stable 5. Non-toxic 6. Can be synthesized ā€¢ To achieve all these desirable properties, the chemical structure will need to be optimized (an optimal balance will need to be achieved against many factors)
  • 10. Creating new compounds ā€¢ We can look to nature for inspiration (biologically inspired) or use existing drugs as starting point ā€¢ Medicinal chemists optimize existing componds by modifying them in a process known as bioisosteric replacement (replacing a hydrogen atom by a halogen atom) ā€¢ Cheminformaticians can computationally enumerate a compound (compound enumeration) library using the rules of organic chemistry (considers chemical stability and synthetic feasibility) Icon made by dDara from www.ļ¬‚aticon.com
  • 11. Molecules ā€¢ Molecules can be thought of as framework of atoms (molecular graph) where atoms are vertices and bonds are edges - Each vertices can typically be one of nine atoms (C, N, O, F, P, S, Cl or Br) - Each edge that links the vertices can be a single, double or triple bond ā€¢ Compound enumeration as performed by the research group of JL Reymond (Acc Chem Res 2015, 48(3):722-730) - Molecules of up to 13 atoms āŸ¶ 977 million possible molecules (109) - Molecules of up to 17 atoms āŸ¶ 166 billion possible molecules (1011)
  • 12. Chemical space ā€¢ Theoretically possible chemical space as revealed via compound enumeration by the research group of JL Reymond (Acc Chem Res 2015, 48(3):722-730) - Molecules of up to 13 atoms āŸ¶ 977 million possible molecules (109) - Molecules of up to 17 atoms āŸ¶ 166 billion possible molecules (1011) ā€¢ Drug space (<500 Da) is estimated to constitute up to 40 atoms (in some cases, even more) āŸ¶ roughly 1060 molecules
  • 14. Bioactivity ā€¢ Bioactivity is the activity elicited by the target protein of interest ā€¢ Such target proteins are typically involved in key pathways that inļ¬‚uence the course of a disease ā€¢ Thus, great attention has been placed to modulate these target proteins ā€¢ Primary literature ā€¢ Curated Databases ā€¢ ChEMBL, BindingDB, MOAD, PubChem ā€¢ Open Innovation ā€¢ Pharmaceutical companies are making data publicly available for non- commercial diseases
  • 15. What can computers do? ā€¢ Computers (IBM Deep Blue) have defeated human in Jeopardy and Chess ā€¢ Google released a self-driving car ā€¢ NASA uses computers to simulate space missions ā€¢ Computers are being used to design aircrafts and cars ā€¢ Supermarkets and Shopping Malls are using our purchase history to analyze and predict our spending behavior ā€¢ Why not use it to discover, design and develop new drugs? ā€¢ Computers (deep learning) can paint likeVan Gogh and Picasso ā€¢ Computers can programmatically code music (Sonic Pi) ā€¢ Computers can dream
  • 18. Why do we need computational models in drug discovery? ā€¢ To discern structure-activity relationship of chemical library ā€¢ In vitro data are limited, expensive, time-consuming, laborious, etc. ā€¢ Computational models can be quickly built to preliminarily predict the pharmacokinetics and bioactivity of query compounds Anuwongcharoen et al. PeerJ 4 (2016) e1958
  • 19. Questions that can be answered by computational models ā€¢ What target proteins could my compound(s) bind to and modulate? ā€¢ Would my compound bind unspeciļ¬cally to other proteins and thus have off-target activity? ā€¢ What type of compounds can bind and modulate the bioactivity of the target protein of my interest? ā€¢ Are there similar compounds to my query compound that may potentially exert similar binding behavior? ā€¢ How does my compound bind to the protein structure of its target? Hall et al. Prog Biophys Mol Biol 116 (2014) 82-91. ā€¢ How can I modify the structure of my compound to enhance its pharmacokinetics and bioactivity?
  • 20.
  • 21. ADMET QSAR Pharmacophore Statistical molecular design Molecular modeling Protein structure prediction - Homology/comparative - Ab initio Molecular dynamics Normal mode analysis Docking/reverse docking Binding cavity analysis Pharmacophore Proteinā€“ligand interactome Proteinā€“protein interactome Drug target gene expression Intrinsically disordered proteins Allo-network drugs High-throughput synthesis High-throughput screening Privileged structures Bioisostere Chemoisostere Scaffold hopping Sequence alignment BLAST Phylogenetic analysis Biological space Computational chemistry Molecular descriptors Chemical space Profiling Filtering - Lipinskiā€™s rule of 5 Search - Molecular similarity - Substructure similarity - Shape, volume and charge-based similarityDatabases Small molecules - DrugBank - ChEMBL - Pubchem - BindingDB - ZINC Proteins - PDB - UniProt - SCOP Protein-protein - MINT - STITCH - STRING Pathway - KEGG - Reactome Proteochemometrics Computational chemogenomics Graph/network theory Fragment-based docking Fragment-based QSAR Ligand growing Structure-based Systems-based Medicinal chemistry Bioinformatics Cheminformatics Ligand-based Chemogenomics Fragment-based Maximizing computational tools for successful drug discovery Overview of Computational Drug Discovery Nantasenamat and Prachayasittikul. Expert Opin Drug Discov 10 (2015) 321-329.
  • 22. Bioinformatics ā€¢ Bioinformatics is a discipline entailing the use of computational approaches to analyze biological data ā€£ Analyze and compare genes, proteins and genomes ā€£ Explore structures and functions of biomolecules (DNA, protein, lipid and carbohydrate) ā€£ Explore network biology and metabolic pathways http://www.gettyimages.com/detail/photo/bioinformatics-background-concept-royalty-free- image/475811932?esource=SEO_GIS_CDN_Redirect I424 L428 F404 R394 E353 A350 D351 L354 P535 W383L525 Suvannang et al. Manuscript under Preparation.
  • 23. ā€¢ Cheminformatics is a discipline at the interface of chemistry and computers that enables the analysis of various aspects relevant to chemical structures ā€£ Chemical space for investigating Molecular similarity/diversity ā€£ Molecular descriptors (e.g. MW, LogP, nHBdon, nHBacc) and Quantum chemical descriptors (HOMO, LUMO, HOMO-LUMO) Cheminformatics Ertl and Rohde. J Cheminf 4 (2012) 12.
  • 24. Drugs and its pre-cursors ā€¢ Fragments - are one of many substructures found in a compound (drug) ā€¢ Privileged substructures - are substructures that are commonly found as inhibitors/activators (drugs) against several therapeutic targets ā€¢ Hits - are a small subset of compounds from large chemical libraries that are identiļ¬ed from high-throughput screening ā€¢ Leads - are compounds that have undergone minor structural optimization from hits. From there, these leads often undergo further rounds of ā€œlead optimizationā€ ā€¢ Drugs - are one of many leads that had passed rigorous tests (pre-clinical and clinical trials) before reaching the market
  • 25. Identifying hits ā€¢ So how does one go about identifying hit compounds? - High-throughput screening ā€Ø (Experimental and computational) - Find similar compounds to known actives as the bioactivity of each compound is not an isolated point (similar chemical structures also provide similar biological activity) ą¹ 30% of these similar compounds to known actives, are themselves actives https://southernresearch.org/news/nih-contract-high- throughput-screening-for-zika/ Hernandex-Santoyo et al. Protein-protein and protein- ligand docking. DOI:10.5772/56376 MartinYC, J Med Chem 2002, 45(19):4350-4358
  • 26. Lead generation (Hit-to-Lead) ā€¢ Identiļ¬ed hits from high- throughput screens are transformed to leads by means of limited structural modiļ¬cation (as to optimize their ADMET properties) ā€¢ Generated leads are subjected to further rounds of lead optimization Fuller N et al. Drug DiscovToday 2016, 21(8):1272-1283.
  • 27. Fragment-based Drug Design Source: http://practicalfragments.blogspot.com/2011/08/ļ¬rst-fragment-based-drug-approved.html Zelboraf treats melanoma by inhibiting BRAF.
  • 29. ā€¢ Christopher Lipinski analyzed a large set of > 2,000 orally-active drugs that led to what is known as the Lipinskiā€™s Rule of 5, which is a set of rules deļ¬ning the drug like-ness of small molecules ā€£ Molecular weight < 500 Da ā€£ Lipophilicity (LogP) < 5 ā€£ Hydrogen bond donors < 5 ā€£ Hydrogen bond acceptors < 10 Lipinskiā€™s Rule of 5 a b c da b c d Christopher Lipinski @ Pļ¬zer Lipinski et al.Adv Drug Deliv Rev 23 (1997) 3-25 Suvannang et al. (2017) Unpublished results
  • 30. ā€¢ In drug discovery, there is a tendency for the lipophilicity and molecular weight to increase as lead optimization progresses as to improve the drugā€™s afļ¬nity and selectivity ā€£ Molecular weight < 300 Da ā€£ Lipophilicity (LogP) < 3 ā€£ Hydrogen bond donors < 3 ā€£ Hydrogen bond acceptors < 3 ā€£ Rotatable bonds < 3 Lead-like Rule of 3
  • 31. Chemical space ā€¢ Chemical space can be generally deļ¬ned as the universe of synthetically feasible small molecules of <500 Da that is estimated to be in the order of ~1060 molecules ā€¢ The visualization of which gives us a birdā€™s eye glance at the relative diversity/likeness of chemical libraries ā€¢ Reymond group at University of Bern, Switzerland developed a computational algorithm that enumerates all possible chemical structures that can be built from 17 heavy atoms in their GDB-17 database which amounts to 166.4 billion Reymond and Awale.ACS Chem Neurosci 3 (2012) 649-657.
  • 32. Biological space ā€¢ Biological space refers to the chemical space of druggable protein families ā€£ ADMET ā€£ Aminergic/Lipophilic GPCR space ā€£ Kinase space ā€£ Protease space ā€£ CYP450 ā€£ Nuclear receptors Petit-Zeman S. http://www.nature.com/horizon/ chemicalspace/background/ļ¬gs/explore_b1.html
  • 33. Fragment space ā€¢ Fragment space can be deļ¬ned as the universe or collection of all possible molecular fragments (substructures) ā€¢ Fragments are < 300 Da ā€¢ Utilization of the fragment space has been suggested to allow more diverse exploration of the possible chemical space ā€¢ Reymond group also extracted 10 million fragments from the GDB-17 https://software.zbh.uni-hamburg.de/assets/softwareserverslide6- a0e42ecb3651120926821932574540d5b2e83ff0209654f9ab14 804c7858451a.png Virshup et al. J Am Chem Soc 135 (2013) 7296-7303
  • 34. Koch et al. PNAS 102 (2005) 17272-17277 Structural classiļ¬cation of natural products (SCONP)
  • 35. Nadin et al.Angew Chem Int Ed 51 (2012) 1114-1122.
  • 36. Polypharmacology ā€¢ There is a paradigm shift from ā€˜one drug-one targetā€™ to ā€˜one drug- multiple targetsā€™ ā€¢ Unintended off-target binding may elicit undesirable side effects and adverse effects ā€¢ Desirable off-target binding gives you drug repositioning opportunities ā€¢ Knowledge of polypharmacology may aid in the design of multi-targeted drugs Reddy and Zhang. Expert Rev Clin Pharmacol 6 (2013) 41-47 Kinase targets of Staurosporine
  • 37. Drug repositioning/repurposing ā€¢ There is a need to discover new drugs for treatment especially rare and neglected diseases ā€¢ Drug repositioning/ re- purposing is a lucrative approach as it tests existing FDA-approved drugs against various other whole-cell and target assays Wu et al. Mol BioSyst 9 (2013) 1268-1281.
  • 38. Experimental activity (pIC50) 5.0 5.5 6.0 6.5 7.0 7.5 8.0 Predictedactivity(pIC50) 5.0 5.5 6.0 6.5 7.0 7.5 8.0 What is QSAR? (1) ā€¢ QSAR/QSPR is the acronym of Quantitative Structure-Activity/Property Relationship ā€¢ QSAR seeks to correlate structural features of compounds with their biological activities
  • 39. What is QSAR? (2) ā€¢ Structure governs activity/ property ā€¢ Typically in the medicinal chemistry literature, effects of substituent groups on activity is extensively studied 1" 2" 3" 4" 5" 6" ā€¢ QSAR/QSPR studies exploits this knowledge for modeling the biological or chemical activities/properties
  • 40. What is QSAR? (3) ā€¢ QSAR involves three main concepts: 1. Selecting a biological activity or chemical property of interest 2. Generating the physicochemical description 3. Predicting the biological activity or chemical property Qm# Energy# Ī¼# HOMO# LUMO# HOMO0LUMO#gap# 0.2271& '309.834& 1.0521& '0.21346& '0.0127& 0.20076& 0.2142& '195.31& 0.2337& '0.22611& '0.01915& 0.20696& IC50% 0.05$ 1.50$ Molecular Descriptors Biological Activity Computational Chemistry Machine Learning Compounds of Interest Predict
  • 41. Growth of QSAR? ā€¢ A search in SCOPUS shows the growing trend of QSAR publications
  • 42. Data set preparation QSAR modeling ChEMBL 23 Bioactivity measured by IC50 Remove duplicate SMILES Bioactivity data of ER Ī± inhibitors Initial data set 10,666 bioactivity data for 5,809 compounds IC50 subset 3,527 compounds Final data set 1,299 compounds Select entries with CONFIDENCE_SCORE=9 and assay_type=B Selected data set 1,346 compounds Mechanistic interpretation of feature importance Feature selection 12 sets of PaDEL fingerprints Descriptor calculation Data splitting Evaluate performance QSAR model Predicted pIC50 values Y-scrambling for evaluating chance correlation Delete entries with < or > signs and those with Salt removal Transform IC50 to pIC50 Final data set Tautomer standardization Remove collinear descriptors 70/30 split ratio Perform 10 data splits Delete entries with missing SMILES notation R2,Q2, Rm 2, RMSE A typical QSAR workļ¬‚ow Suvannang et al. RSC Adv 2018, 8: 11344-11356
  • 43. Applications of QSAR/QSPR models ā€¢ Regulatory Use: QSAR for modelling environmental toxicity/chemical hazards by EPA and OECD ā€¢ Drug Design: QSAR for modelling biological activities ā€¢ Materials Design: QSPR for modelling chemical properties
  • 45. Biological activity/chemical property modeled by QSAR Biological Activity Chemical Property Bioconcentration Boiling point Biodegradation Chromatographic retention time Carcinogenicity Dielectric constant Drug metabolism Diffusion coefļ¬cient Inhibitor constant Dissociation constant Mutagenicity Melting point Permeability Reactivity Blood brain barrier Solubility Skin Stability Pharmacokinetics Thermodynamic properties Receptor binding Viscosity Nantasenamat et al. EXCLI J. (2009) 8: 74-88
  • 46. Multiple Compounds Single ā€Ø Target Protein Multiple Compounds Multiple ā€Ø Target Proteins QSAR Proteochemometrics
  • 47. Summary ā€¢ QSAR models allow us to understand how changes to the chromophore structure leads to GFP color change ā€¢ PCM models allow us to understand how changes to chromophore structure, changes to protein structure and the chromophore-protein interaction inļ¬‚uences GFP color change ā€¢ Insights from the predictive models could be used in further extending the spectral repertoire of GFP Nantasenamat C et al. J. Comput. Chem. 35(27): 1951-1966.
  • 48. Proteochemometrics ā€¢ Proteochemometrics was developed by Maris Lapins and Jarl Wikberg of Uppsala University in 2001 ā€¢ Advantages ā€¢ Can explain ligand-target afļ¬nity by providing detailed maps down to the substructures and amino acid level ā€¢ Can be used to rationalise why a ligand is active toward one target and not on the other related target ā€¢ Has been shown to be useful for Drug Repositioning ā€¢ Could be used for Personalized Medicine
  • 49. Conclusion (1) ā€¢ It is without a doubt that the QSAR paradigm boasts much beneļ¬t for the rational design of robust compounds ā€¢ Nevertheless, there are certain shortcomings that may limit the widespread application of QSAR ā€¢ Workļ¬‚ow of QSAR model development ā€¢ High dimensionality of the input space ā€¢ Representation of the molecular structure ā€¢ Interpretability and meaning of the developed QSAR models ā€¢ Presence of outliers or activity cliffs ā€¢ Validation of QSAR model performance ā€¢ Applicability in real-world setting
  • 50. Conclusion (2) ā€¢ In spite of certain inherent ļ¬‚aws, the QSAR paradigms inevitably one of the most useful forces contributing to the rapid development of drug discovery and design. ā€¢ As with all technologies, QSAR is not perfect; however, its weaknesses and ļ¬‚aws are continuously being identiļ¬ed, solved and reformed to help shape a new improved and robust approach that is approaching minimal predictive error ā€¢ To help realize the goal of developing an intuitive approach toward the development of robust QSAR models, our laboratory had developed a software that affords a semi- automated if not automated QSAR modeling.
  • 51. Conclusion (3) ā€¢ At more than 10 years of QSAR research, we can say that the demise of QSAR is a myth if done properly and we had only scratched the surface of its full potential ā€¢ QSAR is continuously evolvingā€¦starting from 2D-QSAR to ā€Ø 8D-QSAR! ā€¢ Proteochemometrics (so to say Multi-Target QSAR) enables us to take advantage of the explosion of Omics data
  • 52. A"so%ware"for"performing"automated"Data"Mining" AutoWeka"is"a" Python"wrapper" of"Weka" ā€¢ It is freely available at http://www.mt.mahidol.ac.th/autoweka/ ā€¢ Nantasenamat et al. Chapter 8:AutoWeka:Toward an Automated Data Mining Software for QSAR and QSPR Studies. In: Cartwright H.Artiļ¬cial Neural Networks, Springer, pp. 119-147. AutoWeka
  • 53. BioCurator Nantasenamat et al. Manuscript under preparation. ā€¢ We had developed a web application that allow users to upload ChEMBL bioactivity data for automatic data curation Protocol ā€¢ The web app selects a subset of IC50/Ki data ā€¢ Removes redundant compounds if bioactivity values exceed 2 SD ā€¢ Remove data with < or > symbols in the bioactivity label ā€¢ Remove redundant compounds based on SMILES notation
  • 54. osFP Simeon et al. J Cheminf 8 (2016) 72. Protocol ā€¢ The web app accepts the input peptide sequence and computes amino acid composition descriptors ā€¢ Applies the constructed predictive model to predict the class label of the query peptide ā€¢ Predicted class label is relayed into the Results output Simeon et al. J Cheminform (2016) 8:72 DOI 10.1186/s13321-016-0185-8 RESEARCH ARTICLE osFP: a web server forĀ predicting the oligomeric states ofĀ fluorescent proteins Saw Simeon1 , Watshara Shoombuatong1 , Nuttapat Anuwongcharoen1 , Likit Preeyanon2 , Virapong Prachayasittikul2 , Jarl E. S. Wikberg3 and Chanin Nantasenamat1* Abstract Background: Currently, monomeric ļ¬‚uorescent proteins (FP) are ideal markers for protein tagging. The prediction of Open Access
  • 55. HemoPred Win et al. Future Med Chem 9 (2017) 275-291. Protocol ā€¢ The web app accepts the input peptide sequence and computes amino acid composition descriptors ā€¢ Applies the constructed predictive model to predict the class label of the query peptide ā€¢ Predicted class label is relayed into the Results output Future Medicinal Chemistry Research Article HemoPred: a web server for predicting the hemolytic activity of peptides For reprint orders, please contact reprints@future-science.com
  • 56. CryoProtect Win et al. Future Med Chem 9 (2017) 275-291. Protocol ā€¢ The web app accepts the input peptide sequence and computes amino acid composition descriptors ā€¢ Applies the constructed predictive model to predict the class label of the query peptide ā€¢ Predicted class label is relayed into the Results output Research Article CryoProtect: A Web Server for Classifying Antifreeze Proteins from Nonantifreeze Proteins Reny Pratiwi,1,2 Aijaz Ahmad Malik,1 Nalini Schaduangrat,1 Virapong Prachayasittikul,3 Jarl E. S. Wikberg,4 Chanin Nantasenamat,1 and Watshara Shoombuatong1 1 Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand 2 Department of Medical Laboratory Technology, Faculty of Health Science, Setia Budi University, Surakarta 57127, Indonesia 3 Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand 4 Hindawi Journal of Chemistry Volume 2017,Article ID 9861752, 15 pages https://doi.org/10.1155/2017/9861752
  • 57. How to get started in CDD? ā€¢ Hardware ā€¢ Laptop ā€¢ Desktop ā€¢ High- performance computer ā€¢ Compute clusters ā€¢ Cloud computing ā€¢ Software ā€¢ Commercial ā€¢ Free ā€¢ Programming ā€¢ C, Java, etc. ā€¢ R, Python, MATLAB, etc.
  • 58. Computational Drug Discovery based on Open Source ā€¢ Data source ā—¦ Bioactivity data: ChEMBL, PubChem, BindingDB ā—¦ Chemical database: ZINC, ChemSpider, GDB-17 ā—¦ Biological database: PDB, UniProt ā€¢ Data curation and pre-processing ā—¦ BioCurator (developed in-house) ā—¦ Babel ā€¢ Descriptor calculation ā—¦ Rcpi, PyDPI, CDK, PADEL ā€¢ Multivariate analysis ā—¦ R: caret ā—¦ Python: scikit-learn ā€¢ Plots ā—¦ R: ggplot ā—¦ Python: MatPlotLib, seaborn Molecular modeling ā—¦ Avogadro ā—¦ PyMol ā—¦ Chimera ā—¦ VMD ā€¢ Molecular docking ā—¦ AutoDock ā€¢ Molecular dynamics ā—¦ Gromacs ā—¦ NAMD