AI-augmented Drug Discovery.pdf

AI-Augmented
Drug Discovery
Creative Biolabs provides innovative drug discovery services
based on our original Artificial Intelligence-augmented
technology, especially for the discovery of therapeutic
antibodies and small molecules.
Email: info@creative-biolabs.com
Address: SUITE 203, 17 Ramsey Road, Shirley, NY 11967, USA
Web: www.creative-biolabs.com

Introducing a new drug to market
can cost pharmaceutical
companies an average $2.6 billion
and 11-15 years of research and
development.
Even once new drug candidates
show potential in laboratory
testing, less than 10% of drug
candidates make it to market
following Phase I trials.
Between 2010 and 2017, 76% of
new drugs approved by the US
Food and Drug Administration
(FDA) are small molecules.
$2.6 B 10% 76%
WHY USE AI IN DRUG
DISCOVERY?

After making it through the preclinical development
phase, and receiving approval from the FDA,
researchers begin testing the drug with human
participants. AI can facilitate participant monitoring
during clinical trials—generating a larger set of data
more quickly—and aid in participant retention by
personalizing the trial experience.
AI in Clinical Trials
(Phase III)
The drug discovery process ranges from reading and analyzing
already existing literature, to testing the ways potential drugs
interact with targets. According to report, AI could curb drug
discovery costs for companies by as much as 70%.
AI in Drug Discovery
(Phase I)
The preclinical development phase of drug discovery involves
testing potential drug targets on animal models. Utilizing AI
during this phase could help trials run smoothly and enable
researchers to more quickly and successfully predict how a
drug might interact with the animal model.
AI in Preclinical Development
(Phase II)

Ø Predicting 3D structure of
target protein
Ø Predicting drug-protein
interactions
Ø AI in determining drug
activity
Ø AI in de novo drug design
AI in
drug design
AI In Drug Discovery
AI in
polypharmacology
Ø Designing biospecific
drug molecules
Ø Designing multitarget
drug molecules
AI in
chemical synthesis
Ø Predicting reaction yield
Ø Predicting retrosynthesis
pathways
Ø Developing insights into
reaction mechanisms
Ø Designing synthetic route
AI in
drug repurposing
Ø Identification of
therapeutic target
Ø Prediction of new
therapeutic use
AI in
drug screening
Ø Prediction of toxicity
Ø Prediction of bioactivity
Ø Prediction of
physicochemical property
Ø Identification and
classification of target cells

Classes of Learning Tasks and Techniques
Mix of supervised and unsupervised learning, where less expensive and more abundant unlabeled
data can be utilized to train a classifier.
Semisupervised Learning (Fig. A)
A learning algorithm can interactively query the user to determine labels for unlabeled data in the
regions of the input space about which the model is least certain.
Active Learning (Fig. B)
Describes a family of algorithms that relax the common assumption that the training and test data
should be in the same feature space and follow the same distribution.
Transfer Learning (Fig. D)
Can be treated as a geometric or topological problem, the goal is to find similarities and differences
between data points used to spatially order data.
Unsupervised Learning
The goal is to reconstruct the unknown function f that assigns output values y to data points x.
Supervised Learning
Instead of learning only one task at a time, as in single-task learning, several different but
conceptually related tasks are learned in parallel and make use of a shared internal representation.
Multitask Learning (Fig. E)
To some extent strives to emulate reward-driven learning, and in its simplest configuration, an agent
attempts to find the optimal set of actions to promote some outcome.
Reinforcement Learning (Fig. C)
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119 (18): 10520-10594.

Bayesian methods are those that explicitly
apply Bayes’ theorem to classification and
regression problems.
Bayesian Algorithms
It is called instance-based because it builds
the hypotheses from the training instances.
It is also known as memory-based learning
or lazy-learning.
Instance-Based Methods
Algorithms for constructing decision trees
usually work top-down, by choosing a
variable at each step that best splits the
set of items.
Decision Tree Algorithms
In statistics and machine learning,
ensemble methods use multiple
learning algorithms to obtain better
predictive performance than could be
obtained from any of the constituent
learning algorithms alone.
Ensemble Algorithms
Dimensionality reduction seeks a lower-
dimensional representation of numerical
input data that preserves the salient
relationships in the data.
Dimensionality Reduction
Artificial neural networks (ANNs) consist of
input, hidden, and output layers with
connected neurons (nodes) to simulate the
human brain.
Artificial Neural Networks
Common Learning Algorithms

Bayesian Algorithms
Liu ZH,et al. ChemStable: A web server for rule-embedded naïve Bayesian learning approach to predict
compound stability. J. Comput. Aided Mol. Des. 2014, 28: 941-950.

Instance-Based Methods
SVM is a supervised machine learning algorithm used for both classification
and regression. The objective of SVM algorithm is to find a hyperplane in an
N-dimensional space that distinctly classifies the data points.
Support Vector Machine
A SOM or self-organizing feature map is an unsupervised machine learning
technique used to produce a low-dimensional representation of a higher
dimensional data set while preserving the topological structure of the data.
Self-organizing Map
KNN is a simple, supervised machine learning algorithm that can be used to
solve both classification and regression problems.
K-nearest Neighbor
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019,
119 (18): 10520-10594.

Decision Tree Algorithms
Random forests or random decision forests is an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time.
Random Forest
A decision tree is a decision support tool that uses a tree-like model of decisions and their
possible consequences, including chance event outcomes, resource costs, and utility.
Decision Tree

Ensemble Algorithms
Boosting is an ensemble learning method that combines a set of weak
learners into a strong learner to minimize training errors. In boosting, a
random sample of data is selected, fitted with a model and then
trained sequentially—that is, each model tries to compensate for the
weaknesses of its predecessor.
Boosting
Bagging, is the ensemble learning method that is commonly used
to reduce variance within a noisy dataset. In bagging, a random
sample of data in a training set is selected with replacement—
meaning that the individual data points can be chosen more than
once.
Bagging

Dimensionality Reduction
LDA is a generalization of Fisher's linear discriminant, a method used in
statistics, pattern recognition and machine learning to find a linear
combination of features that characterizes or separates two or more classes
of objects or events.
Linear Discriminant Analysis
Image From Wikipedia
A visual depiction of the resulting PCA projection for a set of 2D points. A visual depiction of the resulting LDA projection for a set of 2D points.
PCA is a popular technique for analyzing large datasets containing a high
number of dimensions/features per observation, increasing the
interpretability of data while preserving the maximum amount of information,
and enabling the visualization of multidimensional data.
Principal Component Analysis

Artificial Neural Networks
DNN refers to an ANN that has several hidden layers with several
differences. Deep nets process data in complex ways by employing
sophisticated math modeling.
Deep Neural Networks
ANNs are computing systems inspired by the biological neural networks
that constitute animal brains. A typical ANN architecture contains many
artificial neurons arranged in a series of layers: the input layer, an output
layer, i.e., the top layer, which generates a desired prediction ( ADMET
properties, activity, a vector of fingerprint etc.), and one or more hidden
layer where the intermediate representations of the input data are
transformed.
Artificial neural networks

DeepVS: Boosting Docking-Based Virtual
Screening with DL
Pereira J.C. Boosting docking-based virtual screening with deep learning. J. Chem. Inf. Model. 2016;56:2495–2506.
Mostafa K. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 2019, 35(18):3329–3338.
The deep neural network that is introduced, DeepVS, uses the output of a
docking program and learns how to extract relevant features from basic
data. The approach introduces the use of atom and amino acid
embeddings and implements an effective way of creating distributed
vector representations of protein–ligand complexes by modeling the
compound as a set of atom contexts that is further processed by a
convolutional layer.
DeepVS

DeepAffinity: DL Method
Used to Measure DTBA
Mostafa K. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 2019, 35(18):3329–3338.
DeepAffinity is a deep learning methods used to measure drug
target binding affinity. Under novel representations of
structurally-annotated protein sequences, a semi-supervised
deep learning model that unifies recurrent and convolutional
neural networks has been proposed to exploit both unlabeled
and labeled data, for joint ly encoding molecular
representations and predicting affinities. Performances for new
protein classes with few labeled data are further improved by
transfer learning.
DeepAffinity

DeepTox: Toxicity Prediction Using Deep Learning
Mayr A. DeepTox: toxicity prediction using deep learning. Front. Environ. Sci. 2016, 3:80.
Representation of a toxicophore by hierarchically related features.

AI-Based QSAR Models
Profile-QSAR
SVM QSAR
Bayesian QSAR
Multitask QSAR

• High throughput, screen large numbers of clones
• Large library capacity: from 107 to over 108
• Various phage display systems (M13,λ,T7)
• Tailored biopanning strategies
• Wide range of applications
Antibody Production by Phage Display
Creative Biolabs has combined AI, big data, machine learning, and phage display to generate a novel AI-powered computational antibody drug discovery
platform. Aided by this innovative platform, one-stop human antibody discovery services are provided, including antibody-antigen binding prediction,
antibody candidate generation, antibody sequence optimization, and antibody production & characterization.
AI-Based One-stop Antibody
Discovery Platform
• Discover and analyze new antibody clusters
• Generate new sequences within existing clusters
• Accelerate the generation of high-affinity antibodies
• Rapidly generate novel antibody sequences using
computational algorithms to help improve affinity, solubility,
manufacturability, specificity, and stability
Augmented Antibody Discovery with Al

AI can typically generate 10 times more
antibody sequence clusters than a laboratory-
based approach alone. Diversity leads to the
discovery of new binding modalities and
potentially new therapeutic modes-of-action.
Antibody Discovery Services
Creative Biolabs is specialized in designing and
performing high-quality custom AI-based antibody
screening assays, with different formats, endpoints,
parameters, to satisfy any specific requirement.
Antibody Screening Services
Creative Biolabs offers a wide variety of antibody
engineering services to quickly and efficiently optimize
the existing antibodies via AI based algorithms, such
as affinity, solubility, cross-reactivity, manufacturability,
immunogenicity, specificity, and stability.
Creative Biolabs has applied AI technology in small
molecule design and optimization to promote its affinity,
specificity, and validity. Our innovative AI methods range
from in silico molecule screening, molecular modeling, to
AI-based molecule optimization.
Small Molecule Design & Optimization
Creative Biolabs provides the best strategy and
customized protocols for model training data
service, and ultimately, to accelerate the novel
candidate drug discovery.
AI-Augmented Drug Discovery at Creative Biolabs
Antibody Engineering Services Model Training Data Services

AI-augmented Drug Discovery.pdf

Recommended

Recommended

More Related Content

Similar to AI-augmented Drug Discovery.pdf

Similar to AI-augmented Drug Discovery.pdf (20)

More from Candy Swift

More from Candy Swift (20)

Recently uploaded

Recently uploaded (20)

AI-augmented Drug Discovery.pdf