Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Target Interaction and DNA Analysis.
Poster presented at the BIG N2N Symposium 2016.
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Target Interaction and DNA Analysis
1. http://datasciencelab.ugent.be/
Ghent University – iMinds, ELIS Department/Data Science Lab
Ghent University Global Campus – Center for Biotech Data Science
Mijung Kim, Jasper Zuallaert, Wesley De Neve, and Rik Van de Walle
BIG N2N Annual Symposium
{mijung.kim, jasper.zuallaert, wesley.deneve, rik.vandewalle}@ugent.be
BIOTECH DATA SCIENCE @ GUGC IN KOREA: DEEP LEARNING FOR
PREDICTION OF DRUG-TARGET INTERACTION AND DNA ANALYSIS
May 19, 2016 | Ghent | Belgium
Overview of Deep Machine Learning
Prediction of Drug-Target Interaction
Input layer Hidden layers Output layer
Automatic end-to-end learning of hierarchical features through multi-layered neural networks
Training Data: CheMBL Dataset by EMBL-EBI
Model using
Deep Learning
(TensorFlow)
New
Drug
Potential
Target
DNA Analysis using Natural Language Processing Techniques
Word2Vec on DNA sequences
→ Represents every 𝑛-gram by a vector
e.g., ACG = [O.359, …, -O.129]
Vectors are calculated based on
surrounding 𝑛-grams
e.g., ... TTA CGA ACG TGG CAT ...
Convolutional Neural Networks
ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA …
… ci-4 ci-3 ci-2 ci-1 ci ci+1 ci+2 …
Long Short-Term Memory Networks
ACG GCT CTA TAA AAG AGA GAC ACC CCT CTA …
…
One-hot representation
A C C A T A …
1 O O 1 O 1
O 1 1 O O O
O O O O O O
O O O O 1 O
ACC CCA CAT ATA
O.241 O.124 -O.549 O.421
-O.853 O.513 O.185 -O.129
… … … …
-O.252 -O.884 O.112 O.466
Training
Prediction
Prediction