SlideShare a Scribd company logo
Network embedding in biomedical data science
Chang Su, Jie Tong, Yongjun Zhu, Peng Cui & Fei Wang
Briefings in Bioinformatics
December 2018
https://doi.org/10.1093/bib/bby117
Journal Club 01-Feb-2019
“Recent advances in biomedical research as well as computer software and
hardware technologies have led to an inrush of a large number of relational data
interlinking drugs, genes, proteins, chemical compounds, diseases and medical
concepts extracted from clinical data.”
The availability of such
relational data has
greatly facilitated the
biomedical studies, such
as network biology,
network medicine,
pharmacogenomics,
disease diagnosis,
clinical phenotyping,
etc.
Numerous network-
based learning methods
have been developed to
explore reliable tools for
multiple applications.
Although existing
methods show capacity
of processing networks
and demonstrate great
promises, they usually
suffer from high
computational and space
cost, owning to high
dimensionality and
sparsity of the networks.
Efficient analysis of these networks heavily relies on the ways how networks are represented. Often, a
discrete adjacency matrix is used to represent a network, which only captures neighboring relationships
between vertices. Indeed, this simple representation cannot embody more complex, higher-order structure
relationships, such as paths, frequent substructure etc. As a result, such a traditional routine often makes
many network analytic tasks computationally expensive and intractable over largescale networks.
--Zhang, Daokun, et al. "Network representation learning: A survey." IEEE transactions on Big Data (2018).
What is it?
• Network embedding aims at converting the network into a low-dimensional space while structural
information of the network is preserved.
• In this way, nodes and/or edges of the network can be represented as compacted yet informative vectors
in the embedding space.
Advantages:
• Typical non-network-based machine learning methods such as linear regression, Support Vector Machine
(SVM) and decision forest, which have been demonstrated to be effective and efficient as the state-of-
the-art techniques, can be applied to such vectors.
Current status:
• Efforts of applying network embedding to improve biomedical data analysis are already planned or
underway.
Difficulties:
• The biomedical networks are sparse, noisy, incomplete, heterogeneous and usually consist of biomedical
text and other domain knowledge. It makes embedding tasks more complicated than other application
fields.
In the network embedding space, the relationships
among the nodes, which were originally represented
by edges or other high-order topological measures in
graphs, is captured by the distances between nodes
in the vector space, and the topological and
structural characteristics of a node are encoded into
its embedding vector.
An example is shown in Fig. After embedding the
karate club network into a twodimensional space,
the similar nodes marked by the same color are
close to each other in the embedding space,
demonstrating that the network structure can be
well modeled in the two-dimensional embedding
space.
Ref.: Cui, Peng, et al. "A survey on network embedding." IEEE Transactions on Knowledge and Data Engineering (2018).
Basic idea: Embed nodes so that distances in embedding space reflect node similarities in the original network.
A non-attributed network is also known
as homogeneous network, of which all
nodes and edges belong to a unique
type, respectively.
The attributed networks, also known as
heterogeneous networks, allow nodes
and/or edges to belong to multiple
types.
Table: Network embedding models
surveyed in this study.
Table: Open-source software packages of the network embedding techniques.
Applications in biomedical data science
Table: Biomedical applications of network embedding surveyed in this study.
The use of network embedding for biomedical data analysis is recent and not thoroughly explored.
Drug repositioning: Computational drug repositioning, also known as drug repurposing, is a promising and
efficient tool for exploring new usage for existing drugs to save drug development cost and increase
productivity. Drugs bind with target proteins and affect their downstream activity, consequently lead to
impact on human body to treat the disease. A drug repositioning tool usually aims at predicting unknown
drug–target or drug–disease interactions.
A network integration approach for drug-target interaction prediction and computational drug repositioning
from heterogeneous information.
-- Luo, Yunan, et al. Nature communications 8.1 (2017): 573.
• Proposed DTINet by extending diffusion component analysis (DCA) by separately performing
random walk with restart (RWR) on drug–drug, drug–disease, drug side effect and drug similarity
networks for drug embedding and on protein–protein, protein–disease and protein similarity
networks for target protein embedding. After that, DTINet projected drugs into the embedding
space of target proteins and made prediction based on geometric proximity.
• Developed a computational pipeline, called DTINet, to predict novel drug–target interactions from
a constructed heterogeneous network, which integrates diverse drug-related information.
• DTINet focuses on learning a low-dimensional vector representation of features, which accurately
explains the topological properties of individual nodes in the heterogeneous network, and then
makes prediction based on these representations via a vector space projection scheme.
• DTINet achieves substantial performance improvement over other state-of-the-art methods for
drug–target interaction prediction.
Large‐scale extraction of drug–disease pairs from the medical literature.
-- Wang, Pengwei, et al. Journal of the Association for Information Science and Technology 68.11 (2017): 2649-2661.
• Proposed to detect unknown drug–disease interactions from the medical literature by using
neuro-linguistic programming (NLP) and network embedding techniques.
• Using treatment and inducement drug–disease pairs extracted from 27 million PubMed articles,
they first constructed a heterogeneous network.
• They next expanded the network embedding method, LINE, by modifying the 1st-order proximity
to encode treatment and inducement relations as positive and negative effects to the objective
function, respectively.
• The result showed that the embeddings lead to significant improvement in predictions of both
types of drug–disease interactions.
Adverse drug reaction analysis: An adverse drug reaction (ADR) is defined as any undesirable effect from
the medical use of drugs beyond its anticipated therapeutic effects that occurs at a usual dosage. The study
of ADRs is the concern of drug development especially before a drug is launched on clinical application.
Detecting potential ADRs is always time consuming and expensive. To address this, computational methods
based on network embedding have been introduced to ADR analysis.
Large-scale structural and textual similarity-based mining of knowledge graph to predict drug–drug interactions.
-- Abdelaziz, Ibrahim, et al. Web Semantics: Science, Services and Agents on the World Wide Web 44 (2017): 104-117.
• Tiresias, a large-scale similarity-based framework that predicts DDIs through link prediction.
• Tiresias takes in various sources of drug-related data and knowledge as inputs, and provides DDI
predictions as outputs.
• The process starts with semantic integration of the input data that results in a knowledge graph
describing drug attributes and relationships with various related entities such as enzymes,
chemical structures, and pathways.
• The knowledge graph is then used to compute several similarity measures between all the drugs
in a scalable and distributed framework.
• The resulting similarity metrics are used to build features for a large-scale logistic regression
model to predict potential DDIs.
Exploiting ontology graph for predicting sparsely annotated gene function.
-- Wang, Sheng, et al. Bioinformatics 31.12 (2015): i357-i364.
• Proposed clusDCA to predict gene function, by applying diffusion component analysis (DCA) to
gene–gene interaction and GO to learn low-dimensional representations for genes and GO labels,
respectively.
• Based on such embedding results, they trained a projection model from gene space to GO space
such that genes geometrically closed to their known GO labels.
• It bridges latent gene features and GO labels and results in desirable prediction of sparsely
annotated gene functions.
Genomics Data Analysis
Geometric de-noising of protein-protein interaction networks.
-- Kuchaiev, Oleksii, et al. PLoS computational biology 5.8 (2009): e1000454.
• Proposed an embedding algorithm based on multidimensional scaling (MDS) for PPI network de-
noising.
• By using the embeddings of proteins, they predicted new PPIs and assessed the confidence of
existing PPIs.
Proteomics Data Analysis
miRNA-Disease Association Prediction with Collaborative Matrix Factorization.
-- Shen, Zhen, et al. Complexity 2017 (2017).
• Developed CMFMDA that introduced matrix factorization to bipartite miRNA-disease network for
embedding to predict new associations.
• In CMFMDA, miRNA functional similarity and disease semantic similarity were involved in
factorization in terms of regularizations to improve embedding.
• The evaluation was performed to discover esophageal neoplasms-related miRNAs that were
previously confirmed by miR2Disease and dbDEMC.
• The results showed that CMFMDA can outperform other computational methods.
Transcriptomics Data Analysis
Challenges
Data quality: Unlike other domain where the data are clean and well structured,
networks constructed from the biomedical data are usually noisy and incomplete.
Local and global:
• Performances of the network embedding model and its downstream tasks rely on the type
of structural property to preserve.
• Preserving local property will gather connected nodes in the embedding space, while
preserving global property will project topologically similar (even far separated) nodes
together.
• Designing embedding method by properly considering local and global structure properties
according to application scenarios is an important aspect that will require the development
of novel solutions.
Challenges
Network evolution:
• Networks are always not static, especially in the biomedical domain.
• Existing network embedding models mainly focused on the static networks, and the
settings of network evolution were overlooked.
• To learn embeddings for a dynamic network, existing methods should be trained
repeatedly for each timestamp, which is definitely time consuming and may not capture
the temporal properties.
• Therefore, most of the existing network embedding methods cannot be directly applied to
evolving biomedical networks.
Challenges
Domain complexity:
• Different from network embedding application in other domains, the issue on biomedicine
and health care is much more complicated.
• For example, in a biomedical network, each interaction between entities usually represents
a complex genetic, pathological or pharmacological event or process, and there is usually
no complete knowledge on how it progresses.
• When applying an embedding model, the biomedical domain knowledge is also needed to
better understand the network structure.
Opportunities
Local and global trade-off embedding
Dynamic embedding
Text-associated embedding
Domain-knowledge-associated embedding

More Related Content

What's hot

NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
Alexander Pico
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
Alexander Pico
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
Alexander Pico
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Mathew Varghese
 
An optimal unsupervised text data segmentation 3
An optimal unsupervised text data segmentation 3An optimal unsupervised text data segmentation 3
An optimal unsupervised text data segmentation 3
prj_publication
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
SOYEON KIM
 
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
IOSRjournaljce
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
rahulmonikasharma
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
ijaia
 
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
ijtsrd
 
Efficiency of LSB steganography on medical information
Efficiency of LSB steganography on medical information Efficiency of LSB steganography on medical information
Efficiency of LSB steganography on medical information
IJECEIAES
 
F363941
F363941F363941
F363941
IJERA Editor
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
Kato Mivule
 
NetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David AmarNetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David Amar
Alexander Pico
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
IJNSA Journal
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
SOYEON KIM
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
Kato Mivule
 

What's hot (19)

NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
 
dream
dreamdream
dream
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
 
An optimal unsupervised text data segmentation 3
An optimal unsupervised text data segmentation 3An optimal unsupervised text data segmentation 3
An optimal unsupervised text data segmentation 3
 
50120140506002
5012014050600250120140506002
50120140506002
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
 
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
Application of Hybrid Genetic Algorithm Using Artificial Neural Network in Da...
 
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
 
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
Review on Computational Bioinformatics and Molecular Modelling Novel Tool for...
 
Efficiency of LSB steganography on medical information
Efficiency of LSB steganography on medical information Efficiency of LSB steganography on medical information
Efficiency of LSB steganography on medical information
 
F363941
F363941F363941
F363941
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
 
NetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David AmarNetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David Amar
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
 

Similar to Network embedding in biomedical data science

GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
ijcsit
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
AIRCC Publishing Corporation
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
Alexander Pico
 
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
mlaij
 
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
mlaij
 
NRNB EAC Report 2011
NRNB EAC Report 2011NRNB EAC Report 2011
NRNB EAC Report 2011
Alexander Pico
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
SOYEON KIM
 
A novel optimized deep learning method for protein-protein prediction in bioi...
A novel optimized deep learning method for protein-protein prediction in bioi...A novel optimized deep learning method for protein-protein prediction in bioi...
A novel optimized deep learning method for protein-protein prediction in bioi...
IJECEIAES
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
CSCJournals
 
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
IAEME Publication
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
ijcsa
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
Alexander Pico
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
Sagar Kamble
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
Deakin University
 
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
Harri Sonailent
 
Statistical SignificancePieceFinal
Statistical SignificancePieceFinalStatistical SignificancePieceFinal
Statistical SignificancePieceFinalJami Jackson
 
PSN for Precision Medicine
PSN for Precision MedicinePSN for Precision Medicine
PSN for Precision Medicine
Thi K. Tran-Nguyen, PhD
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET Journal
 
New prediction method for data spreading in social networks based on machine ...
New prediction method for data spreading in social networks based on machine ...New prediction method for data spreading in social networks based on machine ...
New prediction method for data spreading in social networks based on machine ...
TELKOMNIKA JOURNAL
 

Similar to Network embedding in biomedical data science (20)

GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
DSAGLSTM-DTA: Prediction of Drug-Target Affinity using Dual Self-Attention an...
 
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
DSAGLSTM-DTA: PREDICTION OF DRUG-TARGET AFFINITY USING DUAL SELF-ATTENTION AN...
 
NRNB EAC Report 2011
NRNB EAC Report 2011NRNB EAC Report 2011
NRNB EAC Report 2011
 
Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...Outcome-guided mutual information networks for investigating gene-gene intera...
Outcome-guided mutual information networks for investigating gene-gene intera...
 
A novel optimized deep learning method for protein-protein prediction in bioi...
A novel optimized deep learning method for protein-protein prediction in bioi...A novel optimized deep learning method for protein-protein prediction in bioi...
A novel optimized deep learning method for protein-protein prediction in bioi...
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
 
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
Enhancing Genomic Insights: 40 Pivotal Use Cases of Data Science and Machine ...
 
Statistical SignificancePieceFinal
Statistical SignificancePieceFinalStatistical SignificancePieceFinal
Statistical SignificancePieceFinal
 
PSN for Precision Medicine
PSN for Precision MedicinePSN for Precision Medicine
PSN for Precision Medicine
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
 
New prediction method for data spreading in social networks based on machine ...
New prediction method for data spreading in social networks based on machine ...New prediction method for data spreading in social networks based on machine ...
New prediction method for data spreading in social networks based on machine ...
 

More from Arindam Ghosh

Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
Arindam Ghosh
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Arindam Ghosh
 
Pharmacogenomics & its ethical issues
Pharmacogenomics & its ethical  issuesPharmacogenomics & its ethical  issues
Pharmacogenomics & its ethical issues
Arindam Ghosh
 
Limb development in vertebrates
Limb development in vertebratesLimb development in vertebrates
Limb development in vertebrates
Arindam Ghosh
 
Canning fish
Canning fishCanning fish
Canning fish
Arindam Ghosh
 
Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)
Arindam Ghosh
 
Carbon Nanotubes
Carbon NanotubesCarbon Nanotubes
Carbon Nanotubes
Arindam Ghosh
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
Arindam Ghosh
 
Java - Interfaces & Packages
Java - Interfaces & PackagesJava - Interfaces & Packages
Java - Interfaces & Packages
Arindam Ghosh
 
Freshers day anchoring script
Freshers day anchoring scriptFreshers day anchoring script
Freshers day anchoring script
Arindam Ghosh
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
Arindam Ghosh
 
Artificial Vectors
Artificial VectorsArtificial Vectors
Artificial Vectors
Arindam Ghosh
 
Pseudo code
Pseudo codePseudo code
Pseudo code
Arindam Ghosh
 
Hamiltonian path
Hamiltonian pathHamiltonian path
Hamiltonian path
Arindam Ghosh
 
Cedrus of Himachal Pradesh
Cedrus of Himachal PradeshCedrus of Himachal Pradesh
Cedrus of Himachal Pradesh
Arindam Ghosh
 
MySQL and bioinformatics
MySQL and bioinformatics MySQL and bioinformatics
MySQL and bioinformatics
Arindam Ghosh
 
Protein sorting in mitochondria
Protein sorting in mitochondriaProtein sorting in mitochondria
Protein sorting in mitochondria
Arindam Ghosh
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
Arindam Ghosh
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
Arindam Ghosh
 

More from Arindam Ghosh (19)

Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Pharmacogenomics & its ethical issues
Pharmacogenomics & its ethical  issuesPharmacogenomics & its ethical  issues
Pharmacogenomics & its ethical issues
 
Limb development in vertebrates
Limb development in vertebratesLimb development in vertebrates
Limb development in vertebrates
 
Canning fish
Canning fishCanning fish
Canning fish
 
Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)
 
Carbon Nanotubes
Carbon NanotubesCarbon Nanotubes
Carbon Nanotubes
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
 
Java - Interfaces & Packages
Java - Interfaces & PackagesJava - Interfaces & Packages
Java - Interfaces & Packages
 
Freshers day anchoring script
Freshers day anchoring scriptFreshers day anchoring script
Freshers day anchoring script
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Artificial Vectors
Artificial VectorsArtificial Vectors
Artificial Vectors
 
Pseudo code
Pseudo codePseudo code
Pseudo code
 
Hamiltonian path
Hamiltonian pathHamiltonian path
Hamiltonian path
 
Cedrus of Himachal Pradesh
Cedrus of Himachal PradeshCedrus of Himachal Pradesh
Cedrus of Himachal Pradesh
 
MySQL and bioinformatics
MySQL and bioinformatics MySQL and bioinformatics
MySQL and bioinformatics
 
Protein sorting in mitochondria
Protein sorting in mitochondriaProtein sorting in mitochondria
Protein sorting in mitochondria
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 

Recently uploaded

Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 

Recently uploaded (20)

Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 

Network embedding in biomedical data science

  • 1. Network embedding in biomedical data science Chang Su, Jie Tong, Yongjun Zhu, Peng Cui & Fei Wang Briefings in Bioinformatics December 2018 https://doi.org/10.1093/bib/bby117 Journal Club 01-Feb-2019
  • 2. “Recent advances in biomedical research as well as computer software and hardware technologies have led to an inrush of a large number of relational data interlinking drugs, genes, proteins, chemical compounds, diseases and medical concepts extracted from clinical data.” The availability of such relational data has greatly facilitated the biomedical studies, such as network biology, network medicine, pharmacogenomics, disease diagnosis, clinical phenotyping, etc. Numerous network- based learning methods have been developed to explore reliable tools for multiple applications. Although existing methods show capacity of processing networks and demonstrate great promises, they usually suffer from high computational and space cost, owning to high dimensionality and sparsity of the networks.
  • 3. Efficient analysis of these networks heavily relies on the ways how networks are represented. Often, a discrete adjacency matrix is used to represent a network, which only captures neighboring relationships between vertices. Indeed, this simple representation cannot embody more complex, higher-order structure relationships, such as paths, frequent substructure etc. As a result, such a traditional routine often makes many network analytic tasks computationally expensive and intractable over largescale networks. --Zhang, Daokun, et al. "Network representation learning: A survey." IEEE transactions on Big Data (2018).
  • 4. What is it? • Network embedding aims at converting the network into a low-dimensional space while structural information of the network is preserved. • In this way, nodes and/or edges of the network can be represented as compacted yet informative vectors in the embedding space. Advantages: • Typical non-network-based machine learning methods such as linear regression, Support Vector Machine (SVM) and decision forest, which have been demonstrated to be effective and efficient as the state-of- the-art techniques, can be applied to such vectors. Current status: • Efforts of applying network embedding to improve biomedical data analysis are already planned or underway. Difficulties: • The biomedical networks are sparse, noisy, incomplete, heterogeneous and usually consist of biomedical text and other domain knowledge. It makes embedding tasks more complicated than other application fields.
  • 5. In the network embedding space, the relationships among the nodes, which were originally represented by edges or other high-order topological measures in graphs, is captured by the distances between nodes in the vector space, and the topological and structural characteristics of a node are encoded into its embedding vector. An example is shown in Fig. After embedding the karate club network into a twodimensional space, the similar nodes marked by the same color are close to each other in the embedding space, demonstrating that the network structure can be well modeled in the two-dimensional embedding space. Ref.: Cui, Peng, et al. "A survey on network embedding." IEEE Transactions on Knowledge and Data Engineering (2018). Basic idea: Embed nodes so that distances in embedding space reflect node similarities in the original network.
  • 6. A non-attributed network is also known as homogeneous network, of which all nodes and edges belong to a unique type, respectively. The attributed networks, also known as heterogeneous networks, allow nodes and/or edges to belong to multiple types.
  • 7. Table: Network embedding models surveyed in this study.
  • 8. Table: Open-source software packages of the network embedding techniques.
  • 9. Applications in biomedical data science Table: Biomedical applications of network embedding surveyed in this study. The use of network embedding for biomedical data analysis is recent and not thoroughly explored.
  • 10. Drug repositioning: Computational drug repositioning, also known as drug repurposing, is a promising and efficient tool for exploring new usage for existing drugs to save drug development cost and increase productivity. Drugs bind with target proteins and affect their downstream activity, consequently lead to impact on human body to treat the disease. A drug repositioning tool usually aims at predicting unknown drug–target or drug–disease interactions.
  • 11. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. -- Luo, Yunan, et al. Nature communications 8.1 (2017): 573. • Proposed DTINet by extending diffusion component analysis (DCA) by separately performing random walk with restart (RWR) on drug–drug, drug–disease, drug side effect and drug similarity networks for drug embedding and on protein–protein, protein–disease and protein similarity networks for target protein embedding. After that, DTINet projected drugs into the embedding space of target proteins and made prediction based on geometric proximity. • Developed a computational pipeline, called DTINet, to predict novel drug–target interactions from a constructed heterogeneous network, which integrates diverse drug-related information. • DTINet focuses on learning a low-dimensional vector representation of features, which accurately explains the topological properties of individual nodes in the heterogeneous network, and then makes prediction based on these representations via a vector space projection scheme. • DTINet achieves substantial performance improvement over other state-of-the-art methods for drug–target interaction prediction.
  • 12. Large‐scale extraction of drug–disease pairs from the medical literature. -- Wang, Pengwei, et al. Journal of the Association for Information Science and Technology 68.11 (2017): 2649-2661. • Proposed to detect unknown drug–disease interactions from the medical literature by using neuro-linguistic programming (NLP) and network embedding techniques. • Using treatment and inducement drug–disease pairs extracted from 27 million PubMed articles, they first constructed a heterogeneous network. • They next expanded the network embedding method, LINE, by modifying the 1st-order proximity to encode treatment and inducement relations as positive and negative effects to the objective function, respectively. • The result showed that the embeddings lead to significant improvement in predictions of both types of drug–disease interactions.
  • 13. Adverse drug reaction analysis: An adverse drug reaction (ADR) is defined as any undesirable effect from the medical use of drugs beyond its anticipated therapeutic effects that occurs at a usual dosage. The study of ADRs is the concern of drug development especially before a drug is launched on clinical application. Detecting potential ADRs is always time consuming and expensive. To address this, computational methods based on network embedding have been introduced to ADR analysis.
  • 14. Large-scale structural and textual similarity-based mining of knowledge graph to predict drug–drug interactions. -- Abdelaziz, Ibrahim, et al. Web Semantics: Science, Services and Agents on the World Wide Web 44 (2017): 104-117. • Tiresias, a large-scale similarity-based framework that predicts DDIs through link prediction. • Tiresias takes in various sources of drug-related data and knowledge as inputs, and provides DDI predictions as outputs. • The process starts with semantic integration of the input data that results in a knowledge graph describing drug attributes and relationships with various related entities such as enzymes, chemical structures, and pathways. • The knowledge graph is then used to compute several similarity measures between all the drugs in a scalable and distributed framework. • The resulting similarity metrics are used to build features for a large-scale logistic regression model to predict potential DDIs.
  • 15. Exploiting ontology graph for predicting sparsely annotated gene function. -- Wang, Sheng, et al. Bioinformatics 31.12 (2015): i357-i364. • Proposed clusDCA to predict gene function, by applying diffusion component analysis (DCA) to gene–gene interaction and GO to learn low-dimensional representations for genes and GO labels, respectively. • Based on such embedding results, they trained a projection model from gene space to GO space such that genes geometrically closed to their known GO labels. • It bridges latent gene features and GO labels and results in desirable prediction of sparsely annotated gene functions. Genomics Data Analysis
  • 16. Geometric de-noising of protein-protein interaction networks. -- Kuchaiev, Oleksii, et al. PLoS computational biology 5.8 (2009): e1000454. • Proposed an embedding algorithm based on multidimensional scaling (MDS) for PPI network de- noising. • By using the embeddings of proteins, they predicted new PPIs and assessed the confidence of existing PPIs. Proteomics Data Analysis
  • 17. miRNA-Disease Association Prediction with Collaborative Matrix Factorization. -- Shen, Zhen, et al. Complexity 2017 (2017). • Developed CMFMDA that introduced matrix factorization to bipartite miRNA-disease network for embedding to predict new associations. • In CMFMDA, miRNA functional similarity and disease semantic similarity were involved in factorization in terms of regularizations to improve embedding. • The evaluation was performed to discover esophageal neoplasms-related miRNAs that were previously confirmed by miR2Disease and dbDEMC. • The results showed that CMFMDA can outperform other computational methods. Transcriptomics Data Analysis
  • 18. Challenges Data quality: Unlike other domain where the data are clean and well structured, networks constructed from the biomedical data are usually noisy and incomplete. Local and global: • Performances of the network embedding model and its downstream tasks rely on the type of structural property to preserve. • Preserving local property will gather connected nodes in the embedding space, while preserving global property will project topologically similar (even far separated) nodes together. • Designing embedding method by properly considering local and global structure properties according to application scenarios is an important aspect that will require the development of novel solutions.
  • 19. Challenges Network evolution: • Networks are always not static, especially in the biomedical domain. • Existing network embedding models mainly focused on the static networks, and the settings of network evolution were overlooked. • To learn embeddings for a dynamic network, existing methods should be trained repeatedly for each timestamp, which is definitely time consuming and may not capture the temporal properties. • Therefore, most of the existing network embedding methods cannot be directly applied to evolving biomedical networks.
  • 20. Challenges Domain complexity: • Different from network embedding application in other domains, the issue on biomedicine and health care is much more complicated. • For example, in a biomedical network, each interaction between entities usually represents a complex genetic, pathological or pharmacological event or process, and there is usually no complete knowledge on how it progresses. • When applying an embedding model, the biomedical domain knowledge is also needed to better understand the network structure.
  • 21. Opportunities Local and global trade-off embedding Dynamic embedding Text-associated embedding Domain-knowledge-associated embedding