SlideShare a Scribd company logo
1 of 25
Hyo Eun Lee
Network Science Lab
Dept. of Biotechnology
The Catholic University of Korea
E-mail: gydnsml@gmail.com
2023.06.28
Bioinformatics 2020
1
 Introduction
• Motivation and tasks
• Biomedical graph
• Purpose
 Method
• Graph embedding methods
• Application of Graph embedding on biomedical network
 Result
• Dataset and experimental set-up
• Link prediction / Node classification results
• Influence of hyperparameters
 Discussion and Conclusion
2
Motivation
• Graph embedding is underutilized in
biomedical networks
• Graph embedding in biomedical networks
can help uncover potential discoveries
1. Introduction
Tasks
: Biomedical Link Prediction Tasks
, Node Classification Tasks
• Biomedical Link Prediction Tasks
DDA(Drug-Disease Association)
DDI(Drug-Drug Interaction)
PPI(Protein - Protein Interaction)
• Node Classification Tasks
Medical term semantic type
Protein function prediction
3
1. Introduction
Biomedical graph
• Graph
node ∶ biomedical entities
edge ∶ relations
• Effects of graph analyzing
: DDA-based prediction of potential drug indications and clinical decision support
, Detecting lncRNA function
• Embedding Method : Automatically learn a low-dimensional future representation
- Method of Preserving Structural Information of graph
- Can be used for downstream tasks
4
Purpose
1. Investigate the potential of embedding an advances graph
2. Links Prediction serves 3 critical biomedical applications
3. Formalizing semantic classification of medical terms and classifying them using embedding techniques
4. Suggest proper embedding method and hyperparameter settings for each task
Fill in this black
Fig. 1. Pipeline for applying graph embedding methods to biomedical tasks. Low-dimensional node representations are first learned from biomedical networks by graph embedding methods and then used as features to build specific classifiers for different tasks. For
(a) matrix factorization-based methods, they use a data matrix (e.g. adjacency matrix) as the input to learn embeddings through matrix factorization. For (b) random walk-based methods, they first generate sequences of nodes through random walks and then feed the
sequences into the word2vec model (Mikolov et al., 2013) to learn node representations. For (c) neural network-based methods, their architectures and inputs vary from different models (see Section 2 for details)
1. Introduction
5
2. Method
Graph embedding methods
• 11 Embedding Methods
Type : MF(5) / Random Walk(3)
/ Neural Network(3)
Fill in this black
6
2. Method
Graph embedding methods
• First-order proximity
: Based on direct connections between two objects
(local)
• Second-order proximity
: Considers indirect connections between objects
(global)
• High-order proximity
: Consider the neighbors of neighbors
Fill in this black
7
Graph embedding methods
• (a) MF-based methods
- Factorize a data metric into a low-dimensional vector
- Preserves hidden manifold structure and topological properties
2. Method
HOPE GraRep
8
Graph embedding methods
• (b) Random walk-based methods
- Create a node sequence to learning node representations
2. Method
Deep Walk node2vec
struc2vec
9
Graph embedding methods
• (c) Neural Network -based methods
- Different methods use different architectures and information inputs
2. Method
LINE SDNE GAE
10
2. Method
Application of Graph embedding on biomedical network
• 3 biomedical link prediction(DDA, DDI,
PPI) and node classifications
Type : Link prediction(3)
, Node classification(2)
Fill in this black
11
• 1) Link prediction
- Predicting potential interactions based on
biomedical entities and unknown interactions
2. Method
Formalize
• Traditional methods
: Use biological feature structures, gene ontology, graph properties
→ Problem 1. Difficult to apply and use biological features
2. Fit of bio-features
⇒ Use graph embedding methods to solve this problem
• Use supervised or semi-supervised graph inference models to make
predictions
Application of Graph embedding on biomedical network
12
• 2) Node classification
- Protein function prediction, Medical terms classification
2. Method
Protein function prediction
• Real experiments are expensive
, so graph-based methods were introduced
Medical terms classification
• Models for using the growth of clinical text to improve
personalized care and aid judgment
• Medical terms (using UMLS data) and how to measure their co-
occurrence to overcome privacy concerns
Application of Graph embedding on biomedical network
Fig. 2. Illustration of (a) how medical term–term co-occurrence graph is
constructed and (b) node type classification in the graph. Our work
assumes that the graph is given as in Finlayson et al. (2014) and mainly
focuses on (b), i.e. testing various embedding methods on the
classification performancE
13
2. Method
Summary of embedding methods
14
3. Results
Dataset (7)
Link prediction Node classification
: DDA(2), DDI(1), PPI(1)
• DDA
- Validated association of
chemicals and disease pathways in CTDs
- Drug-disease relationship in NDF-RT in UMLS
• DDI
- Comprehensive data from DrugBank
• PPI
- Get Homo sapiens PPIs from STRING
: Term-Term Co-occurrence Graph(1), PPI(1)
• Refine data from stanford hospitals and clinics
using frequency of occurrence statistics
• PPI
- Using Meshup data and Node2vec
15
3. Results
Data Statistics
16
3. Results
Experimental set-up
Link prediction
• Known interactions(Positive)
: 80% Training 20% Testing
• Unknown interactions(majority)
: Negative sampling
• Evaluation: ROC curve (AUC), accuracy, F1 score
Node classification
• Training by embedding the entire graph
information
• Nodes with label information
: 80% Training 20% Testing
• Evaluation : F1(Percentage) Micro/Macro
• Dimension setting: 100
Use grid search to tune 1-2 critical hyperparameters
17
3. Results
Link prediction results
Note: Due to the limited space, we only show the AUC value. Other evaluation metrics can be found in Supplementary Material. The best performing method in each category is in bold.
18
3. Results
Link prediction results
Fig. 3. (a) Comparison with the state-of-the-arts for drug-disease association prediction (LRSSL) (Liang et al., 2017); (b) drug–drug interaction prediction (DeepDDI) (Ryu et al., 2018) and (c) gene (protein)
function prediction (Mashup) (Cho et al., 2016). Same as Mashup, we evaluate their performance on three-level human Biological Process (BP) gene annotations (each containing GO terms with 101–300, 31–
100 and 11–30 genes, respectively). As can be seen, in each task, general graph embedding methods achieve competitive performance against them
19
3. Results
Node classification
Note: The best performing method in each category is in bold. a The source code of GAE provided by the authors does not support a large-scale graph (nodes>40k). We omit its performance on ‘Clini COOC’ here.
20
3. Results
The influence of dimension
ig. 4. The influence of dimensionality on the performance and training time of different embedding methods based on ‘CTD DDA’ dataset
21
3. Results
Fill in this black
Influence of hyperparameters
• Embedding dimensions effects prediction
performance and time efficiency
- When dimensionality exceeds 100,
performance saturates and
time cost increases rapidly.
Fig. 4. The influence of dimensionality on the performance and training time of different
embedding methods based on ‘CTD DDA’ dataset
22
3. Results
Influence of hyperparameters
23
4. Discussion and Conclusion
Discussion
• Need for a comprehensive evaluation of graph embedding methods in biomedical networks
•
• Future research
: Exploring the use of graph embedding methods for various biomedical challenges
(such as gene expression analysis and disease diagnosis)
, Investigating the interpretability of graph embeddings and developing methods to
incorporate domain knowledge into the embedding process
• Emphasized the importance of open source tools and datasets, and the need to develop them
Conclusion
• Evaluate 11 graph embedding methods on 7 biomedical datasets
• Found that embedding methods performed well and the potential for future predictive work
• Provided guidance on setting hyperparameters and discussed potential directions for future work
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks: methods, applications and evaluations", Bioinformatics 2020

More Related Content

Similar to NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks: methods, applications and evaluations", Bioinformatics 2020

A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
IJECEIAES
 

Similar to NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks: methods, applications and evaluations", Bioinformatics 2020 (20)

Estimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachEstimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approach
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
 
Deep learning optimization for drug-target interaction prediction in COVID-19...
Deep learning optimization for drug-target interaction prediction in COVID-19...Deep learning optimization for drug-target interaction prediction in COVID-19...
Deep learning optimization for drug-target interaction prediction in COVID-19...
 
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
 
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
NS-CUK Journal club: H.E.Lee, Review on "Predicting Biomedical Interactions W...
 
PONE_Visinets_published
PONE_Visinets_publishedPONE_Visinets_published
PONE_Visinets_published
 
IRJET - Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence M...
IRJET - Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence M...IRJET - Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence M...
IRJET - Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence M...
 
IRJET- Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence Ma...
IRJET- Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence Ma...IRJET- Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence Ma...
IRJET- Plant Leaf Disease Diagnosis from Color Imagery using Co-Occurrence Ma...
 
IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...
IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...
IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...
 
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
IRJET - Symmetric Image Registration based on Intensity and Spatial Informati...
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
 
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsTraffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
 
IRJET- An Improvised Multi Focus Image Fusion Algorithm through Quadtree
IRJET- An Improvised Multi Focus Image Fusion Algorithm through QuadtreeIRJET- An Improvised Multi Focus Image Fusion Algorithm through Quadtree
IRJET- An Improvised Multi Focus Image Fusion Algorithm through Quadtree
 
cv_Md_Ariful_Islam
cv_Md_Ariful_Islamcv_Md_Ariful_Islam
cv_Md_Ariful_Islam
 
IRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor DetectionIRJET- Fusion based Brain Tumor Detection
IRJET- Fusion based Brain Tumor Detection
 
Graph fusion of finger multimodal biometrics
Graph fusion of finger multimodal biometricsGraph fusion of finger multimodal biometrics
Graph fusion of finger multimodal biometrics
 
Object Detection with Computer Vision
Object Detection with Computer VisionObject Detection with Computer Vision
Object Detection with Computer Vision
 
Sunbelt 2013 Presentation
Sunbelt 2013 PresentationSunbelt 2013 Presentation
Sunbelt 2013 Presentation
 
Influence Analysis of Image Feature Selection TechniquesOver Deep Learning Model
Influence Analysis of Image Feature Selection TechniquesOver Deep Learning ModelInfluence Analysis of Image Feature Selection TechniquesOver Deep Learning Model
Influence Analysis of Image Feature Selection TechniquesOver Deep Learning Model
 
TBerger_FinalReport
TBerger_FinalReportTBerger_FinalReport
TBerger_FinalReport
 

More from ssuser4b1f48

More from ssuser4b1f48 (20)

NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
 
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
 
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
 
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
 
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
 
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
 
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
 
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
 
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
 
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
 
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
 
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
 
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 

NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks: methods, applications and evaluations", Bioinformatics 2020

  • 1. Hyo Eun Lee Network Science Lab Dept. of Biotechnology The Catholic University of Korea E-mail: gydnsml@gmail.com 2023.06.28 Bioinformatics 2020
  • 2. 1  Introduction • Motivation and tasks • Biomedical graph • Purpose  Method • Graph embedding methods • Application of Graph embedding on biomedical network  Result • Dataset and experimental set-up • Link prediction / Node classification results • Influence of hyperparameters  Discussion and Conclusion
  • 3. 2 Motivation • Graph embedding is underutilized in biomedical networks • Graph embedding in biomedical networks can help uncover potential discoveries 1. Introduction Tasks : Biomedical Link Prediction Tasks , Node Classification Tasks • Biomedical Link Prediction Tasks DDA(Drug-Disease Association) DDI(Drug-Drug Interaction) PPI(Protein - Protein Interaction) • Node Classification Tasks Medical term semantic type Protein function prediction
  • 4. 3 1. Introduction Biomedical graph • Graph node ∶ biomedical entities edge ∶ relations • Effects of graph analyzing : DDA-based prediction of potential drug indications and clinical decision support , Detecting lncRNA function • Embedding Method : Automatically learn a low-dimensional future representation - Method of Preserving Structural Information of graph - Can be used for downstream tasks
  • 5. 4 Purpose 1. Investigate the potential of embedding an advances graph 2. Links Prediction serves 3 critical biomedical applications 3. Formalizing semantic classification of medical terms and classifying them using embedding techniques 4. Suggest proper embedding method and hyperparameter settings for each task Fill in this black Fig. 1. Pipeline for applying graph embedding methods to biomedical tasks. Low-dimensional node representations are first learned from biomedical networks by graph embedding methods and then used as features to build specific classifiers for different tasks. For (a) matrix factorization-based methods, they use a data matrix (e.g. adjacency matrix) as the input to learn embeddings through matrix factorization. For (b) random walk-based methods, they first generate sequences of nodes through random walks and then feed the sequences into the word2vec model (Mikolov et al., 2013) to learn node representations. For (c) neural network-based methods, their architectures and inputs vary from different models (see Section 2 for details) 1. Introduction
  • 6. 5 2. Method Graph embedding methods • 11 Embedding Methods Type : MF(5) / Random Walk(3) / Neural Network(3) Fill in this black
  • 7. 6 2. Method Graph embedding methods • First-order proximity : Based on direct connections between two objects (local) • Second-order proximity : Considers indirect connections between objects (global) • High-order proximity : Consider the neighbors of neighbors Fill in this black
  • 8. 7 Graph embedding methods • (a) MF-based methods - Factorize a data metric into a low-dimensional vector - Preserves hidden manifold structure and topological properties 2. Method HOPE GraRep
  • 9. 8 Graph embedding methods • (b) Random walk-based methods - Create a node sequence to learning node representations 2. Method Deep Walk node2vec struc2vec
  • 10. 9 Graph embedding methods • (c) Neural Network -based methods - Different methods use different architectures and information inputs 2. Method LINE SDNE GAE
  • 11. 10 2. Method Application of Graph embedding on biomedical network • 3 biomedical link prediction(DDA, DDI, PPI) and node classifications Type : Link prediction(3) , Node classification(2) Fill in this black
  • 12. 11 • 1) Link prediction - Predicting potential interactions based on biomedical entities and unknown interactions 2. Method Formalize • Traditional methods : Use biological feature structures, gene ontology, graph properties → Problem 1. Difficult to apply and use biological features 2. Fit of bio-features ⇒ Use graph embedding methods to solve this problem • Use supervised or semi-supervised graph inference models to make predictions Application of Graph embedding on biomedical network
  • 13. 12 • 2) Node classification - Protein function prediction, Medical terms classification 2. Method Protein function prediction • Real experiments are expensive , so graph-based methods were introduced Medical terms classification • Models for using the growth of clinical text to improve personalized care and aid judgment • Medical terms (using UMLS data) and how to measure their co- occurrence to overcome privacy concerns Application of Graph embedding on biomedical network Fig. 2. Illustration of (a) how medical term–term co-occurrence graph is constructed and (b) node type classification in the graph. Our work assumes that the graph is given as in Finlayson et al. (2014) and mainly focuses on (b), i.e. testing various embedding methods on the classification performancE
  • 14. 13 2. Method Summary of embedding methods
  • 15. 14 3. Results Dataset (7) Link prediction Node classification : DDA(2), DDI(1), PPI(1) • DDA - Validated association of chemicals and disease pathways in CTDs - Drug-disease relationship in NDF-RT in UMLS • DDI - Comprehensive data from DrugBank • PPI - Get Homo sapiens PPIs from STRING : Term-Term Co-occurrence Graph(1), PPI(1) • Refine data from stanford hospitals and clinics using frequency of occurrence statistics • PPI - Using Meshup data and Node2vec
  • 17. 16 3. Results Experimental set-up Link prediction • Known interactions(Positive) : 80% Training 20% Testing • Unknown interactions(majority) : Negative sampling • Evaluation: ROC curve (AUC), accuracy, F1 score Node classification • Training by embedding the entire graph information • Nodes with label information : 80% Training 20% Testing • Evaluation : F1(Percentage) Micro/Macro • Dimension setting: 100 Use grid search to tune 1-2 critical hyperparameters
  • 18. 17 3. Results Link prediction results Note: Due to the limited space, we only show the AUC value. Other evaluation metrics can be found in Supplementary Material. The best performing method in each category is in bold.
  • 19. 18 3. Results Link prediction results Fig. 3. (a) Comparison with the state-of-the-arts for drug-disease association prediction (LRSSL) (Liang et al., 2017); (b) drug–drug interaction prediction (DeepDDI) (Ryu et al., 2018) and (c) gene (protein) function prediction (Mashup) (Cho et al., 2016). Same as Mashup, we evaluate their performance on three-level human Biological Process (BP) gene annotations (each containing GO terms with 101–300, 31– 100 and 11–30 genes, respectively). As can be seen, in each task, general graph embedding methods achieve competitive performance against them
  • 20. 19 3. Results Node classification Note: The best performing method in each category is in bold. a The source code of GAE provided by the authors does not support a large-scale graph (nodes>40k). We omit its performance on ‘Clini COOC’ here.
  • 21. 20 3. Results The influence of dimension ig. 4. The influence of dimensionality on the performance and training time of different embedding methods based on ‘CTD DDA’ dataset
  • 22. 21 3. Results Fill in this black Influence of hyperparameters • Embedding dimensions effects prediction performance and time efficiency - When dimensionality exceeds 100, performance saturates and time cost increases rapidly. Fig. 4. The influence of dimensionality on the performance and training time of different embedding methods based on ‘CTD DDA’ dataset
  • 23. 22 3. Results Influence of hyperparameters
  • 24. 23 4. Discussion and Conclusion Discussion • Need for a comprehensive evaluation of graph embedding methods in biomedical networks • • Future research : Exploring the use of graph embedding methods for various biomedical challenges (such as gene expression analysis and disease diagnosis) , Investigating the interpretability of graph embeddings and developing methods to incorporate domain knowledge into the embedding process • Emphasized the importance of open source tools and datasets, and the need to develop them Conclusion • Evaluate 11 graph embedding methods on 7 biomedical datasets • Found that embedding methods performed well and the potential for future predictive work • Provided guidance on setting hyperparameters and discussed potential directions for future work

Editor's Notes

  1. 동기 현재까지 그래프 임베딩은 소셜 또는 단순한 바이오 인포메이션 네트워크에서 사용되었으며, 체계적 실험 및 분석 관련 바이오메디컬 네트워크에서는 사용되지 않고 있었다. 따라서 바이오 메디컬 네트워크에 이를 적용하면 잠재적인 발견을 할 수 있을 거다. Task : 이 논문에서는 11가지 임베딩 방법을 크게 2가지 테스크에 적용하는데, 각각 바이오메디컬 링크 프리딕션 테스크와 노드 클레시피케이션 테스크로 나뉜다. 세부적으로는 --.