SlideShare a Scribd company logo
1 of 32
Van Thuy Hoang
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: hoangvanthuy90@gmail.com
240302
Kha-Dinh Luong et.al., NIPS23
2
Graph Convolutional Networks (GCNs)
 Generate node embeddings based on local network neighborhoods
 Nodes have embeddings at each layer, repeating combine messages
from their neighbor using neural networks
3
Higher-order Graph Neural Networks
 Higher-order Graph Neural Networks | Semantic Scholar
4
Higher-order structural arrangements
 Motif-based or fragment-based pretraining is a new direction that
potentially overcomes these problems.
 Existing fragment-based methods use either suboptimal
fragmentation or fragmentation embeddings.
 GROVER predicts fragments from node and graph embeddings,
however, their fragments are k-hop subgraphs that cannot
account for chemically meaningful subgraphs with varying sizes
and structures.
5
Molecule Fragmentation
 Fragment-based contrastive pretraining framework
Principal Subgraph Mining
Molecule generation by principal subgraph mining and assembling. NIPS, 2022.
6
Principal Subgraph Extraction
 Given a graph G = (V, E), a subgraph of G is defined as S
 Fragment extraction on {C=CC=C, CC=CC, C=CCC}.
(a) Initialize vocabulary with atoms.
(b) Fragment CC is the most frequent and added to the vocabulary. All
CC are merged and highlighted in red.
(c) Fragment C=CC is the most frequent and added to the vocabulary.
All C=CC are merged and highlighted in green (molecules 1 and 3).
After 2 iterations the vocabulary is {C, CC, C=CC}.
7
Fragment-based Contrastive Pretraining
 To obtain the collective embedding of atom nodes corresponding to
a fragment, we define a function FRAGPOOL(·) that combines node
embeddings
 FRAGPOOL: average function in the experiments
8
contrastive learning objective
 We minimize the contrastive learning objective based on the InfoNCE
loss
9
Fragment-based predictive pretraining (task 1)
 A multi-label prediction task that outputs a vocabulary-size binary
vector indicating which fragments exist in the molecular graph.
 Thanks to the optimized fragmentation procedure that we use, the
output dimension is compact without extremely rare classes or
fragments, resulting in more robust learning.
10
Fragment Graph Structure Prediction (task 2)
 predict the structural backbones of fragment graphs.
 The number of classes is the number of unique structural backbones.
Essentially, a backbone is a fragment graph with no node or edge
attributes.
 With predictive objective of each task.
11
Experimental Settings
 Dataset:
 a processed subset containing 456K molecules from the ChEMBL
database
 A fragment vocabulary of size 800 is extracted
 Models : 5-layer Graph Isomorphism Network (GIN)
12
On binary molecular property prediction
 Test ROC-AUC on binary molecular property prediction benchmarks
using different pretraining strategies in GraphFP
 C, P, and F indicate contrastive pretraining, predictive pretraining, and
inclusion of fragment encoders in downstream prediction
13
On Long-range Chemical Benchmarks
 Performances on PEPTIDE-FUNC (graph classification) and PEPTIDE-
STRUCT (graph regression).
 These tasks require capturing long-range interactions within large
peptide molecules.
14
On vocabulary of various sizes
 Downstream performances with GINs pretrained on vocabulary of
various sizes.
15
Conclusions and Future Work
 contrastive and predictive learning strategies for pretraining GNNs
based on graph fragmentation
 pretrain two separate encoders for molecular graphs and fragment
graphs, thus capturing structural information at different resolutions.
 When benchmarked on chemical and long-range peptide datasets,
The method achieves competitive or better results compared to
existing methods.
 pretraining via larger datasets, more extensive featurizations, better
fragmentations, and more optimal representations.
Van Thuy Hoang
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: hoangvanthuy90@gmail.com
240302
Namkyeong Lee et.al., ICLR2023
18
BACKGROUND
 Molecular Relational Learning
 Learning the interaction behavior between a pair of molecules
 Examples
 Predicting optical properties when a Chromophore and Solvent
react
 Predicting solubility when a solute and solvent react
 Predicting side effects when taking two types of drugs
simultaneously
19
Functional Group
 Specific atomic groups that play an important role in determining
the chemical reactivity of organic compounds
 Compounds with the same functional group generally have similar
properties and undergo similar chemical reactions
 Hence, it is important to consider functional group for molecular
relational learning
20
Functional Group
 Specific atomic groups that play an important role in determining
the chemical reactivity of organic compounds
 Compounds with the same functional group generally have similar
properties and undergo similar chemical reactions
 Molecule can be represented as a graph Functional group can be
represented as a subgraph
21
INFORMATION BOTTLENECK
 A theoretical approach to trade-off between information compression
and preservation
22
Information Bottleneck Graph
 Subgraph that maximally preserves the property of the original
graph
 Motif in ordinary graphs
 Functional group in molecules
23
Extract a subgraph in terms of nodes
 Inject noise into node embeddings to perform graph compression
24
Conditional Graph Information Bottleneck
 Consider Graph 2 (Solvent) when detecting the important subgraph
from Graph 1 (Solute)
Graph Information Bottleneck
Conditional Graph Information Bottleneck
25
CONDITIONAL GRAPH INFORMATION BOTTLENECK
 Overall procedure
 Decompose the conditional MI based on
the chain rule of MI, and then derive the
upper bound of the decomposed terms
26
A
 A
27
EXPERIMENTS
 Chromophore
 dataset Absorption max,
Emission max, Lifetime
 Solvation Free Energy
dataset:
 MNSol
 FreeSolv
 CompSol
 Abraham – CombiSolv
 Drug-Drug Interaction
dataset
 ZhangDDI
 ChChMiner
28
MAIN TABLE
 Observations
Outperforms baselines on both Molecular Interaction / Drug-Drug
Interaction tasks
29
SENSITIVITY ANALYSIS
 β =1.0:
 CGIB focuses on compression e.g., CGIB focuses an aromatic ring,
which is not relevant to chemical reactions
 β = 0.01:
 CGIB focuses on prediction e.g., CGIB focuses on external part,
which generally more relevant to chemical reactions
30
QUALITATIVE ANALYSIS
 Observations:
 (a) Chromophore
 interact with ordinary solvents
 Focus on external parts à Aligns with domain knowledge
 (b) Chromophore interact with liquid oxygen solvents : Focus on
all parts à Aligns with domain knowledge
31
CONCLUSION
 Proposed a method for tackling relation learning tasks, which are
crucial for scientific discovery
 Based on Conditional Information Bottleneck
 It is crucial to consider Graph 2 (Solvent) when detecting the
important subgraph from Graph 1 (Chromophore)
240318_Thuy_Labseminar[Fragment-based Pretraining and Finetuning on Molecular Graphs].pptx

More Related Content

Similar to 240318_Thuy_Labseminar[Fragment-based Pretraining and Finetuning on Molecular Graphs].pptx

NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...
NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...
NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...ssuser4b1f48
 
Deep Graph Contrastive Representation Learning.pptx
Deep Graph Contrastive Representation Learning.pptxDeep Graph Contrastive Representation Learning.pptx
Deep Graph Contrastive Representation Learning.pptxssuser2624f71
 
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...ssuser4b1f48
 
molecular docking screnning. pptx
molecular docking screnning. pptxmolecular docking screnning. pptx
molecular docking screnning. pptxPraveen kumar S
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
 
cadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxcadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxNoorelhuda2
 
upload.pdf
upload.pdfupload.pdf
upload.pdfzohra72
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social MachineJeremy Frey
 
Semi-supervised learning model for molecular property prediction
Semi-supervised learning model for molecular property predictionSemi-supervised learning model for molecular property prediction
Semi-supervised learning model for molecular property predictionDai-Hai Nguyen
 
Exploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingExploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingxrbiotech
 
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...Ziyuan Zhao
 
Poster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABPoster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABElisabeth Ortega
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksAkua Biaa Adu
 
A GPU-accelerated bioinformatics application for large-scale protein interact...
A GPU-accelerated bioinformatics application for large-scale protein interact...A GPU-accelerated bioinformatics application for large-scale protein interact...
A GPU-accelerated bioinformatics application for large-scale protein interact...AllegroViva Corp
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Dimitris Papadopoulos
 

Similar to 240318_Thuy_Labseminar[Fragment-based Pretraining and Finetuning on Molecular Graphs].pptx (20)

NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...
NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...
NS-CUK Seminar: V.T.Hoang, Review on "Relative Molecule Self-Attention Transf...
 
A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...
 
Deep Graph Contrastive Representation Learning.pptx
Deep Graph Contrastive Representation Learning.pptxDeep Graph Contrastive Representation Learning.pptx
Deep Graph Contrastive Representation Learning.pptx
 
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...
NS-CUK Seminar: V.T.Hoang, Review on "Graph Clustering with Graph Neural Netw...
 
IBSB tutorial
IBSB tutorialIBSB tutorial
IBSB tutorial
 
molecular docking screnning. pptx
molecular docking screnning. pptxmolecular docking screnning. pptx
molecular docking screnning. pptx
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering Algorithm
 
cadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxcadd-191129134050 (1).pptx
cadd-191129134050 (1).pptx
 
Towards More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comp...
Towards More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comp...Towards More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comp...
Towards More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comp...
 
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based AlgorithmsNMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
 
upload.pdf
upload.pdfupload.pdf
upload.pdf
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social Machine
 
Semi-supervised learning model for molecular property prediction
Semi-supervised learning model for molecular property predictionSemi-supervised learning model for molecular property prediction
Semi-supervised learning model for molecular property prediction
 
Exploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingExploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasing
 
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
 
Poster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABPoster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UAB
 
1207.2600
1207.26001207.2600
1207.2600
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein Networks
 
A GPU-accelerated bioinformatics application for large-scale protein interact...
A GPU-accelerated bioinformatics application for large-scale protein interact...A GPU-accelerated bioinformatics application for large-scale protein interact...
A GPU-accelerated bioinformatics application for large-scale protein interact...
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis)
 

More from thanhdowork

[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...thanhdowork
 
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...thanhdowork
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...thanhdowork
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...thanhdowork
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptxthanhdowork
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...thanhdowork
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...thanhdowork
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...thanhdowork
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...thanhdowork
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptxthanhdowork
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptxthanhdowork
 
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...thanhdowork
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....thanhdowork
 
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptxthanhdowork
 
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptxthanhdowork
 
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptxthanhdowork
 
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...thanhdowork
 
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...thanhdowork
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptxthanhdowork
 

More from thanhdowork (20)

[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
 
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx
 
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
 
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
 
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
 
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
 
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
 
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 

240318_Thuy_Labseminar[Fragment-based Pretraining and Finetuning on Molecular Graphs].pptx

  • 1. Van Thuy Hoang Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: hoangvanthuy90@gmail.com 240302 Kha-Dinh Luong et.al., NIPS23
  • 2. 2 Graph Convolutional Networks (GCNs)  Generate node embeddings based on local network neighborhoods  Nodes have embeddings at each layer, repeating combine messages from their neighbor using neural networks
  • 3. 3 Higher-order Graph Neural Networks  Higher-order Graph Neural Networks | Semantic Scholar
  • 4. 4 Higher-order structural arrangements  Motif-based or fragment-based pretraining is a new direction that potentially overcomes these problems.  Existing fragment-based methods use either suboptimal fragmentation or fragmentation embeddings.  GROVER predicts fragments from node and graph embeddings, however, their fragments are k-hop subgraphs that cannot account for chemically meaningful subgraphs with varying sizes and structures.
  • 5. 5 Molecule Fragmentation  Fragment-based contrastive pretraining framework Principal Subgraph Mining Molecule generation by principal subgraph mining and assembling. NIPS, 2022.
  • 6. 6 Principal Subgraph Extraction  Given a graph G = (V, E), a subgraph of G is defined as S  Fragment extraction on {C=CC=C, CC=CC, C=CCC}. (a) Initialize vocabulary with atoms. (b) Fragment CC is the most frequent and added to the vocabulary. All CC are merged and highlighted in red. (c) Fragment C=CC is the most frequent and added to the vocabulary. All C=CC are merged and highlighted in green (molecules 1 and 3). After 2 iterations the vocabulary is {C, CC, C=CC}.
  • 7. 7 Fragment-based Contrastive Pretraining  To obtain the collective embedding of atom nodes corresponding to a fragment, we define a function FRAGPOOL(·) that combines node embeddings  FRAGPOOL: average function in the experiments
  • 8. 8 contrastive learning objective  We minimize the contrastive learning objective based on the InfoNCE loss
  • 9. 9 Fragment-based predictive pretraining (task 1)  A multi-label prediction task that outputs a vocabulary-size binary vector indicating which fragments exist in the molecular graph.  Thanks to the optimized fragmentation procedure that we use, the output dimension is compact without extremely rare classes or fragments, resulting in more robust learning.
  • 10. 10 Fragment Graph Structure Prediction (task 2)  predict the structural backbones of fragment graphs.  The number of classes is the number of unique structural backbones. Essentially, a backbone is a fragment graph with no node or edge attributes.  With predictive objective of each task.
  • 11. 11 Experimental Settings  Dataset:  a processed subset containing 456K molecules from the ChEMBL database  A fragment vocabulary of size 800 is extracted  Models : 5-layer Graph Isomorphism Network (GIN)
  • 12. 12 On binary molecular property prediction  Test ROC-AUC on binary molecular property prediction benchmarks using different pretraining strategies in GraphFP  C, P, and F indicate contrastive pretraining, predictive pretraining, and inclusion of fragment encoders in downstream prediction
  • 13. 13 On Long-range Chemical Benchmarks  Performances on PEPTIDE-FUNC (graph classification) and PEPTIDE- STRUCT (graph regression).  These tasks require capturing long-range interactions within large peptide molecules.
  • 14. 14 On vocabulary of various sizes  Downstream performances with GINs pretrained on vocabulary of various sizes.
  • 15. 15 Conclusions and Future Work  contrastive and predictive learning strategies for pretraining GNNs based on graph fragmentation  pretrain two separate encoders for molecular graphs and fragment graphs, thus capturing structural information at different resolutions.  When benchmarked on chemical and long-range peptide datasets, The method achieves competitive or better results compared to existing methods.  pretraining via larger datasets, more extensive featurizations, better fragmentations, and more optimal representations.
  • 16.
  • 17. Van Thuy Hoang Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: hoangvanthuy90@gmail.com 240302 Namkyeong Lee et.al., ICLR2023
  • 18. 18 BACKGROUND  Molecular Relational Learning  Learning the interaction behavior between a pair of molecules  Examples  Predicting optical properties when a Chromophore and Solvent react  Predicting solubility when a solute and solvent react  Predicting side effects when taking two types of drugs simultaneously
  • 19. 19 Functional Group  Specific atomic groups that play an important role in determining the chemical reactivity of organic compounds  Compounds with the same functional group generally have similar properties and undergo similar chemical reactions  Hence, it is important to consider functional group for molecular relational learning
  • 20. 20 Functional Group  Specific atomic groups that play an important role in determining the chemical reactivity of organic compounds  Compounds with the same functional group generally have similar properties and undergo similar chemical reactions  Molecule can be represented as a graph Functional group can be represented as a subgraph
  • 21. 21 INFORMATION BOTTLENECK  A theoretical approach to trade-off between information compression and preservation
  • 22. 22 Information Bottleneck Graph  Subgraph that maximally preserves the property of the original graph  Motif in ordinary graphs  Functional group in molecules
  • 23. 23 Extract a subgraph in terms of nodes  Inject noise into node embeddings to perform graph compression
  • 24. 24 Conditional Graph Information Bottleneck  Consider Graph 2 (Solvent) when detecting the important subgraph from Graph 1 (Solute) Graph Information Bottleneck Conditional Graph Information Bottleneck
  • 25. 25 CONDITIONAL GRAPH INFORMATION BOTTLENECK  Overall procedure  Decompose the conditional MI based on the chain rule of MI, and then derive the upper bound of the decomposed terms
  • 27. 27 EXPERIMENTS  Chromophore  dataset Absorption max, Emission max, Lifetime  Solvation Free Energy dataset:  MNSol  FreeSolv  CompSol  Abraham – CombiSolv  Drug-Drug Interaction dataset  ZhangDDI  ChChMiner
  • 28. 28 MAIN TABLE  Observations Outperforms baselines on both Molecular Interaction / Drug-Drug Interaction tasks
  • 29. 29 SENSITIVITY ANALYSIS  β =1.0:  CGIB focuses on compression e.g., CGIB focuses an aromatic ring, which is not relevant to chemical reactions  β = 0.01:  CGIB focuses on prediction e.g., CGIB focuses on external part, which generally more relevant to chemical reactions
  • 30. 30 QUALITATIVE ANALYSIS  Observations:  (a) Chromophore  interact with ordinary solvents  Focus on external parts à Aligns with domain knowledge  (b) Chromophore interact with liquid oxygen solvents : Focus on all parts à Aligns with domain knowledge
  • 31. 31 CONCLUSION  Proposed a method for tackling relation learning tasks, which are crucial for scientific discovery  Based on Conditional Information Bottleneck  It is crucial to consider Graph 2 (Solvent) when detecting the important subgraph from Graph 1 (Chromophore)