[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction

SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network
for Efficient and Generalizable
Protein-Protein Interaction Prediction
Ziyuan Zhao1,2,*, Peisheng Qian1,*, Xulei Yang1,+ , Zeng Zeng3 , Cuntai Guan2 , Wai Leong Tam4 , Xiaoli Li1,2
Presenter: Ziyuan Zhao, Peisheng Qian
1 Institute for Infocomm Research (I2R), A*STAR, Singapore
2 School of Computer Science and Engineering (SCSE), Nanyang Technological University, Singapore
3School of Microelectronics, Shanghai University, China
4 Genome Institute of Singapore (GIS), A*STAR, Singapore
Paper ID: 2877

2
Challenges in Protein-Protein Interaction Prediction
- Protein-protein Interactions (PPIs) are central to various cellular functions
and processes.
- Label Scarcity: PPIs need to be annotated and may not be available.
- Domain Shift: Models trained on one domain can suffer tremendous
performance degradation when evaluated on another domain.

3
Improving Efficiency and generalization in PPI prediction
- Machine learning (ML) based, deep learning (DL) based and Graph
Neural Network (GNN) based methods have been investigated.
- However, dealing with imperfect data for improving model efficiency and
generalization in PPI prediction remains underexplored.

4
SemiGNN-PPI: Self-ensembling Multi-graph Neural Network
- Multi-graph encoding
- GNN + Mean Teacher
- Graph consistency constraints

5
Multi-Graph Encoding
- PPI Graph: proteins and PPIs.
- Label graph: PPI types and their correlations.
- Protein-Graph Encoding (PGE) aggregates
representations from neighboring proteins.
- Label-Graph Encoding (LGE) learns inter-dependent
classifiers.
- Multi-Graph Based Classifier applies the
learned classifiers from LGE to
representations from PGE for the PPI
prediction scores.

6
Self-ensemble Graph Learning
- We adopt mean teaching with graph data augmentation.
- Edge Manipulation (EM): for connectivity variations,
we randomly replace a certain percentage of edges.
- Node Manipulation (NM): for attribute missing, we
randomly remove node features, with zero masking.
- We construct two augmented graph views for the student
and teacher networks for consistent predictions.

7
Graph Consistency Constraints
- We model the fine-grained structural
protein-protein relations in the feature
embedding space [Ma et al., 2022].
- Edge matching:
- Student embedding graph
- Teacher embedding graph
- Consistent instance-wise correlations
- Edge matching loss
Yuchen Ma, Yanbei Chen, and Zeynep Akata. Distilling knowledge from self-supervised teacher by embedding graph alignment.
In 33rd British Machine Vision Conference. BMVA Press, 2022
Student Protein
Encoding
Teacher Protein
Encoding

8
Graph Consistency Constraints
- Node matching:
- Edge embedding graph
- Aligning encoding of the same protein
- node matching loss
- Overall loss function
Yuchen Ma, Yanbei Chen, and Zeynep Akata. Distilling knowledge from self-supervised teacher by embedding graph alignment.
In 33rd British Machine Vision Conference. BMVA Press, 2022
Student Protein
Encoding
Teacher Protein
Encoding

9
Datasets and Settings
- 3 datasets, STRING, SHS148k, and SHS27k.
- 7 PPI types: activation, binding, catalysis, expression, inhibition, post-
translational modification (ptmod), and reaction.
- Random, breath-first search (BFS), and depth-first search (DFS)
partitions [Lv et al., 2021].
- Evaluation metric: F1.
Guofeng Lv, Zhiqiang Hu, Yanguang Bi, and Shaoting Zhang. Learning unknown from correlations: Graph neural network for inter-novel-protein interaction prediction, Proceedings of the Thirtieth
International Joint Conference on Artificial Intelligence, IJCAI-21, pages 3677–3683

10
Comparison with Baselines
- Comparing with Machine Learning (ML), Deep Learning (DL) approaches
and GNN-PPI (a strong graph-based baseline).
- We outperform baselines by a clear margin in all datasets and partition
schemes.

11
Experiments under Label Scarcity
- We use 5%, 10% and 20% labels in the train set.
- Our method achieves better performance under all scenarios with
different datasets, label ratios, and partition schemes.

12
Experiments under Domain Shift
- The model is trained and tested in 3
settings.
- Domain Generalization
(DG)
- Inductive Domain Adaptation
(IDA)
- Transductive domain adaptation
(TDA)

13
Inter-novel-protein Interaction Prediction
- In the labeled train set
- BS subset (both proteins of the PPI are present).
- ES subset (either one protein of the PPI is present).
- NS subset (neither of the proteins is present).

14
Conclusions
- We identified 2 challenges in PPI prediction, label scarcity and domain
shift. We addressed them with a novel SemiGNN-PPI for efficient and
generalizable multi-type PPI prediction.
- To enhance generalization capability, we constructed and processed
graphs at protein and label levels.
- To leverage unlabeled PPI data, We integrated GNN into Mean Teacher
and designed multiple graph consistency constraints.
- Experiment results validated the effectiveness of SemiGNN-PPI.

15
Acknowledgement
This research was funded by Competitive Research Programme “NRF-
CRP22-2019-0003”, National Research Foundation Singapore, and partially
supported by A*STAR core funding.

Ziyuan Zhao zhao_ziyuan@i2r.a-star.edu.sg
Peisheng Qian qian_peisheng@i2r.a-star.edu.sg
Thank you!

[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction

Recommended

Recommended

More Related Content

Similar to [IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction

Similar to [IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction (20)

More from Ziyuan Zhao

More from Ziyuan Zhao (13)

Recently uploaded

Recently uploaded (20)

[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction

Editor's Notes