Learning to Extrapolate Knowledge:
Transductive Few-shot Out-of-Graph Link Prediction
Jinheon Baek1, Dong Bok Lee1, Sung Ju Hwang1,2
1Graduate School of AI, KAIST, South Korea
2AITRICS, South Korea
While graphs contain and express a huge amount of knowledge, they are highly
incomplete.
Link Prediction
(A) Incomplete knowledge graph. (B) Predicting missing links.
Thus automatic graph completion, known as link prediction, is practically important.
• Have an evolving nature, where new entities can emerge over time.
• Exhibit long-tail distributions, where most entities have few triplets to train.
Challenges on Real-world Graphs
(A) Evolving nature. (B) Long-tail distribution.
We propose a few-shot out-of-graph link prediction problem whose goal is to predict
links between seen and unseen, or among unseen entities, with few links per entity.
Meta-Learning Framework
To tackle the out-of-graph link prediction problem, we propose a novel meta-
learning framework, which meta-learns the node embedding for unseen entities.
Meta-Learning Framework
Our meta-learning framework learns by simulating the unseen entities during
training, and extrapolates this knowledge to the real unseen entities.
Training a network with massively
generated simulated unseen entities.
This meta-learning makes the model generalize well to the link prediction tasks on
unseen out-of-graph entities.
Meta-Learning Framework
Generalization over real unseen
entities with meta-learned network.
(Inductive) GEN learns to predict links between seen and unseen entities with
output embedding, by simulating unseen entities with seen entities.
Graph Extrapolation Network (GEN)
(A) Meta-learning framework. (B) Meta-learned Network
(Graph Extrapolation Network).
(Transductive) GEN further learns to predict the links even among unseen entities,
with simulated unseen entities during meta-training.
Graph Extrapolation Network (GEN)
(A) Meta-learning framework. (B) Meta-learned Network
(Graph Extrapolation Network).
Results
Transductive-GEN (T-GEN) outperforms all baselines on out-of-graph link prediction
tasks for knowledge graph completion and drug-drug interaction prediction.
FB15-237 NELL-995
Types Models MRR Hits@10 MRR Hits@10
Seen-to-Seen
TransE 0.053 0.082 0.009 0.020
R-GCN 0.008 0.011 0.004 0.007
Seen-to-Seen, re-
trained from scratch
TransE 0.071 0.159 0.071 0.129
R-GCN 0.099 0.181 0.112 0.184
Seen-to-Unseen
MEAN 0.105 0.207 0.158 0.263
LAN 0.112 0.214 0.159 0.255
Ours T-GEN 0.367 0.530 0.282 0.421
(A) Knowledge Graph Completion.
DeepDDI BIOSNAP-sub
Types Models PR Acc PR Acc
Seen-to-Seen,
re-trained from
scratch
MLP 0.476 0.528 0.034 0.049
MPNN 0.478 0.681 0.026 0.067
R-GCN 0.397 0.640 0.041 0.051
Ours T-GEN 0.708 0.815 0.067 0.089
(B) Drug-Drug Interaction Prediction.
Results
Why does GEN generalize well to link prediction with out-of-graph entities?
This is because GEN embeds the unseen entities on the manifold of seen entities,
while baselines embeds the unseen entities off-manifold.
(A) Seen-to-Unseen Baseline
(LAN [Wang et al.]).
(B) Seen-to-Seen Baseline, retrained
from scratch (TransE [Bordes et al.]).
(C) Ours (T-GEN).
Conclusion
• We define a realistic problem setting of few-shot out-of-graph link prediction,
aiming to perform link prediction for unseen entities.
• To tackle this problem, we propose a novel meta-learning framework, which
meta-learns the node embedding for unseen entities.
• We validate our model on knowledge graph completion and drug-drug
interaction tasks, on which it significantly outperforms relevant baselines.

Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction

  • 1.
    Learning to ExtrapolateKnowledge: Transductive Few-shot Out-of-Graph Link Prediction Jinheon Baek1, Dong Bok Lee1, Sung Ju Hwang1,2 1Graduate School of AI, KAIST, South Korea 2AITRICS, South Korea
  • 2.
    While graphs containand express a huge amount of knowledge, they are highly incomplete. Link Prediction (A) Incomplete knowledge graph. (B) Predicting missing links. Thus automatic graph completion, known as link prediction, is practically important.
  • 3.
    • Have anevolving nature, where new entities can emerge over time. • Exhibit long-tail distributions, where most entities have few triplets to train. Challenges on Real-world Graphs (A) Evolving nature. (B) Long-tail distribution.
  • 4.
    We propose afew-shot out-of-graph link prediction problem whose goal is to predict links between seen and unseen, or among unseen entities, with few links per entity. Meta-Learning Framework
  • 5.
    To tackle theout-of-graph link prediction problem, we propose a novel meta- learning framework, which meta-learns the node embedding for unseen entities. Meta-Learning Framework Our meta-learning framework learns by simulating the unseen entities during training, and extrapolates this knowledge to the real unseen entities. Training a network with massively generated simulated unseen entities.
  • 6.
    This meta-learning makesthe model generalize well to the link prediction tasks on unseen out-of-graph entities. Meta-Learning Framework Generalization over real unseen entities with meta-learned network.
  • 7.
    (Inductive) GEN learnsto predict links between seen and unseen entities with output embedding, by simulating unseen entities with seen entities. Graph Extrapolation Network (GEN) (A) Meta-learning framework. (B) Meta-learned Network (Graph Extrapolation Network).
  • 8.
    (Transductive) GEN furtherlearns to predict the links even among unseen entities, with simulated unseen entities during meta-training. Graph Extrapolation Network (GEN) (A) Meta-learning framework. (B) Meta-learned Network (Graph Extrapolation Network).
  • 9.
    Results Transductive-GEN (T-GEN) outperformsall baselines on out-of-graph link prediction tasks for knowledge graph completion and drug-drug interaction prediction. FB15-237 NELL-995 Types Models MRR Hits@10 MRR Hits@10 Seen-to-Seen TransE 0.053 0.082 0.009 0.020 R-GCN 0.008 0.011 0.004 0.007 Seen-to-Seen, re- trained from scratch TransE 0.071 0.159 0.071 0.129 R-GCN 0.099 0.181 0.112 0.184 Seen-to-Unseen MEAN 0.105 0.207 0.158 0.263 LAN 0.112 0.214 0.159 0.255 Ours T-GEN 0.367 0.530 0.282 0.421 (A) Knowledge Graph Completion. DeepDDI BIOSNAP-sub Types Models PR Acc PR Acc Seen-to-Seen, re-trained from scratch MLP 0.476 0.528 0.034 0.049 MPNN 0.478 0.681 0.026 0.067 R-GCN 0.397 0.640 0.041 0.051 Ours T-GEN 0.708 0.815 0.067 0.089 (B) Drug-Drug Interaction Prediction.
  • 10.
    Results Why does GENgeneralize well to link prediction with out-of-graph entities? This is because GEN embeds the unseen entities on the manifold of seen entities, while baselines embeds the unseen entities off-manifold. (A) Seen-to-Unseen Baseline (LAN [Wang et al.]). (B) Seen-to-Seen Baseline, retrained from scratch (TransE [Bordes et al.]). (C) Ours (T-GEN).
  • 11.
    Conclusion • We definea realistic problem setting of few-shot out-of-graph link prediction, aiming to perform link prediction for unseen entities. • To tackle this problem, we propose a novel meta-learning framework, which meta-learns the node embedding for unseen entities. • We validate our model on knowledge graph completion and drug-drug interaction tasks, on which it significantly outperforms relevant baselines.