Graph Neural Prompting with Large Language Models.pptx
1. Van Thuy Hoang
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: hoangvanthuy90@gmail.com
2023-10-09
Yijun Tian, et al.
2. 2
The traditional learning: Pre-training & Fine-tuning
Fine-tune the parameters of the pre-trained model for a specific downstream task using a large
(hundreds of thousands) corpus of labeled data.
Keep training the model via repeated gradient updates.
Characteristics:
Strong performance on many benchmarks.
Need a new large dataset for each task.
Potential for poor out-of-distribution generalization
Potential to explore spurious features of the data
3. 3
In-context learning
No training or optimization of the model parameters in the “adaptation step”.
Simply give the model a task description as well as none/one/few examples as the input at
inference time.
Only the task description: 0-SHOT, 1-SHOT, or FEW-SHOT,
No gradient updates are performed.
Model needs to figure out:
Input distribution (financial or general news)
Output distribution (Positive/Negative or topic)
Input-output mapping
(sentiment or topic classification)
4. 4
In-context learning: example
Language model (LM) uses the in-context learning prompt to “locate” a previously
learned concept to do the in-context learning task
In-context learning: If the LM also infers the prompt concept using
demonstrations in the prompt, then in-context learning succeeds!
5. 5
Knowledge graphs
Knowledge graphs (KGs), storing enormous facts, serve as a structured and
systematic way of representing knowledge.
Consequently, existing methods have incorporated KGs to assist language
modeling for the task of question answering, often by designing customized
model architectures to accommodate both KGs and textual data
6. 6
The question
Can we learn beneficial knowledge from KGs and integrate them into pre-
trained LLMs?
This is the first attempt to study the learning of beneficial knowledge from KGs
for pre-trained LLMs
8. 8
Problems
Given a multiple choice question, first retrieve subgraphs from the knowledge graph
based on the entities in the question and options.
Then develop Graph Neural Prompting (GNP) to encode the pertinent factual knowledge
and structural information to obtain the Graph Neural Prompt.
GNP contains various designs including a GNN, a cross-modality pooling module, a
domain projector, and a self-supervised link prediction objective.
Later, the obtained Graph Neural Prompt is sent into LLM for inference along with the
input text embedding.
The standard maximum likelihood objective for downstream task adaptation, while LLM is
kept frozen or tuned depending on different experimental settings.
9. 9
Methodology
Prompting LLMs for Question Answering
The LLM model can be trained for downstream task adaptation using a
standard maximum likelihood loss using teacher forcing and a cross-entropy
loss:
10. 10
Methodology
Subgraph Retrieval:
for each answer option a_k and its corresponding context C and question Q,
first obtain a set of matched entities E via entity linking to match the tokens in
X to the entities in G.
retrieve a subgraph G ′ based on the entities in E by including their two-hop
neighbors and the relations that connect them
11. 11
Methodology: Graph Neural Prompting
GNN Encoder:
Cross-modality Pooling:
Identifying the most pertinent nodes in relation to the question, and
consolidating the node embeddings
12. 12
Methodology: Graph Neural Prompting
GNN Encoder:
Cross-modality Pooling:
Identifying the most pertinent nodes in relation to the question, and
consolidating the node embeddings:
applying a transformation to the text embeddings T and obtain the
transformed text embedding
13. 13
Methodology: Graph Neural Prompting
generate the graph-level embedding by average pooling the node embeddings
H3:
Domain Projector: a mapping between the graph-level embeddings and the
text domain to facilitate comprehension by the LLM
14. 14
Methodology: Graph Neural Prompting
Self-supervised Link Prediction
This encourages the model to learn to use the partial graph content and
structure to reason about the missing links.
DistMult (Yang et al. 2015) to map the entity embeddings and relation in
the KG to vectors, h, r, t.
15. 15
Overall experimental results
Domains:
general domain (commonsense reasoning)
the biomedical domain (biomedical reasoning).
Two Settings:
LLM Frozen vs. LLM Tuned
16. 16
Ablation Study
GNP contains various model components
crossmodality pooling (CMP)
self-supervised link prediction (SLP)
domain projector (DP))
17. 17
Conclusion
address the limitations of LLMs in precisely capturing and returning grounded
knowledge.
Graph Neural Prompting (GNP), a novel plugand-play method to assist pre-
trained LLMs in learning beneficial knowledge from KGs