Graph Neural Prompting with Large Language Models.pptx

Van Thuy Hoang
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: hoangvanthuy90@gmail.com
2023-10-09
Yijun Tian, et al.

2
The traditional learning: Pre-training & Fine-tuning
 Fine-tune the parameters of the pre-trained model for a specific downstream task using a large
(hundreds of thousands) corpus of labeled data.
 Keep training the model via repeated gradient updates.
Characteristics:
 Strong performance on many benchmarks.
 Need a new large dataset for each task.
 Potential for poor out-of-distribution generalization
 Potential to explore spurious features of the data

3
In-context learning
 No training or optimization of the model parameters in the “adaptation step”.
 Simply give the model a task description as well as none/one/few examples as the input at
inference time.
 Only the task description: 0-SHOT, 1-SHOT, or FEW-SHOT,
 No gradient updates are performed.
 Model needs to figure out：
 Input distribution (financial or general news)
 Output distribution (Positive/Negative or topic)
 Input-output mapping
(sentiment or topic classification)

4
In-context learning: example
 Language model (LM) uses the in-context learning prompt to “locate” a previously
learned concept to do the in-context learning task
 In-context learning: If the LM also infers the prompt concept using
demonstrations in the prompt, then in-context learning succeeds!

5
Knowledge graphs
 Knowledge graphs (KGs), storing enormous facts, serve as a structured and
systematic way of representing knowledge.
 Consequently, existing methods have incorporated KGs to assist language
modeling for the task of question answering, often by designing customized
model architectures to accommodate both KGs and textual data

6
The question
 Can we learn beneficial knowledge from KGs and integrate them into pre-
trained LLMs?
 This is the first attempt to study the learning of beneficial knowledge from KGs
for pre-trained LLMs

7
The overall framework
 The overall framework

8
Problems
 Given a multiple choice question, first retrieve subgraphs from the knowledge graph
based on the entities in the question and options.
 Then develop Graph Neural Prompting (GNP) to encode the pertinent factual knowledge
and structural information to obtain the Graph Neural Prompt.
 GNP contains various designs including a GNN, a cross-modality pooling module, a
domain projector, and a self-supervised link prediction objective.
 Later, the obtained Graph Neural Prompt is sent into LLM for inference along with the
input text embedding.
 The standard maximum likelihood objective for downstream task adaptation, while LLM is
kept frozen or tuned depending on different experimental settings.

9
Methodology
 Prompting LLMs for Question Answering
 The LLM model can be trained for downstream task adaptation using a
standard maximum likelihood loss using teacher forcing and a cross-entropy
loss:

10
Methodology
 Subgraph Retrieval:
 for each answer option a_k and its corresponding context C and question Q,
first obtain a set of matched entities E via entity linking to match the tokens in
X to the entities in G.
 retrieve a subgraph G ′ based on the entities in E by including their two-hop
neighbors and the relations that connect them

11
Methodology: Graph Neural Prompting
 GNN Encoder:
 Cross-modality Pooling:
 Identifying the most pertinent nodes in relation to the question, and
consolidating the node embeddings

12
 GNN Encoder:
 Cross-modality Pooling:
 Identifying the most pertinent nodes in relation to the question, and
consolidating the node embeddings:
 applying a transformation to the text embeddings T and obtain the
transformed text embedding

13
 generate the graph-level embedding by average pooling the node embeddings
H3:
 Domain Projector: a mapping between the graph-level embeddings and the
text domain to facilitate comprehension by the LLM

14
 Self-supervised Link Prediction
 This encourages the model to learn to use the partial graph content and
structure to reason about the missing links.
 DistMult (Yang et al. 2015) to map the entity embeddings and relation in
the KG to vectors, h, r, t.

15
Overall experimental results
 Domains:
 general domain (commonsense reasoning)
 the biomedical domain (biomedical reasoning).
Two Settings:
LLM Frozen vs. LLM Tuned

16
Ablation Study
 GNP contains various model components
 crossmodality pooling (CMP)
 self-supervised link prediction (SLP)
 domain projector (DP))

17
Conclusion
 address the limitations of LLMs in precisely capturing and returning grounded
knowledge.
 Graph Neural Prompting (GNP), a novel plugand-play method to assist pre-
trained LLMs in learning beneficial knowledge from KGs

Graph Neural Prompting with Large Language Models.pptx

Graph Neural Prompting with Large Language Models.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Graph Neural Prompting with Large Language Models.pptx

Similar to Graph Neural Prompting with Large Language Models.pptx (20)

More from ssuser2624f71

More from ssuser2624f71 (20)

Recently uploaded

Recently uploaded (20)

Graph Neural Prompting with Large Language Models.pptx