The document discusses graph-to-text generation and its applications to dialogue systems. It provides an overview of current approaches to graph-to-text including rule-based, statistical, sequence-to-sequence and graph-to-sequence models. Recent advances use pretrained language models and graph neural networks. While current systems show promise, they still struggle with omissions, repetitions and unnatural language. The document proposes two threads of future work: exploring graph-to-text on dialogue data and implementing model improvements.
3. Motivation
● Downstream application: graph-oriented dialogue system
● Dialogue response generation
○ Response selection/expansion
■ What should be said next?
4. Motivation
● Downstream application: graph-oriented dialogue system
● Dialogue response generation
○ Response selection/expansion
■ What should be said next?
○ Language Generation
■ How will it be said?
5. Motivation
● Downstream application: graph-oriented dialogue system
● Dialogue response generation
○ Response selection/expansion
■ What should be said next?
○ Language Generation
■ How will it be said?
6. Language Generation Task
● Graph-to-Text paradigm
● Given: Graph containing all information that was selected to be expressed
● Goal: Produce natural language capturing all AND only that information in the
graph
7. Graph-to-Text Datasets
● Most approaches are
structure-agnostic
● Focus on AMR
○ most semantically and topically diverse
and complex
○ most similar to (our) desired structure
for dialogue
Abstract Meaning
Representation
Rooted graph for
semantic meaning
WebNLG KGs consisting of
triples from DBpedia
AGENDA KGs representing
scientific abstracts of
AI papers
E2E restaurant
information meaning
representations
(intents and slots)
ViGGO video game
information meaning
representations
(intents and slots)
Syntax NMT syntax trees for
machine translation
15. Graph-to-Text: Pretrained Language Models
● Pretrained Language Model (PLM)
○ Decoder
(GPT-2)
○ Encoder-Decoder
(BART, T5)
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
16. Graph-to-Text: Pretrained Language Models
● Main idea: Finetune pretrained LM
on Graph-to-Text data
○ Decoder LM (GPT-2)
○ Encoder-Decoder LM (BART, T5)
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
17. Graph-to-Text: Pretrained Language Models
● Main idea: Finetune pretrained LM
on Graph-to-Text data
○ Decoder LM (GPT-2)
○ Encoder-Decoder LM (BART, T5)
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
* these approaches use linearized
sequence representation of graphs (no
graph encoders)!
18. Graph-to-Text: Pretrained Language Models
● Main idea: Finetune pretrained LM
on Graph-to-Text data
○ Decoder LM (GPT-2)
○ Encoder-Decoder LM (BART, T5)
● Rerank text candidates
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
19. Graph-to-Text: Pretrained Language Models
● Main idea: Finetune pretrained LM
on Graph-to-Text data
○ Decoder LM (GPT-2)
○ Encoder-Decoder LM (BART, T5)
● Rerank text candidates
○ Reconstruction score
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
text
model
AMR
parser
Rec.
score
0.75
20. Graph-to-Text: Pretrained Language Models
● Main idea: Finetune pretrained LM
on Graph-to-Text data
○ Decoder LM (GPT-2)
○ Encoder-Decoder LM (BART, T5)
● Rerank text candidates
○ Reconstruction score
○ Semantic Fidelity Classifier
Table adapted from Ribeiro et al (2020)
HetGT
GPT-2, Rec
GPT-2, SFC
AMR17
text
model
AMR
parser
Rec.
score
0.75
text
model SFC Fidelity Label
(accurate, omit, repeat, …)
21. How good are current systems really?
● Challenging to interpret automated metrics in grounded way
● Automated metrics (like BLEU) seem to agree with human judgements
● Missing information seems to be the most prevalent error
● Low fluency occurs with high rates of anonymization and repetition in output
Table adapted from Manning et al (2020)
22. Observed Shortcomings
● Omission and hallucination
● Unnatural-sounding
○ Repetition
○ Ungrammatical
● “Open-class” tokens
● Edge label encodings
○ not “real” words
24. Model Ideas
● Leverage both graph encoders and pretrained language model
GE GD
graph
text
Multi-objective
loss
perplexity
25. Model Ideas
● Leverage both graph encoders and pretrained language model
GE GD
graph
text
Multi-objective
loss
perplexity
GE GD
graph
text
Hierarchical
PLME PLMD
text
26. Model Ideas
● Leverage both graph encoders and pretrained language model
GE GD
graph
text
Multi-objective
loss
perplexity
GE GD
graph
text
Hierarchical
PLME PLMD
text
GE GD
graph linearized graph
Ensemble
PLME PLMD
text
27. Graph-to-Text on Dialogue Language
● More first and second person expressions, rather than third
● More collaborative language
● Dialogue utterance types
28. Graph-to-Text on Dialogue Language
● More first and second person expressions, rather than third
● More collaborative language
● Dialogue utterance types
● predicate statement (i like cookies)
● predicate question (what did you buy?)
29. Graph-to-Text on Dialogue Language
● More first and second person expressions, rather than third
● More collaborative language
● Dialogue utterance types
● predicate statement (i like cookies)
● predicate question (what did you buy?)
● clarification question (oh a brother?)
● generic agreement prefix (yeah, it was fun)
● emotional reaction prefix (that’s great, i hope you enjoyed it a lot)
● yes-answer
● no-answer
● isolated concept answer (a dog)
● single word question (why?)
30. Graph-to-Text on Dialogue Language
1. Silver AMR annotations of dialogue (DAMR)
2. Graph-to-Text models on DAMR
31. Graph-to-Text on Dialogue Language
1. Silver AMR annotations of dialogue (DAMR)
2. Graph-to-Text models on DAMR
setup limitations:
● current AMR parsers not trained on dialogue data, so silver annotations are potentially more
noisy/incomplete than standard
● still should reveal information about what the models are missing
32. Two Main Threads
● Explore Graph-to-Text on Dialogue ● Implement Graph-to-Text
improvements over previous work
33. Two Main Threads
● Explore Graph-to-Text on Dialogue
1. Collect dialogue graph structures for
evaluation
2. Collect current approaches for
Graph-to-Text
3. Apply current approaches to dialogue
(pretrained, finetuned, scratch)
4. Identify main limitations
5. Devise strategies to overcome
● Implement Graph-to-Text
improvements over previous work
34. Two Main Threads
● Explore Graph-to-Text on Dialogue
1. Collect dialogue graph structures for
evaluation
2. Collect current approaches for
Graph-to-Text
3. Apply current approaches to dialogue
(pretrained, finetuned, scratch)
4. Identify main limitations
5. Devise strategies to overcome
● Implement Graph-to-Text
improvements over previous work
1. Start with Language Modeling ideas
2. Incorporate strategies from #5
3. Apply to standard datasets and dialogue
35. References
Daniel Beck, Gholamreza Haffari, and Trevor Cohn. 2018. Graph-to-sequence learning using gated graph neural networks. In Proceedings of the 56th Annual Meeting of the Association
for Com- putational Linguistics (Volume 1: Long Papers), pages 273–283, Melbourne, Australia. Association for Computational Linguistics.
Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, and Daniel Gildea. 2017. AMR-to-text gener- ation with synchronous node replacement grammar. In Proceedings of the 55th
Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 7–13, Vancouver, Canada. Association for Computational Linguistics.
Marco Damonte and Shay B. Cohen. 2019. Structural neural encoders for AMR-to-text generation. In Pro- ceedings of the 2019 Conference of the North Amer- ican Chapter of the
Association for Computational Linguistics: Human Language Technologies, Vol- ume 1 (Long and Short Papers), pages 3649–3658, Minneapolis, Minnesota. Association for Computa-
tional Linguistics.
Zhijiang Guo, Yan Zhang, Zhiyang Teng, and Wei Lu. 2019. Densely connected graph convolutional networks for graph-to-sequence learning. Transac- tions of the Association for
Computational Linguis- tics, 7:297–312.
Leonardo F. R. Ribeiro, Claire Gardent, and Iryna Gurevych. 2019. Enhancing AMR-to-text genera- tion with dual graph representations. In Proceed- ings of the 2019 Conference on
Empirical Methods in Natural Language Processing and the 9th Inter- national Joint Conference on Natural Language Pro- cessing (EMNLP-IJCNLP), pages 3181–3192, Hong Kong,
China. Association for Computational Lin- guistics.
Jie Zhu, Junhui Li, Muhua Zhu, Longhua Qian, Min Zhang, and Guodong Zhou. 2019. Modeling graph structure in transformer for better AMR-to-text gen- eration. In Proceedings of the
2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natu- ral Language Processing (EMNLP-IJCNLP), pages
5458–5467, Hong Kong, China. Association for Computational Linguistics.
Deng Cai and Wai Lam. 2020. Graph transformer for graph-to-sequence learning. In Proceedings of The Thirty-Fourth AAAI Conference on Artificial Intelli- gence (AAAI).
Yao, S., Wang, T. and Wan, X., 2020, July. Heterogeneous Graph Transformer for Graph-to-Sequence Learning. In Proceedings of the 58th Annual Meeting of the Association for
Computational Linguistics (pp. 7145-7154).
36. References
Manuel Mager, Ramo ́n Fernandez Astudillo, Tahira Naseem, Md Arafat Sultan, Young-Suk Lee, Radu Florian, and Salim Roukos. 2020. GPT-too: A language-model-first approach for
AMR-to-text gen- eration. In Proceedings of the 58th Annual Meet- ing of the Association for Computational Linguistics, pages 1846–1852, Online. Association for Computa- tional
Linguistics.
Hamza Harkous, Isabel Groves, and Amir Saffari. 2020.
Have your text and use it too! end-to-end neural data- to-text generation with semantic fidelity. arXiv e- prints.
Ribeiro, L.F., Schmitt, M., Schütze, H. and Gurevych, I., 2020. Investigating Pretrained Language Models for Graph-to-Text Generation. arXiv preprint arXiv:2007.08426.
Manning, E., Wein, S. and Schneider, N., 2020. A Human Evaluation of AMR-to-English Generation Systems. arXiv preprint arXiv:2004.06814.