SlideShare a Scribd company logo
1 of 32
Download to read offline
Abigail See* Peter J. Liu+
Christopher Manning*
*Stanford NLP +
Google Brain
Get To The Point:
Summarization with Pointer-Generator Networks
1st August 2017
● More difficult
● More flexible and human
● Necessary for future progress
● Easier
● Too restrictive (no paraphrasing)
● Most past work is extractive
Two approaches to summarization
Extractive Summarization
Select parts (typically sentences) of the original
text to form a summary.
Abstractive Summarization
Generate novel sentences using natural language
generation techniques.
● Long news articles
(average ~800 words)
● Multi-sentence summaries
(usually 3 or 4 sentences,
average 56 words)
● Summary contains
information from
throughout the article
CNN / Daily Mail
dataset
Sequence-to-sequence + attention model
<START>
Context Vector
Germany
Vocabulary
Distribution
a zoo
Attention
Distribution
"beat"
...
Encoder
Hidden
States
Decoder
HiddenStates
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
weighted sum weighted sum
Partial Summary
Sequence-to-sequence + attention model
<START> Germany
...
Encoder
Hidden
States
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
beat
Decoder
HiddenStates
Partial Summary
Sequence-to-sequence + attention model
<START>
...
Encoder
Hidden
States
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Source Text
Argentina 2-0 <STOP>beatGermany
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Two Problems
Incorrect rare or
out-of-vocabulary word
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Two Problems
Incorrect rare or
out-of-vocabulary word
Solution: Use a pointer to copy words.
Get to the point!
Source Text
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
Germany
... ...
beat Argentina 2-0
point!
point!
point!
generate!
...
Best of both worlds:
extraction + abstraction
[1] Incorporating copying mechanism in sequence-to-sequence learning. Gu et al., 2016.
[2] Language as a latent variable: Discrete generative models for sentence compression. Miao and Blunsom, 2016.
Source Text
Germany emerge victorious in 2-0 win against Argentina on Saturday ...
...
<START> Germany
Vocabulary
Distribution
a zoo
beat
Partial Summary
Final Distribution
"Argentina"
a zoo
"2-0"
Context Vector
Attention
Distribution
Encoder
Hidden
States
Decoder
HiddenStates
Pointer-generator network
Improvements
Before After
UNK UNK was expelled from the
dubai open chess tournament
gaioz nigalidze was expelled from the
dubai open chess tournament
the 2015 rio olympic games the 2016 rio olympic games
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Two Problems
Solution: Use a pointer to copy words.
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Problem 1: The summaries sometimes reproduce factual details inaccurately.
e.g. Germany beat Argentina 3-2
Two Problems
Solution: Use a pointer to copy words.
Problem 2: The summaries sometimes repeat themselves.
e.g. Germany beat Germany beat Germany beat…
Solution: Penalize repeatedly attending to same parts of the source text.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
2. Penalize attending to things that have already been covered.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Don't attend here
Result: repetition rate reduced to
level similar to human summaries
Reducing repetition with coverage
Coverage = cumulative attention = what has been covered so far
1. Use coverage as extra input to attention mechanism.
2. Penalize attending to things that have already been covered.
[4] Modeling coverage for neural machine translation. Tu et al., 2016,
[5] Coverage embedding models for neural machine translation. Mi et al., 2016
[6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
Don't attend here
Summaries are still mostly extractive
Final Coverage
Source Text
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Previous best abstractive result
Our improvements
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1
Previous best abstractive result
Our improvements
worse ROUGE; better human eval
better ROUGE; worse human eval
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Results
ROUGE-1 ROUGE-2 ROUGE-L
Nallapati et al. 2016 35.5 13.3 32.7
Ours (seq2seq baseline) 31.3 11.8 28.8
Ours (pointer-generator) 36.4 15.7 33.4
Ours (pointer-generator + coverage) 39.5 17.3 36.4
Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1
Previous best abstractive result
Our improvements
worse ROUGE; better human eval
better ROUGE; worse human eval
?
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
● Summarization is subjective
○ There are many correct ways to summarize
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
● ROUGE is based on strict comparison to a reference summary
○ Intolerant to rephrasing
○ Rewards extractive strategies
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
● ROUGE is based on strict comparison to a reference summary
○ Intolerant to rephrasing
○ Rewards extractive strategies
● Take first 3 sentences as summary → higher ROUGE than (almost) any
published system
○ Partially due to news article structure
The difficulty of evaluating summarization
First sentences not always a good summary
Robots tested in
Japan companies
Irrelevant
Our system
starts here
A crowd gathers near the entrance of Tokyo's
upscale Mitsukoshi Department Store, which traces
its roots to a kimono shop in the late 17th century.
Fitting with the store's history, the new greeter wears
a traditional Japanese kimono while delivering
information to the growing crowd, whose expressions
vary from amusement to bewilderment.
It's hard to imagine the store's founders in the late
1600's could have imagined this kind of employee.
That's because the greeter is not a human -- it's a
robot.
Aiko Chihira is an android manufactured by Toshiba,
designed to look and move like a real person.
...
What next?
Extractive
methods
SAFETY
Human-level
summarization
longtext
understanding
MOUNT ABSTRACTION
Extractive
methods
paraphrasing
SAFETY
Human-level
summarization
longtext
understanding
MOUNT ABSTRACTION SWAMP OF BASIC ERRORS
repetition
copying errors
nonsense
Extractive
methods
paraphrasing
SAFETY
Human-level
summarization
MOUNT ABSTRACTION SWAMP OF BASIC ERRORS
repetition
copying errors
nonsense
Extractive
methods
RNNs
RNNs
more high-level understanding?
more scalability?
better metrics?
SAFETY
Thank you!
Blog post: www.abigailsee.com
Code: github.com/abisee/pointer-generator

More Related Content

What's hot

Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanismKhang Pham
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Fine-tuning BERT for Question Answering
Fine-tuning BERT for Question AnsweringFine-tuning BERT for Question Answering
Fine-tuning BERT for Question AnsweringApache MXNet
 
Glove global vectors for word representation
Glove global vectors for word representationGlove global vectors for word representation
Glove global vectors for word representationhyunyoung Lee
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo
 
Evolutionary computing - soft computing
Evolutionary computing - soft computingEvolutionary computing - soft computing
Evolutionary computing - soft computingSakshiMahto1
 
Gpt1 and 2 model review
Gpt1 and 2 model reviewGpt1 and 2 model review
Gpt1 and 2 model reviewSeoung-Ho Choi
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's TutorialWayne Lee
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear ComplexitySangwoo Mo
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 

What's hot (20)

Word embedding
Word embedding Word embedding
Word embedding
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
 
Understanding GloVe
Understanding GloVeUnderstanding GloVe
Understanding GloVe
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Fine-tuning BERT for Question Answering
Fine-tuning BERT for Question AnsweringFine-tuning BERT for Question Answering
Fine-tuning BERT for Question Answering
 
Glove global vectors for word representation
Glove global vectors for word representationGlove global vectors for word representation
Glove global vectors for word representation
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 
Evolutionary computing - soft computing
Evolutionary computing - soft computingEvolutionary computing - soft computing
Evolutionary computing - soft computing
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Gpt1 and 2 model review
Gpt1 and 2 model reviewGpt1 and 2 model review
Gpt1 and 2 model review
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 

Similar to Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks

Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
 
NLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxNLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxKevinSims18
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfPo-Chuan Chen
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
Maintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsMaintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsChris Rackauckas
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupYves Peirsman
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Yusuke Oda
 
Computer Tools for Academic Research
Computer Tools for Academic ResearchComputer Tools for Academic Research
Computer Tools for Academic ResearchMiklos Koren
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationJEE HYUN PARK
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaKei Nakagawa
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewIRJET Journal
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesIRJET Journal
 
Automatic keyword extraction.pptx
Automatic keyword extraction.pptxAutomatic keyword extraction.pptx
Automatic keyword extraction.pptxBiswarupDas18
 

Similar to Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks (20)

Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
NLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxNLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docx
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
010821+presentation+oti.ppt
010821+presentation+oti.ppt010821+presentation+oti.ppt
010821+presentation+oti.ppt
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Maintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia EcosystemsMaintaining Large Scale Julia Ecosystems
Maintaining Large Scale Julia Ecosystems
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Computer Tools for Academic Research
Computer Tools for Academic ResearchComputer Tools for Academic Research
Computer Tools for Academic Research
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical Review
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization Methodologies
 
Automatic keyword extraction.pptx
Automatic keyword extraction.pptxAutomatic keyword extraction.pptx
Automatic keyword extraction.pptx
 

More from Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 

Recently uploaded (20)

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 

Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator Networks

  • 1. Abigail See* Peter J. Liu+ Christopher Manning* *Stanford NLP + Google Brain Get To The Point: Summarization with Pointer-Generator Networks 1st August 2017
  • 2. ● More difficult ● More flexible and human ● Necessary for future progress ● Easier ● Too restrictive (no paraphrasing) ● Most past work is extractive Two approaches to summarization Extractive Summarization Select parts (typically sentences) of the original text to form a summary. Abstractive Summarization Generate novel sentences using natural language generation techniques.
  • 3. ● Long news articles (average ~800 words) ● Multi-sentence summaries (usually 3 or 4 sentences, average 56 words) ● Summary contains information from throughout the article CNN / Daily Mail dataset
  • 4. Sequence-to-sequence + attention model <START> Context Vector Germany Vocabulary Distribution a zoo Attention Distribution "beat" ... Encoder Hidden States Decoder HiddenStates Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text weighted sum weighted sum Partial Summary
  • 5. Sequence-to-sequence + attention model <START> Germany ... Encoder Hidden States Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text beat Decoder HiddenStates Partial Summary
  • 6. Sequence-to-sequence + attention model <START> ... Encoder Hidden States Germany emerge victorious in 2-0 win against Argentina on Saturday ... Source Text Argentina 2-0 <STOP>beatGermany
  • 7. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Two Problems Incorrect rare or out-of-vocabulary word
  • 8. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Two Problems Incorrect rare or out-of-vocabulary word Solution: Use a pointer to copy words.
  • 9. Get to the point! Source Text Germany emerge victorious in 2-0 win against Argentina on Saturday ... Germany ... ... beat Argentina 2-0 point! point! point! generate! ... Best of both worlds: extraction + abstraction [1] Incorporating copying mechanism in sequence-to-sequence learning. Gu et al., 2016. [2] Language as a latent variable: Discrete generative models for sentence compression. Miao and Blunsom, 2016.
  • 10. Source Text Germany emerge victorious in 2-0 win against Argentina on Saturday ... ... <START> Germany Vocabulary Distribution a zoo beat Partial Summary Final Distribution "Argentina" a zoo "2-0" Context Vector Attention Distribution Encoder Hidden States Decoder HiddenStates Pointer-generator network
  • 11. Improvements Before After UNK UNK was expelled from the dubai open chess tournament gaioz nigalidze was expelled from the dubai open chess tournament the 2015 rio olympic games the 2016 rio olympic games
  • 12. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Two Problems Solution: Use a pointer to copy words. Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat…
  • 13. Problem 1: The summaries sometimes reproduce factual details inaccurately. e.g. Germany beat Argentina 3-2 Two Problems Solution: Use a pointer to copy words. Problem 2: The summaries sometimes repeat themselves. e.g. Germany beat Germany beat Germany beat… Solution: Penalize repeatedly attending to same parts of the source text.
  • 14. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
  • 15. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016.
  • 16. Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. 2. Penalize attending to things that have already been covered. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016. Don't attend here
  • 17. Result: repetition rate reduced to level similar to human summaries Reducing repetition with coverage Coverage = cumulative attention = what has been covered so far 1. Use coverage as extra input to attention mechanism. 2. Penalize attending to things that have already been covered. [4] Modeling coverage for neural machine translation. Tu et al., 2016, [5] Coverage embedding models for neural machine translation. Mi et al., 2016 [6] Distraction-based neural networks for modeling documents. Chen et al., 2016. Don't attend here
  • 18. Summaries are still mostly extractive Final Coverage Source Text
  • 19. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 20. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Previous best abstractive result Our improvements ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 21. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9 Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 Previous best abstractive result Our improvements worse ROUGE; better human eval better ROUGE; worse human eval ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 22. Results ROUGE-1 ROUGE-2 ROUGE-L Nallapati et al. 2016 35.5 13.3 32.7 Ours (seq2seq baseline) 31.3 11.8 28.8 Ours (pointer-generator) 36.4 15.7 33.4 Ours (pointer-generator + coverage) 39.5 17.3 36.4 Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9 Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 Previous best abstractive result Our improvements worse ROUGE; better human eval better ROUGE; worse human eval ? ROUGE compares the machine-generated summary to the human-written reference summary and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
  • 23. ● Summarization is subjective ○ There are many correct ways to summarize The difficulty of evaluating summarization
  • 24. ● Summarization is subjective ○ There are many correct ways to summarize ● ROUGE is based on strict comparison to a reference summary ○ Intolerant to rephrasing ○ Rewards extractive strategies The difficulty of evaluating summarization
  • 25. ● Summarization is subjective ○ There are many correct ways to summarize ● ROUGE is based on strict comparison to a reference summary ○ Intolerant to rephrasing ○ Rewards extractive strategies ● Take first 3 sentences as summary → higher ROUGE than (almost) any published system ○ Partially due to news article structure The difficulty of evaluating summarization
  • 26. First sentences not always a good summary Robots tested in Japan companies Irrelevant Our system starts here A crowd gathers near the entrance of Tokyo's upscale Mitsukoshi Department Store, which traces its roots to a kimono shop in the late 17th century. Fitting with the store's history, the new greeter wears a traditional Japanese kimono while delivering information to the growing crowd, whose expressions vary from amusement to bewilderment. It's hard to imagine the store's founders in the late 1600's could have imagined this kind of employee. That's because the greeter is not a human -- it's a robot. Aiko Chihira is an android manufactured by Toshiba, designed to look and move like a real person. ...
  • 30. Human-level summarization longtext understanding MOUNT ABSTRACTION SWAMP OF BASIC ERRORS repetition copying errors nonsense Extractive methods paraphrasing SAFETY
  • 31. Human-level summarization MOUNT ABSTRACTION SWAMP OF BASIC ERRORS repetition copying errors nonsense Extractive methods RNNs RNNs more high-level understanding? more scalability? better metrics? SAFETY
  • 32. Thank you! Blog post: www.abigailsee.com Code: github.com/abisee/pointer-generator