SlideShare a Scribd company logo
Text summarization using Abstract Meaning Representation
Text summarization using Abstract Meaning
Representation
AMR
Amit Nagarkoti
Supervisor: Dr. Harish Karnick
Department of Computer Science and Engineering
Indian Institute of Technology Kanpur
Text summarization using Abstract Meaning Representation
Table of contents
1. Introduction
2. Extractive Summarization Methods
3. Seq2Seq Learning
4. Results
Text summarization using Abstract Meaning Representation
Introduction
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
Text summarization using Abstract Meaning Representation
Introduction
Introduction
Definition
Text summarization is the process of reducing the size of original
document to a much more concise form such that the most relevant facts
in the original document are retained.
Summaries are always lossy.
Summarization methodologies
Extractive summarization : extract import words and sentences
Abstractive summarization harder : rephrase, generate similar words
Text summarization using Abstract Meaning Representation
Introduction
Why Important?
user point of view: Evaluate importance of article.
linguistic/scientific/philosophical view: Solving summarization is
equivalent to solving the problem of language understanding.
Text summarization using Abstract Meaning Representation
Introduction
AMR : welcome to amr
(w / welcome-01
:ARG2 (a / amr))
represent sentence as a DAG
captures “who is doing what to whom”
nodes : verb sense (see-01), objects (boy, marble)
edges : relations (ARG1, ARG0)
Text summarization using Abstract Meaning Representation
Introduction
Abstraction in AMR
Figure: AMR example
The boy sees the white marble.
The boy saw the marble that was white.
Text summarization using Abstract Meaning Representation
Introduction
Propbank Lexicon
white-03 : refers to the color white
ARG1: thing that is white in color (marble)
ARG2: specific part of ARG1, if also mentioned
see-01 : to see or view
ARG0: viewer (boy)
ARG1: thing viewed (marble)
ARG2: attribute of ARG1, further description (white - but we have
different amr here)
Text summarization using Abstract Meaning Representation
Introduction
How to get an AMR?
Use JAMR : 84% accuracy for concept node identification.
gives AMR for a sentence
(d / discover-01 | 0
:ARG0 (r / rope | 0.0)
:ARG1 (n / noose | 0.1)
:location (c / campus | 0.2))
the noose made of rope was discovered on campus
Text summarization using Abstract Meaning Representation
Introduction
word-node Alignment
node alignment word
discover-01 6-7—0 discovered
rope 4-5—0.0 rope
noose 1-2—0.1 noose
campus 8-9—0.2 campus
Table: word-node alignment
These alignments will be used to generate summaries from AMRs
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Using AMRs
Figure: Steps to generate Document graph from sentence AMR (modified from
[Liu et al., 2015])
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Document Graph
Figure: Dense Document Graph
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Finding the summary sub-graph
maximize
n
i=1
ψi θT
f (vi ) (1)
here ψi is a binary variable indicating node i is selected or not
θ = [θ1, . . . , θm] are model parameters
f (vi ) = [f1(vi ), . . . , fm(vi )] are node features
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Constraints for a valid sub-graph
Figure: vi − eij ≥ 0 vj − eij ≥ 0
Figure: i f0i − vi = 0
Figure: i fij − k fjk − vj = 0
Neij − fij ≥ 0, ∀i, j sanity constraint
i,j eij ≤ L size constraint
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Complete Algorithm
f o r cur doc in corpus :
c r e a t e doc graph
add ILP c o n s t r a i n t s
s o l v e o b j e c t i v e to minimize l o s s
c a l c u l a t e g r a d i e n t s
update model parameters
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Figure: term-sentence matrix
term vector tT
i = [xi1 . . . xin]
sentence vector sT
j = [x1j . . . xmj ]
term-sentence matrix can be huge
no relation between terms
sparsity
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Figure: SVD over term-sentence matrix source:Wikipedia
Figure: low rank approximation
Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Finally a score is calculated for each sentence vector given by
Sl =
n
i=1
v2
l,i · σ2
i (2)
where Sl is score for sentence l
choose L sentences with highest scores
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
s2s
Figure: Sequence to Sequence Model
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Figure: RNN cell st = f (Uxt + Wst−1) ot = softmax(Vst ) source:
nature [LeCun et al., 2015]
Figure: Chat client using s2s model based on LSTM source: [Christopher, ]
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
S2S continued
During training model tries to minimize the negative log likelihood of the
target word
lossD =
T
t=1
− log(P(wt)) (3)
here wt is the target word at step t.
Problems in s2s
slow training with long sequences - limit sequence lengths
limited context for decoder critical for s2s
large vocabularies
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention to rescue
Attend or Re-visit critical information when making decisions
Figure: Attention in RNNs source:distill.pub
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention complete view
Figure: Attention at step t
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention Heatmap
X − axis is the input sequence, Y − axis is the generated output
Figure: Attention Heatmap during Decoding
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Byte Pair Encoding
Definition
BPE is a compression technique where the most frequent pair of
consecutive bytes is replaced by a byte not in the document.
BPE has been adapted for NMT [Sennrich et al., 2015] using the
idea of subword unit.
“lower” will be represented as “l o w e r @”.
vocabsize = numOfIterations + numOfChars.
BPE merge operations learned from dictionary low, lowest, newer, wider.
using 4 merge operations.
r @ r@
l o lo
lo w low
e r@ er@
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Pointer Generator Model for OOVs
Figure: Pointer Gen model [See et al., 2017]
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Coverage
“Coverage” is used in MT to control over and under production of target
words.
Some words may never get enough attention resulting in poor
translation/summaries.
The solution is to use coverage to guided attention [Tu et al., 2016]
and [See et al., 2017].
Accumulate all attention weights and penalize for extra attention.
ct =
t−1
t =1
αt
coverage vector (4)
covlosst =
encsteps
i=1
min(αi , ct
i ) (5)
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
AMR Linearization
Figure: AMR DFS Traversal gives the linearization -TOP-( eat-01 ARG1(
bone )ARG1 ARG0( dog )ARG0 )-TOP-
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Data Augmentation/Extension with POS
POS sequence p1 · · · pn is obtained using the Stanford Parser
Figure: s2s model generating multiple output distributions
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Hierarchical Sequence Encoder for AMRs
sentence vector Si is obtained by using attention on word encoder
states
document vector Di is obtained by using attention on sentence
encoder states
Figure: hierarchical models learn the document vector using level-wise learning
Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Using Dependency Parsing
“A Dependency Parse of a sentence is tree with labelled edges such that
the main verb or the focused noun is the root and edges are the relations
between words in the sentence.”
To reduce document size we used a context of size L around the
root word, reducing the sentence size to at max 2L + 1.
Fig below has “capital” as root with L = 3 we are able to extract
crux of the sentence i.e “Delhi is the capital of India”.
Figure: Dependency Parse :: Delhi is the capital of India , with lots of people.
Text summarization using Abstract Meaning Representation
Results
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
Text summarization using Abstract Meaning Representation
Results
Rouge
RougeN-Precision MatchingN−grams
Count candidateN−grams
RougeN-Recall MatchingN−grams
Count referenceN−grams
RougeN-F1 2RougeNre∗RougeNpre
RougeNre+RougeNpre
RougeL-Precision lenLCS(ref ,can)
lencandidate
RougeL-Recall lenLCS(ref ,can)
lenreference
RougeL-F1 2RougeLre∗RougeLpre
RougeLre+RougeLpre
Table: Rouge Formulas
Text summarization using Abstract Meaning Representation
Results
Examples
Rouge-N
candidate :: the cat was found under the bed
reference :: the cat was under the bed
has recall 6
6 = 1 and precision 6
7 = 0.86
Rouge-L
reference :: police killed the gunman
candidate1 :: police kill the gunman
candidate2 :: the gunman kill police
candidate1 has RougeL − F1 of 0.75 and candidate2 has
RougeL − F1 of 0.5
Text summarization using Abstract Meaning Representation
Results
Dataset Description
We used CNN/Dailymail dataset. The distribution of dataset is as follows
Train 287,226
Test 11,490
Validation 13,368
Table: CNN/Dailymail split
average number of sentences per article 31
average number of sentences per summary 3
average number of words per article 790
average number of words per summary 55
average number of words per article (BPE vsize 50k) 818
Table: CNN/Dailymail average stats for Training sets
Text summarization using Abstract Meaning Representation
Results
Results onn CNN/Dailymail
Model R1-f1 R2-f1 RL-f1
bpe-no-cov 35.39 15.53 32.31
bpe-cov 36.31 15.69 33.63
text-no-cov 31.4 11.6 27.2
text-cov 33.19 12.38 28.12
text-200 34.66 16.23 30.96
text-200-cov 37.18 17.41 32.68
pos-full* 35.76 16.9 31.86
pos-full-cov* 37.96 17.71 33.23
Table: pos-full minimizes loss for word generation and POS tag generation, all
models use pointer copying mechanism as default except the bpe model,
text-200 uses glove vectors for word vector initialization
Text summarization using Abstract Meaning Representation
Results
AMR S2S
AMR as augmented data using 25k training examples
Model R1-f1 R2-f1 RL-f1
aug-no-cov 30.18 11 26.52
aug-cov 34.53 13.22 29.21
Table: AMRs as Augmented Data
AMR to AMR using s2s
Model R1-f1 R2-f1 RL-f1
s2s-cnn-1 17.97 6.31 17.35
s2s-cnn-2 25.60 8.31 25.01
s2s-cnn-3 31.96 10.71 29.12
Table: 1: cnn no pointer gen 2: cnn with pointer gen 3: cnn with cov and
pointer gen
Text summarization using Abstract Meaning Representation
Results
AMR using Doc Graph
Model R1-f1
number of edges L <= nodes/2 - 1 (ref gold) 18.59
number of edges L <= nodes/3 -1 (ref gold) 19.72
number of edges L <= nodes/4 -1 (ref gold) 19.57
number of edges L <= nodes/2 -1 (ref *generated) 38.05
number of edges L <= nodes/3 -1 (ref *generated) 44.72
number of edges L <= nodes/4 -1 (ref *generated) 43.60
Table: Graph based AMR summaries *reference summaries were generated
using word alignment in summary graph
Text summarization using Abstract Meaning Representation
Results
Extractive + Abstractive
We used LSA and the dependency parse for the extractive phase
Model R1-f1 R2-f1 RL-f1
dep-no-cov 25.89 8.72 23.81
dep-cov 30.53 10.26 28.06
lsa-10-no-cov 29.75 11.27 27.3
lsa-10-cov 32.9 12.17 30.41
lsa-15-no-cov 31.64 12.34 28.74
lsa-15-cov 33.22 12.86 30.11
Table: Comparing the mixed approach for summarization, Dependency based
method used L = 7 as context window value
Text summarization using Abstract Meaning Representation
Results
Sample Outputs
repetition
Reference: roseanne barr told the daily beast that she is slowly
going blind barr said she has macular degeneration and glaucoma
Generated: roseanne barr said she is slowly going blind in interview
with the daily beast barr said she suffers from macular degeneration
and glaucoma [she suffers from macular degeneration and glaucoma]
[she suffers from macular degeneration and glaucoma]
less abstraction
Reference: batman superman dawn justice trailer leaked thursday
yanked offline film will released march 25 2016 stars ben affleck
henry cavill
Generated: batman superman dawn justice [leaked handheld camera
phone user uploaded handheld camera phone capture trailer spanish]
superman dawn justice will released march 25 2016 [stars affleck
batman henry cavill superman film also stars amy adams] much costs
Text summarization using Abstract Meaning Representation
Results
Sample Outputs
facts redundancy
Reference: coast guard says about 50 people were rescued from
mobile bay more than 100 sailboats took part in the dauphin island
race an annual event
Generated: [sailboats took part in the dauphin island race] and as
many as 50 [sailboats took part in the dauphin island race] and as
many as 50 people in all were rescued from water [the coast guard
says this is the 57th year] for the event [the club says this is the 57th
year] for the event
Text summarization using Abstract Meaning Representation
Results
Conclusion and Future Work
In this work we walked through some of the techniques for text
summarization and suggested some changes to them.
Extractive methods always degrades the summaries due to lack of
complete information but improves on the training times.
We looked into AMRs and their applicability to the task of
summarization and their effectiveness on a smaller data set.
Finding Global graph embeddings for AMR type structures.
Using co-referencing in AMRs.
Data Augmentation without increasing the model complexity.
Expanding Memory Networks for Summarization.
Reinforcement Learning for Summarization.
Better Extraction using Dependency Trees.
Text summarization using Abstract Meaning Representation
Results
Thank You!
(t / thank-01
:ARG1 (y / you))
Text summarization using Abstract Meaning Representation
Results
References I
Christopher, O.
Understanding lstm networks.
LeCun, Y., Bengio, Y., and Hinton, G. (2015).
Deep learning.
Nature, 521(7553):436–444.
Insight.
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and Smith, N. A.
(2015).
Toward abstractive summarization using semantic representations.
See, A., Liu, P. J., and Manning, C. D. (2017).
Get to the point: Summarization with pointer-generator networks.
arXiv preprint arXiv:1704.04368.
Text summarization using Abstract Meaning Representation
Results
References II
Sennrich, R., Haddow, B., and Birch, A. (2015).
Neural machine translation of rare words with subword units.
arXiv preprint arXiv:1508.07909.
Tu, Z., Lu, Z., Liu, Y., Liu, X., and Li, H. (2016).
Modeling coverage for neural machine translation.
arXiv preprint arXiv:1601.04811.

More Related Content

What's hot

Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
Gurram Poorna Prudhvi
 
Pegasus
PegasusPegasus
Pegasus
Hangil Kim
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learning
Abu Kaisar
 
Text summerization
Text summerizationText summerization
Text summerization
Abhishek Seth
 
Document Summarization
Document SummarizationDocument Summarization
Document SummarizationPratik Kumar
 
Text Similarity
Text SimilarityText Similarity
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
Ibrahim Muneer
 
Text categorization
Text categorizationText categorization
Text categorization
KU Leuven
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
Marina Santini
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
Rajnish Raj
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
Marina Santini
 
Text mining
Text miningText mining
Text mining
Koshy Geoji
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
Roberto Pereira Silveira
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
Lalit Jain
 
Word embedding
Word embedding Word embedding
Word embedding
ShivaniChoudhary74
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
Kyunghoon Kim
 
Text MIning
Text MIningText MIning
Text MIning
Prakhyath Rai
 
Text mining Pre-processing
Text mining Pre-processingText mining Pre-processing
Text mining Pre-processing
Creditas
 

What's hot (20)

Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Pegasus
PegasusPegasus
Pegasus
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learning
 
Text summerization
Text summerizationText summerization
Text summerization
 
Document Summarization
Document SummarizationDocument Summarization
Document Summarization
 
Text Similarity
Text SimilarityText Similarity
Text Similarity
 
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
 
Text categorization
Text categorizationText categorization
Text categorization
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Text mining
Text miningText mining
Text mining
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
Word embedding
Word embedding Word embedding
Word embedding
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Text MIning
Text MIningText MIning
Text MIning
 
Text mining Pre-processing
Text mining Pre-processingText mining Pre-processing
Text mining Pre-processing
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 

Similar to text summarization using amr

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Edmond Lepedus
 
From_seq2seq_to_BERT
From_seq2seq_to_BERTFrom_seq2seq_to_BERT
From_seq2seq_to_BERT
Huali Zhao
 
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
ijctcm
 
Dictionary based Image Compression via Sparse Representation
Dictionary based Image Compression via Sparse Representation  Dictionary based Image Compression via Sparse Representation
Dictionary based Image Compression via Sparse Representation
IJECEIAES
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
ijnlc
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
IRJET Journal
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
irjes
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
IRJET Journal
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODELQUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
IRJET Journal
 
A1804010105
A1804010105A1804010105
A1804010105
IOSR Journals
 
Multi Document Text Summarization using Backpropagation Network
Multi Document Text Summarization using Backpropagation NetworkMulti Document Text Summarization using Backpropagation Network
Multi Document Text Summarization using Backpropagation Network
IRJET Journal
 
TEXT CLUSTERING.doc
TEXT CLUSTERING.docTEXT CLUSTERING.doc
TEXT CLUSTERING.doc
naveenchaurasia
 
Super resolution image reconstruction via dual dictionary learning in sparse...
Super resolution image reconstruction via dual dictionary  learning in sparse...Super resolution image reconstruction via dual dictionary  learning in sparse...
Super resolution image reconstruction via dual dictionary learning in sparse...
IJECEIAES
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
CSCJournals
 
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONX-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
IJCI JOURNAL
 
Joint Image Registration And Example-Based Super-Resolution Algorithm
Joint Image Registration And Example-Based Super-Resolution AlgorithmJoint Image Registration And Example-Based Super-Resolution Algorithm
Joint Image Registration And Example-Based Super-Resolution Algorithm
aciijournal
 
User_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docxUser_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docx
dickonsondorris
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
ijnlc
 
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATIONAPPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
IJDKP
 

Similar to text summarization using amr (20)

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...
 
From_seq2seq_to_BERT
From_seq2seq_to_BERTFrom_seq2seq_to_BERT
From_seq2seq_to_BERT
 
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
 
Dictionary based Image Compression via Sparse Representation
Dictionary based Image Compression via Sparse Representation  Dictionary based Image Compression via Sparse Representation
Dictionary based Image Compression via Sparse Representation
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODELQUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
QUICK GLANCE: UNSUPERVISED EXTRACTIVE SUMMARIZATION MODEL
 
A1804010105
A1804010105A1804010105
A1804010105
 
Multi Document Text Summarization using Backpropagation Network
Multi Document Text Summarization using Backpropagation NetworkMulti Document Text Summarization using Backpropagation Network
Multi Document Text Summarization using Backpropagation Network
 
TEXT CLUSTERING.doc
TEXT CLUSTERING.docTEXT CLUSTERING.doc
TEXT CLUSTERING.doc
 
Super resolution image reconstruction via dual dictionary learning in sparse...
Super resolution image reconstruction via dual dictionary  learning in sparse...Super resolution image reconstruction via dual dictionary  learning in sparse...
Super resolution image reconstruction via dual dictionary learning in sparse...
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
 
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONX-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
 
Joint Image Registration And Example-Based Super-Resolution Algorithm
Joint Image Registration And Example-Based Super-Resolution AlgorithmJoint Image Registration And Example-Based Super-Resolution Algorithm
Joint Image Registration And Example-Based Super-Resolution Algorithm
 
User_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docxUser_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docx
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
 
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATIONAPPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATION
 

Recently uploaded

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 

text summarization using amr

  • 1. Text summarization using Abstract Meaning Representation Text summarization using Abstract Meaning Representation AMR Amit Nagarkoti Supervisor: Dr. Harish Karnick Department of Computer Science and Engineering Indian Institute of Technology Kanpur
  • 2. Text summarization using Abstract Meaning Representation Table of contents 1. Introduction 2. Extractive Summarization Methods 3. Seq2Seq Learning 4. Results
  • 3. Text summarization using Abstract Meaning Representation Introduction Outline 1 Introduction 2 Extractive Summarization Methods 3 Seq2Seq Learning 4 Results
  • 4. Text summarization using Abstract Meaning Representation Introduction Introduction Definition Text summarization is the process of reducing the size of original document to a much more concise form such that the most relevant facts in the original document are retained. Summaries are always lossy. Summarization methodologies Extractive summarization : extract import words and sentences Abstractive summarization harder : rephrase, generate similar words
  • 5. Text summarization using Abstract Meaning Representation Introduction Why Important? user point of view: Evaluate importance of article. linguistic/scientific/philosophical view: Solving summarization is equivalent to solving the problem of language understanding.
  • 6. Text summarization using Abstract Meaning Representation Introduction AMR : welcome to amr (w / welcome-01 :ARG2 (a / amr)) represent sentence as a DAG captures “who is doing what to whom” nodes : verb sense (see-01), objects (boy, marble) edges : relations (ARG1, ARG0)
  • 7. Text summarization using Abstract Meaning Representation Introduction Abstraction in AMR Figure: AMR example The boy sees the white marble. The boy saw the marble that was white.
  • 8. Text summarization using Abstract Meaning Representation Introduction Propbank Lexicon white-03 : refers to the color white ARG1: thing that is white in color (marble) ARG2: specific part of ARG1, if also mentioned see-01 : to see or view ARG0: viewer (boy) ARG1: thing viewed (marble) ARG2: attribute of ARG1, further description (white - but we have different amr here)
  • 9. Text summarization using Abstract Meaning Representation Introduction How to get an AMR? Use JAMR : 84% accuracy for concept node identification. gives AMR for a sentence (d / discover-01 | 0 :ARG0 (r / rope | 0.0) :ARG1 (n / noose | 0.1) :location (c / campus | 0.2)) the noose made of rope was discovered on campus
  • 10. Text summarization using Abstract Meaning Representation Introduction word-node Alignment node alignment word discover-01 6-7—0 discovered rope 4-5—0.0 rope noose 1-2—0.1 noose campus 8-9—0.2 campus Table: word-node alignment These alignments will be used to generate summaries from AMRs
  • 11. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Outline 1 Introduction 2 Extractive Summarization Methods 3 Seq2Seq Learning 4 Results
  • 12. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Using AMRs Figure: Steps to generate Document graph from sentence AMR (modified from [Liu et al., 2015])
  • 13. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Document Graph Figure: Dense Document Graph
  • 14. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Finding the summary sub-graph maximize n i=1 ψi θT f (vi ) (1) here ψi is a binary variable indicating node i is selected or not θ = [θ1, . . . , θm] are model parameters f (vi ) = [f1(vi ), . . . , fm(vi )] are node features
  • 15. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Constraints for a valid sub-graph Figure: vi − eij ≥ 0 vj − eij ≥ 0 Figure: i f0i − vi = 0 Figure: i fij − k fjk − vj = 0 Neij − fij ≥ 0, ∀i, j sanity constraint i,j eij ≤ L size constraint
  • 16. Text summarization using Abstract Meaning Representation Extractive Summarization Methods Complete Algorithm f o r cur doc in corpus : c r e a t e doc graph add ILP c o n s t r a i n t s s o l v e o b j e c t i v e to minimize l o s s c a l c u l a t e g r a d i e n t s update model parameters
  • 17. Text summarization using Abstract Meaning Representation Extractive Summarization Methods LSA for extractive summarization Figure: term-sentence matrix term vector tT i = [xi1 . . . xin] sentence vector sT j = [x1j . . . xmj ] term-sentence matrix can be huge no relation between terms sparsity
  • 18. Text summarization using Abstract Meaning Representation Extractive Summarization Methods LSA for extractive summarization Figure: SVD over term-sentence matrix source:Wikipedia Figure: low rank approximation
  • 19. Text summarization using Abstract Meaning Representation Extractive Summarization Methods LSA for extractive summarization Finally a score is calculated for each sentence vector given by Sl = n i=1 v2 l,i · σ2 i (2) where Sl is score for sentence l choose L sentences with highest scores
  • 20. Text summarization using Abstract Meaning Representation Seq2Seq Learning Outline 1 Introduction 2 Extractive Summarization Methods 3 Seq2Seq Learning 4 Results
  • 21. Text summarization using Abstract Meaning Representation Seq2Seq Learning s2s Figure: Sequence to Sequence Model
  • 22. Text summarization using Abstract Meaning Representation Seq2Seq Learning Figure: RNN cell st = f (Uxt + Wst−1) ot = softmax(Vst ) source: nature [LeCun et al., 2015] Figure: Chat client using s2s model based on LSTM source: [Christopher, ]
  • 23. Text summarization using Abstract Meaning Representation Seq2Seq Learning S2S continued During training model tries to minimize the negative log likelihood of the target word lossD = T t=1 − log(P(wt)) (3) here wt is the target word at step t. Problems in s2s slow training with long sequences - limit sequence lengths limited context for decoder critical for s2s large vocabularies
  • 24. Text summarization using Abstract Meaning Representation Seq2Seq Learning Attention to rescue Attend or Re-visit critical information when making decisions Figure: Attention in RNNs source:distill.pub
  • 25. Text summarization using Abstract Meaning Representation Seq2Seq Learning Attention complete view Figure: Attention at step t
  • 26. Text summarization using Abstract Meaning Representation Seq2Seq Learning Attention Heatmap X − axis is the input sequence, Y − axis is the generated output Figure: Attention Heatmap during Decoding
  • 27. Text summarization using Abstract Meaning Representation Seq2Seq Learning Byte Pair Encoding Definition BPE is a compression technique where the most frequent pair of consecutive bytes is replaced by a byte not in the document. BPE has been adapted for NMT [Sennrich et al., 2015] using the idea of subword unit. “lower” will be represented as “l o w e r @”. vocabsize = numOfIterations + numOfChars. BPE merge operations learned from dictionary low, lowest, newer, wider. using 4 merge operations. r @ r@ l o lo lo w low e r@ er@
  • 28. Text summarization using Abstract Meaning Representation Seq2Seq Learning Pointer Generator Model for OOVs Figure: Pointer Gen model [See et al., 2017]
  • 29. Text summarization using Abstract Meaning Representation Seq2Seq Learning Coverage “Coverage” is used in MT to control over and under production of target words. Some words may never get enough attention resulting in poor translation/summaries. The solution is to use coverage to guided attention [Tu et al., 2016] and [See et al., 2017]. Accumulate all attention weights and penalize for extra attention. ct = t−1 t =1 αt coverage vector (4) covlosst = encsteps i=1 min(αi , ct i ) (5)
  • 30. Text summarization using Abstract Meaning Representation Seq2Seq Learning AMR Linearization Figure: AMR DFS Traversal gives the linearization -TOP-( eat-01 ARG1( bone )ARG1 ARG0( dog )ARG0 )-TOP-
  • 31. Text summarization using Abstract Meaning Representation Seq2Seq Learning Data Augmentation/Extension with POS POS sequence p1 · · · pn is obtained using the Stanford Parser Figure: s2s model generating multiple output distributions
  • 32. Text summarization using Abstract Meaning Representation Seq2Seq Learning Hierarchical Sequence Encoder for AMRs sentence vector Si is obtained by using attention on word encoder states document vector Di is obtained by using attention on sentence encoder states Figure: hierarchical models learn the document vector using level-wise learning
  • 33. Text summarization using Abstract Meaning Representation Seq2Seq Learning Using Dependency Parsing “A Dependency Parse of a sentence is tree with labelled edges such that the main verb or the focused noun is the root and edges are the relations between words in the sentence.” To reduce document size we used a context of size L around the root word, reducing the sentence size to at max 2L + 1. Fig below has “capital” as root with L = 3 we are able to extract crux of the sentence i.e “Delhi is the capital of India”. Figure: Dependency Parse :: Delhi is the capital of India , with lots of people.
  • 34. Text summarization using Abstract Meaning Representation Results Outline 1 Introduction 2 Extractive Summarization Methods 3 Seq2Seq Learning 4 Results
  • 35. Text summarization using Abstract Meaning Representation Results Rouge RougeN-Precision MatchingN−grams Count candidateN−grams RougeN-Recall MatchingN−grams Count referenceN−grams RougeN-F1 2RougeNre∗RougeNpre RougeNre+RougeNpre RougeL-Precision lenLCS(ref ,can) lencandidate RougeL-Recall lenLCS(ref ,can) lenreference RougeL-F1 2RougeLre∗RougeLpre RougeLre+RougeLpre Table: Rouge Formulas
  • 36. Text summarization using Abstract Meaning Representation Results Examples Rouge-N candidate :: the cat was found under the bed reference :: the cat was under the bed has recall 6 6 = 1 and precision 6 7 = 0.86 Rouge-L reference :: police killed the gunman candidate1 :: police kill the gunman candidate2 :: the gunman kill police candidate1 has RougeL − F1 of 0.75 and candidate2 has RougeL − F1 of 0.5
  • 37. Text summarization using Abstract Meaning Representation Results Dataset Description We used CNN/Dailymail dataset. The distribution of dataset is as follows Train 287,226 Test 11,490 Validation 13,368 Table: CNN/Dailymail split average number of sentences per article 31 average number of sentences per summary 3 average number of words per article 790 average number of words per summary 55 average number of words per article (BPE vsize 50k) 818 Table: CNN/Dailymail average stats for Training sets
  • 38. Text summarization using Abstract Meaning Representation Results Results onn CNN/Dailymail Model R1-f1 R2-f1 RL-f1 bpe-no-cov 35.39 15.53 32.31 bpe-cov 36.31 15.69 33.63 text-no-cov 31.4 11.6 27.2 text-cov 33.19 12.38 28.12 text-200 34.66 16.23 30.96 text-200-cov 37.18 17.41 32.68 pos-full* 35.76 16.9 31.86 pos-full-cov* 37.96 17.71 33.23 Table: pos-full minimizes loss for word generation and POS tag generation, all models use pointer copying mechanism as default except the bpe model, text-200 uses glove vectors for word vector initialization
  • 39. Text summarization using Abstract Meaning Representation Results AMR S2S AMR as augmented data using 25k training examples Model R1-f1 R2-f1 RL-f1 aug-no-cov 30.18 11 26.52 aug-cov 34.53 13.22 29.21 Table: AMRs as Augmented Data AMR to AMR using s2s Model R1-f1 R2-f1 RL-f1 s2s-cnn-1 17.97 6.31 17.35 s2s-cnn-2 25.60 8.31 25.01 s2s-cnn-3 31.96 10.71 29.12 Table: 1: cnn no pointer gen 2: cnn with pointer gen 3: cnn with cov and pointer gen
  • 40. Text summarization using Abstract Meaning Representation Results AMR using Doc Graph Model R1-f1 number of edges L <= nodes/2 - 1 (ref gold) 18.59 number of edges L <= nodes/3 -1 (ref gold) 19.72 number of edges L <= nodes/4 -1 (ref gold) 19.57 number of edges L <= nodes/2 -1 (ref *generated) 38.05 number of edges L <= nodes/3 -1 (ref *generated) 44.72 number of edges L <= nodes/4 -1 (ref *generated) 43.60 Table: Graph based AMR summaries *reference summaries were generated using word alignment in summary graph
  • 41. Text summarization using Abstract Meaning Representation Results Extractive + Abstractive We used LSA and the dependency parse for the extractive phase Model R1-f1 R2-f1 RL-f1 dep-no-cov 25.89 8.72 23.81 dep-cov 30.53 10.26 28.06 lsa-10-no-cov 29.75 11.27 27.3 lsa-10-cov 32.9 12.17 30.41 lsa-15-no-cov 31.64 12.34 28.74 lsa-15-cov 33.22 12.86 30.11 Table: Comparing the mixed approach for summarization, Dependency based method used L = 7 as context window value
  • 42. Text summarization using Abstract Meaning Representation Results Sample Outputs repetition Reference: roseanne barr told the daily beast that she is slowly going blind barr said she has macular degeneration and glaucoma Generated: roseanne barr said she is slowly going blind in interview with the daily beast barr said she suffers from macular degeneration and glaucoma [she suffers from macular degeneration and glaucoma] [she suffers from macular degeneration and glaucoma] less abstraction Reference: batman superman dawn justice trailer leaked thursday yanked offline film will released march 25 2016 stars ben affleck henry cavill Generated: batman superman dawn justice [leaked handheld camera phone user uploaded handheld camera phone capture trailer spanish] superman dawn justice will released march 25 2016 [stars affleck batman henry cavill superman film also stars amy adams] much costs
  • 43. Text summarization using Abstract Meaning Representation Results Sample Outputs facts redundancy Reference: coast guard says about 50 people were rescued from mobile bay more than 100 sailboats took part in the dauphin island race an annual event Generated: [sailboats took part in the dauphin island race] and as many as 50 [sailboats took part in the dauphin island race] and as many as 50 people in all were rescued from water [the coast guard says this is the 57th year] for the event [the club says this is the 57th year] for the event
  • 44. Text summarization using Abstract Meaning Representation Results Conclusion and Future Work In this work we walked through some of the techniques for text summarization and suggested some changes to them. Extractive methods always degrades the summaries due to lack of complete information but improves on the training times. We looked into AMRs and their applicability to the task of summarization and their effectiveness on a smaller data set. Finding Global graph embeddings for AMR type structures. Using co-referencing in AMRs. Data Augmentation without increasing the model complexity. Expanding Memory Networks for Summarization. Reinforcement Learning for Summarization. Better Extraction using Dependency Trees.
  • 45. Text summarization using Abstract Meaning Representation Results Thank You! (t / thank-01 :ARG1 (y / you))
  • 46. Text summarization using Abstract Meaning Representation Results References I Christopher, O. Understanding lstm networks. LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521(7553):436–444. Insight. Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and Smith, N. A. (2015). Toward abstractive summarization using semantic representations. See, A., Liu, P. J., and Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
  • 47. Text summarization using Abstract Meaning Representation Results References II Sennrich, R., Haddow, B., and Birch, A. (2015). Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909. Tu, Z., Lu, Z., Liu, Y., Liu, X., and Li, H. (2016). Modeling coverage for neural machine translation. arXiv preprint arXiv:1601.04811.