presentation from my thesis defense on text summarization, discusses already existing state of art models along with efficiency of AMR or Abstract Meaning Representation for text summarization, we see how we can use AMRs with seq2seq models. We also discuss other techniques such as BPE or Byte Pair Encoding and its effectiveness for the task. Also we see how data augmentation with POS tags and AMRs effect the summarization with s2s learning.
Following are the questions which I tried to answer in this ppt
What is text summarization.
What is automatic text summarization?
How it has evolved over the time?
What are different methods?
How deep learning is used for text summarization?
business application
in first few slides extractive summarization is explained, with pro and cons in next section abstractive on is explained.
In the last section business application of each one is highlighted
The project is developed as a part of IRE course work @IIIT-Hyderabad.
Team members:
Aishwary Gupta (201302216)
B Prabhakar (201505618)
Sahil Swami (201302071)
Links:
https://github.com/prabhakar9885/Text-Summarization
http://prabhakar9885.github.io/Text-Summarization/
https://www.youtube.com/playlist?list=PLtBx4kn8YjxJUGsszlev52fC1Jn07HkUw
http://www.slideshare.net/prabhakar9885/text-summarization-60954970
https://www.dropbox.com/sh/uaxc2cpyy3pi97z/AADkuZ_24OHVi3PJmEAziLxha?dl=0
Extracts objects from the entire collection, without modifying the objects themselves. where the goal is to select whole sentences (without modifying them)
Improving Neural Abstractive Text Summarization with Prior KnowledgeGaetano Rossiello, PhD
Abstractive text summarization is a complex task whose goal is to generate a concise version of a text without necessarily reusing the sentences from the original source, but still preserving the meaning and the key contents. In this position paper we address this issue by modeling the problem as a sequence to sequence learning and exploiting Recurrent Neural Networks (RNN). Moreover, we discuss the idea of combining RNNs and probabilistic models in a unified way in order to incorporate prior knowledge, such as linguistic features. We believe that this approach can obtain better performance than the state-of-the-art models for generating well-formed summaries.
Abstractive text summarization is nowadays one of the most important research topics in NLP. However, getting a deep understanding of what it is and also how it works requires a series of base pieces of knowledge that build on top of each other. This is the reason why this presentation will give audiences an overview of sequence-to-sequence with the acceleration of various versions of attention over the past few years. In addition, natural language generation (NLG) with the focusing on decoder techniques and its relevant problems will be reviewed, as a supportive factor to the light of the success of automatic summarization. Finally, the abstractive text summarization will be represented with potential approaches to tackle some hot issues in some latest research papers.
Automatic text summarization is the process of reducing the text content and retaining the
important points of the document. Generally, there are two approaches for automatic text summarization:
Extractive and Abstractive. The process of extractive based text summarization can be divided into two
phases: pre-processing and processing. In this paper, we discuss some of the extractive based text
summarization approaches used by researchers. We also provide the features for extractive based text
summarization process. We also present the available linguistic preprocessing tools with their features,
which are used for automatic text summarization. The tools and parameters useful for evaluating the
generated summary are also discussed in this paper. Moreover, we explain our proposed lexical chain
analysis approach, with sample generated lexical chains, for extractive based automatic text summarization.
We also provide the evaluation results of our system generated summary. The proposed lexical chain
analysis approach can be used to solve different text mining problems like topic classification, sentiment
analysis, and summarization.
Following are the questions which I tried to answer in this ppt
What is text summarization.
What is automatic text summarization?
How it has evolved over the time?
What are different methods?
How deep learning is used for text summarization?
business application
in first few slides extractive summarization is explained, with pro and cons in next section abstractive on is explained.
In the last section business application of each one is highlighted
The project is developed as a part of IRE course work @IIIT-Hyderabad.
Team members:
Aishwary Gupta (201302216)
B Prabhakar (201505618)
Sahil Swami (201302071)
Links:
https://github.com/prabhakar9885/Text-Summarization
http://prabhakar9885.github.io/Text-Summarization/
https://www.youtube.com/playlist?list=PLtBx4kn8YjxJUGsszlev52fC1Jn07HkUw
http://www.slideshare.net/prabhakar9885/text-summarization-60954970
https://www.dropbox.com/sh/uaxc2cpyy3pi97z/AADkuZ_24OHVi3PJmEAziLxha?dl=0
Extracts objects from the entire collection, without modifying the objects themselves. where the goal is to select whole sentences (without modifying them)
Improving Neural Abstractive Text Summarization with Prior KnowledgeGaetano Rossiello, PhD
Abstractive text summarization is a complex task whose goal is to generate a concise version of a text without necessarily reusing the sentences from the original source, but still preserving the meaning and the key contents. In this position paper we address this issue by modeling the problem as a sequence to sequence learning and exploiting Recurrent Neural Networks (RNN). Moreover, we discuss the idea of combining RNNs and probabilistic models in a unified way in order to incorporate prior knowledge, such as linguistic features. We believe that this approach can obtain better performance than the state-of-the-art models for generating well-formed summaries.
Abstractive text summarization is nowadays one of the most important research topics in NLP. However, getting a deep understanding of what it is and also how it works requires a series of base pieces of knowledge that build on top of each other. This is the reason why this presentation will give audiences an overview of sequence-to-sequence with the acceleration of various versions of attention over the past few years. In addition, natural language generation (NLG) with the focusing on decoder techniques and its relevant problems will be reviewed, as a supportive factor to the light of the success of automatic summarization. Finally, the abstractive text summarization will be represented with potential approaches to tackle some hot issues in some latest research papers.
Automatic text summarization is the process of reducing the text content and retaining the
important points of the document. Generally, there are two approaches for automatic text summarization:
Extractive and Abstractive. The process of extractive based text summarization can be divided into two
phases: pre-processing and processing. In this paper, we discuss some of the extractive based text
summarization approaches used by researchers. We also provide the features for extractive based text
summarization process. We also present the available linguistic preprocessing tools with their features,
which are used for automatic text summarization. The tools and parameters useful for evaluating the
generated summary are also discussed in this paper. Moreover, we explain our proposed lexical chain
analysis approach, with sample generated lexical chains, for extractive based automatic text summarization.
We also provide the evaluation results of our system generated summary. The proposed lexical chain
analysis approach can be used to solve different text mining problems like topic classification, sentiment
analysis, and summarization.
Simply put, semantic analysis is the process of drawing meaning from text. It allows computers to understand and interpret sentences, paragraphs, or whole documents, by analyzing their grammatical structure, and identifying relationships between individual words in a particular context.
Information Extraction, Named Entity Recognition, NER, text analytics, text mining, e-discovery, unstructured data, structured data, calendaring, standard evaluation per entity, standard evaluation per token, sequence classifier, sequence labeling, word shapes, semantic analysis in language technology
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
Simply put, semantic analysis is the process of drawing meaning from text. It allows computers to understand and interpret sentences, paragraphs, or whole documents, by analyzing their grammatical structure, and identifying relationships between individual words in a particular context.
Information Extraction, Named Entity Recognition, NER, text analytics, text mining, e-discovery, unstructured data, structured data, calendaring, standard evaluation per entity, standard evaluation per token, sequence classifier, sequence labeling, word shapes, semantic analysis in language technology
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...ijctcm
Text Summarization is a way to produce a text, which contains the significant portion of information of the
original text(s). Different methodologies are developed till now depending upon several parameters to find
the summary as the position, format and type of the sentences in an input text, formats of different words,
frequency of a particular word in a text etc. But according to different languages and input sources, these
parameters are varied. As result the performance of the algorithm is greatly affected. The proposed
approach summarizes a text without depending upon those parameters. Here, the relevance of the
sentences within the text is derived by Simplified Lesk algorithm and WordNet, an online dictionary. This
approach is not only independent of the format of the text and position of a sentence in a text, as the
sentences are arranged at first according to their relevance before the summarization process, the
percentage of summarization can be varied according to needs. The proposed approach gives around 80%
accurate results on 50% summarization of the original text with respect to the manually summarized result,
performed on 50 different types and lengths of texts. We have achieved satisfactory results even upto 25%
summarization of the original text.
Dictionary based Image Compression via Sparse Representation IJECEIAES
Nowadays image compression has become a necessity due to a large volume of images. For efficient use of storage space and data transmission, it becomes essential to compress the image. In this paper, we propose a dictionary based image compression framework via sparse representation, with the construction of a trained over-complete dictionary. The overcomplete dictionary is trained using the intra-prediction residuals obtained from different images and is applied for sparse representation. In this method, the current image block is first predicted from its spatially neighboring blocks, and then the prediction residuals are encoded via sparse representation. Sparse approximation algorithm and the trained overcomplete dictionary are applied for sparse representation of prediction residuals. The detail coefficients obtained from sparse representation are used for encoding. Experimental result shows that the proposed method yields both improved coding efficiency and image quality as compared to some state-of-the-art image compression methods.
Conceptual framework for abstractive text summarizationijnlc
As the volume of information available on the Internet increases, there is a growing need for tools helping users to find, filter and manage these resources. While more and more textual information is available on-line, effective retrieval is difficult without proper indexing and summarization of the content. One of the possible solutions to this problem is abstractive text summarization. The idea is to propose a system that will accept single document as input in English and processes the input by building a rich semantic graph and then reducing this graph for generating the final summary.
Discovering Novel Information with sentence Level clustering From Multi-docu...irjes
Specific objective to discover some novel information from a set of documents initially retrieved in response to some query. Clustering sentences level text, effective use and update is still an open research issue, especially in domain of text mining. Since most existing system uses pattern belong to a single cluster. But here we can use patterns belongs to all cluster with different degree of membership. Since sentences of those documents we would expect at least one of the clusters to be closely related to the concepts described by the query term. This paper presents a Novel Fuzzy Clustering Algorithm that operates on relational input data (i.e. data in the form of square matrix of pair wise similarities between data objects).
Super resolution image reconstruction via dual dictionary learning in sparse...IJECEIAES
Patch-based super resolution is a method in which spatial features from a low-resolution (LR) patch are used as references for the reconstruction of high-resolution (HR) image patches. Sparse representation for each patch is extracted. These coefficients obtained are used to recover HR patch. One dictionary is trained for LR image patches, and another dictionary is trained for HR image patches and both dictionaries are jointly trained. In the proposed method, high frequency (HF) details required are treated as combination of main high frequency (MHF) and residual high frequency (RHF). Hence, dual-dictionary learning is proposed for main dictionary learning and residual dictionary learning. This is required to recover MHF and RHF respectively for recovering finer image details. Experiments are carried out to test the proposed technique on different test images. The results illustrate the efficacy of the proposed algorithm.
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemCSCJournals
The Bottleneck Traveling Salesman Problem (BTSP) is a variation of the well-known Traveling Salesman Problem in which the objective is to minimize the maximum lap (arc length) in the tour of the salesman. In this paper, a lexisearch algorithm using adjacency representation for a tour has been developed for obtaining exact optimal solution to the problem. Then a comparative study has been carried out to show the efficiency of the algorithm as against existing exact algorithm for some randomly generated and TSPLIB instances of different sizes.
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONIJCI JOURNAL
In multi-turn dialogue generation, responses are not only related to the topic and background of the context but also related to words and phrases in the sentences of the context. However, currently widely used hierarchical dialog models solely rely on context representations from the utterance-level encoder, ignoring the sentence representations output by the word-level encoder. This inevitably results in a loss of information while decoding and generating. In this paper, we propose a new dialog model X-ReCoSa to tackle this problem which aggregates multi-scale context information for hierarchical dialog models. Specifically, we divide the generation decoder into upper and lower parts, namely the intention part and the generation part. Firstly, the intention part takes context representations as input to generate the intention of the response. Then the generation part generates words depending on sentence representations. Therefore, the hierarchical information has been fused into response generation. we conduct experiments on the English dataset DailyDialog. Experimental results exhibit that our method outperforms baseline models on both automatic metric-based and human-based evaluations.
Joint Image Registration And Example-Based Super-Resolution Algorithmaciijournal
Supper-resolution (SR) methods are classified into two different methods: image registration (IR)-based
methods and example-based methods. The proposed joint SR method is focused on estimating highresolution (HR) video sequences from low-resolution (LR) ones by combining the two different methods.
The IR-based SR method collects information from adjacent frames to reconstruct HR images in the video
sequence. Example-based SR methods give good textures and strong edges in the result HR video. In this
paper, IR-based and example-based SR methods are fused based on the gradient features. The proposed
joint SR method gives smaller peak signal to noise ratio than the example-based method, however it shows
better reconstruction results on high-level features such as characters in images. Experimental result of the
proposed joint SR method shows less noise and higher contrast than the example-based method.
User_42751212015Module1and2pagestocompetework.pdf
User_42751212015Module1and2pagestocompetework_1.pdf
User_42751212015Module2Homework(CIS330).docx
[INSERT TITLE HERE] 1
Running head: [INSERT TITLE HERE]
[INSERT TITLE HERE]
Student Name
Allied American University
Author Note
This paper was prepared for [INSERT COURSE NAME], [INSERT COURSE ASSIGNMENT] taught by [INSERT INSTRUCTOR’S NAME].
Directions: Please complete each of the following exercises. Please read the instructions carefully.
For all “short programming assignments,” include source code files in your submission.
1. Short programming assignment. Combine the malloc2D function of program 3.16 with the adjacency matrix code of program 3.18 to write a program that allows the user to first enter the count of vertices, and then enter the graph edges. The program should then output the graph with lines of the form:
There is an edge between 0 and 3.
2. Short programming assignment. Modify your program for question 2.1 so that after the adjacency matrix is created, it is then converted to an adjacency list, and the output is generated from the list.
3. Short programming assignment. Modify program 4.7 from the text, overloading the == operator to work for this ADT using a friend function.
4. Is the ADT given in program 4.7 a first-class ADT? Explain your answer.
5. Suppose you are given the source code for a C++ class, and asked if the class shown is an ADT. On what factors would your decision be based?
6. How does using strings instead of simple types like integers alter the O-notation of operations?
User_42751212015Module1Homework(CIS330)Corrected (1).docx
[INSERT TITLE HERE] 1
Running head: [INSERT TITLE HERE]
[INSERT TITLE HERE]
Student Name
Allied American University
Author Note
This paper was prepared for [INSERT COURSE NAME], [INSERT COURSE ASSIGNMENT] taught by [INSERT INSTRUCTOR’S NAME].
Directions: Please refer to your textbook to complete the following exercises.1. Refer to page 12 of your text to respond to the following:Show the contents of the id array after each union operation when you use the quick find algorithm (Program I.I) to solve the connectivity problem for the sequence 0-2, 1-4, 2-5, 3-6, 0-4, 6-0, and 1-3. Also give the number of times the program accesses the id array for each input pair.2. Refer to page 12 of your text to respond to the following:Show the contents of the id array after each union operation when you use the quick union algorithm (Program I.I) to solve the connectivity problem for the sequence 0-2, 1-4, 2-5, 3-6, 0-4, 6-0, and 1-3. Also give the number of times the program accesses the id array for each input pair.3. Refer to figures 1.7 and 1.8 on pages 16 and 17 of the text. Give the contents of the id array after each union operation for the weighted quick union algorithm running on the examples corresponding to figures 1.7 and 1.84. For what value is N is 10N lg N>2N2 ...
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESijnlc
In this paper, we propose a compression based multi-document summarization technique by incorporating
word bigram probability and word co-occurrence measure. First we implemented a graph based technique
to achieve sentence compression and information fusion. In the second step, we use hand-crafted rule
based syntactic constraint to prune our compressed sentences. Finally we use probabilistic measure while
exploiting word co-occurrence within a sentence to obtain our summaries. The system can generate summaries for any user-defined compression rate.
APPROACH FOR THICKENING SENTENCE SCORE FOR AUTOMATIC TEXT SUMMARIZATIONIJDKP
In our study we will use approach that combine Natural language processing NLP with Term occurrences to improve the quality of important sentences selection by thickening sentence score along with reducing the number of long sentences that would be included in the final summarization. There are sixteen known methods for automatic text summarization. In our paper we utilized Term frequency approach and built an algorithm to re filter sentences score.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
1. Text summarization using Abstract Meaning Representation
Text summarization using Abstract Meaning
Representation
AMR
Amit Nagarkoti
Supervisor: Dr. Harish Karnick
Department of Computer Science and Engineering
Indian Institute of Technology Kanpur
2. Text summarization using Abstract Meaning Representation
Table of contents
1. Introduction
2. Extractive Summarization Methods
3. Seq2Seq Learning
4. Results
3. Text summarization using Abstract Meaning Representation
Introduction
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
4. Text summarization using Abstract Meaning Representation
Introduction
Introduction
Definition
Text summarization is the process of reducing the size of original
document to a much more concise form such that the most relevant facts
in the original document are retained.
Summaries are always lossy.
Summarization methodologies
Extractive summarization : extract import words and sentences
Abstractive summarization harder : rephrase, generate similar words
5. Text summarization using Abstract Meaning Representation
Introduction
Why Important?
user point of view: Evaluate importance of article.
linguistic/scientific/philosophical view: Solving summarization is
equivalent to solving the problem of language understanding.
6. Text summarization using Abstract Meaning Representation
Introduction
AMR : welcome to amr
(w / welcome-01
:ARG2 (a / amr))
represent sentence as a DAG
captures “who is doing what to whom”
nodes : verb sense (see-01), objects (boy, marble)
edges : relations (ARG1, ARG0)
7. Text summarization using Abstract Meaning Representation
Introduction
Abstraction in AMR
Figure: AMR example
The boy sees the white marble.
The boy saw the marble that was white.
8. Text summarization using Abstract Meaning Representation
Introduction
Propbank Lexicon
white-03 : refers to the color white
ARG1: thing that is white in color (marble)
ARG2: specific part of ARG1, if also mentioned
see-01 : to see or view
ARG0: viewer (boy)
ARG1: thing viewed (marble)
ARG2: attribute of ARG1, further description (white - but we have
different amr here)
9. Text summarization using Abstract Meaning Representation
Introduction
How to get an AMR?
Use JAMR : 84% accuracy for concept node identification.
gives AMR for a sentence
(d / discover-01 | 0
:ARG0 (r / rope | 0.0)
:ARG1 (n / noose | 0.1)
:location (c / campus | 0.2))
the noose made of rope was discovered on campus
10. Text summarization using Abstract Meaning Representation
Introduction
word-node Alignment
node alignment word
discover-01 6-7—0 discovered
rope 4-5—0.0 rope
noose 1-2—0.1 noose
campus 8-9—0.2 campus
Table: word-node alignment
These alignments will be used to generate summaries from AMRs
11. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
12. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Using AMRs
Figure: Steps to generate Document graph from sentence AMR (modified from
[Liu et al., 2015])
13. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Document Graph
Figure: Dense Document Graph
14. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Finding the summary sub-graph
maximize
n
i=1
ψi θT
f (vi ) (1)
here ψi is a binary variable indicating node i is selected or not
θ = [θ1, . . . , θm] are model parameters
f (vi ) = [f1(vi ), . . . , fm(vi )] are node features
15. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Constraints for a valid sub-graph
Figure: vi − eij ≥ 0 vj − eij ≥ 0
Figure: i f0i − vi = 0
Figure: i fij − k fjk − vj = 0
Neij − fij ≥ 0, ∀i, j sanity constraint
i,j eij ≤ L size constraint
16. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
Complete Algorithm
f o r cur doc in corpus :
c r e a t e doc graph
add ILP c o n s t r a i n t s
s o l v e o b j e c t i v e to minimize l o s s
c a l c u l a t e g r a d i e n t s
update model parameters
17. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Figure: term-sentence matrix
term vector tT
i = [xi1 . . . xin]
sentence vector sT
j = [x1j . . . xmj ]
term-sentence matrix can be huge
no relation between terms
sparsity
18. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Figure: SVD over term-sentence matrix source:Wikipedia
Figure: low rank approximation
19. Text summarization using Abstract Meaning Representation
Extractive Summarization Methods
LSA for extractive summarization
Finally a score is calculated for each sentence vector given by
Sl =
n
i=1
v2
l,i · σ2
i (2)
where Sl is score for sentence l
choose L sentences with highest scores
20. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
21. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
s2s
Figure: Sequence to Sequence Model
22. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Figure: RNN cell st = f (Uxt + Wst−1) ot = softmax(Vst ) source:
nature [LeCun et al., 2015]
Figure: Chat client using s2s model based on LSTM source: [Christopher, ]
23. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
S2S continued
During training model tries to minimize the negative log likelihood of the
target word
lossD =
T
t=1
− log(P(wt)) (3)
here wt is the target word at step t.
Problems in s2s
slow training with long sequences - limit sequence lengths
limited context for decoder critical for s2s
large vocabularies
24. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention to rescue
Attend or Re-visit critical information when making decisions
Figure: Attention in RNNs source:distill.pub
25. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention complete view
Figure: Attention at step t
26. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Attention Heatmap
X − axis is the input sequence, Y − axis is the generated output
Figure: Attention Heatmap during Decoding
27. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Byte Pair Encoding
Definition
BPE is a compression technique where the most frequent pair of
consecutive bytes is replaced by a byte not in the document.
BPE has been adapted for NMT [Sennrich et al., 2015] using the
idea of subword unit.
“lower” will be represented as “l o w e r @”.
vocabsize = numOfIterations + numOfChars.
BPE merge operations learned from dictionary low, lowest, newer, wider.
using 4 merge operations.
r @ r@
l o lo
lo w low
e r@ er@
28. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Pointer Generator Model for OOVs
Figure: Pointer Gen model [See et al., 2017]
29. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Coverage
“Coverage” is used in MT to control over and under production of target
words.
Some words may never get enough attention resulting in poor
translation/summaries.
The solution is to use coverage to guided attention [Tu et al., 2016]
and [See et al., 2017].
Accumulate all attention weights and penalize for extra attention.
ct =
t−1
t =1
αt
coverage vector (4)
covlosst =
encsteps
i=1
min(αi , ct
i ) (5)
30. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
AMR Linearization
Figure: AMR DFS Traversal gives the linearization -TOP-( eat-01 ARG1(
bone )ARG1 ARG0( dog )ARG0 )-TOP-
31. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Data Augmentation/Extension with POS
POS sequence p1 · · · pn is obtained using the Stanford Parser
Figure: s2s model generating multiple output distributions
32. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Hierarchical Sequence Encoder for AMRs
sentence vector Si is obtained by using attention on word encoder
states
document vector Di is obtained by using attention on sentence
encoder states
Figure: hierarchical models learn the document vector using level-wise learning
33. Text summarization using Abstract Meaning Representation
Seq2Seq Learning
Using Dependency Parsing
“A Dependency Parse of a sentence is tree with labelled edges such that
the main verb or the focused noun is the root and edges are the relations
between words in the sentence.”
To reduce document size we used a context of size L around the
root word, reducing the sentence size to at max 2L + 1.
Fig below has “capital” as root with L = 3 we are able to extract
crux of the sentence i.e “Delhi is the capital of India”.
Figure: Dependency Parse :: Delhi is the capital of India , with lots of people.
34. Text summarization using Abstract Meaning Representation
Results
Outline
1 Introduction
2 Extractive Summarization Methods
3 Seq2Seq Learning
4 Results
36. Text summarization using Abstract Meaning Representation
Results
Examples
Rouge-N
candidate :: the cat was found under the bed
reference :: the cat was under the bed
has recall 6
6 = 1 and precision 6
7 = 0.86
Rouge-L
reference :: police killed the gunman
candidate1 :: police kill the gunman
candidate2 :: the gunman kill police
candidate1 has RougeL − F1 of 0.75 and candidate2 has
RougeL − F1 of 0.5
37. Text summarization using Abstract Meaning Representation
Results
Dataset Description
We used CNN/Dailymail dataset. The distribution of dataset is as follows
Train 287,226
Test 11,490
Validation 13,368
Table: CNN/Dailymail split
average number of sentences per article 31
average number of sentences per summary 3
average number of words per article 790
average number of words per summary 55
average number of words per article (BPE vsize 50k) 818
Table: CNN/Dailymail average stats for Training sets
38. Text summarization using Abstract Meaning Representation
Results
Results onn CNN/Dailymail
Model R1-f1 R2-f1 RL-f1
bpe-no-cov 35.39 15.53 32.31
bpe-cov 36.31 15.69 33.63
text-no-cov 31.4 11.6 27.2
text-cov 33.19 12.38 28.12
text-200 34.66 16.23 30.96
text-200-cov 37.18 17.41 32.68
pos-full* 35.76 16.9 31.86
pos-full-cov* 37.96 17.71 33.23
Table: pos-full minimizes loss for word generation and POS tag generation, all
models use pointer copying mechanism as default except the bpe model,
text-200 uses glove vectors for word vector initialization
39. Text summarization using Abstract Meaning Representation
Results
AMR S2S
AMR as augmented data using 25k training examples
Model R1-f1 R2-f1 RL-f1
aug-no-cov 30.18 11 26.52
aug-cov 34.53 13.22 29.21
Table: AMRs as Augmented Data
AMR to AMR using s2s
Model R1-f1 R2-f1 RL-f1
s2s-cnn-1 17.97 6.31 17.35
s2s-cnn-2 25.60 8.31 25.01
s2s-cnn-3 31.96 10.71 29.12
Table: 1: cnn no pointer gen 2: cnn with pointer gen 3: cnn with cov and
pointer gen
40. Text summarization using Abstract Meaning Representation
Results
AMR using Doc Graph
Model R1-f1
number of edges L <= nodes/2 - 1 (ref gold) 18.59
number of edges L <= nodes/3 -1 (ref gold) 19.72
number of edges L <= nodes/4 -1 (ref gold) 19.57
number of edges L <= nodes/2 -1 (ref *generated) 38.05
number of edges L <= nodes/3 -1 (ref *generated) 44.72
number of edges L <= nodes/4 -1 (ref *generated) 43.60
Table: Graph based AMR summaries *reference summaries were generated
using word alignment in summary graph
41. Text summarization using Abstract Meaning Representation
Results
Extractive + Abstractive
We used LSA and the dependency parse for the extractive phase
Model R1-f1 R2-f1 RL-f1
dep-no-cov 25.89 8.72 23.81
dep-cov 30.53 10.26 28.06
lsa-10-no-cov 29.75 11.27 27.3
lsa-10-cov 32.9 12.17 30.41
lsa-15-no-cov 31.64 12.34 28.74
lsa-15-cov 33.22 12.86 30.11
Table: Comparing the mixed approach for summarization, Dependency based
method used L = 7 as context window value
42. Text summarization using Abstract Meaning Representation
Results
Sample Outputs
repetition
Reference: roseanne barr told the daily beast that she is slowly
going blind barr said she has macular degeneration and glaucoma
Generated: roseanne barr said she is slowly going blind in interview
with the daily beast barr said she suffers from macular degeneration
and glaucoma [she suffers from macular degeneration and glaucoma]
[she suffers from macular degeneration and glaucoma]
less abstraction
Reference: batman superman dawn justice trailer leaked thursday
yanked offline film will released march 25 2016 stars ben affleck
henry cavill
Generated: batman superman dawn justice [leaked handheld camera
phone user uploaded handheld camera phone capture trailer spanish]
superman dawn justice will released march 25 2016 [stars affleck
batman henry cavill superman film also stars amy adams] much costs
43. Text summarization using Abstract Meaning Representation
Results
Sample Outputs
facts redundancy
Reference: coast guard says about 50 people were rescued from
mobile bay more than 100 sailboats took part in the dauphin island
race an annual event
Generated: [sailboats took part in the dauphin island race] and as
many as 50 [sailboats took part in the dauphin island race] and as
many as 50 people in all were rescued from water [the coast guard
says this is the 57th year] for the event [the club says this is the 57th
year] for the event
44. Text summarization using Abstract Meaning Representation
Results
Conclusion and Future Work
In this work we walked through some of the techniques for text
summarization and suggested some changes to them.
Extractive methods always degrades the summaries due to lack of
complete information but improves on the training times.
We looked into AMRs and their applicability to the task of
summarization and their effectiveness on a smaller data set.
Finding Global graph embeddings for AMR type structures.
Using co-referencing in AMRs.
Data Augmentation without increasing the model complexity.
Expanding Memory Networks for Summarization.
Reinforcement Learning for Summarization.
Better Extraction using Dependency Trees.
45. Text summarization using Abstract Meaning Representation
Results
Thank You!
(t / thank-01
:ARG1 (y / you))
46. Text summarization using Abstract Meaning Representation
Results
References I
Christopher, O.
Understanding lstm networks.
LeCun, Y., Bengio, Y., and Hinton, G. (2015).
Deep learning.
Nature, 521(7553):436–444.
Insight.
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and Smith, N. A.
(2015).
Toward abstractive summarization using semantic representations.
See, A., Liu, P. J., and Manning, C. D. (2017).
Get to the point: Summarization with pointer-generator networks.
arXiv preprint arXiv:1704.04368.
47. Text summarization using Abstract Meaning Representation
Results
References II
Sennrich, R., Haddow, B., and Birch, A. (2015).
Neural machine translation of rare words with subword units.
arXiv preprint arXiv:1508.07909.
Tu, Z., Lu, Z., Liu, Y., Liu, X., and Li, H. (2016).
Modeling coverage for neural machine translation.
arXiv preprint arXiv:1601.04811.