SlideShare a Scribd company logo
1 of 19
Download to read offline
Improving Neural Abstractive Text Summarization
with Prior Knowledge
Gaetano Rossiello, Pierpaolo Basile, Giovanni Semeraro, Marco Di Ciano and
Gaetano Grasso
gaetano.rossiello@uniba.it
Department of Computer Science
University of Bari - Aldo Moro, Italy
URANIA 16 - 1st Italian Workshop on Deep Understanding and Reasoning:
A challenge for Next-generation Intelligent Agents
28 November 2016
AI*IA 16 - Genoa, Italy
Text Summarization
The goal of summarization is to produce a shorter version of a source
text by preserving the meaning and the key contents of the original.
A well written summary can significantly reduce the amount of
cognitive work needed to digest large amounts of text.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Information Overload
Information overload is a problem in modern digital society caused
by the explosion of the amount of information produced on both the
World Wide Web and the enterprise environments.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Text Summarization - Approaches
Input
Single-document
Multi-document
Output
Extractive
Abstract
Headline
Extractive Summarization
The generated summary is a selection of relevant sentences from the
source text in a copy-paste fashion.
Abstractive Summarization
The generated summary is a new cohesive text not necessarily present
in the original source.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Extractive Summarization - Methods
Statistical methods
Features based
Machine Learning
Fuzzy Logic
Graph based
Distributional Semantic
LSA (Latent Semantic Analysis)
NMF (Non-Negative Matrix Factorization)
Word2Vec
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Abstractive Summarization: a Challenging Task
Abstractive summarization requires deep understanding and
reasoning over the text, determining the explicit or implicit meaning
of each element, such as words, phrases, sentences and paragraphs,
and making inferences about their propertiesa in order to generate
new sentences which compose the summary
a
Norvig, P.: Inference in text understanding. AAAI, 1987.
– Abstractive Example –
Original: Russian defense minister Ivanov called Sunday for the
creation of a joint front for combating global terrorism.
Summary: Russia calls for joint front against terrorism.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Deep Learning for Abstractive Text Summarization
Idea
Casting the summarization task as a neural machine translation
problem, where the models, trained on a large amount of data,
learn the alignments between the input text and the target summary
through an attention encoder-decoder paradigm.
Rush, A., et al. A neural attention model for abstractive
sentence summarization. EMNLP 2015
Nallapati, R., et al. Sequence-to-sequence RNNs for text
summarization and Beyond. CoNNL 2016
Chopra, S., et al. Abstractive Sentence Summarization
with Attentive Recurrent Neural Networks. NAACL 2016
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Deep Learning for Abstractive Text Summarization
1
Rush, A., et al.: A neural attention model for abstractive sentence
summarization. EMNLP 2015
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Abstractive Summarization - Problem Formulation
Let us consider:
Original text x = {x1, x2, . . . , xn}
Summary y = {y1, y2, . . . , ym}
where n >> m and xi , yj ∈ V (V is the vocabulary)
A probabilistic perspective goal
The summarization problem consists in finding an output sequence
y that maximizes the conditional probability of y given an input
sequence x
arg maxy∈V P(y|x)
P(y|x) = P(y|x; θ) =
|y|
t=1 P(yt|{y1, . . . , yt−1}, x; θ)
where θ denotes a set of parameters learnt from a training set of
source text and target summary pairs.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Recurrent Neural Networks
Recurrent neural network (RNN) is a neural network model
proposed in the 80’s for modelling time series.
The structure of the network is similar to feedforward neural
network, with the distinction that it allows a recurrent hidden
state whose activation at each time is dependent on that of the
previous time (cycle).
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Sequence to Sequence Learning
Sequence to sequence learning problem can be modeled by
RNNs using a encoder-decoder paradigm.
The encoder is a RNN that reads one token at time from the
input source and returns a fixed-size vector representing the
input text.
The decoder is another RNN that generates words for the
summary and it is conditioned by the vector representation
returned by the first network.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Abstractive Summarization and Sequence to Sequence
P(y|x; θ) = P(yt|{y1, . . . , yt−1}, x; θ) = gθ(ht, c)
where:
ht = gθ(yt−1, ht−1, c)
The vector context c is the output of the encoder and it
encodes the representation of the whole input source.
gθ is a RNN and it can be modeled using:
Elman RNN
LSTM (Long-Short Term Memory)
GRU (Gated Recurrent Unit)
At the time t the decoder RNN computes the probability of the word
yt given the last hidden state ht and the context input c.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Limits of the State-of-the-Art Neural Models
The proposed neural attention-based models for abstractive
summarization are still in an early stage, thus they show some
limitations:
Problems in distinguish rare and unknown words
Grammar errors in the generated summaries
– Example –
Suppose that none of two tokens 10 and Genoa belong to the
vocabulary, then the model cannot distinguish the probability of the
two sentences:
The airport is about 10 kilometers.
The airport is about Genoa kilometers.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Infuse Prior Knowledge into Neural Networks
Our Idea
Infuse prior knowledge, such as linguistic features, into a RNNs
in order to overtake these limits.
Motivation:
The
DT
airport
NN
is
VBZ
about
IN
?
CD
kilometers
NNS
where CD is the Part-of-Speech (POS) tag that identifies a cardinal
number.
Thus, 10 is the unknown token with the higher probability because
it is tagged as CD.
Introducing information about the syntactical role of each word, the
neural network can tend to learn the right collocation of the words
by belonging to a certain part-of-speech class.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Infuse Prior Knowledge into Neural Networks
Preliminary approach:
Combine hand-crafted linguistic features and embeddings as
input vectors into RNNs.
Substitute the softmax layer of neural network with a
Log-Linear model.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Evaluation Plan - Dataset
We plan to evaluate our models on gold-standard datasets for the
summarization task:
DUC (Document Understanding Conference) 2002-20071
TAC (Text Analysis Conference) 2008-20112
Gigaword3
CNN/DailyMail4
Cornell University Library 5
Local government documents 6
1
http://duc.nist.gov/
2
http://tac.nist.gov/data/index.html
3
https://catalog.ldc.upenn.edu/LDC2012T21
4
https://github.com/deepmind/rc-data
5
https://arxiv.org/
6
made available by InnovaPuglia S.p.A.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Evaluation Plan - Metric
ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
ROUGEa metrics compare an automatically produced summary
against a reference or a set of references (human-produced) summary.
a
Lin, Chin-Yew. ROUGE: a Package for Automatic Evaluation of Summaries.
WAS 2004
ROUGE-N: N-gram based co-occurrence statistics.
ROUGE-L: Longest Common Subsequence (LCS) based
statistics.
ROUGEN(X) = S∈{Ref Summaries} gramn∈S countmatch(gramn,X)
S∈{Ref Summaries} gramn∈S count(gramn)
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Future Works
Evaluate the proposed approach by comparing it with the SOA
models.
Integrate relational semantic knowledge into RNNs in order
to learn jointly word and knowledge embeddings by exploiting
knowledge bases and lexical thesaurus.
Abstractive summaries from whole documents or multiple
documents.
Gaetano Rossiello, et al. Neural Abstractive Text Summarization
Improving Neural Abstractive Text Summarization with Prior Knowledge

More Related Content

What's hot

NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learningAbu Kaisar
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...Association for Computational Linguistics
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERTshaurya uppal
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity RecognitionTomer Lieber
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Jeong-Gwan Lee
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AISaurav Shrestha
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 

What's hot (20)

NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Bert
BertBert
Bert
 
Text summarization using deep learning
Text summarization using deep learningText summarization using deep learning
Text summarization using deep learning
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
BERT
BERTBERT
BERT
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 

Similar to Improving Neural Abstractive Text Summarization with Prior Knowledge

EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONijaia
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelANIRUDHMALODE2
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSijnlc
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSkevig
 
Survey on Text Prediction Techniques
Survey on Text Prediction TechniquesSurvey on Text Prediction Techniques
Survey on Text Prediction Techniquesvivatechijri
 
Sequence learning and modern RNNs
Sequence learning and modern RNNsSequence learning and modern RNNs
Sequence learning and modern RNNsGrigory Sapunov
 
Eat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review PredictionEat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review Predictionvivatechijri
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...ijnlc
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...kevig
 
NLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docxNLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docxKevinSims18
 
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...csandit
 
02 15034 neural network
02 15034 neural network02 15034 neural network
02 15034 neural networkIAESIJEECS
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architectureijaia
 
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecturegerogepatton
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecturegerogepatton
 
AI Chapter VIIProblem Solving Using Searching .pptx
AI Chapter VIIProblem Solving Using Searching .pptxAI Chapter VIIProblem Solving Using Searching .pptx
AI Chapter VIIProblem Solving Using Searching .pptxwekineheshete
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNIJECEIAES
 
Master's Thesis Alessandro Calmanovici
Master's Thesis Alessandro CalmanoviciMaster's Thesis Alessandro Calmanovici
Master's Thesis Alessandro CalmanoviciAlessandro Calmanovici
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...ijtsrd
 

Similar to Improving Neural Abstractive Text Summarization with Prior Knowledge (20)

EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language Model
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
 
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONSNEURAL DISCOURSE MODELLING OF CONVERSATIONS
NEURAL DISCOURSE MODELLING OF CONVERSATIONS
 
Survey on Text Prediction Techniques
Survey on Text Prediction TechniquesSurvey on Text Prediction Techniques
Survey on Text Prediction Techniques
 
Sequence learning and modern RNNs
Sequence learning and modern RNNsSequence learning and modern RNNs
Sequence learning and modern RNNs
 
Eat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review PredictionEat it, Review it: A New Approach for Review Prediction
Eat it, Review it: A New Approach for Review Prediction
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
 
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
XAI LANGUAGE TUTOR - A XAI-BASED LANGUAGE LEARNING CHATBOT USING ONTOLOGY AND...
 
NLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docxNLP Techniques for Text Generation.docx
NLP Techniques for Text Generation.docx
 
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
 
02 15034 neural network
02 15034 neural network02 15034 neural network
02 15034 neural network
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
AI Chapter VIIProblem Solving Using Searching .pptx
AI Chapter VIIProblem Solving Using Searching .pptxAI Chapter VIIProblem Solving Using Searching .pptx
AI Chapter VIIProblem Solving Using Searching .pptx
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNN
 
10.1.1.35.8376
10.1.1.35.837610.1.1.35.8376
10.1.1.35.8376
 
Master's Thesis Alessandro Calmanovici
Master's Thesis Alessandro CalmanoviciMaster's Thesis Alessandro Calmanovici
Master's Thesis Alessandro Calmanovici
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 

Recently uploaded

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.k64182334
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 

Recently uploaded (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

Improving Neural Abstractive Text Summarization with Prior Knowledge

  • 1. Improving Neural Abstractive Text Summarization with Prior Knowledge Gaetano Rossiello, Pierpaolo Basile, Giovanni Semeraro, Marco Di Ciano and Gaetano Grasso gaetano.rossiello@uniba.it Department of Computer Science University of Bari - Aldo Moro, Italy URANIA 16 - 1st Italian Workshop on Deep Understanding and Reasoning: A challenge for Next-generation Intelligent Agents 28 November 2016 AI*IA 16 - Genoa, Italy
  • 2. Text Summarization The goal of summarization is to produce a shorter version of a source text by preserving the meaning and the key contents of the original. A well written summary can significantly reduce the amount of cognitive work needed to digest large amounts of text. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 3. Information Overload Information overload is a problem in modern digital society caused by the explosion of the amount of information produced on both the World Wide Web and the enterprise environments. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 4. Text Summarization - Approaches Input Single-document Multi-document Output Extractive Abstract Headline Extractive Summarization The generated summary is a selection of relevant sentences from the source text in a copy-paste fashion. Abstractive Summarization The generated summary is a new cohesive text not necessarily present in the original source. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 5. Extractive Summarization - Methods Statistical methods Features based Machine Learning Fuzzy Logic Graph based Distributional Semantic LSA (Latent Semantic Analysis) NMF (Non-Negative Matrix Factorization) Word2Vec Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 6. Abstractive Summarization: a Challenging Task Abstractive summarization requires deep understanding and reasoning over the text, determining the explicit or implicit meaning of each element, such as words, phrases, sentences and paragraphs, and making inferences about their propertiesa in order to generate new sentences which compose the summary a Norvig, P.: Inference in text understanding. AAAI, 1987. – Abstractive Example – Original: Russian defense minister Ivanov called Sunday for the creation of a joint front for combating global terrorism. Summary: Russia calls for joint front against terrorism. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 7. Deep Learning for Abstractive Text Summarization Idea Casting the summarization task as a neural machine translation problem, where the models, trained on a large amount of data, learn the alignments between the input text and the target summary through an attention encoder-decoder paradigm. Rush, A., et al. A neural attention model for abstractive sentence summarization. EMNLP 2015 Nallapati, R., et al. Sequence-to-sequence RNNs for text summarization and Beyond. CoNNL 2016 Chopra, S., et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. NAACL 2016 Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 8. Deep Learning for Abstractive Text Summarization 1 Rush, A., et al.: A neural attention model for abstractive sentence summarization. EMNLP 2015 Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 9. Abstractive Summarization - Problem Formulation Let us consider: Original text x = {x1, x2, . . . , xn} Summary y = {y1, y2, . . . , ym} where n >> m and xi , yj ∈ V (V is the vocabulary) A probabilistic perspective goal The summarization problem consists in finding an output sequence y that maximizes the conditional probability of y given an input sequence x arg maxy∈V P(y|x) P(y|x) = P(y|x; θ) = |y| t=1 P(yt|{y1, . . . , yt−1}, x; θ) where θ denotes a set of parameters learnt from a training set of source text and target summary pairs. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 10. Recurrent Neural Networks Recurrent neural network (RNN) is a neural network model proposed in the 80’s for modelling time series. The structure of the network is similar to feedforward neural network, with the distinction that it allows a recurrent hidden state whose activation at each time is dependent on that of the previous time (cycle). Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 11. Sequence to Sequence Learning Sequence to sequence learning problem can be modeled by RNNs using a encoder-decoder paradigm. The encoder is a RNN that reads one token at time from the input source and returns a fixed-size vector representing the input text. The decoder is another RNN that generates words for the summary and it is conditioned by the vector representation returned by the first network. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 12. Abstractive Summarization and Sequence to Sequence P(y|x; θ) = P(yt|{y1, . . . , yt−1}, x; θ) = gθ(ht, c) where: ht = gθ(yt−1, ht−1, c) The vector context c is the output of the encoder and it encodes the representation of the whole input source. gθ is a RNN and it can be modeled using: Elman RNN LSTM (Long-Short Term Memory) GRU (Gated Recurrent Unit) At the time t the decoder RNN computes the probability of the word yt given the last hidden state ht and the context input c. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 13. Limits of the State-of-the-Art Neural Models The proposed neural attention-based models for abstractive summarization are still in an early stage, thus they show some limitations: Problems in distinguish rare and unknown words Grammar errors in the generated summaries – Example – Suppose that none of two tokens 10 and Genoa belong to the vocabulary, then the model cannot distinguish the probability of the two sentences: The airport is about 10 kilometers. The airport is about Genoa kilometers. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 14. Infuse Prior Knowledge into Neural Networks Our Idea Infuse prior knowledge, such as linguistic features, into a RNNs in order to overtake these limits. Motivation: The DT airport NN is VBZ about IN ? CD kilometers NNS where CD is the Part-of-Speech (POS) tag that identifies a cardinal number. Thus, 10 is the unknown token with the higher probability because it is tagged as CD. Introducing information about the syntactical role of each word, the neural network can tend to learn the right collocation of the words by belonging to a certain part-of-speech class. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 15. Infuse Prior Knowledge into Neural Networks Preliminary approach: Combine hand-crafted linguistic features and embeddings as input vectors into RNNs. Substitute the softmax layer of neural network with a Log-Linear model. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 16. Evaluation Plan - Dataset We plan to evaluate our models on gold-standard datasets for the summarization task: DUC (Document Understanding Conference) 2002-20071 TAC (Text Analysis Conference) 2008-20112 Gigaword3 CNN/DailyMail4 Cornell University Library 5 Local government documents 6 1 http://duc.nist.gov/ 2 http://tac.nist.gov/data/index.html 3 https://catalog.ldc.upenn.edu/LDC2012T21 4 https://github.com/deepmind/rc-data 5 https://arxiv.org/ 6 made available by InnovaPuglia S.p.A. Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 17. Evaluation Plan - Metric ROUGE (Recall-Oriented Understudy for Gisting Evaluation) ROUGEa metrics compare an automatically produced summary against a reference or a set of references (human-produced) summary. a Lin, Chin-Yew. ROUGE: a Package for Automatic Evaluation of Summaries. WAS 2004 ROUGE-N: N-gram based co-occurrence statistics. ROUGE-L: Longest Common Subsequence (LCS) based statistics. ROUGEN(X) = S∈{Ref Summaries} gramn∈S countmatch(gramn,X) S∈{Ref Summaries} gramn∈S count(gramn) Gaetano Rossiello, et al. Neural Abstractive Text Summarization
  • 18. Future Works Evaluate the proposed approach by comparing it with the SOA models. Integrate relational semantic knowledge into RNNs in order to learn jointly word and knowledge embeddings by exploiting knowledge bases and lexical thesaurus. Abstractive summaries from whole documents or multiple documents. Gaetano Rossiello, et al. Neural Abstractive Text Summarization