SlideShare a Scribd company logo
SummaRuNNer
Ramesh Nallapati, Feifei Zhai, Bowen Zhou
Presented by :
Sharath T.S
Shubhangi Tandon
Contributions of this paper
● SummaRuNNer, a simple recurrent network based sequence classifier
that outperforms or matches state-of-the-art models for extractive
summarization
● The simple formulation of model facilitates interpretable visualization of
its decisions
● A novel training mechanism that allows our extractive model to be trained
end-to-end using abstractive summaries.
SummaRuNNer
● Treat extractive summarization as a sequence classification problem
● Each sentence is visited sequentially in the original document order
● A binary decision is made (taking into account previous decisions)
● GRU based RNN basic building block of sequence classifier
● Recurrent network with two gates, u :update gate and r : reset gate
Recurrents neural networks
LSTMs:
● Input gate: Decides what fraction of the new input flowing into the LSTM
cell has to be updated.
LSTMs - Continued
● Update gate: Calculates what amount of current cell state to forget, and
updates the new information.
LSTMs - Continued
● Output gate: Evaluates the new cell state and decides what parts of the
information has to be output.
Refer: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
GRU LSTMs
Modifications compared to LSTMs:
● It combines the forget(f) and input(i) gate into a single update gate.
● Merges the cell state and hidden state into one state.
The Model
SummaRuNNer
Model:
● Two-layer bi-directional GRU-RNN - The first layer of the RNN runs at the word level, computes
hidden state representations at each word position. Another RNN at the word level that runs
backwards from the last word to the first.
● second layer of bi-directional RNN that runs at the sentence-level and accepts the average-pooled,
concatenated hidden states of word-level RNNs.
● Document representation :
`
Computing Posterior - Logistic loss
(7)
Extractive Summary labels - Greedy Algorithm
Why is it needed?
● most summarization corpora only contain human written abstractive
summaries as ground truth.
● Algorithm
○ selected sentences from the document should be the ones that maximize the Rouge
score with respect to gold summaries.
○ Stop when none of the remaining candidate when added improve the ROUGE score.
● Train the network with labelled data.
Abstractive training - Decoder
● Apart from the sigmoid function present to compute the class a sentence belongs to,
the decoder in addition does the following
○ Takes embedding of a word(hidden state) as input from the previous state as xk
, s -1
is the value computed
at the last sentence of the RNN( Equation 7).
○ Computes softmax to output the most probable word.
○ Optimize the log likelihood of the word distribution in the abstractive summaries.(context captured by
RNN)
○ Predict using weights W, without the decoder on test samples.
Decoder - Continued
How does it work?
● The summary representation s−1
acts as an information channel between
the SummaRuNNer model and the decoder.
● Maximizing the probability of abstractive summary words as computed by
the decoder will require the model to learn a good summary
representation which in turn depends on accurate estimates of extractive
probabilities p(yj
).
SummaRuNNer Visualisation
Corpus used
● Daily Mail ( Cheng & Lapata) : 200k Tr, 12k Val , 10k Test
● Daily Mail/CNN (Nallapati) : 286k Tr, 13k Val, 11k Test
● DUC 2002 : 567 documents ( out of Domain Testing)
● Average statistics
○ 28 sentences/ doc
○ 3-4 sentences in reference summary
○ 802 word / doc
● Training Data Constraints
○ Vocab size : 150k
○ Maximum sentences/ doc : 100
○ Max Sentence Length : 50 words
○ Model hidden state : 200
○ Batch Size : 64
Experiments and Results : Daily Mail Corpus
Experiments and Results : Daily Mail /CNN data
Experiments and Results : DUC 2002 data
Future Work
● Pre-Train extractive model using abstractive training
● Construct a joint extractive-abstractive model where predictions of
extractive component form stochastic intermediate units to be consumed
by abstractive component.

More Related Content

What's hot

Efficient Forecasting of Exchange rates with Recurrent FLANN
Efficient Forecasting of Exchange rates with Recurrent FLANNEfficient Forecasting of Exchange rates with Recurrent FLANN
Efficient Forecasting of Exchange rates with Recurrent FLANN
IOSR Journals
 
Web clustering engines
Web clustering enginesWeb clustering engines
Web clustering engines
Yash Darak
 
Emnlp2015 reading festival_lstm_cws
Emnlp2015 reading festival_lstm_cwsEmnlp2015 reading festival_lstm_cws
Emnlp2015 reading festival_lstm_cws
Ace12358
 
Query trees
Query treesQuery trees
Query trees
Shefa Idrees
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 
work load characterization
work load characterizationwork load characterization
work load characterization
Raghu Golla
 
Web clustring engine
Web clustring engineWeb clustring engine
Web clustring engine
FACTS Computer Software L.L.C
 
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
Vimukthi Wickramasinghe
 
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Innovation Quotient Pvt Ltd
 
Mining the social web 6
Mining the social web 6Mining the social web 6
Mining the social web 6
HyeonSeok Choi
 
Basic Communication
Basic CommunicationBasic Communication
Basic Communication
Dr Shashikant Athawale
 
Query processing and Query Optimization
Query processing and Query OptimizationQuery processing and Query Optimization
Query processing and Query Optimization
Niraj Gandha
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query OptimizationAli Usman
 
Svv
SvvSvv
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
Bhaskar Mitra
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query Optimization
Ravinder Kamboj
 
Search for a substring of characters using the theory of non-deterministic fi...
Search for a substring of characters using the theory of non-deterministic fi...Search for a substring of characters using the theory of non-deterministic fi...
Search for a substring of characters using the theory of non-deterministic fi...
journalBEEI
 
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON DatatypePapers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Max Klymyshyn
 
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
Victor Giannakouris
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: Simrank
Abhishek Mungoli
 

What's hot (20)

Efficient Forecasting of Exchange rates with Recurrent FLANN
Efficient Forecasting of Exchange rates with Recurrent FLANNEfficient Forecasting of Exchange rates with Recurrent FLANN
Efficient Forecasting of Exchange rates with Recurrent FLANN
 
Web clustering engines
Web clustering enginesWeb clustering engines
Web clustering engines
 
Emnlp2015 reading festival_lstm_cws
Emnlp2015 reading festival_lstm_cwsEmnlp2015 reading festival_lstm_cws
Emnlp2015 reading festival_lstm_cws
 
Query trees
Query treesQuery trees
Query trees
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
work load characterization
work load characterizationwork load characterization
work load characterization
 
Web clustring engine
Web clustring engineWeb clustring engine
Web clustring engine
 
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...
 
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
 
Mining the social web 6
Mining the social web 6Mining the social web 6
Mining the social web 6
 
Basic Communication
Basic CommunicationBasic Communication
Basic Communication
 
Query processing and Query Optimization
Query processing and Query OptimizationQuery processing and Query Optimization
Query processing and Query Optimization
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Svv
SvvSvv
Svv
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query Optimization
 
Search for a substring of characters using the theory of non-deterministic fi...
Search for a substring of characters using the theory of non-deterministic fi...Search for a substring of characters using the theory of non-deterministic fi...
Search for a substring of characters using the theory of non-deterministic fi...
 
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON DatatypePapers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
Papers We Love Kyiv, July 2018: A Conflict-Free Replicated JSON Datatype
 
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
CSMR: A Scalable Algorithm for Text Clustering with Cosine Similarity and Map...
 
Analysis of different similarity measures: Simrank
Analysis of different similarity measures: SimrankAnalysis of different similarity measures: Simrank
Analysis of different similarity measures: Simrank
 

Similar to SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents - Ramesh Nallapati, Feifei Zhai, Bowen Zhou

論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」
kurotaki_weblab
 
Convolutional and Recurrent Neural Networks
Convolutional and Recurrent Neural NetworksConvolutional and Recurrent Neural Networks
Convolutional and Recurrent Neural Networks
Ramesh Ragala
 
Conformer review
Conformer reviewConformer review
Conformer review
June-Woo Kim
 
Deep Neural Machine Translation with Linear Associative Unit
Deep Neural Machine Translation with Linear Associative UnitDeep Neural Machine Translation with Linear Associative Unit
Deep Neural Machine Translation with Linear Associative Unit
Satoru Katsumata
 
Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
Muthusamy Chelliah
 
Scene understanding
Scene understandingScene understanding
Scene understanding
Mohammed Shoaib
 
Multidimensional RNN
Multidimensional RNNMultidimensional RNN
Multidimensional RNN
Grigory Sapunov
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Fordham University
 
UNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on SparkUNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on Spark
Zhan Zhang
 
Ire presentation
Ire presentationIre presentation
Ire presentation
Raj Patel
 
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINEUnderstanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Raj Patel
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
RishavSharma112
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
A Hybrid Deep Neural Network Model For Time Series Forecasting
A Hybrid Deep Neural Network Model For Time Series ForecastingA Hybrid Deep Neural Network Model For Time Series Forecasting
A Hybrid Deep Neural Network Model For Time Series Forecasting
Martha Brown
 
Neural machine translation by jointly learning to align and translate.pptx
Neural machine translation by jointly learning to align and translate.pptxNeural machine translation by jointly learning to align and translate.pptx
Neural machine translation by jointly learning to align and translate.pptx
ssuser2624f71
 
Android Malware
Android Malware Android Malware
Android Malware
Nambiraju
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Conv-TasNet.pdf
Conv-TasNet.pdfConv-TasNet.pdf
Conv-TasNet.pdf
ssuser849b73
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Databricks
 

Similar to SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents - Ramesh Nallapati, Feifei Zhai, Bowen Zhou (20)

論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」
 
Convolutional and Recurrent Neural Networks
Convolutional and Recurrent Neural NetworksConvolutional and Recurrent Neural Networks
Convolutional and Recurrent Neural Networks
 
Conformer review
Conformer reviewConformer review
Conformer review
 
Deep Neural Machine Translation with Linear Associative Unit
Deep Neural Machine Translation with Linear Associative UnitDeep Neural Machine Translation with Linear Associative Unit
Deep Neural Machine Translation with Linear Associative Unit
 
Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
Multidimensional RNN
Multidimensional RNNMultidimensional RNN
Multidimensional RNN
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
 
UNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on SparkUNET: Massive Scale DNN on Spark
UNET: Massive Scale DNN on Spark
 
Ire presentation
Ire presentationIre presentation
Ire presentation
 
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINEUnderstanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
A Hybrid Deep Neural Network Model For Time Series Forecasting
A Hybrid Deep Neural Network Model For Time Series ForecastingA Hybrid Deep Neural Network Model For Time Series Forecasting
A Hybrid Deep Neural Network Model For Time Series Forecasting
 
Neural machine translation by jointly learning to align and translate.pptx
Neural machine translation by jointly learning to align and translate.pptxNeural machine translation by jointly learning to align and translate.pptx
Neural machine translation by jointly learning to align and translate.pptx
 
Android Malware
Android Malware Android Malware
Android Malware
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Conv-TasNet.pdf
Conv-TasNet.pdfConv-TasNet.pdf
Conv-TasNet.pdf
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 

SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents - Ramesh Nallapati, Feifei Zhai, Bowen Zhou

  • 1. SummaRuNNer Ramesh Nallapati, Feifei Zhai, Bowen Zhou Presented by : Sharath T.S Shubhangi Tandon
  • 2. Contributions of this paper ● SummaRuNNer, a simple recurrent network based sequence classifier that outperforms or matches state-of-the-art models for extractive summarization ● The simple formulation of model facilitates interpretable visualization of its decisions ● A novel training mechanism that allows our extractive model to be trained end-to-end using abstractive summaries.
  • 3. SummaRuNNer ● Treat extractive summarization as a sequence classification problem ● Each sentence is visited sequentially in the original document order ● A binary decision is made (taking into account previous decisions) ● GRU based RNN basic building block of sequence classifier ● Recurrent network with two gates, u :update gate and r : reset gate
  • 4. Recurrents neural networks LSTMs: ● Input gate: Decides what fraction of the new input flowing into the LSTM cell has to be updated.
  • 5. LSTMs - Continued ● Update gate: Calculates what amount of current cell state to forget, and updates the new information.
  • 6. LSTMs - Continued ● Output gate: Evaluates the new cell state and decides what parts of the information has to be output. Refer: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 7. GRU LSTMs Modifications compared to LSTMs: ● It combines the forget(f) and input(i) gate into a single update gate. ● Merges the cell state and hidden state into one state.
  • 9. SummaRuNNer Model: ● Two-layer bi-directional GRU-RNN - The first layer of the RNN runs at the word level, computes hidden state representations at each word position. Another RNN at the word level that runs backwards from the last word to the first. ● second layer of bi-directional RNN that runs at the sentence-level and accepts the average-pooled, concatenated hidden states of word-level RNNs. ● Document representation : `
  • 10. Computing Posterior - Logistic loss (7)
  • 11. Extractive Summary labels - Greedy Algorithm Why is it needed? ● most summarization corpora only contain human written abstractive summaries as ground truth. ● Algorithm ○ selected sentences from the document should be the ones that maximize the Rouge score with respect to gold summaries. ○ Stop when none of the remaining candidate when added improve the ROUGE score. ● Train the network with labelled data.
  • 12. Abstractive training - Decoder ● Apart from the sigmoid function present to compute the class a sentence belongs to, the decoder in addition does the following ○ Takes embedding of a word(hidden state) as input from the previous state as xk , s -1 is the value computed at the last sentence of the RNN( Equation 7). ○ Computes softmax to output the most probable word. ○ Optimize the log likelihood of the word distribution in the abstractive summaries.(context captured by RNN) ○ Predict using weights W, without the decoder on test samples.
  • 13. Decoder - Continued How does it work? ● The summary representation s−1 acts as an information channel between the SummaRuNNer model and the decoder. ● Maximizing the probability of abstractive summary words as computed by the decoder will require the model to learn a good summary representation which in turn depends on accurate estimates of extractive probabilities p(yj ).
  • 15. Corpus used ● Daily Mail ( Cheng & Lapata) : 200k Tr, 12k Val , 10k Test ● Daily Mail/CNN (Nallapati) : 286k Tr, 13k Val, 11k Test ● DUC 2002 : 567 documents ( out of Domain Testing) ● Average statistics ○ 28 sentences/ doc ○ 3-4 sentences in reference summary ○ 802 word / doc ● Training Data Constraints ○ Vocab size : 150k ○ Maximum sentences/ doc : 100 ○ Max Sentence Length : 50 words ○ Model hidden state : 200 ○ Batch Size : 64
  • 16. Experiments and Results : Daily Mail Corpus
  • 17. Experiments and Results : Daily Mail /CNN data
  • 18. Experiments and Results : DUC 2002 data
  • 19. Future Work ● Pre-Train extractive model using abstractive training ● Construct a joint extractive-abstractive model where predictions of extractive component form stochastic intermediate units to be consumed by abstractive component.