SlideShare a Scribd company logo
Text Summarization - Machine Learning
    TEXT SUMMARIZATION
1   Kareem El-Sayed Hashem
    Mohamed Mohsen Brary
TEXT SUMMARIZATION
   Goal: reducing a text with a computer program in
    order to create a summary that retains the most
    important points of the original text.




                                                           Text Summarization - Machine Learning
   Summarization Applications
     summaries of email threads
     action items from a meeting
     simplifying text by compressing sentences




                                                       2
WHAT TO SUMMARIZE?
SINGLE VS. MULTIPLE DOCUMENTS
   Single Document Summarization
       Given a single document produce




                                                               Text Summarization - Machine Learning
         Abstract
         Outline

         Headline




   Multiple Document Summarization
       Given a group of document produce a gist of the
        document
         A series of news stories of the same event
         A set of webpages about some topic or question


                                                           3
QUERY-FOCUSED SUMMARIZATION
& GENERIC SUMMARIZATION
   Generic Summarization
       Summarize the content of a document




                                                                       Text Summarization - Machine Learning
   Query-focused Summarization
     Summarize a document with respect to an
      information need expressed in a user query
     A kind of complex question answering
           Answer a question by summarizing a document that has
            the information to construct the answer




                                                                   4
SUMMARIZATION FOR QUESTION
ANSWERING:
   Snippets
       Create snippets summarizing a web page for a query




                                                                        Text Summarization - Machine Learning
   Multiple Documents
       Create answer to complex questions summarizing
        multiple documents.
         Instead of giving a snippet for each document
         Create a cohesive answer that combines information from

          each document




                                                                    5
EXTRACTIVE SUMMARIZATION
& ABSTRACTIVE SUMMARIZATION
   Extractive Summarization:
       Create the summary from phrases or sentences in the
        source document(s)




                                                                  Text Summarization - Machine Learning
   Abstractive Summarization
       Express the ideas in the source document using
        different words




                                                              6
SUMMARIZATION: THREE STAGES
 Content Selection: choose sentences to extract
  from the document




                                                       Text Summarization - Machine Learning
 Information Ordering: choose an order to place
  them in the summary
 Sentence Realization: clean up the sentence




                                                   7
UNSUPERVISED CONTENT SELECTION
   Intuition Dating Back to Luhn (1958):
       Choose sentences that have distinguished or
        informative words




                                                                   Text Summarization - Machine Learning
   Two Approaches to Define distinguished words
       tf-idf: weigh each word wi in document j by tf-idf



       Topic signature: choose smaller set of distinguished
        words
           Log-likelihood ratio (LLR)


                                                               8
TOPIC SIGNATURE-BASED CONTENT
SELECTION WITH QUERIES

   Choose words that are informative either
       By log-likelihood ratio (LLR)




                                                       Text Summarization - Machine Learning
       Or by appearing in the query




       Weigh a sentence by weight of its words:


                                                   9
SUPERVISED CONTENT SELECTION
   Given
       A labeled training set of good summaries for each
        document




                                                               Text Summarization - Machine Learning
   Align
       The sentences in the document with sentences in the
        summary
   Extract Features
     Position
     Length of sentence
     Word informativeness
     Cohesion
                                                              10
SUPERVISED CONTENT SELECTION
   Train
       A binary classifier (put sentence in summary? Yes or
        no)




                                                                Text Summarization - Machine Learning
   Problems
       Hard to get labeled training data
       Alignment is difficult
       Performance not better that unsupervised algorithm




                                                               11
EVALUATING SUMMARIES: ROUGE
   ROUGE “ Recall Oriented Understudy for
    Gisting Evaluation ”




                                                    Text Summarization - Machine Learning
   Internal metric for automatically evaluating
    summaries
     Based on BLEU (a metric used for machine
      translation)
     Not as good as human evaluation.
     But much more convenient

                                                   12
EVALUATING SUMMARIES: ROUGE
   Given a document D, and an automatic
    summary X:




                                                           Text Summarization - Machine Learning
     Have N humans produce a set of reference
      summaries of D
     Run System, giving automatic summary X
     What percentage of the bigrams from the reference
      summaries appear in X?




                                                          13
EXAMPLE
 Human 1: water spinach is a green leafy
  vegetable grown in the tropics.




                                                 Text Summarization - Machine Learning
 Human 2: water spinach is a semi-aquatic
  tropical plant grown as a vegetable.
 Human 3: water spinach is a commonly eaten
  leaf vegetable of Asia.

   System: water spinach is a leaf vegetable
    commonly eaten in tropical areas of Asia.

   ROUGE -2=                = 12/28 = 0.43     14
ANSWERING HARDER QUESTION:
QUERY-FOCUSED MULTI-DOCUMENT
SUMMARIZATION

   The (bottom-up) snippet method
       Find a set of relevant documents




                                                                 Text Summarization - Machine Learning
       Extract informative sentences form the documents
       Order and modify the sentences into an answer



   The(top-down) information extraction method
       Build specific answers for different questions types:
         Definition questions
         Biography questions

         Certain medical questions

                                                                15
QUERY-FOCUSED MULTI-DOCUMENT
SUMMARIZATION




                                Text Summarization - Machine Learning
                               16
MAXIMAL MARGINAL RELEVANCE (MMR)
 An iterative method for content selection from
  multiple documents




                                                          Text Summarization - Machine Learning
 Iteratively (greedily) choose the best sentence to
  insert in the summary/answer so far:
       Relevant: maximally relevant to the user query
           High cosine similarity to the query
       Novel: minimally redundant with the summary so
        far:
           Low cosine similarity to the summary




                                                         17
   Stop when desired length
LLR + MMR CHOOSING INFORMATIVE YET
NON-REDUNDANT SENTENCES

   One of many ways to combine the intuitions of
    LLR and MMR:




                                                           Text Summarization - Machine Learning
     Score each sentence based on LLR (including query
      words)
     Include the sentence with highest score in the
      summary
     Iteratively add into the summary high-scoring
      sentences that are not redundant with the summary
      so far.


                                                          18
INFORMATION ORDERING
   Chronological ordering:
       Order sentences by the date of the document “ for
        summarizing news”




                                                               Text Summarization - Machine Learning
   Coherence:
     Choose ordering that make neighboring sentences
      similar(by cosine similarity)
     Choose ordering in which neighboring sentences
      discuss the same entity


   Topical ordering
                                                              19
       Learn the ordering of topics in the source document
DOMAIN-SPECIFIC ANSWERING:
THE INFORMATION EXTRACTION METHOD

   A good biography of a person contains:
       A person’s birth/death, fame factor, education …etc




                                                               Text Summarization - Machine Learning
   A good definition contains
       Type or category “ The Hajj is a type of ritual ”
   A medical answer about a drug’s use contains:
     The problem : medical condition
     The intervention : drug or procedure
     The outcome : the result of the study




                                                              20
INFORMATION THAT SHOULD BE IN THE
ANSWER FOR 3 KINDS OF QUESTIONS




                                     Text Summarization - Machine Learning
                                    21
ARCHITECTURE FOR ANSWERING COMPLEX
QUESTIONS




                                      Text Summarization - Machine Learning
                                     22
Text Summarization - Machine Learning
                                                                             23
              NLP Stanford course.
REFERENCES:
                    
Text Summarization - Machine Learning
                        THANK YOU 
                                      24

More Related Content

What's hot

Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Text clustering
Text clusteringText clustering
Text clustering
KU Leuven
 

What's hot (20)

Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Abstractive Text Summarization
Abstractive Text SummarizationAbstractive Text Summarization
Abstractive Text Summarization
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Improving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior KnowledgeImproving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior Knowledge
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
NLP
NLPNLP
NLP
 
Text clustering
Text clusteringText clustering
Text clustering
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Relational Algebra and MapReduce
Relational Algebra and MapReduceRelational Algebra and MapReduce
Relational Algebra and MapReduce
 

Similar to Text summarization

Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarization
Masud Rahman
 
Learning to Link with Wikipedia
Learning to Link with WikipediaLearning to Link with Wikipedia
Learning to Link with Wikipedia
Ashish Kulkarni
 

Similar to Text summarization (20)

Summarization in Computational linguistics
Summarization in Computational linguisticsSummarization in Computational linguistics
Summarization in Computational linguistics
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
 
Multi-Topic Multi-Document Summarizer
Multi-Topic Multi-Document SummarizerMulti-Topic Multi-Document Summarizer
Multi-Topic Multi-Document Summarizer
 
A statistical model for gist generation a case study on hindi news article
A statistical model for gist generation  a case study on hindi news articleA statistical model for gist generation  a case study on hindi news article
A statistical model for gist generation a case study on hindi news article
 
NLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docxNLP Techniques for Text Summarization.docx
NLP Techniques for Text Summarization.docx
 
K0936266
K0936266K0936266
K0936266
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
Article Summarizer
Article SummarizerArticle Summarizer
Article Summarizer
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
 
Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarizationSupporting program comprehension with source code summarization
Supporting program comprehension with source code summarization
 
Learning to Link with Wikipedia
Learning to Link with WikipediaLearning to Link with Wikipedia
Learning to Link with Wikipedia
 
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
 
The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)
 
AbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timefAbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timef
 
Side final 2
Side final 2Side final 2
Side final 2
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Query Based Summarization
Query Based SummarizationQuery Based Summarization
Query Based Summarization
 
Query based summarization
Query based summarizationQuery based summarization
Query based summarization
 

Recently uploaded

Recently uploaded (20)

Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
 
Benefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational ResourcesBenefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational Resources
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 

Text summarization

  • 1. Text Summarization - Machine Learning TEXT SUMMARIZATION 1 Kareem El-Sayed Hashem Mohamed Mohsen Brary
  • 2. TEXT SUMMARIZATION  Goal: reducing a text with a computer program in order to create a summary that retains the most important points of the original text. Text Summarization - Machine Learning  Summarization Applications  summaries of email threads  action items from a meeting  simplifying text by compressing sentences 2
  • 3. WHAT TO SUMMARIZE? SINGLE VS. MULTIPLE DOCUMENTS  Single Document Summarization  Given a single document produce Text Summarization - Machine Learning  Abstract  Outline  Headline  Multiple Document Summarization  Given a group of document produce a gist of the document  A series of news stories of the same event  A set of webpages about some topic or question 3
  • 4. QUERY-FOCUSED SUMMARIZATION & GENERIC SUMMARIZATION  Generic Summarization  Summarize the content of a document Text Summarization - Machine Learning  Query-focused Summarization  Summarize a document with respect to an information need expressed in a user query  A kind of complex question answering  Answer a question by summarizing a document that has the information to construct the answer 4
  • 5. SUMMARIZATION FOR QUESTION ANSWERING:  Snippets  Create snippets summarizing a web page for a query Text Summarization - Machine Learning  Multiple Documents  Create answer to complex questions summarizing multiple documents.  Instead of giving a snippet for each document  Create a cohesive answer that combines information from each document 5
  • 6. EXTRACTIVE SUMMARIZATION & ABSTRACTIVE SUMMARIZATION  Extractive Summarization:  Create the summary from phrases or sentences in the source document(s) Text Summarization - Machine Learning  Abstractive Summarization  Express the ideas in the source document using different words 6
  • 7. SUMMARIZATION: THREE STAGES  Content Selection: choose sentences to extract from the document Text Summarization - Machine Learning  Information Ordering: choose an order to place them in the summary  Sentence Realization: clean up the sentence 7
  • 8. UNSUPERVISED CONTENT SELECTION  Intuition Dating Back to Luhn (1958):  Choose sentences that have distinguished or informative words Text Summarization - Machine Learning  Two Approaches to Define distinguished words  tf-idf: weigh each word wi in document j by tf-idf  Topic signature: choose smaller set of distinguished words  Log-likelihood ratio (LLR) 8
  • 9. TOPIC SIGNATURE-BASED CONTENT SELECTION WITH QUERIES  Choose words that are informative either  By log-likelihood ratio (LLR) Text Summarization - Machine Learning  Or by appearing in the query  Weigh a sentence by weight of its words: 9
  • 10. SUPERVISED CONTENT SELECTION  Given  A labeled training set of good summaries for each document Text Summarization - Machine Learning  Align  The sentences in the document with sentences in the summary  Extract Features  Position  Length of sentence  Word informativeness  Cohesion 10
  • 11. SUPERVISED CONTENT SELECTION  Train  A binary classifier (put sentence in summary? Yes or no) Text Summarization - Machine Learning  Problems  Hard to get labeled training data  Alignment is difficult  Performance not better that unsupervised algorithm 11
  • 12. EVALUATING SUMMARIES: ROUGE  ROUGE “ Recall Oriented Understudy for Gisting Evaluation ” Text Summarization - Machine Learning  Internal metric for automatically evaluating summaries  Based on BLEU (a metric used for machine translation)  Not as good as human evaluation.  But much more convenient 12
  • 13. EVALUATING SUMMARIES: ROUGE  Given a document D, and an automatic summary X: Text Summarization - Machine Learning  Have N humans produce a set of reference summaries of D  Run System, giving automatic summary X  What percentage of the bigrams from the reference summaries appear in X? 13
  • 14. EXAMPLE  Human 1: water spinach is a green leafy vegetable grown in the tropics. Text Summarization - Machine Learning  Human 2: water spinach is a semi-aquatic tropical plant grown as a vegetable.  Human 3: water spinach is a commonly eaten leaf vegetable of Asia.  System: water spinach is a leaf vegetable commonly eaten in tropical areas of Asia.  ROUGE -2= = 12/28 = 0.43 14
  • 15. ANSWERING HARDER QUESTION: QUERY-FOCUSED MULTI-DOCUMENT SUMMARIZATION  The (bottom-up) snippet method  Find a set of relevant documents Text Summarization - Machine Learning  Extract informative sentences form the documents  Order and modify the sentences into an answer  The(top-down) information extraction method  Build specific answers for different questions types:  Definition questions  Biography questions  Certain medical questions 15
  • 16. QUERY-FOCUSED MULTI-DOCUMENT SUMMARIZATION Text Summarization - Machine Learning 16
  • 17. MAXIMAL MARGINAL RELEVANCE (MMR)  An iterative method for content selection from multiple documents Text Summarization - Machine Learning  Iteratively (greedily) choose the best sentence to insert in the summary/answer so far:  Relevant: maximally relevant to the user query  High cosine similarity to the query  Novel: minimally redundant with the summary so far:  Low cosine similarity to the summary 17  Stop when desired length
  • 18. LLR + MMR CHOOSING INFORMATIVE YET NON-REDUNDANT SENTENCES  One of many ways to combine the intuitions of LLR and MMR: Text Summarization - Machine Learning  Score each sentence based on LLR (including query words)  Include the sentence with highest score in the summary  Iteratively add into the summary high-scoring sentences that are not redundant with the summary so far. 18
  • 19. INFORMATION ORDERING  Chronological ordering:  Order sentences by the date of the document “ for summarizing news” Text Summarization - Machine Learning  Coherence:  Choose ordering that make neighboring sentences similar(by cosine similarity)  Choose ordering in which neighboring sentences discuss the same entity  Topical ordering 19  Learn the ordering of topics in the source document
  • 20. DOMAIN-SPECIFIC ANSWERING: THE INFORMATION EXTRACTION METHOD  A good biography of a person contains:  A person’s birth/death, fame factor, education …etc Text Summarization - Machine Learning  A good definition contains  Type or category “ The Hajj is a type of ritual ”  A medical answer about a drug’s use contains:  The problem : medical condition  The intervention : drug or procedure  The outcome : the result of the study 20
  • 21. INFORMATION THAT SHOULD BE IN THE ANSWER FOR 3 KINDS OF QUESTIONS Text Summarization - Machine Learning 21
  • 22. ARCHITECTURE FOR ANSWERING COMPLEX QUESTIONS Text Summarization - Machine Learning 22
  • 23. Text Summarization - Machine Learning 23 NLP Stanford course. REFERENCES: 
  • 24. Text Summarization - Machine Learning THANK YOU  24