SlideShare a Scribd company logo
Query-Based Summarization Mariana Damova 30.07.2010
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object]
The task of query-based summarization ,[object Object],[object Object]
Types of summaries ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Steps in the query-based summarization process ,[object Object],[object Object]
Evaluation of DUC ,[object Object],[object Object],[object Object],[object Object],[object Object]
Approaches based on Document graphs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Approaches based on Document graphs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Approaches using linguistics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Approaches using linguistics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Machine-learning approaches ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Machine-learning approaches ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Application Tailored Systems ,[object Object],[object Object]
Medical Information Summarization System ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Opinion summarization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusion ,[object Object]

More Related Content

What's hot

Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
Abdelaziz Al-Rihawi
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
mlaij
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
ijnlc
 
Side final 2
Side final 2Side final 2
Side final 2
ARYA TM
 
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
IJET - International Journal of Engineering and Techniques
 
text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amr
amit nagarkoti
 
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
TELKOMNIKA JOURNAL
 
O01741103108
O01741103108O01741103108
O01741103108
IOSR Journals
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
Prabhakar Bikkaneti
 
A domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicA domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicIAEME Publication
 
G04124041046
G04124041046G04124041046
G04124041046
IOSR-JEN
 
A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...
eSAT Journals
 
Using lexical chains for text summarization
Using lexical chains for text summarizationUsing lexical chains for text summarization
Using lexical chains for text summarization
Anthony-Claret Onwutalobi
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A Review
INFOGAIN PUBLICATION
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
El Habib NFAOUI
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
El Habib NFAOUI
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews
IJCERT JOURNAL
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text Clustering
IJRES Journal
 

What's hot (20)

Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
 
Side final 2
Side final 2Side final 2
Side final 2
 
I6 mala3 sowmya
I6 mala3 sowmyaI6 mala3 sowmya
I6 mala3 sowmya
 
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
 
Y24168171
Y24168171Y24168171
Y24168171
 
text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amr
 
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
Rhetorical Sentence Classification for Automatic Title Generation in Scientif...
 
O01741103108
O01741103108O01741103108
O01741103108
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
A domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicA domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logic
 
G04124041046
G04124041046G04124041046
G04124041046
 
A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...
 
Using lexical chains for text summarization
Using lexical chains for text summarizationUsing lexical chains for text summarization
Using lexical chains for text summarization
 
Text Mining at Feature Level: A Review
Text Mining at Feature Level: A ReviewText Mining at Feature Level: A Review
Text Mining at Feature Level: A Review
 
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habibConceptual foundations of text mining and preprocessing steps nfaoui el_habib
Conceptual foundations of text mining and preprocessing steps nfaoui el_habib
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text Clustering
 

Similar to Query Based Summarization

Advantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information RetrievalAdvantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information Retrieval
Onur Yılmaz
 
Review of Topic Modeling and Summarization
Review of Topic Modeling and SummarizationReview of Topic Modeling and Summarization
Review of Topic Modeling and Summarization
IRJET Journal
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
ugginaramesh
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
IJECEIAES
 
AbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timefAbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timef
NidaShafique8
 
Multi-Topic Multi-Document Summarizer
Multi-Topic Multi-Document SummarizerMulti-Topic Multi-Document Summarizer
Multi-Topic Multi-Document Summarizer
ijcsit
 
Keyword_extraction.pptx
Keyword_extraction.pptxKeyword_extraction.pptx
Keyword_extraction.pptx
BiswarupDas18
 
Improvement of Text Summarization using Fuzzy Logic Based Method
Improvement of Text Summarization using Fuzzy Logic Based  MethodImprovement of Text Summarization using Fuzzy Logic Based  Method
Improvement of Text Summarization using Fuzzy Logic Based Method
IOSR Journals
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval for
Waheeb Ahmed
 
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSAN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
ijcsit
 
An automatic text summarization using lexical cohesion and correlation of sen...
An automatic text summarization using lexical cohesion and correlation of sen...An automatic text summarization using lexical cohesion and correlation of sen...
An automatic text summarization using lexical cohesion and correlation of sen...
eSAT Publishing House
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
INFOGAIN PUBLICATION
 
IRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator SystemIRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator System
IRJET Journal
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
idescitation
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
ijdmtaiir
 
05 handbook summ-hovy
05 handbook summ-hovy05 handbook summ-hovy
05 handbook summ-hovySagar Dabhi
 
Arabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation methodArabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation method
ijcsit
 
NLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic AnalysisNLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic Analysis
INFOGAIN PUBLICATION
 
Summarization in Computational linguistics
Summarization in Computational linguisticsSummarization in Computational linguistics
Summarization in Computational linguistics
Ahmad Mashhood
 

Similar to Query Based Summarization (20)

Advantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information RetrievalAdvantages of Query Biased Summaries in Information Retrieval
Advantages of Query Biased Summaries in Information Retrieval
 
Review of Topic Modeling and Summarization
Review of Topic Modeling and SummarizationReview of Topic Modeling and Summarization
Review of Topic Modeling and Summarization
 
N15-1013
N15-1013N15-1013
N15-1013
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
 
AbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timefAbstractiveSurvey of text in today timef
AbstractiveSurvey of text in today timef
 
Multi-Topic Multi-Document Summarizer
Multi-Topic Multi-Document SummarizerMulti-Topic Multi-Document Summarizer
Multi-Topic Multi-Document Summarizer
 
Keyword_extraction.pptx
Keyword_extraction.pptxKeyword_extraction.pptx
Keyword_extraction.pptx
 
Improvement of Text Summarization using Fuzzy Logic Based Method
Improvement of Text Summarization using Fuzzy Logic Based  MethodImprovement of Text Summarization using Fuzzy Logic Based  Method
Improvement of Text Summarization using Fuzzy Logic Based Method
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval for
 
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMSAN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
AN OVERVIEW OF EXTRACTIVE BASED AUTOMATIC TEXT SUMMARIZATION SYSTEMS
 
An automatic text summarization using lexical cohesion and correlation of sen...
An automatic text summarization using lexical cohesion and correlation of sen...An automatic text summarization using lexical cohesion and correlation of sen...
An automatic text summarization using lexical cohesion and correlation of sen...
 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
 
IRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator SystemIRJET- Implementation of Automatic Question Paper Generator System
IRJET- Implementation of Automatic Question Paper Generator System
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
 
05 handbook summ-hovy
05 handbook summ-hovy05 handbook summ-hovy
05 handbook summ-hovy
 
Arabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation methodArabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation method
 
NLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic AnalysisNLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic Analysis
 
Summarization in Computational linguistics
Summarization in Computational linguisticsSummarization in Computational linguistics
Summarization in Computational linguistics
 

More from Mariana Damova, Ph.D

ИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
ИКТ програма 2018-2020 Хоризонт 2020 мариана дамоваИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
ИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
Mariana Damova, Ph.D
 
Geography of Letters - The Spirituality of Sofia in the Historic Memory
Geography of Letters - The Spirituality of Sofia in the Historic MemoryGeography of Letters - The Spirituality of Sofia in the Historic Memory
Geography of Letters - The Spirituality of Sofia in the Historic Memory
Mariana Damova, Ph.D
 
Startup Europe Week Sofia 2017 - Introduction
Startup Europe Week Sofia 2017 - IntroductionStartup Europe Week Sofia 2017 - Introduction
Startup Europe Week Sofia 2017 - Introduction
Mariana Damova, Ph.D
 
IndustryInform Service of Mozaika
IndustryInform Service of MozaikaIndustryInform Service of Mozaika
IndustryInform Service of Mozaika
Mariana Damova, Ph.D
 
Семантични технологии основи
Семантични технологии   основи Семантични технологии   основи
Семантични технологии основи
Mariana Damova, Ph.D
 
IndustryInform Demo March 2016
IndustryInform Demo March 2016IndustryInform Demo March 2016
IndustryInform Demo March 2016
Mariana Damova, Ph.D
 
Startup Europe Week Sofia introduction
Startup Europe Week Sofia introductionStartup Europe Week Sofia introduction
Startup Europe Week Sofia introduction
Mariana Damova, Ph.D
 
Concordia july2015
Concordia july2015Concordia july2015
Concordia july2015
Mariana Damova, Ph.D
 
Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23Mariana Damova, Ph.D
 
Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23
Mariana Damova, Ph.D
 
Communication channels for the european single digital market
Communication channels for the european single digital marketCommunication channels for the european single digital market
Communication channels for the european single digital market
Mariana Damova, Ph.D
 
Bulgariana europeana27112013 ним
Bulgariana europeana27112013 нимBulgariana europeana27112013 ним
Bulgariana europeana27112013 ним
Mariana Damova, Ph.D
 
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
Mariana Damova, Ph.D
 
Mozaika june2014
Mozaika june2014Mozaika june2014
Mozaika june2014
Mariana Damova, Ph.D
 
Europeana in Bulgaria
Europeana in BulgariaEuropeana in Bulgaria
Europeana in Bulgaria
Mariana Damova, Ph.D
 
Bulgariana europeana02112013
Bulgariana europeana02112013Bulgariana europeana02112013
Bulgariana europeana02112013
Mariana Damova, Ph.D
 
проектиране на онтологии и връзката им с езиковите технологии
проектиране на онтологии и връзката им с езиковите технологиипроектиране на онтологии и връзката им с езиковите технологии
проектиране на онтологии и връзката им с езиковите технологии
Mariana Damova, Ph.D
 
семантични технологии основи
семантични технологии   основисемантични технологии   основи
семантични технологии основи
Mariana Damova, Ph.D
 
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Mariana Damova, Ph.D
 

More from Mariana Damova, Ph.D (20)

ИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
ИКТ програма 2018-2020 Хоризонт 2020 мариана дамоваИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
ИКТ програма 2018-2020 Хоризонт 2020 мариана дамова
 
Geography of Letters - The Spirituality of Sofia in the Historic Memory
Geography of Letters - The Spirituality of Sofia in the Historic MemoryGeography of Letters - The Spirituality of Sofia in the Historic Memory
Geography of Letters - The Spirituality of Sofia in the Historic Memory
 
Startup Europe Week Sofia 2017 - Introduction
Startup Europe Week Sofia 2017 - IntroductionStartup Europe Week Sofia 2017 - Introduction
Startup Europe Week Sofia 2017 - Introduction
 
IndustryInform Service of Mozaika
IndustryInform Service of MozaikaIndustryInform Service of Mozaika
IndustryInform Service of Mozaika
 
Семантични технологии основи
Семантични технологии   основи Семантични технологии   основи
Семантични технологии основи
 
IndustryInform Demo March 2016
IndustryInform Demo March 2016IndustryInform Demo March 2016
IndustryInform Demo March 2016
 
Startup Europe Week Sofia introduction
Startup Europe Week Sofia introductionStartup Europe Week Sofia introduction
Startup Europe Week Sofia introduction
 
Mozaika-Jan2016a
Mozaika-Jan2016aMozaika-Jan2016a
Mozaika-Jan2016a
 
Concordia july2015
Concordia july2015Concordia july2015
Concordia july2015
 
Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23
 
Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23Industry informofmozaikathehumanizingtechnologieslab june23
Industry informofmozaikathehumanizingtechnologieslab june23
 
Communication channels for the european single digital market
Communication channels for the european single digital marketCommunication channels for the european single digital market
Communication channels for the european single digital market
 
Bulgariana europeana27112013 ним
Bulgariana europeana27112013 нимBulgariana europeana27112013 ним
Bulgariana europeana27112013 ним
 
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...
 
Mozaika june2014
Mozaika june2014Mozaika june2014
Mozaika june2014
 
Europeana in Bulgaria
Europeana in BulgariaEuropeana in Bulgaria
Europeana in Bulgaria
 
Bulgariana europeana02112013
Bulgariana europeana02112013Bulgariana europeana02112013
Bulgariana europeana02112013
 
проектиране на онтологии и връзката им с езиковите технологии
проектиране на онтологии и връзката им с езиковите технологиипроектиране на онтологии и връзката им с езиковите технологии
проектиране на онтологии и връзката им с езиковите технологии
 
семантични технологии основи
семантични технологии   основисемантични технологии   основи
семантични технологии основи
 
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
 

Query Based Summarization

Editor's Notes

  1. Automatic summarization is the process by which the information in a source text is expressed in a more concise fashion, with a minimal loss of information.
  2. abstractive summaries produce generated text from the important parts of the documents; extractive summaries identify important sections of the text and use them in the summary as they are. single document summaries represent a single document. multi-document summaries are produced from multiple documents and they have to deal with three major problems: recognizing and coping with redundancy ; identifying important differences among documents; ensuring summary coherence . generic summaries present in concise manner the main topics of a given text; query-based summaries are constructed as an answer to an information need expressed by a user’s query, where: indicative summaries point to information of the document, which helps the user to decide whether the document should be read or not; Informative summaries provide all the relevant information to represent the original document.
  3. Rouge is based on MT evaluation. In this approach human made summaries are compared with automatic summaries based on n-gram co-occurrence statistics. Gisting “ the choicest or most essential or most vital part of some idea or experience ”. The product of machine translation is sometimes called a " gisting translation“ . MT will often produce only a rough translation that will at best allow the reader to "get the gist" of the source text, but is unlikely to convey a complete understanding of it. To evaluate system performance NIST assessors who created the .ideal. written summaries did pairwise comparisons of their summaries to the system-generated summaries, other assessors. summaries, and baseline summaries. They used the Summary Evaluation Environment DUC evaluation - provide sets of documents and their human made summaries, and sets of unseen documents - the ideal summary is created - pairwise comparison of the summaries Recall at different compression ratios has been used in summarization research to measure how well an automatic system retains important content of original Documents . However, the simple sentence recall measure cannot differentiate system performance appropriately . I nstead of pure sentence recall score, we use coverage score C .
  4. RST - a text has a kind of unity that arbitrary collections of sentences generally lack. RST offers an explanation of the coherence of texts. For every part of a coherent text, there is some function, some plausible reason for its presence, evident to readers. RST is intended to describe texts. It posits a range of possibilities of structure -- various sorts of "building blocks" which can be observed to occur in texts. These "blocks" are at two levels, the principal one dealing with "nuclearity" and "relations" (often called coherence relations in the linguistic literature).
  5. A hidden Markov model ( HMM ) is a statistical model in which the system being modeled is assumed to be a Markov process with unobserved state. A HMM can be considered as the simplest dynamic Bayesian network . In a regular Markov model , the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens.
  6. Redundancy Removal. This will identify any information repetition in the source (input) texts, thus minimising any redundant or repetitive content in the final summary. Definition of Basic Elements: (a) the head of a major syntactic constituent, expressed as a single item, (b) relation between a head-Basic element and a single dependent, expressed as a triple (head | modifier | relation). Basic elements can be created by using a parser to produce a syntactic parse tree and a set of ‘cutting rules’ to extract just the valid basic elements from the tree. With basic elements represented as triples one can quite easily decide whether any two units match or not. The query-based basic elements summarizer includes four major stages: (a) query interpretation ; (b) identify important basic elements; (c) identify important sentences; (d) generate summaries
  7. Capturing Sentence Prior for Query-Based Multi-Document Summarization achieves the generation of a fixed length multi document summary which satisfies a specific information need by topic-oriented informative multi-document summarization. Information retrieval techniques have been explored to improve the relevance scoring of sentences towards information need. A measure to capture the notion of importance or prior of a sentence has been identified. The Probability Ranking Principle, the calculated importance/prior is incorporated into the final sentence scoring by weighted linear combination. The system has outperformed all systems at DUC 2006 challenge in terms of ROUGE scores with a significant margin over the next best system. The information need or topic consists of mainly two components. First is the title of the topic, second is the actual information need, expressed as multiple questions. In this approach Information retrieval techniques have been combined with summarization techniques, in producing the extracts. The system that involves the described task consists of the following stages: information need enrichment, content selection and summary generation. Using Conditional Sampling assumption that the query words to be independent of each other while keeping their dependencies on w intact, it computes the required joint probability. Most of the current query-based summarization systems concentrate only on features that measure the relevance of sentences towards the query. They do not explicitly attempt to capture centrality/prior knowledge carried by a sentence pertaining to a domain. The approach defines a new measure which captures the sentence importance based on the distribution of its constituent words in the domain corpus. An entropy measure has been used to compute the information content of a sentence based on a unigram model learned from document corpus
  8. based solely on word-frequency features of clusters, documents and topics. Summary sentences are ranked by a regression SVM. The summarizer does not use any expensive NLP techniques. Because of a detailed feature analysis using Least Angle Regression, FastSum can rely on a minimal set of features leading to fast processing times, e.g. 1250 news documents per 60 seconds. The method only involves sentence splitting, filtering candidate sentences and computing the word frequencies in the documents of a cluster, topic description and the topic title. A machine learning technique called regression SVM is used and for the feature selection a new model selection technique is adopted, called Least Angle Regression (LARS). The focus is on selecting the minimal set of features that are computationally less expensive than other features. The approach ranks all sentences in the topic cluster for summarizability. Features are mainly based on word frequencies of words in the clusters, documents and topics. A cluster contains 25 documents and is associated with a topic. The topic contains a topic title and a topic description. The topic title is a list of key words or phrases describing the topic, the topic description contain the actual query or queries. The features used are word-based and sentence-based. Word-based features are computed based on the probability of words for the different containers. Sentence-based features include the length and position of the sentence in the document.
  9. Dividing the candidate sentences into groups based on a threshold and selecting highest-ranked one from each group. When it is determined which sentences will be included in the summary, three different “scores” are generated and normalized with the length of the sentence. A query-based Medical Information Summarization System Using Ontology Knowledge proposes a technique using UMLS (Unified Medical Language System) and ontology from National Library of Medicine. The ontology-based approach performs clearly better than the keyword-only approach. A general web search engine tries to serve as an information access agent. It retrieves and ranks information according to a user’s query. A document summarization system is presented specialized for medical domain, which will retrieve and summarize up-to-date medical information from trustworthy online sources according to users’ queries. Summaries that a user wants need to be generated on the fly based on his query keywords in a web context. The summarization algorithm is term-based, i.e. only terms defined in UMLS will be recognized and processed. The summarization procedure is s follows: (a) revise the query with UMLS ontology knowledge, (b) calculate distance of each sentence in the document to the finalized query. The distance function is a metrics satisfying d(x,x)=0, symmetry, and triangle inequality. If the distance is smaller than the threshold, the sentence will be a candidate to be included in the summary. (c) calculate pair-wise distances among the candidate sentences, then divide the candidate sentences into groups based on a threshold and select highest-ranked one from each group. When it is determined which sentences will be included in the summary, three different “scores” are generated: a) simply count the number of matched original keywords and select the sentences with many matching keywords b) if a sentence contains a n original keyword assign weight 1 to it. If a sentence contains an expanded keyword, assign weight 0.5 to this keyword. Add all the weights together, and get the score for each sentence. Sentences with high scores are being selected. c) after the scores are obtained, normalize the score with the length of the sentence. And sentences with high normalized score are being selected.
  10. Each summary has to address a set of complex questions about the target, where the question cannot be answered simply with a named entity. The input to the summarization task comprises a target, some opinion-related questions about the target, and a set of documents that contain answers to the questions. The output is a summary for each target that summarizes the answers to the questions. It has been discovered that users have a strong preference for summarizers that model sentiment over non-sentiment baselines. A filtering component identifies sentences that are unlikely to be in a good summary. Another filter is concerned with the sentiment of a sentence. The system performs the following steps: A.Preprocessing, B.Question sentiment and target analyzer, C.Filtering, C1 Sentiment tagger, C2 Taget overlap,D.Feature extraction,E.sentence ranker, F.Redundancy removal. Several preprocessing steps take place before Web-based blog entries are introduced to the FastSum engine. These include translating the original legal opinion topics into queries and identifying any target entities or concepts within those queries, running the queries through the blog search engine and aggregating the top-ranked results, and passing those results through a “marginal relevance filter” in order to ensure that the entries serving as FastSum input data surpass a minimum relevance criterion.