Be the first to like this
Summarization has emerged as an increasingly useful approach to tackle the problem of information overload. Extracting information from online conversations can be of very good commercial and educational value. But majority of this information is present as noisy unstructured text making traditional document summarization techniques difficult to apply. In this project, we propose a novel approach to address the problem of conversation summarization. We develop an automatic text summarizer which extracts sentences from the email conversations to form a summary. Our approach consists of three phases. In the first phase, we prepare the dataset for usage by correcting spellings and segmenting the text. In the second phase, we represent each sentence by a set of predefined features. Finally, in the third phase we use a machine learning algorithm to train the summarizer on the set of feature vectors. We also a developed an interface which takes as input the document to be saummarized and retuns an extractive summary.