SlideShare a Scribd company logo
1 of 10
Comments-Oriented Blog
Summarization
Motivation
 Comments left by readers on Web documents contain valuable
information that can be utilized in different information retrieval tasks
including document search, visualization, and summarization.
 In this project we aim to summarize a Web document (e.g. a blog post) by
considering the comments left by its readers.
 Web documents are now presented with annotations given by their
readers in the form of tags, comments, ratings, and others.
 These Annotations along with comments are valuable input from users and
can be utilized in different IR tasks.
 By considering these comments, the generated summary can better
capture the input from the readers, as opposed to the author of the
document only.
 Comments-oriented summary provides balanced views from both author
and readers.
Introduction
Problem Statement
Given a blog post, consisting of a set of sentences P = {s1 , s2 , . . . , sn} and
the set of comments C = {c1 , c2 , . . . c} associated with blog post , the task of
comments- oriented blog summarization is to extract a subset of sentences P ,
denoted by Sr (Sr ⊂ P ), that best represents the discussion in C
Solution
 Score blog sentences based on their similarity with top scored relevant
comments.
 Comments are scored by using RQT graph/tensor based approach and
Named Entity Similarity score.
Abstract Solution
Comment Oriented Blog Summarization
Approach
 In summary generation it is important to retrieve relevant comments.
 A comment is relevant if it reflects the topic discussed in blog or has more
replies.
 A comment is scored using RQT model and Named Entity Similarity model
and top comments are selected as relevant comments.
 Similarity score for each sentence is calculated by summation of cosine
similarity between that sentence and other comments.
 Top scored sentences are the ones which are grasped by most
commentators and hence are relevant for summary.
RQT Model
 Three factors determine RQT score (Rc) of a comment
 Response Count (Cr ) : Number of replies to each comment.
 Topic Related Cluster Count (Ct): Cosine similarity used to cluster comments
 Quotation Count(Cq ): Number of times it is quoted in other comments.
Rc= Cr+Ct +Cq
Additional Factors
 Likes Count (Cl) :Our dataset is Techcrunch.com where people comment
using facebook. Likes on a particular comment also increase relevance of
a comment significantly. Number of likes (Cl) also affect weightage.
Rc= Cr+Ct +Cq+Cl
 Named Entity Similarity: Named entites in a comment are identified by
Stanford POS tagger and named entity score (Ec) is calculated by taking
number of named entities in a comment.
Final Score for each comment is calculated as
Score(C) = Rc + Ec
Sentence Scoring
 Comments whose weights are greater than threshold value are chosen as
top comments.
 Cosine similarity of each sentence is calculated with top comments .
 Sentences are assigned score based on their cosine similarity with top
comments.
Score(Si)=Summation(CS(Si,Comments))
CS : Cosine Similarity Si : Blog Sentence
 Only those top 5~7 sentences which has more than 6~8 words will be
selected as summary for the blog. More or less number of sentences can
be selected based on percentage of summary required.
Experiments And Results
 10 blogs were randomly chosen with large number of comments and
generated a summary with 30 % and 20 % of words.
 Generated summaries using online tools and compared System generated
summaries using the ROUGE Summary Evaluation Package by Chin-Yew
LIN.
Conclusions
 Our approach depends upon number of factors like the number of likes
each comment has got, the length of the blog content etc
 Generated summary was awarded with less ROGUE score if the number of
comments aren't enough.
 Generated summary was less accurate when none of comments were
accurate
 By scoring the comments based on named entities, accuracy of ranking of
comments increased significantly.
 This system needs more testing and larger dataset in order to get optimal
values of the constants.

More Related Content

Viewers also liked

Documento
DocumentoDocumento
Documentoricy34
 
Samruddhi_Kohat_Resume
Samruddhi_Kohat_ResumeSamruddhi_Kohat_Resume
Samruddhi_Kohat_Resumesamruddhi142
 
9fdcdbd0d324838777002ff369fd410f
9fdcdbd0d324838777002ff369fd410f9fdcdbd0d324838777002ff369fd410f
9fdcdbd0d324838777002ff369fd410fkrkristal
 
Sensory healthcare 10:14
Sensory healthcare 10:14Sensory healthcare 10:14
Sensory healthcare 10:14Vetyver
 
Enzimas, reacciones enzimáticas, clasificación, historia, uso.
Enzimas, reacciones enzimáticas, clasificación, historia, uso.Enzimas, reacciones enzimáticas, clasificación, historia, uso.
Enzimas, reacciones enzimáticas, clasificación, historia, uso.Daniel Cruz
 
Build Profits and Value. Business Plans and Strategic Projects
Build Profits and Value. Business Plans and Strategic ProjectsBuild Profits and Value. Business Plans and Strategic Projects
Build Profits and Value. Business Plans and Strategic ProjectsEric Cole
 
Top 10 Mistakes in Sensory Branding
Top 10 Mistakes in Sensory BrandingTop 10 Mistakes in Sensory Branding
Top 10 Mistakes in Sensory BrandingVetyver
 

Viewers also liked (7)

Documento
DocumentoDocumento
Documento
 
Samruddhi_Kohat_Resume
Samruddhi_Kohat_ResumeSamruddhi_Kohat_Resume
Samruddhi_Kohat_Resume
 
9fdcdbd0d324838777002ff369fd410f
9fdcdbd0d324838777002ff369fd410f9fdcdbd0d324838777002ff369fd410f
9fdcdbd0d324838777002ff369fd410f
 
Sensory healthcare 10:14
Sensory healthcare 10:14Sensory healthcare 10:14
Sensory healthcare 10:14
 
Enzimas, reacciones enzimáticas, clasificación, historia, uso.
Enzimas, reacciones enzimáticas, clasificación, historia, uso.Enzimas, reacciones enzimáticas, clasificación, historia, uso.
Enzimas, reacciones enzimáticas, clasificación, historia, uso.
 
Build Profits and Value. Business Plans and Strategic Projects
Build Profits and Value. Business Plans and Strategic ProjectsBuild Profits and Value. Business Plans and Strategic Projects
Build Profits and Value. Business Plans and Strategic Projects
 
Top 10 Mistakes in Sensory Branding
Top 10 Mistakes in Sensory BrandingTop 10 Mistakes in Sensory Branding
Top 10 Mistakes in Sensory Branding
 

Similar to Blog summarizer

Comments oriented blog summarization by sentence extraction
Comments oriented blog summarization by sentence extractionComments oriented blog summarization by sentence extraction
Comments oriented blog summarization by sentence extractionJhih-Ming Chen
 
An Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation NetworkAn Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation NetworkAndrea Porter
 
Survey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniquesSurvey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniquesAnunaya
 
Web Rec Final Report
Web Rec Final ReportWeb Rec Final Report
Web Rec Final Reportweichen
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimizationAfzal Rais
 
Vikalp - Automatic multiple choice questions generator
Vikalp - Automatic multiple choice questions generatorVikalp - Automatic multiple choice questions generator
Vikalp - Automatic multiple choice questions generatorIRJET Journal
 
Paper id 24201441
Paper id 24201441Paper id 24201441
Paper id 24201441IJRAT
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsEditor IJCATR
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Textmaria.grineva
 
An E-commerce feedback review mining for a trusted seller’s profile and class...
An E-commerce feedback review mining for a trusted seller’s profile and class...An E-commerce feedback review mining for a trusted seller’s profile and class...
An E-commerce feedback review mining for a trusted seller’s profile and class...IRJET Journal
 
Topic-specific Web Crawler using Probability Method
Topic-specific Web Crawler using Probability MethodTopic-specific Web Crawler using Probability Method
Topic-specific Web Crawler using Probability MethodIOSR Journals
 
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONS
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONSCOMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONS
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONSijnlc
 
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical AnnotationBuilding A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical AnnotationCSCJournals
 
Object-Oriented Analysis & Design (OOAD) Domain Modeling Introduction
  Object-Oriented Analysis & Design (OOAD)  Domain Modeling Introduction  Object-Oriented Analysis & Design (OOAD)  Domain Modeling Introduction
Object-Oriented Analysis & Design (OOAD) Domain Modeling IntroductionDang Tuan
 
Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1Haitham Raik
 
IRJET- Finding Related Forum Posts through Intention-Based Segmentation
IRJET-  	  Finding Related Forum Posts through Intention-Based SegmentationIRJET-  	  Finding Related Forum Posts through Intention-Based Segmentation
IRJET- Finding Related Forum Posts through Intention-Based SegmentationIRJET Journal
 

Similar to Blog summarizer (20)

Comments oriented blog summarization by sentence extraction
Comments oriented blog summarization by sentence extractionComments oriented blog summarization by sentence extraction
Comments oriented blog summarization by sentence extraction
 
An Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation NetworkAn Efficient Algorithm For Ranking Research Papers Based On Citation Network
An Efficient Algorithm For Ranking Research Papers Based On Citation Network
 
Survey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniquesSurvey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniques
 
Web Rec Final Report
Web Rec Final ReportWeb Rec Final Report
Web Rec Final Report
 
STACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSISSTACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSIS
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
Ists
IstsIsts
Ists
 
Vikalp - Automatic multiple choice questions generator
Vikalp - Automatic multiple choice questions generatorVikalp - Automatic multiple choice questions generator
Vikalp - Automatic multiple choice questions generator
 
Paper id 24201441
Paper id 24201441Paper id 24201441
Paper id 24201441
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online Reviews
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Text
 
An E-commerce feedback review mining for a trusted seller’s profile and class...
An E-commerce feedback review mining for a trusted seller’s profile and class...An E-commerce feedback review mining for a trusted seller’s profile and class...
An E-commerce feedback review mining for a trusted seller’s profile and class...
 
Ranking Web Pages
Ranking Web PagesRanking Web Pages
Ranking Web Pages
 
Topic-specific Web Crawler using Probability Method
Topic-specific Web Crawler using Probability MethodTopic-specific Web Crawler using Probability Method
Topic-specific Web Crawler using Probability Method
 
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONS
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONSCOMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONS
COMMTRUST: A MULTI-DIMENSIONAL TRUST MODEL FOR E-COMMERCE APPLICATIONS
 
Pagerank
PagerankPagerank
Pagerank
 
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical AnnotationBuilding A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
 
Object-Oriented Analysis & Design (OOAD) Domain Modeling Introduction
  Object-Oriented Analysis & Design (OOAD)  Domain Modeling Introduction  Object-Oriented Analysis & Design (OOAD)  Domain Modeling Introduction
Object-Oriented Analysis & Design (OOAD) Domain Modeling Introduction
 
Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1
 
IRJET- Finding Related Forum Posts through Intention-Based Segmentation
IRJET-  	  Finding Related Forum Posts through Intention-Based SegmentationIRJET-  	  Finding Related Forum Posts through Intention-Based Segmentation
IRJET- Finding Related Forum Posts through Intention-Based Segmentation
 

Recently uploaded

Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 

Recently uploaded (20)

Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 

Blog summarizer

  • 2. Motivation  Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization.  In this project we aim to summarize a Web document (e.g. a blog post) by considering the comments left by its readers.  Web documents are now presented with annotations given by their readers in the form of tags, comments, ratings, and others.  These Annotations along with comments are valuable input from users and can be utilized in different IR tasks.  By considering these comments, the generated summary can better capture the input from the readers, as opposed to the author of the document only.  Comments-oriented summary provides balanced views from both author and readers.
  • 3. Introduction Problem Statement Given a blog post, consisting of a set of sentences P = {s1 , s2 , . . . , sn} and the set of comments C = {c1 , c2 , . . . c} associated with blog post , the task of comments- oriented blog summarization is to extract a subset of sentences P , denoted by Sr (Sr ⊂ P ), that best represents the discussion in C Solution  Score blog sentences based on their similarity with top scored relevant comments.  Comments are scored by using RQT graph/tensor based approach and Named Entity Similarity score.
  • 5. Approach  In summary generation it is important to retrieve relevant comments.  A comment is relevant if it reflects the topic discussed in blog or has more replies.  A comment is scored using RQT model and Named Entity Similarity model and top comments are selected as relevant comments.  Similarity score for each sentence is calculated by summation of cosine similarity between that sentence and other comments.  Top scored sentences are the ones which are grasped by most commentators and hence are relevant for summary.
  • 6. RQT Model  Three factors determine RQT score (Rc) of a comment  Response Count (Cr ) : Number of replies to each comment.  Topic Related Cluster Count (Ct): Cosine similarity used to cluster comments  Quotation Count(Cq ): Number of times it is quoted in other comments. Rc= Cr+Ct +Cq
  • 7. Additional Factors  Likes Count (Cl) :Our dataset is Techcrunch.com where people comment using facebook. Likes on a particular comment also increase relevance of a comment significantly. Number of likes (Cl) also affect weightage. Rc= Cr+Ct +Cq+Cl  Named Entity Similarity: Named entites in a comment are identified by Stanford POS tagger and named entity score (Ec) is calculated by taking number of named entities in a comment. Final Score for each comment is calculated as Score(C) = Rc + Ec
  • 8. Sentence Scoring  Comments whose weights are greater than threshold value are chosen as top comments.  Cosine similarity of each sentence is calculated with top comments .  Sentences are assigned score based on their cosine similarity with top comments. Score(Si)=Summation(CS(Si,Comments)) CS : Cosine Similarity Si : Blog Sentence  Only those top 5~7 sentences which has more than 6~8 words will be selected as summary for the blog. More or less number of sentences can be selected based on percentage of summary required.
  • 9. Experiments And Results  10 blogs were randomly chosen with large number of comments and generated a summary with 30 % and 20 % of words.  Generated summaries using online tools and compared System generated summaries using the ROUGE Summary Evaluation Package by Chin-Yew LIN.
  • 10. Conclusions  Our approach depends upon number of factors like the number of likes each comment has got, the length of the blog content etc  Generated summary was awarded with less ROGUE score if the number of comments aren't enough.  Generated summary was less accurate when none of comments were accurate  By scoring the comments based on named entities, accuracy of ranking of comments increased significantly.  This system needs more testing and larger dataset in order to get optimal values of the constants.