• Share
  • Email
  • Embed
  • Like
  • Private Content
Automatic Document Summarization
 

Automatic Document Summarization

on

  • 1,168 views

 

Statistics

Views

Total Views
1,168
Views on SlideShare
1,168
Embed Views
0

Actions

Likes
1
Downloads
24
Comments
2

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Glad you found it useful. Don't hesitate to contact us if you want to discuss this further.
    Are you sure you want to
    Your message goes here
    Processing…
  • thanx worth billion
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Automatic Document Summarization Automatic Document Summarization Presentation Transcript

    • AUTOMATIC DOCUMENT SUMMARIZATION FINDWISE
    • Single document summarizationProposed use for Findwise: • Meta data for indexing serviceUnsupervised: • No need for trainingset • Relative domain independence • Relative language independence
    • Preprocessing Mandatory Additional • Sentence splitting • Named Entity Recognition • Tokenization • Keyword extraction • Stemming • tfidf term weighting • PoS-tagging
    • Sentence extraction Sentence ranking • Real value ranking • Relevance ordering Sentence selection • Desired summary length Sentence ordering • Final presentation
    • TextRank Graph based • Sentences as vertices • Similarity as edges Iterative ranking • PageRank
    • Sentence Similarity What makes two sentences similar? Explored variations • Shared words • Word importance • Lexical filtering • Length normalization • Advanced analysis
    • K-means clustering Approach: • Sentences as points • Divide into clusters • Select sentences from each cluster • Diverse summaries
    • Domain customization Domain: short news articles in English • Sentence position important • Use domain knowledge to improve performance • Other boosting for other domains
    • Multi document summarization Sentence Ranking Sentence selection • TextRank • Similarity threshold • K-Means clustering
    • Sentence Ordering Paragraph selection Paragraph merging • Topical closeness • Date of publication • Sentence Similarity • Original position
    • Results single document Algorithm ROUGE Ngram(1,1) TextRank 0.4797 K-means 0.4680 One-class SVM 0.4343 TextRank 0.4708 Original K-means Original 0.4791 Baseline 1 0.4649 Baseline 2 0.3998
    • Results multi document Algorithm ROUGE Ngram(1,1) TextRank 0.2537 K-means 0.2400 MetaRank 0.2561 Baseline 1 0.2317 Baseline 2 0.2054