Your SlideShare is downloading. ×
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Clusterrank
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Clusterrank

338

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
338
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ClusterRank A Graph Based Method for Meeting Summarization Nikhil Garg School of Computer Science EPFL
  • 2. Meeting Summarization
    • Dialog Acts
          • Example:
    [1] A: Due to severity of this problem [2] B: Which problem? [3] A: Global warming. [4] B: Oh. Yeah, right [5] A: This severe problem has made us look into other sources of energy
    • Abstractive summarization
            • Rephrase sentences, make new sentences
            • 3. eg. The severity of global warming has made us look into other sources of energy
    • Extractive summarization
            • Select sentences from the given text
            • 4. eg. [5] This severe problem has made us look into other sources of energy
  • 5. Meeting vs Text summarization
    • Dialog Acts
          • Example:
    [1] A: Due to severity of this problem [2] B: Which problem? [3] A: Global warming. [4] B: Oh. Yeah, right [5] A: This severe problem has made us look into other sources of energy
    • Challenges
          • Incomplete sentences
          • 6. Random chit-chat
          • 7. Redundancy
          • 8. ASR errors
          • 9. Dis-fluencies
  • 10. Outline
    • Previous work
    • TextRank
    • ClusterRank
    • Experiments
    • Conclusions
  • 11. Previous Work
      • Supervised sentence ranking: [Kupiec et al., 1995], [Maskey et al., 2003]
      • Maximal Marginal Relevance: [Goldstein et al., 1999]
      • Keyphrases: [Riedhammer et al., 2008], [Xie and Liu, 2008]
      • Prosodic Features: [Murray et al., 2005]
      • Text as a graph: [Zha, 2002], [Mihalcea and Tarau, 2004] , [Erkan and Radev, 2004]
  • 12. TextRank
    • Nodes: Sentences
    • 13. Edges: Sentence similarity
    • Sentences scores according to “centrality”
      • - PageRank [Brin and Page, 1998]
    • Pick top ranked sentences for summary
    S1 S2 S3 S4 S5 0.6 0.7 0.1 0.2 0.9
  • 14. TextRank in Meetings Challenges in Meetings
    • Incomplete Sentences
    - Information split across multiple utterances
    • Repeated Words
    - Often same information repeated in consecutive dialogs
    • Random chit-chat
    - Off-topic conversations in between
    • Redundancy
    - Repeated information in several utterances S1 S2 S3 S4 S5 0.6 0.7 0.1 0.2 0.9
  • 15. Clustering
    • Select a merge point and windows of text above and below
    • 16. Clustering based on word overlap
    • 17. Proximity of text is important
    [1] A: Due to severity of this problem [5] A: This severe problem has made us look into other sources of energy [4] B: Oh. yeah, right. [3] A: Global Warming [2] B: Which problem ? window_above=3 window_below=2 High similarity merge_point (b) Merging multiple sentences (or clusters) above and below the merge point
  • 18. ClusterRank Meeting Transcript Summary Clusters Cluster Graph Cluster Scores Sentence Scores The Big Picture Clustering Graph Construction Page Rank Sentence Scoring Greedy Selection
  • 19. ClusterRank Meeting Transcript Summary Clusters Cluster Graph Cluster Scores Sentence Scores 0.6 0.7 0.1 0.2 0.9 C1 C2 C3 C5 C4 Clustering Graph Construction Page Rank Sentence Scoring Greedy Selection
  • 20. ClusterRank Meeting Transcript Summary Clusters Cluster Graph Cluster Scores Sentence Scores Cluster Score(Si) = similarity(Si, centroid) * PR(cluster) Clustering Graph Construction Page Rank Sentence Scoring Greedy Selection
  • 21. ClusterRank Meeting Transcript Summary Clusters Cluster Graph Cluster Scores Sentence Scores
    • Pick high scoring sentences with minimal redundancy
    • Optional: Normalize sentence score by length to promote shorter sentences
    Clustering Graph Construction Page Rank Sentence Scoring Greedy Selection
  • 22. Experiments
    • AMI Meeting Corpus
    • 23. 137 Meeting transcripts
    • 24. ROUGE [Lin 2004] evaluation scores
    • 25. Summary length : 6% words
    • 26. Comparison against human abstractive summaries
    * denotes significant improvement
  • 27. Experiments
    • Performance comparison for different summary lengths
  • 28. Conclusions
    • Unsupervised graph based algorithm for extractive summarization
    • 29. Extension of TextRank with additional measures for high noise and redundancy
    • Limitations
      • Hand tuned parameters in clustering
      • 30. Clustering might suppress desired topics
    • Future Work
      • Try system on ASR
      • 31. Better clustering algorithm
      • 32. Integrating user query
      • 33. Use clusters for abstractive summarization
  • 34. Thank You Questions?

×