• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
503 Final Presentation
 

503 Final Presentation

on

  • 505 views

Computational Linguistics Course Project

Computational Linguistics Course Project

Statistics

Views

Total Views
505
Views on SlideShare
462
Embed Views
43

Actions

Likes
0
Downloads
1
Comments
0

1 Embed 43

http://kklo.wordpress.com 43

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    503 Final Presentation 503 Final Presentation Presentation Transcript

    • TIMELINE FROM NEWS KK Lo
    • GOAL...
    • RELATED WORK
    • Topic Detection and Tracking 2 communities Temporal and Event Tagging
    • Events of interest discovering new topic classifying documents ? tracking topics Topic Detection and Tracking
    • publication date =event happen time? Problems assume each article is an event lack of details
    • ? Tagging events and their temporal relationships Temporal and Event Tagging
    • Problems Event Event Event Event Event Event Result obtained from the TARSQI toolkit Event too many Events....
    • MY SOLUTION
    • APPLY SUMMARIZATION TECHNIQUE AS EVENT FILTERING
    • 3 components
    • prior probability Prior Ranking 0 1. Sentence A 2. Sentence B 3. Sentence C 4. ... Beginning sentence has a higher prior probability
    • cosine similarities s1 Grasshopper s2 s3 s4 s5 A Page-rank-like ranking algorithm
    • explicit time event instance event-event link event-time link TARSQI Toolkit From TEXT to TimeML
    • Events in TimeML Event Filtering Appear in the NO Top Selected BYE Sentences? YES PICK
    • 2008Dec Temporal event2 event1 Reasoner event3 2009 Find the (start, end) bound for each events
    • RESULT?
    • Sentence Selection Quality Special 250-words summary form 25 documents with DUC2007 Data Set Thanks to for the data and ROUGE =p
    • Effect of Sentence Filtering choosing the top 10 sentences D0701A D0720E #Event before 3320 1435 Filtering #Event after 67 37 Filtering How can we represent 3320 events on a timeline?
    • Time-Event Anchoring D0701A D0720E #Event before 3320 1435 Filtering #Failure 3085 1129 #Event after 67 37 Filtering #Failure 49 29 This shows that my approach is a failure
    • TARSQI only support single document WHY? e.g. 50 tagged events, Unable to deduce the only 50 pairs of relationships for all pair relation are tagged of events should be 50C2 = 1225
    • LESSON LEARNED
    • Temporal and Event Tagging 3 areas my project Automatic Summarization Topic Detection and Tracking
    • The limit of existing technology OR EVEN The limit of temporal analysis cannot get enough information from the documents
    • cosine similarity with tf-idf weighting is computational expensive 2.5 hrs for 867 sentences
    • DUC2007 Documents are hard to parse different documents have different format........ no standard date format... contains some special characters that cause troubles to XML parsers...
    • Q&A