Detecting Topic Drift with Compound Topic Models

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Detecting Topic Drift with Compound Topic Models - Presentation Transcript

    1. Detecting Topic Drift with Compound Topic Models Dan Knights Mike Mozer Nicolas Nicolov J.D. Power and Associates McGraw-Hill, U.S.A. Boulder, CO 80303 Goals: Track topics over time Detect topic drift Identify emerging topics Visualize topic trends Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 1/9
    2. Topic tracking challenge: emerging topics Dataset 1 Dataset 2 LDA LDA 0: energy hybrid gas prius fuel 0: money stock dow economy 1: million billion economy stock ? 1: hybrid gas prius alternative ... 2: obama mccain election race ... probability probability ... Topic 0 1 ... correspondence 0 1 2 topic index not guaranteed topic index Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 2/9
    3. Compound topic models guarantee correspondence CTM Dataset 1 + Dataset 2 LDA 0: money stock dow economy 0: money stock dow economy 1: hybrid gas prius alternative 1: hybrid gas prius alternative 2: obama mccain election race 2: obama mccain election race ... ... probability probability Topic 0 1 2 correspondence 0 1 2 topic index guaranteed topic index Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 3/9
    4. Potential indicators of drift 3 kinds of indicator: Kullback-Leibler divergence (KLD) Relative Perplexity (RP) Chi-square test (not shown) 2 kinds of model: Topic model Unigram model 3 x 2 = 6 potential indicators Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 4/9
    5. Case study: synthetic topic drift Gradual topic drift, days 150-179: Days Days Days 1-149 150-179 180-300 Drift indicators: All indicators detect drift Drift Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 5/9
    6. Case study: Toyota All blogs mentioning “Toyota” 6 months (January – June 2008) Drift indicators: Highest Drift? Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 6/9
    7. Emerging topics, Toyota (Mar-Jun 2008) Emerging “energy” topic Chapman auto accident topic “Energy” topic tracks gas price Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 7/9
    8. Case study: iPhone Public blogs mentioning “iPhone” and “platform” 12 months (April 2007 – March 2008) Most variable topics for Aug-Nov 2007 “Apple opens window: iPhone platform” “Google launches Android platform” Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 8/9
    9. Summary Compound topic models help with: tracking topics between distinct data sets detecting drift related to news events avoiding topic/vocabulary matching problem visualizing topic trends Open questions: How to interpret drift indicators Are unigram models sufficient for detecting topic drift? fast and frugal compared to topic models Dan Knights (JDPA) Detecting Topic Drift May 19, 2009 9/9

    + DanKnightsDanKnights, 5 months ago

    custom

    393 views, 0 favs, 0 embeds more stats

    A poster presented at ICWSM 2009 (International AAA more

    More info about this document

    CC Attribution License

    Go to text version

    • Total Views 393
      • 393 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 3
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories