Impulse Technologies
                                      Beacons U to World of technology
        044-42133143, 98401 03301,9841091117 ieeeprojects@yahoo.com www.impulse.net.in
              Topic-Mining-over-Asynchronous-Text-Sequences
   Abstract
          Time stamped texts, or text sequences, are ubiquitous in real-world
   applications. Multiple text sequences are often related to each other by sharing
   common topics. The correlation among these sequences provides more meaningful
   and comprehensive clues for topic mining than those from each individual
   sequence. However, it is nontrivial to explore the correlation with the existence of
   asynchronism among multiple sequences, i.e., documents from different sequences
   about the same topic may have different time stamps. In this paper, we formally
   address this problem and put forward a novel algorithm based on the generative
   topic model. Our algorithm consists of two alternate steps: the first step extracts
   common topics from multiple sequences based on the adjusted time stamps
   provided by the second step; the second step adjusts the time stamps of the
   documents according to the time distribution of the topics discovered by the first
   step. We perform these two steps alternately and after iterations a monotonic
   convergence of our objective function can be guaranteed. The effectiveness and
   advantage of our approach were justified through extensive empirical studies on
   two real data sets consisting of six research paper repositories and two news article
   feeds, respectively.




  Your Own Ideas or Any project from any company can be Implemented
at Better price (All Projects can be done in Java or DotNet whichever the student wants)
                                                                                          1

16

  • 1.
    Impulse Technologies Beacons U to World of technology 044-42133143, 98401 03301,9841091117 ieeeprojects@yahoo.com www.impulse.net.in Topic-Mining-over-Asynchronous-Text-Sequences Abstract Time stamped texts, or text sequences, are ubiquitous in real-world applications. Multiple text sequences are often related to each other by sharing common topics. The correlation among these sequences provides more meaningful and comprehensive clues for topic mining than those from each individual sequence. However, it is nontrivial to explore the correlation with the existence of asynchronism among multiple sequences, i.e., documents from different sequences about the same topic may have different time stamps. In this paper, we formally address this problem and put forward a novel algorithm based on the generative topic model. Our algorithm consists of two alternate steps: the first step extracts common topics from multiple sequences based on the adjusted time stamps provided by the second step; the second step adjusts the time stamps of the documents according to the time distribution of the topics discovered by the first step. We perform these two steps alternately and after iterations a monotonic convergence of our objective function can be guaranteed. The effectiveness and advantage of our approach were justified through extensive empirical studies on two real data sets consisting of six research paper repositories and two news article feeds, respectively. Your Own Ideas or Any project from any company can be Implemented at Better price (All Projects can be done in Java or DotNet whichever the student wants) 1