Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Topic Models for Twitter hashtag recommendation

2,307 views

Published on

Presentation given at the Making Sense of Micropost Worksop at the World Wide Web conference of 2013

Published in: Technology
  • Be the first to comment

Using Topic Models for Twitter hashtag recommendation

  1. 1. ELIS – Multimedia LabFréderic Godin, Viktor Slavkovikj, Wesley DeNeve, Benjamin Schrauwen and Rik Van de WalleUsing Topic Models forTwitter Hashtag RecommendationMultimedia Lab, Ghent University – iMinds, BelgiumReservoir Lab, Ghent University, BelgiumImage and Video Systems Lab, KAIST, South Korea
  2. 2. 2ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Introduction (1)IndexingSearchLinkingGeneral TopicMemes GroupingInformation retrieval
  3. 3. 3ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Introduction (2)±10% of tweets contain a hashtag3% of the hashtags are used more than 5 timesIndexingSearchLinkingGeneral TopicMemesGrouping
  4. 4. 4ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013GoalSuggest keywords that resemble the general topic of a tweetand that could be used as a hashtagPromote hashtags for effective indexingAllow for effective search of tweets through hashtagsReduce the use of sparse hashtags
  5. 5. 5ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Architectural overviewBasic filterTweetLanguageidentificationTopicdistributionHashtagsuggestionHashtaggedtweet
  6. 6. 6ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Basic filterClean up the tweet: URLs, special HTML entities, digits,punctuations, the hash character, …During training:Remove tweets with just one wordRemove retweets
  7. 7. 7ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Language identificationWhy We need to build a language-dependent topic model.Goal Build unsupervised classifier that discriminates betweenEnglish and non-English tweets.How Using Naive Bayes and the Expectation-Maximizationalgorithm + character n-gram featuresResult Evaluation on a test set of 1000 randomly selected tweetsLui & Baldwin (LangID.py) Our algorithmPrecision 97.9% 97.0%Recall 91.8% 97.8%F1 94.8% 97.4%
  8. 8. 8ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Calculating the topic distributionIdea Find the general topic(s) of a tweetHow Using Latent Dirichlet Allocation to findthe topic distribution in an unsupervised mannerTraining 1.8 million tweets pre-filtered on 4000 keywords200 topics, α=0.1, β=0.1Example “Please RT!! sign Bernie Sanders petition for thefiscal cliff! http://..”0 1 2 3 57 199[0.1; 0.0 ; 0.0 ; 0.0 ; … ; 0.8 ; … ; 0.05]Topic 57:1. Fiscal2. Political3. President…
  9. 9. 9ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Hashtag suggestion (1)Idea Suggest a number of hashtags based onthe topic distribution of the tweetHow Sample the topic distribution and suggestthe top ranked keywordsYay, we got sixth period today school business light time periodPlease RT!! Sign Bernie Sanderspetition for the fiscall! Http://..fiscal political traffic president policycomfort, elegance, prettiness little good love relationship godExample
  10. 10. 10ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Hashtag suggestion (2)051015202530350 1 2 3 4 5 6 7 8 9 10Percentageoftweets(%)Number of correctly suggested hashtags5 hashtags10 hashtagsEvaluation of 100 tweets
  11. 11. 11ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013Conclusions and Future WorkWe built a hashtag recommendation system:Suggests general keywordsUnsupervisedIn the future:Use more context information: semantic web,social graph,…Adopt a hybrid approach between general and specifichashtags
  12. 12. 12ELIS – Multimedia LabUsing Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de WalleMaking Sense of Microposts Workshop @ World Wide Web Conference 2013#Questions @frederic_godin

×