What makes a tweet relevant for a topic?

1,588 views

Published on

The slides for presentation at #MSM2012 workshop, #WWW2012

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,588
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
28
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Greetings…Self Introduction: I’m 2nd year PhD student from Web Information Systems Group in TU Delft. Today, I will report our work titled “What makes a tweet relevant for a topic?”The general purpose : we want to supply a better experience for users who are seeking information on Twitter.
  • In 2010, there was a paper which suggested that Twitter is more like a news media rather than a social network. Because 85% of the tweets are related to the news. Here is a picture which shows how much tweets were posted per second when several big events occurred, such as Japanese Earthquake, U.K. Royal wedding, Raid on Osama bin Laden, and UEFA champions league. valuable source for information, particularly for news The twitter is a news media, a source for getting information…As large amount of information has been being generated on Twitter, how can we retrieve the interesting ones or the relevant one for our needs? The straightforward solution is to search, just what you do several times per day on Google. But Teevan et al. tell us that the search behavior on Twitter is quite different. For example, (i) length are shorter (ii) special expression, such as the usage of hashtag (iii) mention celebrity or name (how this differs)?  check the paperColor of the table, black and white, green to highlight, list only the important rowsTherefore, we are curious how we can improve finding relevant or interesting tweets on Twitter.
  • RQ:Topic -> determine whether the tweets are relevant based on the characteristics in them.If we consider semantics in the tweets, will the relevance estimation be better?numberingChallenge : amount of data.. Make them explicitly
  • Traditional Twitter SearchHighlight what does keyword matching means, the keywords, in search query and tweets.
  • Title -> syntactical featuresBox in the tweetGreen boxes for the hypothesesFlow from keyword-based relevance to … Slide 5-8, flow
  • Subtitle -> question?
  • Introduction to the usage of @, including mentions, and reply. Reply tweets frequently occur in private conversations. Therefore particularly, make a hypothesis about reply tweet.
  • The 21st International World Wide Web Conference #www2012 will take place in Lyon, France April 16-20 2012 @www2012Lyon www2012.wwwconference.orgSubtitle questionOne short, one longcomparison
  • Fade in the question later.
  • Fade in the entities one by one.
  • Mention the name of the features.
  • Consider comparison between this and another tweet with only one type of entities.Title, subtitle
  • SentimentRaise a discussion -> relevant, interestingAnother example tweet could be…
  • First summarize. Fade in the question later.How can we extract the semantic features that are topic-sensitive.
  • Color of the box (query)
  • Not highlight www, lyon, france"How can you infer that WWW in the query refers to the Conference and not to the World Wide Web?"
  • 18Can we utilize the contextual features.
  • Titles,
  • Timeline
  • Number of features.----- Meeting Notes (11/4/12 16:19) -----The temporal context -> senstiveWSDM <- Named Entities Recognition.
  • Numbering research questionsRefers to the paper, section 5.Explain the setup given a topic…
  • What is the relevance judgment?Number of topics.
  • ComparisonFade in the pairsHighlightTextbox -> Conclusion, (precision)
  • ----- Meeting Notes (11/4/12 16:19) -----The indri score <- relevance, not enoughhardly visible <- point outTitle <- feature weights (influence)...
  • Too much textNumber of featuresLast -> future work, link to Twinder
  • Too much textNumber of featuresLast -> future work, link to Twinder
  • ----- Meeting Notes (11/4/12 16:13) -----The example for explaining what do I mean by relevant or interesting. Unified the style of animation
  • What makes a tweet relevant for a topic?

    1. 1. What makes a tweet relevant for a topic? #MSM2012 Lyon April 16th, 2012 Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben Web Information Systems, TU Delft What makes a tweet relevant for a topic? 1
    2. 2. Get information from TwitterHow do people use Twitter as a source of information?• Twitter is more like a news media.• How do people search Twitter? • Search on Twitter (Teevan et al.) Web Search Twitter Search Query length (chars) 18.80 12.00 Query length (words) 3.08 1.64 Is a celebrity name 3.11% 15.22% What makes a tweet relevant for a topic? 2
    3. 3. Research QuestionsWhat are the challenges we are facing? 1. Given a topic, can we identify the relevant tweets based on the characteristics of the tweets? 2. Are semantics meaningful for determining the tweets’ relevance for a topic? What makes a tweet relevant for a topic? 3
    4. 4. Search on TwitterTraditional solution• Twitter search interface• Ordered by time• Keyword-based match What makes a tweet relevant for a topic? 4
    5. 5. Syntactical feature : HashtagIs a tweet more relevant if it contains a #hashtag? Hypothesis 1: tweets that contain hashtags are more likely to be relevant than tweets that do not contain hashtags. What makes a tweet relevant for a topic? 5
    6. 6. Syntactical feature : hasURLIs a tweet that contains a URL more relevant? Hypothesis 2: tweets that contain a URL are more likely to be relevant than tweets that do not contain a URL. What makes a tweet relevant for a topic? 6
    7. 7. Syntactical feature : isReplyIs a tweet which is a reply to @somebody more relevant? Hypothesis 3: tweets that are formulated as a reply to another tweet are less likely to be relevant than other tweets. What makes a tweet relevant for a topic? 7
    8. 8. Syntactical feature : lengthDoes the length of a tweet influence its relevance for a topic? Hypothesis 4: the longer a tweet, the more likely it is to be relevant and interesting. What makes a tweet relevant for a topic? 8
    9. 9. Overview of features Short summary Topic sensitive Topic insensitive Keyword-based Syntactical features relevance Are there further features thatallow for estimating the relevance? What makes a tweet relevant for a topic? 9
    10. 10. Semantic featuresFind semantics in a tweet to estimate the relevance dbp:Tim_Berners-Lee dbp:World_Wide_Web dbp:France dbp:Lyon dbp:International_World_Wide_Web_Conference What makes a tweet relevant for a topic? 10
    11. 11. Semantic features : #entityIs a tweet with more entities more interesting?•5 entities extracted. Hypothesis 5: the more entities a tweet mentions, the more likely it is to be relevant and interesting. What makes a tweet relevant for a topic? 11
    12. 12. Semantic features : #entity(type)Do the types of semantics influence the relevance?•Person : 1•Artifact : 1•Event : 1•Location : 2 Hypothesis 6: different types of entities are of different importance for estimating the relevance of a tweet. What makes a tweet relevant for a topic? 12
    13. 13. Semantic features : diversityHow many types are there in the entities?•4 types of entities Hypothesis 7: the greater the diversity of concepts mentioned in a tweet, the more likely it is to be interesting and relevant. What makes a tweet relevant for a topic? 13
    14. 14. Semantic features : sentimentWas the author of the tweet happy or not?•Sentiment : Neutral Hypothesis 8: the likelihood of a tweet’s relevance is influenced by its sentiment polarity. What makes a tweet relevant for a topic? 14
    15. 15. Overview of featuresWhat do we have now? Topic sensitive Topic insensitive Keyword-based Syntactical ? Semantics Semantics in the given topic? What makes a tweet relevant for a topic? 15
    16. 16. Semantic-based relevanceExpand the queries to match more tweets. dbp:Lyon dbp:Tim_Berners-Lee Reformulated query is expected to get a more accurate retrieval score. What makes a tweet relevant for a topic? 16
    17. 17. Semantic-based relatedness Is there a semantic overlap between the query and the tweet? dbp:Lyondbp:Tim_Berners-Lee What makes a tweet relevant for a topic? 17
    18. 18. Overview of featuresBy now, we have 4 types of features. Topic sensitive Topic insensitive Keyword-based Syntactical Semantic-based Semantics Can we utilize the contextual information of tweets? What makes a tweet relevant for a topic? 18
    19. 19. Topic insensitive contextual featuresDoes the number of posts influence the relatedness? Hypothesis 9: the higher the number of tweets that have been published by the creator of a tweet, the more likely it is that the tweet is relevant. What makes a tweet relevant for a topic? 19
    20. 20. Topic-sensitive contextual featuresIs this tweet too old? Hypothesis 10: the lower the temporal distance between the query time and the creation time of a tweet, the more likely is the tweet relevant to the topic. T Query March 31 April 16 What makes a tweet relevant for a topic? 20
    21. 21. Summary of FeaturesThe features Topic sensitive Topic insensitive Keyword-based Syntactical Semantic-based Semantics Contextual Contextual What makes a tweet relevant for a topic? 21
    22. 22. Analysis• Research Questions: 1. Which features are more influential on predicting the relatedness of a tweet to a certain topic? 2. Which types of features are more important? Are semantics meaningful? 3. What’s the performance that we can achieve by utilizing these features?• Experimental Setup • Consider the search problem as a classification task • Classification algorithm = Logistic Regression What makes a tweet relevant for a topic? 22
    23. 23. DatasetFrom TREC 2011 Microblog Track• Twitter corpus • 16 million tweets (Jan. 24th, 2011 – Feb. 8th) • 4,766,901 tweets classified as English • 6.2 million entity-extractions (140k distinct entities)• Relevance judgments • 49 topics • 40,855 (topic, tweet) pairs • 60.31 relevant tweets per topic (on average) What makes a tweet relevant for a topic? 23
    24. 24. ResultsWhich type of features matters?Features Precision Recall F-measurekeyword relevance 0.3040 0.2924 0.2981semantic 0.3053 0.2931 0.2991relevancetopic-sensitive 0.3017 0.3419 0.3206topic-insensitive 0.1294 0.0170 0.0300without semantics 0.3363 0.4828 0.3965all features 0.3674 0.4736 0.4138 Overall, we can achieve the precision and recall of over 35% and 45% respectively by applying all the features. What makes a tweet relevant for a topic? 24
    25. 25. Weights of features Syntactical Which feature matters? 2 1 Keyword-based 02 hasHashtag hasURL isReply length -110 Semantics-1 Keyword-based relevance 2 1 Semantic-based 0 2 #entities diversity sentiment -1 1 Contextual 0 2-1 Relevance Relatedness 1 0 social context temporal context -1 What makes a tweet relevant for a topic? 25
    26. 26. Conclusions1. We constructed 17 features, including keyword-based relevance, semantics-based relevance, syntactical features, semantic features and contextual features.2. Semantic features and topic-sensitive features are meaningful.3. The contextual features have little impact on the prediction. What makes a tweet relevant for a topic? 26
    27. 27. Future work• We plan to leverage the implementation of search engine for Twitter based on the work done in this paper. • Twinder (Tweets Finder)• The progress on this work on be found at: http://wis.ewi.tudelft.nl/twinder/ What makes a tweet relevant for a topic? 27
    28. 28. QUESTIONS?April 16th, 2012 Slides : http://goo.gl/fQLaQk.tao@tudelft.nl http://ktao.nl/THANK YOU! What makes a tweet relevant for a topic? 28

    ×