Searching Twitter: Separating the Tweet from the Chaff


Published on

This presentation was given at ICWSM 2011. In this presentation, we report on a qualitative investigation into the different factors that make tweets ‘useful’ and ‘not useful’ for a set of common search tasks. The investigation found 16 features that help make a tweet useful, noting that useful tweets often showed 2 or 3 of these features. ‘Not useful’ tweets, however, typically had only one of 17 clear and striking features.

Our results contribute a novel framework for extracting useful information from real-time streams of social-media content

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Searching Twitter: Separating the Tweet from the Chaff

  1. 1. Searching Twitter:Separating the Tweetfrom the ChaffJonathan Hurlock & Max L. Wilson
  2. 2. You sure can!
  3. 3. llow I fo ? do ests H ow ter y In m
  4. 4.
  5. 5. Yet more DataMeta Data, Profile Data, Linked Data
  6. 6. Any of it Useful?Who cares how much data there is!“I think the challenge not only for twitter, but forthe technology industry at large. Is buildingmore relevant filters, in real time. Like beingable to surface valuable informationimmediately. No matter who it is, whoʼslistening or whoʼs broadcasting, is a reallyreally hard problem, and it makes twitter alotmore meaningful[... ]Weʼve gotten really reallygood at being able to put content in, into media[...] getting it out in a relevant, valueable way,in real time is still very difficult.”- Jack Dorsey (Creator of Twitter)
  7. 7. Why Twitter?Where is the value? $ ₧ ƒ ! ₥ ₧ ₤ ₣ ¢ ₠! ! ₣£ ₡ ₱£ ! ₧ ₡ ¢ ₤ ₠ ₱! ₥ ₣ $ ƒ
  8. 8. Lets go back...
  9. 9. Lets go back... Great Scott!
  10. 10. Asking FriendsHey, what are you doing? you & me
  11. 11. Social SearchWhat is everyone else doing? you & me
  12. 12. friend friend friendSocial SearchWhat is everyone else doing? friend you & me
  13. 13. bob & lisa Existing Knowledge No need to reinvent the wheel you & meMeredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: asurvey study of status message & behavior. In Proceedings of the 28th international conference on Human factors incomputing systems (CHI 10). ACM, New York, NY, USA, 1739-1748.
  14. 14. lisa Existing Knowledge bob & me No need to reinvent the wheel youMeredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: asurvey study of status message & behavior. In Proceedings of the 28th international conference on Human factors incomputing systems (CHI 10). ACM, New York, NY, USA, 1739-1748.
  15. 15. Lets go back to the networkRemember... you & me
  16. 16. friend friend friendand if we take a step back...Please mind the gap friend you me
  17. 17. We start to see interesting things...
  18. 18. Which have value!
  19. 19. Location, experiences, temporal data Yardi, Sarita and Boyd, Danah. ICWSM 2010. Tweeting from the Town Square: Measuring Geographic Local Networks
  20. 20. Location, experiences, temporal data Political upheaval, emergency events .. so what are you tweeting now? Yardi, Sarita and Boyd, Danah. ICWSM 2010. Tweeting from the Town Square: Measuring Geographic Local Networks
  21. 21. Twitter SearchHow do you find useful information?
  22. 22. Displaying ResultsRealtime
  23. 23. Displaying Results RT Time, ReTweets, Location, Popularity?
  24. 24. Displaying Results RT Time, ReTweets, Location, Popularity?
  25. 25. Displaying ResultsMaking sense of the data.
  26. 26. Displaying Results Making sense of the data.Michael S. Bernstein, Bongwon Suh, Lichan Hong, Jilin Chen, Sanjay Kairam, Ed H. Chi. Eddi: Interactive Topic-based Browsing of Social Status Streams.In Proc. of ACM User Interface Software and Technology (UIST) conference, Oct. 2010. New York, NY.
  27. 27. Displaying Results Making sense of the data.Diakopoulos, N.; Naaman, M.; Kivran-Swaine, F.; , "Diamonds in the rough: Social media visual analytics for journalistic inquiry,"Visual Analytics Science and Technology (VAST), 2010 IEEE Symposium on , vol., no., pp.115-122, 25-26 Oct. 2010
  28. 28. Interestingness Not necessarily useful! Naveed, Nasir and Gottron, Thomas and Kunegis, Jérôme and Alhadi, Arifah Che (2011) Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter. pp. 1-7. In: Proceedings of the ACM WebSci11, June 14-17 2011, Koblenz, Germany.
  29. 29. How we are different?What makes us unique?
  30. 30. Finding Usefulness! What constitutes a useful Tweet? fuln ess use
  31. 31. The MethodHow did we go about this?
  32. 32. Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: a comparison of microblog search and web search. WSDM 11: Proceedings of the fourth ACM international conference on Web search and data mining (pp. 35-44). New York, NY, USA: ACM. Information Seeking 3 Information Seeking Tasks
  33. 33. 20 ParticipantsThey were really nice people!
  34. 34. Search InterfaceA simple, easy to understand interface
  35. 35. It’s useful because...Think aloud + InterviewsTo help us provide more insight I didn’t because...
  36. 36. ∑AnalysisLots and lots of it! K
  37. 37. Grounded Theory Inductive Coding = Lots of Post-its!Glaser, B. G., & Strauss, A. L. (2009).The Discovery of Grounded Theory: strategies for qualitative research.Piscataway, New Jersey, USA: Transaction Publishers.
  38. 38. Kappa Analysis Cohen... Fleiss....Landis, R. J., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data. Biometrics , 33 (1),159-174.
  39. 39. Extended Kappa Analysis Multi Coded Kappa 0.73 (Substantial Agreement) Between Evaluators & 0.62 (Substantial Agreement) with Independent Untrained CoderHarris, J. K., & Burke, R. C. (2005). Do you see what I see? An application of inter-coder reliability in qualitative analysis.American Public Health Association 133rd Annual Meeting & Exposition. Washington, DC, USA: American Public HealthAssociation.
  40. 40. What did we find?Useful & Not-Useful
  41. 41. UsefulIn Tweet Content Experience Someone reporting a personal experience, but not necessarily suggestion / direction. Direct Someone making a direct recommendation, but not necessarily relaying a personal experience. Recommendation Social Knowledge Containing information that is spreading socially, or becoming general knowledge. Specific Information Where facts are listed directly in tweets e.g. prices, times etc.Reflection on Tweet Entertaining The reader finds them amusing. Shared Sentiment The reader agrees with the author of the tweet.Relevant Time The time is current Location The location is relevant to the query.
  42. 42. Useful (cont.)Trust Trusted Author The twitter account has a reputation / following Trusted Avatar The visual appearance cultivates trust. Trusted Link A link to a trustworthy recognisable domain.Links Actionable Link The user can perform a transaction by using the link (heavily dependent on trust) Media Link The link is to rich multimedia content. Useful Link The link provides valuable information content, e.g. authoritative information, educated reviewsMeta Tweet ReTweeted Lots Its information that others have passed on lots Conversation Its part of a series of tweets, and they all need to be useful
  43. 43. Not UsefulTweet Content No Information Absence of anything, event, factual points Introspective Personal content and personal thoughts for no social benefit Off Topic Result not related to the query give / TF-IDF irrelevant Too Technical The content requires specific domain knowledge the resader doesn’t possess Poorly Constructed Tweets that may have grammatical / spelling errors, or malformed URLs.Bad Tweets SPAM Irrelevant or inappropriate messages Wrong Language Messages sent in a foreign language of that to the reader Dead Link A URL which does not work i.e. a 404Not Relevant Time Out of date content Location Wrong geographic location
  44. 44. Not Useful (cont.)Trust Un-truested Author An author the reader feels at un-eased by or suspicious of. Un-trusted Link A link the reader feels is suspiciousSubjective A tweet that is perspective centric, meaning the author is providing their view or projecting an Perspective Oriented attitude on a subject matter or to a subject / reader. Disagree with Tweet A conflict of aggreement between the reader and the author Not Funny A tweet that is aimed to be humorous, which the reader does not feel is humorous.Meta Tweet QnA Part of a conversation, reader desires the whole convo. not just the question or the answer. Repeated Content the reader has seen before.
  45. 45. Insights Interesting finds
  46. 46. The Possible ImpactWhere could we see the impact of this work?
  47. 47. Search SystemA work in progress
  48. 48. ConclusionsSo just remember.
  49. 49. Thank you for ListeningJonathan Hurlock @jonhurlockMax L. Wilson @gingdottwit Like the talk? Then please tweet it, by quickly visiting: