Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Discovery of Twitter User Interestingness Based on Retweets, Reply Mentions and Pure Mentions Relationships

236 views

Published on

With the rising popularity of social media such as Facebook, Twitter, Instagram and many more, sentiment classification for social media has become a hot research topic. There were many research studies conducted on Twitter as it is one of the most widely used social media. Previous studies have approached the problem as a tweet-level classification task where each tweet is classified as positive, negative or neutral. However, getting an overall sentiment might not be useful to a business organizations which are using Twitter for monitoring consumer opinion of their products/services. Instead, it is more useful to determine specifically which tweets where users are happy or unhappy about. This paper proposes the discovery of Twitter user level interestingness based on relationships such as retweets, reply-mentions and pure-mentions using Google's PageRank algorithm. We conducted experiments and compared the results with hard-marked results by seven annotators.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Discovery of Twitter User Interestingness Based on Retweets, Reply Mentions and Pure Mentions Relationships

  1. 1. Discovery ofTwitterUser Interestingness Based on Retweets, Reply Mentions and Pure Mentions Relationships Ong Kok Chien , Poo Kuan Hoong and Chiung Ching Ho Faculty of Computing Informatics, Multimedia University Cyberjaya. 1 2016 International Conference on Information in Business andTechnology Management (I2BM)
  2. 2. 2016 International Conference on Information in Business andTechnology Management (I2BM) Outline Introduction Objective Methods Results Summary 2
  3. 3. 2016 International Conference on Information in Business andTechnology Management (I2BM) Introduction  Explore the graph relationships between Retweets (RT), Reply-Mentions, (RM) and Pure- Mentions (PM)  Compare the ranking with hand-marked (HM) ranking by seven (7) annotators 3
  4. 4. 2016 International Conference on Information in Business andTechnology Management (I2BM) Twitter  Maximum 140 characters microblogging site.  “ATweet is an expression of a moment or idea. It can contain text, photos, and videos. Millions ofTweets are shared in real time, every day.” Reply Retweet Favorite Hashtags https://about.twitter.com/what-is-twitter/story-of-a-tweet .com 4
  5. 5. 2016 International Conference on Information in Business andTechnology Management (I2BM) Objectives  To rankTwitter users using Page Rank. 5
  6. 6. 2016 International Conference on Information in Business andTechnology Management (I2BM) Methods  Link-based ranking algorithms (PageRank)  Twitter Users as Nodes.  Relationships as Edges. 6
  7. 7. 2016 International Conference on Information in Business andTechnology Management (I2BM) Example  PageRank (PR)  E.g.: BackLinks inWebsites - Referring back to OriginalContent. - Sergey Brin & Larry Page (1998). The anatomy of a large-scale hypertextual Web search engine. Image extracted from Wikipedia 7
  8. 8. 2016 International Conference on Information in Business andTechnology Management (I2BM) Example Minister ofYouth & Sports Khairykj shatyrah2 AyenSanji 8 https://twitter.com/Khairykj/status/410964119521460224
  9. 9. 2016 International Conference on Information in Business andTechnology Management (I2BM) Architecture Twitter Streaming API Configure Keywords 1 JSON raw data 2 3 HiveQL 4 UnixScript 9
  10. 10. 2016 International Conference on Information in Business andTechnology Management (I2BM) Keywords  HyppTV  Streamyx  UMobile  Unifi  Yes4G  Celcom  xpaxsays 10
  11. 11. 2016 International Conference on Information in Business andTechnology Management (I2BM) Basic Statistics Dataset  TotalTweets : 7,931  After discard non-native Retweets: 7,922  EnglishTweets (language=en): 2,229  Unique RT pairs of users: 512  Unique PM pairs of users: 620  Unique RM pairs of users: 545  Unique Full-Mention (FM) pairs of users: 1,154 1st Feb 2015 -> 7th Feb 2015 11
  12. 12. 2016 International Conference on Information in Business andTechnology Management (I2BM) Categories of Tweets Tweets are categorized into the following categories:  (1) News - products/company info;  (2) Advertisements - promotion;  (3) Business - special offers;  (4) Jokes - funny/pranks content;  (5) Questions - seeking for answers/response;  (6) Answers - response to a question (@mentions);  (7) Statement - Complaints/comments/feedback;  (8) Conversation - response to another tweets; and  (9) Irrelevant - not related to telco products/services. 12
  13. 13. 2016 International Conference on Information in Business andTechnology Management (I2BM) Interestingness score schema The interestingness score schema was set from the range 0 to 4:  0 = Irrelevant;  1 = Less Interesting/Informative;  2 = Interesting/Informative;  3 = Quite Interesting/Informative; and  4 =Very Interesting/Informative 13
  14. 14. 2016 International Conference on Information in Business andTechnology Management (I2BM) Results  Scored an average informative/interestingness score of 1.33 out of 4 by our 7 annotators from 2 iterations.  Agreement amongst 7 annotators after 2 iterations for 9 categories was 62.27% and score (between 0-4) was 51.64%. 14
  15. 15. 2016 International Conference on Information in Business andTechnology Management (I2BM) Results Rank HM RM PM FM RT 1 Asianadotmy Azwnrafi Zaynneutron Zaynneutron Zaynneutron 2 Zulhusnia Lauravinzant Shahril_Wokay 2 Ndiarzali88 Ndiarzali88 3 IzRijap TuneTalk Azwnrafi TuneTalk UniFiEdge 4 Socasnov FirdausAzil TuneTalk Lauravinzant TuneTalk 5 Pjolll Alliebnorman d Alliebnormand Azwnrafi Asianadotmy 6 uk_htc FookHeng_Le e Twtwanitaa FirdausAzil uk_htc 7 FookHeng_Lee Pjolll Lauravinzant FookHeng_Le e HyppWorld 8 TuneTalk Ndiarzali88 FirdausAzil Twtwanitaa Zulhusnia 9 NurIllihazwani UniFiEdge FookHeng_Lee Pjolll IzRijap 10 Shahril_Wokay 2 HyppWorld Pjolll Alliebnorman d FookHeng_Le e 15 • RT shows the closest match of ranking sequence as compared to RM and PM. • For the case of RM and PM, RM appears to be a better match to the HM sequence.
  16. 16. 2016 International Conference on Information in Business andTechnology Management (I2BM) Summary  A PR graph relationships analysis of how RT, RM and PM impact the perception of user-level informative/interestingness, validated with HM evaluations. 16
  17. 17. 2016 International Conference on Information in Business andTechnology Management (I2BM) FutureWork  Further evaluation to be conducted using different weightages of RT, RM and PM. 17

×