It’s not in their tweets: Modeling topical expertise                                         of Twitter usersClaudia Wagne...
with…Vera Liao            Peter Pirolli                Markus Strohmaier                            Les Nelson
3                                        Motivation    On Twitter information consumption is mainly    driven by social ne...
Searching for evidence at             Twitter user’s profile page                 Bio                               List M...
6                         Research Questions    How useful are different types of user-related    data for humans to infor...
User Study   Expertise Judgments of humans16 participantsTask: Rate (1-5) expertise level of selected Twitter users (withh...
User Study               Expertise Judgments of humans2-way ANOVA                                                         ...
9                         Research Questions    How useful are different types of user-related    data for humans to infor...
10                                                Dataset     10 topics       semanticweb, biking, wine, democrat, republi...
11                                               Dataset     1145 users       Most recent 1000 tweets and retweets       M...
Computational Expertise Models                        Methodology Learn latent semantic structures (topics) from Twitter c...
Computational Expertise Models                          MethodologyAssociate users with topics by using statistical Infere...
Topical Similarity between                                  lists/bio/tweets/RTs                1.0                0.8JS−D...
15                                    Types of User Lists     Manual inspection of user lists     Selected 10 users at ran...
16                               Value of User Lists     3 human raters judged if a list (label and/or     description) be...
Quantify the Value of17                                  Lists/Bio/Tweets/RTs     Which type of information reflects best ...
Information-Theoretic Evaluation of18        Computational Expertise Models             0.7                               ...
Task-based Evaluation of        Computational Expertise ModelsCompare topic distributions inferred via differenttypes of u...
F−Measure                0.0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9      biking                            ...
F−Measure        0.0   0.1   0.2   0.3       0.4     0.5   0.6      0.7T=10                                               ...
Task-based Evaluation of                                         Computational Expertise ModelsT=300                      ...
ConclusionsDifferent types of user-related data lead todifferent topic annotations  List-based topic annotations are most ...
24                           Implications & Limitations     User Interface       Make user lists and bio information more ...
Bio and User Lists are useful for judging topical expertise                                                           Expe...
Upcoming SlideShare
Loading in …5
×

It’s not in their tweets: Modeling topical expertise of Twitter users

966 views

Published on

presented at the ASE/IEEE International conference on Social Computing 2012 in Amsterdam

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
966
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Letmestartbymotivating a bittheresearchwearedoing. On socialmediaapplicationsliketwitterwhereeveryonecanprovideinformation.Thereforejudgingtopicalexpertiseofusersisoftencrucial in ordertodecideiftheinformationprovidedbythisuseristrustfuland relevant.
  • So whatpeopleusually do thanisto check out theprofilepageoftheTwitterusertheyareinterested in in orderfigure out whothispersonis an what he/sheisdoing. On theprofilepagepeopleareconfrontedwith 4 different typesoftextualinformationwhichmaypotentiallyrvealinformationabouttheuser‘sexpertise: on the top ofthepagethey find thebioinformation, belowethebiosectionthey find themostrecenttweetsandrts. On therightsideofthepagethey find themostrecentlistmembershipsof a user.
  • In thisworkwetrytoadressthefollowing 2 researchquestions
  • To address our first RQwe conducted a user study to find out which type of user related data are most useful for humans to make sense about the expertise of a Twitter user.
  • One can see from this that content is not enough to inform users’ expertise judgments. Our participants were not able to differentiate experts from novices just by seeing their tweets.Contextual information – i.e. lists and bio – were more useful for informating expertise judgements and the best performance was achieved when participants saw everything – i.e. tweets, lists and bio information.
  • Forthesecondexperimentwecreated a datasetasfollows.Users with high Wefollow rank are nominated by many other users in that directoryPrevious research (which won the best paper at this conf in 2010) showed that users with high Wefollow rank tend to be perceived as topical experts.
  • Weendeduphaving a bitmoprethan 1100user
  • If one wants to create topical expertise profiles of users, the first one needs is to learn topics. We used our dataset to learn semantically cohesive bag of words (i.e. topics) - i.e. we fitted a topic model (LDA) to our dataset where the aggregation of all data per user was a document.
  • Next we used statistical inference to related users with topics. We used 4 types of user-related data to infer the topic distribution of a user.
  • Our results show that on average topic profiles of users inferred from… This indicates that listscontain other types of information. One potential explanation is that users add other users to list which reflect what they know about them from real life. So e.g. I might add a colleague of mine into a list called “semantic web researcher” although this person does not use Twitter at all for talking about semantic web.
  • Togainfurtherinsightintothenatureoflistswedecidedtoinspect a small sample oflistsmanually.
  • The thirdcategoryoflistsismostusefulforourtask. Thereforewewantedtoknowhowmanypercentoflistsbelongtothisclass.
  • So farweonlyknowthatinformationobtainedfromuserlistsdifferssignificantlyfromothertypesofinformationandthatmanyuserlistsaretopicallists, but wedontknowwhich type ofinformationbestreflectsthetopicalexpertiseofusers.
  • The NMI measures the mutual dependence of the two random variables. A higher NMI value implies that a topic distribution more likely matches the underlying category information. List-based topic distributions reflect the underlying cat. Info best – i.e. users in one Wefollowdir tend to be in topical similar lists.
  • So far we only know that different types of user-related data lead to different topic distributions but we don’t know which topic distributions are best – i.e. reflect the expertise of a user most accurate. To answer this question we compared different topic distributions within a classification task. The aim of the classification task was to classify users into the right wefollow directory. We only used Wefollow users with a high rank (and previous research has that users with high wefollow rank tend to be perceived as experts for that topic) who only showed up in one of the 10 directories. Therefore, we assume that the topic distribution which allows classifying users most accurately into their expertise area this the one who reflects users’ expertise most accurately.
  • Our results show that for all 10 Wefollow directories a classifier trained with list-based topic distributions as features performs best – i.e. leads to the highest F-score.
  • Also the number of topics we learn may impact the performance of our classifier. We therefore compared the F-score of classifiers trained with different number of topics. From this graph we can see again that list-based topic annotations lead to the best performing classifier not matter how broad or fine grained topics are.
  • Also by inspecting the confusion matrices one can see that a classifier trained with list-based features shows less confusion. An optimal classifier would lead to a red box with a white diagonal line. The x-axis of each confusion matrix shows the reference values and the y-axis shows the predictions. The lighter the color the higher the value.
  • It’s not in their tweets: Modeling topical expertise of Twitter users

    1. 1. It’s not in their tweets: Modeling topical expertise of Twitter usersClaudia Wagner, Vera Liao, Peter Pirolli, Les Nelson and Markus Strohmaier Amsterdam, 16.4.2012
    2. 2. with…Vera Liao Peter Pirolli Markus Strohmaier Les Nelson
    3. 3. 3 Motivation On Twitter information consumption is mainly driven by social networks Users need to decide whom to follow in order to get trustful and relevant information about the topics they are interested in Evidence from real-life Search online for evidence
    4. 4. Searching for evidence at Twitter user’s profile page Bio List MembershipsTweets and Retweets
    5. 5. 6 Research Questions How useful are different types of user-related data for humans to inform their expertise judgments of Twitter users? How useful are different types of user-related data for learning computational expertise models of users?
    6. 6. User Study Expertise Judgments of humans16 participantsTask: Rate (1-5) expertise level of selected Twitter users (withhigh and low expertise) for the topic „semanticweb“3 Conditions under which the user accounts were presented tosubjects: Condition 1: Tweets, Retweets, List, Bio Condition 2: Only Tweets and Retweets are shown Condition 3: Only List and Bio are shownFor each condition and expertise level we have 4 Twitter pages(4 replicates)4 * 3 * 2 = 24 pages to rate per subject
    7. 7. User Study Expertise Judgments of humans2-way ANOVA cond 1 (tweets, bio and lists) cond 2 (only tweets) 3.5Within-Subject Variables: cond 3 (only bio and lists)•Twitter user expertise (high/low)•3 Conditions Mean Rating per Twitter UserInteraction between conditions and 3.0Twitter user expertise is significant(F(2) = 8,326 , p < 0,01 )Post-Hoc Test shows that users’ 2.5ability to correctly judge expertise ofTwitter users differs significantlyunder condition 1 and 2 andcondition 2 and 3. Low Expertise High Expertise
    8. 8. 9 Research Questions How useful are different types of user-related data for humans to inform their expertise judgments of Twitter users? How useful are different types of user-related data for learning computational expertise models of users?
    9. 9. 10 Dataset 10 topics semanticweb, biking, wine, democrat, republican, medicine, surfing, dogs, nutrition and diabetes We use Wefollow directories as a manually created proxy ground truth for expertise Top 150 users per Wefollow directory Excluded users who are in more than one of the 10 directories and users who mainly tweet non- english
    10. 10. 11 Dataset 1145 users Most recent 1000 tweets and retweets Most recent 300 user lists Bio info Information on Twitter is sparse Extend URLs in Tweets, RTs and bio Use list names as search query terms Use top 5 search query result snippets obtained from Yahoo Boss to enrich list information 3
    11. 11. Computational Expertise Models Methodology Learn latent semantic structures (topics) from Twitter communication by fitting an LDA modelT1 T2 T3 Top 20 stemmed words of 3 randomly select topics learned by an LDA model with T=50
    12. 12. Computational Expertise Models MethodologyAssociate users with topics by using statistical Inference basedon different types of user related data  user’s topical expertiseprofile Bio T1 T2 T3 Lists T1 T2 T3 Tweets T1 T2 T3 RTs T1 T2 T3
    13. 13. Topical Similarity between lists/bio/tweets/RTs 1.0 0.8JS−Divergence 0.6 0.4 List−Bio List−Tweet List−Retweet 0.2 Bio−Tweet Bio−Retweet Tweet−Retweet 0.0 10 50 80 200 400 600 #Topics
    14. 14. 15 Types of User Lists Manual inspection of user lists Selected 10 users at random and inspected their user list memberships (455 user lists) We found 3 main classes of user lists: Personal judgments (e.g., “great people”, “geeks”) Personal relationships (e.g., “my family”,“colleagues”) Topical Lists (e.g., “science”, “researcher”, “healthcare”)
    15. 15. 16 Value of User Lists 3 human raters judged if a list (label and/or description) belongs to the class Topical Lists 77,67% of user lists were topical lists Inter-rater agreement Kappa=0.62
    16. 16. Quantify the Value of17 Lists/Bio/Tweets/RTs Which type of information reflects best the topical expertise of a user? Information Theoretic Evaluation Which type of topic distribution reflects best the underlying category information of the user? Normalized Mutual Information (NMI) between user’s topic distributions and user’s Wefollow directory Task-based Evaluation Which type of topic distributions are most useful for classifying users into their Wefollow directories? F1-score of classifcation models
    17. 17. Information-Theoretic Evaluation of18 Computational Expertise Models 0.7 Tweet Bio List 0.6 Retweet 0.5 NMI 0.4 0.3 0.2 T=10 T=50 T=80 T=200 T=400 T=600 #Topics
    18. 18. Task-based Evaluation of Computational Expertise ModelsCompare topic distributions inferred via differenttypes of user-related data within a classificationtaskObjective: Classifying users into Wefollow directoriesby using topic distribution as featuresClassification Task: Train Partial Least Square classifier with topicdistributions inferred via different types of user-relateddata as features Perform 5-fold-cross validation Use F-measure (harmonic mean of precision andrecall) to compare classifiers’ performance
    19. 19. F−Measure 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 biking Bio List Tweet Retweet democrat diabetes dogs medicine nutrition republicansemanticweb surfing wine Computational Expertise Models Task-based Evaluation of
    20. 20. F−Measure 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7T=10 Bio List TweetT=30 RetweetT=50T=70T=80T=100T=200T=300T=400T=500T=600T=700 Computational Expertise Models Task-based Evaluation of
    21. 21. Task-based Evaluation of Computational Expertise ModelsT=300 List Tweet wine wine surfing surfing semanticweb semanticweb republican republican nutrition nutrition medicine medicine dogs dogs diabetes diabetes democrat democrat biking biking diabetes dogs biking medicine nutrition republican semanticweb surfing wine democrat diabetes dogs biking medicine nutrition republican semanticweb surfing wine democrat x-axis shows reference values y-axis shows predictions
    22. 22. ConclusionsDifferent types of user-related data lead todifferent topic annotations List-based topic annotations are most distinct from all others Bio-, tweet- and retweet-based topic annotations are quite similarFor creating topical expertise profiles of usersinformation about their list memberships is mostusefulFor informing humans’ expertise judgments aboutTwitter users contextual information (user’ bio andlist memberships) is most useful
    23. 23. 24 Implications & Limitations User Interface Make user lists and bio information more prominent Incentives for people to use lists more heavily E.g. provide weakly list-summaries Search and Recommender Systems could benefit from exploiting user list information Results are biased towards users with high Wefollow rank
    24. 24. Bio and User Lists are useful for judging topical expertise Experimental Setup THANK YOU claudia.wagner@joanneum.at http://claudiawagner.infosrc: http://adobeairstream.com/green/a-natural-predicament-sustainability-in-the-21st-century/

    ×