Letmestartbymotivating a bittheresearchwearedoing. On socialmediaapplicationsliketwitterwhereeveryonecanprovideinformation.Thereforejudgingtopicalexpertiseofusersisoftencrucial in ordertodecideiftheinformationprovidedbythisuseristrustfuland relevant.
So whatpeopleusually do thanisto check out theprofilepageoftheTwitterusertheyareinterested in in orderfigure out whothispersonis an what he/sheisdoing. On theprofilepagepeopleareconfrontedwith 4 different typesoftextualinformationwhichmaypotentiallyrvealinformationabouttheuser‘sexpertise: on the top ofthepagethey find thebioinformation, belowethebiosectionthey find themostrecenttweetsandrts. On therightsideofthepagethey find themostrecentlistmembershipsof a user.
In thisworkwetrytoadressthefollowing 2 researchquestions
To address our first RQwe conducted a user study to find out which type of user related data are most useful for humans to make sense about the expertise of a Twitter user.
One can see from this that content is not enough to inform users’ expertise judgments. Our participants were not able to differentiate experts from novices just by seeing their tweets.Contextual information – i.e. lists and bio – were more useful for informating expertise judgements and the best performance was achieved when participants saw everything – i.e. tweets, lists and bio information.
Forthesecondexperimentwecreated a datasetasfollows.Users with high Wefollow rank are nominated by many other users in that directoryPrevious research (which won the best paper at this conf in 2010) showed that users with high Wefollow rank tend to be perceived as topical experts.
Weendeduphaving a bitmoprethan 1100user
If one wants to create topical expertise profiles of users, the first one needs is to learn topics. We used our dataset to learn semantically cohesive bag of words (i.e. topics) - i.e. we fitted a topic model (LDA) to our dataset where the aggregation of all data per user was a document.
Next we used statistical inference to related users with topics. We used 4 types of user-related data to infer the topic distribution of a user.
Our results show that on average topic profiles of users inferred from… This indicates that listscontain other types of information. One potential explanation is that users add other users to list which reflect what they know about them from real life. So e.g. I might add a colleague of mine into a list called “semantic web researcher” although this person does not use Twitter at all for talking about semantic web.
Togainfurtherinsightintothenatureoflistswedecidedtoinspect a small sample oflistsmanually.
The thirdcategoryoflistsismostusefulforourtask. Thereforewewantedtoknowhowmanypercentoflistsbelongtothisclass.
So farweonlyknowthatinformationobtainedfromuserlistsdifferssignificantlyfromothertypesofinformationandthatmanyuserlistsaretopicallists, but wedontknowwhich type ofinformationbestreflectsthetopicalexpertiseofusers.
The NMI measures the mutual dependence of the two random variables. A higher NMI value implies that a topic distribution more likely matches the underlying category information. List-based topic distributions reflect the underlying cat. Info best – i.e. users in one Wefollowdir tend to be in topical similar lists.
So far we only know that different types of user-related data lead to different topic distributions but we don’t know which topic distributions are best – i.e. reflect the expertise of a user most accurate. To answer this question we compared different topic distributions within a classification task. The aim of the classification task was to classify users into the right wefollow directory. We only used Wefollow users with a high rank (and previous research has that users with high wefollow rank tend to be perceived as experts for that topic) who only showed up in one of the 10 directories. Therefore, we assume that the topic distribution which allows classifying users most accurately into their expertise area this the one who reflects users’ expertise most accurately.
Our results show that for all 10 Wefollow directories a classifier trained with list-based topic distributions as features performs best – i.e. leads to the highest F-score.
Also the number of topics we learn may impact the performance of our classifier. We therefore compared the F-score of classifiers trained with different number of topics. From this graph we can see again that list-based topic annotations lead to the best performing classifier not matter how broad or fine grained topics are.
Also by inspecting the confusion matrices one can see that a classifier trained with list-based features shows less confusion. An optimal classifier would lead to a red box with a white diagonal line. The x-axis of each confusion matrix shows the reference values and the y-axis shows the predictions. The lighter the color the higher the value.
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter usersClaudia Wagner, Vera Liao, Peter Pirolli, Les Nelson and Markus Strohmaier Amsterdam, 16.4.2012
with…Vera Liao Peter Pirolli Markus Strohmaier Les Nelson
3 Motivation On Twitter information consumption is mainly driven by social networks Users need to decide whom to follow in order to get trustful and relevant information about the topics they are interested in Evidence from real-life Search online for evidence
Searching for evidence at Twitter user’s profile page Bio List MembershipsTweets and Retweets
6 Research Questions How useful are different types of user-related data for humans to inform their expertise judgments of Twitter users? How useful are different types of user-related data for learning computational expertise models of users?
User Study Expertise Judgments of humans16 participantsTask: Rate (1-5) expertise level of selected Twitter users (withhigh and low expertise) for the topic „semanticweb“3 Conditions under which the user accounts were presented tosubjects: Condition 1: Tweets, Retweets, List, Bio Condition 2: Only Tweets and Retweets are shown Condition 3: Only List and Bio are shownFor each condition and expertise level we have 4 Twitter pages(4 replicates)4 * 3 * 2 = 24 pages to rate per subject
User Study Expertise Judgments of humans2-way ANOVA cond 1 (tweets, bio and lists) cond 2 (only tweets) 3.5Within-Subject Variables: cond 3 (only bio and lists)•Twitter user expertise (high/low)•3 Conditions Mean Rating per Twitter UserInteraction between conditions and 3.0Twitter user expertise is significant(F(2) = 8,326 , p < 0,01 )Post-Hoc Test shows that users’ 2.5ability to correctly judge expertise ofTwitter users differs significantlyunder condition 1 and 2 andcondition 2 and 3. Low Expertise High Expertise
9 Research Questions How useful are different types of user-related data for humans to inform their expertise judgments of Twitter users? How useful are different types of user-related data for learning computational expertise models of users?
10 Dataset 10 topics semanticweb, biking, wine, democrat, republican, medicine, surfing, dogs, nutrition and diabetes We use Wefollow directories as a manually created proxy ground truth for expertise Top 150 users per Wefollow directory Excluded users who are in more than one of the 10 directories and users who mainly tweet non- english
11 Dataset 1145 users Most recent 1000 tweets and retweets Most recent 300 user lists Bio info Information on Twitter is sparse Extend URLs in Tweets, RTs and bio Use list names as search query terms Use top 5 search query result snippets obtained from Yahoo Boss to enrich list information 3
Computational Expertise Models Methodology Learn latent semantic structures (topics) from Twitter communication by fitting an LDA modelT1 T2 T3 Top 20 stemmed words of 3 randomly select topics learned by an LDA model with T=50
Computational Expertise Models MethodologyAssociate users with topics by using statistical Inference basedon different types of user related data user’s topical expertiseprofile Bio T1 T2 T3 Lists T1 T2 T3 Tweets T1 T2 T3 RTs T1 T2 T3
15 Types of User Lists Manual inspection of user lists Selected 10 users at random and inspected their user list memberships (455 user lists) We found 3 main classes of user lists: Personal judgments (e.g., “great people”, “geeks”) Personal relationships (e.g., “my family”,“colleagues”) Topical Lists (e.g., “science”, “researcher”, “healthcare”)
16 Value of User Lists 3 human raters judged if a list (label and/or description) belongs to the class Topical Lists 77,67% of user lists were topical lists Inter-rater agreement Kappa=0.62
Quantify the Value of17 Lists/Bio/Tweets/RTs Which type of information reflects best the topical expertise of a user? Information Theoretic Evaluation Which type of topic distribution reflects best the underlying category information of the user? Normalized Mutual Information (NMI) between user’s topic distributions and user’s Wefollow directory Task-based Evaluation Which type of topic distributions are most useful for classifying users into their Wefollow directories? F1-score of classifcation models
Task-based Evaluation of Computational Expertise ModelsCompare topic distributions inferred via differenttypes of user-related data within a classificationtaskObjective: Classifying users into Wefollow directoriesby using topic distribution as featuresClassification Task: Train Partial Least Square classifier with topicdistributions inferred via different types of user-relateddata as features Perform 5-fold-cross validation Use F-measure (harmonic mean of precision andrecall) to compare classifiers’ performance
F−Measure 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 biking Bio List Tweet Retweet democrat diabetes dogs medicine nutrition republicansemanticweb surfing wine Computational Expertise Models Task-based Evaluation of
F−Measure 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7T=10 Bio List TweetT=30 RetweetT=50T=70T=80T=100T=200T=300T=400T=500T=600T=700 Computational Expertise Models Task-based Evaluation of
Task-based Evaluation of Computational Expertise ModelsT=300 List Tweet wine wine surfing surfing semanticweb semanticweb republican republican nutrition nutrition medicine medicine dogs dogs diabetes diabetes democrat democrat biking biking diabetes dogs biking medicine nutrition republican semanticweb surfing wine democrat diabetes dogs biking medicine nutrition republican semanticweb surfing wine democrat x-axis shows reference values y-axis shows predictions
ConclusionsDifferent types of user-related data lead todifferent topic annotations List-based topic annotations are most distinct from all others Bio-, tweet- and retweet-based topic annotations are quite similarFor creating topical expertise profiles of usersinformation about their list memberships is mostusefulFor informing humans’ expertise judgments aboutTwitter users contextual information (user’ bio andlist memberships) is most useful
24 Implications & Limitations User Interface Make user lists and bio information more prominent Incentives for people to use lists more heavily E.g. provide weakly list-summaries Search and Recommender Systems could benefit from exploiting user list information Results are biased towards users with high Wefollow rank
Bio and User Lists are useful for judging topical expertise Experimental Setup THANK YOU firstname.lastname@example.org http://claudiawagner.infosrc: http://adobeairstream.com/green/a-natural-predicament-sustainability-in-the-21st-century/