Analyzing Cross-System User Modeling on the Social Web


Published on


  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Observations:Even though the size of Flickr profiles is high, the entropy is rather lowEntropy of aggregated profiles is the highest
  • Source = which tags do we put into the profile?Semantic Enrichment: do we do something further with the tags that are already selected to be in the profile? (here: do we add further tags?)Weighting scheme: how do we weigh the tags (in the paper we compare two dimensions: (i) type of weighting (-> TF vs. TFxIDF) and (ii) where do we count (i.e. do we take the the TF statistics from (a) the personal profile of the foreign system or (b) from the “global statistics” of the target system) In this talk, we do just look at (ii). We use TF and do not report on TFxIDF.
  • Here, we do “semantic enrichment” based on “Tag-similarity” (see slide 15: User Modeling Building Blocks)
  • Here, we do “semantic enrichment” based on “cross-system rules” (see slide 15: User Modeling Building Blocks)
  • Characteristics:Overlap is small; still one gets significantly more informationPerformance: cross-system UM leads to very high improvements for cold-start recommendations (some personal information is better than nothing) to optimize: we need to know the characteristics of the system (we can be stupid and simply aggregate what we can get  this is fine as we will get improvements anyhow; but we can massively optimize if we carefully select the different building blocks of the cross-system UM strategy with respect to the given application (e.g. Recommending bookmarks in Delicious  select tags from personal Twitter profile, but weigh them according to the global Delicious tag frequencies.
  • Analyzing Cross-System User Modeling on the Social Web

    1. 1. Analyzing Cross-System User Modeling on the Social Web<br />ICWE, Cyprus, June 22, 2011<br />Fabian Abel, SamurAraujo, QiGao, Geert-Jan Houben<br />Web Information Systems, TU Delft<br />
    2. 2. What we do: Science and Engineering for the Personal Web<br />domains: news social mediacultural heritage public datae-learning<br />Personalized<br />Recommendations<br />Personalized Search<br />Adaptive Systems <br />Analysis and <br />User Modeling<br />Semantic Enrichment, Linkage and Alignment<br />user/usage data<br />Social Web<br />
    3. 3. profile<br />?<br />Hi, I have a <br />new-user problem!<br />profile<br />Hi, I’m back and<br />I have new <br />interests.<br />Hi, I don’t know <br />that your <br />interests changed!<br />Pitfalls of User-adaptive Systems<br />Hi, I’m your new <br />user. Give me <br />personalization!<br />System A<br />System D<br />System C<br />System B<br />How can we tackle these problems?<br />profile<br />profile<br />profile<br />time<br />
    4. 4. Cross-system user modeling on the Social Web <br />User data on the Social Web<br />
    5. 5. SocialGraph API<br />1. get other accounts <br />of user <br />Account Mapping<br />2. aggregate <br />public profile <br />data <br />Social Web Aggregator<br />Blog posts:<br />Semantic Enhancement<br />Profile Alignment<br />Bookmarks:<br />3. Map profiles to<br />target user model<br />4. enrich data with<br />semantics <br />Other media:<br />WordNet®<br />Social networking profiles:<br />FOAF<br />vCard<br />Interweaving public user data with Mypes<br />Aggregated, <br />enriched profile<br />(e.g., in RDF or vCard)<br />Google Profile URI <br /><br />Analysis and user modeling<br />5. generate user profiles<br />
    6. 6. In this paper: User Modeling across Twitter, Flickr and Delicious<br />Twitter and Delicious<br />1500 users<br />80k + 620k TAS<br />Flickr and Delicious<br />1467 users<br />890k + 680k TAS<br />Bob<br />travel, google IO<br />web<br />socialmedia<br />identity<br />This is #interesting: <br /> #web<br /><br />Twitter<br />Delicious<br />Flickr<br />
    7. 7. Tag-based user profiles<br />Tag-based profile of a user u = set of weighted tags: <br />weight indicates to what degree<br />the user is interested in t<br />tag of interest<br />Lightweight weighting scheme:<br />count how often the user applied the tag<br />
    8. 8. Characteristics of tag-based profiles<br />
    9. 9. Characteristics of tag-based profiles<br />What are the characteristics of the individual tag-based profiles in Twitter, Flickr and Delicious?<br />How do the tag-based profiles of individual users overlap between the different systems?<br />
    10. 10. Size of tag-based profiles<br />Delicious<br />Flickr<br />Twitter<br />
    11. 11. Overlap of tag-based profiles<br />Overlap of tag-based profile is less than 10% for more than 90% of the users<br />
    12. 12. where: <br /> - p(t) = probability that t occurs in Tu<br /> - Tu = tags in user profile P(u)<br />Entropy of Tag-based profiles<br />Delicious<br />Flickr & Delicious<br />Flickr<br />Twitter & Delicious<br />Twitter<br />Aggregated profiles reveal wrt entropy significantly more information than the service specific profiles.<br />
    13. 13. Observations<br />Profile size varies from system to system (e.g. tag-based Twitter profiles are rather sparse)<br />Tag-based profiles of an individual user overlap only little(e.g. overlap is less than 10% for more than 90% of the users)<br />Entropy of tag-based profiles:<br /> Twitter < Flickr < Delicious < aggregated profiles<br />
    14. 14. Cross-System User Modeling for Cold-start recommendations<br />
    15. 15. Evaluation: Recommending tags / bookmarks<br />Hi, I’m your new <br />user. Give me <br />personalization!<br />delicious<br />profile<br />profile<br />?<br />user’s tags and bookmarks<br />profile<br />Ground truth:<br />leave-n-out evaluation<br />tags to explore<br />Cosine-based<br />recommender<br />Web sites to <br />bookmark<br />Cross-system<br />user modeling<br />actual tags and bookmarks of the user<br />How does cross-system user modeling impact the recommendation quality (in cold-start situations)?<br />
    16. 16. User Modeling Building Blocks<br />1. Which tags should be contained in the profile?<br />2. Further enrich/align tags?<br />3. How to weight the tags?<br />1. Source<br />Profile?<br />tags weights<br />analyze<br />0.1<br />0.1<br />0.5<br />0.2<br />0.1<br />t1<br />t2<br />t3<br />t4<br />t5<br />2. Semantic Enrichment<br />enrich<br />3. Weighting Scheme<br />?<br />weight<br />System A<br />System B<br />
    17. 17. User Modeling Building Blocks (in this talk)<br />Source:<br />Personal tags from foreign system<br />Popular tags from target system<br />Semantic Enrichment:<br />Enrich tags with similar tags (based on Jaro-Winkler similarity)<br />Cross-system rules: if tag A was used in foreign system then add tag B<br />Weighting scheme:<br />Personal usage frequency in foreign system<br />Global usage frquency in target system<br />personal<br />profile<br />popular<br />profile<br />?<br />similarity<br />cross rules<br />personal<br />global<br />Foreign:<br />Target:<br />a) simJaro(blog, blogs) is high<br />b) Cross-system rule:<br />blogforeignnikontarget<br />web<br />blog<br />java<br />requires profile to compute recommendations<br />blogs <br />france<br />
    18. 18. Cross-System User Modeling for Cold-start recommendations<br />Which user modeling strategies performs best in which context?<br />How do the different building blocks of the user modeling strategies (e.g. source of user data) influence the quality of the tag-based profiles?<br />
    19. 19. Tag recommendations: Twitter / Delicious<br />As you can easily see…<br />:-)<br />
    20. 20. Tag recommendations: Twitter Delicious<br />Significant improvements regarding all metrics!<br />Improvement regarding P@10, but “global Delicious trend” performs better regarding MRR & S@1.<br />Cross-system strategies lead to significant improvement (impact of semantic enrichment is rather low)<br />profile<br />profile<br />profile<br /> global<br />tag frequencies<br /> (weights)<br />profile<br />?<br />profile<br />?<br />user’s <br /> tags<br />user profile<br />popular<br />personal<br />personal<br />personal<br />global<br />personal<br />global<br />global<br />baseline<br />Cross-system user modeling<br />similarity<br />
    21. 21. Tag recommendations: Delicious Twitter<br />Semantic enrichment (cross-system rules) allow for significant improvement regarding P@10<br />Significant improvements regarding all metrics!<br />profile<br />profile<br />profile<br />Tag-based profile information from Delicious seems to be more valuable than hashtga-based Twitter profiles<br />user’s tags<br />and tag frequencies (weights)<br />profile<br />?<br />user profile<br />popular<br />personal<br />personal<br />personal<br />global<br />personal<br />global<br />global<br />baseline<br />Cross-system user modeling<br />crossrules<br />
    22. 22. Tag Recommendations: different settings<br />profile<br />profile<br />target:<br />Cross-system user modeling allows for cold-start tag recommendations in Delicious: <br />Twitter profiles are more appropriate than Flickr profiles.<br />Cross-system user modeling is also beneficial for cold-start tag recommendations in Flickr.<br />target:<br />profile<br />?<br />profile<br />?<br />Cross-system user modeling has significant impact on the recommendation performance<br />To optimize the performance one adapt to the given application setting<br />profile<br />
    23. 23. Bookmark Recommendations<br />Cross-system user modeling achieves also significant improvements for cold-start bookmark recommendations<br />Twitter is again a more appropriate source than Flickr<br />baseline<br />Cross UM<br />Cross UM<br />
    24. 24. Conclusions<br />Characteristics of distributed tag-based profiles:<br />Overlap of tag-based profiles, which an individual user creates at different services, is low<br />Aggregated profiles reveal significantly more information (regarding entropy) than service-specific profiles<br />Performance of cross-system user modeling for cold-start recommendations:<br />Cross-system UM leads to tremendous (and significant) improvements of the tag and bookmark recommendation quality<br />To optimize the performance one has to adapt the cross-system strategies to the concrete application setting<br /><br />
    25. 25. Thank you!<br />Fabian Abel, QiGao, Geert-Jan Houben, Ke Tao<br />Datasets:<br />Twitter: @persweb<br /><br />