Successfully reported this slideshow.
Your SlideShare is downloading. ×

Social Recommender Systems Tutorial - WWW 2011


Check these out next

1 of 156 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (20)


Similar to Social Recommender Systems Tutorial - WWW 2011 (20)

Recently uploaded (20)


Social Recommender Systems Tutorial - WWW 2011

  1. 1. Social Recommender Systems Ido Guy, David Carmel IBM Research-Haifa, Israel WWW 2011, March 28 th -April 1 st , Hyderabad, India
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Fundamental Recommendation Approaches </li></ul><ul><li>Entity Recommendation: Content, Tags, People, Communities </li></ul><ul><li>Recommendation for Groups </li></ul><ul><li>The Cold Start Problem </li></ul><ul><li>Trust </li></ul><ul><li>Social Recommender Systems in the Enterprise </li></ul><ul><li>Temporal Aspects in Social Recommendation </li></ul><ul><li>Social Recommendation over Activity Streams </li></ul><ul><li>Evaluation Methods </li></ul><ul><li>Summary of Open Challenges </li></ul>
  3. 3. Introduction IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  4. 4. Web 2.0 and Social Media <ul><li>Web 2.0 [Oreiley03] </li></ul><ul><ul><li>It’s all about people </li></ul></ul><ul><ul><ul><li>Joining online communities </li></ul></ul></ul><ul><ul><ul><li>Connecting to each other on social networks </li></ul></ul></ul><ul><ul><ul><li>Creating content as in wikis and blogs a </li></ul></ul></ul><ul><ul><ul><li>Annotating content with tags, comments, ratings </li></ul></ul></ul><ul><li>Social Media </li></ul><ul><ul><li>Refers to Web 2.0 sites that allow users to share and interact </li></ul></ul><ul><ul><li>Characterized by </li></ul></ul><ul><ul><ul><li>User-centered design </li></ul></ul></ul><ul><ul><ul><li>User-generated content (e.g., tags) </li></ul></ul></ul><ul><ul><ul><li>Social networks and online communities </li></ul></ul></ul>
  5. 5. Top 10 Website (, 3/2011) <ul><li>Google </li></ul><ul><li>Facebook </li></ul><ul><li>YouTube </li></ul><ul><li>Yahoo! </li></ul><ul><li>Windows Live </li></ul><ul><li> </li></ul><ul><li>Baidu </li></ul><ul><li>Wikipedia </li></ul><ul><li>Twitter </li></ul><ul><li> </li></ul>
  6. 6. Social Overload <ul><li>Facebook – largest social network site </li></ul><ul><ul><li>600,000,000 users, half log in every day </li></ul></ul><ul><ul><li>35,000,000,000 online “friendships” </li></ul></ul><ul><ul><li>900,000,000 objects people interact with </li></ul></ul><ul><ul><li>30,000,000,000 shared content items / month </li></ul></ul><ul><li>YouTube – largest video sharing site </li></ul><ul><ul><li>2,000,000,000 views per day </li></ul></ul><ul><ul><li>1,000,000 video hours uploaded per month </li></ul></ul><ul><li>Twitter – largest microblogging site </li></ul><ul><ul><li>200,000,000 users per month </li></ul></ul><ul><ul><li>65,00,000 tweets per day (750 per second) </li></ul></ul><ul><ul><li>8,000,000 followers of most popular user </li></ul></ul>
  7. 7. Social Overload <ul><li>Information overload – blogs, microblogs, forums, wikis, news, bookmarked web pages, photos, videos, … </li></ul><ul><li>Interaction overload – friends, followers, followees, commenters, co-members, voters, “likers”, taggers, … </li></ul>
  8. 8. Social Recommender Systems <ul><li>Recommender Systems that target the social media domain </li></ul><ul><li>Aim at coping with the challenge of social overload by presenting the most attractive and relevant content </li></ul><ul><li>Also aim at increasing adoption and engagement </li></ul><ul><li>Often apply personalization techniques </li></ul>
  9. 9. Recommender Systems and Social Media <ul><li>Recommender Systems are an augmentation of the social process, in which we rely on advices or suggestions from other people [Resnick97] </li></ul><ul><li>Any CF system has social characteristics. However, here we focus on the social media domain </li></ul><ul><li>Social Media and Recommender Systems can mutually benefit each other: </li></ul>Recommender Systems Social Media Social media introduces new types of data and metadata that can be leveraged by RS (tags, comments, votes, explicit social relationships) RS can significantly impact the success of social media, ensuring each user is present with the most relevant items that suits her personal needs
  10. 10. Real-World Examples
  11. 11. Real-World Examples
  12. 12. Real-World Examples
  13. 13. Real-World Examples
  14. 14. Real-World Examples
  15. 15. Real-World Examples <ul><li>Want to know what movie you should see this weekend? Ask Facebook . What should you eat for dinner tonight? Yelp knows! Should you wear the pink sweater or the blue one? You don’t have to decide – you can just ask Twitter ! </li></ul><ul><li>“ The Culture of Recommendation: Has It Gone Too Far?” Susan Murphy at SuzeMuse Blog, Feb 22 nd 2011 </li></ul>
  16. 16. Fundamental Recommendation Approaches IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  17. 17. Fundamental Recommendation approaches <ul><li>Collaborative filtering based Recommendation </li></ul><ul><ul><li>Aggregate ratings of objects from users and generate recommendation based on inter-user similarity </li></ul></ul><ul><li>Demographic Recommendation </li></ul><ul><ul><li>Categorize users based on personal attributes (age, gender, income..) and make recommendation based on demographic classes </li></ul></ul><ul><li>Content-based recommendation </li></ul><ul><ul><li>A user profile is constructed based on the features of the items the user has rated/consumed. This profile is used to identify new interesting items for the user (that match his profile) </li></ul></ul><ul><li>Knowledge-based recommendation </li></ul><ul><ul><li>Compute the utility of each item to the user and the user needs </li></ul></ul><ul><li>Hybrid methods </li></ul><ul><ul><li>combine several approaches together </li></ul></ul>
  18. 18. Recommendation techniques ( Burke,2002 ) Generate a classifier based on u ’s ratings, use it to classify new items Rating from u to items Features of items Content-based Infer a match between items and u ’s needs User needs Features of items Knowledge-based Identify similar users, extrapolate from their rating Demographic information about u Demographic information about users Demographic Identify similar users, extrapolate from their rating Rating from u to items User-item matrix CF Process Input Background Technique
  19. 19. Collaborative Filtering <ul><li>In the real world we seek advices from our trusted people (friends, colleagues, experts) </li></ul><ul><li>CF automate the process of “word-of-mouth” </li></ul><ul><ul><li>Weight all users with respect to similarity with the active user. </li></ul></ul><ul><ul><li>Select a subset of the users ( neighbors ) to use as recommenders </li></ul></ul><ul><ul><li>Predict the rating of the active user for specific items </li></ul></ul><ul><ul><ul><li>based on its neighbors’ ratings </li></ul></ul></ul><ul><ul><li>Recommend items with maximum prediction </li></ul></ul>
  20. 20. User based CF algorithm <ul><li>The User x Item Matrix: </li></ul>Jon’s taste is similar to both Chris and Alice tastes  Do not recommend Superman to Jon Shall we recommend Superman for John? ? Like Like Jon Dislike Like Like Chris Like Dislike ? Bob Dislike Like Like Alice Superman Snow-white Shrek
  21. 21. User based CF algorithm (cont) <ul><li>Voting: v ij corresponds to the vote of user i for item j </li></ul><ul><li>I i is the set of items on which user i voted </li></ul><ul><li>w(i,k) -the similarity between u i and u k </li></ul><ul><li>k – a normalization factor </li></ul><ul><li>Predictive vote : </li></ul><ul><li>The mean vote for user i : </li></ul>
  22. 22. Cosine based similarity between users Pearson based similarity between users
  23. 23. CF - Practical challenges <ul><li>Ratings data is often sparse, and pairs of users with few co-ratings are prone to skewed correlations </li></ul><ul><li>Fails to incorporate agreement about an item in the population as a whole </li></ul><ul><ul><li>Agreement about a universally loved item is much less important than agreement for a controversial item </li></ul></ul><ul><ul><ul><li>Some algorithms account for global item agreement by including weights inversely proportional to an item’s popularity </li></ul></ul></ul><ul><li>Calculating a user’s perfect neighborhood is expensive – requiring comparison against all other users </li></ul><ul><ul><li>Sampling: a subset of users is selected prior to prediction computation </li></ul></ul><ul><ul><li>Clustering: can be used to quickly locate a user's neighbors </li></ul></ul>
  24. 24. Item-Based Nearest Neighbor Algorithms <ul><li>The transpose of the user-based algorithms </li></ul><ul><ul><li>generate predictions based on similarities between items </li></ul></ul><ul><ul><li>The prediction for an item is based on the user’s ratings for similar items </li></ul></ul><ul><li>traverse over all m items rated by user i and measure their rate , averaged by their similarity to the predicted item </li></ul><ul><li>w(k,j) is a measure of item similarity - usually the cosine measure </li></ul><ul><li>Average correcting is not needed because the component ratings are all from the same target user </li></ul>Bob dislikes Snow-white (which is similar to Shrek)  do not recommend Shrek to Bob ? Like Like Jon Dislike Like Like Chris Like Dislike ? Bob Dislike Like Like Alice Superman Snow-white Shrek
  25. 25. Dimensionality Reduction Algorithms <ul><li>Reduce domain complexity by mapping the item space to a smaller number of underlying “dimensions” </li></ul><ul><ul><li>represent the latent topics or tastes present in those items </li></ul></ul><ul><ul><li>improve accuracy in predicting ratings for the most part </li></ul></ul><ul><ul><li>reduce run-time performance needs and lead to larger numbers of co-rated dimensions </li></ul></ul><ul><li>Popular techniques: Singular value decomposition and Principal Component Analysis </li></ul><ul><ul><li>Require an extremely expensive offline computation step to generate the latent dimensional space </li></ul></ul>
  26. 26. Singular Value Decomposition (SVD) <ul><li>Handy mathematical technique that has application to many problems </li></ul><ul><li>Given any m  n matrix R , find matrices U , I , and V such that </li></ul><ul><ul><li>I = U I V T </li></ul></ul><ul><ul><li>U is m  r and orthonormal </li></ul></ul><ul><ul><li>I is r  r and diagonal </li></ul></ul><ul><ul><li>V is nxr and orthonormal </li></ul></ul><ul><li>Discard the smallest I values to get R m,k with k << r </li></ul><ul><ul><li>R m,k is an k-approximation of R based on k most important latent feature </li></ul></ul><ul><ul><li>Recommendation will be given based on R k </li></ul></ul>
  27. 27. Hybrid recommendation methods <ul><li>Any Recommendation approach has pros and cons </li></ul><ul><ul><li>e.g. CF & CB both suffer from the cold start problem </li></ul></ul><ul><ul><ul><li>but CF can recommend “outside the box” compared to Content-based approaches </li></ul></ul></ul><ul><li>Hybrid recommender system combines two or more techniques to gain better performance with fewer drawbacks </li></ul><ul><li>Hybrid methods: </li></ul><ul><ul><li>Weighted: scores of several recommenders are combined together </li></ul></ul><ul><ul><li>Switching: switch between recommenders according to the current situation </li></ul></ul><ul><ul><li>Mixed: present recommendations that are coming from several recommenders </li></ul></ul><ul><ul><li>Cascade: One recommender refines the recommendations given by another </li></ul></ul>
  28. 28. References <ul><li>Recommender Systems: An Introduction, Jannach et al. 2011 . </li></ul><ul><li>Recommender Systems Handbook , Ricci et al. 2010 </li></ul><ul><li>Hybrid web recommender systems, : Survey and Experiments. Burke , User Modeling and User-Adapted Interaction . 2002 </li></ul><ul><li>Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. Adomavicius et al., IEEE Transactions on Knowledge and Data Engineering. 2005 </li></ul>
  29. 29. Content Recommendation IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  30. 30. Video Recommendations <ul><li>The YouTube Video Recommendation System [Davidson et al., RecSys ’10] </li></ul><ul><li>Goals </li></ul><ul><ul><li>recent and fresh </li></ul></ul><ul><ul><li>diverse </li></ul></ul><ul><ul><li>relevant to the user’s recent actions </li></ul></ul><ul><ul><li>users to understand why a video was recommended to them </li></ul></ul><ul><li>Based on user’s personal activities (watched, favorited, liked) </li></ul><ul><li>Expending through co-visitation graph of videos </li></ul><ul><li>Ranking based on a variety of signals for relevance and diversity </li></ul>
  31. 31. Video Recommendations <ul><li>Calculating related videos (CB): </li></ul><ul><ul><li>Association rule mining (co-visitation count) </li></ul></ul><ul><ul><li>Additional issues: presentation bias, noisy watch data, </li></ul></ul><ul><ul><li>More data sources: sequence and timestamp of video watches, video metadata </li></ul></ul><ul><li>Expansion through related video graph </li></ul><ul><li>Ranking </li></ul><ul><ul><li>Video quality </li></ul></ul><ul><ul><li>User specificity (personalization) </li></ul></ul><ul><li>Topic diversification by removing very similar videos or ones yielded by the same seed video </li></ul><ul><li>Evaluation – A/B testing </li></ul><ul><ul><li>CTR, long CTR, session length, time to long watch, </li></ul></ul><ul><ul><li>60% of all homepage video clicks are recommendations </li></ul></ul>
  32. 32. News Recommendations <ul><li>Personalized news recommendation based on click behavior [Liu et al., IUI 2010] </li></ul><ul><li>News Recommendation on the Google News website </li></ul><ul><li>Combined CB-CF approach: Rec(article) = IF(article) x CF(article) </li></ul><ul><li>IF is based on the topic of the article and two main factors: </li></ul><ul><ul><li>User’s own past clicks (reflecting the user’s genuine news interests) </li></ul></ul><ul><ul><li>General news trends based on click behavior from the general public </li></ul></ul><ul><ul><li>Combination of long-term and short-term behavior </li></ul></ul><ul><li>CF is based on clustering dynamic datasets </li></ul><ul><ul><li>MinHash – fuzzy clustering based on the proportional overlap between the set of items they clicked </li></ul></ul><ul><ul><li>Shown to scale and retain quality </li></ul></ul>
  33. 33. News Recommendations <ul><li>Evaluation based on live trail </li></ul><ul><li>Hybrid method shown to perform best </li></ul><ul><ul><li>31% CF over popularity; 38% hybrid over popularity </li></ul></ul><ul><li>Noticeable effect on return rate on the longer run (after a week of getting used to the feature) </li></ul><ul><li>No effect on overall stories read on the News homepage (maybe people only have fixed amount on time) </li></ul>
  34. 34. News Recommendations <ul><li>Social network and social information filtering on Digg [Lerman, ICWSM ’07] </li></ul><ul><li>Digg – social news aggregator, allowing users to submit links to, vote, and discuss news stories </li></ul><ul><li>Recommendations by Digg Friends </li></ul><ul><li>Live trial by tracking Digg users over time </li></ul><ul><ul><li>Users tend to like stories submitted by friends </li></ul></ul><ul><ul><li>Users tend to like stories their friends read and liked </li></ul></ul><ul><li>Need to control for user-diversity to avoid “tyranny of the minority”, where lion’s share of front page stories comes from most active users </li></ul><ul><li>In practice, also use the concept of “diggers like me” </li></ul>
  35. 35. Enhancing CF with Friends <ul><li>The user’s network of friends and people of interest becomes more accessible in the Web 2.0 era (Facebook, LinkedIn, Twitter,...) </li></ul><ul><li>Such social relationships can be very effective for recommendation compared to traditional CF </li></ul><ul><ul><li>Recommendation from people the user knows and thus can judge </li></ul></ul><ul><ul><li>Spare explicit feedback such as ratings </li></ul></ul><ul><ul><li>Effective for new users </li></ul></ul><ul><li>Various works have shown the effectiveness of friend-based recommendation over CF, e.g.: </li></ul><ul><ul><li>Movie and book recommendation - Comparing Recommendations Made by Online Systems and Friends [Sinha & Swearingen, 2001] </li></ul></ul><ul><ul><li>Friends as trusted recommenders for movies [Golbeck, 2006] </li></ul></ul><ul><ul><li>Club recommendation within a German SNS - Collaborative Filtering vs. Social Filtering [Groh & Ehmig, Group 2007] </li></ul></ul>
  36. 36. Blog Recommendations <ul><li>Document Representation and query expansion models for blog recommendation [Arguello et al., ICWSM ’08] </li></ul><ul><li>Personalized recommendation of blogs in response to a query </li></ul><ul><li>Blog is a collection of documents (blog entries) </li></ul><ul><li>The query represents an ongoing (multifaceted) interest in a topic </li></ul><ul><li>Two document models </li></ul><ul><ul><li>Large document model – based on the blog as a whole, a virtual concatenation of its respective entries </li></ul></ul><ul><ul><li>Smoothed small document model – each entry is a document, aggregation at the ranking level </li></ul></ul><ul><li>Query expansion using Wikipedia </li></ul><ul><li>Evaluation using the TrecBlog06 collection </li></ul><ul><ul><li>Two document models equally perform, hybridization further improves </li></ul></ul><ul><ul><li>Query expansion shown to improve recommendation results </li></ul></ul><ul><li>Recommendation vs. Search </li></ul>
  37. 37. Mixed Social Media Item Recommendations <ul><li>Social network-based recommendations of blogs, bookmarks, and communities </li></ul><ul><li>Key distinction: </li></ul><ul><ul><li>Familiarity: co-authorship, org chart, direct connection or tagging, etc. </li></ul></ul><ul><ul><li>Similarity – co-usage of tags, co-bookmarking, co-membership, co-commenting </li></ul></ul><ul><li>Explanations – showing the “implicit recommender” and her relationship to the user and item </li></ul>Personalized Recommendation of Social Software Items based on Social Relations [Guy et al., RecSys ’09]
  38. 38. Mixed Social Media Item Recommendations <ul><li>Evaluation combines a user survey and a live system </li></ul><ul><li>Recommendations from familiar people are significantly more accurate than recommendations from similar people </li></ul><ul><ul><li>57% to 43% interest ratio </li></ul></ul><ul><li>Similar people yield more diverse, less expected items </li></ul><ul><li>Explanations have an instant effect increasing interest in recommended items </li></ul>
  39. 39. Mixed Social Media Item Recommendations <ul><li>Social media recommendation based on people and tags [Guy et al., SIGIR ’10] </li></ul><ul><li>Underlying social aggregation system – SaND </li></ul><ul><li>5 item types: blogs, bookmarks, communities, wikis, files </li></ul><ul><li>First comprehensive study to compare people-based and tag-based recommenders </li></ul><ul><li>Outgoing and incoming tags </li></ul>
  40. 40. Mixed Social Media Item Recommendations <ul><li>Direct tag evaluation </li></ul><ul><ul><li>65 participants rated 16 tags each </li></ul></ul><ul><ul><li>Hybrid tags most accurate </li></ul></ul><ul><ul><li>Incoming slightly outperform outgoing </li></ul></ul><ul><ul><li>Indirect are least effective </li></ul></ul><ul><li>Large-scale user study </li></ul><ul><ul><li>412 participants rated 16 items each </li></ul></ul><ul><ul><li>All personalization methods outperform popularity </li></ul></ul><ul><ul><li>Tag-based significantly outperforms people-based in terms of accuracy </li></ul></ul><ul><ul><li>Yet has less diversity, more expected results, and less effective explanations </li></ul></ul><ul><ul><li>Hybrid combines the good of both worlds </li></ul></ul><ul><ul><li>Reaches 70:30 interest ratio for first 16 items </li></ul></ul>
  41. 41. Tag-based Movie Recommendations <ul><li>Tagommenders: connecting users to items through tags [Sen et al., WWW ’09] </li></ul><ul><li>Inspecting various ways to recommend items based on tags </li></ul><ul><ul><li>Movie ratings </li></ul></ul><ul><ul><li>Movie clicks </li></ul></ul><ul><ul><li>Tag applications </li></ul></ul><ul><ul><li>Tag Searches </li></ul></ul><ul><li>Evaluation based on MovieLens (featuring tags) </li></ul><ul><ul><li>Algorithms based on item ratings performed best, followed by those based on implicit item signals, followed by those based on tag signals. </li></ul></ul><ul><ul><li>A linear combination of all algorithms, performed best. </li></ul></ul><ul><ul><li>Tag-based algorithms outperformed state-of-the-art CF </li></ul></ul><ul><ul><ul><li>Lead to RS that leverage item features users find most useful </li></ul></ul></ul>
  42. 42. Summary of Key Points <ul><li>Articulated social networks play an important role in CF for social media </li></ul><ul><ul><li>Enhance regular CF in various manners </li></ul></ul><ul><li>“ Tagommenders” are highly effective for recommendation </li></ul><ul><ul><li>Outperform regular CF </li></ul></ul><ul><li>As in Traditional RS, hybrid approaches (e.g., tags+social networks, short+long term interests) typically further improve </li></ul><ul><li>Many users => strong evaluation on live systems </li></ul><ul><li>Accuracy vs. Serendipity tradeoff </li></ul><ul><ul><li>Accuracy alone is not enough – serendipity and diversity also play a key role [Mcnee et al., CHI ’06] </li></ul></ul>
  43. 43. References <ul><li>Arguello, J., Elsas, J., Callan, J., & Carbonell J. Document representation and query expansion models for blog recommendation. Proc. ICWSM’08. </li></ul><ul><li>Davidson, J. Liebal, B., Junning L., et al. The YouTube video recommendation system. Proc. RecSys '10 , 293-296. </li></ul><ul><li>Groh, G., & Ehmig, C. Recommendations in taste related domains: collaborative filtering vs. social filtering . Proc. GROUP ’07 , 127-136. </li></ul><ul><li>Golbeck J. Generating predictive movie recommendations from trust in social networks. Proc. 4th Int. Conf. on Trust Management . Pisa, Italy </li></ul><ul><li>Guy, I., Zwerdling, N., Carmel, D., et al. Personalized recommendation of social software items based on social relations. Proc. RecSys ’09 , 53-60. </li></ul><ul><li>Guy, I., Zwerdling N., Ronen, I, et al. Social media recommendation based on people and tags. Proc SIGIR ’10 , 194-201. </li></ul>
  44. 44. References <ul><li>Lerman, K. Social networks and social information filtering on Digg. Proc. ICWSM ’07. </li></ul><ul><li>Liu, J., Dolan, P. & Pederson E.B. Personalized news recommendation based on click behavior. Proc. IUI ’10 , 31-40. </li></ul><ul><li>McNee M.S., Riedl, J., & Konstan J.A. 2006. Being accurate is not enough: how accuracy metrics have hurt recommender systems. Proc CHI '06 , 1097-1101. </li></ul><ul><li>Sen, S., Vig, J., & Riedl, J.Tagommenders: connecting users to items through tags. Proc. WWW '09 , 671-680. </li></ul><ul><li>Sinha, R. & Swearingen, K. Comparing recommendations made by online systems and friends. 2001 DELOS-NSF Workshop on Personalization and Recommender Systems in Digital Libraries . </li></ul>
  45. 45. Tag Recommendation IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  46. 46. Tag Recommendation <ul><li>Adding terms (tags) to objects by the public provides additional contextual and semantic information to various resources: </li></ul><ul><ul><li>Web pages (e.g. Delicious) </li></ul></ul><ul><ul><li>Academic publications (e.g. CiteILike) </li></ul></ul><ul><ul><li>Multimedia objects (e.g. Flickr, Last.Fm, YouTube) </li></ul></ul><ul><li>External tags are useful for many applications </li></ul><ul><ul><li>search/browse, classification, tag-cloud representation, query expansion </li></ul></ul><ul><li>Tag Recommendation: - recommend appropriate tags to be applied by the user per specific item annotation </li></ul><ul><ul><li>assist the user in the tagging phase </li></ul></ul><ul><ul><li>reduce undesired noise in the aggregated folksonomy </li></ul></ul>
  47. 48. Tag Recommendation Approaches <ul><li>Popular: </li></ul><ul><ul><li>Recommend the most popular tags to the user </li></ul></ul><ul><ul><ul><li>Popular tags already assigned for the target item (Golder 2005) </li></ul></ul></ul><ul><ul><ul><li>Frequent tags previously used by the user </li></ul></ul></ul><ul><ul><ul><li>Tags co-occurred with already assigned tags (Sigurbjornsson 2008) </li></ul></ul></ul><ul><li>Collaborative Filtering: </li></ul><ul><ul><li>Recommend tags associated with “similar” items </li></ul></ul><ul><ul><li>Recommend tags given by “similar” users </li></ul></ul><ul><li>Hybrid: </li></ul><ul><ul><li>Recommend tags given by similar users to similar items (Symeonidis08, Rendle10, Carmel 10) </li></ul></ul>
  48. 49. Flickr’s tag recommender (Sigurbjornsson WWW2008 )
  49. 50. Content-based tag recommendation <ul><li>Recommend keywords/phrases from the item’s associated text (content, anchor-text, meta data, etc.) </li></ul><ul><ul><li>e.g .terms with highest tf-idf score </li></ul></ul><ul><li>Analyze mutual relationship between content and tags </li></ul><ul><ul><li>Recommend tags that have the highest co-occurrence with important keywords </li></ul></ul><ul><ul><li>Language modeling approach (Givon 2010): </li></ul></ul><ul><ul><ul><li>Estimate the joint tag and keyword probability distribution. </li></ul></ul></ul><ul><ul><ul><li>This provides an estimation that a given item will be annotated with certain tags, given a background collection of annotated items </li></ul></ul></ul>
  50. 51. Graph based approaches <ul><li>The FolkRank algorithm (Hotho 2006): </li></ul><ul><ul><li>a resource which is tagged with important tags by important users becomes important </li></ul></ul><ul><ul><ul><li>The same holds, symmetrically, for tags and users </li></ul></ul></ul><ul><li>We have a graph of connected vertices (resources, users, tags) which are mutually reinforcing each other by spreading their weights </li></ul><ul><li>Graph nodes are scored by random walk techniques: </li></ul>w – a weight vector over nodes A – a row-stochastic matrix of the graph p - preference vector over the nodes <ul><li>For tag recommendation, return the top ranked tags, while setting p to bias the desired pair of user and resource </li></ul>
  51. 52. Tag Recommendations in Folksonomies, Jaeschke07 <ul><li>For each user we pick one of his posts randomly </li></ul><ul><li>The task of the different recommenders is to predict the user tags of this post, based on the rest of the Folksonomy </li></ul><ul><li>We measure how many of those tags are covered by the top-k recommended tags </li></ul>
  52. 53. References <ul><li>The Structure of Collaborative Tagging System. Golder et al. Journal of Information Science, 2005 </li></ul><ul><li>Flickr tag recommendation based on collective knowledge. Sigurbjörnsson et al. WWW 2008 </li></ul><ul><li>Tag recommendations based on tensor dimensionality reduction. Symeonidis RecSys 2008 </li></ul><ul><li>Pairwise interaction tensor factorization for personalized tag recommendation Rendle et al. WSDM 2010 </li></ul><ul><li>Social bookmark weighting for search and recommendation, Carmel et al. VLDB Journal 2010 </li></ul><ul><li>Large Scale Book Annotation with Social Tags. Givon et al. ICWSM 2009 </li></ul><ul><li>FolkRank: A Ranking Algorithm for Folksonomies . Hotho et al. FGIR 2006 </li></ul><ul><li>Tag Recommendations in Folksonomies, Jaeschke et al, PKDD 2007 </li></ul>
  53. 54. People Recommendation IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  54. 55. Social Matching <ul><li>Social matching : A framework and research agenda [Terveen & Mcdonald, 2005] </li></ul><ul><li>Social matching systems = recommender systems that recommend people to each other </li></ul><ul><ul><li>Must reveal some amount of personal information </li></ul></ul><ul><ul><li>Privacy, trust, reputation, interpersonal attraction have greater importance </li></ul></ul><ul><ul><li>Interaction overload vs. information overload </li></ul></ul>
  55. 56. Recommending People to Connect with <ul><li>Do You Know? Recommending People to Invite into Your Social Network [Guy et al., IUI ’09] </li></ul><ul><li>Recommendation in the enterprise based on the following signals: </li></ul><ul><ul><li>Org chart relationships </li></ul></ul><ul><ul><li>Paper and patent co-authorship </li></ul></ul><ul><ul><li>Project co-membership </li></ul></ul><ul><ul><li>Blog commenting </li></ul></ul><ul><ul><li>People tagging </li></ul></ul><ul><ul><li>Mutual connections </li></ul></ul><ul><ul><li>Connection in another SNS </li></ul></ul><ul><ul><li>Wiki co-editing </li></ul></ul><ul><ul><li>File sharing </li></ul></ul><ul><li>Rich and detailed “evidence” </li></ul>
  56. 57. Recommending People to Connect with <ul><li>Evaluation based on the Fringe enterprise SNS </li></ul><ul><li>Dramatic increase in the number of invitations sent and users sending invitations </li></ul><ul><ul><li>“ I must say I am a lazy social networker, but Fringe was the first application motivating me to go ahead and send out some invitations to others to connect ” </li></ul></ul><ul><li>Evidence increases users’ trust in the system and makes them feel more comfortable </li></ul><ul><ul><li>“ If I see more direct connections I’m more likely to add them […] I know they are not recommended by accident ” </li></ul></ul><ul><li>Substantial increase in friends per user </li></ul><ul><li>Sharp decay in usage over time </li></ul><ul><ul><li>Excitement drops, connections exhausted </li></ul></ul>
  57. 58. Recommending People to Connect with <ul><li>Make new friends, but keep the old: recommending people on social networking sites [Chen et al., CHI ’09] </li></ul><ul><li>Content Matching (CM) </li></ul><ul><ul><li>Profile entries, status messages, photo text, shred lists, job title, location, description, tags </li></ul></ul><ul><ul><li>vu ( wi ) = TFu ( wi ) ⋅ IDFu ( wi ) </li></ul></ul><ul><ul><li>Cosine similarity of both users’ word vector </li></ul></ul><ul><ul><li>Latent semantic analysis did not perform better </li></ul></ul><ul><ul><ul><li>And does not yield intuitive explanations </li></ul></ul></ul><ul><li>Content-plus-Link (CplusL) </li></ul><ul><ul><li>Hybrid CM + social link </li></ul></ul><ul><ul><li>Social link: a sequence of 3 or 4 users </li></ul></ul><ul><ul><ul><li>a connects to b, a comments on b, b connects to a </li></ul></ul></ul><ul><li>Friend-of-Friend (FoF) </li></ul><ul><ul><li>Based on number of mutual friends </li></ul></ul><ul><ul><li>One or more recommendations for 57.2% of the users </li></ul></ul><ul><li>Aggregated Relationships (SONAR) </li></ul><ul><ul><li>Similar to the “Do you know?” algorithm </li></ul></ul><ul><ul><li>One ore more recommendations for 87.7% of the users </li></ul></ul>4 Algorithms Compared
  58. 59. Recommending People to Connect with <ul><li>Evaluation based on the SocialBlue enterprise SNS (“Beehive”) </li></ul><ul><li>Survey with 258 participants </li></ul><ul><ul><li>CM and CplusL yield mostly unknown people, while FoF and SONAR yield mostly known </li></ul></ul><ul><ul><li>Content similarity vs. relationships alg. </li></ul></ul><ul><ul><li>The latter are more accurate overall </li></ul></ul><ul><ul><li>The former are better at discovering new friends </li></ul></ul><ul><li>Controlled field study with 3,000 users </li></ul><ul><ul><li>SONAR yields most effective results </li></ul></ul><ul><ul><li>Combine relationships (at first) and content similarity (when the network grows)? </li></ul></ul>
  59. 60. Recommending People to Connect with <ul><li>The network effects of recommending social connections [Daly et al., RecSys ’10] </li></ul><ul><li>FoF is highly biased towards well-connected users, leading to high rec. frequency of the same users </li></ul><ul><li>CM is most diverse and often recommends users with few connections only </li></ul><ul><li>CM and SONAR affect betweenness centrality most significantly </li></ul><ul><li>CM is most biased for same country but least biased for same division </li></ul><ul><li>SONAR substantially increases cross-country and intra-division connections </li></ul><ul><li>Highlight network effects when recommending people? </li></ul>degree distribution Betweenness delta
  60. 61. Recommending People to Connect with <ul><li>FriendSensing: recommending friends using mobile phones [Quercia et al., RecSys ’09] </li></ul><ul><li>“ Sensing” friends based on BlueTooth information on mobile devices </li></ul><ul><li>Friends are recommended based on collocation: </li></ul><ul><ul><li>Creating a weighted graph based on frequency and duration </li></ul></ul><ul><ul><li>Applying link prediction on the graph (shortest path, page rank, k-markov chain, hits) </li></ul></ul><ul><li>Simulation-based evaluation </li></ul>
  61. 62. Recommending People to Follow <ul><li>Recommending twitter users to follow using content and collaborative filtering approaches [Hannon et el., RecSys’10] </li></ul><ul><li>CB, CF, and Hybrid approaches </li></ul><ul><li>User profiles based on </li></ul><ul><ul><li>Own tweets </li></ul></ul><ul><ul><li>Followers’ tweets </li></ul></ul><ul><ul><li>Followees’ tweets </li></ul></ul><ul><ul><li>Followers </li></ul></ul><ul><ul><li>Followees </li></ul></ul><ul><li>Using Lucene to index users by their profile, after applying TF-IDF to boost distinctive terms/users within the profile </li></ul><ul><li> </li></ul><ul><li> </li></ul>
  62. 63. Recommending People to Follow <ul><li>Offline Evaluation, 20K users </li></ul><ul><ul><li>19,000 training set Twitter users </li></ul></ul><ul><ul><li>1,000 test users </li></ul></ul><ul><ul><li>Create index per profile and predict followees </li></ul></ul><ul><ul><li>Measure by precision and position </li></ul></ul><ul><ul><li>Results show trade-off between the two </li></ul></ul><ul><ul><li>Slight advantage to followers and tweets of followers </li></ul></ul><ul><ul><li>Hybrid improves results (precision close to 0.3) </li></ul></ul><ul><li>Live User Trial, 34 participants </li></ul><ul><ul><li>Hybrid approach combining all types </li></ul></ul><ul><ul><li>30 recommended Twitter users </li></ul></ul><ul><ul><li>Indicate whom s/he is likely to follow </li></ul></ul><ul><ul><ul><li>No actual effect </li></ul></ul></ul><ul><ul><li>On average, 6.9 out of 30 </li></ul></ul>
  63. 64. Recommending People to Follow <ul><li>Who should I follow? Recommending people in directed social networks [Brzozowski & Romero, 2010] </li></ul><ul><li>Experiments with the WaterCooler enterprise SNS </li></ul><ul><li>110 users followed 774 new people during 24-day trail period </li></ul><ul><li>Strongest pattern A<-X->B </li></ul><ul><ul><li>Sharing an audience with someone is a surprisingly compelling reason to follow them </li></ul></ul><ul><li>Similarity (and most read) are not so strong indicators </li></ul><ul><li>Most replied is a good indicator </li></ul>
  64. 65. Expertise Location <ul><li>Expertise location systems aid in finding a person with regard to a specific topic </li></ul><ul><li>For answering a question, getting advice, having a conversation, promoting an idea, etc. </li></ul><ul><li>Sometimes referred to as “expert recommendation” systems </li></ul><ul><li>E.g., Expertise recommender: a flexible recommendation system and architecture [McDonald & Ackerman, CSCW ’00] </li></ul><ul><li>Different from people recommendation </li></ul><ul><ul><li>Initiated by the user </li></ul></ul><ul><ul><li>In context: the user types a query </li></ul></ul><ul><ul><li>Typically answers an ad-hoc user need </li></ul></ul><ul><ul><li>Recommendation vs. Search </li></ul></ul>
  65. 66. References <ul><li>Brzozowski, M.J. & Romero, D.M. Who should I follow? Recommending people in directed social networks. </li></ul><ul><li>Chen, J., Geyer, W. Dugan, C., Muller, M., & Guy, I. 2009. Make new friends, but keep the old: recommending people on social networking sites. Proc. CHI ’09 , 201-210. </li></ul><ul><li>Daly E.M., Geyer W., & Millen D.R. The network effects of recommending social connections. Proc. RecSys '10 , 301-304. </li></ul><ul><li>Guy I., Ronen I., & Wilcox E. Do you know? recommending people to invite into your social network. Proc. IUI’09 , 77-86. </li></ul><ul><li>Hannon, J., Bennett, M., & Smyth, B. Recommending twitter users to follow using content and collaborative filtering approaches. Proc. RecSys '10 , 199-206. </li></ul><ul><li>McDonald D.W & Ackerman M.S. 2000. Expertise recommender: a flexible recommendation system and architecture. Proc. CSCW '00, 231-240. </li></ul><ul><li>Quercia, D., & Capra, L. 2009. FriendSensing: recommending friends using mobile phones. Proc. RecSys ’09 , 273-276. </li></ul><ul><li>Terveen, L. and McDonald, D. W. Social matching: A framework and research agenda. ACM TOCHI 12, 3, (2007), 401-434. </li></ul>
  66. 67. Community Recommendations IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  67. 68. Recommending Similar Communities <ul><li>Evaluating similarity measures: a large-scale study in the Orkut social network [Spertus et al., KDD ’05] </li></ul><ul><li>Orkut – SNS by Google, used to be largest in Brazil and India </li></ul><ul><ul><li>20K communities with over 20 members </li></ul></ul><ul><ul><li>180K distinct members </li></ul></ul><ul><ul><li>Over 2M memberships </li></ul></ul><ul><li>6 community similarity measures, based on community membership </li></ul><ul><ul><li>How appropriate is r as a recommendation for b </li></ul></ul>at time of experiments L1-Norm L2-Norm Log-Odds Salton (IDF) Pointwise Mutual-Info: Pos. Correlations Pointwise Mutual-Info: Pos. and Neg. Correlations
  68. 69. Recommending Similar Communities <ul><li>Live trail on the Orkut SNS </li></ul><ul><li>Click-through to measure user acceptance </li></ul><ul><li>L2 shown as the best similarity measure </li></ul><ul><li>Followed by MI1, MI2, IDF,L1, and Log-Odds </li></ul><ul><li>Conversion rate - % of non-members who clicked-through and then joined </li></ul><ul><ul><li>46% for base members, 17% for base non-members </li></ul></ul><ul><li>Measure user similarity through common communities – will L2 win too? </li></ul>
  69. 70. Personalized Community Recommendation <ul><li>Collaborative filtering for Orkut communities: discovery of user latent behavior [Chen et al., WWW ’09] </li></ul><ul><li>Personalized community recommendation using CF of two types </li></ul><ul><ul><li>Association rule mining (ARM) – association between communities shared between many users: users who join X typically join Y </li></ul></ul><ul><ul><li>Latent Dirichlet Allocation (LDA) – user-community co-occurrences using latent aspects (topics): x is related to y through a semantic feature, e.g., “baseball” </li></ul></ul><ul><ul><ul><li>Users=docs, communities=words, membership=co-occurrence </li></ul></ul></ul><ul><ul><ul><li>Per-topic distribution of users and communities </li></ul></ul></ul>
  70. 71. Personalized Community Recommendation <ul><li>Orkut membership data: 492K users, 118K communities </li></ul><ul><li>Top-k recommendation: withhold 1 community the user has joined with k-1 random communities, obtain rank (k=1001) </li></ul><ul><li>ARM is better when recommending lists of up to 3 communities, </li></ul><ul><li>LDA is consistently better when recommending a list of 4 ore more </li></ul><ul><li>In general, LDA ranks communities better than ARM </li></ul><ul><li>LDA is parallelized to improve efficiency </li></ul>
  71. 72. Personalized Community Recommendation <ul><li>Flickr group recommendation based on tensor decomposition [Zheng et al., SIGIR ’10] </li></ul><ul><li>Latent association between users and groups through tags </li></ul><ul><li>Model through a three-mode tensor </li></ul><ul><li>Experiments using 197x5328x4064 tensor </li></ul><ul><ul><li>Statistically significant superiority over user-based and non-negative matrix factorization approaches </li></ul></ul>
  72. 73. Personalized Community Recommendation <ul><li>Group Proximity Measure for Recommending Groups in Online Social Networks (Saha & Getoor, SNA-KDD ’08) </li></ul><ul><li>Group proximity based on escape probability </li></ul><ul><ul><li>Random walks starting in G1 visits G2 before any other node in G1 </li></ul></ul><ul><ul><li>Extend with core vs. outlier nodes </li></ul></ul><ul><li>Method 1: recommend the closest groups to those the user is a member of </li></ul><ul><li>Method 2: boost groups that are closer to many of the user’s groups </li></ul><ul><li>GM and GL- two SVM classifiers trained over group membership data </li></ul><ul><li>Accuracy = predicted membership in top 5 </li></ul>
  73. 74. Personalized Community Recommendation <ul><li>Group Recommendation System for Facebook [Baatarjav et al., OTM ’08] </li></ul><ul><li>Matching user profiles with group identities </li></ul><ul><li>Combining hierarchical clustering and decision tree </li></ul><ul><li>Facebook data for University of North Texax (1580 users), focus on 17 groups </li></ul><ul><li>15 profile features: age, gender, timezone, relationship status, political view, interests, movies, affiliations, … </li></ul><ul><li>Characterize groups by the majority of their members </li></ul><ul><li>Removing noise by removing members far from the group’s center </li></ul><ul><li>Reported average accuracy 73% </li></ul><ul><li>From LinkedIn Blog (“groups you may like”): </li></ul><ul><ul><li>Building a virtual profile per group by selecting the most representative features of group members using Information Theory techniques like Mutual Information and KL Divergence. </li></ul></ul><ul><ul><li>Mapping user’s attributes to group’s virtual profile </li></ul></ul><ul><ul><li>Adding more recommendations based on CF </li></ul></ul>
  74. 75. References <ul><li>Baatarjav, E.A., Phithakkitnukoon S., & Dantu R. Group recommendation system for Facebook. Proc. OTM ’08 , 211-219. </li></ul><ul><li>Chen W.Y., Chu J.C., Luan J., Bai H., Wang, Y., & Chang E.Y. Collaborative filtering for Orkut communities: discovery of user latent behavior. Proc. WWW '09 , 681-690. </li></ul><ul><li>Official LinkedInBlog - The engineering behind LinkedIn products “you may like”: </li></ul><ul><li>Saha & Getoor. Group Proximity Measure for Recommending Groups in Online Social Networks. Proc. SNA-KDD ’08 . </li></ul><ul><li>Spertus, E., Sahami, M., & Buyukkokten O. Evaluating similarity measures: a large-scale study in the Orkut social network. Proc. KDD '05 , 678-684. </li></ul><ul><li>Zheng N., Li Q., Liao S., & Zhang L. Flickr group recommendation based on tensor decomposition. Proc. SIGIR '10 , 737-738. </li></ul>
  75. 76. Recommendations for Groups IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  76. 77. Recommendation for a group rather than individuals <ul><li>Friends collaborating with a recommender system to design the perfect vacation </li></ul><ul><li>Family selecting a movie or TV show to watch together </li></ul><ul><li>Group of colleagues choosing a restaurant for an evening out </li></ul><ul><ul><li>Or looking for a recipe for a joint meal </li></ul></ul>
  77. 78. Issues in recommendation for groups How to support the process of arriving at a final decision Negotiation may be required. Members decide which recommendation (if any) to accept Aggregation methods Aggregating preferences/results must be applied System generates recommendations Benefits/drawback for the group/system Members can examine each other Members specify their preferences General issues Differences from individuals Phase of recommendation
  78. 79. Explicit specification of preferences: MusicFX ( McCarthy, CSCW 2000) Hate Don’t mind Love Alternative rock : 1 2 3 4 5 Hot country: 1 2 3 4 5 50’oldies 1 2 3 4 5 …90 more genres
  79. 80. Collaborative speciation of preferences in the Travel Decision Forum [Jameson 04]
  80. 81. Collaborative specification - Advantages <ul><li>Persuade other members to specify a similar preference, perhaps by giving them information that they previously lacked </li></ul><ul><li>Explain and justify a member's preference </li></ul><ul><ul><ul><li>(e.g., .I can't go hiking, because of an injury) </li></ul></ul></ul><ul><li>Taking into account attitudes and anticipated behavior of other members </li></ul><ul><li>Encouraging assimilation to facilitate the reaching of agreement </li></ul>
  81. 82. Aggregating Preferences <ul><li>Least misery: </li></ul><ul><ul><li>Recommendations are based on the lowest predicted rating </li></ul></ul><ul><ul><li>Rate item i by R i =min {r i1 ..r ik } (for group of users {1..k}) </li></ul></ul><ul><li>Fairness: </li></ul><ul><ul><li>No-one wants a perfectly fair solution that makes everyone equally miserable </li></ul></ul><ul><ul><ul><li>R i = avg {r i1 ..r ik } – w * std(r i1 ..r ik) </li></ul></ul></ul><ul><li>Fusion: </li></ul><ul><ul><li>Aggregate item ranking created for the individuals (e.g., Borda count) </li></ul></ul><ul><li>Group modeling: </li></ul><ul><ul><li>Compute an aggregate preference model that represents the preferences of the group as a whole </li></ul></ul><ul><ul><ul><li>e.g. the system computes a model of the group by forming a linear combination of the individual models </li></ul></ul></ul>
  82. 83. Aggregation Methods [Berkovsky, RecSys ’10]
  83. 84. Group recommendations .. Baltrunas et a.l, RecSys2010 <ul><li>Individual recommendation: </li></ul><ul><ul><li>containing all the items that are not present in the user’s training set of any group member . </li></ul></ul><ul><li>Group recommendation: </li></ul><ul><ul><li>takes as input the individual ranked lists of items’ recommendations for each user in the group </li></ul></ul><ul><ul><li>returns a ranked list of recommendations for the whole group </li></ul></ul>
  84. 85. References <ul><li>MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative Workouts, McCarthy et al. CSCW 98 </li></ul><ul><li>Two Methods for Enhancing Mutual Awareness in a Group Recommender System , Jameson et al., AVI 2004 </li></ul><ul><li>Recommendation to groups Jameson and Smyth, 2007 </li></ul><ul><li>Group recommendations with rank aggregation and collaborative filtering (Baltrunas et a.l, RecSys2010) </li></ul><ul><li>Group-Based Recipe Recommendations: Analysis of Data Aggregation Strategies. Berkovsky et al, RecSys2010 </li></ul>
  85. 86. The Cold Start Problem IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  86. 87. The Cold Start Problem <ul><li>The Cold Start problem concerns the issue where the RS cannot draw inferences for users or items for which it has not yet gathered sufficient information </li></ul><ul><li>New items </li></ul><ul><ul><li>e.g., a newly created document w/o tags or bookmarks </li></ul></ul><ul><ul><li>e.g., a newly created community w/o members </li></ul></ul><ul><li>New users </li></ul><ul><ul><li>e.g., a user that has just signed up to a new site </li></ul></ul><ul><ul><li>e.g., a new member or employee </li></ul></ul><ul><li>Typically addressed by applying a hybrid approach </li></ul>CF CB CF CB
  87. 88. The Cold Start Problem of New Items <ul><li>Traditional CF systems are based on item ratings </li></ul><ul><ul><li>Until rated by a substantial number of users, the system will not be able to recommend the item </li></ul></ul><ul><li>a.k.a the “early rater” problem – first person to rate an item gets little benefit </li></ul><ul><li>Same for implicit feedback over items – clicks, searches, comments, tags </li></ul><ul><li>Even more acute for activity streams, where items quickly come and go </li></ul><ul><li>Typically addressed by integrating CB similarity measurements </li></ul><ul><ul><li>Recommendation based on the data of older similar items </li></ul></ul>Recommend Y to B Recommend Y to A CB CF+CB User A Item X Item Y User B Similar Likes Similar User A Item X Item Y Likes Similar
  88. 89. The Cold Start Problem of New Items <ul><li>Methods and metrics for cold-start recommendations [Schein et al., SIGIR ’02] </li></ul><ul><li>Extend hybrid RS to average content data in a model-based fashion </li></ul>
  89. 90. The Cold Start Problem for New Users <ul><li>Sometimes also referred to as the “ New User Problem ” </li></ul><ul><li>User needs to rate sufficient items for a CB recommender to really understand the user’s preferences </li></ul><ul><li>Mitigated by CF – similar users who rated more items can yield more recommendations </li></ul><ul><li>Traditional CF still faces an issue if the user did not provide any explicit feedback (or very small amount of feedback) </li></ul><ul><li>Typically resolved through building a user profile by integrating other user activity (implicit feedback) </li></ul><ul><ul><li>Browsing history, click-through data, searches </li></ul></ul><ul><li>Social media introduces new ways to learn about the user from external sources </li></ul><ul><ul><li>Friends (“social filtering”), tags, communities, … </li></ul></ul><ul><ul><li>More public information which is less sensitive to privacy issues </li></ul></ul>
  90. 91. The Cold Start Problem for New Users <ul><li>Increasing engagement through early recommender intervention [Freyne et al., RecSys ’09] </li></ul><ul><li>New users of an enterprise SNS </li></ul><ul><li>Use aggregation approach – extract social network data from external sources (project and patent DB, org chart, wikis and blogs) </li></ul><ul><li>A brand new employee – still has org chart and basic profile data (location, division, project, etc.) </li></ul>
  91. 92. The Cold Start Problem for New Users <ul><li>Getting to know you: learning new user preferences in recommender systems [Rashid et al., IUI ’02] </li></ul><ul><li>Experimentation with MovieLens (offline and online) </li></ul><ul><li>Focus on minimizing user effort, while maximizing accuracy </li></ul><ul><li>6 techniques for CF can use to learn about new users </li></ul><ul><ul><li>Random </li></ul></ul><ul><ul><li>Popular (#ratings), add less information </li></ul></ul><ul><ul><li>Entropy (diverse ratings), add a lot of information </li></ul></ul><ul><ul><li>Pop*Ent – product of popularity and entropy </li></ul></ul><ul><ul><li>Item-Item personalized – once the user has rated one movie (before – random from top 200 of Pop*Ent) </li></ul></ul>
  92. 93. Cold Start for Tag-based Recommenders <ul><li>Tag-based recommenders are sensitive to both new item and new user cold start problems </li></ul><ul><ul><li>New items are “tagless” </li></ul></ul><ul><ul><li>New users are not associated with tags </li></ul></ul><ul><li>Possible Solutions: </li></ul><ul><ul><li>Hybridize with a CB recommender </li></ul></ul><ul><ul><li>Use CB automatic tag extraction </li></ul></ul><ul><ul><li>Apply CF for tags to enrich tags </li></ul></ul><ul><ul><ul><li>Will not work for brand new users or items </li></ul></ul></ul>
  93. 94. References <ul><li>Freyne J., Jacovi M., Guy I., and Geyer, W. Increasing engagement through early recommender intervention. Proc. RecSys '09 , 85-92. </li></ul><ul><li>Schein A.I., Popescul A., Ungar L.H, & Pennock D.M. Methods and metrics for cold-start recommendations. Proc. SIGIR ’02 , 253-260. </li></ul><ul><li>Rashid A.M., Albert I., Cosley D., Lam S.K., McNee S.M., & Konstan, J.A. Getting to know you: learning new user preferences in recommender systems. Proc. IUI ’02 , 127-134. </li></ul>
  94. 95. Trust and Distrust IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  95. 96. Social relations based recommendation <ul><li>Traditional recommender systems ignore the social connections between users </li></ul><ul><li>To improve the recommendation accuracy users’ social network should be taken into consideration </li></ul><ul><ul><li>People who are socially connected are more likely to share the same or similar interests </li></ul></ul><ul><ul><li>Users can be easily influenced by the friends they trust, and prefer their friends’ recommendations. </li></ul></ul>
  96. 98. Trust Enhanced Recommendation <ul><li>The question of whom to trust and what information to trust has become both more important and more difficult to answer </li></ul><ul><li>Social trust relationships, derived from social networks, are uniquely suited to speak to the quality of online information </li></ul><ul><li>Merging social networks based trust and recommender systems can improve the accuracy of recommendations and improve the user's experience. </li></ul>
  97. 99. Social trust graph User-item rating matrix Problem Definition
  98. 100. TidalTrust: Accumulate recommendation from trusted people only (Golbeck06)
  99. 101. Integrating Trust with CF <ul><li>Items recommended by trusted neighbors will be boosted </li></ul><ul><li>Note that we still can get recommendations from similar non trusted neighbors ( w(a,u) << t(a,v) if v is trusted and u is not) </li></ul><ul><li>Note: </li></ul><ul><ul><li>Trust is highly correlated with similarity </li></ul></ul><ul><ul><ul><li>people usually have more trust in similar people </li></ul></ul></ul><ul><ul><li>However, trust captures nuances that similarity cannot assess (Golbeck) </li></ul></ul>
  100. 102. Inferring Trust The Goal: Select two individuals - the source and sink - and recommend to the source how much to trust the sink . Sink Source
  101. 103. Trust computations <ul><li>If the source does not know the sink, the source asks all of its friends how much to trust the sink, and computes a trust value by a weighted average </li></ul><ul><li>Neighbors repeat the process if they do not have a direct rating for the sink </li></ul>
  102. 104. Reputation <ul><li>We can measure the reputation (“global trust”) of the user using Network analysis </li></ul><ul><ul><li>e.g. PageRank </li></ul></ul><ul><li>These reputation values can be used to bias recommendations from highly trusted (reliable) people </li></ul>
  103. 105. What about Distrust? (Victor 2009) <ul><li>Use the distrust set to filter out `unwanted' individuals from collaborative recommendation processes </li></ul><ul><ul><li>Ignore users from R Dist in the CF process </li></ul></ul><ul><li>Distrust as an indicator to reverse deviations </li></ul><ul><ul><li>consider distrust scores as negative weights </li></ul></ul><ul><ul><ul><li>Reduce the recommendation score of items recommended by distrusted neighbors </li></ul></ul></ul><ul><li>Using distrust for recommendation is still an open challenge </li></ul>
  104. 106. Trust in Recommendation (by explanations) Good explanations could help inspire user trust and loyalty, increase satisfaction, make it quicker and easier for users to find what they want, and persuade them to try or purchase a recommended item MoviExplain: A Recommender System with Explanations (Symeonidis09)
  105. 107. Explanation Aims: <ul><li>Transparency </li></ul><ul><ul><li>Explain how the system works </li></ul></ul><ul><li>Curability </li></ul><ul><ul><li>Allow users to tell the system it is wrong </li></ul></ul><ul><li>Trust </li></ul><ul><ul><li>Increase users’ confidence in the system </li></ul></ul><ul><li>Effectiveness </li></ul><ul><ul><li>Help users make good decisions </li></ul></ul><ul><li>Persuasiveness </li></ul><ul><ul><li>Convince users to try or buy </li></ul></ul><ul><li>Efficiency </li></ul><ul><ul><li>Help users make decisions faster </li></ul></ul><ul><li>Satisfaction </li></ul><ul><ul><li>Increase the ease of usability or enjoyment </li></ul></ul>
  106. 108. Explanation types <ul><li>Nearest neighbor explanation </li></ul><ul><ul><li>Customers who bought item X also bought items Y,Z </li></ul></ul><ul><ul><li>Item Y is recommended because you rated related item X </li></ul></ul><ul><li>Content based explanation </li></ul><ul><ul><li>This story deals with topics X,Y which belong to your topic of interest </li></ul></ul><ul><li>Social based explanation </li></ul><ul><ul><li>People leverage their social network to reach information and make use of trust relationships to filter information . </li></ul></ul><ul><ul><ul><li>Your friend X wrote that blog </li></ul></ul></ul><ul><ul><ul><li>50% of your friends liked this item (while only 5% disliked it) </li></ul></ul></ul>
  107. 109. TagSplanations (Vig09) <ul><li>Tagsplanations have two key components: </li></ul><ul><ul><li>tag relevance, the degree to which your related tag describes an item </li></ul></ul><ul><ul><li>tag preference, the user's sentiment toward a tag </li></ul></ul>Rushmore Rear Window
  108. 110. Social-based Explanation (Guy09)
  109. 111. References <ul><li>Trust and Distrust Based Recommendations for Controversial Reviews. Victor et al. IEEE Intelligent Systems </li></ul><ul><li>Inferring binary trust relationships in Web-based social networks. Golbeck et al. TOIT 2006 </li></ul><ul><li>MoviExplain: a recommender system with explanations. Symeonidis et al. RecSys 2009 </li></ul><ul><li>Tagsplanations: explaining recommendations using tags. Vig aet al. IUI 2009 </li></ul><ul><li>Personalized recommendation of social software items based on social relations. Gui et al. RecSys 2009 </li></ul>
  110. 112. Social Recommender Systems in the Enterprise IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  111. 113. Social Media in the Enterprise <ul><li>Following their success on the web, social media application have emerged in large enterprises </li></ul><ul><ul><li>Corporate blogs and wikis (Blog Central) </li></ul></ul><ul><ul><li>Social bookmarking (Dogear) </li></ul></ul><ul><ul><li>Social network sites (Fringe, Beehive/SocialBlue, Town Square) </li></ul></ul><ul><ul><li>People tagging </li></ul></ul><ul><ul><li>Social file sharing </li></ul></ul><ul><li>Several differences observed from use outside the firewall </li></ul><ul><ul><li>e.g., people seek more strongly to connect with strangers </li></ul></ul><ul><li>Information overload analogously created within the enterprise </li></ul><ul><li>Enterprise social media products in market: IBM Connections, Jive, Yammer </li></ul>
  112. 114. IBM Lotus Connections – Enterprise Social Software
  113. 115. Social Recommenders in the Enterprise <ul><li>Same user identity in all systems </li></ul><ul><ul><li>Facilitates aggregation </li></ul></ul><ul><ul><li>Higher accountability </li></ul></ul><ul><li>Lower scales, but also lower density </li></ul><ul><li>Cold start for new employees </li></ul><ul><ul><li>Mitigated with directory data (location, division, role, etc.) </li></ul></ul>
  114. 116. Recommending Content to Create <ul><li>Recommending topics for self-descriptions in online user profiles [Geyer et al., RecSys ’08] </li></ul><ul><li>‘ About You’ Q&A pairs in SocialBlue SNS </li></ul><ul><li>Content algorithm – user profile based on all SocialBlue content </li></ul><ul><li>Network algorithm – if one r more friends have created this ‘About You’ </li></ul><ul><li>Network algorithm performed best </li></ul>
  115. 117. Increasing Engagement in Enterprise Social Media <ul><li>Increasing Engagement through Early Recommender Intervention [Freyne et al, RecSys ’09] </li></ul><ul><li>Challenge of adoption and engagement of new social software users </li></ul><ul><li>Research over SocialBlue enterprise SNS </li></ul><ul><li>Combined people and content-creation recommendations have significant effect in increasing overall contributions and page views </li></ul><ul><ul><li>Short and long term effects </li></ul></ul><ul><li>People recommendations ( sorted by overall activity ) have high effect in increasing retention rate </li></ul>
  116. 118. Stranger Recommendation <ul><li>Do you want to know? Recommending strangers in the enterprise [Guy et al., CSCW ’11] </li></ul><ul><li>Recommendation of people who are unknown yet interesting in the organization </li></ul><ul><li>Maybe useful to </li></ul><ul><ul><li>Get help or advice </li></ul></ul><ul><ul><li>Reach new opportunities </li></ul></ul><ul><ul><li>Discover new routes for career development </li></ul></ul><ul><ul><li>Learn about new assets that can be leveraged </li></ul></ul><ul><ul><li>Connect with SMEs and influencers </li></ul></ul><ul><ul><li>Cultivate organizational social capital </li></ul></ul><ul><ul><li>Grow own reputation and influence within the organization </li></ul></ul><ul><li>Complements recommendation of people to connect with, as those are quickly exhausted over time </li></ul>
  117. 119. Stranger Recommendation <ul><li>Method - subtracting the familiarity network from the similarity network </li></ul><ul><li>Similarity : common things and places: tags, communities, wikis </li></ul><ul><li>Score based on Jaccard’s index </li></ul><ul><li>Presentation with evidence </li></ul><ul><li>Two-thirds of the recommendations are strangers </li></ul><ul><li>Significantly more interesting than a random person </li></ul><ul><li>Out of 9 recommendations, 67% got at least one stranger rated 3 or above </li></ul><ul><li>Exploratory recommendation </li></ul><ul><ul><li>Low accuracy, high serendipity </li></ul></ul>
  118. 120. References <ul><li>Freyne J., Jacovi, M., Guy I., & Geyer W. 2009. Increasing engagement through early recommender intervention. Proc RecSys ’09, 85-92. </li></ul><ul><li>Geyer, W., Dugan, C., Millen, D., Muller, M., & Freyne, J. Recommending topics for self-descriptions in online user profiles. Proc. RecSys ’08 , 59-66. </li></ul><ul><li>Blog Muse </li></ul><ul><li>Guy I., Ur, S., Ronen, I., Perer, A., & Jacovi, M. Do you want to know? recommending strangers in the enterprise. Proc. CSCW ’11 . </li></ul>
  119. 121. Temporal Aspects in Social Recommendation IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  120. 122. The Time Factor in RS <ul><li>Most RS works ignore temporal factors </li></ul><ul><li>In practice, time yields many changes: </li></ul><ul><ul><li>Changes in user tastes and needs </li></ul></ul><ul><ul><ul><li>e.g., a scientist moving to a new domain </li></ul></ul></ul><ul><ul><li>Appearance of new items and users, disappearance of others </li></ul></ul><ul><ul><ul><li>e.g., a new action movie about aliens </li></ul></ul></ul><ul><ul><li>Changes in items and their features </li></ul></ul><ul><ul><ul><li>e.g, a restaurant changing its menu </li></ul></ul></ul><ul><li>The pace of changes is greater in the social media world, where the masses are involved in constantly creating new content, communities, comments, tags, etc. </li></ul><ul><li>Real-time web mediums like Twitter even further magnify the importance of time </li></ul>
  121. 123. Temporal CF <ul><li>Collaborative filtering with temporal dynamics [Koren, KDD’09] </li></ul><ul><li>Tracking customer preferences for products over time </li></ul><ul><li>Both users and products go through a distinct series of changes in their characteristics </li></ul><ul><li>A mere decay model loses too much signal </li></ul><ul><li>Separate transient factors from lasting ones </li></ul><ul><ul><li>Matrix factorization – model both user and product change along time, to distill longer term trends </li></ul></ul><ul><ul><li>Item-item neighborhood – learning how influence between two items rated by a user decays over time, to reveal the fundamental item-item relations </li></ul></ul><ul><li>Leading to improved results over Netflix in both models </li></ul>
  122. 124. Temporal Diversity <ul><li>Temporal diversity in recommender systems [Lathia et al., SIGIR ’10] </li></ul><ul><li>Over time, the same items might be recommended again and again </li></ul><ul><ul><li>Users may lose interest in interacting with the RS </li></ul></ul><ul><li>Temporal diversity is therefore important </li></ul><ul><ul><li>Netflix data changes over time: #movie grows, #users grows, #ratings grows </li></ul></ul><ul><li>User survey – simulated 5 “weeks” </li></ul><ul><ul><li>S1 – top-10 most popular movies (no diversity) </li></ul></ul><ul><ul><li>S2 – ~7 popular movies replaces each week </li></ul></ul><ul><ul><li>S3 – random out of the Netflix dataset </li></ul></ul>
  123. 125. Temporal Diversity <ul><li>Comparing diversity of 3 CF algorithms over time: </li></ul><ul><ul><li>Baseline – item’s mean rating </li></ul></ul><ul><ul><li>Item-item based kNN </li></ul></ul><ul><ul><li>Matrix factorization based on SVD </li></ul></ul><ul><li>Growing Netflix dataset 249 system updates (Simulated 7 days) </li></ul>
  124. 126. Temporal User Profile <ul><li>Adaptive web search based on user profile constructed without any effort from users [Sugiyama et al., WWW’04] </li></ul><ul><ul><li>Personalized search based on browsing history </li></ul></ul><ul><ul><li>Given a query, search results are adapted based on the user’s information needs </li></ul></ul><ul><ul><li>User profile evolves over time, combining </li></ul></ul><ul><ul><ul><li>Persistent user behavior computed over a fixed time decay window </li></ul></ul></ul><ul><ul><ul><li>Ephemeral aspects captured from the current session (day) </li></ul></ul></ul><ul><li>Personalizing search via automated analysis of interests and activities [Teevan et al., SIGIR ’05] </li></ul><ul><ul><li>Wide range of implicit user activities over short and long time periods using a relevance feedback framework </li></ul></ul><ul><ul><ul><li>Search queries, visited pages – “implicit” short-term relevance feedback </li></ul></ul></ul><ul><ul><ul><li>documents and email – longer term interests </li></ul></ul></ul>
  125. 127. Time-based Evaluation in SRS Google Reader News Recommendation Network effects of people recommendation Increasing engagement of new users
  126. 128. Future Directions <ul><li>Enhance user modeling approaches to consider time in a smarter way </li></ul><ul><ul><li>e.g., someone you “friended” 5 years ago – a very good friend or a one-time encounter? </li></ul></ul><ul><ul><ul><li>Measure interaction over time </li></ul></ul></ul><ul><ul><li>e.g., tag usage as a signal to changes of interest over time </li></ul></ul><ul><ul><ul><li>Distinguish transient vs. lasting interests </li></ul></ul></ul><ul><li>Learn from user feedback to compensate for decay in interest </li></ul><ul><ul><li>Exploitations vs. exploration </li></ul></ul><ul><li>Different recommendation for new users, old users, heavy users, etc. </li></ul><ul><li>Impose diversity and avoid recommending the same items again and again </li></ul><ul><li>More live user studies over time </li></ul>
  127. 129. References <ul><li>Koren Y. Collaborative filtering with temporal dynamics. Proc. KDD ’09 , 447-456. </li></ul><ul><li>Lathia N., Hailes, S., Capra L., & Amatriain X. 2010. Temporal diversity in recommender systems. Proc. SIGIR ’10 , 210-217. </li></ul><ul><li>Sugiyama K., Hatano K., & Yoshikawa M. Adaptive web search based on user profile constructed without any effort from users. Proc. WWW '04, 675-684. </li></ul><ul><li>Teevan, J., Dumais, S. T., & Horvitz, E. Personalizing search via automated analysis of interests and activities. Proc. SIGIR ’05 , 449-456. </li></ul>
  128. 130. Social Recommendation over Activity Streams IBM Research WWW 2011, March 28 th -April 1 st , Hyderabad, India
  129. 131. Activity streams and the Real-time Web <ul><li>The emergence of Twitter and Facebook marked a new phase in the evolution of the Web, characterized by streams of updates and news (“new items”) </li></ul><ul><li>Millions of users sharing activities with friends and followers creating a “firehose” of data in real-time </li></ul><ul><li>Twitter consists of 140-character status messages (microblogs), allow asymmetric following </li></ul><ul><li>Facebook Newsfeed shows different activity from Faceook friends: status updates, friend addition, group joining, page liking, photo sharing and tagging, … </li></ul><ul><li>Friendfeed is an example of social aggregator, including activities from 3 rd party sites: blogs, SNSs, social bookmarking, … </li></ul><ul><li>Activity stream format: </li></ul>External (aggregated) Hetrogenous Asymmetric (follow) FriendFeed Internal Hetrogenous Symmetric (friends) Facebook Internal Homogenous (status updates) Asymmetric (follow) Twitter
  130. 132. Utilizing the Stream for Recommendation <ul><li>Stream data is intensive, fresh, and represents the “crowds” </li></ul><ul><li>Yet, it is also noisy and sparse in content and meta-data </li></ul><ul><li>Using the stream smartly can enhance current SRS </li></ul><ul><ul><li>Especially for modeling recent interests and trends </li></ul></ul><ul><li>Stream data requires special techniques </li></ul><ul><ul><li>Frequency is high, so is noise </li></ul></ul><ul><ul><li>Recency plays special role </li></ul></ul><ul><ul><li>Content is sparse, no tags (apart from #tags) </li></ul></ul>
  131. 133. Enhancing News Recommendation <ul><li>Using twitter to recommend real - time topical news . [Phelan el al., RecSys '09] </li></ul><ul><li>CB approach based on co-occurrence of popular terms within the user’s RSS and Twitter items </li></ul><ul><li>3 Recommendation strategies </li></ul><ul><ul><li>Most recent public tweets </li></ul></ul><ul><ul><li>Tweets from the user’s followees </li></ul></ul><ul><ul><li>Non-tweet, based on top 100 RSS terms </li></ul></ul><ul><li>First indication that personalized Twitter-based profile can substantially enhance recommendations </li></ul>
  132. 134. Enhancing Movie Recommendations <ul><li>On the real-time web as a source of recommendation knowledge [ Esparza et al., RecSys ’10] </li></ul><ul><li>Blippr – a Twitter-like short textual movie review service , also supporting tags </li></ul><ul><li>Users are represented based on their blips (and tags) </li></ul><ul><li>Initial results show superiority over CF </li></ul><ul><li>Adding tags did not improve the performance in this case </li></ul>
  133. 135. Personalized Filtering of the Stream <ul><li>Current stream filtering is done based on the network of “followees” or friends, which is insufficient </li></ul><ul><li>On the one hand, users often receive hundreds of news items per day </li></ul><ul><ul><li>Much beyond what users have time to process </li></ul></ul><ul><ul><li>Need to filter the stream to those items that are indeed of interest </li></ul></ul><ul><li>On the other hand, there may be other useful news outside the circle of friends or followees </li></ul><ul><li>Challenges: </li></ul><ul><ul><li>Frequency – huge flood of news items, </li></ul></ul><ul><ul><li>Recency plays big role – news items are interesting only when very fresh </li></ul></ul><ul><ul><li>News items are sparse in content and unstructured (sparse in metadata) </li></ul></ul><ul><ul><li>Cold start problem of new items </li></ul></ul>
  134. 136. Recommending Twitter URLs <ul><li>Short and tweet: experiments on recommending content from information streams [Chen et al., CHI ‘10] </li></ul><ul><li>Directly recommend content through URLs </li></ul><ul><li>Candidate selection </li></ul><ul><ul><li>Followees and followees-of-followees </li></ul></ul><ul><ul><li>Popular tweets </li></ul></ul><ul><li>Topic Relevance </li></ul><ul><ul><li>Cosine similarity between user and URL </li></ul></ul><ul><ul><li>Both based on TF-IDF </li></ul></ul><ul><ul><li>User profile based on self tweets and followees’ tweets </li></ul></ul><ul><ul><li>URL based on tweets (independent of page content) </li></ul></ul><ul><li>Social process </li></ul><ul><ul><li>Based on “votes” by followees of followees (fof) </li></ul></ul><ul><ul><li>A vote increases as an fof is followed by more of the user’s followees </li></ul></ul><ul><ul><li>A vote increases as an fof tweets less frequently </li></ul></ul>
  135. 137. Recommending Twitter URLs <ul><li>Comparing 12 algorithms: (2 candidate) * (3 topic relevance) * (2 social process) </li></ul><ul><li>Field study, 44 subjects </li></ul><ul><li>At least 20 followees and 50 tweets </li></ul><ul><li>Rating 5-top URLs from each algorithm </li></ul><ul><ul><li>Interesting or not interesting </li></ul></ul><ul><li>Each URL is shown with up to 3 tweets </li></ul><ul><ul><li>From the user’s fof, if available </li></ul></ul><ul><li>Social process beats topic relevance </li></ul><ul><li>FoF beats popularity </li></ul><ul><li>Self beats followee (due to taking top 5?) </li></ul>
  136. 138. Topic-based Twitter Filtering <ul><li>Eddi: interactive topic-based browsing of social status streams [Bernstein et al., UIST ’10] </li></ul>
  137. 139. Topic-based Twitter Filtering <ul><li>TweetTopic – topic assignment algorithm </li></ul><ul><li>“ macbook died, but the Genius guys gave me a new 1!” -> not just ‘macbook’ but also ‘apple’ </li></ul><ul><li>Search engines as an external knowledge source </li></ul><ul><li>Transform a tweet to a search query </li></ul><ul><ul><li>Part-of-speech analysis – identify noun phrases </li></ul></ul><ul><ul><li>Iterative backoff to get at least 10 results – remove the word with fewest web occurences </li></ul></ul><ul><li>Retrieve search results through a search engine (e.g., Y!BOSS) </li></ul><ul><li>Identify popular terms in the results using TF-IDF (5/10 pages popularity threshold) </li></ul><ul><li>IR-style evaluation over 100 tweets </li></ul>
  138. 140. References <ul><li>Bernstein M.S., Suh B., Hong L., Chen, J., Kairam S., & Chi E.H. Eddi: interactive topic-based browsing of social status streams. Proc. UIST ’10 , 303-312. </li></ul><ul><li>Chen J., Nairn R., Nelson L., Bernstein, M., & Chi E. Short and tweet: experiments on recommending content from information streams. Proc. CHI ’10, 1185-1194. </li></ul><ul><li>Garcia Esparza, S., O'Mahony, M. P., & Smyth, B. 2010. On the real-time web as a source of recommendation knowledge. Proc. RecSys '10 , 305-308. </li></ul><ul><li>Phelan O., McCarthy K., & Smyth B . 2009 . Using twitter to recommend real - time topical news . Proc. RecSys '09 , 385-388. </li></ul>
  139. 141. Evaluation Methods IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  140. 142. Evaluation goals <ul><li>An application designer who wishes to add a recommendation system to her application must make a decision about the most appropriate algorithm for her goals </li></ul><ul><li>Most evaluation methods rank systems based on </li></ul><ul><ul><li>Prediction power — the ability to accurately predict the user’s choices </li></ul></ul><ul><ul><li>Classification accuracy – the ability to differentiate good items from bad ones. </li></ul></ul><ul><ul><li>Novelty and Exploration ability - discovering new items, and exploring diverse items </li></ul></ul><ul><ul><li>Other features </li></ul></ul><ul><ul><ul><li>Preserving the user privacy </li></ul></ul></ul><ul><ul><ul><li>Fast response </li></ul></ul></ul><ul><ul><ul><li>Ease of interaction with the recommendation engine </li></ul></ul></ul>
  141. 143. Offline Evaluation <ul><li>Based on a pre-collected data set of users choosing or rating items </li></ul><ul><ul><li>Usually done by recording historical user data, and then hiding some of these interactions in order to compare the user predicted rating with her actual rating </li></ul></ul><ul><li>No interaction with real users, thus allow comparing a wide range of candidate algorithms at a low cost </li></ul><ul><li>Mostly useful for evaluating the prediction power of the system and for system tuning </li></ul><ul><li>For example: the Netflix challenge </li></ul>
  142. 144. The Netflix Challenge (Bernett 07) <ul><li>Open competition for the best collaborative filtering algorithm to predict user ratings for films </li></ul><ul><ul><li>based on previous movie ratings (1-5 scale) </li></ul></ul><ul><li>Netflix provided a training data set of </li></ul><ul><ul><li>100,480,507 ratings </li></ul></ul><ul><ul><li>by 480,189 users </li></ul></ul><ul><ul><li>for 17,770 movies </li></ul></ul><ul><li>Contestants were judged according to their predictions for 3 million withheld ratings </li></ul><ul><li>$1 million Prize was given to a team who achieved 10 percent improvement over the accuracy of the Netflix movie recommendation system </li></ul>
  143. 145. Evaluating Prediction Accuracy – the user opinions/ratings over items Root Mean Squared Error (RMSE) the most popular metric used in evaluating accuracy of predicted ratings A popular alternative: Mean Absolute Error (MAE) <ul><li>Both measure the average error of the system predictions </li></ul><ul><li>RMSE disproportionably penalizes large errors, compared to MAE </li></ul>
  144. 146. Measuring Ranking Accuracy <ul><li>Evaluating how many recommended items were purchased by the user </li></ul><ul><ul><li>Recall - how many of acquired items were recommended </li></ul></ul><ul><ul><li>Precision – how many recommended items were acquired </li></ul></ul><ul><ul><ul><li>Prec@N – how many top-N recommended items were acquired </li></ul></ul></ul><ul><ul><li>A trade off is expected </li></ul></ul><ul><ul><ul><li>long recommendation lists typically improve recall while reduce precision </li></ul></ul></ul><ul><li>For a ranked list of recommended items use NDCG (Normalized discount Cumulative Gain) </li></ul><ul><ul><ul><li>g(u,i) – the utility of item i to user u (e.g. the rating) </li></ul></ul></ul><ul><ul><ul><li>DCG* - the DCG of the optimal rank </li></ul></ul></ul>
  145. 147. Evaluating top-N recommendation (Cremonesi, RecSys2010) <ul><li>For each item i rated by user u : </li></ul><ul><ul><li>(i) Randomly select 1000 additional items unrated by user u. </li></ul></ul><ul><ul><ul><li>We may assume that most of them will not be of interest to user u. </li></ul></ul></ul><ul><ul><li>(ii) Predict the ratings for the test item i and for the additional 1000 items. </li></ul></ul><ul><ul><li>(iii) Form a ranked list by ordering all the 1001 items according to their predicted ratings. </li></ul></ul><ul><ul><ul><li>The best result corresponds to the case where item i precedes all the random items </li></ul></ul></ul><ul><ul><li>(iv) Form a top-N recommendation list by picking the N top ranked items from the list. </li></ul></ul><ul><ul><ul><li>If rank(i) ≤ N we have a hit (i.e., the test item i is recommended to the user). </li></ul></ul></ul><ul><ul><ul><li>Otherwise we have a miss </li></ul></ul></ul><ul><li>Recall/Precision are defined by averaging over all rated items T </li></ul><ul><ul><li>recall@N = #hits / |T| </li></ul></ul><ul><ul><li>precision@N = #hits/ (|T|*N) </li></ul></ul>
  146. 148. Off-line evaluation of a Tag Recommendation System using social bookmarks ( Carmel, CIKM’09) <ul><li>Given a dataset of bookmarks {(u,d,t)} </li></ul><ul><li>For each (u,d) pair: </li></ul><ul><ul><li>Hide all bookmarks related to this pair from the bookmark data (all (u,d,t’) triplets) </li></ul></ul><ul><ul><li>Recommend tags for this pair </li></ul></ul><ul><ul><li>Score the recommendation lists by Mean Average Precision (MAP), given the actual tags used by u to bookmark d </li></ul></ul>(d1,u1): t1, t2 , t3, t4 , t5 ,t6, t7, t8, t 9 , t10 … (d2,u2): t1 , t2, t3 , t4 , t5 , t6 , t7, t8, t9, t10 … (d3,u3): t1, t2, t3, t4, t5 ,t6, t7, t8, t9 , t10 …
  147. 149. User Studies <ul><li>A user study is conducted by recruiting a set of users, and asking them to perform several interaction tasks with the recommendation system. </li></ul><ul><ul><li>While the subjects perform the tasks, we observe and record their behavior </li></ul></ul><ul><ul><ul><li>the portion of the task completed </li></ul></ul></ul><ul><ul><ul><li>the accuracy of the task results </li></ul></ul></ul><ul><ul><ul><li>the time taken to perform the task </li></ul></ul></ul><ul><ul><ul><li>general impression </li></ul></ul></ul><ul><ul><ul><li>etc. </li></ul></ul></ul>
  148. 150. Questions to measure subjective constructs
  149. 151. User studies – Pros and Cons <ul><li>Enable online test of the user behavior when interacting with the recommendation system </li></ul><ul><li>Are very expensive to conduct </li></ul><ul><ul><li>typically restricted to a small set of subjects and a relatively small set of tasks, and cannot test all possible scenarios </li></ul></ul><ul><ul><li>The test subjects must represent the population of users of the real system </li></ul></ul><ul><ul><ul><li>as closely as possible </li></ul></ul></ul>
  150. 152. Online Evaluation <ul><li>Evaluate the system by real users that perform real tasks </li></ul><ul><ul><li>Provides the strongest evidence for the true value of the system to its users </li></ul></ul><ul><ul><li>The real effect of the recommendation system depends on a variety of user’s dependent factors that are changed dynamically </li></ul></ul><ul><ul><ul><li>The user current intent </li></ul></ul></ul><ul><ul><ul><li>The user’s current context </li></ul></ul></ul><ul><li>Feedback from the users is collected by observing their feedback to the system’s recommnedation </li></ul><ul><ul><li>Systems are evaluated according to the acquired vs. non-acquired ratio </li></ul></ul><ul><li>Such a live user experiment may be controlled </li></ul><ul><ul><li>Randomly assign users to different conditions </li></ul></ul><ul><ul><ul><li>e.g test a new version of your system on a test set of users </li></ul></ul></ul><ul><ul><li>A/B testing: split users to test groups and measure effectiveness of different conditions/algorithms on the groups </li></ul></ul><ul><li>On-line evaluation studies are done on a regular basis by commercial Recommendation Systems </li></ul>
  151. 153. References <ul><li>The Netflix Prize, Bernett et al. KDD CUP 2007 </li></ul><ul><li>Evaluating collaborative filtering recommender systems. Herlocker et al . TOIS 2004 </li></ul><ul><li>Personalized social search based on the user's social network. Carmel et al., CIKM 2009 </li></ul><ul><li>User Evaluation Framework of recommender Systems, Chen et al. (SRS 2010) </li></ul><ul><li>Performance of recommender algorithms on top-n recommendation tasks, Cremonesi et al. RecSys 2010 </li></ul>
  152. 154. Open Issues and Research Challenges IBM Research WWW 2011, March 28th-April 1st, Hyderabad, India
  153. 155. Open Issues and Research challenges <ul><li>Privacy </li></ul><ul><ul><li>Especially with explanations </li></ul></ul><ul><li>Aggregation outside the firewall </li></ul><ul><ul><li>Dealing with the cold start of new users </li></ul></ul><ul><li>Beyond accuracy </li></ul><ul><ul><li>Diversity </li></ul></ul><ul><ul><li>Serendipity, “surprise effect” </li></ul></ul><ul><li>Crowdsourcing evaluation approach for SRS </li></ul><ul><li>Inspection of recommendation over time </li></ul><ul><ul><li>Learning from user feedback </li></ul></ul>
  154. 156. Questions? <ul><li>Ido Guy [email_address] </li></ul><ul><li>David Carmel [email_address] </li></ul>

Editor's Notes

  • The sites in the top sites lists are ordered by their 1 month alexa traffic rank. The 1 month rank is calculated using a combination of average daily visitors and pageviews over the past month. The site with the highest combination of visitors and pageviews is ranked #1.
  • Facebook – updated March 2011 Twitter – June 2010 YouTube – May 2010
  • To mention that table entries can be filled by implicit feedback
  • Outside the box – since similar users can recommend surprising items compared to content-based approaches which is limited to items that fit the user profile
  • What a Difference a Group Makes:Web-Based Recommendations for Interrelated Users (Mason &amp; Smith)
  • Explaining collaborative filtering
  • Tagsplanations: explaining recommendations using tags, Vig, IUI09
  • User Evaluation Framework of Recommender Systems, Chen (SRS 2010)