Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SIGIR 2016 COFIBA - Collaborative Filtering Bandits, the 39th ACM SIGIR.


Published on

Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and computational advertisement, where the set of items and users is very fluid. In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. Our algorithm takes into account the collaborative effects that arise due to the interaction of the users with the items, by dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users. The resulting algorithm thus takes advantage of preference patterns in the data in a way akin to collaborative filtering methods. We provide an empirical analysis on medium-size real-world datasets, showing scalability and increased prediction performance (as measured by click-through rate) over state-of-the-art methods for clustering bandits. We also provide a regret analysis within a standard linear stochastic noise setting.

Published in: Science
  • Be the first to comment

SIGIR 2016 COFIBA - Collaborative Filtering Bandits, the 39th ACM SIGIR.

  1. 1. Collaborative Filtering Bandits Shuai Li University of Insubria The 39th SIGIR Jul 20, 2016 Joint with: Alexandros Karatzoglou and Claudio Gentile 1
  2. 2. Overview • Contextual Bandits have been used in Recommender Systems • Traditional Bandits do not take collaborative filtering effects into account • COFIBA (“Coffee Bar”): main idea is to generalize Clustering of Bandits with co-clustering for collaborative effects 2
  3. 3. Motivation • Classical CF methods, dynamic environments (Video-Music-Ads) • Clustering Bandit is successful for large scale recommendation [ICML’14] • Latent clustering is more efficient, cheaper and scalable [ICML’16] • Netflix: 2/3 of the movies watched are recommended • Google News: recommendations generate 38% more click-throughs • According to Google 2011 annual report, “Advertising recommendation revenues made up 97% of our revenues in 2009 and 96% of our revenues in 2010 and 2011” • Google gains billions in market value as YouTube drives ad growth (16.3% at 699.62 billion USD ⇐ 65 billion USD increase by YOUTUBE ADS CLICKS) – Jul 17, 2015 3
  4. 4. Continuous Cold-Start Problem in dynamic Recommenda- tion settings • User Pick which one? X-Box Google Glass PSPiPad all are ever-changing one data record Same Category Different Category 4
  5. 5. Challenge and Design Principle • How to adapt to highly dynamic environment? • Has a technical sound theoretical guarantee? • Scale at big data scenarios? • Simple to deploy to industrial systems? • Can be applied to many other application domains? 5
  6. 6. Multi-Armed Bandit • Multi-Armed Bandits and Regret • A Statistical Problem for Slot Machine players: - One Slot machine pays more - How can we detect it by playing? • The tradeoff faced by user between “exploitation” of the machine with highest expected payoff and “exploration” to get more information about the expected payoffs of the other machines • In recommendation systems − > Different items have different utilities How can we find the one with the highest? 6
  7. 7. Traditional Bandits VS. Clustering Bandits • Graph with n users • Unknown users’ profiles ui, i = 1 . . . n u2 u1 u6 u5 u8 u7u4 u3 Available connections need not reflect similar interests among users =⇒ need to infer the connections in the highly dynamic environment 7
  8. 8. Forming the Implicit Graph for Users • Set of n users • Unknown users’ profiles ui, i = 1 . . . n u2 u1 u6 u5 u8 u7u4 u3 Drawing edges based on observed behavior of users (clustering algorithms) • Content universe changing rapidly over time • Many users: scaling properties are major concerns 8
  9. 9. Forming the Graph and Clustering Bandit Model (group recommendation to subcommunities): • m << n clusters • Each cluster has a single profile uj • Need to learn both vectors of user profile and cluster profile • User profiles used to compute similarity for graph, cluster’s for recommendations • zj is aggregation of proxies wi u2 u1 w1 u3 w2 w3 w6 u3 w7 u2 z2 w4 w8 u3 u1 u3 u3 u1 u3u2 w5 z1 9
  10. 10. Pruning the edges of the graph • Start off from full n-node graph (or sparsified version thereof) and single estimated cluster • Node proxies wi to delete edges: If ||wi − wj|| > θ(i,j) =⇒ delete (i, j) • Estimated clusters are current connected components u2 u1 w1 u3 w2 w3 w6 u3 w7 u2 z2 w4 w8 u3 u1 u3 u3 u1 u3u2 w5 z1 • When serving user i in estimated cluster j, update both node proxy wi and cluster proxy zj • Recompute clusters upon changing the graph Edges are deleted 10
  11. 11. Co-Clustering for Collaborative Filtering 11
  12. 12. Co-Clustering for Collaborative Filtering 12
  13. 13. Co-Clustering for Collaborative Filtering 13
  14. 14. Cofiba with Co-clustering of User-Item Graph Challenges • Statistical: tight theoretical convergence guarantee • Computational: running time cost and memory cost • Performance: online prediction over user-item graphs Tricks 1. Start off from random (Erdos-Renyi) graph 2. Clustering by connected components: Current clusters is union of under- lying clusters 14
  15. 15. Tricks 1. Start off from random (Erdos-Renyi) graph Known fact: • Random (Erdos-Renyi) graphs lead to one initial cluster with a pre-specified probability • Initial n-clique graph G 15
  16. 16. Tricks 2. During learning process clusters are unions of underlying preference clusters u2 u1 w1 u3 w2 w3 w6 u3 w7 u2 z2 w4 w8 u3 u1 u3 u3 u1 u3u2 w5 z1 • Within-cluster edges (w.r.t. underlying clustering) are never deleted • Between-cluster edges (w.r.t. underlying clustering) will eventually be deleted, assuming gap on cluster profile vectors, and enough observed payoff values 16
  17. 17. Algorithmic Idea • Group users based on items, and group items based on the clustering induced over users • Explore the priori: I = {x1, . . . , x|I|} • Multiple clusterings over the set of users U and a single clustering over the set of items I • Nit,t(xt,k) w.r.t. the items in Cit are stored into the clusters at the user side pointed to by those items • Update the clusterings at user side and unique clustering at item side User graphs U 6 2 4 6 5 3 4 1 31 5 2 6 4 21 5 3 2 1 5 6 7 8 4 I1,t+1 ^ I2,t+1 ^ I U Item graph U User graphs 6 4 21 5 3 8 2 2,t 1,1 3,t I Item graph U U 6 2 4 5 3 1 3 I3,t+1 ^ I4,t+1 ^ (c) Time t+1 (a) Initialization 6 4 31 5 2 U 2 1 3 5 6 7 8 4 I ^ I ^ I ^ I 1 3 7 4 56 U 6 2 4 5 31 1,tI ^ (b) Time t 2 4 5 1 U 63 17
  18. 18. Advancements • Explore the collaborative effect that arise due to the ever-changing in- teraction of both customers and products, e.g., IR system: Google • Design a truly online collaborative filtering solution augmented by an exploration-exploitation strategy • Dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users • Principle recipe to alleviate the cold-start problem in terms of both theory and practice 18
  19. 19. Theoretical Guarantee • Let it be generated uniformly at random from U, j-th induced partition P(ηj) over U is made up of mj clusters of cardinality vj,1, vj,2, . . . , vj,mj • According to a given but unknown distribution over I, the sequence of items in Cit is generated i.i.d., at ∈ [−1, 1], E[at] = uit ¯xt • Let parameter α and α2 be suitable functions of log(1/δ). If ct ≤ c ∀t then, as T grows large, with probability at least 1 − δ the cumulative regret satisfies T t=1 rt = ˜O Ej[S] + 1 + (2c − 1)VARj(S) d T n , where S = S(j) = mj k=1 √ vj,k, and Ej[·] and VARj(·) denote, respectively, the expectation and the variance w.r.t. the distribution of ηj over I 19
  20. 20. Data Sets • Yahoo: ICML 2012 Exploration & Exploitation Challenge, news article recommendation algorithms on “Today Module”, 3M records • Telefonica: This data contains clicks on ads displayed to user on one of the websites that Telefonica operates on, 15M records • Avazu: The data was provided for the challenge to predict the click- through rate of impressions on mobile devices, 40M records 20
  21. 21. Experimental Evidence • In general, the longer the lifecycle of one item the fewer the items, the higher the chance that users with similar preferences will consume it, and hence the bigger the collaborative effects contained in the data • It is therefore reasonable to expect that our algorithm will be more ef- fective in datasets where the collaborative effects are indeed strong 21
  22. 22. Experimental Result: Benchmark News Data 1 2 3 4 5 6 7 x 10 4 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Rounds CTR Yahoo Dataset LINUCB−ONE DYNUCB LINUCB−IND CLUB LINUCB−V COFIBA 22
  23. 23. Experimental Result: Benchmark News Data • The users here are span a wide range of demographic characteristics; this dataset is derived from the consumption of news that are often interesting for large portions of these users and, as such, do not create strong polarization into subcommunities • This implies that more often than not, there are quite a few specific hot news that all users might express interest in, and it is natural to expect that these pieces of news are intended to reach a wide audience of consumers • In this non-trivial case, COFIBA can still achieve a significant increased prediction accuracy compared, thereby suggesting that simultaneous clus- tering at both the user and the item (the news) sides might be an even more effective strategy to earn clicks in news recommendation systems 23
  24. 24. Experimental Result: Benchmark Advertising Data 1 2 3 4 5 6 7 8 9 x 10 5 0 0.05 0.1 0.15 0.2 0.25 Rounds CTR Avazu Dataset LINUCB−ONE DYNUCB LINUCB−IND CLUB LINUCB−V COFIBA 24
  25. 25. Experimental Result: Benchmark Advertising Data • The Avazu data is furnished from its professional digital advertising so- lution platform, where the customers click the ad impressions via the iOS/Android mobile apps or through websites, serving either the pub- lisher or the advertiser which leads to a daily high volume internet traffic • In this dataset, COFIBA seems to work extremely well during the cold- start, and comparatively best in all later stages 25
  26. 26. Experimental Result: Production Advertising Data 0.5 1 1.5 2 2.5 3 x 10 5 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 Rounds CTR Telefonica Dataset LINUCB−ONE DYNUCB LINUCB−IND CLUB LINUCB−V COFIBA 26
  27. 27. Experimental Result: Production Advertising Data • Most of the users in the Telefonica data are from a diverse sample of people in Spain, and it is easy to imagine that this dataset spans a large number of communities across its population • Thus we can assume that collaborative effects will be much more evident, and indeed COFIBA is able to leverage these effects efficiently 27
  28. 28. Experimental Result: Typical Distribution of Users 0 2 4 6 8 10 12 14 16 18 0 0.1 0.2 0.3 0.4 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 0 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 0 2 4 6 8 10 12 14 16 18 0 0.1 0.2 0.3 0.4 0.5 0 2 4 6 8 10 12 14 16 18 0 0.1 0.2 0.3 0.4 0.5 28
  29. 29. Experimental Result: Typical Distribution of Users • A typical distribution of cluster sizes over users for the Yahoo! dataset. Each bar plot corresponds to a cluster at the item side. Each bar repre- sents the fraction of users contained in the corresponding cluster • The emerging pattern is always the same: we have few clusters over the items with very unbalanced sizes and, corresponding to each item cluster, we have few clusters over the users, again with very unbalanced sizes • This recurring pattern also confirms our theoretical finding, and a prop- erty of data that the COFIBA algorithm can provably take advantage 29
  30. 30. Experimental Result: Summary • Despite the differences in all the datasets, the experimental evidence we collected on them is quite consistent, in that in all the cases COFIBA significantly outperforms all other competing methods we tested • This is especially noticeable during the cold-start period, but the same relative behavior essentially shows up during the whole time window of our experiments • On the other hand, COFIBA is far more effective in exploiting the col- laborative effects embedded in the data, and still amenable to be run on large datasets 30
  31. 31. Conclusion and Future Work • Introduced Collaborative Bandits that make best use of data seen so far • Provided the sharp theoretical guarantee • Outperformed the state-of-the-art significantly • Some directions: – Elaborate the analysis to asynchronous networks [ICML’16] – Exhibit more generic Context-Aware clustering method – Extend our techniques in the quantification domain [SIGKDD’16] 31
  32. 32. Questions ? • Papers etc − > • Thanks :-) 32