Bayesian Personalized Ranking for Non-Uniformly Sampled Items

2,509 views

Published on

The slide set describing our approach to the KDD Cup 2011, presented at the KDD Cup workshop in San Diego, California.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,509
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Bayesian Personalized Ranking for Non-Uniformly Sampled Items

  1. 1. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Bayesian Personalized Ranking for Non-Uniformly Sampled Items Zeno Gantner, Lucas Drumond, Christoph Freudenthaler, Lars Schmidt-Thieme University of Hildesheim 21 August 2011Zeno Gantner et al., University of Hildesheim 1 / 15
  2. 2. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Questions (and Answers) What? Who? Which? How? Where? Why?Zeno Gantner et al., University of Hildesheim 2 / 15
  3. 3. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Which problem to solve?Which problem to solve? Rating Prediction (Track 1) vs. Item Prediction (Track 2)Zeno Gantner et al., University of Hildesheim 3 / 15
  4. 4. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?How did we tackle the problem? Bayesian Personalized Ranking: 2 BPR(DS ) = argmax ln σ(ˆu,i (Θ) − ˆu,j (Θ) )−λ Θ s s Θ (u,i,j)∈DS DS contains all pairs of positive and negative items for each user, 1 σ(x) = 1+e −x is the logistic function, Θ represents the model parameters, ˆu,i (Θ) is the predicted score for user u and item i, and s λ Θ 2 is a regularization term to prevent overfitting. interpretation 1: reduce ranking to pairwise classif. [Balcan et al. 2008] interpretation 2: optimize for smoothed area under the ROC curve (AUC) Model: matrix factorization Learning: stochastic gradient ascent [Rendle et al., UAI 2009]Zeno Gantner et al., University of Hildesheim 4 / 15
  5. 5. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?How did we tackle the problem? 2 BPR(DS ) = argmax ln σ(ˆu,i − ˆu,j ) − λ Θ s s Θ (u,i,j)∈DS problem: all negative items j are given the same weightZeno Gantner et al., University of Hildesheim 5 / 15
  6. 6. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?How did we tackle the problem? 2 BPR(DS ) = argmax ln σ(ˆu,i − ˆu,j ) − λ Θ s s Θ (u,i,j)∈DS problem: all negative items j are given the same weight solution: adapt weights in the optimization criterion (and sampling probabilities in the learning algorithm) WBPR(DS ) = argmax wu wi wj ln σ(ˆu,i − ˆu,j ) − λ Θ 2 , s s Θ (u,i,j)∈DS where + wj = δ(j ∈ Iu ). (1) u∈UZeno Gantner et al., University of Hildesheim 5 / 15
  7. 7. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?Why did we not win?But also: Why did we perform better than others? Why did we perform better than others? straightforward model that matches the prediction task pretty well scalability (e.g. k = 480 factors per user/item) integration of rating information (see paper) ensembles (see paper) Why did we not win? . . . two possible answers . . .Zeno Gantner et al., University of Hildesheim 6 / 15
  8. 8. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?TaxonomyZeno Gantner et al., University of Hildesheim 7 / 15
  9. 9. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?Learn the right contrast rating < 80 rating >= 80 liked? no rating rating >= 80 rated? no rating rating < 80 rating >= 80 ? no ratingZeno Gantner et al., University of Hildesheim 8 / 15
  10. 10. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?Learn the right contrast rating < 80 rating >= 80 liked? no rating rating >= 80 rated? no rating rating < 80 rating >= 80 ? no ratingZeno Gantner et al., University of Hildesheim 9 / 15
  11. 11. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?Learn the right contrast rating < 80 rating >= 80 liked? no rating rating >= 80 rated? no rating rating < 80 rating >= 80 ? no ratingZeno Gantner et al., University of Hildesheim 10 / 15
  12. 12. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?Learn the right contrast rating < 80 rating >= 80 liked? no rating rating >= 80 rated? no rating rating < 80 rating >= 80 ? no ratingZeno Gantner et al., University of Hildesheim 11 / 15
  13. 13. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Where?Where next? classification → ranking → pairwise classification pairwise classification: try other losses, e.g. soft margin (hinge) loss Bayesian2 Personalized Ranking beyond KDD Cup: consider different sampling schemes . . .Zeno Gantner et al., University of Hildesheim 12 / 15
  14. 14. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items SummarySummary Use matrix factorization optimized for Bayesian Personalized Ranking (BPR) to solve the item ranking problem. BPR reduces ranking (in this case: binary variables) to pairwise classification. Extend BPR to use different sampling scheme: Weighted BPR (WBPR). Open question: Learn a different contrast? Details can be found in the paper. Code: http://ismll.de/mymedialite/ examples/kddcup2011.html advertisement: Contribute to http://recsyswiki.com!Zeno Gantner et al., University of Hildesheim 13 / 15
  15. 15. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items QuestionsZeno Gantner et al., University of Hildesheim 14 / 15
  16. 16. Bayesian Personalized Rankingfor Non-Uniformly Sampled ItemsAcknowledgements Thank you The organizers, for hosting a great competition. The participants, for sharing their insights. Funding German Research Council (Deutsche Forschungsgemeinschaft, DFG) project Multirelational Factorization Models. Development of the MyMediaLite software was co-funded by the European Commission FP7 project MyMedia under the grant agreement no. 215006. Picture credits by Michael Sauers, under Creative Commons by-nc-sa 2.0 http://www.flickr.com/photos/travelinlibrarian/223839049/ by Rob Starling, under Creative Commons by-sa 2.0 http://en.wikipedia.org/wiki/File:Air_New_Zealand_B747-400_ZK-SUI_at_LHR.jpgZeno Gantner et al., University of Hildesheim 15 / 15
  17. 17. Bayesian Personalized Rankingfor Non-Uniformly Sampled ItemsNumbers? k error in % “liked” contrast 320 5.52 480 5.08 “rated” contrast 320 5.15 480 4.87 Estimated error on validation split (not leaderboard).Zeno Gantner et al., University of Hildesheim 16 / 15
  18. 18. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items AdvertisementMyMediaLite: Recommender System Algorithm Library functionality rating prediction item recommendation from implicit feedback group recommendation target groups simple researchers, educators and students free application developers scalable development well-documented written in C#, runs on Mono well-tested GNU General Public License (GPL) choice regular releases (ca. 1 per month) http://ismll.de/mymedialiteZeno Gantner et al., University of Hildesheim 17 / 15
  19. 19. Bayesian Personalized Rankingfor Non-Uniformly Sampled Items AdvertisementRecSys Wiki is looking for contributions Alan Zeno http://recsyswiki.comZeno Gantner et al., University of Hildesheim 18 / 15

×