Comparing State-of-the-Art Collaborative Filtering Systems

  • 4,071 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,071
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
201
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Comparing State-of-the-Art Collaborative Filtering Systems Laurent Candillier, Frank Meyer, Marc Boull´ e Introduction France Telecom R&D Lannion Collaborative approaches MLDM 2007 Experiments Conclusions 1 Introduction 2 Collaborative approaches 3 Experiments 4 Conclusions
  • 2. Recommender systems Help users find items they should appreciate from huge catalogues [Adomavicius and Tuzhilin, 2005] Introduction Collaborative approaches ⇒ Collaborative filtering : based on user to item rating matrix Experiments Conclusions i1 i2 i3 i4 i5 4 4 1 u1 4 3 u2 5 2 1 u3 4 5 u4 5 4 u5 5 3 u6 4 ? 1 u7
  • 3. User-based approaches Recommend items appreciated by users whose tastes are similar to the ones of the given user [Resnick et al., 1994] Introduction ⇒ need a similarity measure between users Collaborative approaches ex : pearson similarity : cosine of deviation from the mean Experiments Conclusions i ∈Sa ∩Su (vai − va )(vui − vu ) w (a, u) = − va )2 − vu )2 i ∈Sa ∩Su (vai i ∈Sa ∩Su (vui vui : rating of user u on item i Su : set of items rated by user u vu : mean rating of user u vui i ∈Su vu = |Su |
  • 4. User-based approaches Which rating for user a (active) on item i ? Introduction Collaborative approaches Prediction using weighted sum Experiments {u|i ∈Su } w (a, u) × vui Conclusions pai = {u|i ∈Su } |w (a, u)| Prediction using weighted sum of deviations from the mean {u|i ∈Su } w (a, u) × (vui − vu ) pai = va + {u|i ∈Su } |w (a, u)| How many neighbors considered ?
  • 5. Cluster-based approaches Recommend items appreciated by users that belong to the Introduction same group as the given user [Breese et al., 1998] Collaborative approaches Experiments ⇒ need Conclusions a clustering method : ex : K-means a distance measure : ex : euclidian distance Then the rating of a user on an item is the mean rating given by the users that belong to the same cluster How many clusters considered ?
  • 6. Item-based approaches Recommend items similar to those appreciated by the given user [Karypis, 2001] Introduction Collaborative approaches ⇒ dual of user-based approach Experiments Conclusions × (vaj − vj ) {j∈Sa |j=i } sim(i , j) pai = vi + {j∈Sa |j=i } |sim(i , j)| sim(i , j) : similarity measure between items i and j Sa : set of items rated by user a vi : mean rating on item i How many neighbors considered ?
  • 7. Experiments For user- and item-based approaches, choose similarity measure prediction scheme Introduction Collaborative neighborhood size K approaches For cluster-based approaches, choose Experiments distance measure Conclusions prediction scheme number of clusters Evaluation protocol [Herlocker et al., 2004] movie rating dataset : MovieLens (6040 × 3706) 10-fold cross validation (10 × 9/10th for learning) Mean Absolute Error Rate on test set T = {(u, i , r )} 1 MAE = |pui − r | |T | (u,i ,r )∈T
  • 8. User-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.8 Cosine Experiments Adjusted Conclusions Proba 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
  • 9. User-based approaches, prediction schemes MAE Introduction PearsonWeighted Collaborative PearsonDeviation approaches 0.8 ProbaWeighted Experiments ProbaDeviation Conclusions 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
  • 10. Item-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.76 Cosine Experiments Adjusted Conclusions Proba 0.72 0.68 0.64 0 200 400 600 800 1000 1200 1400 K
  • 11. Summary of experiments BestDefault BestUser BestItem BestCluster Introduction model construction 1 730 170 254 time (in sec.) Collaborative prediction time approaches 1 31 3 1 (in sec.) Experiments MAE 0.6829 0.6688 0.6382 0.6736 Conclusions BestDefault : Bayes minimizing MAE BestUser : pearson similarity, 1500 neighbors, prediction using deviation from the mean BestItem : probabilistic similarity, 400 neighbors, prediction using deviation from the mean BestCluster : K-means, euclidian distance, 4 clusters, prediction using Bayes minimizing MAE
  • 12. Conclusions Introduction Collaborative All approaches, and all their possible options, are tested approaches under exactly the same conditions Experiments Bayes is a good compromise : low error rate, low Conclusions execution time, incremental Deviation from the mean : better results, new for item-based approaches Similarity measures : pearson for user-based, probabilistic for item-based
  • 13. Conclusions The item-based approach Introduction Collaborative get the best performances in the experiments approaches seems to need fewer neighbors than user-based approach Experiments Conclusions is also appropriate to navigate in item catalogues even with no user information may naturally use content data about items to improve its results (idem for user-based approach with demographic data) results depend on the number of items compared to the number of users ?
  • 14. Next Need to scale well even when faced with huge datasets Introduction ex : netflix prize : 100,480,507 ratings from 480,189 users on Collaborative approaches 17,770 movies Experiments select most relevant users [Yu et al., 2002] Conclusions reduce dimensionality with PCA or SVD [Goldberg et al., 2001, Vozalis and Margaritis, 2005] create a set of super-users [Rashid et al., 2006] sampling ? stochastic ? bagging ? Combine approaches ⇒ ensemble methods [Polikar, 2006]
  • 15. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl (1994) Grouplens: an open architecture for collaborative filtering Introduction of netnews Collaborative approaches In Conference on Computer Supported Cooperative Work, Experiments pages 175–186. ACM Conclusions J. Breese, D. Heckerman and C. Kadie (1998) Empirical analysis of predictive algorithms for collaborative filtering In 14th Conference on Uncertainty in Artificial Intelligence, pages 43–52. Morgan Kaufman G. Karypis (2001) Evaluation of item-based top-N recommendation algorithms
  • 16. In 10th International Conference on Information and Knowledge Management, pages 247–254 K. Goldberg, T. Roeder, D. Gupta and C. Perkins (2001) Introduction Eigentaste: a constant time collaborative filtering Collaborative approaches algorithm Experiments Information Retrieval, 4(2):133–151 Conclusions K. Yu, X. Xu, J. Tao, M. Ester and H. Kriegel (2002) Instance selection techniques for memory-based collaborative filtering In SIAM Data Mining J. Herlocker, J. Konstan, L. Terveen and J. Riedl (2004) Evaluating collaborative filtering recommender systems ACM Transactions on Information Systems, 22(1):5–53 G. Adomavicius and A. Tuzhilin (2005)
  • 17. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions IEEE Transactions on Knowledge and Data Engineering, Introduction 17(6):734–749 Collaborative approaches M. Vozalis and K. Margaritis (2005) Experiments Applying SVD on item-based filtering Conclusions In 5th International Conference on Intelligent Systems Design and Applications, pages 464–469 A.M. Rashid, S.K. Lam, G. Karypis and J. Riedl (2006) ClustKNN: a highly scalable hybrid model- & memory-based CF algorithm In KDD Workshop on Web Mining and Web Usage Analysis R. Polikar (2006) Ensemble systems in decision making IEEE Circuits & Systems Magazine, 6(3):21–45