Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

4,733 views

4,480 views

4,480 views

Published on

No Downloads

Total views

4,733

On SlideShare

0

From Embeds

0

Number of Embeds

55

Shares

0

Downloads

217

Comments

0

Likes

6

No embeds

No notes for slide

- 1. Comparing State-of-the-Art Collaborative Filtering Systems Laurent Candillier, Frank Meyer, Marc Boull´ e Introduction France Telecom R&D Lannion Collaborative approaches MLDM 2007 Experiments Conclusions 1 Introduction 2 Collaborative approaches 3 Experiments 4 Conclusions
- 2. Recommender systems Help users ﬁnd items they should appreciate from huge catalogues [Adomavicius and Tuzhilin, 2005] Introduction Collaborative approaches ⇒ Collaborative ﬁltering : based on user to item rating matrix Experiments Conclusions i1 i2 i3 i4 i5 4 4 1 u1 4 3 u2 5 2 1 u3 4 5 u4 5 4 u5 5 3 u6 4 ? 1 u7
- 3. User-based approaches Recommend items appreciated by users whose tastes are similar to the ones of the given user [Resnick et al., 1994] Introduction ⇒ need a similarity measure between users Collaborative approaches ex : pearson similarity : cosine of deviation from the mean Experiments Conclusions i ∈Sa ∩Su (vai − va )(vui − vu ) w (a, u) = − va )2 − vu )2 i ∈Sa ∩Su (vai i ∈Sa ∩Su (vui vui : rating of user u on item i Su : set of items rated by user u vu : mean rating of user u vui i ∈Su vu = |Su |
- 4. User-based approaches Which rating for user a (active) on item i ? Introduction Collaborative approaches Prediction using weighted sum Experiments {u|i ∈Su } w (a, u) × vui Conclusions pai = {u|i ∈Su } |w (a, u)| Prediction using weighted sum of deviations from the mean {u|i ∈Su } w (a, u) × (vui − vu ) pai = va + {u|i ∈Su } |w (a, u)| How many neighbors considered ?
- 5. Cluster-based approaches Recommend items appreciated by users that belong to the Introduction same group as the given user [Breese et al., 1998] Collaborative approaches Experiments ⇒ need Conclusions a clustering method : ex : K-means a distance measure : ex : euclidian distance Then the rating of a user on an item is the mean rating given by the users that belong to the same cluster How many clusters considered ?
- 6. Item-based approaches Recommend items similar to those appreciated by the given user [Karypis, 2001] Introduction Collaborative approaches ⇒ dual of user-based approach Experiments Conclusions × (vaj − vj ) {j∈Sa |j=i } sim(i , j) pai = vi + {j∈Sa |j=i } |sim(i , j)| sim(i , j) : similarity measure between items i and j Sa : set of items rated by user a vi : mean rating on item i How many neighbors considered ?
- 7. Experiments For user- and item-based approaches, choose similarity measure prediction scheme Introduction Collaborative neighborhood size K approaches For cluster-based approaches, choose Experiments distance measure Conclusions prediction scheme number of clusters Evaluation protocol [Herlocker et al., 2004] movie rating dataset : MovieLens (6040 × 3706) 10-fold cross validation (10 × 9/10th for learning) Mean Absolute Error Rate on test set T = {(u, i , r )} 1 MAE = |pui − r | |T | (u,i ,r )∈T
- 8. User-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.8 Cosine Experiments Adjusted Conclusions Proba 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
- 9. User-based approaches, prediction schemes MAE Introduction PearsonWeighted Collaborative PearsonDeviation approaches 0.8 ProbaWeighted Experiments ProbaDeviation Conclusions 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
- 10. Item-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.76 Cosine Experiments Adjusted Conclusions Proba 0.72 0.68 0.64 0 200 400 600 800 1000 1200 1400 K
- 11. Summary of experiments BestDefault BestUser BestItem BestCluster Introduction model construction 1 730 170 254 time (in sec.) Collaborative prediction time approaches 1 31 3 1 (in sec.) Experiments MAE 0.6829 0.6688 0.6382 0.6736 Conclusions BestDefault : Bayes minimizing MAE BestUser : pearson similarity, 1500 neighbors, prediction using deviation from the mean BestItem : probabilistic similarity, 400 neighbors, prediction using deviation from the mean BestCluster : K-means, euclidian distance, 4 clusters, prediction using Bayes minimizing MAE
- 12. Conclusions Introduction Collaborative All approaches, and all their possible options, are tested approaches under exactly the same conditions Experiments Bayes is a good compromise : low error rate, low Conclusions execution time, incremental Deviation from the mean : better results, new for item-based approaches Similarity measures : pearson for user-based, probabilistic for item-based
- 13. Conclusions The item-based approach Introduction Collaborative get the best performances in the experiments approaches seems to need fewer neighbors than user-based approach Experiments Conclusions is also appropriate to navigate in item catalogues even with no user information may naturally use content data about items to improve its results (idem for user-based approach with demographic data) results depend on the number of items compared to the number of users ?
- 14. Next Need to scale well even when faced with huge datasets Introduction ex : netﬂix prize : 100,480,507 ratings from 480,189 users on Collaborative approaches 17,770 movies Experiments select most relevant users [Yu et al., 2002] Conclusions reduce dimensionality with PCA or SVD [Goldberg et al., 2001, Vozalis and Margaritis, 2005] create a set of super-users [Rashid et al., 2006] sampling ? stochastic ? bagging ? Combine approaches ⇒ ensemble methods [Polikar, 2006]
- 15. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl (1994) Grouplens: an open architecture for collaborative ﬁltering Introduction of netnews Collaborative approaches In Conference on Computer Supported Cooperative Work, Experiments pages 175–186. ACM Conclusions J. Breese, D. Heckerman and C. Kadie (1998) Empirical analysis of predictive algorithms for collaborative ﬁltering In 14th Conference on Uncertainty in Artiﬁcial Intelligence, pages 43–52. Morgan Kaufman G. Karypis (2001) Evaluation of item-based top-N recommendation algorithms
- 16. In 10th International Conference on Information and Knowledge Management, pages 247–254 K. Goldberg, T. Roeder, D. Gupta and C. Perkins (2001) Introduction Eigentaste: a constant time collaborative ﬁltering Collaborative approaches algorithm Experiments Information Retrieval, 4(2):133–151 Conclusions K. Yu, X. Xu, J. Tao, M. Ester and H. Kriegel (2002) Instance selection techniques for memory-based collaborative ﬁltering In SIAM Data Mining J. Herlocker, J. Konstan, L. Terveen and J. Riedl (2004) Evaluating collaborative ﬁltering recommender systems ACM Transactions on Information Systems, 22(1):5–53 G. Adomavicius and A. Tuzhilin (2005)
- 17. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions IEEE Transactions on Knowledge and Data Engineering, Introduction 17(6):734–749 Collaborative approaches M. Vozalis and K. Margaritis (2005) Experiments Applying SVD on item-based ﬁltering Conclusions In 5th International Conference on Intelligent Systems Design and Applications, pages 464–469 A.M. Rashid, S.K. Lam, G. Karypis and J. Riedl (2006) ClustKNN: a highly scalable hybrid model- & memory-based CF algorithm In KDD Workshop on Web Mining and Web Usage Analysis R. Polikar (2006) Ensemble systems in decision making IEEE Circuits & Systems Magazine, 6(3):21–45

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment