Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Item Based Collaborative Filtering ... by nextlib 23601 views
- Recommender Systems (Machine Learni... by Xavier Amatriain 86476 views
- How to Build a Recommendation Engin... by Caserta Concepts 16133 views
- [SAC2014]Splitting Approaches for C... by YONG ZHENG 1159 views
- Social Trust-aware Recommendation S... by Nima Dokoohaki 2953 views
- Nuanced graph representation to imp... by Luca Aiello 745 views

No Downloads

Total views

7,769

On SlideShare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

431

Comments

1

Likes

10

No embeds

No notes for slide

- 1. Survey of Recommendation Systems
- 2. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 3. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 4. Introduction • What is recommendation system? – Recommend related items – Personalized experiences • How to build a recommendation system? – Content-Based – Collaborative Filtering Algorithm • Examples – Amazon – Youa
- 5. Examples Browsing a book Recommendations Rating?
- 6. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 7. CF Algorithm • Memory-Based User-Based Item-Based • Model-Based Bayes Clustering
- 8. User-Based CF Algorithm
- 9. User-Based CF Algorithm User by Item Matrix: Table 1: An example of user-item matrix Table 2: A simple example of ratings matrix
- 10. User-Based CF Algorithm Voting : vi,j corresponding to the vote for user i on item j. Mean Vote : where Ii is the set of items on which user i voted. Predicted vote: weights of n similar usersnormalizer
- 11. Similarity Computation Vector Cosine-Based Similarity Correlation-Based Similarity (Pearson) Other Similarities
- 12. Vector Cosine-Based Similarity Vector cosine similarity: Uu ujuUu uiu Uu ujuuiu BA rrrr rrrr w 2 , 2 , ,, , )()( ))(( Adjusted cosine similarity: different rating scale?
- 13. Correlation-Based Similarity Pearson correlation: Thus in the example in Table 2, we have w1,5 = 0.756.
- 14. Prediction Computation Weighted Sum of Others’ Ratings: For the simple example in Table 4, using the user-based CF algorithm, to predict the rating for U1 on I2, we have
- 15. Recommendations I Rating Prediction Algorithm: a) Calculate Pa,i for each item i with prediction computation formulation. b) Recommend the top-N highest rating items that the active user a has not purchased.
- 16. Recommendations II K Nearest Neighbors Algorithm: a) Find k most similar users (KNN). b) Identify a set of items, C, purchased by the group together with their frequency. c) Recommend the top-N most frequent items in C that the active user has not purchased.
- 17. Item-Based CF Algorithm Correlation-Based Similarity: where ru,i is the rating of user u on item i, is the average rating of the ith item by those users. User-Item Matrix ir
- 18. Prediction Computation Simple Weighted Average: where wi,n is the weight between items i and n, ru,n is the rating for user u on item n.
- 19. Extensions • Default Voting • Inverse User Frequency • Case Amplification
- 20. Default Voting Problem: • pair-wise similarity is computed only from the ratings in the intersection of the items both users have rated. • too few votes at the beginning Solution: Assuming some default voting values for the missing ratings can improve the CF prediction performance. Dimension Reduction, such as SVD, PCA etc.
- 21. Inverse User Frequency Definition: )/log( ji nnf where nj is the number of users who have rated item j and n is the total number of users.
- 22. Case Amplification where ρ is the case amplification power, ρ ≥ 1, and typical choice of ρ is 2.5. Case amplification reduces noise in the data. It tends to favor high weights as small values raised to a power become negligible. For example, wi,j = 0.9, then it remains high (0.92.5 ≈ 0.8); if wi,j = 0.1, then it be negligible (0.12.5 ≈ 0.003).
- 23. Model-Based CF Algorithm • Simple Bayesian CF Algorithm • Clustering CF Algorithm
- 24. Simple Bayesian CF Algorithm Simple Bayesian: Laplace Estimator:
- 25. Simple Bayesian CF Algorithm Example in Table 4, to produce the rating for U1 on I2 using the Simple Bayesian CF algorithm and the Laplace Estimator:
- 26. Clustering CF Algorithm For two data objects, X = (x1, x2, …, xn) and Y = (y1, y2, …, yn), the popular Minkowski distance is defined as, where n is the dimension number of the object, and q is a positive integer. Obviously, when q = 1, d is Manhattan distance; when q = 2, d is Euclidian distance.
- 27. Evaluation Metrics Mean Absolute Error and Normalized Mean Absolute Error: where rmax and rmin are the upper and lower bounds of the ratings.
- 28. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 29. Challenges • Data sparsity • Scalability • Synonymy • Gray Sheep • Shilling Attacks
- 30. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 31. Demo • Tools：Mahout - Scalable machine learning and data mining library，http://mahout.apache.org/ • Data： MovieLens， http://www.movielens.org/
- 32. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 33. Conclusions CF categories Memory-based CF Representative techniques Item-based/user-based top-N recommendations Main advantages 1. easy implementation 2. new data can be added easily and incrementally 3. need not consider the content of the items being recommended 4. scale well with co-rated items Main shortcomings 1. are dependent on human ratings 2. performance decrease when data are sparse 3. cannot recommend for new users and items 4. have limited scalability for large
- 34. Conclusions CF categories Model-based CF Representative techniques 1. Bayesian belief nets CF 2. Clustering CF 3. CF using dimensionality reduction techniques, SVD, PCA Main advantages 1. better address the sparsity, scalability and other problems 2. improve prediction performance 3. give an intuitive rationale for recommendations Main shortcomings 1. expensive model-building 2. trade-off between prediction performance and scalability 3. lose useful information for dimensionality reduction techniques
- 35. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
- 36. Future work Scalability Real-time
- 37. Q & A
- 38. References J. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive algorithms for collaborative filtering,” in Proceedings of the 4th Conference on Uncertainty in Artificial Intelligence (UAI ’98), 1998. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative filtering recommendation algorithms,” in Proc. of the WWW Conference, 2001. K. Miyahara and M. J. Pazzani, “Collaborative filtering with the simple Bayesian classifier,” in Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, pp. 679–689, 2000. L. H. Ungar and D. P. Foster, “Clustering methods for collaborative filtering,” in Proceedings of the Workshop on Recommendation Systems, AAAI Press, 1998. Xiaoyuan Su and Taghi M. Khoshgoftaar, “A Survey of Collaborative Filtering Techniques,” in Advances in Artificial Intelligence Volume 2009, Article ID 421425, 19 pages.

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

I am Modester by name good day. i just went to your profile this time true this site (www.slideshare.net) and i got your detail and your explanation in fact the way you explain your self shows me that you are innocent and maturity and also understand person i decided to have a contact with you so that we can explain to our self each other because God great everyone to make a friend with each other and from that we know that we are from thism planet God great for us ok my dear please try and reach me through my email address (modester4life2@yahoo.com) so that i can send you my picture true your reply we can know each other ok have a nice day and God bless you yours Modester