Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Recommender Systems, Lenskit, Mahout

Published in: Technology
  • Be the first to comment


  1. 1. Introduction to Recommender Systems Bob Brehm
  2. 2. Intro to Recommenders
  3. 3. Intro to Recommenders
  4. 4. Intro to Recommenders
  5. 5. What is a recommender? ● Wikipedia [3]: ● A subclass of [an] information filtering system that seek to predict the 'rating' or 'preference' that user would give to an item ● My addition: A subclass of machine-learning. ● Recommender model [2]: ● Users ● Items ● Ratings ● Community
  6. 6. What is a recommender? [2]
  7. 7. What is a recommender [2] ● Simple use case ● Tripadviser hotel recommender. Hotels ratings are averaged, and then the user examines the ratings. ● More complex use case ● News article recommender. A user model is built of user preferences. The items attributes are ranked. A matrix operation is performed to predict recommendations for a given user.
  8. 8. History of recommenders ● Konstan - Of Ants and Caveman [1] ● Ants will follow a chemical trail – similar to recommenders ● Paleolithic ancestors would follow recommendations on edible items vs. poisonous items ● Suggests psychological and/or biological need to follow recommendations
  9. 9. History of recommenders ● Manual filtering ● Usenet – recommenders based on TFIDF and user profile. [2] ● PARC tapestry (1992) - database of comments and contents. [1] ● Active CF (1005) – forwarding content to relevant readers. [1]
  10. 10. History of recommenders ● Automatic filtering [2] ● Grouplens project (1994) – User ratings for Usenet, Nearest Neigbor algorithm ● Commercial era – – Phoaks system – helped users locate information on web using collaborative filtering. – Cdnow – dot.bomb purchased by Amazon
  11. 11. History of recommenders ● Automatic filtering [3] ● Netflix Prize (2006-2009) – sought to award a prize for improving the predictive ability to match a user's preferences to movie selections by 10%. ● Neflix awared a $1M prize to “BellKor’s Pragmatic Chaos” - a team from Bell Labs. ● BellKor’s Pragmatic Chaos blended earlier work with better predictive models to win.
  12. 12. Recommender types ● Non-personalized recommender [2] ● Simple average of item ratings ● Can be misleading lacking context. What if favorite sauce is ketchup and you order ice cream? ● If X then Y recommenders can be improved by considering X! then Y ● Example: Zagat restaurant ratings, Tripadvisor hotel ratings
  13. 13. Recommender types ● Content-based filtering (user-item) [2] ● Model items by attribute keywords. Each item then has a position in the keyword vector space. ● Model user test profile by attribute keywords. The user profile also has a position in the keyword vector space. ● The relevance ranking is the cosine between these vectors. ● Factor in item ratings by threshold, +/- weight, etc. ● Term Frequency – Inverse Document Frequency (TF- IDF) to represent items in vector. ● Multi-linear regression for analysis.
  14. 14. Recommender types ● Collaborative filtering (user-user, item-item) [2] ● User-user CF is used extensively for social media friends linking – Facebook, LinkedIn, etc. [1][4]
  15. 15. Recommender types ● Hybrid Recommender [3] ● Combination of collaborative and content- based filtering[3] ● Netflix uses a hybrid system – they use collaborative filtering to find similar user habits and content filtering to find similar items. [2] ● Hybrids exist to overcome inherent difficulties such as the cold-start problem which is how to deal with a new user or new item. [4]
  16. 16. Algorithms ● Simple averages (Non-personalized) ● Cosine similarity (Content-based) ● TF-IDF (Content-based) ● Multi-linear regression (Content-based) ● Pearson Correlation (Collaborative) ● K-nearest neighbor (Collaborative)
  17. 17. Lenskit ● Lenskit is a recommender system open-source tool suite that can be used for production but is primarily useful for research and prototyping IMHO. ● Features of lenskit: ● Mavenized project including goals and archetypes. ● Data Access Objects (DAOs) and cursors. ● ItemScorer – implement this however you want. ● RatingPredictor – output is in the desired scale. ● ItemRecommender – provides Top-N recommendations.
  18. 18. Lenskit ● Features of Lenskit (cont.): ● Handy annotation classes. ● Support for Groovy. ● Post processing in R. ● MovieLens data sets (through Grouplens Research) ● Support for sparse matrices. ● Speed optimizations and profiling.
  19. 19. Mahout ● Started as a subproject of Lucene in 2008. ● Idea behind Mahout is that is provides a framework for the development and deployment of Machine Learning algorithms. ● Currently it has three distinct branches: ● Classification ● Clustering ● Recommenders
  20. 20. Mahout ● Support for recommenders include: ● Data model – provides connections to data ● UserSimilarity – provides similarity to users ● ItemSimilarity – provides similarity to items ● UserNeighborhood – find a neighborhood (mini cluster) of like-minded users. ● Recommender – the producer of recommendations. ● Algorithms!
  21. 21. Demo Lenskit
  22. 22. Implementing Recommenders ● Decide whether you want to make or buy. There are commercial companies out there that already do this. ● If you decide to make some hints: ● Get your user to login. ● Build the user's profile explicitly through preferences and implicitly through logging. ● Choose the simplest algorithms that get the job done. ● Test, test, test.
  23. 23. ??
  24. 24. Thanks ● A special thanks to Professor Joseph Konstan of the University of Minnesota who has put together an excellent MOOC called “Intro to Recommenders” through Coursera. Some of the material in this presentation is based on that class. ● Thanks for your time!
  25. 25. References ● [1] Introduction to recommender systems. Joseph Konstan. Sigmod 2008. ● [2] Intro to recommendations. Coursera. Retrieved from ● [3] Recommender system. Wikipedia. ● [4] An Algorithmic Framework for Performing Collaborative Filtering. ● [5] Hybrid Web Recommender Systems.