Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Collaborative Filtering Recommendation System

Presentation to explain how collaborative filtering is leveraged to improve recommendation systems.

  • Be the first to comment

Collaborative Filtering Recommendation System

  1. 1. Collaborative Filtering Recommender System VIMALENDU SHEKHAR MILIND GOKHALE RENUKA DESHMUKH
  2. 2. Recommender Systems  Subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item.  Helps deciding in what to wear, what to buy, what stocks to purchase etc.  Applied in a variety of applications like movies, books, research arcticles.
  3. 3. Evolution  People relied on the recommendations from their peers.  This method doesn’t take the personal preference of the user in to account.  It also limits the search space.  Computer based recommender systems overcomes this by expanding the search space and providing a more fine tunes results.
  4. 4. Tasks of Recommender Systems  Predict Task- The user’s preference for an item.  Recommend Task- Produce best ranked list of n-items for user’s need.
  5. 5. Collaborative Filtering  collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc.  For recommender systems collaborative filtering is a method of making automatic predictions about the interests of a user by collecting preferences information from many users.  Based on the idea that people who agreed in their evaluation of certain items in the past are likely to agree again in the future.
  6. 6. Recommender Systems
  7. 7. User - User Collaborative Filtering  Basic Idea- find other users whose past rating behavior is similar to that of the current user and use their ratings on other items to predict what the current user will like.  Required: Ratings matrix and similarity function that computes the similarity between two users.
  8. 8. Idea
  9. 9.  The selection of neighbors can be random or based on a threshold value.  User U’s prediction for item i is given by pu,I
  10. 10. Item-Item Collaborative Filtering  Basic Idea- Recommend items that are similar to the user’s highly preferred items.  Provides performance gains by lending itself well to pre-computing similarity matrix.
  11. 11. Idea
  12. 12. Prediction  User U’s prediction for item is given by pu,i  Cosine similarity or conditional probability is used to computer item-item similarity.
  13. 13. Dimensionality Reduction  Problem:  User-User or Item-Item CF: The user-items ratings domain is a vector space. Thus redundancy  Information Retrieval: term-document matrix thus high dimensional representation of terms and documents.  Synonymy, Polysemy, noise  Can we Reduce the number of dimensions to a constant k?  Truncated SVD – Singular dimensionality reduction by singular value decomposition  Applications:  Information retrieval: LSA/LSI Latent semantic analysis / index.  CF
  14. 14. Probabilistic Methods  The core idea of probabilistic methods is to compute either P(i|u), the probability that user u will purchase or view item i, or the probability distribution P(ru,i|u) over user u’s rating of item I  Cross-Sell System:  uses pairwise conditional probabilities with the na¨ıve Bayes assumption to do recommendation in unary e-commerce domains.  Based on user purchase histories, the algorithm estimates P(a|b) (the probability that a user purchases a given that they have purchased b) for each pair of items a, b. The user’s currently-viewed item or shopping basket is combined with these pairwise probabilities to recommend items optimizing the expected value of site-defined objective functions
  15. 15. Probabilistic Matrix Factorization  Probabilistic latent semantic analysis/indexing (PLSA/PLSI)  PLSA decomposes the probability P(i|u) by introducing a set Z of latent factors. Here z is a factor on the basis of which user (u) decides which item (i) to view or purchase.  P(i|u) is therefore  Thus basically users are represented as a mixture of preference profiles or feature preferences and attributes the item preference by user, to the preference profiles rather than directly to the users. ˆU is the matrix of the mixtures of preference profiles for each user ˆT is the matrix of preference profile probabilities of selecting various items. Σ is a diagonal matrix such that σz = P(z)
  16. 16. Hybrid Recommenders  Hybrids can be particularly beneficial when the algorithms involved cover different use cases or different aspects of the data set.  7 Classes of Hybrid Recommenders  Weighted – takes scores produced by several recommenders and combines them  Switching – switch between difference algorithms according to the context  Mixed – present several recommender results but not combined into single list.  Feature-combining – Use multiple recommendation data sources to get a single meta-recommender algorithm  Cascading – chain the algorithms (output of one to other as input)  Feature-augmenting – Uses output of one algo as one of the inputs to other algo  Meta-level – Train a model using one algo and give it as input to another algo  Example: Netflix Prize – Feature weighted linear stacking; function gj of item meta-features, such as number of ratings or genre, to alter the blending ratio of the various algorithms’ predictions on an item-by-item basis
  17. 17. Algorithm Selection  User-based Algo: more tractable when there are more items than users  Item-based Algo: more tractable when there are more users than items  Minimal offline computation but higher online computation  Matrix Factorization methods:  - Require expensive offline model  + Fast for online use  + Reduced impact of ratings noise  + Reduced impact of user rating on each others’ ratings  Probabilistic Models: when recommendation process should follow models of user behavior.
  18. 18. Evaluating Recommender Systems  It can be costly to try algorithms on real sets of users and measure the effects.  Offline Algorithmic Evaluations:  Pre-test algorithms in order to understand user testing.  It is beneficial for performing direct, objective comparison of different algorithms in a reproducible fashion
  19. 19. Data Sets  EachMovie: by DEC Systems Research center – 2.8M user ratings of movies  MovieLens: 100K timestamped user ratings, 1M ratings, and 10M rating and 100K timestamped records of users tagging movies.  Jester: ratings of 100 jokes from 73,421 users between April 99’ – May 03’, and ratings of 150 jokes from 63,974 users between Nov 06’ – May 09’  BookCrossing: 1.1M ratings from 279K users for 271K books  Netflix: 100M datestamped ratings of 17K movies from 480K users.
  20. 20. Offline Evaluation Structure  The users in the data set are split into two groups: training set and test set.  A recommender model is built against the training set.  The users in the test set are then split into two parts: query set and target set.  The recommender is given the query set as a user history and asked to recommend items or to predict ratings for the items in the target set;  it is then evaluated on how well its recommendations or predictions match with those held out in the query.  This whole process is frequently repeated as in k-fold cross-validation by splitting the users into k equal sets.
  21. 21. Prediction Accuracy: MAE  (MAE) Mean Absolute Error:  Example: 5-star scale [1, 5], an MAE of 0.7 means that the algorithm, on average, was off by 0.7 stars.  This is useful for understanding the results in a particular context, but makes it difficult to compare results across data sets as they have differing rating ranges  (NMAE) Normalized mean absolute error: Divides the ranges of possible ratings and thus a common metric range of [0,1]
  22. 22. Prediction Accuracy: RMSE  (RMSE) Root Mean Square Error: Amplifies the larger absolute errors  Netflix Prize: $1M prize was awarded for a 10% improvement in RMSE over Netflix’s internal algorithm.  Further, RMSE can also be normalized like NMAE by dividing the rating scale.  Out of the three techniques, which one to use depends on how the results are to be compared.  Mostly these metrics are used for evaluation of predict tasks.
  23. 23. Accuracy over time  Temporal versions of MAE and RMSE introduced to measure the accuracy of recommender systems over time as and when more users are added to the system.  Hence the timestamped datasets prove to be very useful for measuring accuracy over time. nt - number of ratings computed up through time t tu,i - the time of rating ru,i.
  24. 24. Decision Support Metrics  This framework examines the capacity for a retrieval system to accurately identify resources relevant to a query, measuring separately its capacity to find all relevant items and avoid finding irrelevant items.  A confusion matrix is used for measuring this.
  25. 25. Decision Support Metrics  High Precision System: Example - Movie Recommendation  High Recall System: Example – Legal precedent needs
  26. 26. Online Evaluation  Offline evaluation though useful is limited to operating on past data.  Recommender systems with similar metric performance can still give different results and a decrease in the error may or may not make the system better at meeting the user needs.  For this online user testing is needed.  Field Trials: Here the recommender is deployed in the live systems and users’ interaction with the system are recorded  Virtual Lab Studies: They generally have a small user base who are invited to participate instead of live user base.
  27. 27. Building a Data Set  The need for preference data can be decomposed into two types of information needs:  User information: user’s preferences  Item information: what kinds of users like or dislike each item  User–item preferences: Set of characteristics, user preferences for those characteristics, and those characteristics’ applicability to various items.  Item–item model: What items are liked by the same users as well as the current user’s preferences.
  28. 28. Cold-Start Problem  Problem of providing recommendations when there is not yet data available  Item cold-start: A new item has been added to the database (e.g., when a new movie or book is released) but has not yet received enough ratings to be recommendable.  User cold-start: A new user has joined the system but their preferences are not yet known
  29. 29. Sources of Preference Data  Preference data (ratings) comes from two primary sources.  Explicit ratings: Preferences the user has explicitly stated for particular items.  Implicit ratings: Preferences inferred by the system from observable user activity, such as purchases or clicks.  Many recommender systems obtain ratings by having users explicitly state their preferences for items. These stated preferences are then used to estimate the user’s preference for items they have not rated.  Drawback: There can, for many reasons, be a discrepancy between what the users say and what they do.
  30. 30. Preferences can also be inferred from user behavior Usenet – Reading Domain Time spent reading Saving or replying Copying text into new articles Mentions of URLs. Intelligent Music Mgmt System Infers the user’s preference for various songs in their library as they skip them or allow them to play to completion e-commerce domain Page views Item purchases as gifts or personal use Shared accounts can be misleading
  31. 31. Types of preference data
  32. 32. Rating Scales  GroupLens used a 5-star scale  Jester uses a semi-continuous −10 to +10 graphical scale  Ringo used a 7-star scale  Pandora music uses a “like”/“dislike” method
  33. 33. Dealing with Noise  Noise in rating can be introduced by – normal human error and other factors.  Natural noise in ratings can be detected by asking users to re-rate items.  Another approach is detecting and ignoring noisy ratings by comparing each rating to the user’s predicted preference for that item and discarding ratings whose differences exceed some threshold from the prediction and recommendation process.
  34. 34. Thank you