Collaborative Filtering
Recommender System
VIMALENDU SHEKHAR
MILIND GOKHALE
RENUKA DESHMUKH
Recommender Systems
 Subclass of information filtering system that seek to predict the 'rating' or
'preference' that a user would give to an item.
 Helps deciding in what to wear, what to buy, what stocks to purchase etc.
 Applied in a variety of applications like movies, books, research arcticles.
Evolution
 People relied on the recommendations from their peers.
 This method doesn’t take the personal preference of the user in to account.
 It also limits the search space.
 Computer based recommender systems overcomes this by expanding the search
space and providing a more fine tunes results.
Tasks of Recommender Systems
 Predict Task- The user’s preference for an item.
 Recommend Task- Produce best ranked list of n-items for user’s need.
Collaborative Filtering
 collaborative filtering is the process of filtering for information or patterns using
techniques involving collaboration among multiple agents, viewpoints, data
sources, etc.
 For recommender systems collaborative filtering is a method of making automatic
predictions about the interests of a user by collecting preferences information
from many users.
 Based on the idea that people who agreed in their evaluation of certain items in
the past are likely to agree again in the future.
Recommender Systems
User - User Collaborative Filtering
 Basic Idea- find other users whose past rating behavior is similar to that of the
current user and use their ratings on other items to predict what the current user
will like.
 Required: Ratings matrix and similarity function that computes the similarity
between two users.
Idea
 The selection of neighbors can be random or based on a threshold value.
 User U’s prediction for item i is given by pu,I
Item-Item Collaborative Filtering
 Basic Idea- Recommend items that are similar to the user’s highly preferred items.
 Provides performance gains by lending itself well to pre-computing similarity
matrix.
Idea
Prediction
 User U’s prediction for item is given by pu,i
 Cosine similarity or conditional probability is used to computer item-item
similarity.
Dimensionality Reduction
 Problem:
 User-User or Item-Item CF: The user-items ratings domain is a vector space. Thus redundancy
 Information Retrieval: term-document matrix thus high dimensional representation of terms
and documents.
 Synonymy, Polysemy, noise
 Can we Reduce the number of dimensions to a constant k?
 Truncated SVD – Singular dimensionality reduction by singular value decomposition
 Applications:
 Information retrieval: LSA/LSI Latent semantic analysis / index.
 CF
Probabilistic Methods
 The core idea of probabilistic methods is to compute either P(i|u), the probability
that user u will purchase or view item i, or the probability distribution P(ru,i|u) over
user u’s rating of item I
 Cross-Sell System:
 uses pairwise conditional probabilities with the na¨ıve Bayes assumption to do
recommendation in unary e-commerce domains.
 Based on user purchase histories, the algorithm estimates P(a|b) (the probability that a
user purchases a given that they have purchased b) for each pair of items a, b. The
user’s currently-viewed item or shopping basket is combined with these pairwise
probabilities to recommend items optimizing the expected value of site-defined
objective functions
Probabilistic Matrix Factorization
 Probabilistic latent semantic analysis/indexing (PLSA/PLSI)
 PLSA decomposes the probability P(i|u) by introducing a set Z of latent factors.
Here z is a factor on the basis of which user (u) decides which item (i) to view or
purchase.
 P(i|u) is therefore
 Thus basically users are represented as a mixture of preference profiles or feature
preferences and attributes the item preference by user, to the preference profiles
rather than directly to the users.
ˆU is the matrix of the mixtures of preference profiles for each user
ˆT is the matrix of preference profile probabilities of selecting various items.
Σ is a diagonal matrix such that σz = P(z)
Hybrid Recommenders
 Hybrids can be particularly beneficial when the algorithms involved cover different use cases or
different aspects of the data set.
 7 Classes of Hybrid Recommenders
 Weighted – takes scores produced by several recommenders and combines them
 Switching – switch between difference algorithms according to the context
 Mixed – present several recommender results but not combined into single list.
 Feature-combining – Use multiple recommendation data sources to get a single meta-recommender
algorithm
 Cascading – chain the algorithms (output of one to other as input)
 Feature-augmenting – Uses output of one algo as one of the inputs to other algo
 Meta-level – Train a model using one algo and give it as input to another algo
 Example: Netflix Prize – Feature weighted linear stacking;
function gj of item meta-features, such as number of ratings or genre, to alter the blending ratio of the
various algorithms’ predictions on an item-by-item basis
Algorithm Selection
 User-based Algo: more tractable when there are more items than users
 Item-based Algo: more tractable when there are more users than items
 Minimal offline computation but higher online computation
 Matrix Factorization methods:
 - Require expensive offline model
 + Fast for online use
 + Reduced impact of ratings noise
 + Reduced impact of user rating on each others’ ratings
 Probabilistic Models: when recommendation process should follow models of user
behavior.
Evaluating Recommender Systems
 It can be costly to try algorithms on real sets of users and measure the effects.
 Offline Algorithmic Evaluations:
 Pre-test algorithms in order to understand user testing.
 It is beneficial for performing direct, objective comparison of different algorithms in a
reproducible fashion
Data Sets
 EachMovie: by DEC Systems Research center – 2.8M user ratings of movies
 MovieLens: 100K timestamped user ratings, 1M ratings, and 10M rating and 100K
timestamped records of users tagging movies.
 Jester: ratings of 100 jokes from 73,421 users between April 99’ – May 03’, and
ratings of 150 jokes from 63,974 users between Nov 06’ – May 09’
 BookCrossing: 1.1M ratings from 279K users for 271K books
 Netflix: 100M datestamped ratings of 17K movies from 480K users.
Offline Evaluation Structure
 The users in the data set are split into two groups: training set and test set.
 A recommender model is built against the training set.
 The users in the test set are then split into two parts: query set and target set.
 The recommender is given the query set as a user history and asked to
recommend items or to predict ratings for the items in the target set;
 it is then evaluated on how well its recommendations or predictions match with
those held out in the query.
 This whole process is frequently repeated as in k-fold cross-validation by splitting
the users into k equal sets.
Prediction Accuracy: MAE
 (MAE) Mean Absolute Error:
 Example: 5-star scale [1, 5], an MAE of 0.7 means that the algorithm, on average,
was off by 0.7 stars.
 This is useful for understanding the results in a particular context, but makes it
difficult to compare results across data sets as they have differing rating ranges
 (NMAE) Normalized mean absolute error: Divides the ranges of possible ratings
and thus a common metric range of [0,1]
Prediction Accuracy: RMSE
 (RMSE) Root Mean Square Error: Amplifies the larger absolute errors
 Netflix Prize: $1M prize was awarded for a 10% improvement in RMSE over Netflix’s
internal algorithm.
 Further, RMSE can also be normalized like NMAE by dividing the rating scale.
 Out of the three techniques, which one to use depends on how the results are to be
compared.
 Mostly these metrics are used for evaluation of predict tasks.
Accuracy over time
 Temporal versions of MAE and RMSE introduced to measure the accuracy of
recommender systems over time as and when more users are added to the
system.
 Hence the timestamped datasets prove to be very useful for measuring accuracy
over time.
nt - number of ratings computed up through time t
tu,i - the time of rating ru,i.
Decision Support Metrics
 This framework examines the capacity for a retrieval system to accurately identify
resources relevant to a query, measuring separately its capacity to find all relevant
items and avoid finding irrelevant items.
 A confusion matrix is used for measuring this.
Decision Support Metrics
 High Precision System: Example - Movie Recommendation
 High Recall System: Example – Legal precedent needs
Online Evaluation
 Offline evaluation though useful is limited to operating on past data.
 Recommender systems with similar metric performance can still give different
results and a decrease in the error may or may not make the system better at
meeting the user needs.
 For this online user testing is needed.
 Field Trials: Here the recommender is deployed in the live systems and users’
interaction with the system are recorded
 Virtual Lab Studies: They generally have a small user base who are invited to
participate instead of live user base.
Building a Data Set
 The need for preference data can be decomposed into two types of information
needs:
 User information: user’s preferences
 Item information: what kinds of users like or dislike each item
 User–item preferences: Set of characteristics, user preferences for those
characteristics, and those characteristics’ applicability to various items.
 Item–item model: What items are liked by the same users as well as the current
user’s preferences.
Cold-Start Problem
 Problem of providing recommendations when there is not yet data available
 Item cold-start: A new item has been added to the database (e.g., when a new
movie or book is released) but has not yet received enough ratings to be
recommendable.
 User cold-start: A new user has joined the system but their preferences are not yet
known
Sources of Preference Data
 Preference data (ratings) comes from two primary sources.
 Explicit ratings: Preferences the user has explicitly stated for particular items.
 Implicit ratings: Preferences inferred by the system from observable user activity, such as
purchases or clicks.
 Many recommender systems obtain ratings by having users explicitly state their
preferences for items. These stated preferences are then used to estimate the
user’s preference for items they have not rated.
 Drawback: There can, for many reasons, be a discrepancy between what the users say
and what they do.
Preferences can also be inferred from user
behavior
Usenet – Reading
Domain
Time spent reading
Saving or replying
Copying text into new articles
Mentions of URLs.
Intelligent Music
Mgmt System
Infers the user’s preference for
various songs in their library as they
skip them or allow them to play to
completion
e-commerce domain
Page views
Item purchases as gifts or personal
use
Shared accounts can be misleading
Types of preference data
Rating Scales
 GroupLens used a 5-star scale
 Jester uses a semi-continuous −10 to +10 graphical scale
 Ringo used a 7-star scale
 Pandora music uses a “like”/“dislike” method
Dealing with Noise
 Noise in rating can be introduced by – normal human error and other factors.
 Natural noise in ratings can be detected by asking users to re-rate items.
 Another approach is detecting and ignoring noisy ratings by comparing each
rating to the user’s predicted preference for that item and discarding ratings
whose differences exceed some threshold from the prediction and
recommendation process.
Thank you

Collaborative Filtering Recommendation System

  • 1.
    Collaborative Filtering Recommender System VIMALENDUSHEKHAR MILIND GOKHALE RENUKA DESHMUKH
  • 2.
    Recommender Systems  Subclassof information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item.  Helps deciding in what to wear, what to buy, what stocks to purchase etc.  Applied in a variety of applications like movies, books, research arcticles.
  • 3.
    Evolution  People reliedon the recommendations from their peers.  This method doesn’t take the personal preference of the user in to account.  It also limits the search space.  Computer based recommender systems overcomes this by expanding the search space and providing a more fine tunes results.
  • 4.
    Tasks of RecommenderSystems  Predict Task- The user’s preference for an item.  Recommend Task- Produce best ranked list of n-items for user’s need.
  • 5.
    Collaborative Filtering  collaborativefiltering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc.  For recommender systems collaborative filtering is a method of making automatic predictions about the interests of a user by collecting preferences information from many users.  Based on the idea that people who agreed in their evaluation of certain items in the past are likely to agree again in the future.
  • 6.
  • 7.
    User - UserCollaborative Filtering  Basic Idea- find other users whose past rating behavior is similar to that of the current user and use their ratings on other items to predict what the current user will like.  Required: Ratings matrix and similarity function that computes the similarity between two users.
  • 8.
  • 10.
     The selectionof neighbors can be random or based on a threshold value.  User U’s prediction for item i is given by pu,I
  • 11.
    Item-Item Collaborative Filtering Basic Idea- Recommend items that are similar to the user’s highly preferred items.  Provides performance gains by lending itself well to pre-computing similarity matrix.
  • 12.
  • 13.
    Prediction  User U’sprediction for item is given by pu,i  Cosine similarity or conditional probability is used to computer item-item similarity.
  • 14.
    Dimensionality Reduction  Problem: User-User or Item-Item CF: The user-items ratings domain is a vector space. Thus redundancy  Information Retrieval: term-document matrix thus high dimensional representation of terms and documents.  Synonymy, Polysemy, noise  Can we Reduce the number of dimensions to a constant k?  Truncated SVD – Singular dimensionality reduction by singular value decomposition  Applications:  Information retrieval: LSA/LSI Latent semantic analysis / index.  CF
  • 15.
    Probabilistic Methods  Thecore idea of probabilistic methods is to compute either P(i|u), the probability that user u will purchase or view item i, or the probability distribution P(ru,i|u) over user u’s rating of item I  Cross-Sell System:  uses pairwise conditional probabilities with the na¨ıve Bayes assumption to do recommendation in unary e-commerce domains.  Based on user purchase histories, the algorithm estimates P(a|b) (the probability that a user purchases a given that they have purchased b) for each pair of items a, b. The user’s currently-viewed item or shopping basket is combined with these pairwise probabilities to recommend items optimizing the expected value of site-defined objective functions
  • 16.
    Probabilistic Matrix Factorization Probabilistic latent semantic analysis/indexing (PLSA/PLSI)  PLSA decomposes the probability P(i|u) by introducing a set Z of latent factors. Here z is a factor on the basis of which user (u) decides which item (i) to view or purchase.  P(i|u) is therefore  Thus basically users are represented as a mixture of preference profiles or feature preferences and attributes the item preference by user, to the preference profiles rather than directly to the users. ˆU is the matrix of the mixtures of preference profiles for each user ˆT is the matrix of preference profile probabilities of selecting various items. Σ is a diagonal matrix such that σz = P(z)
  • 17.
    Hybrid Recommenders  Hybridscan be particularly beneficial when the algorithms involved cover different use cases or different aspects of the data set.  7 Classes of Hybrid Recommenders  Weighted – takes scores produced by several recommenders and combines them  Switching – switch between difference algorithms according to the context  Mixed – present several recommender results but not combined into single list.  Feature-combining – Use multiple recommendation data sources to get a single meta-recommender algorithm  Cascading – chain the algorithms (output of one to other as input)  Feature-augmenting – Uses output of one algo as one of the inputs to other algo  Meta-level – Train a model using one algo and give it as input to another algo  Example: Netflix Prize – Feature weighted linear stacking; function gj of item meta-features, such as number of ratings or genre, to alter the blending ratio of the various algorithms’ predictions on an item-by-item basis
  • 18.
    Algorithm Selection  User-basedAlgo: more tractable when there are more items than users  Item-based Algo: more tractable when there are more users than items  Minimal offline computation but higher online computation  Matrix Factorization methods:  - Require expensive offline model  + Fast for online use  + Reduced impact of ratings noise  + Reduced impact of user rating on each others’ ratings  Probabilistic Models: when recommendation process should follow models of user behavior.
  • 19.
    Evaluating Recommender Systems It can be costly to try algorithms on real sets of users and measure the effects.  Offline Algorithmic Evaluations:  Pre-test algorithms in order to understand user testing.  It is beneficial for performing direct, objective comparison of different algorithms in a reproducible fashion
  • 20.
    Data Sets  EachMovie:by DEC Systems Research center – 2.8M user ratings of movies  MovieLens: 100K timestamped user ratings, 1M ratings, and 10M rating and 100K timestamped records of users tagging movies.  Jester: ratings of 100 jokes from 73,421 users between April 99’ – May 03’, and ratings of 150 jokes from 63,974 users between Nov 06’ – May 09’  BookCrossing: 1.1M ratings from 279K users for 271K books  Netflix: 100M datestamped ratings of 17K movies from 480K users.
  • 21.
    Offline Evaluation Structure The users in the data set are split into two groups: training set and test set.  A recommender model is built against the training set.  The users in the test set are then split into two parts: query set and target set.  The recommender is given the query set as a user history and asked to recommend items or to predict ratings for the items in the target set;  it is then evaluated on how well its recommendations or predictions match with those held out in the query.  This whole process is frequently repeated as in k-fold cross-validation by splitting the users into k equal sets.
  • 22.
    Prediction Accuracy: MAE (MAE) Mean Absolute Error:  Example: 5-star scale [1, 5], an MAE of 0.7 means that the algorithm, on average, was off by 0.7 stars.  This is useful for understanding the results in a particular context, but makes it difficult to compare results across data sets as they have differing rating ranges  (NMAE) Normalized mean absolute error: Divides the ranges of possible ratings and thus a common metric range of [0,1]
  • 23.
    Prediction Accuracy: RMSE (RMSE) Root Mean Square Error: Amplifies the larger absolute errors  Netflix Prize: $1M prize was awarded for a 10% improvement in RMSE over Netflix’s internal algorithm.  Further, RMSE can also be normalized like NMAE by dividing the rating scale.  Out of the three techniques, which one to use depends on how the results are to be compared.  Mostly these metrics are used for evaluation of predict tasks.
  • 24.
    Accuracy over time Temporal versions of MAE and RMSE introduced to measure the accuracy of recommender systems over time as and when more users are added to the system.  Hence the timestamped datasets prove to be very useful for measuring accuracy over time. nt - number of ratings computed up through time t tu,i - the time of rating ru,i.
  • 25.
    Decision Support Metrics This framework examines the capacity for a retrieval system to accurately identify resources relevant to a query, measuring separately its capacity to find all relevant items and avoid finding irrelevant items.  A confusion matrix is used for measuring this.
  • 26.
    Decision Support Metrics High Precision System: Example - Movie Recommendation  High Recall System: Example – Legal precedent needs
  • 27.
    Online Evaluation  Offlineevaluation though useful is limited to operating on past data.  Recommender systems with similar metric performance can still give different results and a decrease in the error may or may not make the system better at meeting the user needs.  For this online user testing is needed.  Field Trials: Here the recommender is deployed in the live systems and users’ interaction with the system are recorded  Virtual Lab Studies: They generally have a small user base who are invited to participate instead of live user base.
  • 28.
    Building a DataSet  The need for preference data can be decomposed into two types of information needs:  User information: user’s preferences  Item information: what kinds of users like or dislike each item  User–item preferences: Set of characteristics, user preferences for those characteristics, and those characteristics’ applicability to various items.  Item–item model: What items are liked by the same users as well as the current user’s preferences.
  • 29.
    Cold-Start Problem  Problemof providing recommendations when there is not yet data available  Item cold-start: A new item has been added to the database (e.g., when a new movie or book is released) but has not yet received enough ratings to be recommendable.  User cold-start: A new user has joined the system but their preferences are not yet known
  • 30.
    Sources of PreferenceData  Preference data (ratings) comes from two primary sources.  Explicit ratings: Preferences the user has explicitly stated for particular items.  Implicit ratings: Preferences inferred by the system from observable user activity, such as purchases or clicks.  Many recommender systems obtain ratings by having users explicitly state their preferences for items. These stated preferences are then used to estimate the user’s preference for items they have not rated.  Drawback: There can, for many reasons, be a discrepancy between what the users say and what they do.
  • 31.
    Preferences can alsobe inferred from user behavior Usenet – Reading Domain Time spent reading Saving or replying Copying text into new articles Mentions of URLs. Intelligent Music Mgmt System Infers the user’s preference for various songs in their library as they skip them or allow them to play to completion e-commerce domain Page views Item purchases as gifts or personal use Shared accounts can be misleading
  • 32.
  • 33.
    Rating Scales  GroupLensused a 5-star scale  Jester uses a semi-continuous −10 to +10 graphical scale  Ringo used a 7-star scale  Pandora music uses a “like”/“dislike” method
  • 34.
    Dealing with Noise Noise in rating can be introduced by – normal human error and other factors.  Natural noise in ratings can be detected by asking users to re-rate items.  Another approach is detecting and ignoring noisy ratings by comparing each rating to the user’s predicted preference for that item and discarding ratings whose differences exceed some threshold from the prediction and recommendation process.
  • 35.