Recommender Systems.pptx

Recommender Systems
Based Rajaraman and Ullman: Mining
Massive Data Sets &
Francesco Ricci et al. Recommender
Systems Handbook.

Recommender System
o Predict ratings for unrated items
o Recommend top-k items

RS – Major Approaches
• Basic question: Given 𝑅 𝑈 ×|𝐼| (highly
incomplete/sparse), given 𝑢, 𝑖, predict 𝑟𝑢,𝑖.
𝒊𝟏 𝒊𝟐 𝒊𝟑 𝒊𝟒 𝒊𝟓 𝒊𝟔
1 3 5
1 4 4
4 2 3
3 5 4
4 4 3
𝑢1
𝑢2
𝑢3
𝑢4
𝑢5

RS – Approaches
• Content-based: how similar is 𝑖 to items 𝑢 has
rated/liked in the past?
– Use metadata for measuring similarity.
+ works even when no ratings available on affected
items.
- Requires metadata!
• Collaborative Filtering: Identify items (users)
with their rating vector; no need for
metadata; but cold-start is a problem.

RS – Approaches
• CF can be memory-based (as sketched on p5): item 𝑖’s
“characteristics captured by the ratings it has received
(rating vector).
• Or it can be model-based: model user/item’s behavior
via latent factors (to be learned from data).
– Dimensionality reduction
– Original ratings matrix is usually (very) low rank.
 Matrix completion:
• using Singular value decomposition (SVD).
• Using matrix factorization (MF) [and variants].
• MovieLens – example of RS using CF.

Key concepts/questions
• How is user f/b expressed: ratings or implicit?
• How to measure similarity?
• How many nearest neighbors to pick (if
memory- or neighborhood-based).
• How to predict unknown ratings?
• Distinguished (also called active) user and
(target) item.

A Naïve Algorithm (memory-based)
• Find top-ℓ most similar neighbors to
distinguished user 𝑢 (using chosen similarity
or proximity measure).
• ∀item 𝑖 rated by sufficiently many of these,
compute 𝑟𝑢𝑖by aggregating by chosen
neighbors above.
• Sort items with predicted ratings and
recommend top-𝑘 items to 𝑢.

An Example
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7
𝑢1 4 5 1
𝑢2 5 5 4
𝑢3 2 4 5
𝑢4 3 3
• Jaccard(A,B) = 1/5 <2/4 = Jaccard(A,C)!
• cos 𝐴, 𝐵 = 4 × 5/ 𝐴 . |𝐵| ≈ 0.380 > 0.322 ≈
cos 𝐴, 𝐶 . – OK, but ignores internal “rating scales” 
easy/hard graders.
• See the Rajaraman et al. book for “rounded”
Jaccard/Cosine.
• A more principled approach: subtract from each rating the
corresponding user’s mean rating, then apply
Jaccard/cosine.

An Example
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7
𝑢1 2/3 5/3 -7/3
𝑢2 1/3 1/3 -2/3
𝑢3 -5/3 1/3 4/3
𝑢4 0 0
• See what just happened to the ratings!
• Behavior and items more well-separated.
• Cosine can now be + or -: check (A,B) and
(A,C).

Prediction using Memory/Neighborhood-
based approaches
• A popular approach – using Pearson correlation
coefficient.
• 𝑟𝑢𝑖 = 𝑟𝑢 + 𝐾. 𝑣∈𝑁 𝑢 ∩𝑈 𝑖 𝑤𝑢𝑣. 𝑟𝑣𝑖 − 𝑟𝑣 , where 𝑤𝑢𝑣 =
{ 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 − 𝑟𝑢 𝑟𝑣𝑗 − 𝑟𝑣 }/{√ 𝑗∈𝐼 𝑢 ∩𝐼 𝑣 𝑟𝑢𝑗 −

User-User vs Item-Item.
• User-User CF: what we just discussed!
• Item-Item – dual in principle: find items most
similar to distinguished item 𝑖; for every user
𝑢 who did not rate the distinguished item but
rated sufficiently many from the similarity
group, compute 𝑟𝑢𝑖.
• In practice, item-item has been found to be
better than user-user.

Simpler Alternatives for Rating
Estimation
• Simple average of ratings by most similar neighbors.
• Weighted average.
• User’s mean plus offset corresponding to weighted
average of offsets by most similar neighbors (Pearson!).
• Or you can see the popular vote by most similar
neighbors: e.g., 𝑢 has 5 most similar neighbors who
have rated 𝑖.
– 𝑣1, 𝑣2 rated 1; 𝑣_3 rated 3; 𝑣4 rated 4; 𝑣5 rated 5.
– Simple majority: 𝑟𝑢𝑖 = 1.
– Suppose 𝑤𝑢𝑣1
= 𝑤𝑢𝑣2
= 0.2; 𝑤𝑢𝑣3
= 0.3; 𝑤𝑢𝑣4
=
0.8; 𝑤𝑢𝑣5
=1.0. Then 𝑟𝑢𝑖 = 5. Tie-breaking arbitrary.

Item-based CF
• Dual to user-based CF, in principle.
• “People who bought 𝑆 also bought 𝑇”.
• Natural connection to association rules (each user = a
transaction).
• Predict unknown rating of user 𝑢 on item 𝑖 as the aggregate
of ratings by 𝑢 on items similar to 𝑖.
• E.g., using mean-centering and Pearson correlation for
item-item similarity,
𝑟𝑢𝑖 = 𝑟𝑖 + 𝐾
𝑗∈𝐼 𝑢 ∩𝑁(𝑖)
𝑤𝑖𝑗. (𝑟𝑢𝑗 − 𝑟
𝑗)
where 𝑟𝑖 =mean rating of 𝑖 by various users and 𝑤𝑖𝑗 =
similarity b/w 𝑖 and 𝑗, and 𝐾– the usual normalization factor.

Item-based CF Computation Illustrated
• Similarities: computing sim. b/w all pairs of items is prohibitive!
• But do we need to?
• How efficiently can we compute the sim. of all pairs of items for
which the sim. Is positive?
X
X
X
X
𝑖
𝑢
…

Item-based CF – Recommendation
Generation
X
X
X
X
𝑖
𝑢 X X X X X
similar items?
similar items?
How efficiently can we generate recommendations for a given user?

Some empirical facts re. user-based vs.
item-based CF
• User profiles are typically thinner than item
profiles; depends on application domain.
– Certainly holds for movies (Netflix).
•  as users provide more ratings, user-user sim.
can chage more dyamically than item-item sim.
• Can we precompute item-item sim. and speed up
prediction computation?
• What about refreshing sim. against updates? Can
we do it incrementally? How often should we do
this?
• Why not do this for user-user?

User & Item-based CF are both
personalized
• Non-personalized would estimate an unknown
rating as a global average.
• Every user gets the same recommendation
list, modulo items s/he may have already
rated.
• Personalized clearly leads to better
predictions.

Recommender Systems.pptx

Recommended

Recommended

More Related Content

Similar to Recommender Systems.pptx

Similar to Recommender Systems.pptx (20)

Recently uploaded

Recently uploaded (20)

Recommender Systems.pptx