Collaborative Filtering 1: User-based CF

Recommender Systems &
Collaborative Filtering
Yusuke Yamamoto
Faculty of Informatics
Senior Lecturer
yusuke_yamamoto@acm.org
Data Engineering （Recommender System 1）
2019.10.16

1 Introduction to
Recommender Systems
2

3
Predicts user preference model and
decides which items should be recommended.
Recommender System

The most familiar recommender system: Amazon(1/2)
4画像出典：https://www.amazon.co.jp/
Recommend items which users will like

The most familiar recommender system: Amazon(2/2)
5画像出典：https://www.amazon.co.jp/
Recommend items related to a target item

SNS x Recommender System
6
User recommendation
on Twitter
Event recommendation
on Facebook

Movie recommendation from Netflix
7

Apple Music x Recommender System
8

Computational Ads
10画像出典：http://www.apple-style.com

Recommendation Systems appear anywhere!
11

Why using Recommender Systems?
12
Value for users
● Find things that are interesting
● Narrow down the set of choices
Value for providers
● Increase trust and customer loyalty
● Increase sales, click rates, conversion etc.
● Discover new things..
● Opportunities for promotion

Definition of Recommender System
13
Favorite artist setting
Purchase
Dwell time
Clickthrough
Bookmark
…
Rating Comment
Retweet
…
Explicit preference info.
＋
Implicit preference info.
Favorite genre setting
Favorite brand setting

Definition of Recommender System
14
Ad
MusicProduct
…
Web pageUser
Event

Paradigms of Recommender System
15Dietmar Jannach氏のRecommender Systems: An IntroductionのPPT資料より
Recommender
System
item score
item1 0.9
item2 1
item3 0.3
… …
User profile
& context
Recommendation
list for a target user
1
Community
data2
Item features
3
User model

3 main approach for recommendation
16
Collaborative filtering
Decides which items should be recommended,
based on past behavior logs of similar users
Content-based filtering
based on item features and its metadata
Knowledge-based filtering
based on preference info. which users explicitly show

Problem Definition
17
§ User u’s behavior data setBu={b1, b2, …, bn}
§ Item set I = {i1, i2, …, im}
§ User u’s profile（user model）：pu
§ Relevance between pu and item i ：Rel (pu, i)
Input
Output
Ranked list of item set I （∀ i ∈ I）, based on Rel (pu, i)
s.t.
l How to model user profiles?
l How to compute relevance?
Point

Content list of this lecture
18
1. Collaborative Filtering (CF)
2. Content-based Filtering
3. Link analysis
4. Advanced CF
Lecture + Programming Work
as you can see how methods work

References for this lecture
画像出典：https://www.amazon.co.jp/ 19

2 Collaborative Filtering – Part 1
20

Collaborative Filtering (CF)
21
Approach
Uses the preferences of a community data to
recommend items
Basis assumption
• Users appropriately give ratings to items
• Patterns in the rating data help us predict the ratings
Practical points
• Large commercial eCommerce sites use the CF
• Well-understood
• Applicable in many domains if only rating data can be
obtained

Example
22
How much does
Alice like Item5？Q.
Alice
Item 1 Item 2 Item 3 Item 4 Item 5
?
Items purchased by Alice and her ratings
Un-purchased item
A
✓ ✓ ✓ ✓

Let’s observe other users’ ratings
23
Can we predict Alice’s rating using others’ ratings?Q.
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
24
Dissimilar
Similar

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
25
Alice will give about 5 to item 5?

User-based Collaborative Filtering
26
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Use ratings of the users with similar preferenceIdea:
Point
How to compute user similarity
How do we combine the ratings of the similar users
to predict Alice’s rating?
Which/how many similar users’ ratings to consider?
1.
2.
3.

Similarity between users（1/3）
27
Pearson Correlation Coefficient
𝑠𝑖𝑚 𝑢', 𝑢) =
∑,∈-(𝑟01,, − 𝑟01
)(𝑟04,, − 𝑟04
)
∑,∈- 𝑟01,, − 𝑟01
5
∑,∈- 𝑟01,, − 𝑟01
5
：User a, b𝑢', 𝑢)
𝑟01,, ：User a’s rating to item i
𝐼 : Item set
𝑟01
𝑟04, ：User a, b’s average rating

28
∑,∈-(𝑟01,, − 𝑟01
)(𝑟04,, − 𝑟04
)
∑,∈- 𝑟01,, − 𝑟01
5
∑,∈- 𝑟04,, − 𝑟04
5
Item1 Item2 Item3 Item4
Alice 5 3 4 4
User1 3 1 2 3
User2 4 3 4 3
User3 3 3 1 5
User4 1 5 5 2
sim=0.71
sim=-0.79

29
∑,∈-(𝑟01,, − 𝑟01
)(𝑟04,, − 𝑟04
)
∑,∈- 𝑟01,, − 𝑟01
5
∑,∈- 𝑟04,, − 𝑟04
5
Alice 5 3 4 4
User1 3 1 2 3
User2 4 3 4 3
User3 3 3 1 5
User4 1 5 5 2
sim = ?
Let’s calculate

Pearson correlation（1/2）
30
A measure of the linear correlation between
two variables X and Y
0
1
2
3
4
5
6
Alice
User1
User4
Ratingscore
sim(Alice, User4)=-0.79
sim(Alice, User1)=0.85
（It takes differences in rating behavior into account）

Pearson correlation（2/2）
31
0
1
2
3
4
5
6
Alice
User1
User2
Ratingscore
sim(Alice, User2)=0.71
A measure of the linear correlation between
two variables X and Y
（It takes differences in rating behavior into account）

Predicting rating scores based on user similarity（1/3）
32
A typical prediction function
𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢', 𝑖 = 𝑟01
+
∑0∈=>
𝑠𝑖𝑚(𝑢', 𝑢) ? (𝑟0,, − 𝑟01
)
∑0∈=>
𝑠𝑖𝑚(𝑢', 𝑢)
：Target user a𝑢'
𝑟0,, ：Rating score of u for item i
𝑖 ：Target item i
𝑟01 ：User a’s average rating score
𝑈A ：A set of similar users to ua

33
+
∑0∈=>
𝑠𝑖𝑚(𝑢', 𝑢) ? (𝑟0,, − 𝑟01
)
∑0∈=>
Item5 sim Average rating
Alice ? 1 4
User1 3 0.85 2.4
User2 5 0.71 3.8
Similar
users
4.0 +
0.85× 3 − 2.4 + 0.71×(5 − 3.8)
0.85 + 0.71
= 4.87
Predicted rating score of Alice for Item5

34
+
∑0∈=>
𝑠𝑖𝑚(𝑢', 𝑢) ? (𝑟0,, − 𝑟01
)
∑0∈=>
Item5 sim Average rating
Alice ? 1 4
User1 3 0.85 2.4
User2 5 0.71 3.8
Similar
users
4.0 +
0.85× 3 − 2.4 + 0.71×(5 − 3.8)
0.85 + 0.71
= 4.87
Predicted rating score of Alice for Item5
How to choose similar users?Q.

How to decide “similar users” (nearest neighbors)?
35
Set a threshold for user similarity
• If a user has higher similarity than a threshold,
he/she can be regarded as a “similar” user
• In worst cases, no similar users will be found
Focus on top K similar users （kNN method）
• If a user ranks at the top K similarity, he/she can be
regarded as a similar user
• K is often set to between 50 〜 200
• In worst cases, a system uses rating information of users
with low similarity

Summary of User-based Collaborative Filtering
36
Basic Approach
• User similarities are obtained from a rating matrix
• Based on rating scores of similar users, systems predict
a rating score of target user for a target item
Similarity Calculation
Pearson correlation coefficient is often used
Selection of Similar Users
Top K users with high similarity are often selected as
similar users

Collaborative Filtering 1: User-based CF

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Collaborative Filtering 1: User-based CF

Similar to Collaborative Filtering 1: User-based CF (20)

More from Yusuke Yamamoto

More from Yusuke Yamamoto (20)

Recently uploaded

Recently uploaded (20)

Collaborative Filtering 1: User-based CF