Collaborative filtering algorithms recommend items to users based on the items liked by similar users. There are two main approaches: model-based builds a predictive model from user data, while memory-based identifies similar users and recommends popular items among them. The document describes memory-based collaborative filtering using cosine similarity to calculate user similarities based on common liked items, normalized by number of items per user. An example in R shows generating recommendations for a new user based on a training user-item matrix and similarity calculations.
9654467111 Call Girls In Munirka Hotel And Home Service
An introduction to recommendation algorithms using collaborative filtering
1. An introduction to recommendation algorithms
Collaborative filtering: how does it work?
Arnaud de Myttenaere
2. About me
Data Scientist, PhD
Founder of Uchidata
Consultant at Octo Technology, Sydney
Several projects on recommendation
algorithms (Viadeo social network,
e-commerce, news website, . . . )
5. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
6. Model Based approach
1. Build a dataset which summarizes the data
UserId Like Gender Age Artist Style Country
1 1 M 25 Daft Punk Electro France
1 0 M 25 Lady Gaga Pop USA
2 1 F 20 The Beatles Rock UK
target user information item information
2. Learn a model to predict the target variable using your favorite
algorithm: Linear Regression, Random Forest, XGBoost, . . .
3. For each new customer, apply the model on a set of artists
and recommend the ones with the highest scores.
7. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
8. Memory Based approach
How to recommend items to a particular customer or user?
For each new customer:
Search for similar customers in
historical data
Recommend popular items among
similar customers
Example: Collaborative Filtering.
10. Collaborative filtering: why?
Collaborative Filtering algorithms1 ..
are intuitive
are simple to implement
scales relatively well
captures many implicit signals
are hard to beat
1
Criteo Product Recommendation Beyond Collaborative Filtering - welcome
to the Twilight Zone!, Olivier Koch RecSys Meetup London - Sept 20, 2017
11. But Collaborative Filtering algorithms also have some limitations2 :
does not scales that well in practice
does not capture temporal signals
does not solve cold start
does not address exploration in the long tail
2
Criteo Product Recommendation Beyond Collaborative Filtering - welcome
to the Twilight Zone!, Olivier Koch RecSys Meetup London - Sept 20, 2017
12. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
13. Context
Let us consider the following example:
John likes Rock
Mike likes Pop and Electro
Dan likes Pop, R&B and Rock
Lea likes Pop
This information can be loaded in the folowing dataset:
Customer Item
John Rock
Mike Pop
Mike Electro
Dan Pop
Dan R&B
Dan Rock
Lea Pop
Objective: find the best recommendation for Lea.
14. Graph visualization
The data can be visualized as a (bipartite) graph.
Code : R
Library : igraph
JohnMike Dan Lea
Rock PopElectro R&B
1 l i b r a r y ( i g r a p h )
d = read . csv ( ” data . csv ” ) # Load Data
3 g = graph . data . frame ( d ) # Load Data i n t o graph
5 V( g ) $ type <− V( g ) $name %i n% d$ Item # Set graph as b i p a r t i t e
p l o t (g , l a y o u t=l a y o u t . b i p a r t i t e , v e r t e x . c o l o r=c ( ” green ” , ” cyan ” ) [V( g ) $ type +1])
15. Incidence matrix
This graph can be represented by a matrix...
Rock Pop Electro R&B
John : 1 0 0 0
Mike : 0 1 1 0
Dan : 1 1 0 1
Lea : 0 1 0 0
A = get . i n c i d e n c e (g , s p a r s e=TRUE)
16. Incidence matrices
... or by two matrices:
Atrain =
Rock Pop Electro R&B
John : 1 0 0 0
Mike : 0 1 1 0
Dan : 1 1 0 1
Atest =
Rock Pop Electro R&B
Lea : 0 1 0 0
1 A t r a i n = A[ which ( rownames (A) != ”Lea” ) , ]
A t e s t = A[ which ( rownames (A) == ”Lea” ) , ]
17. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
18. If similarity is the number of items in common (1/2)
The similarity vector is given by:
SimMatrix = Atrain ⊗ t(Atest)
i.e.
SimMatrix =
1 0 0 0
0 1 1 0
1 1 0 1
⊗
0
1
0
0
=
0
1
1
John
Mike
Dan
Indeed Lea does not have any item in common with John, but has
1 item in common with Mike and Dan (Pop).
sim matrix = A t r a i n % ∗ % A t e s t
19. If similarity is the number of items in common (2/2)
Then the recommendation scores are given by
scoreMatrix = t(SimMatrix) ⊗ Atrain
i.e.
scoreMatrix = 0 1 1 ⊗
1 0 0 0
0 1 1 0
1 1 0 1
So
scoreMatrix =
Rock Pop Electro R&B
(1 2 1 1)
1 s c o r e M a t r i x = t ( as . matrix ( sim matrix ) ) % ∗ % A t r a i n
20. Comments
If similarity is the number of items in common...
→ not optimal since users with a lot of items will be very similar
to (almost) every user.
→ hard to use because leads to a lot of items with the same
recommendation score.
Better similarity metric: cosine similarity
→ Idea: normalize the similarity using the number of items
associated to each users.
21. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
22. Cosine similarity (1/3)
Using the same data:
Atrain =
Rock Pop Electro R&B
John : 1 0 0 0
Mike : 0 1 1 0
Dan : 1 1 0 1
The normalization vector is:
Ntrain =
1
1/
√
2
1/
√
3
John is associated to 1 item (Rock)
Mike is associated to 2 items (Pop, Electro)
Dan is associated to 3 items
23. Cosine similarity (1/3)
Then the normalized similarity matrix is given by:
SimMatrixnorm =
1 . .
. 1√
2
.
. . 1√
3
⊗
0
1
1
=
0
1/
√
2
1/
√
3
0
0.71
0.58
John
Mike
Dan
→Lea is more similar to Mike than Dan, because she has an item
in common with both but Mike is associated to less items than
Dan, so the link with Mike is stronger.
1 N t r a i n = 1/ s q r t (A t r a i n % ∗ % rep (1 , ncol (A) ) )
M norm = diag ( as . v e c t o r (N t r a i n ) )
24. Cosine similarity (3/3)
The matrix of scores is given by:
scoreMatrix = t(SimMatrixnorm) ⊗ Atrain
i.e.
scoreMatrix = 0 0.71 0.58 ⊗
1 0 0 0
0 1 1 0
1 1 0 1
So
scoreMatrix =
Rock Pop Electro R&B
(0.58 1.29 0.71 0.58)
s c o r e M a t r i x = t ( as . matrix ( sim matrix norm ) ) % ∗ % A t r a i n
25. Comments
For
Atest =
Rock Pop Electro R&B
Lea : 0 1 0 0
Recommendation scores are
scoreMatrix =
Rock Pop Electro R&B
(0.58 1.29 0.71 0.58)
1. Pop is the best recommendation for Lea, but she is already
associated to Pop.
2. If the objective is to recommend new items, Electro is the
best recommendation for Lea.
3. Rock and R&B have the same score and can be ordered by
frequencies or randomly.
26. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
27. R code
Collaborative filtering in 10 lines of code.
1 l i b r a r y ( i g r a p h ) # Load graph l i b r a r y
2 g = graph . data . frame ( read . csv ( ” data . csv ” ) ) # Read and con vert data i n t o graph
V( g ) $ type <− V( g ) $name %i n% d$ Item # Set graph as b i p a r t i t e
4 A = get . i n c i d e n c e (g , s p a r s e=TRUE) # Compute I n c i d e n c e Matrix
5 A t r a i n = A[ which ( rownames (A) != ”Lea” ) , ]
A t e s t = A[ which ( rownames (A) == ”Lea” ) , ]
N t r a i n = 1/ s q r t (A t r a i n % ∗ % rep (1 , ncol (A) ) )
8 M norm = diag ( as . v e c t o r (N t r a i n ) )
sim matrix norm = M norm % ∗ % (A t r a i n % ∗ % A t e s t )
10 s c o r e M a t r i x = t ( as . matrix ( sim matrix norm ) ) % ∗ % A t r a i n
28. In practice
Can be precomputed and do not need to be updated in real time :
Atrain and Mnorm
Must be computed in real time :
Atest and scoreMatrix (matrix calculation)
Optimal number of users :
too small → bad performance,
too big → too slow (unless computation is parallelized).
29. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
30. Collaborative Filtering
Notations
Let Iu(t) be the vector of items associated to a user u at time t:
Iu(t) = (0, 0, . . . , 1, . . . , 0)
where the kth coefficient is equal to 1 if item k is associated to
user u at time t, and 0 else.
Example: in music recommendation an ”item” could be an artist (or a song), and
coefficient k is equal to 1 if the user u likes the artist (or song) k.
31. Collaborative Filtering
Then, for t > t, the collaborative filtering algorithm estimates
Iu(t ) (the future vector of items associated to the user u) by:
Iu(t ) =
v=u
sim(v, u) · Iv (t)
where sim(v, u) represents the similarity between users u and v.
→ The most relevant items for the user u are the ones with the
highest score.
32. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
33. Similarity function
The similarity between two users can be defined as the number of
items in common. Then
sim(u, v) = Iu|Iv
where ·|· is the classical scalar product.
→ not optimal since users with a lot of items will be very similar to
every user.
34. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
35. Cosine similarity
Cosine similarity One can normalize the similarity by the number of
items associated to users u and v.
sim(u, v) =
Iu|Iv
Iu · Iv
However, as
Iu(t ) =
v=u
Iu|Iv
Iu · Iv
· Iv =
1
Iu
·
v=u
Iu|Iv
Iv
· Iv
The order of recommendations for the user u is the same than the
ones got with sim(u, v) = Iu|Iv
√
Iv
.
→ In practice we can use sim(u, v) = Iu|Iv
√
Iv
.
36. The different approaches
Model Based
Memory Based
Collaborative filtering using graph libraries
Context
A simple example
Cosine similarity
R code
More formally
Notations
Similarity function
Cosine similarity
Conclusion
37. Conclusion
Two different approaches :
Model Based
Memory Based
→ Choose the number of users in Atrain to fit your practical
constraints,
→ The definition of similarity between users can be modified to
consider users and context.
38. Conclusion
However
This algorithm is based on past behavior, so it never suggests
new content.
→ It is necessary to often refresh the training set.