DN 2017 | Building a recommender system using collaborative filtering (CF) | Sarah Mestiri

Building a recommender
system using Collaborative
Filtering (CF)
Sarah Mestiri
Machine Learning Engineer

“Customer expect to be treated as
human, not number.”
Introduction
“State of the connected customer” report- Salesforce - 2016

Outline
Experiencing personalization through
collaborative filtering
Common-Methods in CF
Building a recommender system using CF

Experiencing
personalization through
collaborative filtering

What’s Collaborative
Filtering?

Memory-based CF
Types
User-based collaborative filtering Item-based collaborative filtering
Recommendations based on similar
users ratings to target user.
Recommendations based on target
user’s own ratings on similar items.
Item-based

Memory-based CF
Process explained
Prediction:
1- Similarity calculation
between items (users).
2-”Peer Group”: top-k
most similar items
(users).
i1 i2 i3 i4 i5
u1 3 5 1 1 ?
u2 1 0 0 2 1
u3 2 5 1 ? ?
u4 0 1 1 ? ?

Memory-based CF
Implementation &
challenges
Step 1: Choice of Similarity Measure
- Pearson (mean-centered ratings)
- Cosine
- Adjusted Cosine
Step 2: Determination of Peer Group
- Top-k most similar
items (users)
What if one user has a general tendency to rate generously while
the other is harsh in his ratings?
Step 3: Recommendation

Model-based CF
Introduction
Where do Neighborhood methods fail?
Computation
Scalability
Sparsity
Accuracy
Challenges

What to recommend to Alice?
Model-based CF

Matrix Inception Frozen King-kong Zootopia
Alice -1 -1 1 1 ?
Patrick 1 0 0 -1 1
John 1 -1 -1 ? ?
Sara 0 1 1 ? ?
Children -1 -1 1 1 1
Action 1 1 -1 -1 0
Children Action
Alice 1 -1
Patrick -1 1
John -1 1
Sara 1 0
Model-based CF
Matrix factorization

X
Alice -1 -1 1 1 ?
Patrick 1 0 0 -1 1
John 1 -1 -1 ? ?
Sara 0 1 1 ? ?
Children -1 -1 1 1 1
Action 1 1 -1 -1 0
Children Action
Alice 1 -1
Patrick -1 1
John -1 1
Sara 1 0
Latent factors
Model-based CF
Matrix factorization

Model-based CF
Ratings prediction
How can one determine the
factor matrices U and V ?
Goal: Minimize the
difference between
predicted ratings &
observed ratings.
Stochastic Gradient
Descent SGD (~SVD)
or Alternating Least
Squares ALS

Step 1: Understand your Data
Visualizations (Pandas, Matplotlib,
Seaborn)

Step 2: Pre-Process your data
Cleanup, merge, etc. (Pandas,
Numpy)

Step 3: Build your ML Model
Implement the CF method
(KNN,SGD,ALS) or use it from
available libraries (Scikit-learn,
Surprise, MLLib Spark)

Step 4: Train your model
Predict ratings on the train set.

Step 5: Test your model
Predict ratings on the test set.

Step 6: Evaluate your model:
Measure accuracy using RMSE

https://github.com/SarahMestiri/RecommenderSystems
GitHub Repo:

Thank you!
mestiri.sa@gmail.com
Sarahmestiri.com
@mestirisarah

DN 2017 | Building a recommender system using collaborative filtering (CF) | Sarah Mestiri

More Related Content

Similar to DN 2017 | Building a recommender system using collaborative filtering (CF) | Sarah Mestiri

More from Dataconomy Media

Recently uploaded

DN 2017 | Building a recommender system using collaborative filtering (CF) | Sarah Mestiri

Editor's Notes