This document discusses recommender systems and singular value decomposition (SVD). It begins with an overview of recommendation techniques like content-based filtering, collaborative filtering, and similarity measures. It then explains SVD in detail, how it decomposes a matrix into three matrices, and its applications in recommender systems for matrix factorization and baseline estimation. Finally, it briefly mentions other applications of SVD like dimensionality reduction and image compression.
2. CONTENTS
• Dealing with Data
• Recommendation Techniques Overview
• Content Based Filtering
• Collaborative Filtering
• Similarity Measures
Recommender Systems by Navule Pavan Kumar Rao
3. DEALING WITH DATA
Implicit Data Explicit Data
Data Gathered from User Behavior Well appreciated Data by User
Ex: Frequency of a product bought by user.
List of products bought by user.
Ratings, Reviews, Comments
Ex: user rate product on scale of 1 to 5
Advantage:
Lot of such data can be captured
Unbiased and versatile data
Advantage: More detailed preference towards
products by users.
More structured Data.
Disadvantage:
Data is more noisy, unstructured.
No evidence what user feels about the
product he uses
Disadvantage:
All users not interested in giving ratings to
products they purchase (sparsity).
Possibility of Bias(from user), Fraud (from
seller)
Recommender Systems by Navule Pavan Kumar Rao
5. CONTENT BASED FILTERING
• Content Based Filtering (CBF) emphasizes more on attributes of product being
recommended.
• CBF recommends the products based on the most similar features in other
products.
• Only related or similar products will be recommended to the user based on
similarities.
• Independent of other users’ preferences.
• Similarity Measure employed in CBF:
• Cosine Similarity, Euclidian Distance, Pearson Correlation, TF-IDF
Recommender Systems by Navule Pavan Kumar Rao
6. CONTENT BASED FILTERING
Examples:
Genre based Recommendations
• News: Political, Business, Economies, Cryptocurrencies etc.
• Movies: Fantasy, Action, Comedy, Horror etc.
• Cuisine: Indian, Continental, Italian etc.
Recommender Systems by Navule Pavan Kumar Rao
7. COLLABORATIVE FILTERING
• Collaborative Filtering (CF) recommends products based on users preferences.
• CF depends on user-item relationship.
• Based on this, it calculates the similarities between their most relevant majority
user profiles and recommends items.
• Most relevant?
• Similarity Measures!
• Majority?
• Weighted Average of their ratings.
U I I1 I2 I3 I4
U1 1 2 1 1
U2 2 3 5 1
U3 3 4 2 5
U4 5 3 5 4
Recommender Systems by Navule Pavan Kumar Rao
8. COLLABORATIVE FILTERING
User
Based
Filtering:
Computes the
similar users
Calculate the
Weighted
average across
items*
Predict
Item
Based
Filtering
Compute Similar
Items based on
user’s
preference
history
Calculate
Weighted
Average
across items*
Predict
* Can use other approaches based on the CBF technique employed
Recommender Systems by Navule Pavan Kumar Rao
Memory Based CBF Techniques:
10. SIMILARITY MEASURES
Cosine Similarity
The resultant of Cosine Similarity is the Amplitude which is on y axis.
Higher the result lesser the angle between A and B and more similar they are.
Recommender Systems by Navule Pavan Kumar Rao
11. SIMILARITY MEASURES
Cosine Similarity
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
𝐶𝑜𝑠𝑖𝑛𝑒 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝑈𝐴, 𝑈𝐵) =
4𝑥4 + 3𝑥4 + 2𝑥1
42 + 32 + 22 𝑥 42 + 42 + 12
= 0.969
UA and UB are similar users
Recommender Systems by Navule Pavan Kumar Rao
13. CONTENTS
• Introduction
• SVD in Detail
• SVD in Recommender Systems
• Baseline Estimation Techniques
• Other applications
Recommender Systems by Navule Pavan Kumar Rao
14. INTRODUCTION
• Data like user preferences, ratings etc,. are represented in the form of matrix
• Represent this matrix into a product(factors) of many matrices and each matrix
holding some significance about the original matrix.
• This technique is referred as Matrix Factorization or Matrix Decomposition.
Recommender Systems by Navule Pavan Kumar Rao
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
15. MATRIX FACTORIZATION TECHNIQUES
• Based on Eigen Values
• Eigen decomposition
• Jordan decomposition
• Schur decomposition
• Real Schur decomposition
• QZ decomposition
• Takagi's factorization
• Singular value decomposition
• Solve system of Linear Equations
• LU decomposition
• LU reduction
• Block LU decomposition
• Rank factorization
• Cholesky decomposition
• QR decomposition
• RRQR factorization
• Interpolative decomposition
And Many other..!
Recommender Systems by Navule Pavan Kumar Rao
16. SVD IN DETAIL
SVD (Singular Value Decomposition) is a representation of rectangular matrix of
Gene-Expression Data,
Anxp= Unxn Snxp VT
pxp
For Ex:
n rows – Gene – assumed as users.
p columns – Expression – assumed as ratings to a particular product.
Recommender Systems by Navule Pavan Kumar Rao
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
17. SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Anxp Unxn (Gene) Snxp VT
pxp (Expression)=
Recommender Systems by Navule Pavan Kumar Rao
18. SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Where U’s Columns are Eigen Vectors of AAT
and V’s Columns are Eigen Vectors of ATA
S is a diagonal matrix whose elements are singular values* of A in
the decreasing order.
U and V and are Orthogonal, i.e.
UTU = Inxn
VTV = Ipxp
* Singular Values are Square roots of Eigen Values
Recommender Systems by Navule Pavan Kumar Rao
19. SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
SVD is used in Collaborative Filtering approach in the field of Recommender
Systems.
Reverse approach of SVD to find U or V by using Stochastic Gradient Descent
Optimization technique where Anxp is sparse.
Find pg and qe such that
• pg makes the row of matrix U where pg is the gth Gene vector of U
• qe makes the column of matrix VT where qe is the eth Expression vector of VT
age = pg . qe
All the vectors pg are mutually orthogonal, as well as the vectors qe
4 1 ?
? 2 5
3 ? ?
Recommender Systems by Navule Pavan Kumar Rao
20. SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn VT
pxp
aeg
Anxp Unxn
qe
VT
pxp
=
pg
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor
Recommender Systems by Navule Pavan Kumar Rao
21. SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
• The problem here is we don’t know how to build pg and qe , so we take
any random values* for both pg and qe
• Perform baseline estimate either via
• SGD (Stochastic Gradient Descent)
• ALS (Alternative Least Squares)
• Once convergence is achieved, we obtain optimal values of pg and qe
We can then estimate any gene expression as age(estimated) = pg . qe
4 1 ?
? 2 5
3 ? ?
* Random values form normal distribution
Recommender Systems by Navule Pavan Kumar Rao
22. SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
As we have few age(actual) available in matrix A we calculate the age and try to reduce the
difference between the actual and calculated value (Minimization of error)
Errormin= ∑(age(actual) – pg . qe)2
We thus find for vectors that makes the sum of errors minimal.
Then we construct U and V with
• pg as rows of U and
• qe as columns of VT
Finally we have, A = UVT
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor
Recommender Systems by Navule Pavan Kumar Rao
23. SVD IN RECOMMENDER SYSTEMS
Accuracy Measures
RMSE (Root Mean Square Error)
MAP (Mean Average Precision)
Recommender Systems by Navule Pavan Kumar Rao
24. SVD IN RECOMMENDER SYSTEMS
There are variantions in SVD with this approach. The one explained is Sparse
SVD for High Dimensional Data.
• PMF (Probabilistic Matrix Factorization)
• SVD++
• NMF (Non-Negative Matrix Factorization)
4 1 ?
? 2 5
3 ? ?
Recommender Systems by Navule Pavan Kumar Rao
25. OTHER APPLICATIONS OF SVD
Dimensionality Reduction (PCA)
Image Compression
Natural Language Processing (LSI)
And many more..
Recommender Systems by Navule Pavan Kumar Rao