Recommender systems

RECOMMENDER SYSTEMS
NAVULE PAVAN KUMAR RAO

CONTENTS
• Dealing with Data
• Recommendation Techniques Overview
• Content Based Filtering
• Collaborative Filtering
• Similarity Measures
Recommender Systems by Navule Pavan Kumar Rao

DEALING WITH DATA
Implicit Data Explicit Data
Data Gathered from User Behavior Well appreciated Data by User
Ex: Frequency of a product bought by user.
List of products bought by user.
Ratings, Reviews, Comments
Ex: user rate product on scale of 1 to 5
Advantage:
Lot of such data can be captured
Unbiased and versatile data
Advantage: More detailed preference towards
products by users.
More structured Data.
Disadvantage:
Data is more noisy, unstructured.
No evidence what user feels about the
product he uses
Disadvantage:
All users not interested in giving ratings to
products they purchase (sparsity).
Possibility of Bias(from user), Fraud (from
seller)

RECOMMENDATION TECHNIQUES
Recommender
System
Content based
Filtering
Collaborative
Filtering
Model based
Filtering
Clustering,
Association Rules
Mining, Bayesian
Networks, Neural
Networks
Memory based
Filtering
User based Item based
Hybrid Filtering

CONTENT BASED FILTERING
• Content Based Filtering (CBF) emphasizes more on attributes of product being
recommended.
• CBF recommends the products based on the most similar features in other
products.
• Only related or similar products will be recommended to the user based on
similarities.
• Independent of other users’ preferences.
• Similarity Measure employed in CBF:
• Cosine Similarity, Euclidian Distance, Pearson Correlation, TF-IDF

CONTENT BASED FILTERING
Examples:
Genre based Recommendations
• News: Political, Business, Economies, Cryptocurrencies etc.
• Movies: Fantasy, Action, Comedy, Horror etc.
• Cuisine: Indian, Continental, Italian etc.

COLLABORATIVE FILTERING
• Collaborative Filtering (CF) recommends products based on users preferences.
• CF depends on user-item relationship.
• Based on this, it calculates the similarities between their most relevant majority
user profiles and recommends items.
• Most relevant?
• Similarity Measures!
• Majority?
• Weighted Average of their ratings.
U I I1 I2 I3 I4
U1 1 2 1 1
U2 2 3 5 1
U3 3 4 2 5
U4 5 3 5 4

COLLABORATIVE FILTERING
User
Based
Filtering:
Computes the
similar users
Calculate the
Weighted
average across
items*
Predict
Item
Based
Filtering
Compute Similar
Items based on
user’s
preference
history
Calculate
Weighted
Average
across items*
Predict
* Can use other approaches based on the CBF technique employed
Memory Based CBF Techniques:

SIMILARITY MEASURES
• Cosine Similarity
• Euclidian Distance
• TF-IDF
• Spearman’s Rank Correlation
• Pearson Correlation

SIMILARITY MEASURES
Cosine Similarity
The resultant of Cosine Similarity is the Amplitude which is on y axis.
Higher the result lesser the angle between A and B and more similar they are.

SIMILARITY MEASURES
Cosine Similarity
UserMovie Race 1 Race 2 Race 3
UA 4 3 2
UB 4 4 1
UC 2 1 5
𝐶𝑜𝑠𝑖𝑛𝑒 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝑈𝐴, 𝑈𝐵) =
4𝑥4 + 3𝑥4 + 2𝑥1
42 + 32 + 22 𝑥 42 + 42 + 12
= 0.969
UA and UB are similar users

SINGULAR VALUE DECOMPOSITION
NAVULE PAVAN KUMAR RAO

CONTENTS
• Introduction
• SVD in Detail
• SVD in Recommender Systems
• Baseline Estimation Techniques
• Other applications

INTRODUCTION
• Data like user preferences, ratings etc,. are represented in the form of matrix
• Represent this matrix into a product(factors) of many matrices and each matrix
holding some significance about the original matrix.
• This technique is referred as Matrix Factorization or Matrix Decomposition.
UA 4 3 2
UB 4 4 1
UC 2 1 5

MATRIX FACTORIZATION TECHNIQUES
• Based on Eigen Values
• Eigen decomposition
• Jordan decomposition
• Schur decomposition
• Real Schur decomposition
• QZ decomposition
• Takagi's factorization
• Singular value decomposition
• Solve system of Linear Equations
• LU decomposition
• LU reduction
• Block LU decomposition
• Rank factorization
• Cholesky decomposition
• QR decomposition
• RRQR factorization
• Interpolative decomposition
And Many other..!

SVD IN DETAIL
SVD (Singular Value Decomposition) is a representation of rectangular matrix of
Gene-Expression Data,
Anxp= Unxn Snxp VT
pxp
For Ex:
n rows – Gene – assumed as users.
p columns – Expression – assumed as ratings to a particular product.
UA 4 3 2
UB 4 4 1
UC 2 1 5

SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Anxp Unxn (Gene) Snxp VT
pxp (Expression)=

SVD IN DETAIL
Anxp= Unxn Snxp VT
pxp
Where U’s Columns are Eigen Vectors of AAT
and V’s Columns are Eigen Vectors of ATA
S is a diagonal matrix whose elements are singular values* of A in
the decreasing order.
U and V and are Orthogonal, i.e.
UTU = Inxn
VTV = Ipxp
* Singular Values are Square roots of Eigen Values

SVD IN RECOMMENDER SYSTEMS
Anxp= Unxn Snxp VT
pxp
SVD is used in Collaborative Filtering approach in the field of Recommender
Systems.
Reverse approach of SVD to find U or V by using Stochastic Gradient Descent
Optimization technique where Anxp is sparse.
Find pg and qe such that
• pg makes the row of matrix U where pg is the gth Gene vector of U
• qe makes the column of matrix VT where qe is the eth Expression vector of VT
age = pg . qe
All the vectors pg are mutually orthogonal, as well as the vectors qe
4 1 ?
? 2 5
3 ? ?

Anxp= Unxn VT
pxp
aeg
Anxp Unxn
qe
VT
pxp
=
pg
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor

Anxp= Unxn Snxp VT
pxp
• The problem here is we don’t know how to build pg and qe , so we take
any random values* for both pg and qe
• Perform baseline estimate either via
• SGD (Stochastic Gradient Descent)
• ALS (Alternative Least Squares)
• Once convergence is achieved, we obtain optimal values of pg and qe
We can then estimate any gene expression as age(estimated) = pg . qe
4 1 ?
? 2 5
3 ? ?
* Random values form normal distribution

Anxp= Unxn Snxp VT
pxp
As we have few age(actual) available in matrix A we calculate the age and try to reduce the
difference between the actual and calculated value (Minimization of error)
Errormin= ∑(age(actual) – pg . qe)2
We thus find for vectors that makes the sum of errors minimal.
Then we construct U and V with
• pg as rows of U and
• qe as columns of VT
Finally we have, A = UVT
S being a diagonal matrix it just acts as a scaler on U or VT and is ignor

Accuracy Measures
RMSE (Root Mean Square Error)
MAP (Mean Average Precision)

There are variantions in SVD with this approach. The one explained is Sparse
SVD for High Dimensional Data.
• PMF (Probabilistic Matrix Factorization)
• SVD++
• NMF (Non-Negative Matrix Factorization)
4 1 ?
? 2 5
3 ? ?

OTHER APPLICATIONS OF SVD
Dimensionality Reduction (PCA)
Image Compression
Natural Language Processing (LSI)
And many more..

REFERENCES
• http://web.mit.edu/be.400/www/SVD/Singular_Value_Decomposition.htm
• https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-
2011/positive-definite-matrices-and-applications/singular-value-
decomposition/MIT18_06SCF11_Ses3.5sum.pdf
• http://nicolas-hug.com/blog/matrix_facto_2

Thank you!

Recommender systems

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to Recommender systems

Similar to Recommender systems (20)

Recently uploaded

Recently uploaded (20)

Recommender systems

Editor's Notes