Define the original user-item matrix, R, of size m x n, which includes the ratings of m users on n items. r ij refers to the rating of user u i on item i j .
Preprocess user-item matrix R in order to eliminate all missing data values.
Compute the SVD of R and obtain matrices U, S and V , of size m x m, m x n, and n x n, respectively. Their relationship is expressed by: R =U * S * V T .
Perform the dimensionality reduction step by keeping only k diagonal entries from matrix S to obtain a k x k matrix, S k . Similarly, matrices U k and V k of size m x k and k x n are generated. The "reduced" user-item matrix, R ’ , is obtained by R ’ = U k * S k * V k T , while r ' ij denotes the rating by user u i on item i j as included in this reduced matrix.
Compute sqrt(S k ) and then calculate two matrix products: U k * sqrt(S k ) T , which represents m users and sqrt(S k ) * V k T , which represents n items in the k dimen-sional feature space. We are particularly interested in the latter matrix, of size k x n.
Use KNN on user matrix and item matrix, or you can multiply them to get user's rating on every item.
Demo from Here Which two people have the most similar tastes? Which two season are the most close?
MAGIC DIVISI ！ #!/usr/bin/env python #coding=utf-8 import divisi from divisi.cnet import * data = divisi.SparseLabeledTensor(ndim = 2) # read some rating into data # data[user_id, song_id] = 4 svd_result = data.svd(k = 128) # get songs that the user may like # predict_features(svd_result, user_id).top_items(100) # get similar songs # feature_similarity(svd_result, song_id).top_items(100) # get users that have similar tastes # concept_similarity(svd_result, user_id).top_items(100)