CSTalks-Quaternary Semantics Recomandation System-24 Aug
A Unified Framework forRecommendations Based onQuaternary Semantic Analysis Wei Chen*, Wynne Hsu*, Mong Li Lee* *School of Computing, National University of Singapore 1
Introduction The amount of information on the web is increasing at a lightning pace. E.g products in Amazon, videos in Youtube, movies in Netflix Recommendation is necessary.
Introduction Recommendation systems are typically classified according to four types : User recommendation Item recommendation Tag recommendation Item rating prediction
Related WorkMost of the work in recommendation systemsutilize only ternary relationships in generatingrecommendations.The collaborative filtering-based recommendationsystems use <user ,rating, items >[B. Sarwar,WWW’01,SIGIR’09]Tag-based recommendation systems utilize the<users, tags, items >.
Motivation We argue that recommendations based on ternary relationships are not accurate as they would have missed out important associations
Motivation example Groundhog day is recommended to U3
Motivation The need of quaternary relationship is necessary. This is reinforced by the following observations: Users may use the same tag for an item but have different ratings for it. Items may have multiple tags indicating their different facets. Some tags may carry implicit semantics that can reveal the users’ preferences.
Overview of the paper We propose a model: using tensor to model the quaternary relationship. Higher-Order Singular Value Decomposition (HOSVD) is applied in the 4-order tensor to reveal the latent semantic associations among users, items, tags and ratings.
BACKGROUND - TensorA tensor is a multidimensional array. An N-ordertensor is denoted as
BACKGROUND – Tensor unfoldingThe matrix unfolding of an N-order tensor along the dimension i are vectorsobtained by keeping the index fixed while varyingthe other indices.
BACKGROUND – HOSVDHOSVD is a generalization of Singular ValueDecomposition (SVD) to higher-order tensors andcan be written as n-mode productWhere U(n) contain the orthonormal vectors (n-mode singular vector) spanning the column spaceof the A (n) , is the core tensor
BACKGROUNDSuppose we want to get the RANK-(2,3,3)approximation, we first retain the first ci column ofmatrix U(i) at mode i as follows:
BACKGROUND –TensorApproximationWe can now construct the approximate core tensorusing
BACKGROUNDFinally, we obtain the RANK-(2,3,3) approximation
QUATENARY SEMANTICANALYSISThe main idea is to capture the underlyingrelationships among users-tags-items-ratings byreducing the rank of the original tensor to minimizethe effect of noise on the underlying populationand reduce spareness.
QUATENARY SEMANTICANALYSIS - Initialization Input: list of quadruples < users, tags, rating, items>;
QUATENARY SEMANTICANALYSIS - Initialization constructed tensor where |U|, |T|, |R| and |V| is the number of user, tags , ratings and items respectively
QUATENARY SEMANTICANALYSIS Calculate the matrix unfolding A(1) , A(2) , A(3) and A(4) from Tensor Perform SVD on each matrix unfolding and get the left singular matrix U(1) , U(2) , U(3) and U(4)
QUATENARY SEMANTICANALYSISRemove the least significant rows |U|-c1; |V |-c2; |T|-c3and |R|-c4 from U(1);U(2);U(3); and U(4), respectively. Wechoose c1= 4; c2 =4; c3 = 4; c4 = 2.
QUATENARY SEMANTICANALYSISCalculate the approximate core tensorApproximate the original tensor by:
QUATENARY SEMANTICANALYSISLatent associations such as the newly addedquadruples in Table 6 may not be found if thetensor data is sparseWe overcome this problem by applying asmoothing technique to the tensor in Algorithm.
Experimental result – datasetdescription Datasets: Movielens Data The first file contains users’ tags on different movies. The second file contains users’ ratings on different movies on a scale of 1 to 5. By joining these two files over user and movie, we obtain the quadruples < user; movie; tag; rating >. After preprocessing, the dataset has 11122 tuples with 201 users, 501 movies, and 404 tags.
Experimental result – Item Recommendation Compare method: UPCC: User based recommendation IPCC: Item based recommendation Probabilistic Matrix Factorization (PMF)
Conclusion We have shown that quaternary semantic analysis can lead to more accurate recommendation. We have proposed using a 4-order tensor to model the four heterogeneous entities: users, items, tags and ratings. A unified framework is proposed that utilize quaternary relation for user recommendation, item recommendation, tag recommendation and rating prediction.