Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Collaborative filtering with CCAM


Published on

Published by ICMLA'11 in Honolulu, Hawaii.

Published in: Education, News & Politics
  • Be the first to comment

  • Be the first to like this

Collaborative filtering with CCAM

  1. 1. ICMLA11, Honolulu, Hawaii 1COLLABORATIVEFILTERING WITH CCAMPresenter: Meng-Lun WuAuthor: Meng-Lun Wu, Chia-Hui Chang and Rei-Zhe LiuDate: 2011/12/21
  2. 2. ICMLA11, Honolulu, Hawaii 2Outline• Introduction• Related Work• Preliminary• Collaborative Filtering with CCAM• Experiment• Conclusion
  3. 3. ICMLA11, Honolulu, Hawaii 3Introduction (1/2)• In any recommender system, the number of ratings already obtained is usually very small compared to the number of ratings that need to be predicted.• A possible solution turns out to be dimensionality reduction methods which can alleviate data sparsity.• Typically, clustering is the simplest way that can be extended over recommender systems to achieve a compact model and avoid the sparsity problem.
  4. 4. ICMLA11, Honolulu, Hawaii 4Introduction (2/2)• In the past years, co-clustering based on information theory has attracted more and more attention.• We have extended a co-clustering algorithm based on information theory to augmented data matrix which called Co- Clustering with Augmented data Matrix, CCAM.• In this paper, we consider how to alleviate the sparsity problem and achieve a precise prediction by Collaborative Filtering with CCAM.
  5. 5. ICMLA11, Honolulu, Hawaii 5Related Work• Information theoretical co-clustering • Dhillon et al. (2003) developed from information theory and tried to optimize the objective function based on the loss of mutual information between clustered random variables.• Matrix factorization co-clustering • Chen et al. (2008) linearly combined user-based, item-based CF method, and matrix factorization results in order to make prediction on ratings which relied on ONMTF. • Li et al. (2009) presented a novel cross-domain collaborative filtering method which co-clusters movie information via ONMTF and reconstructs knowledge for recommending books and movies.
  6. 6. ICMLA11, Honolulu, Hawaii 6Preliminary (1/2)• Suppose that we are given a clicking information matrix R which is composed of user set, U={u1, u2, …, unu} and a set of ad, A={a1, a2, …, ana}. • nu and na respectively represents the number of users and ads.• For memory-based CF methods, before finding similar neighbors, it is inevitable to encounter sparsity issues of demanded data. • In the research of Dhillon et al. (2003), they considered a co-clustering algorithm which monotonically decreases the information loss of tabular data to form a compact model.
  7. 7. ICMLA11, Honolulu, Hawaii 7Preliminary (2/2)• Assume U and A are random variable sets with a joint probability distribution p(U, A) and marginal distribution p(U) and p(A). The mutual information I(U; A) is defined as• Suppose there are G1 user clusters CU={cu(1), cu(2), …, cu(G1)} and, G2 ad clusters CA={ca(1), ca(2), …, ca(G2)}, in order to judge the quality of a co-clustering, we define the loss in mutual information as• PROPOSITION 1. There are also properties that are declared and proven, they are
  8. 8. ICMLA11, Honolulu, Hawaii 8Co-Clustering with Augmented dataMatrix, CCAM (1/4)• When the optimization problem of loss in mutual information is first proposed by Dhillon et al. (2003), it was designed and applied for single tabular data. • However, in many cases besides the major data set, there exist related tables which may provide some useful information.• In this co-clustering approach, Co-Clustering with Augmented data Matrix (CCAM), we will simultaneously modify the co-clusters of multiple augmented data to reduce the information loss.• The other two sets of components, feature set F={f1, f2, …, fn }, and f profile set P={p1, p2, …, pnp}, are extensive information for ads and users and form the augmented matrices • where nf and np denotes the number of features and profiles, respectively.
  9. 9. ICMLA11, Honolulu, Hawaii 9Co-Clustering with Augmented dataMatrix, CCAM (2/4)• PROPOSITION 2. There are extensive properties recognized when p(A, F) and p(U, P) were considered. • which were also declared and proven.• DEFINITION 1. An optimal co-cluster (CU, CA) we desire to obtain would minimize
  10. 10. ICMLA11, Honolulu, Hawaii 10Co-Clustering with Augmented dataMatrix, CCAM (3/4)•
  11. 11. ICMLA11, Honolulu, Hawaii 11Algorithm 1Co-Clustering with Augmented data Matrix algorithm
  12. 12. ICMLA11, Honolulu, Hawaii 12Collaborative filtering with CCAM(1/5)•
  13. 13. ICMLA11, Honolulu, Hawaii 13Collaborative filtering with CCAM(2/5)• DEFINITION 3. Since CCAM is designed on the base of KL- divergence, the distance metrics would be in a similar format. • Here we define the distance between each user and user cluster and each ad and ad cluster.• Note that the ad cluster prototype and user cluster prototype of CCAM would be regarded as
  14. 14. ICMLA11, Honolulu, Hawaii 14Collaborative filtering with CCAM (3/5)•
  15. 15. ICMLA11, Honolulu, Hawaii 15Collaborative filtering with CCAM (4/5)•
  16. 16. ICMLA11, Honolulu, Hawaii 16Collaborative filtering with CCAM (5/5)
  17. 17. ICMLA11, Honolulu, Hawaii 17Data set• The data set used in the experiments are obtained from a financial social web-site, Ad$Mart, which ranged from 2009/09/01 to 2010/03/31.• For each test user, 15 observed clicking rates (Given15) are provided to find nearest neighbors and the remaining clicking rates are used for evaluation.• To ensure each test user would click at least 15 ads, users with more than 20 clicked ads and ads with more than 10 clicked user-ad pairs would be reserved. • User-Ad: The pre-processing clicking data is provided by 1786 users and 520 ads. After preprocessing, we make it a joint probability distribution over user and ad, and also reform it into a clicking rate matrix scaled from 1-5. • Ad-Feature: An advertisement feature data set compiling 37 statistics of 530 ads. • User-Profile: A questionnaire data set provided by 520 users on 24 survey questions.
  18. 18. ICMLA11, Honolulu, Hawaii 18Evaluation methodology (1/2)•
  19. 19. ICMLA11, Honolulu, Hawaii 19Evaluation methodology (2/2)•
  20. 20. ICMLA11, Honolulu, Hawaii 20 and  tuning based on k-NN•
  21. 21. ICMLA11, Honolulu, Hawaii 21G1 and G2 tuning based on K-Means• We also have to determine what value of G1 would result in a well-performed MAE. • We simply make G2=10 as well as K1 = K2 = 5, and as a strategy to avoid too many parameter tunings. • On this issue, we will see the responding of k-Means with different G1 (7, 15, 30, 60) and reserve the best one in order to apply to the other algorithms.
  22. 22. ICMLA11, Honolulu, Hawaii 22Parameter tuning with CCAM (1/2)• In order to evaluate the result of co-clustering, we take advantage of classification algorithm (Weka J48) on user data to test the F-measure of 10-fold c.v., and similarly in ad aspect. • We use the clustering result of the user data (user-ad matrix and user-profile matrix) as the target labels for evaluation of user clustering, and is similar to the ad data (ad-user matrix and ad-feature matrix). • To examine the effectiveness of co-clustering, we reduce the columns of user- ad matrix to a smaller user-ad cluster matrix. The reduced data is then inserted into our user data for classification, so as the ad data. Clustering result User- User data of user-ad and ad cluster user-profile
  23. 23. ICMLA11, Honolulu, Hawaii 23Parameter tuning with CCAM (2/2)• We find that when G1=60, the best setting will be λ=0.2, φ=0.1.• Therefore, we will then apply the result of the optimal parameters of CCAM in the next section to compare with the other algorithms. •
  24. 24. ICMLA11, Honolulu, Hawaii 24 Results• Table 3 compare the model- based approaches.• Table 4 compare the hybrid models approaches with the previous parameter settings.
  25. 25. ICMLA11, Honolulu, Hawaii 25Conclusion• In this paper, we applied the rating framework of Chen’s to evaluate the performance of hybrid CF with various model construction.• In order to give a fair comparison, we start by tuning for the best performance in each individual approach.• As a result, we compared four algorithm, CCAM, ITCC, k-Means and k-NN. The MAE metric has shown that CCAM outperformed the other three algorithms.• In the future, to have more thorough discussions, we will investigate our algorithm on different real world data set. • such as the MovieLens, EachMovie and Book-Crossing data sets which respectively contains movie and book rating data of users.
  26. 26. ICMLA11, Honolulu, Hawaii 26THANK YOU FORLISTENING.Q&A