Co-clustering with augmented data

Co-clustering with augmented data matrix Authors: Meng-Lun Wu, Chia-HuiChang, and Rui-Zhe Liu Dept. of Computer Science Information Engineering National Central University 1 2011/8/24 DaWak 2011 in Toulouse, France

Outline Introduction Related Work Problem Formulation Co-Clustering Algorithm Experiments Result and Evaluation Conclusion 2 2011/8/24 DaWak 2011 in Toulouse, France

Introduction (cont.) Over the past decade, co-clustering are arisen to solve the simultaneously clustering of dyadic data. However, most research only take account of the dyadic data as the main clustering matrix, which are not considering of addition information. In addition to user-movie click matrix, we might have user preference and movie description. Similarly, in addition to document-word co-occurrence matrix, we might have document genre and word meaning. 3 2011/8/24 DaWak 2011 in Toulouse, France

Introduction (cont.) To fully utilize augmented matrix, we proposed a new method called Co-Clustering with Augmented data Matrix (CCAM). Umatch1 social websites provide the Ad$mart service that could let user to click the ads and share the profit with users. Fortunately, we could cope with Umatchwebsites, which hope us to analyze the ad-user information according to the following data. ad-user click data, ad setting data, and user profile (Lohasquestionary). 4 2011/8/24 DaWak 2011 in Toulouse, France 1. Umatch: http://www.morgenstern.com.tw/users2/index.php/u_match1/

Related work Co-clustering research could separate three kinds categories, MDCC, MOCC2 andITCC. MDCC: Matrix decomposition co-clustering Long et al. (2005) “Co-clustering by Block Value Decomposition” Ding et al. (2005) gave a similar co-clustering approach based on nonnegative matrix factorization. MOCC2: topic model based co-clustering Shafiei et al. (2006) “Latent Dirichlet Co-clustering“. Hanhuai et al. (2008) “Bayesian Co-clustering “ 2011/8/24 5 DaWak 2011 in Toulouse, France 2. M. MahdiShafiei and Evangelos E. Milios “Model-based Overlapping Co-Clustering” Supported by grants from the Natural Sciences and Engineering Research.

Related work (cont.) ITCC: an optimization method Dhillon et al. (2003) “Information-Theoretic Co-Clustering.” Banerjee et al. (2004) ”A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation.” Li et al. employ ITCC framework to propagate the class structure and knowledge from in-domain data to out-of-domain data. As the inspiration of Li and Dhillon, we extend ITCC framework with augmented matrix to co-cluster the ad and user. 2011/8/24 6 DaWak 2011 in Toulouse, France

Problem formulation Let A, U, S and L be discrete random variables. A denotes ads which are ranged from {a1,…,am}, U denotes users which are ranged from {u1,…,un} S denotes ad settings which are ranged from {s1,…,sr} L denotes user Lohasquestionary which are ranged from {l1,…,lv} Input Data: the joint probability distribution p(A, U): ad-user link matrix p(A, S): ad-setting matrix p(U, L): user-Lohas matrix Given a p(A,U), the mutual information is defined as 7 2011/8/24 DaWak 2011 in Toulouse, France 𝐼𝐴;𝑈=𝑎𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢)

Problem formulation Goal: to obtain k ad clusters denoted by {â1, … âk} l user groups denoted by {û1, … ûl} Such that the mutual information loss after co-clustering is minimized the objective function where ,  are trade-off parameter that balance the effect to ad clusters or user groups. 8 2011/8/24 DaWak 2011 in Toulouse, France 𝑓𝐴,𝑈=𝐼𝐴;𝑈−𝐼𝐴;𝑈+λ𝐼𝐴;𝑆−𝐼𝐴;𝑆+𝜑[𝐼𝑈;𝐿−𝐼(𝑈;𝐿)]

Problem formulation (cont.) ,[object Object],Lemma 1. For a fixed co-clustering (Â, Û), we can write the loss in mutual information as where q(A, U), q(A, S) and q(U, L) could be obtained by 9 2011/8/24 DaWak 2011 in Toulouse, France 𝑓𝐴,𝑈=𝐼𝐴;𝑈−𝐼𝐴;𝑈+λ𝐼𝐴;𝑆−𝐼𝐴;𝑆+𝜑𝐼𝑈;𝐿−𝐼𝑈;𝐿 =𝐷(𝑝𝐴,𝑈||𝑞𝐴,𝑈)+λ∙𝐷(𝑝𝐴,𝑆||𝑞𝐴,𝑆)+𝜑∙𝐷(𝑝𝑈,𝐿||𝑞𝑈,𝐿) 𝑞𝑎,𝑢=𝑝𝑎,𝑢𝑝𝑎𝑎𝑝𝑢𝑢, 𝑤h𝑒𝑟𝑒 𝑎=𝐶𝐴𝑎 𝑎𝑛𝑑 𝑢=𝐶𝑈𝑢 𝑞𝑎,𝑠=𝑝𝑎,𝑠 𝑝𝑎𝑎, 𝑤h𝑒𝑟𝑒 𝑎=𝐶𝐴𝑎 𝑞𝑢,𝑙=𝑝𝑢,𝑙𝑝𝑢𝑢, 𝑤h𝑒𝑟𝑒 𝑢=𝐶𝑈𝑢

Lemma 1 Proof Since we are considering hard clustering 𝑝𝑎,𝑢=𝑎∈𝑎𝑢∈𝑢𝑝(𝑎,𝑢) 𝑝𝑎,𝑠 =𝑎∈𝑎𝑝(𝑎,𝑠) 𝑝𝑢,𝑙 =𝑢∈𝑢𝑝(𝑢,𝑙) 𝐼𝐴;𝑈−𝐼𝐴;𝑈 =𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢)−𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢) =𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑎)𝑝(𝑢)𝑝(𝑢) =𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑞(𝑎,𝑢) =𝐷𝑝𝐴,𝑈||𝑞𝐴,𝑈 where 𝑝𝑎𝑎=𝑝(𝑎)𝑝𝑎 𝑓𝑜𝑟 𝑎=𝐶𝐴𝑎, and similarly for 𝑝𝑢𝑢 2011/8/24 10 DaWak 2011 in Toulouse, France

Lemma 1 Proof (Cont.) 𝐼𝐴;𝑆−𝐼𝐴;𝑆 =𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑠)−𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑠) =𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑎) =𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑞(𝑎,𝑠) =𝐷𝑝𝐴,𝑆||𝑞𝐴,𝑆 𝐼𝑈;𝐿−𝐼𝑈;𝐿 =𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑙)−𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑙) =𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑢) =𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑞(𝑢,𝑙) =𝐷𝑝𝑈,𝐿||𝑞𝑈,𝐿 2011/8/24 11 DaWak 2011 in Toulouse, France

Co-clustering algorithm 13 2011/8/24 DaWak 2011 in Toulouse, France

2011/8/24 DaWak 2011 in Toulouse, France 14

2011/8/24 DaWak 2011 in Toulouse, France 15

Experiments result and evaluation The difficulty of clustering research is performance evaluation, because of it have no standard target. Therefore, we present two evaluation methods based on class prediction and group variance. Classification based evaluation Mutual information based evaluation We have retrieved the data from 2009/09/01 to 2010/03/31 that contain 530 ads and 9865 users. For Lohas, only 2,124 users have values (have filled Lohasquestionary), others are filled with zero. 16 8/24/2011

Classification based evaluation Clustering evaluation is always done with classification, since we don’t have target labels, we produce the label by the following generation. Target (Initial cluster) generation : The target is based on the K-means clustering which is applied to the following data. Ad matrix (Ad): p(A, S) + p(A, U) User matrix (User): p(U, L) + p(U, A) Parameter setting : Iteration of K-means : 1000 Cluster K is set from 2 to 5. Output : ad cluster𝐶𝐴 (0) and user group 𝐶𝑈 (0) 17 8/24/2011

Classification based evaluation (cont.) Co-clustering features (ITCC and CCAM): User-ad cluster matrix: summation over ai belongs to ad clusterâk. U𝐴=𝑙𝑛𝑎𝑖∈𝑎𝑘𝑈𝐴𝑗𝑖 Ad-user group matrix: summation over uj belongs to user group ûl. A𝑈=𝑙𝑛𝑢𝑗∈𝑢𝑙𝐴𝑈𝑖𝑗 After generate target and co-clustering features, we apply decision tree to classify the co-clustering result, and use the F-measure as evaluation metric. Testing data with co-clustering feature: Ad + AÛ User + UÂ 18 8/24/2011

Ad cluster evaluation 8/24/2011 19 =0.6, =1.0 =0.2 =1.0 =0.8 =1.0 =0.6 =1.0

User group evaluation 8/24/2011 20 =0.6 =1.0 =0.2 =1.0 =0.8 =1.0 =0.6, =1.0

Parameter tuning of CCAM We fix φ=1.0, and set λ from 0.2 to 1.0, then observe the average F-measure between ads and users. The optimal parameter for different K are K=2,4: φ=1.0, λ=0.6 K=3:φ=1.0, λ=0.8 K=5: φ=1.0, λ=0.2 However, we fix λ1.0 and set φfrom 0.2 to 1.0 as well as K from 3to 5. There are nothing change. We suspect that φcontrol the p(U, L), but the zero entry dominate the p(U, L) of 161x7736. 8/24/2011 21

Parameter tuning (fix =1.0) 8/24/2011 22

Parameter tuning (fix =1.0) 8/24/2011 23

Mutual information based evaluation The mutual information are exploited the nature of co-clustering by measuring the difference between ad clusters and user groups. The higher difference is performed, the better clustering is achieved. We use the following equation to measure the mutual information. 𝐼𝐴;𝑈=𝑎𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝𝑎,𝑢𝑝𝑎𝑝(𝑢) where 𝑝𝑎,𝑢=𝑎∈𝑎𝑢∈𝑢𝑝(𝑎,𝑢) 24 8/24/2011

Mutual information based evaluation (cont.) 25 8/24/2011

Monotonically decrease mutual information loss 8/24/2011 26

Conclusion Co-clustering is to achieve the dual goals of row clustering and column clustering. However, most co-clustering algorithm focus on co-clustering of correlation matrix between row and column. Our proposed method, Co-Clustering with Augmented Matrix (CCAM), can fully utilize the augmented data to achieve the better co-clustering. CCAM could achieve better classification performance than ITCC and also present a comparable performance in the mutual information evaluation. 8/24/2011 27

Thank you for listening. Q & A 28 8/24/2011

Co-clustering with augmented data

Recommended

Recommended

More Related Content

Similar to Co-clustering with augmented data

Similar to Co-clustering with augmented data (20)

More from AllenWu

More from AllenWu (9)

Recently uploaded

Recently uploaded (20)

Co-clustering with augmented data