Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- 07. disjoint set by Onkar Nath Sharma 451 views
- Sets and disjoint sets union123 by Ankita Goyal 1525 views
- disjoint-set data structures by skku_npc 1152 views
- BIG DATA: AN AUGMENTED INTELLIGENCE... by Gigaom 2415 views
- Advanced Algorithms #1 - Union/Find... by Andrea Angella 1846 views
- Disjoint sets by Core Condor 828 views

1,407 views

1,321 views

1,321 views

Published on

No Downloads

Total views

1,407

On SlideShare

0

From Embeds

0

Number of Embeds

556

Shares

0

Downloads

18

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Co-clustering with augmented data matrix<br />Authors: Meng-Lun Wu, Chia-HuiChang, and Rui-Zhe Liu<br />Dept. of Computer Science Information Engineering <br />National Central University<br />1<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />
- 2. Outline<br />Introduction<br />Related Work<br />Problem Formulation<br />Co-Clustering Algorithm<br />Experiments Result and Evaluation<br />Conclusion<br />2<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />
- 3. Introduction (cont.)<br />Over the past decade, co-clustering are arisen to solve the simultaneously clustering of dyadic data.<br />However, most research only take account of the dyadic data as the main clustering matrix, which are not considering of addition information.<br />In addition to user-movie click matrix, we might have user preference and movie description. <br />Similarly, in addition to document-word co-occurrence matrix, we might have document genre and word meaning.<br />3<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />
- 4. Introduction (cont.)<br />To fully utilize augmented matrix, we proposed a new method called Co-Clustering with Augmented data Matrix (CCAM).<br />Umatch1 social websites provide the Ad$mart service that could let user to click the ads and share the profit with users.<br />Fortunately, we could cope with Umatchwebsites, which hope us to analyze the ad-user information according to the following data.<br />ad-user click data, ad setting data, and user profile (Lohasquestionary).<br />4<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />1. Umatch: http://www.morgenstern.com.tw/users2/index.php/u_match1/<br />
- 5. Related work<br />Co-clustering research could separate three kinds categories, MDCC, MOCC2 andITCC.<br />MDCC: Matrix decomposition co-clustering<br />Long et al. (2005) “Co-clustering by Block Value Decomposition”<br />Ding et al. (2005) gave a similar co-clustering approach based on nonnegative matrix factorization.<br />MOCC2: topic model based co-clustering<br />Shafiei et al. (2006) “Latent Dirichlet Co-clustering“. <br />Hanhuai et al. (2008) “Bayesian Co-clustering “<br />2011/8/24<br />5<br />DaWak 2011 in Toulouse, France<br />2. M. MahdiShafiei and Evangelos E. Milios “Model-based Overlapping Co-Clustering” Supported by grants from the Natural Sciences and Engineering Research.<br />
- 6. Related work (cont.)<br />ITCC: an optimization method<br />Dhillon et al. (2003) “Information-Theoretic Co-Clustering.”<br />Banerjee et al. (2004) ”A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation.”<br />Li et al. employ ITCC framework to propagate the class structure and knowledge from in-domain data to out-of-domain data.<br />As the inspiration of Li and Dhillon, we extend ITCC framework with augmented matrix to co-cluster the ad and user.<br />2011/8/24<br />6<br />DaWak 2011 in Toulouse, France<br />
- 7. Problem formulation<br />Let A, U, S and L be discrete random variables.<br />A denotes ads which are ranged from {a1,…,am}, <br />U denotes users which are ranged from {u1,…,un}<br />S denotes ad settings which are ranged from {s1,…,sr}<br />L denotes user Lohasquestionary which are ranged from {l1,…,lv}<br />Input Data: the joint probability distribution<br />p(A, U): ad-user link matrix<br />p(A, S): ad-setting matrix<br />p(U, L): user-Lohas matrix<br />Given a p(A,U), the mutual information is defined as<br />7<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />𝐼𝐴;𝑈=𝑎𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢)<br /> <br />
- 8. Problem formulation<br />Goal: to obtain<br />k ad clusters denoted by {â1, … âk}<br />l user groups denoted by {û1, … ûl}<br />Such that the mutual information loss after co-clustering is minimized the objective function<br />where , are trade-off parameter that balance the effect to ad clusters or user groups.<br />8<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />𝑓𝐴,𝑈=𝐼𝐴;𝑈−𝐼𝐴;𝑈+λ𝐼𝐴;𝑆−𝐼𝐴;𝑆+𝜑[𝐼𝑈;𝐿−𝐼(𝑈;𝐿)]<br /> <br />
- 9. Problem formulation (cont.)<br /><ul><li>Let q(A, U) denotes the approximation distribution for p(A, U).</li></ul>Lemma 1.<br />For a fixed co-clustering (Â, Û), we can write the loss in mutual information as<br />where q(A, U), q(A, S) and q(U, L) could be obtained by<br />9<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />𝑓𝐴,𝑈=𝐼𝐴;𝑈−𝐼𝐴;𝑈+λ𝐼𝐴;𝑆−𝐼𝐴;𝑆+𝜑𝐼𝑈;𝐿−𝐼𝑈;𝐿<br />=𝐷(𝑝𝐴,𝑈||𝑞𝐴,𝑈)+λ∙𝐷(𝑝𝐴,𝑆||𝑞𝐴,𝑆)+𝜑∙𝐷(𝑝𝑈,𝐿||𝑞𝑈,𝐿)<br /> <br />𝑞𝑎,𝑢=𝑝𝑎,𝑢𝑝𝑎𝑎𝑝𝑢𝑢, 𝑤h𝑒𝑟𝑒 𝑎=𝐶𝐴𝑎 𝑎𝑛𝑑 𝑢=𝐶𝑈𝑢<br /> <br />𝑞𝑎,𝑠=𝑝𝑎,𝑠 𝑝𝑎𝑎, 𝑤h𝑒𝑟𝑒 𝑎=𝐶𝐴𝑎<br /> <br />𝑞𝑢,𝑙=𝑝𝑢,𝑙𝑝𝑢𝑢, 𝑤h𝑒𝑟𝑒 𝑢=𝐶𝑈𝑢<br /> <br />
- 10. Lemma 1 Proof<br />Since we are considering hard clustering<br />𝑝𝑎,𝑢=𝑎∈𝑎𝑢∈𝑢𝑝(𝑎,𝑢)<br />𝑝𝑎,𝑠 =𝑎∈𝑎𝑝(𝑎,𝑠)<br />𝑝𝑢,𝑙 =𝑢∈𝑢𝑝(𝑢,𝑙)<br />𝐼𝐴;𝑈−𝐼𝐴;𝑈<br />=𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢)−𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑢)<br />=𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑝(𝑎,𝑢)𝑝𝑎𝑝(𝑎)𝑝(𝑢)𝑝(𝑢)<br />=𝑎𝑢𝑎∈𝑎𝑢∈𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝(𝑎,𝑢)𝑞(𝑎,𝑢) =𝐷𝑝𝐴,𝑈||𝑞𝐴,𝑈<br />where 𝑝𝑎𝑎=𝑝(𝑎)𝑝𝑎 𝑓𝑜𝑟 𝑎=𝐶𝐴𝑎, and similarly for 𝑝𝑢𝑢<br /> <br />2011/8/24<br />10<br />DaWak 2011 in Toulouse, France<br />
- 11. Lemma 1 Proof (Cont.)<br />𝐼𝐴;𝑆−𝐼𝐴;𝑆<br />=𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑠)−𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑠)<br />=𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑝(𝑎,𝑠)𝑝𝑎𝑝(𝑎)<br />=𝑎𝑎∈𝑎𝑝𝑎,𝑠𝑙𝑜𝑔𝑝(𝑎,𝑠)𝑞(𝑎,𝑠) =𝐷𝑝𝐴,𝑆||𝑞𝐴,𝑆<br />𝐼𝑈;𝐿−𝐼𝑈;𝐿<br />=𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑙)−𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑙)<br />=𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑝(𝑢,𝑙)𝑝𝑢𝑝(𝑢)<br />=𝑢𝑢∈𝑢𝑝𝑢,𝑙𝑙𝑜𝑔𝑝(𝑢,𝑙)𝑞(𝑢,𝑙) =𝐷𝑝𝑈,𝐿||𝑞𝑈,𝐿<br /> <br />2011/8/24<br />11<br />DaWak 2011 in Toulouse, France<br />
- 12. Problem formulation (cont.)<br />Lemma 2. <br />An alternative approach of iteratively reducing the K-L divergence values.<br />𝐷(𝑝(𝐴,𝑈)|𝑞𝐴,𝑈=𝑎∈𝐴𝑎∈𝑎𝑝𝑎𝐷(𝑝(𝑈|𝑎)|𝑞𝑈𝑎<br />=𝑢∈𝑈𝑢∈𝑢𝑝𝑢𝐷(𝑝(𝐴|𝑢)|𝑞𝐴𝑢<br />𝐷(𝑝(𝑈,𝐿)|𝑞𝑈,𝐿=𝑢∈𝑈𝑢∈𝑢𝑝𝑢𝐷(𝑝(𝐿|𝑢)|𝑞𝐿𝑢<br />𝐷(𝑝(𝐴,𝑆)|𝑞𝐴,𝑆=𝑎∈𝐴𝑎∈𝑎𝑝𝑎𝐷(𝑝(𝑆|𝑎)|𝑞𝑆𝑎<br />Theorem 1<br />The CCAM algorithm could monotonically decreases the objective function. Since<br />Where t is iteration number.<br /> <br />2011/8/24<br />12<br />DaWak 2011 in Toulouse, France<br />𝑓(𝑡)(𝐴,𝑈)≥𝑓(𝑡+1)(𝐴,𝑈)<br /> <br />
- 13. Co-clustering algorithm<br />13<br />2011/8/24<br />DaWak 2011 in Toulouse, France<br />
- 14. 2011/8/24<br />DaWak 2011 in Toulouse, France<br />14<br />
- 15. 2011/8/24<br />DaWak 2011 in Toulouse, France<br />15<br />
- 16. Experiments result and evaluation<br />The difficulty of clustering research is performance evaluation, because of it have no standard target.<br />Therefore, we present two evaluation methods based on class prediction and group variance.<br />Classification based evaluation<br />Mutual information based evaluation<br />We have retrieved the data from 2009/09/01 to 2010/03/31 that contain 530 ads and 9865 users. <br />For Lohas, only 2,124 users have values (have filled Lohasquestionary), others are filled with zero.<br />16<br />8/24/2011<br />
- 17. Classification based evaluation<br />Clustering evaluation is always done with classification, since we don’t have target labels, we produce the label by the following generation.<br />Target (Initial cluster) generation :<br />The target is based on the K-means clustering which is applied to the following data.<br />Ad matrix (Ad): p(A, S) + p(A, U)<br />User matrix (User): p(U, L) + p(U, A)<br />Parameter setting :<br />Iteration of K-means : 1000<br />Cluster K is set from 2 to 5.<br />Output : ad cluster𝐶𝐴 (0) and user group 𝐶𝑈 (0)<br /> <br />17<br />8/24/2011<br />
- 18. Classification based evaluation (cont.)<br />Co-clustering features (ITCC and CCAM):<br />User-ad cluster matrix: summation over ai belongs to ad clusterâk.<br />U𝐴=𝑙𝑛𝑎𝑖∈𝑎𝑘𝑈𝐴𝑗𝑖<br />Ad-user group matrix: summation over uj belongs to user group ûl.<br />A𝑈=𝑙𝑛𝑢𝑗∈𝑢𝑙𝐴𝑈𝑖𝑗<br />After generate target and co-clustering features, we apply decision tree to classify the co-clustering result, and use the F-measure as evaluation metric.<br />Testing data with co-clustering feature:<br />Ad + AÛ<br />User + UÂ<br /> <br />18<br />8/24/2011<br />
- 19. Ad cluster evaluation<br />8/24/2011<br />19<br />=0.6, =1.0<br />=0.2<br />=1.0<br />=0.8<br />=1.0<br />=0.6<br />=1.0<br />
- 20. User group evaluation<br />8/24/2011<br />20<br />=0.6<br />=1.0<br />=0.2<br />=1.0<br />=0.8<br />=1.0<br />=0.6, =1.0<br />
- 21. Parameter tuning of CCAM<br />We fix φ=1.0, and set λ from 0.2 to 1.0, then observe the average F-measure between ads and users. <br />The optimal parameter for different K are <br />K=2,4: φ=1.0, λ=0.6<br />K=3:φ=1.0, λ=0.8<br />K=5: φ=1.0, λ=0.2<br />However, we fix λ1.0 and set φfrom 0.2 to 1.0 as well as K from 3to 5. There are nothing change.<br />We suspect that φcontrol the p(U, L), but the zero entry dominate the p(U, L) of 161x7736.<br />8/24/2011<br />21<br />
- 22. Parameter tuning (fix =1.0)<br />8/24/2011<br />22<br />
- 23. Parameter tuning (fix =1.0)<br />8/24/2011<br />23<br />
- 24. Mutual information based evaluation<br />The mutual information are exploited the nature of co-clustering by measuring the difference between ad clusters and user groups.<br />The higher difference is performed, the better clustering is achieved.<br />We use the following equation to measure the mutual information.<br />𝐼𝐴;𝑈=𝑎𝑢𝑝𝑎,𝑢𝑙𝑜𝑔𝑝𝑎,𝑢𝑝𝑎𝑝(𝑢)<br />where 𝑝𝑎,𝑢=𝑎∈𝑎𝑢∈𝑢𝑝(𝑎,𝑢)<br /> <br />24<br />8/24/2011<br />
- 25. Mutual information based evaluation (cont.)<br />25<br />8/24/2011<br />
- 26. Monotonically decrease mutual information loss<br />8/24/2011<br />26<br />
- 27. Conclusion<br />Co-clustering is to achieve the dual goals of row clustering and column clustering.<br />However, most co-clustering algorithm focus on co-clustering of correlation matrix between row and column.<br />Our proposed method, Co-Clustering with Augmented Matrix (CCAM), can fully utilize the augmented data to achieve the better co-clustering.<br />CCAM could achieve better classification performance than ITCC and also present a comparable performance in the mutual information evaluation.<br />8/24/2011<br />27<br />
- 28. Thank you for listening.<br />Q & A<br />28<br />8/24/2011<br />

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment