Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

704 views

Published on

No Downloads

Total views

704

On SlideShare

0

From Embeds

0

Number of Embeds

33

Shares

0

Downloads

15

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion Game Theoretic Framework for Heterogeneous Information Network Clustering Faris Alqadah Johns Hopkins University
- 2. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 3. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 4. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionMotivation Heterogeneous Information Networks (HINs) are pervasive in applications ranging from bioinformatics to e-commerce. Generalization of bi-clustering to pairwise relations as opposed to tensor spaces. No uniﬁed deﬁnition of a HIN-cluster or algorithmic framework to mine them. Address short coming of ‘pattern’-based approaches.
- 5. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionHINs Objects derived from distinct domains Topology of the network determined by pairwise-binary relations amongst domains. Graph representation of a HIN is a multi-partite graph. Clicking patterns, social networks, gene networks from different experiments.
- 6. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionRelated Work Three major categories of work Multi-way clustering [5, 4, 1, 2]: Directly extend bi-clustering or co-clustering. Mostly hard-clusters. Information-network [10, 11]: Combine ranking and clustering using probabilty generating models, limited by network-topology, hard clustering. Pattern-based [3, 12, 7]: Formal Concept Analysis, overlapping clustering, too many clusters, parameter settings.
- 7. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets.
- 8. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets.
- 9. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets.
- 10. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets.
- 11. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets. Multiple-edge HIN, competing cluster-inﬂuences.
- 12. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets. Multiple-edge HIN, competing cluster-inﬂuences.
- 13. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea For single-edge HIN, trade-off between number of nodes in bipartite sets. Multiple-edge HIN, competing cluster-inﬂuences.
- 14. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea Multiple-edge HIN, competing cluster-inﬂuences. An ‘ideal’ HIN-cluster should be an equilibrium point among all competing clustering inﬂuences.
- 15. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionKey Idea Multiple-edge HIN, competing cluster-inﬂuences. An ‘ideal’ HIN-cluster should be an equilibrium point among all competing clustering inﬂuences. Nash equilibrium: No one can do any better assuming everyone else retains the same strategy.
- 16. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 17. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNotation Context Kij = (Gi , Gj , Iij ), two sets and a relation. A HIN Gn = (V, E) where V is a set of domains {G1 , . . . , Gn } and (Gi , Gj ) ∈ E iff ∃Kij
- 18. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNotation Context Kij = (Gi , Gj , Iij ), two sets and a relation. A HIN Gn = (V, E) where V is a set of domains {G1 , . . . , Gn } and (Gi , Gj ) ∈ E iff ∃Kij
- 19. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionConcepts (maximal bicliques) Common neighbors: {gj ∈ Gj |gj Iij gi ∀gi ∈ Ai } if (Gi , Gj ) ∈ E, ψ j (Ai ) = ∅ otherwise. Concept or maximal bi-clique: (Ai , Aj ) such that ψ j (Ai ) = Aj and ψ i (Aj ) = Ai .
- 20. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionConcepts (maximal bicliques) Common neighbors: {gj ∈ Gj |gj Iij gi ∀gi ∈ Ai } if (Gi , Gj ) ∈ E, ψ j (Ai ) = ∅ otherwise. Concept or maximal bi-clique: (Ai , Aj ) such that ψ j (Ai ) = Aj and ψ i (Aj ) = Ai .
- 21. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionFCA-based approaches Generalize the notion of a concept (several deﬁnitions), and enumerate all such concepts. Parameter settings not always intuitive. Substantially different algorithm design for simple change in deﬁnition. For suitably deﬁned game, Nash equilibrium points capture maximal bi-cliques.
- 22. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 23. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNormal form game A ﬁnite, n-player, normal form game, G, is a triple N, (Mi ), (ri ) where N = {1, . . . , n} is the set of players Mi = {mi1 , . . . , mili } is the set of moves available to player i and li is the number of available moves for that player. ri : M1 × · · · × Mn → R is the reward function for each player i. It maps a proﬁle of moves to a value. Each player i selects a strategy from the set of all available strategies, Pi = {pi : Mi → [0, 1]}
- 24. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNormal form game A ﬁnite, n-player, normal form game, G, is a triple N, (Mi ), (ri ) where N = {1, . . . , n} is the set of players Mi = {mi1 , . . . , mili } is the set of moves available to player i and li is the number of available moves for that player. ri : M1 × · · · × Mn → R is the reward function for each player i. It maps a proﬁle of moves to a value. Each player i selects a strategy from the set of all available strategies, Pi = {pi : Mi → [0, 1]}
- 25. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNash equilibrium and example Nash equilibrium: A strategy proﬁle in which no player has an incentive to unilaterally deviate [8, 6]. ∀i ∈ N, pi ∈ Pi : ∗ ∗ ∗ ∗ ∗ ri (p1 , . . . , pi−1 , pi , . . . , pn ) ≤ ri (p1 , . . . , pn ) Player 2 chooses 0 Player 2 chooses 1 Player 2 chooses 2 Player 1 chooses 0 (0,0) (1,0) (2,-2) Player 1 chooses 1 (0,1) (1,1) ( 3,-2) Player 1 chooses 2 (-2,2) (-2,3) (2,2)
- 26. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionNash equilibrium and example Nash equilibrium: A strategy proﬁle in which no player has an incentive to unilaterally deviate [8, 6]. ∀i ∈ N, pi ∈ Pi : ∗ ∗ ∗ ∗ ∗ ri (p1 , . . . , pi−1 , pi , . . . , pn ) ≤ ri (p1 , . . . , pn ) Player 2 chooses 0 Player 2 chooses 1 Player 2 chooses 2 Player 1 chooses 0 (0,0) (1,0) (2,-2) Player 1 chooses 1 (0,1) (1,1) ( 3,-2) Player 1 chooses 2 (-2,2) (-2,3) (2,2)
- 27. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 28. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionParty planner game Two party planners P1 and P2 plan a party by inviting guests from disjoint sets of clients G1 and G2 . Party planners receive compensation based on overall satisfaction of clients. Client satisfaction is a function of positive and negative interactions at the party P1 and P2 do not cooperate, but are privy to each others guest list at any point. Both wish to maximize compensation.
- 29. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionSatisfaction Reward Function Let (A1 , A2 ) be a party. Deﬁne satisfaction of g1 ∈ A1 attending party (A1 , A2 ) as |ψ 2 (g1 ) ∩ A2 | − w ∗ |A2 ψ 2 (g1 )| sat1 (g1 , A2 ) = (1) |A2 | Overall reward to party planner i: risat (Ai , Aj ) = sati (gi , Aj ) (2) gi ∈Ai
- 30. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionConcepts as Nash equilibrium points M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3 G1 (1,1) (1,2) (1,3) (1,2) (1,1) (1,2) (1,1) G1, G2 (2,1) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2) G1, G2, G3 (3,1) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3) G1, G3 (2,1) (2,2) (0,0) (-1,-1) (2,1) (-1,-1) (-4,-2) G2 (1,1) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5) G2, G3 (2,1) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5) G3 (1,1) (1,2) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5)
- 31. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionConcepts as Nash equilibrium points Theorem For any instance of the bi-clustering game Gbicluster in which risat is the selected reward function, there exists w ∗ , such that ∀w ≥ w ∗ if (A∗ , A∗ ) is a concept of K = (G1 , G2 , I12 ) then 1 2 (A∗ , A∗ ) is a Nash equilibrium point of Gbicluster . 1 2
- 32. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 33. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionHIN-clustering game Extend bi-clustering game to n-party planners, n sets of guests. Guest interactions are determined by network topology. Mining HIN-clusters is equivalent to ﬁnding Nash-equilibrium points of the HIN-clustering game. Finding Nash-equilibrium is non-trivial [9]. Adapt simple strategy and key heuristic to enumerate the Nash equilibrium points.
- 34. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionStrategy and heuristics M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3 G1 (1,1) (1,2) (1,3) (1,2) (1,1) (1,2) (1,1) G1, G2 (2,1) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2) G1, G2, G3 (3,1) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3) G1, G3 (2,1) (2,2) (0,0) (-1,-1) (2,1) (-1,-1) (-4,-2) G2 (1,1) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5) G2, G3 (2,1) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5) G3 (1,1) (1,2) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5) 1 Mark all second components that are maximal in each row.
- 35. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionStrategy and heuristics M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3 G1 (1,1) (1,2) (1,3**) (1,2) (1,1) (1,2) (1,1) G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2) G1, G2, G3 (3,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3) G1, G3 (2,1) (2,2**) (0,0) (-1,-1) (2,1) (-1,-1) (-4,-2) G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5) G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5) G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5) 1 Mark all second components that are maximal in each row.
- 36. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionStrategy and heuristics M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3 G1 (1,1) (1,2) (1**,3**) (1**,2) (1,1) (1**,2) (1**,1) G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2) G1, G2, G3 (3**,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3) G1, G3 (2,1) (2**,2**) (0,0) (-1,-1) (2**,1) (-1,-1) (-4,-2) G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5) G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5) G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5) 1 Mark all second components that are maximal in each row. 2 Mark all ﬁrst components that are maximal in each column.
- 37. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionStrategy and heuristics M1 M1, M2 M1, M2, M3 M1, M3 M2 M2, M3 M3 G1 (1,1) (1,2) (1**,3**) (1**,2) (1,1) (1**,2) (1**,1) G1, G2 (2,1**) (-1,-1) (-2,-3) (-1,-1) (-4,-2) (-4,-4) (-4,-2) G1, G2, G3 (3**,1**) (0,0) (-3,-3) (-3,-2) (-3,-1) (-6,-4) (-9,-3) G1, G3 (2,1) (2**,2**) (0,0) (-1,-1) (2**,1) (-1,-1) (-4,-2) G2 (1,1**) (-2,-4) (-3,-9) (-2,-4) (-5,-5) (-5,-10) (-5,-5) G2, G3 (2,1**) (-1,-1) (-4,-6) (-4,-4) (-4,-2) (-7,-7) (-10,-5) G3 (1,1) (1,2**) (-1,-3) (-2,-4) (1,1) (-2,-4) (-5,-5) 1 Mark all second components that are maximal in each row. 2 Mark all ﬁrst components that are maximal in each column. 3 Any cell that has both components marked is a Nash equilibrium. Heuristic: Every Nash equilibrium point is a superset of an n-concept.
- 38. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionGHIN framework Utilizing heuristic, exponential run time still possible. Sacriﬁce completeness, but guarantee correctness Attempt to form a Nash equilibrium point with each object in the HIN.
- 39. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionGHIN framework 1 For each object gi in the seed set attempt to form maximally large n-partite clique in HIN. 2 Add objects from all domains to the clique while the reward increases. 3 Remove objects not in original clique from all domains while the reward increases. 4 If no change from step 2 and 3 Nash equilibrium found, else repeat 2 and 3. 5 Update the seed set by removing all objects in the cluster.
- 40. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 41. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionShortcomings of satisfaction reward function Satisfaction reward function simple, intuitive, and efﬁcient. If matrices in HIN have signiﬁcantly different density levels, then bias occurs. Use expected satisfaction instead.
- 42. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionExpected satisfaction Assume all objects are independent. For given party (A1 , . . . , An ) expected number of interactions is number of success in |Aj | draws from ﬁnite population of |Gj | objects Expected number of success is hypergeometrically distributed random variable.
- 43. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionExpected satisfaction |Aj | ∗ |ψ j (gi )| expij (gi , Aj ) = |Gj | |Aj | ∗ |ψ j (gi )| ∗ (|Gj | − |Aj |) ∗ (|Gj | − |ψ j (gi )|) varij (gi , Aj ) = |Gj |2 ∗ (|Gj | − 1) |ψ j (gi ) ∩ Aj | − expij (gi , Aj ) esat(gi , Aj ) = −w varij (gi , Aj ) esat(gi , A−i ) = esat(gi , Gj ) Aj ⊆Gj ,(Gi ,Gj )∈E riesat (Ai , A−i ) = esat(gi , A−i ) gi ∈Ai
- 44. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionExpected satisfaction |Aj | ∗ |ψ j (gi )| expij (gi , Aj ) = |Gj | |Aj | ∗ |ψ j (gi )| ∗ (|Gj | − |Aj |) ∗ (|Gj | − |ψ j (gi )|) varij (gi , Aj ) = |Gj |2 ∗ (|Gj | − 1) |ψ j (gi ) ∩ Aj | − expij (gi , Aj ) esat(gi , Aj ) = −w varij (gi , Aj ) esat(gi , A−i ) = esat(gi , Gj ) Aj ⊆Gj ,(Gi ,Gj )∈E riesat (Ai , A−i ) = esat(gi , A−i ) gi ∈Ai
- 45. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionTiring party goers Incorporate ‘tiring’ factor to avoid too much overlap. Let c(gi ) denote the number of clusters gi has appeared in upto the current time-step, then let t = f (c(gi )) where f : N → (0, 1] and f is anti-monotonic. For example: 1 f (x) = x2 1 f (x) = ex
- 46. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOutline 1 Introduction Motivation 2 Preliminaries HINs and FCA Game Theory 3 The Bi-clustering Game Party-Planners 4 Framework GHIN 5 Reward Functions Expected Satisfaction 6 Experimental Results Real world HINs 7 Conclusion
- 47. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionHINs and evaluation HIN name Description Num domains Num classes Total num objects MER Newsgroup, Middle East politics and Religion 3 2 24,783 REC Newsgroup, recreation 3 2 26,225 SCI Newsgroup, science 3 5 37,413 PC Newsgroup, pc and software 3 5 35,186 PCR Newsgroup, politics and Christianity 3 2 24,485 FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517 Extrinsic evaluation, B 3 recall and precision: min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Prec(g, g ) = |C(g) ∩ C(g )| min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Rcl(g, g ) = |L(g) ∩ L(g )| B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]] B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
- 48. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionHINs and evaluation HIN name Description Num domains Num classes Total num objects MER Newsgroup, Middle East politics and Religion 3 2 24,783 REC Newsgroup, recreation 3 2 26,225 SCI Newsgroup, science 3 5 37,413 PC Newsgroup, pc and software 3 5 35,186 PCR Newsgroup, politics and Christianity 3 2 24,485 FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517 Extrinsic evaluation, B 3 recall and precision: min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Prec(g, g ) = |C(g) ∩ C(g )| min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Rcl(g, g ) = |L(g) ∩ L(g )| B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]] B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
- 49. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionHINs and evaluation HIN name Description Num domains Num classes Total num objects MER Newsgroup, Middle East politics and Religion 3 2 24,783 REC Newsgroup, recreation 3 2 26,225 SCI Newsgroup, science 3 5 37,413 PC Newsgroup, pc and software 3 5 35,186 PCR Newsgroup, politics and Christianity 3 2 24,485 FOUR_AREAS DBLP subset of database, data mining, AI, and IR papers 4 4 70,517 Extrinsic evaluation, B 3 recall and precision: min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Prec(g, g ) = |C(g) ∩ C(g )| min(|C(g) ∩ C(g )|, |L(g) ∩ L(g )|) Rcl(g, g ) = |L(g) ∩ L(g )| B 3 Prec = Avgg [Avgg ,C(g)∩C(g )=∅ [Prec(g, g )]] B 3 Rcl = Avgg [Avgg ,L(g)∩L(g )=∅ [Rcl(g, g )]]
- 50. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionResults HIN Algorithm F1 F0.5 F2 GHIN expsat 0.627051 0.736396 0.622735 GHIN sat 0.553790 0.649559 0.569664 MER NetClus 0.3759 0.4512 0.322 MDC 0.3661 0.4533 0.3070 GHIN expsat 0.544189 0.633362 0.508778 GHIN sat 0.434367 0.485025 0.451840 REC NetClus 0.2784 0.2870 0.2704 MDC 0.2845 0.2953 0.2746 GHIN expsat 0.484068 0.589704 0.530239 GHIN sat 0.402306 0.481798 0.462886 SCI NetClus 0.2609 0.2583 0.2635 MDC 0.2532 0.2529 0.2535 GHIN expsat 0.334827 0.520472 0.302943 GHIN sat 0.306503 0.432229 0.345382 PC NetClus 0.2254 0.2068 0.2477 MDC 0.2282 0.2116 0.2476 GHIN expsat 0.640894 0.793399 0.508778 GHIN sat 0.541986 0.574588 0.530971 PCR NetClus 0.3642 0.4396 0.3109 MDC 0.3440 0.4268 0.2810 GHIN expsat 0.623117 0.598877 0.650079 GHIN sat 0.5315 0.506687 0.5588 FOUR_AREAS NetClus 0.3612 0.36655 0.3560 MDC 0.5085 0.5162 0.5010
- 51. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionClass distributions in clusters Algorithm Class C1 C2 C3 C4 DB 0.0601266 0.93633 0.0133188 0.0512748 DM 0.028481 0.0363608 0.0106007 0.850142 GHIN expsat IR 0.882911 0.0204432 0.133188 0.0339943 AI 0.028481 0.00686642 0.842892 0.0645892 DB 0.0553833 0.450802 0.500074 0.0955971 DM 0.163934 0.15815 0.128535 0.304584 NetClus IR 0.179553 0.0512035 0.242707 0.112786 AI 0.60113 0.339844 0.128684 0.487033 DB 0.186681 0.232455 0.803727 0.000000 DM 0.261844 0.000000 0.128592 0.161790 MDC IR 0.003183 0.278748 0.000000 0.75888 AI 0.548292 0.488797 0.067680 0.079323
- 52. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionSample Clusters Terms Authors Conferences data Surajit Chaudhuri VLDB database Divesh Srivastava SIGMOD queries H. V. Jagadish ICDE databases Jeffrey F. Naughton PODS querys Michael J. Carey EDBT xml Raghu Ramakrishnan mining Jiawei Han KDD learning Christos Faloutsos PAKDD data Wei Wang ICDM frequent Heikki Mannila SDM association Srinivasan Parthasarathy PKDD patterns Ke Wang ICML
- 53. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionApplying GHIN to EMAP data E-MAP (epistatic miniarray porﬁles) query and target genes Genetic interaction score indicates whether strain is healthier or sicker than expected (positive or negative) Negative network derived by using scores ≤ −2.5 Find Nash points, and use functional enrichment: Do we ﬁnd small functional classes?
- 54. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionApplying GHIN to EMAP data Functional enrichment by large classes (31−500) 0.7 Exp sat tiring Sat 0.6 Fraction of patterns enriched 0.5 0.4 0.3 0.2 0.1 0 −0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 P−value threshold Functional enrichment by small classes 0.7 Exp sat tiring Sat 0.6 Fraction of patterns enriched 0.5 0.4 0.3 0.2 0.1 0 −0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 P−value threshold
- 55. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion Clusters exclusively annotated by small functional classes: YBR078W ECM33 YIL034C CAP2 YIL159W BNR1 YKL007W CAP1 YMR054W STV1 YMR058W FET3 YMR089C YTA12 YFL031W HAC1 YHR079C IRE1 YJL095W BCK1 YCL048W SPS22 YIL073C SPO22 YJL155C FBP26 YLR267W BOP2
- 56. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionParameter study Effect of w on extrinsic clustering quality. 0.7 0.7 0.9 mer mer mer rec rec 0.8 rec 0.6 pcr 0.6 pcr pcr pc pc pc 0.7 sci sci sci 0.5 four 0.5 four four 0.6 0.4 0.5 F0.5 score 0.4 F1 score F2 score 0.3 0.4 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0 0.1 0 −0.1 0 −0.1 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 w w w
- 57. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionParameter study Effect of w on algorithm operation. 4 x 10 30 2.5 1000 mer rec mer mer 900 pcr rec rec pc 25 2 pcr 800 pcr Average num iterations to find Nash sci four pc pc 700 Total number of iterations sci sci Number clusters four four 20 1.5 600 500 15 1 400 300 10 0.5 200 100 5 0 0 0 2 4 6 8 10 12 0 2 4 6 8 10 12 0 2 4 6 8 10 12 w w w
- 58. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionConclusion Novel framework for deﬁning and enumerating HIN-clusters. First (as far as I know) connection between Information network clustering and game theory. Initial experimental results show promise.
- 59. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results ConclusionOngoing and future work Development of reward functions, (information theortic, spectral?). Clustering in biological data, do we ﬁnd smaller functional classes compared to other bi-clustering methods? Extension of framework to weighted HINs. More algorithmic development. Compare algorithms with actual Nash solver.
- 60. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion S. M. Arindam Banerjee, Sugato Basu. Multi-way clustering on relation graphs. In Proceedings of the SIAM International Conference on Data Mining, 2007. R. Bekkerman, R. El-Yaniv, and A. McCallum. Multi-way distributional clustering via pairwise interactions. In ICML ’05: Proceedings of the 22nd international conference on Machine learning, pages 41–48, New York, NY, USA, 2005. ACM. J. Li, G. Liu, H. Li, and L. Wong. Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: A one-to-one correspondence and mining algorithms. IEEE Trans. Knowl. Data Eng., 19(12):1625–1637, 2007. B. Long, X. Wu, Z. M. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs.
- 61. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 317–326, New York, NY, USA, 2006. ACM. B. Long, Z. M. Zhang, X. Wu, and P. S. Yu. Spectral clustering for multi-type relational data. In ICML ’06: Proceedings of the 23rd international conference on Machine learning, pages 585–592, New York, NY, USA, 2006. ACM. E. Mendelson. Introducing Game Theory and Its Applications. Chapman & Hall / CRC, 2004. I. A. T. S. Mohammed J Zaki, Markus Peters. Clicks: An effective algorithm for mining subspace clusters in categorical datasets. Data and Knowledge Engineering special issue on Intelligent Data Mining, 60 (2):51–70, 2007.
- 62. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion G. Owen. Game Theory. Academic Press, 1995. R. Porter, E. Nudelman, and Y. Shoham. Simple search methods for ﬁnding a nash equilibrium. In Games and Economic Behavior, pages 664–669, 2004. Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. Rankclus: Integrating clustering with ranking for heterogeneous information network analysis. In Proc. 2009 Int. Conf. on Extending Data Base Technology (EDBT’09 ), 2009. Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’09 ), 2009.
- 63. Introduction Preliminaries The Bi-clustering Game Framework Reward Functions Experimental Results Conclusion A. Tanay, R. Sharan, and R. Shamir. Discovering statistically signiﬁcant biclusters in gene expression data. In In Proceedings of ISMB 2002, 2002.

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment