Your SlideShare is downloading. ×
Co clustering by-block_value_decomposition
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Co clustering by-block_value_decomposition

188

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
188
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Author / Bo Long, Zhongfei Zhang and Philip S. Yu
    Source / ACM KDD’05, August 21-24, 2005, pp. 635 – pp. 640
    Presenter / Allen Wu
    Co-clustering by Block Value Decomposition
    1
  • 2. Outline
    Introduction
    Block value decomposition
    Derivation of the Algorithm
    Empirical Evaluation
    Conclusion
    2
  • 3. Introduction
    Dyadic data refer to a domain with two finite sets of objects in which observations are made for dyads.
    Co-clustering can effectively deal with the high dimensional and sparse data between rows and columns.
    In this paper, a new co-clustering framework, Block Value Decomposition(BVD), had been proposed.
    3
  • 4. Introduction (cont.)
    This paper develop a specific novel co-clustering algorithm for a special yet very popular case – non-negative dyadic data.
    The algorithm performs an implicitly adaptive dimensionality reduction, which works well for typical sparse data.
    The dyadic data matrix is factorized into three components.
    The row-coefficient matrix – R
    The block value matrix– B
    The column-coefficient matrix– C
    4
  • 5. The definition of dyadic data
    5
    The notion dyadic refers to a domain with two sets of objects X={x1, …, xn} and Y={y1, …, ym}
    The data can be organized as an n by m two-dimensional matrix Z.
    Each w(x,y) corresponds to one element of Z.
  • 6. 6
    ×
    ×
    =
    n×k
    k ×l
    l×m
    n×m
    k ×l
  • 7. 7
    y1 y2 y3 y4
    x1
    x2
    x3
    x4
    ×
    ×
    C
    B
    R
    y1 y2 y3 y4
    =
    y1 y2 y3 y4
    x1
    x2
    x3
    x4
    x1
    x2
    x3
    x4
    RBC
    Z
  • 8. Block value decomposition definition
    8
    Non-negative block value decomposition of a non-negative data matrix Z n×m(i.e. ij: Zij  0) is given by the minimization of
    f(R, B, C) = ||Z – RBC||2
    subject to the constraints ij: Rij  0, Bij  0 and Cij  0, where R n×k, B k×l, C l×m, k<<n, and l<<m.
    If R=CT, symmetric non-negative block value decomposition of a symmetric non-negative data matrix Z n×n(i.e. ij: Zij  0) is given by the minimization of
    f(S, B,) = ||Z – SBST||2
    ij: Sij  0, and Bij  0, where S n×k, B k×k and k<<n.
  • 9. Derivation of the algorithm
    9
    The objective function is convex in R, B and C respectively. However, it is not convex in all of them simultaneously.
    Thus, it is unrealistic to expect an algorithm to find the global minimum.
    Theorem 1. If R, B and C are a local minimizer of the objective function , then the equations
    (ZCTBT )。R + (RBCCTBT )。R = 0
    (RTZCT )。B +(RTRBCCT )。B = 0
    (BTRTZ)。C + (BTRTRBC)。C = 0
    are satisified, where 。denotes the Hadamard product of two matrices.
  • 10. Derivation of the algorithm (cont.)
    10
    Let λ1, λ2, and λ3 be the Lagrange multipliers for the constraint R, B, and C  0, respectively, where λ1k×n, λ2l×k and λ3m×l. The Lagrange function L(R, B, C, λ1, λ2, λ3 ) becomes:
    L = f(R;B;C) -tr(λ1 RT ) -tr(λ2BT ) - tr(λ3 CT )
    The Kuhn-Tucker conditions are:
    L/ R = L/ B = L/ C = 0
    λ1。R = λ2。B = λ3。C = 0
    Taking the derivatives, we obtain the following three equations, respectively.
    2ZCTBT - 2RBCCTBT + λ1 = 0
    2RTZCT - 2RTRBCCT + λ2 = 0
    2BTRTZ - 2BTRTRBC + λ3 = 0
  • 11. Derivation of the algorithm (cont.)
    11
    Based on Theorem 1, we propose following updating rules.
    If the R=CT, we derive the updating rules for symmetric matrix, that the symmetric NBVD provides only one clustering result.
  • 12. EMPIRICAL EVALUATIONS
    12
    The experiment dataset is collected from the 20-Newsgroup data and CLASSIC3 dataset.
    We measure the clustering performance using the accuracy given by the confusion matrix of the obtained clusters and the "real" classes.
  • 13. EMPIRICAL EVALUATIONS (cont.)
    13
  • 14. EMPIRICAL EVALUATIONS (cont.)
    14
  • 15. Conclusion
    15
    In this paper, we have proposed a new co-clustering frame work for dyadic data called Block Value Decomposition.
    Under this framework, we focus on a special but also very popular case, Non-negative Block Value Decomposition.
    We have shown the correctness of the NBVD algorithm theoretically.
    According to the empirical evaluations, the effectiveness and the great potential of the BVD framework.

×