Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- European Bus Operators' Forum - Pet... by Russell Publishing 293 views
- Euthymides Jse by sudsnz 780 views
- Sampiyonlar Ligi by vehbik 390 views
- Direct Transactions for Real Estate... by bestyardsign 164 views
- 2011 decorative flower oil painting... by Xiamen LKL Fine A... 167 views
- Budget vs. premium exhibition displays by Dona Daniel 145 views

No Downloads

Total views

328

On SlideShare

0

From Embeds

0

Number of Embeds

49

Shares

0

Downloads

4

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Donglin Niu, Jennifer G. DyDepartment of Electrical and Computer Engineering, Northeastern University, Boston, MA Michael I. Jordan EECS and Statistics Departments, University of California, Berkeley
- 2. Given medical data, From doctor’s view: according to type of disease From insurance company view: based on patient’s cost/risk
- 3. Two kinds of Approaches: Iterative & SimultaneousIterative Given an existing clustering, find another clustering Conditional Information Bottleneck. Gondek and Hofmann (2004) COALA. Bae and Bailey (2006) Minimizing KL-divergence. Qi and Davidson (2009) Multiple alternative clusterings Orthogonal Projection. Cui et al. (2007)
- 4. SimultaneousDiscovery of all the possible partitionings Meta Clustering. Caruana et al. (2006) De-correlated kmeans. Jain et al. (2008)
- 5. Ensemble Clustering Hierarchical Clustering
- 6. VIEW 1 VIEW 2There are O( KN ) possible clustering solutions.We’d like to find solutions that: 1. have high cluster quality, and 2. be non-redundant and we’d like to simultaneously 3. learn the subspace in each view
- 7. Normalized Cut (On Spectral Clustering, Ng et al.) -maximize within-cluster similarity and minimize between-cluster similarity. Let U be the cluster assignment T 1/ 2 1/ 2 max tr(U D KD U) T s.t. U U IAdvantage: Can discover arbitrarily-shaped clusters.
- 8. There are several possible criteria: Correlation, Mutual information. Correlation: can capture only linear dependencies. Mutual information: can capture non-linear dependencies, but requires estimating the joint probability distribution. In this approach, we choose Hilbert-Schmidt Information Criterion 2 HSIC (x, y) c xy HS Advantage: Can detect non-linear dependence, do not need to estimate joint probability distributions.
- 9. HSIC is the norm of a cross-covariance matrix in kernel space. 2 HSIC (x, y) c xy HS C xy E xy [( ( x) x ) ( ( y) y )] Empirical estimate of HSIC 1 s.t.HSIC( X , Y ) : 2 tr (KHLH ) n H, K, L R n n , K ij : k ( xi , x j ), L ij : l ( yi , y j ) Number of observations 1 T H I 1n1n n Kernel functions
- 10. Cluster Quality: NormalizedCut Redundancy HSIC : T 1/ 2 1/ 2maximize Uv Rn c tr(U v Dv K v Dv U v ) v q tr( K v HK q H ) T Ts.t. Uv Uv I , Wv Wv I , K v ,ij K (WvT xi ,WvT x j ) Where Uv is the embedding, Kv is the kernel matrix, Dv is the degree matrix for each view v. Hv is the matrix to centralize the kernel matrix. All these are defined in subspace Wv.
- 11. We use a coordinate ascent approach.Step 1: Fixed Wv, optimize for Uv Solution to Uv is equal to the eigenvectors with the largest eigenvalues of the normalized kernel similarity matrix.Step 2: Fixed Uv, optimize for Wv We use gradient ascent on a Stiefel manifold.Repeat Steps 1 & 2 until convergence.K-means Step: Normalize Uv. Apply k-means on Uv.
- 12. Cluster the features using spectral clustering. Data x = [f1 f2 f3 f4 f5 …fd] Feature similarity based on HSIC(fi,fj). Transformation Matrix f1 f2 … Wv f4 1 0 0 . . 0 1 0 . . f15 f34 f21 0 0 0 . . … f3 … f7 f9 0 0 1 . . . . 0 . .
- 13. Synthetic Data 1 Synthetic Data 2 View 1 View 2 View 1 View 2mSC: our algorithm DATA 1 DATA 2OPC: orthogonal Projection VIEW 1 VIEW 2 VIEW 1 VIEW 2 (Cui et al., 2007) mSC 0.94 0.95 0.90 0.93DK: de-correlated Kmeans OPC 0.89 0.85 0.02 0.07 (Jain et al., 2008) DK 0.87 0.94 0.03 0.05SC: spectral clustering SC 0.37 0.42 0.31 0.25 Kmeans 0.36 0.34 0.03 0.05 Normalized Mutual Information (NMI) Results
- 14. Identity (ID)View Pose View NMI Results FACE ID POSE mSC 0.79 0.42 OPC 0.67 0.37 DK 0.70 0.40 SC 0.67 0.22 Kmeans 0.64 0.24 •Mean face •Number below each image is cluster purity
- 15. Webkb Data High Weight Words High weight word in each subspace viewview 1 Cornell, Texas, Wisconsin, Madison, Washingtonview 2 homework, student, professor, project, Ph.d NMI Webkb Univ. Type Results mSC 0.81 0.54 OPC 0.43 0.53 DK 0.48 0.57 SC 0.25 0.39 Kmeans 0.10 0.50
- 16. NSF Award Data High Frequent Words Subjects Work TypePhysics Information Biology experimental theoreticalmaterials control cell methods Experimentschemical programming gene mathematical Processesmetal information protein develop Techniquesoptical function DNA equation Measurementsquantum languages Biological theoretical surface
- 17. Machine Sound Data Machine Sound Data Motor Fan Pump mSC 0.82 0.75 0.83 OPC 0.73 0.68 0.47 DK 0.64 0.58 0.75 SC 0.42 0.16 0.09 Kmeans 0.57 0.16 0.09 Normalized Mutual Information (NMI) Results
- 18. Most clustering algorithms only find one single clustering solution. However, data may be multi- faceted (i.e., it can be interpreted in many different ways). We introduced a new method for discovering multiple non-redundant clusterings. Our approach, mSC, optimizes both a spectral clustering (to measure quality) and an HSIC regularization (to measure redundancy). mSC, can discover multiple clusters with flexible shapes, while simultaneously find the subspace in which these clustering views reside.
- 19. Thank you!

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment