Spectrum = the set of eigenvalues
By looking at the spectrum we can know about
the graph itself!
A way of normalizing data (canonical form)
and then perform clustering (e.g. via k-
means) on this normalized/reduced space.
Input: A similarity matrix
Output: A set of (non-overlapping/hard)
UndirectedGraph G(V, E)
V: set of vertices (nodes in the network)
E: set of edges (links in the network)
▪ Weight wij is the weight of the edge connecting vertex I
and j (represented by the affinity matrix.)
Degree: sum of weights on outgoing edges of a
Measuring the size of a subset A ofV
How to create the affinity matrixW from the
similarity matrix S?
▪ Connect all vertices that have similarity greater than ε
k-nearest neighbor graph
▪ Connect the k-nearest neighbors of each vertex.
▪ Mutual k-nearest neighbor graphs for asymmetric S.
Fully connected graph
▪ Use the Gaussian similarity function (kernel)
L = D –W
D: degree matrix. A diagonal matrix diag(d1,...,dn)
For every vector
L is symmetric and positive semi-definite
The smallest eigenvalue of L is zero and the
corresponding eigenvector is 1 = (1,...,1)T
L has n non-negative, real-valued eigenvalues
Two versions exist
Lsym = D-1/2LD-1/2 = I - D-1/2WD-1/2
Lrw = D-1L = I - D-1W
The partition (A1,...,Ak) induces a cut on the graph
Two types of graph cuts exist
Spectral clustering solves a relaxed version of the
mincut problem (therefore it is an approximation)
By the Rayleigh-Ritz
theorem it follows that the
second eigenvalue is the
Transition probability matrix and Laplacian
P = D-1W
Lrw = I - P
Lrw based spectral clustering (Shi &
Malik,2000) is better (especially when the
degree distribution is uneven).
Use k-nearest neighbor graphs
How to set the number of clusters:
Use the eigengap heuristic
If using Gaussian kernel how to set sigma
Mean distance of a point to its log(n)+1 nearest
The low-rank approximation B for a matrix A s.t.
rank(B) = r < rank(A) is given by,
B = USV*, where A = UZV* and S is the same as Z
except the (r+1) and above singular values of Z are
set to zero.
Approximation is done by minimizing the
▪ minB||A – B||F, subject to rank(B) = r