SlideShare a Scribd company logo
1 of 5
Download to read offline
Scalable Constrained Spectral Clustering
Jianyuan Li, Member, IEEE,
Yingjie Xia, Member, IEEE, Zhenyu Shan, and
Yuncai Liu, Senior Member, IEEE
Abstract—Constrained spectral clustering (CSC) algorithms have shown great
promise in significantly improving clustering accuracy by encoding side information
into spectral clustering algorithms. However, existing CSC algorithms are ineffi-
cient in handling moderate and large datasets. In this paper, we aim to develop a
scalable and efficient CSC algorithm by integrating sparse coding based graph
construction into a framework called constrained normalized cuts. To this end, we
formulate a scalable constrained normalized-cuts problem and solve it based on a
closed-form mathematical analysis. We demonstrate that this problem can be
reduced to a generalized eigenvalue problem that can be solved very efficiently.
We also describe a principled k-way CSC algorithm for handling moderate and
large datasets. Experimental results over benchmark datasets demonstrate that
the proposed algorithm is greatly cost-effective, in the sense that (1) with less side
information, it can obtain significant improvements in accuracy compared to the
unsupervised baseline; (2) with less computational time, it can achieve high clus-
tering accuracies close to those of the state-of-the-art.
Index Terms—Constrained spectral clustering, sparse coding, efficiency,
scalability
Ç
1 INTRODUCTION
CURRENTLY, data in a wide variety of areas tend to large scales. For
many traditional learning based data mining algorithms, it is a big
challenge to efficiently mine knowledge from the fast increasing
data such as information streams, images and even videos. To over-
come the challenge, it is important to develop scalable learning
algorithms.
Constrained clustering is an important area in the research
communities of machine learning. Researchers proposed many
new algorithms [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11],
[12], [13], [14], [15], [16], which improve clustering accuracy by
means of encoding side information into unsupervised clustering
algorithms. Here, side information might be labelled data [15],
pairwise constraints [2], relative comparison constraints [17],
and so forth. In this paper, we refer to it as labelled data. In
practice, labelled data are often costly to obtain and so the typi-
cal problem in this area is to improve clustering by using a little
of side information.
We understand that constrained spectral clustering (CSC)
algorithms [9], [10], [11], [12], [13], [14], [15], [16] are in general
better than other constrained clustering algorithms in terms of
accuracy, in part because of the high accuracies of unsupervised
spectral clustering [18], [19], [20], [21], [22]. However, the scal-
ability and efficiency of previous CSC algorithms are in general
poor. Specifically, the memory complexity of existing CSC algo-
rithms [9], [10], [11], [12], [13], [14], [15], [16] is Oðn2
Þ and the
time complexity is Oðn3
Þ, where n is the total number of instan-
ces. This hampers the applications of CSC algorithms towards
moderate and large datasets.
In this paper, we develop an efficient and scalable CSC algo-
rithm that can well handle moderate and large datasets. The
SCACS algorithm can be understood as a scalable version of the
well-designed but less efficient algorithm known as Flexible Con-
strained Spectral Clustering (FCSC) [15], [16]. To our best knowl-
edge, our algorithm is the first efficient and scalable version in this
area, which is derived by an integration of two recent studies, the
constrained normalized cuts [15], [16] and the graph construction
method based on sparse coding [23]. However, it is by no means
straightforward to integrate the two existing methods. In the rest
paper, we mainly answer three questions: how do we achieve an
effective integration? how do we derive the SCACS algorithm in
a principled way? and how well does the proposed algorithm
perform?
The structure of the rest paper is as follows: in Section 2, we
revisit the constrained normalized-cuts problem and the method of
graph construction based on sparse coding; in Section 3 we formu-
late the problem to be solved and present a closed-form mathemat-
ical analysis; in Section 4, we propose our algorithm based on the
results of Section 3; in Section 5, we evaluate the proposed algo-
rithm over benchmark datasets and discuss the results; In Section 6,
we draw the conclusion and mention the future work.
For notation, vectors and matrices are denoted by bold lower-
case and upper-case letters, respectively. Sets are denoted by italic
upper-case letters. Scalars are denoted by italic lower-case letters.
2 BACKGROUND
In this section, we revisit two pioneer studies, namely the con-
strained normalized cuts [15], [16] and the sparse coding based
graph construction method [23].
2.1 Constrained Normalized Cuts
Given a vector dataset X ¼ fxign
i¼1 where xi 2 Rd
, and a constraint
set fC¼; C6¼g where ðxi; xjÞ 2 C¼ if the patterns of xi and xj are
similar and ðxi; xjÞ 2 C6¼ otherwise, the aim is to partition X into k
clusters biased by the constraint set. Let W be the similarity matrix
over X where Wij represents the similarity between instances xi
and xj. Let D be the degree matrix over X which is a diagonal
matrix with elements Dii ¼
P
j Wij. Let L ¼ I À DÀ1=2
WDÀ1=2
be
the normalized graph Laplacian where I denotes the identity
matrix. Let Q denote the constraint matrix where Qij ¼ 1 expresses
ðxi; xjÞ 2 C¼, and Qij ¼ À1 expresses ðxi; xjÞ 2 C6¼, and Qij ¼ 0
expresses no available side information. Let Q ¼ DÀ1=2
QDÀ1=2
be
the normalized constraint matrix. The constrained normalized cuts
[15], [16] can be rewritten as:
arg minv2fþ 1ffiffin
p ;À 1ffiffin
p gn vT
Lv
s:t: vT
Qv ! a; vT
v ¼ 1; v?D1=2
1:
(1)
Here the parameter a controls what degree the input side informa-
tion is respected. After removed vT
Qv ! a, Eq. (1) degenerates to
the standard normalized cuts [18]. The problem is NP hard and the
feasible way is to allow v to take any real values [21] which is called
a relaxed method in the literature. The relaxed solution vector can
be given by generalized eigenvalue decomposition [15], [16]. How-
ever, it still requires Oðn2
Þ memory cost and Oðn3
Þ time cost.
2.2 Sparse Coding Based Graph Construction
Graph construction amounts to computing a similarity matrix.
There exist a variety of methods [18], [21], [23], [24], [25], [26], [27].
 J. Li is with the College of Computer Science, Zhejiang University, Hangzhou
310058, China, and the Big Data Research Center of Enjoyor Co Ltd, HangZhou
310030, China. E-mail: jylbob@gmail.com.
 Y. Xia is with the College of Computer Science, Zhejiang University, Hangzhou
310058, China. E-mail: xiayingjie@zju.edu.cn.
 Z. Shan is with the College of Computer Science, Hangzhou Normal University,
Hangzhou 310012, China. E-mail: shanzhenyu@zju.edu.cn.
 Y. Liu is with the Department of Automation, Shanghai Jiaotong University, Shang-
hai 200240, China. E-mail: company8110@gmail.com.
Manuscript received 1 Aug. 2013; revised 13 June 2014; accepted 18 Aug. 2014. Date of
publication 9 Sept. 2014; date of current version 23 Dec. 2014.
Recommended for acceptance by S. Chawla.
For information on obtaining reprints of this article, please send e-mail to: reprints@ieee.
org, and reference the Digital Object Identifier below.
Digital Object Identifier no. 10.1109/TKDE.2014.2356471
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 589
1041-4347 ß 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Among them, the method in [23] is based on sparse coding theory
[28] and is designed for handling large datasets.
Here we briefly introduce the sparse coding based graph con-
struction. Given a dataset X, of the form d-by-n matrix, X, sparse
coding aims to find a pair of matrices, U 2 RdÂp
and Z 2 RpÂn
,
such that UZ could best approximate X where U’s columns repre-
sent the desired base vectors and Z’s columns represent sparse coef-
ficient vectors—each vector has few non-zero components. The
cost function to be minimized is
fðU; ZÞ ¼ kX À UZk2
F : (2)
Unfortunately, it is costly to precisely solve for UÃ
and ZÃ
[29]. In
practice, an approximate and efficient way [23] is to randomly
choose p instances among the input dataset and act them as base
vectors, and then to estimate each column vector of Z according to
Zij ¼
Ksðxj; uiÞ
P
i2rNBðjÞ Ksðxj; uiÞ
; (3)
where s denotes the bandwidth of Gaussian kernel KsðÁ; ÁÞ and
i 2 rNBðjÞ means that the base vector ui is among the r nearest
base (rNB) vectors of instance xj. The sparseness of the matrix Z is
controlled by the parameter r.
After obtained Z, one can construct two forms of graph matrices:
G ¼ ZT
Z and S ¼ ZZT
. Choose ^Z ¼ DÀ1=2
Z where Dii ¼
P
j Zij. We
can obtain the normalized graph matrices ^G ¼ ^ZT ^Z 2 RnÂn
and
^S ¼ ^Z^ZT
2 RpÂp
. It is easy to check that the normalized graph Lapla-
cian over X is ðI À ^GÞ. The time complexity for computing ^G is
Oðpn2
Þ, but the one for computing ^S is just Oðnp2
Þ (p ( n).
3 PROBLEM FORMULATION
In this section, we formulate a scalable constrained normalized-
cuts problem and demonstrate how to solve it.
3.1 The Scalable Constrained Normalized Cuts
In the following, Problem 1 refers to a straightforward integration
of the constrained normalized cuts and the sparse coding based
graph construction, and Problem 2 refers to the formulated scalable
constrained normalized-cuts problem.
Problem 1. Let ^G be the similarity matrix over X. The relaxed con-
strained normalized-cuts problem can be formulated as
min
v2Rn
vT
Lv; s:t: vT
Qv ! a; vT
v ¼ 1; v?1; (4)
where L ¼ I À ^G is the normalized graph Laplacian.
Based on [15], the normal approach for finding the solution vec-
tor of Problem 1 can be reduced to the following generalized eigen-
value problem:
Lv ¼ ðQ À bIÞv; (5)
where b is a lower bound of a. The time cost for solving this prob-
lem is Oðn3
Þ, infeasible for handling large datasets.
To seek a scalable solution, we write v 2 Rn
as ^Z
T
u where
u 2 Rp
. Plugging v ¼ ^Z
T
u into Eq. (4), we reformulate Problem 1
as the following Problem 2.
Problem 2.
min
u2Rp
uT
Au; s:t: uT ^Qu ! a; uT ^Su ¼ 1; 1T ^Su ¼ 0; (6)
here,
A ¼ ^S À ^S^S; ^Q ¼ ^ZQ^Z
T
: (7)
This problem is mathematically equivalent to Problem 1, but
it results in two significant changes: (1) the n-by-n normalized
graph Laplacian L is compressed as the p-by-p matrix A; (2)
the n-by-n constraint matrix Q is naturally compressed as the
p-by-p matrix ^Q. Consequently, the solution of Problem 1 might
be efficiently recovered from the solution of Problem 2 consid-
ering p ( n.
3.2 Solving Problem 2
To solve Problem 2, we use Lagrange multiplier and obtain that
Lðu; ; mÞ ¼ uT
Au À ðuT ^Qu À aÞ À mðuT ^Su À 1Þ: (8)
Using the KKT Theorem [30], the feasible solutions must satisfy the
the following conditions:
Au À ^Qu À m^Su ¼ 0 (9)
uT ^Qu ! a (10)
uT ^Su ¼ 1; (11)
1T ^Su ¼ 0; (12)
 ! 0; (13)
ðuT ^Qu À aÞ ¼ 0: (14)
When  ¼ 0, Eq. (9) degenerates to the standard eigen-system
^Su ¼ ð1 À mÞu, namely side information does not work. To use
side information, we limit   0 which reduces Eq. (10) and
Eq. (14) to
uT ^Qu ¼ a; (15)
Assuming that Àm= ¼ b, Eq. (9) can be reduced to
Au ¼ ð^Q À b^SÞu; (16)
which is a generalized eigenvalue problem that can be solved very
efficiently due to p ( n.
To derive a scalable constrained spectral clustering algorithm
(SCACS), it is essential to discuss how to set the parameters b and
a. Via a few algebraic manipulations, we find two facts which are
useful for algorithm design. First, the matrix A is positive semi-
definite, hence we have
uT
ð^Q À b^SÞu ¼ ða À bÞ ! 0; (17)
meaning that the parameter a is lower-bounded by b. Thus, users
just need to specify b, regardless of a. Second, for ensuring that
Eq. (16) has at least i meaningful solution vectors in terms of non-
negative normalized cuts, b  gi is a sufficient condition where gi
denotes the ith largest eigenvalue of the generalized eigen-system
^Qx ¼ g^Sx.
4 ALGORITHM AND ANALYSIS
Based on the mathematical analysis above, in this section we derive
the algorithm and analyze the complexity.
4.1 The Proposed Algorithm
Problem 2 just indicates a binary constrained spectral clustering
problem where the solution vector vÃ
plays the role of group-
ing indicator. Without loss of generality, here we directly
derive an algorithm for k-class problems (k ! 2). We call the
proposed algorithm scalable constrained spectral clustering and
list 13 steps in Algorithm 1.
590 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015
Algorithm 1. Scalable Constrained Spectral Clustering
1: Input: a dataset X 2 RdÂn
, the base-vector number p, the
n-by-n constraint matrix Q, b, cluster number k.
2: Choose p vector data among the input dataset at random,
and stack them in the columns of matrix U 2 RdÂp
.
3: Compute Z 2 RpÂn
using Eq. (3) and then compute
^Z ¼ DÀ1=2
Z where D denotes a diagonal matrix with
elements Dii ¼
P
j Zij.
4: Compute ^S ¼ ^Z^Z
T
and ^Q ¼ ^ZQ^Z
T
.
5: Find the largest eigenvalue gmax of the generalized eigen-
system ^Qx ¼ g^Sx.
6: If b ! gmax, return fvÃ
g ¼ ;; otherwise, find all the eigen-
vectors fuig by solving generalized eigen-system Eq. (16),
where ui denotes the ith eigenvector, 1 i p.
7: Find among fuig the eigenvectors fuigþ
associated with
positive eigenvalues.
8: Normalize each ui 2 fuigþ
by multiplying a factor
ffiffiffiffiffiffiffiffiffiffi
1
uT
i
^Sui
q
.
9: Remove the eigenvectors from fuigþ
that are not orthogonal
to the vector 1T ^S.
10: Find among fuigþ
the m eigenvectors that lead to the small-
est values of uT
i Aui where m ¼ minfk À 1; jfuigþ
jg, then
stack them in columns of matrix V.
11: Compute VðrÞ
¼ ^Z
T
VðI À VT
AVÞ.
12: Normalize VðrÞ
’s rows to have unit length, then feed it to the
k-means algorithm.
13: Output: the grouping indicator.
Several key steps are interpreted as follows: (1) Step 7 aims to
satisfy the condition   0; (2) Step 8 aims to scale each eigenvec-
tors for satisfying the condition of Eq. (11); (3) Step 9 aims to satisfy
the condition of Eq. (12); (4) In Step 11, we recover the solution vec-
tors by the linear transformation ^Z
T
u and we weight each solution
vector by one minus the associated value of the objective function.
It is worth mentioning that the input parameter b is tunable,
making the algorithm flexible to noisy side information or inappro-
priate mathematical expressions for side information. Usually, the
larger b is given, the more side information is respected.
4.2 Complexity Analysis
The algorithm complexity is analyzed as follows. The time cost for
computing the matrix ^S is in general Oðnp2
Þ, and that for comput-
ing the generalized eigenvalue decomposition in Step 5 or in Step 6
is Oðp3
Þ, and that for computing ^ZQ^ZT
is Oðkp2
þ kpnÞ. Hence the
general time complexity of our algorithm is
Oðkpn þ kp2
þ np2
þ p3
Þ: (18)
In practical applications, p can be chosen as a value far smaller than
n so that the running time grows almost linearly to n. The memory
cost for storing the matrices ^Z and ^S are OðnpÞ and Oðp2
Þ, respec-
tively. The general memory complexity is
Oðnp þ p2
Þ: (19)
5 EXPERIMENT
In this section, we carry out experiments for assessing the pro-
posed SCACS algorithm.
5.1 Experiment Setup
We implement our algorithm and other compared algorithms over
Linux machines, the machine for recording computational time
with 3.10 GHz CPU and 8 GB main memory.
The datasets that we used are downloaded from UCI machine
learning repository1
and the other two web sites.2;3
Basic informa-
tion is listed in Table 1.
Side information is generated based on the ground truth labels
of the datasets. The following expression illustrates the encoding
rules of Q:
Qij ¼
1 if xi and xj have consistent labels
1 if i ¼ j
À1 if xi and xj have different labels
0 no side information:
8

:
(20)
In our experiment, we randomly sample c labelled instances from a
given input dataset, and then obtain Q based on the rules of
Eq. (20).
The clustering accuracy is evaluated by the best matching rate
(ACC). Let h be the resulting label vector obtained from a cluster-
ing algorithm. Let g be the ground truth label vector. Then, the best
matching rate is defined as
ACC ¼
Pn
i¼1 dðgi; mapðhiÞÞ
n
; (21)
where dða; bÞ denotes the delta function that returns 1 if a ¼ b and
returns 0 otherwise, and map(hi) is the permutation mapping func-
tion that maps each cluster label hi to the equivalent label from the
data corpus.
5.2 Comparisons with Benchmarks and FCSC
In this part, we compare our SCACS algorithm with three spectral
algorithms (LSC-R, SL and FCSC). Among them, “LSC-R” [23] is
the unsupervised baseline that constructs similarity graphs based
on sparse coding. “SL” [10] is the CSC baseline which encodes side
information by directly modifying similarity matrices. “FCSC” [16]
is the state-of-the-art in terms of accuracy. FCSC is too slow to han-
dle large datasets, hence in this experiment we used four small
datasets:
1) The Image Segmentation dataset.
2) A subset sampled from the USPS dataset. We used the first
300 instances of each class, 3,000 instances in total.
3) A subset sampled from the Pen Digits dataset. We used the
first 300 instances of each class, 3000 instances in total.
4) A subset sampled from the Letter Recognition dataset. We
used the first five classes of instances, 3,864 instances in
total.
We set the Gaussian kernel width s to the average Euclidean dis-
tance between all instances and base vectors. We use p ¼ 500, r ¼ 3
and b ¼ b0gkÀ1 where b0 ¼ 0:5 þ 0:4 Â c
n and gkÀ1 denotes the (k-1)
th largest eigenvalue of the generalized eigen-system ^Qx ¼ g^Sx.
For fair comparison, the randomly selected labelled instances are
TABLE 1
Dataset Description
Dataset # Instances # Attributes # Classes
Image Segmentation 2,310 19 7
USPS 9,298 256 10
Pen Digits 10,992 16 10
Letter Recognition 20,000 16 26
MNIST 70,000 784 10
CoverType 581,012 54 7
1. http://archive.ics.uci.edu/ml
2. http://www.zjucadcg.cn/dengcai
3. http://yann.lecun.com/exdb/mnist
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 591
consistent for SL, SCACS and FCSC. In order to show how the
results vary with c, we use 10 different values ranging from 100 to
1,000 by a increasing step size 100; for each value, we carry out 20 tri-
als in terms of 20 different constraint sets. For SL, we use the k-near-
est neighbor method to compute similarity matrices (k ¼ 30), and
we use radial basis function (RBF) to compute the similarities
between pairwise instances. For FCSC, we directly use RBF kernel
function to compute similarity matrices as used by the authors.
We demonstrate the algorithm accuracies and the standard devi-
ations in Figs. 1a, 1b, 1c, and 1d. One can see that our algorithm out-
performs LSC-R over all the datasets with large margins, indicating
that our algorithm in encoding side information is appropriate. In
addition, FCSC significantly outperforms SL, indicating that the
constrained normalized-cuts framework performs much better in
encoding side information. Over the four small datasets, our algo-
rithm outperforms SL on average and is close to FCSC.
5.3 Comparisons with Efficient Algorithms
In this part, we compare our algorithm with three efficient algo-
rithms (LSC-R, CKM, and ML). Among them, “CKM” refers to the
k-means based constrained clustering algorithm [2]. “ML” refers to
the side information based metric learning method [4] and we use
their efficient version that learns a diagonal matrix. The datasets
that we used are listed in Table 1.
For fair comparison, we use the same base-vector selections for
LSC-R and SCACS, and we use the same random selections of con-
strained instances for CKM, ML and SCACS. We carry out 20 trials
in terms of 20 different constrained instance sets, and report the
average clustering accuracies and the standard deviations. The
parameter that we used are p ¼ 500, b0 ¼ 0:1, r ¼ 3, respectively.
The results are illustrated in Figs. 1e, 1f, 1g, 1h, 1i, and 1j. Over
five datasets (Image Segmentation, USPS, Pen Digits, LetterRec
and MNIST), our algorithm significantly outperforms other three
algorithms; and over the remaining CoverType dataset, our algo-
rithm performs better than LSC-R and CKM. Note that ML per-
forms well on low-dimensional datasets such as Letter Recognition
and CoverType, with a decreased performance on high-dimen-
sional datasets such as USPS and MNIST. However, our algorithm
performs much better on high-dimensional datasets.
The running time is recorded in Table 2. The results show that
SCACS is much faster than FCSC. For the Pen Digits dataset, FCSC
demands more than 11 hours, however, our algorithm demands
only 3.22 seconds. For the two largest datasets, MNIST and Cov-
erType, FCSC could not return results within a week, however, our
algorithm only demands 12.02 and 64.47 seconds, respectively.
5.4 Influence of Parameters
Below we show how the parameters (r, p, b0) influence SCACS. We
use two datasets, Pen Digits and MNIST, with a lower dimension-
ality and a higher dimensionality, respectively. For computing the
influence of r, we fix c ¼ 500 and vary r from 3 to 30 by a step size
1. For computing the influence of p, we fix c ¼ 500 and vary p rang-
ing from 100 to 1,000 by a step size 100. For computing the influ-
ence of b0, we vary b0 from 0.1 to 0.9 by a step size 0.1.
Fig. 2a shows the influence of r. The accuracies of SCACS tend
to decrease as r increases, indicating that sparseness plays a key
role in our algorithm. This suggests that a relative small r is prefer-
able. The accuracy decrease pace over the MNIST dataset is faster
than that over the Pen Digits dataset. This might support that
sparseness is more important for grouping high-dimensional data-
sets than for grouping low-dimensional datasets. Fig. 2b shows
that the accuracies of SCACS vary mildly with p; in particular in
the interval of, e.g., p ranging from 500 to 1,000, our algorithm is
significantly robust. This states that it is easy to set a usable value
for p. Figs. 2c and 2d show how the clustering accuracies vary with
b0. One may notice that the clustering accuracies drop seriously if
constrained instances are very few (e.g., c ¼ 100) but b0 uses a
larger value (e.g., b0 ¼ 0.8). Conversely, if constrained instances are
abundant, a larger b0 might result in a significant improvement in
terms of accuracy, e.g., for the Pen Digits datasets, the clustering
accuracies can be significantly improved as b0 increases when
c ¼ 5;000.
6 CONCLUSION AND FUTURE WORK
We have developed a new k-way scalable constrained spectral
clustering algorithm based on a closed-form integration of the con-
strained normalized cuts and the sparse coding based graph con-
struction. Experimental results show that (1) with less side
information, our algorithm can obtain significant improvements in
Fig. 1. Accuracy comparisons among different algorithms. Here c denotes the number of labelled data.
TABLE 2
Running Time (seconds), c ¼ 100, p ¼ 500
Dataset LSC-R CKM ML SCACS FCSC
ImageSeg 0.41 0.01 3.46 2.35 153
USPS 0.96 0.82 21.05 3.29 14,461
PenDigits 0.82 0.31 5.42 3.22 39,948
LetterRec 4.12 4.10 15.17 6.59 212,600
MNIST 9.32 56.70 492.9 12.02 -
CoverType 49.12 33.31 190.58 64.47 -
592 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015
accuracy compared to the unsupervised baseline; (2) with less
computational time, our algorithm can obtain high clustering accu-
racies close to those of the state-of-the-art; (3) It is easy to select the
input parameters; (4) our algorithm performs well in grouping
high-dimensional image data. In the future, we are considering an
active selection of pairwise instances for labelling; we will also
apply our algorithm to group urban transportation big data, which
might significantly boost sensor placement optimization.
ACKNOWLEDGMENTS
This paper was supported in part by the following funds: National
High Technology Research and Development Program of China
(2011AA010101), National Natural Science Foundation of China
(61002009 and 61304188), Key Science and Technology Program of
Zhejiang Province of China (2012C01035-1), and Zhejiang Provin-
cial Natural Science Foundation of China (LZ13F020004 and
LR14F020003), Postdoctoral fund of China. The authors would like
to thank Prof. Xiaofei He for beneficial suggestions. Yingjie Xia is
the corresponding author.
REFERENCES
[1] K. Wagstaff and C. Cardie, “Clustering with instance-level constraints,” in
Proc. 17th Int. Conf. Mach. Learn., 2000, pp. 1103–1110.
[2] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-means
clustering with background knowledge,” in Proc. 18th Int. Conf. Mach.
Learn., 2001, pp. 577–584.
[3] S. Basu, A. Banerjee, and R. Mooney, “Semi-supervised clustering by
seeding,” in Proc. 19th Int. Conf. Mach. Learn., 2002, pp. 27–34.
[4] E. P. Xing, A. Y. Ng, M. I. Jordan, and S. J. Russell, “Distance metric learn-
ing with application to clustering with side-information,” in Proc. Adv. Neu-
ral Inf. Process. Syst., 2003, pp. 505–512.
[5] S. Basu, B. I. Lenko, and R. J. Mooney, “A probabilistic framework for semi-
supervised clustering,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data
Mining, 2004, pp. 59–68.
[6] N. Shental, A. Bar-hillel, T. Hertz, and D. Weinshall, “Computing gaussian
mixture models with em using equivalence constraints,” in Proc. Adv. Neu-
ral Inf. Process. Syst. 16, 2003, pp. 505–512.
[7] B. Kulis, S. Basu, I. Dhillon, and R. Mooney, “Semi-supervised graph clus-
tering: A kernel approach,” in Proc. 22nd Int. Conf. Mach. Learn., 2005,
pp. 457–464.
[8] Y.-M. Cheung and H. Zeng, “Semi-supervised maximum margin clustering
with pairwise constraints,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 5,
pp. 926–939, May 2012.
[9] S. X. Yu and J. B. Shi, “Grouping with bias,” in Proc. Adv. Neural Inf. Process.
Syst., 2001, pp. 1327–1334.
[10] S. D. Kamvar, D. Klein, and C. D. Manning, “Spectral learning,” in Proc. Int.
Joint Conf. Artif. Intell., 2003, pp. 561–566.
[11] X. Ji and W. Xu, “Document clustering with prior knowledge,” in Proc.
29th Annu. Int. Conf. Res. Develop. Inf. Retrieval, 2006, pp. 405–412.
[12] Z. Lu and M. A. Carreira-Perpinan, “Constrained spectral clustering
through affinity propagation,” in Proc. IEEE Conf. Comput. Vis. Pattern Rec-
ognit., 2008, pp. 1–8.
[13] Z. Li, J. Liu, and X. Tang, “Constrained clustering via spectral regu-
larization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 421–
428.
[14] T. Coleman, J. Saunderson, and A. Wirth, “Spectral clustering with incon-
sistent advice,” in Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 152–159.
[15] X. Wang and I. Davidson, “Flexible constrained spectral clustering,” in
Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2010,
pp. 563–572.
[16] X. Wang, B. Qian, and I. Davidson, “On constrained spectral clustering and
its applications,” Data Mining Knowl. Discov., vol. 28, no. 1, pp. 1–30, 2012.
[17] M. Schultz and T. Joachims, “Learning a distance metric from relative
comparisons,” in Proc. Adv. Neural Inf. Process. Syst., 2003, pp. 41–48.
[18] J. B. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.
[19] A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: Analysis and
an algorithm,” in Proc. Adv. Neural Inf. Process. Syst. 14, 2001, pp. 849–856.
[20] C. Fowlkes, S. Belongie, F. R. K. Chung, and J. Malik, “Spectral grouping
using the Nystr€om method,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26,
no. 2, pp. 214–225, Feb. 2004.
[21] U. V. Luxburg, “A tutorial on spectral clustering,” Statist. Comput., vol. 17,
no. 4, pp. 395–416, 2007.
[22] Y. Yang, Y. Yang, H. Shen, Y. Zhang, X. Du, and X. Zhou,
“Discriminative nonnegative spectral clustering with out-of-sample
extension,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1760–1771,
Aug. 2013.
[23] X. Chen and D. Cai, “Large scale spectral clustering with landmark-based
representation,” in Proc. 25th Conf. Artif. Intell., 2011, p. 1.
[24] S. I. Daitch, J. A. Kelner, and D. A. Spielman, “Fitting a graph to vector
data,” in Proc. 26th Annu. Int. Conf. Mach. Learn., 2009, p. 26.
[25] T. Jebara J. Wang, and S. Chang, “Graph construction and b-matching for
semi-supervised learning,” in Proc. Int. Conf. Mach. Learn., 2009, p. 56.
[26] B. Cheng, J. Yang S. Yan Y. Fu, and T. S. Huang, “Learning with l1-graph
for image analysis,” IEEE Trans. Image Process., vol. 19, no. 4, pp. 858–866,
Apr. 2010.
[27] W. Liu, J. He, and S. Chang, “Large graph construction for scalable semi-
supervised learning,” in Proc. Int. Conf. Mach. Learn., 2010, pp. 679–686.
[28] B. A. Olshausen, “Emergence of simple-cell receptive field properties by
learning a sparse code for natural images,” Nature, vol. 381, pp. 607–609,
1996.
[29] H. Lee A. Battle R. Raina, and A. Y. Ng, “Efficient sparse coding algo-
rithms,” in Proc. Adv. Neural Inf. Process. Syst. 19, 2006, pp. 801–808.
[30] H. W. Kuhn and A. W. Tucker, “Nonlinear programming,” in Proc. 2nd Ber-
keley Symp., 1951, pp. 481–492.
 For more information on this or any other computing topic,
please visit our Digital Library at www.computer.org/publications/dlib.
Fig. 2. The influence of parameters (r,p,b0) to the proposed algorithm.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 593

More Related Content

What's hot

Fault diagnosis using genetic algorithms and principal curves
Fault diagnosis using genetic algorithms and principal curvesFault diagnosis using genetic algorithms and principal curves
Fault diagnosis using genetic algorithms and principal curveseSAT Journals
 
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...IRJET Journal
 
InternshipReport
InternshipReportInternshipReport
InternshipReportHamza Ameur
 
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...aciijournal
 
Expandable bayesian
Expandable bayesianExpandable bayesian
Expandable bayesianAhmad Amri
 
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET Journal
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
 
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSSVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSijscmcj
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0theijes
 
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...ijfls
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...Adam Fausett
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
 
New approach for wolfe’s modified simplex method to solve quadratic programmi...
New approach for wolfe’s modified simplex method to solve quadratic programmi...New approach for wolfe’s modified simplex method to solve quadratic programmi...
New approach for wolfe’s modified simplex method to solve quadratic programmi...eSAT Journals
 
Performance evaluation of ds cdma
Performance evaluation of ds cdmaPerformance evaluation of ds cdma
Performance evaluation of ds cdmacaijjournal
 

What's hot (17)

Fault diagnosis using genetic algorithms and principal curves
Fault diagnosis using genetic algorithms and principal curvesFault diagnosis using genetic algorithms and principal curves
Fault diagnosis using genetic algorithms and principal curves
 
Az36311316
Az36311316Az36311316
Az36311316
 
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...
IRJET- An Efficient Reverse Converter for the Three Non-Coprime Moduli Set {4...
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...
STUDY ANALYSIS ON TRACKING MULTIPLE OBJECTS IN PRESENCE OF INTER OCCLUSION IN...
 
Expandable bayesian
Expandable bayesianExpandable bayesian
Expandable bayesian
 
Colombo14a
Colombo14aColombo14a
Colombo14a
 
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A Survey
 
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSSVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
 
23
2323
23
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
 
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering Algorithm
 
New approach for wolfe’s modified simplex method to solve quadratic programmi...
New approach for wolfe’s modified simplex method to solve quadratic programmi...New approach for wolfe’s modified simplex method to solve quadratic programmi...
New approach for wolfe’s modified simplex method to solve quadratic programmi...
 
Performance evaluation of ds cdma
Performance evaluation of ds cdmaPerformance evaluation of ds cdma
Performance evaluation of ds cdma
 

Viewers also liked

Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing SitesPrivacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites1crore projects
 
Anthony Fung - CEO Zalora Indonesia
Anthony Fung - CEO Zalora IndonesiaAnthony Fung - CEO Zalora Indonesia
Anthony Fung - CEO Zalora IndonesiaDian Sari Pertiwi
 
How far will you take your chiropractic care
How far will you take your chiropractic careHow far will you take your chiropractic care
How far will you take your chiropractic careAaron Saund
 
Malware Propagation in Large-Scale Networks
Malware Propagation in Large-Scale NetworksMalware Propagation in Large-Scale Networks
Malware Propagation in Large-Scale Networks1crore projects
 
Progressive Duplicate Detection
Progressive Duplicate DetectionProgressive Duplicate Detection
Progressive Duplicate Detection1crore projects
 
3. Kick-off TOJ - Final report
3. Kick-off TOJ - Final report3. Kick-off TOJ - Final report
3. Kick-off TOJ - Final reportRaffaella Battella
 
Best Keyword Cover Search
Best Keyword Cover SearchBest Keyword Cover Search
Best Keyword Cover Search1crore projects
 
Context-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML DataContext-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML Data1crore projects
 
POWER POINT PRESENTATION
POWER POINT PRESENTATIONPOWER POINT PRESENTATION
POWER POINT PRESENTATIONmidhunamidhun
 
Importance of dependant's pass in singapore
Importance of dependant's pass in singaporeImportance of dependant's pass in singapore
Importance of dependant's pass in singaporejasoncnelson
 
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασία
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασίαΧριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασία
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασίαTheresa Giakoumatou
 
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier DetectionReverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection1crore projects
 
On Summarization and Timeline Generation for Evolutionary Tweet Streams
On Summarization and Timeline Generation for Evolutionary Tweet StreamsOn Summarization and Timeline Generation for Evolutionary Tweet Streams
On Summarization and Timeline Generation for Evolutionary Tweet Streams1crore projects
 
Caderno de aprendizaxe UD 2 o nacemento dunha nova era
Caderno de aprendizaxe UD 2 o nacemento dunha nova eraCaderno de aprendizaxe UD 2 o nacemento dunha nova era
Caderno de aprendizaxe UD 2 o nacemento dunha nova eraEvaPaula
 
NME contents page analysis
NME contents page  analysisNME contents page  analysis
NME contents page analysisasmediae15
 
5 tips to project confidence
5 tips to project confidence5 tips to project confidence
5 tips to project confidenceHE Deodorants
 
Everwise Certificate
Everwise CertificateEverwise Certificate
Everwise CertificateDeepak Shetty
 

Viewers also liked (20)

Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing SitesPrivacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
 
Anthony Fung - CEO Zalora Indonesia
Anthony Fung - CEO Zalora IndonesiaAnthony Fung - CEO Zalora Indonesia
Anthony Fung - CEO Zalora Indonesia
 
tej-TL
tej-TLtej-TL
tej-TL
 
How far will you take your chiropractic care
How far will you take your chiropractic careHow far will you take your chiropractic care
How far will you take your chiropractic care
 
Malware Propagation in Large-Scale Networks
Malware Propagation in Large-Scale NetworksMalware Propagation in Large-Scale Networks
Malware Propagation in Large-Scale Networks
 
CV
CVCV
CV
 
Progressive Duplicate Detection
Progressive Duplicate DetectionProgressive Duplicate Detection
Progressive Duplicate Detection
 
3. Kick-off TOJ - Final report
3. Kick-off TOJ - Final report3. Kick-off TOJ - Final report
3. Kick-off TOJ - Final report
 
Best Keyword Cover Search
Best Keyword Cover SearchBest Keyword Cover Search
Best Keyword Cover Search
 
Context-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML DataContext-Based Diversification for Keyword Queries over XML Data
Context-Based Diversification for Keyword Queries over XML Data
 
POWER POINT PRESENTATION
POWER POINT PRESENTATIONPOWER POINT PRESENTATION
POWER POINT PRESENTATION
 
Importance of dependant's pass in singapore
Importance of dependant's pass in singaporeImportance of dependant's pass in singapore
Importance of dependant's pass in singapore
 
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασία
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασίαΧριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασία
Χριστίνα Κανάκη Αναζήτηση πηγών και ερευνητική εργασία
 
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier DetectionReverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection
 
DAAD-Intercultural0001
DAAD-Intercultural0001DAAD-Intercultural0001
DAAD-Intercultural0001
 
On Summarization and Timeline Generation for Evolutionary Tweet Streams
On Summarization and Timeline Generation for Evolutionary Tweet StreamsOn Summarization and Timeline Generation for Evolutionary Tweet Streams
On Summarization and Timeline Generation for Evolutionary Tweet Streams
 
Caderno de aprendizaxe UD 2 o nacemento dunha nova era
Caderno de aprendizaxe UD 2 o nacemento dunha nova eraCaderno de aprendizaxe UD 2 o nacemento dunha nova era
Caderno de aprendizaxe UD 2 o nacemento dunha nova era
 
NME contents page analysis
NME contents page  analysisNME contents page  analysis
NME contents page analysis
 
5 tips to project confidence
5 tips to project confidence5 tips to project confidence
5 tips to project confidence
 
Everwise Certificate
Everwise CertificateEverwise Certificate
Everwise Certificate
 

Similar to Scalable Constrained Spectral Clustering

A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System  A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System IJECEIAES
 
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...ijdpsjournal
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques usingeSAT Publishing House
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radixeSAT Journals
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...sipij
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithmcsandit
 
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODNONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODijwmn
 
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODNONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODijwmn
 
Face recognition using laplacianfaces (synopsis)
Face recognition using laplacianfaces (synopsis)Face recognition using laplacianfaces (synopsis)
Face recognition using laplacianfaces (synopsis)Mumbai Academisc
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlationsLeonardo Auslender
 
Comparison of Different Methods for Fusion of Multimodal Medical Images
Comparison of Different Methods for Fusion of Multimodal Medical ImagesComparison of Different Methods for Fusion of Multimodal Medical Images
Comparison of Different Methods for Fusion of Multimodal Medical ImagesIRJET Journal
 
Dce a novel delay correlation
Dce a novel delay correlationDce a novel delay correlation
Dce a novel delay correlationijdpsjournal
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwarePerformance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwareCSCJournals
 
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION ijscai
 
Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...IJECEIAES
 
15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdfAllanKelvinSales
 
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGCOMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGcsandit
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 

Similar to Scalable Constrained Spectral Clustering (20)

A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System  A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System
 
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...DCE: A NOVEL DELAY CORRELATION  MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE  REAL...
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
 
Implementation of low power divider techniques using
Implementation of low power divider techniques usingImplementation of low power divider techniques using
Implementation of low power divider techniques using
 
Implementation of low power divider techniques using radix
Implementation of low power divider techniques using radixImplementation of low power divider techniques using radix
Implementation of low power divider techniques using radix
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 
Time of arrival based localization in wireless sensor networks a non linear ...
Time of arrival based localization in wireless sensor networks  a non linear ...Time of arrival based localization in wireless sensor networks  a non linear ...
Time of arrival based localization in wireless sensor networks a non linear ...
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithm
 
Tree_Chain
Tree_ChainTree_Chain
Tree_Chain
 
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODNONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
 
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHODNONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
NONLINEAR MODELING AND ANALYSIS OF WSN NODE LOCALIZATION METHOD
 
Face recognition using laplacianfaces (synopsis)
Face recognition using laplacianfaces (synopsis)Face recognition using laplacianfaces (synopsis)
Face recognition using laplacianfaces (synopsis)
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlations
 
Comparison of Different Methods for Fusion of Multimodal Medical Images
Comparison of Different Methods for Fusion of Multimodal Medical ImagesComparison of Different Methods for Fusion of Multimodal Medical Images
Comparison of Different Methods for Fusion of Multimodal Medical Images
 
Dce a novel delay correlation
Dce a novel delay correlationDce a novel delay correlation
Dce a novel delay correlation
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism HardwarePerformance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
 
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
MARGINAL PERCEPTRON FOR NON-LINEAR AND MULTI CLASS CLASSIFICATION
 
Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...
 
15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf
 
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXINGCOMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
COMPARISON OF VOLUME AND DISTANCE CONSTRAINT ON HYPERSPECTRAL UNMIXING
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 

Recently uploaded

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 

Recently uploaded (20)

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 

Scalable Constrained Spectral Clustering

  • 1. Scalable Constrained Spectral Clustering Jianyuan Li, Member, IEEE, Yingjie Xia, Member, IEEE, Zhenyu Shan, and Yuncai Liu, Senior Member, IEEE Abstract—Constrained spectral clustering (CSC) algorithms have shown great promise in significantly improving clustering accuracy by encoding side information into spectral clustering algorithms. However, existing CSC algorithms are ineffi- cient in handling moderate and large datasets. In this paper, we aim to develop a scalable and efficient CSC algorithm by integrating sparse coding based graph construction into a framework called constrained normalized cuts. To this end, we formulate a scalable constrained normalized-cuts problem and solve it based on a closed-form mathematical analysis. We demonstrate that this problem can be reduced to a generalized eigenvalue problem that can be solved very efficiently. We also describe a principled k-way CSC algorithm for handling moderate and large datasets. Experimental results over benchmark datasets demonstrate that the proposed algorithm is greatly cost-effective, in the sense that (1) with less side information, it can obtain significant improvements in accuracy compared to the unsupervised baseline; (2) with less computational time, it can achieve high clus- tering accuracies close to those of the state-of-the-art. Index Terms—Constrained spectral clustering, sparse coding, efficiency, scalability Ç 1 INTRODUCTION CURRENTLY, data in a wide variety of areas tend to large scales. For many traditional learning based data mining algorithms, it is a big challenge to efficiently mine knowledge from the fast increasing data such as information streams, images and even videos. To over- come the challenge, it is important to develop scalable learning algorithms. Constrained clustering is an important area in the research communities of machine learning. Researchers proposed many new algorithms [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], which improve clustering accuracy by means of encoding side information into unsupervised clustering algorithms. Here, side information might be labelled data [15], pairwise constraints [2], relative comparison constraints [17], and so forth. In this paper, we refer to it as labelled data. In practice, labelled data are often costly to obtain and so the typi- cal problem in this area is to improve clustering by using a little of side information. We understand that constrained spectral clustering (CSC) algorithms [9], [10], [11], [12], [13], [14], [15], [16] are in general better than other constrained clustering algorithms in terms of accuracy, in part because of the high accuracies of unsupervised spectral clustering [18], [19], [20], [21], [22]. However, the scal- ability and efficiency of previous CSC algorithms are in general poor. Specifically, the memory complexity of existing CSC algo- rithms [9], [10], [11], [12], [13], [14], [15], [16] is Oðn2 Þ and the time complexity is Oðn3 Þ, where n is the total number of instan- ces. This hampers the applications of CSC algorithms towards moderate and large datasets. In this paper, we develop an efficient and scalable CSC algo- rithm that can well handle moderate and large datasets. The SCACS algorithm can be understood as a scalable version of the well-designed but less efficient algorithm known as Flexible Con- strained Spectral Clustering (FCSC) [15], [16]. To our best knowl- edge, our algorithm is the first efficient and scalable version in this area, which is derived by an integration of two recent studies, the constrained normalized cuts [15], [16] and the graph construction method based on sparse coding [23]. However, it is by no means straightforward to integrate the two existing methods. In the rest paper, we mainly answer three questions: how do we achieve an effective integration? how do we derive the SCACS algorithm in a principled way? and how well does the proposed algorithm perform? The structure of the rest paper is as follows: in Section 2, we revisit the constrained normalized-cuts problem and the method of graph construction based on sparse coding; in Section 3 we formu- late the problem to be solved and present a closed-form mathemat- ical analysis; in Section 4, we propose our algorithm based on the results of Section 3; in Section 5, we evaluate the proposed algo- rithm over benchmark datasets and discuss the results; In Section 6, we draw the conclusion and mention the future work. For notation, vectors and matrices are denoted by bold lower- case and upper-case letters, respectively. Sets are denoted by italic upper-case letters. Scalars are denoted by italic lower-case letters. 2 BACKGROUND In this section, we revisit two pioneer studies, namely the con- strained normalized cuts [15], [16] and the sparse coding based graph construction method [23]. 2.1 Constrained Normalized Cuts Given a vector dataset X ¼ fxign i¼1 where xi 2 Rd , and a constraint set fC¼; C6¼g where ðxi; xjÞ 2 C¼ if the patterns of xi and xj are similar and ðxi; xjÞ 2 C6¼ otherwise, the aim is to partition X into k clusters biased by the constraint set. Let W be the similarity matrix over X where Wij represents the similarity between instances xi and xj. Let D be the degree matrix over X which is a diagonal matrix with elements Dii ¼ P j Wij. Let L ¼ I À DÀ1=2 WDÀ1=2 be the normalized graph Laplacian where I denotes the identity matrix. Let Q denote the constraint matrix where Qij ¼ 1 expresses ðxi; xjÞ 2 C¼, and Qij ¼ À1 expresses ðxi; xjÞ 2 C6¼, and Qij ¼ 0 expresses no available side information. Let Q ¼ DÀ1=2 QDÀ1=2 be the normalized constraint matrix. The constrained normalized cuts [15], [16] can be rewritten as: arg minv2fþ 1ffiffin p ;À 1ffiffin p gn vT Lv s:t: vT Qv ! a; vT v ¼ 1; v?D1=2 1: (1) Here the parameter a controls what degree the input side informa- tion is respected. After removed vT Qv ! a, Eq. (1) degenerates to the standard normalized cuts [18]. The problem is NP hard and the feasible way is to allow v to take any real values [21] which is called a relaxed method in the literature. The relaxed solution vector can be given by generalized eigenvalue decomposition [15], [16]. How- ever, it still requires Oðn2 Þ memory cost and Oðn3 Þ time cost. 2.2 Sparse Coding Based Graph Construction Graph construction amounts to computing a similarity matrix. There exist a variety of methods [18], [21], [23], [24], [25], [26], [27]. J. Li is with the College of Computer Science, Zhejiang University, Hangzhou 310058, China, and the Big Data Research Center of Enjoyor Co Ltd, HangZhou 310030, China. E-mail: jylbob@gmail.com. Y. Xia is with the College of Computer Science, Zhejiang University, Hangzhou 310058, China. E-mail: xiayingjie@zju.edu.cn. Z. Shan is with the College of Computer Science, Hangzhou Normal University, Hangzhou 310012, China. E-mail: shanzhenyu@zju.edu.cn. Y. Liu is with the Department of Automation, Shanghai Jiaotong University, Shang- hai 200240, China. E-mail: company8110@gmail.com. Manuscript received 1 Aug. 2013; revised 13 June 2014; accepted 18 Aug. 2014. Date of publication 9 Sept. 2014; date of current version 23 Dec. 2014. Recommended for acceptance by S. Chawla. For information on obtaining reprints of this article, please send e-mail to: reprints@ieee. org, and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TKDE.2014.2356471 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 589 1041-4347 ß 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
  • 2. Among them, the method in [23] is based on sparse coding theory [28] and is designed for handling large datasets. Here we briefly introduce the sparse coding based graph con- struction. Given a dataset X, of the form d-by-n matrix, X, sparse coding aims to find a pair of matrices, U 2 RdÂp and Z 2 RpÂn , such that UZ could best approximate X where U’s columns repre- sent the desired base vectors and Z’s columns represent sparse coef- ficient vectors—each vector has few non-zero components. The cost function to be minimized is fðU; ZÞ ¼ kX À UZk2 F : (2) Unfortunately, it is costly to precisely solve for Uà and Zà [29]. In practice, an approximate and efficient way [23] is to randomly choose p instances among the input dataset and act them as base vectors, and then to estimate each column vector of Z according to Zij ¼ Ksðxj; uiÞ P i2rNBðjÞ Ksðxj; uiÞ ; (3) where s denotes the bandwidth of Gaussian kernel KsðÁ; ÁÞ and i 2 rNBðjÞ means that the base vector ui is among the r nearest base (rNB) vectors of instance xj. The sparseness of the matrix Z is controlled by the parameter r. After obtained Z, one can construct two forms of graph matrices: G ¼ ZT Z and S ¼ ZZT . Choose ^Z ¼ DÀ1=2 Z where Dii ¼ P j Zij. We can obtain the normalized graph matrices ^G ¼ ^ZT ^Z 2 RnÂn and ^S ¼ ^Z^ZT 2 RpÂp . It is easy to check that the normalized graph Lapla- cian over X is ðI À ^GÞ. The time complexity for computing ^G is Oðpn2 Þ, but the one for computing ^S is just Oðnp2 Þ (p ( n). 3 PROBLEM FORMULATION In this section, we formulate a scalable constrained normalized- cuts problem and demonstrate how to solve it. 3.1 The Scalable Constrained Normalized Cuts In the following, Problem 1 refers to a straightforward integration of the constrained normalized cuts and the sparse coding based graph construction, and Problem 2 refers to the formulated scalable constrained normalized-cuts problem. Problem 1. Let ^G be the similarity matrix over X. The relaxed con- strained normalized-cuts problem can be formulated as min v2Rn vT Lv; s:t: vT Qv ! a; vT v ¼ 1; v?1; (4) where L ¼ I À ^G is the normalized graph Laplacian. Based on [15], the normal approach for finding the solution vec- tor of Problem 1 can be reduced to the following generalized eigen- value problem: Lv ¼ ðQ À bIÞv; (5) where b is a lower bound of a. The time cost for solving this prob- lem is Oðn3 Þ, infeasible for handling large datasets. To seek a scalable solution, we write v 2 Rn as ^Z T u where u 2 Rp . Plugging v ¼ ^Z T u into Eq. (4), we reformulate Problem 1 as the following Problem 2. Problem 2. min u2Rp uT Au; s:t: uT ^Qu ! a; uT ^Su ¼ 1; 1T ^Su ¼ 0; (6) here, A ¼ ^S À ^S^S; ^Q ¼ ^ZQ^Z T : (7) This problem is mathematically equivalent to Problem 1, but it results in two significant changes: (1) the n-by-n normalized graph Laplacian L is compressed as the p-by-p matrix A; (2) the n-by-n constraint matrix Q is naturally compressed as the p-by-p matrix ^Q. Consequently, the solution of Problem 1 might be efficiently recovered from the solution of Problem 2 consid- ering p ( n. 3.2 Solving Problem 2 To solve Problem 2, we use Lagrange multiplier and obtain that Lðu; ; mÞ ¼ uT Au À ðuT ^Qu À aÞ À mðuT ^Su À 1Þ: (8) Using the KKT Theorem [30], the feasible solutions must satisfy the the following conditions: Au À ^Qu À m^Su ¼ 0 (9) uT ^Qu ! a (10) uT ^Su ¼ 1; (11) 1T ^Su ¼ 0; (12) ! 0; (13) ðuT ^Qu À aÞ ¼ 0: (14) When ¼ 0, Eq. (9) degenerates to the standard eigen-system ^Su ¼ ð1 À mÞu, namely side information does not work. To use side information, we limit 0 which reduces Eq. (10) and Eq. (14) to uT ^Qu ¼ a; (15) Assuming that Àm= ¼ b, Eq. (9) can be reduced to Au ¼ ð^Q À b^SÞu; (16) which is a generalized eigenvalue problem that can be solved very efficiently due to p ( n. To derive a scalable constrained spectral clustering algorithm (SCACS), it is essential to discuss how to set the parameters b and a. Via a few algebraic manipulations, we find two facts which are useful for algorithm design. First, the matrix A is positive semi- definite, hence we have uT ð^Q À b^SÞu ¼ ða À bÞ ! 0; (17) meaning that the parameter a is lower-bounded by b. Thus, users just need to specify b, regardless of a. Second, for ensuring that Eq. (16) has at least i meaningful solution vectors in terms of non- negative normalized cuts, b gi is a sufficient condition where gi denotes the ith largest eigenvalue of the generalized eigen-system ^Qx ¼ g^Sx. 4 ALGORITHM AND ANALYSIS Based on the mathematical analysis above, in this section we derive the algorithm and analyze the complexity. 4.1 The Proposed Algorithm Problem 2 just indicates a binary constrained spectral clustering problem where the solution vector và plays the role of group- ing indicator. Without loss of generality, here we directly derive an algorithm for k-class problems (k ! 2). We call the proposed algorithm scalable constrained spectral clustering and list 13 steps in Algorithm 1. 590 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015
  • 3. Algorithm 1. Scalable Constrained Spectral Clustering 1: Input: a dataset X 2 RdÂn , the base-vector number p, the n-by-n constraint matrix Q, b, cluster number k. 2: Choose p vector data among the input dataset at random, and stack them in the columns of matrix U 2 RdÂp . 3: Compute Z 2 RpÂn using Eq. (3) and then compute ^Z ¼ DÀ1=2 Z where D denotes a diagonal matrix with elements Dii ¼ P j Zij. 4: Compute ^S ¼ ^Z^Z T and ^Q ¼ ^ZQ^Z T . 5: Find the largest eigenvalue gmax of the generalized eigen- system ^Qx ¼ g^Sx. 6: If b ! gmax, return fvà g ¼ ;; otherwise, find all the eigen- vectors fuig by solving generalized eigen-system Eq. (16), where ui denotes the ith eigenvector, 1 i p. 7: Find among fuig the eigenvectors fuigþ associated with positive eigenvalues. 8: Normalize each ui 2 fuigþ by multiplying a factor ffiffiffiffiffiffiffiffiffiffi 1 uT i ^Sui q . 9: Remove the eigenvectors from fuigþ that are not orthogonal to the vector 1T ^S. 10: Find among fuigþ the m eigenvectors that lead to the small- est values of uT i Aui where m ¼ minfk À 1; jfuigþ jg, then stack them in columns of matrix V. 11: Compute VðrÞ ¼ ^Z T VðI À VT AVÞ. 12: Normalize VðrÞ ’s rows to have unit length, then feed it to the k-means algorithm. 13: Output: the grouping indicator. Several key steps are interpreted as follows: (1) Step 7 aims to satisfy the condition 0; (2) Step 8 aims to scale each eigenvec- tors for satisfying the condition of Eq. (11); (3) Step 9 aims to satisfy the condition of Eq. (12); (4) In Step 11, we recover the solution vec- tors by the linear transformation ^Z T u and we weight each solution vector by one minus the associated value of the objective function. It is worth mentioning that the input parameter b is tunable, making the algorithm flexible to noisy side information or inappro- priate mathematical expressions for side information. Usually, the larger b is given, the more side information is respected. 4.2 Complexity Analysis The algorithm complexity is analyzed as follows. The time cost for computing the matrix ^S is in general Oðnp2 Þ, and that for comput- ing the generalized eigenvalue decomposition in Step 5 or in Step 6 is Oðp3 Þ, and that for computing ^ZQ^ZT is Oðkp2 þ kpnÞ. Hence the general time complexity of our algorithm is Oðkpn þ kp2 þ np2 þ p3 Þ: (18) In practical applications, p can be chosen as a value far smaller than n so that the running time grows almost linearly to n. The memory cost for storing the matrices ^Z and ^S are OðnpÞ and Oðp2 Þ, respec- tively. The general memory complexity is Oðnp þ p2 Þ: (19) 5 EXPERIMENT In this section, we carry out experiments for assessing the pro- posed SCACS algorithm. 5.1 Experiment Setup We implement our algorithm and other compared algorithms over Linux machines, the machine for recording computational time with 3.10 GHz CPU and 8 GB main memory. The datasets that we used are downloaded from UCI machine learning repository1 and the other two web sites.2;3 Basic informa- tion is listed in Table 1. Side information is generated based on the ground truth labels of the datasets. The following expression illustrates the encoding rules of Q: Qij ¼ 1 if xi and xj have consistent labels 1 if i ¼ j À1 if xi and xj have different labels 0 no side information: 8 : (20) In our experiment, we randomly sample c labelled instances from a given input dataset, and then obtain Q based on the rules of Eq. (20). The clustering accuracy is evaluated by the best matching rate (ACC). Let h be the resulting label vector obtained from a cluster- ing algorithm. Let g be the ground truth label vector. Then, the best matching rate is defined as ACC ¼ Pn i¼1 dðgi; mapðhiÞÞ n ; (21) where dða; bÞ denotes the delta function that returns 1 if a ¼ b and returns 0 otherwise, and map(hi) is the permutation mapping func- tion that maps each cluster label hi to the equivalent label from the data corpus. 5.2 Comparisons with Benchmarks and FCSC In this part, we compare our SCACS algorithm with three spectral algorithms (LSC-R, SL and FCSC). Among them, “LSC-R” [23] is the unsupervised baseline that constructs similarity graphs based on sparse coding. “SL” [10] is the CSC baseline which encodes side information by directly modifying similarity matrices. “FCSC” [16] is the state-of-the-art in terms of accuracy. FCSC is too slow to han- dle large datasets, hence in this experiment we used four small datasets: 1) The Image Segmentation dataset. 2) A subset sampled from the USPS dataset. We used the first 300 instances of each class, 3,000 instances in total. 3) A subset sampled from the Pen Digits dataset. We used the first 300 instances of each class, 3000 instances in total. 4) A subset sampled from the Letter Recognition dataset. We used the first five classes of instances, 3,864 instances in total. We set the Gaussian kernel width s to the average Euclidean dis- tance between all instances and base vectors. We use p ¼ 500, r ¼ 3 and b ¼ b0gkÀ1 where b0 ¼ 0:5 þ 0:4  c n and gkÀ1 denotes the (k-1) th largest eigenvalue of the generalized eigen-system ^Qx ¼ g^Sx. For fair comparison, the randomly selected labelled instances are TABLE 1 Dataset Description Dataset # Instances # Attributes # Classes Image Segmentation 2,310 19 7 USPS 9,298 256 10 Pen Digits 10,992 16 10 Letter Recognition 20,000 16 26 MNIST 70,000 784 10 CoverType 581,012 54 7 1. http://archive.ics.uci.edu/ml 2. http://www.zjucadcg.cn/dengcai 3. http://yann.lecun.com/exdb/mnist IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 591
  • 4. consistent for SL, SCACS and FCSC. In order to show how the results vary with c, we use 10 different values ranging from 100 to 1,000 by a increasing step size 100; for each value, we carry out 20 tri- als in terms of 20 different constraint sets. For SL, we use the k-near- est neighbor method to compute similarity matrices (k ¼ 30), and we use radial basis function (RBF) to compute the similarities between pairwise instances. For FCSC, we directly use RBF kernel function to compute similarity matrices as used by the authors. We demonstrate the algorithm accuracies and the standard devi- ations in Figs. 1a, 1b, 1c, and 1d. One can see that our algorithm out- performs LSC-R over all the datasets with large margins, indicating that our algorithm in encoding side information is appropriate. In addition, FCSC significantly outperforms SL, indicating that the constrained normalized-cuts framework performs much better in encoding side information. Over the four small datasets, our algo- rithm outperforms SL on average and is close to FCSC. 5.3 Comparisons with Efficient Algorithms In this part, we compare our algorithm with three efficient algo- rithms (LSC-R, CKM, and ML). Among them, “CKM” refers to the k-means based constrained clustering algorithm [2]. “ML” refers to the side information based metric learning method [4] and we use their efficient version that learns a diagonal matrix. The datasets that we used are listed in Table 1. For fair comparison, we use the same base-vector selections for LSC-R and SCACS, and we use the same random selections of con- strained instances for CKM, ML and SCACS. We carry out 20 trials in terms of 20 different constrained instance sets, and report the average clustering accuracies and the standard deviations. The parameter that we used are p ¼ 500, b0 ¼ 0:1, r ¼ 3, respectively. The results are illustrated in Figs. 1e, 1f, 1g, 1h, 1i, and 1j. Over five datasets (Image Segmentation, USPS, Pen Digits, LetterRec and MNIST), our algorithm significantly outperforms other three algorithms; and over the remaining CoverType dataset, our algo- rithm performs better than LSC-R and CKM. Note that ML per- forms well on low-dimensional datasets such as Letter Recognition and CoverType, with a decreased performance on high-dimen- sional datasets such as USPS and MNIST. However, our algorithm performs much better on high-dimensional datasets. The running time is recorded in Table 2. The results show that SCACS is much faster than FCSC. For the Pen Digits dataset, FCSC demands more than 11 hours, however, our algorithm demands only 3.22 seconds. For the two largest datasets, MNIST and Cov- erType, FCSC could not return results within a week, however, our algorithm only demands 12.02 and 64.47 seconds, respectively. 5.4 Influence of Parameters Below we show how the parameters (r, p, b0) influence SCACS. We use two datasets, Pen Digits and MNIST, with a lower dimension- ality and a higher dimensionality, respectively. For computing the influence of r, we fix c ¼ 500 and vary r from 3 to 30 by a step size 1. For computing the influence of p, we fix c ¼ 500 and vary p rang- ing from 100 to 1,000 by a step size 100. For computing the influ- ence of b0, we vary b0 from 0.1 to 0.9 by a step size 0.1. Fig. 2a shows the influence of r. The accuracies of SCACS tend to decrease as r increases, indicating that sparseness plays a key role in our algorithm. This suggests that a relative small r is prefer- able. The accuracy decrease pace over the MNIST dataset is faster than that over the Pen Digits dataset. This might support that sparseness is more important for grouping high-dimensional data- sets than for grouping low-dimensional datasets. Fig. 2b shows that the accuracies of SCACS vary mildly with p; in particular in the interval of, e.g., p ranging from 500 to 1,000, our algorithm is significantly robust. This states that it is easy to set a usable value for p. Figs. 2c and 2d show how the clustering accuracies vary with b0. One may notice that the clustering accuracies drop seriously if constrained instances are very few (e.g., c ¼ 100) but b0 uses a larger value (e.g., b0 ¼ 0.8). Conversely, if constrained instances are abundant, a larger b0 might result in a significant improvement in terms of accuracy, e.g., for the Pen Digits datasets, the clustering accuracies can be significantly improved as b0 increases when c ¼ 5;000. 6 CONCLUSION AND FUTURE WORK We have developed a new k-way scalable constrained spectral clustering algorithm based on a closed-form integration of the con- strained normalized cuts and the sparse coding based graph con- struction. Experimental results show that (1) with less side information, our algorithm can obtain significant improvements in Fig. 1. Accuracy comparisons among different algorithms. Here c denotes the number of labelled data. TABLE 2 Running Time (seconds), c ¼ 100, p ¼ 500 Dataset LSC-R CKM ML SCACS FCSC ImageSeg 0.41 0.01 3.46 2.35 153 USPS 0.96 0.82 21.05 3.29 14,461 PenDigits 0.82 0.31 5.42 3.22 39,948 LetterRec 4.12 4.10 15.17 6.59 212,600 MNIST 9.32 56.70 492.9 12.02 - CoverType 49.12 33.31 190.58 64.47 - 592 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015
  • 5. accuracy compared to the unsupervised baseline; (2) with less computational time, our algorithm can obtain high clustering accu- racies close to those of the state-of-the-art; (3) It is easy to select the input parameters; (4) our algorithm performs well in grouping high-dimensional image data. In the future, we are considering an active selection of pairwise instances for labelling; we will also apply our algorithm to group urban transportation big data, which might significantly boost sensor placement optimization. ACKNOWLEDGMENTS This paper was supported in part by the following funds: National High Technology Research and Development Program of China (2011AA010101), National Natural Science Foundation of China (61002009 and 61304188), Key Science and Technology Program of Zhejiang Province of China (2012C01035-1), and Zhejiang Provin- cial Natural Science Foundation of China (LZ13F020004 and LR14F020003), Postdoctoral fund of China. The authors would like to thank Prof. Xiaofei He for beneficial suggestions. Yingjie Xia is the corresponding author. REFERENCES [1] K. Wagstaff and C. Cardie, “Clustering with instance-level constraints,” in Proc. 17th Int. Conf. Mach. Learn., 2000, pp. 1103–1110. [2] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-means clustering with background knowledge,” in Proc. 18th Int. Conf. Mach. Learn., 2001, pp. 577–584. [3] S. Basu, A. Banerjee, and R. Mooney, “Semi-supervised clustering by seeding,” in Proc. 19th Int. Conf. Mach. Learn., 2002, pp. 27–34. [4] E. P. Xing, A. Y. Ng, M. I. Jordan, and S. J. Russell, “Distance metric learn- ing with application to clustering with side-information,” in Proc. Adv. Neu- ral Inf. Process. Syst., 2003, pp. 505–512. [5] S. Basu, B. I. Lenko, and R. J. Mooney, “A probabilistic framework for semi- supervised clustering,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2004, pp. 59–68. [6] N. Shental, A. Bar-hillel, T. Hertz, and D. Weinshall, “Computing gaussian mixture models with em using equivalence constraints,” in Proc. Adv. Neu- ral Inf. Process. Syst. 16, 2003, pp. 505–512. [7] B. Kulis, S. Basu, I. Dhillon, and R. Mooney, “Semi-supervised graph clus- tering: A kernel approach,” in Proc. 22nd Int. Conf. Mach. Learn., 2005, pp. 457–464. [8] Y.-M. Cheung and H. Zeng, “Semi-supervised maximum margin clustering with pairwise constraints,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 5, pp. 926–939, May 2012. [9] S. X. Yu and J. B. Shi, “Grouping with bias,” in Proc. Adv. Neural Inf. Process. Syst., 2001, pp. 1327–1334. [10] S. D. Kamvar, D. Klein, and C. D. Manning, “Spectral learning,” in Proc. Int. Joint Conf. Artif. Intell., 2003, pp. 561–566. [11] X. Ji and W. Xu, “Document clustering with prior knowledge,” in Proc. 29th Annu. Int. Conf. Res. Develop. Inf. Retrieval, 2006, pp. 405–412. [12] Z. Lu and M. A. Carreira-Perpinan, “Constrained spectral clustering through affinity propagation,” in Proc. IEEE Conf. Comput. Vis. Pattern Rec- ognit., 2008, pp. 1–8. [13] Z. Li, J. Liu, and X. Tang, “Constrained clustering via spectral regu- larization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 421– 428. [14] T. Coleman, J. Saunderson, and A. Wirth, “Spectral clustering with incon- sistent advice,” in Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 152–159. [15] X. Wang and I. Davidson, “Flexible constrained spectral clustering,” in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2010, pp. 563–572. [16] X. Wang, B. Qian, and I. Davidson, “On constrained spectral clustering and its applications,” Data Mining Knowl. Discov., vol. 28, no. 1, pp. 1–30, 2012. [17] M. Schultz and T. Joachims, “Learning a distance metric from relative comparisons,” in Proc. Adv. Neural Inf. Process. Syst., 2003, pp. 41–48. [18] J. B. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000. [19] A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: Analysis and an algorithm,” in Proc. Adv. Neural Inf. Process. Syst. 14, 2001, pp. 849–856. [20] C. Fowlkes, S. Belongie, F. R. K. Chung, and J. Malik, “Spectral grouping using the Nystr€om method,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 2, pp. 214–225, Feb. 2004. [21] U. V. Luxburg, “A tutorial on spectral clustering,” Statist. Comput., vol. 17, no. 4, pp. 395–416, 2007. [22] Y. Yang, Y. Yang, H. Shen, Y. Zhang, X. Du, and X. Zhou, “Discriminative nonnegative spectral clustering with out-of-sample extension,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1760–1771, Aug. 2013. [23] X. Chen and D. Cai, “Large scale spectral clustering with landmark-based representation,” in Proc. 25th Conf. Artif. Intell., 2011, p. 1. [24] S. I. Daitch, J. A. Kelner, and D. A. Spielman, “Fitting a graph to vector data,” in Proc. 26th Annu. Int. Conf. Mach. Learn., 2009, p. 26. [25] T. Jebara J. Wang, and S. Chang, “Graph construction and b-matching for semi-supervised learning,” in Proc. Int. Conf. Mach. Learn., 2009, p. 56. [26] B. Cheng, J. Yang S. Yan Y. Fu, and T. S. Huang, “Learning with l1-graph for image analysis,” IEEE Trans. Image Process., vol. 19, no. 4, pp. 858–866, Apr. 2010. [27] W. Liu, J. He, and S. Chang, “Large graph construction for scalable semi- supervised learning,” in Proc. Int. Conf. Mach. Learn., 2010, pp. 679–686. [28] B. A. Olshausen, “Emergence of simple-cell receptive field properties by learning a sparse code for natural images,” Nature, vol. 381, pp. 607–609, 1996. [29] H. Lee A. Battle R. Raina, and A. Y. Ng, “Efficient sparse coding algo- rithms,” in Proc. Adv. Neural Inf. Process. Syst. 19, 2006, pp. 801–808. [30] H. W. Kuhn and A. W. Tucker, “Nonlinear programming,” in Proc. 2nd Ber- keley Symp., 1951, pp. 481–492. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib. Fig. 2. The influence of parameters (r,p,b0) to the proposed algorithm. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 2, FEBRUARY 2015 593