SlideShare a Scribd company logo
1 of 11
Download to read offline
956 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
Multiview Alignment Hashing for
Efficient Image Search
Li Liu, Mengyang Yu, Student Member, IEEE, and Ling Shao, Senior Member, IEEE
Abstract—Hashing is a popular and efficient method for
nearest neighbor search in large-scale data spaces by embedding
high-dimensional feature descriptors into a similarity preserving
Hamming space with a low dimension. For most hashing methods,
the performance of retrieval heavily depends on the choice of
the high-dimensional feature descriptor. Furthermore, a single
type of feature cannot be descriptive enough for different
images when it is used for hashing. Thus, how to combine
multiple representations for learning effective hashing functions
is an imminent task. In this paper, we present a novel
unsupervised multiview alignment hashing approach based on
regularized kernel nonnegative matrix factorization, which can
find a compact representation uncovering the hidden semantics
and simultaneously respecting the joint probability distribution
of data. In particular, we aim to seek a matrix factorization
to effectively fuse the multiple information sources meanwhile
discarding the feature redundancy. Since the raised problem is
regarded as nonconvex and discrete, our objective function is then
optimized via an alternate way with relaxation and converges
to a locally optimal solution. After finding the low-dimensional
representation, the hashing functions are finally obtained through
multivariable logistic regression. The proposed method is
systematically evaluated on three data sets: 1) Caltech-256;
2) CIFAR-10; and 3) CIFAR-20, and the results show that our
method significantly outperforms the state-of-the-art multiview
hashing techniques.
Index Terms—Hashing, multiview, NMF, alternate
optimization, logistic regression, image similarity search.
I. INTRODUCTION
LEARNING discriminative embedding has been a critical
problem in many fields of information processing and
analysis, such as object recognition [1], [2], image/video
retrieval [3], [4] and visual detection [5]. Among them,
scalable retrieval of similar visual information is attractive,
since with the advances of computer technologies and the
development of the World Wide Web, a huge amount of digital
data has been generated and applied. The most basic but
essential scheme for similarity search is the nearest neigh-
bor (NN) search: given a query image, to find an image
that is most similar to it within a large database and assign
the same label of the nearest neighbor to this query image.
Manuscript received May 25, 2014; revised January 5, 2015; accepted
January 5, 2015. Date of publication January 12, 2015; date of current
version January 30, 2015. The associate editor coordinating the review of this
manuscript and approving it for publication was Prof. Jie Liang.
(Corresponding author: Ling Shao)
The authors are with the Department of Computer Science and Digital
Technologies, Northumbria University, Newcastle upon Tyne NE1 8ST,
U.K. (e-mail: li2.liu@northumbria.ac.uk; m.y.yu@ieee.org; ling.shao@
northumbria.ac.uk).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2015.2390975
NN search is regarded as a linear search scheme (O(N)),
which is not scalable due to the large sample size in datasets
of practical applications. Later, to overcome this kind of
computational complexity problem, some tree-based search
schemes are proposed to partition the data space via various
tree structures. Among them, KD-tree and R-tree [6] are
successfully applied to index the data for fast query responses.
However, these methods cannot operate with high-dimensional
data and do not guarantee faster search compared to the linear
scan. In fact, most of the vision-based tasks suffer from the
curse of dimensionality problems,1 because visual descriptors
usually have hundreds or even thousands of dimensions.
Thus, some hashing schemes are proposed to effectively
embed data from a high-dimensional feature space into a
similarity-preserving low-dimensional Hamming space where
an approximate nearest neighbor of a given query can be found
with sub-linear time complexity.
One of the most well-known hashing techniques that
preserve similarity information is Locality-Sensitive
Hashing (LSH) [7]. LSH simply employs random linear
projections (followed by random thresholding) to map data
points close in a Euclidean space to similar codes. Spectral
Hashing (SpH) [8] is a representative unsupervised hashing
method, in which the Laplace-Beltrami eigenfunctions of
manifolds are used to determine binary codes. Moreover,
principled linear projections like PCA Hashing (PCAH) [9]
has been suggested for better quantization rather than
random projection hashing. Besides, another popular hashing
approach, Anchor Graphs Hashing (AGH) [10], is proposed to
learn compact binary codes via tractable low-rank adjacency
matrices. AGH allows constant time hashing of a new
data point by extrapolating graph Laplacian eigenvectors to
eigenfunctions. More relevant hashing methods can be seen
in [11]–[13].
However, single-view hashing is the main topic on which
the previous exploration of hashing methods focuses. In their
architectures, only one type of feature descriptor is used
for learning hashing functions. In practice, to make a
more comprehensive description, objects/images are always
represented via several different kinds of features and each of
them has its own characteristics. Thus, it is desirable to incor-
porate these heterogenous feature descriptors into learning
hashing functions, leading to multi-view hashing approaches.
Multiview learning techniques [14]–[17] have been well
1The effectiveness and efficiency of these methods drop exponentially as
the dimensionality increases, which is commonly referred to as the curse of
dimensionality.
1057-7149 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 957
explored in the past few years and widely applied to visual
information fusion [18]–[20]. Recently, a number of multiview
hashing methods have been proposed for efficient
similarity search, such as Multi-View Anchor Graph
Hashing (MVAGH) [21], Sequential Update for Multi-
View Spectral Hashing (SU-MVSH) [22], Multi-View
Hashing (MVH-CS) [23], Composite Hashing with Multiple
Information Sources (CHMIS) [24] and Deep Multi-view
Hashing (DMVH) [25]. These methods mainly depend on
spectral, graph or deep learning techniques to achieve data
structure preserving encoding. Nevertheless, the hashing
purely with the above schemes are usually sensitive to data
noise and suffering from the high computational complexity.
The above drawbacks of prior work motivate us to
propose a novel unsupervised mulitiview hashing approach,
termed Multiview Alignment Hashing (MAH), which can
effectively fuse multiple information sources and exploit the
discriminative low-dimensional embedding via Nonnegative
Matrix Factorization (NMF) [26]. NMF is a popular method in
data mining tasks including clustering, collaborative filtering,
outlier detection, etc. Unlike other embedding methods
with positive and negative values, NMF seeks to learn a
nonnegative parts-based representation that gives better visual
interpretation of factoring matrices for high-dimensional data.
Therefore, in many cases, NMF may be more suitable for sub-
space learning tasks, because it provides a non-global basis set
which intuitively contains the localized parts of objects [26].
In addition, since the flexibility of matrix factorization can
handle widely varying data distributions, NMF enables more
robust subspace learning. More importantly, NMF decomposes
an original matrix into a part-based representation that gives
better interpretation of factoring matrices for non-negative
data. When applying NMF to multiview fusion tasks,
a part-based representation can reduce the corruption between
any two views and gain more discriminative codes.
To the best of our knowledge, this is the first work using
NMF to combine multiple views for image hashing. It is
worthwhile to highlight several contributions of the proposed
method:
• MAH can find a compact representation uncovering the
hidden semantics from different view aspects and
simultaneously respecting the joint probability
distribution of data.
• To solve our nonconvex objective function, a new
alternate optimization has been proposed to get the final
solution.
• We utilize multivariable logistic regression to generate the
hashing function and achieve the out-of-sample extension.
The rest of this paper is organized as follows. In Section II,
we give a brief review of NMF. The details of our method are
described in Section III. Section IV reports the experimental
results. Finally, we conclude this paper in Section V.
II. A BRIEF REVIEW OF NMF
In this section, we mainly review some related algorithms,
focusing on Nonnegative Matrix Factorization (NMF) and
its variants. NMF is proposed to learn the nonnegative
parts of objects. Given a nonnegative data matrix
X = [x1, . . . , xN ] ∈ RD×N
≥0 , each column of X is a
sample data. NMF aims to find two nonnegative matrices
U ∈ RD×d
≥0 and V ∈ Rd×N
≥0 with full rank whose product can
approximately represent the original matrix X, i.e., X ≈ UV .
In practice, we always have d < min(D, N). Thus, we
minimize the following objective function
LN MF = X − UV 2
, s.t. U, V ≥ 0, (1)
where · is Frobenius norm. To optimize the above objective
function, an iterative updating procedure was developed in [26]
as follows:
Vij ←
(UT X)ij
(UT UV )ij
Vij , Uij ←
(XV T )ij
(UV V T )ij
Uij , (2)
and normalization
Uij ←
Uij
i Uij
. (3)
It has been proved that the above updating procedure can find
the local minimum of LN MF . The matrix V obtained in NMF
is always regarded as the low-dimensional representation while
the matrix U denotes the basis matrix.
Furthermore, there also exists some variants of NMF. Local
NMF (LNMF) [27] imposes a spatial localized constraint
on the bases. In [28], sparse NMF was proposed and later,
NMF constrained with neighborhood preserving regulariza-
tion (NPNMF) [29] was developed. Besides, researchers
also proposed graph regularized NMF (GNMF) [30], which
effectively preserves the locality structure of data. Beyond
these methods, [31] extends the original NMF with the kernel
trick as kernelized NMF (KNMF), which could extract more
useful features hidden in the original data through some
kernel-induced nonlinear mappings. Moreover, it can deal with
data where only relationships (similarities or dissimilarities)
between objects are known. Specifically, in their work, they
addressed the matrix factorization by K ≈ UV , where
K is the kernel matrix instead of the data matrix X, and
(U, V) are similar with standard NMF. More related to our
work, a multiple kernels NMF (MKNMF) [32] approach was
proposed, where linear programming is applied to determine
the combination of different kernels.
In this paper, we present a Regularized Kernel
Nonnegative Matrix Factorization (RKNMF) framework
for hashing, which can effectively preserve the data intrinsic
probability distribution and simultaneously reduce the
redundancy of low-dimensional representations. Rather than
locality-based graph regularization, we measure the joint
probability of pairwise data by the Gaussian function, which
is defined over all the potential neighbors and has been
proved to effectively resist data noise [33]. This kind of
measurement is capable to capture the local structure of the
high-dimensional data while also revealing global structure
such as the presence of clusters at several scales. To the
best of our knowledge, this is the first time that NMF with
multiview hashing has been successfully applied to feature
embedding for large-scale similarity search.
958 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
Fig. 1. Illustration of the working flow of MAH. During the training phase, the proposed optimization method is derived by an alternate optimization
procedure which optimizes the kernel’s weight α and the factorization matrices (U, V) alternately. Using the multivariable logistic regression, the algorithm
then outputs the projection matrix P and regression matrix to generate the hash function, which are directly used in the test phase.
III. MULTIVIEW ALIGNMENT HASHING
In the section, we introduce our new Multiview Alignment
Hashing approach, referred as MAH. Our goal is to learn a
hash embedding function, which fuses the various alignment
representations from multiple sources while preserving the
high-dimensional joint distribution and obtaining the orthogo-
nal bases simultaneously during the RKNMF. Originally, we
need to find the binary solution which, however, is first relaxed
to a real-valued range so that a more suitable solution can be
gained. After applying the alternate optimization, we convert
the real-valued solutions into binary codes. Fig. 1 shows the
outline of the proposed MAH approach.
A. Objective Function
Given the i-th view training data X(i) = [x(i)
1 , . . . , x(i)
N ] ∈
RDi ×N , we construct the corresponding N × N kernel
matrix Ki using the Heat Kernel formulation:
Ki x(i)
p , x(i)
q = exp
−||x(i)
p − x(i)
q ||2
2τ2
i
, ∀p, q,
where τi is the related scalable parameter. Actually, our
approach can work with any legitimate kernel. Without loss
of generality, one of the most popular kernels, Heat Kernel, is
used here in the purposed method. Our discussion in further
section will only focus on the Heat Kernel.
Then multiple kernel matrices from each view data
{K1, . . . , Kn} are computed and Ki ∈ RN×N
≥0 , ∀i. We further
define the fusion matrix
K =
n
i=1
αi Ki, subject to
n
i=1
αi = 1, αi ≥ 0, ∀i.
To obtain a meaningful low-dimensional matrix factorization,
we then set a constraint for the binary representation
V = [v1, . . . , vN ] as the similarity probability regularization,
which is utilized to preserve the data distribution in the
intrinsic objective space. The optimization is expressed as:
arg min
V∈{0,1}d×N
1
2 p,q
W(i)
pq vp − vq
2
, (4)
where W
(i)
pq = Pr(x
(i)
p , x
(i)
q ) is the symmetric joint probability
between x(i)
p and x(i)
q in the i-th view feature space. In this
paper, we use a Gaussian function2 [33] to measure it:
Pr x(i)
p , x(i)
q =
⎧
⎨
⎩
exp − x
(i)
p −x
(i)
q
2/2σ2
i
k=l exp − x
(i)
k −x
(i)
l
2/2σ2
i
, if p = q
0, if p = q
(5)
where σi is the Gaussian smooth parameter. x
(i)
p − x
(i)
q is
always measured by the Euclidean distance. To refer the
framework in [30], the similarity probability regularization for
the i-th view can be reduced to
arg min
V∈{0,1}d×N
Tr(V Li V T
), (6)
where Li = D(i) − W(i) is the Laplacian matrix,
W(i) = W(i)
pq ∈ RN×N is the symmetric similarity matrix
and D(i) is a diagonal matrix with its entries D(i)
pp = p W(i)
pq .
In addition, to reduce the redundancy of subspaces and obtain
the compact bases in NMF simultaneously, we hope the
basis matrix U of NMF should be as orthogonal as possible
2Our approach can work with any legitimate similarity probability measure,
though we focus on Gaussian similarity probability in this paper.
LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 959
to minimize the redundancy, i.e., UT U − I = 0. Here we
also minimize UT U −I 2 and make basis U near-orthogonal
since this objective function can be fused into the computation
of the L2-norm loss function of NMF. Finally, we combine the
above-mentioned two constraints for U and V and construct
our optimization problem as follows:
arg min
U,V,αi
n
i=1
αi Ki − UV 2
+γ
n
i=1
αi Tr(V Li V T
) + η UT
U − I 2
,
s.t. V ∈ {0, 1}d×N
, U, V ≥ 0,
n
i=1
αi = 1, αi ≥ 0, ∀i,
(7)
where γ and η are two positive coefficients balancing the
tradeoff between the NMF approximation error and additional
constraints.
B. Alternate Optimization via Relaxation
In this section, we first relax the discreteness condition to
the continuous condition, i.e., we relax the binary domain
V ∈ {0, 1}d×N in Eq. (7) to the real number domain
V ∈ Rd×N but keep the NMF requirements (nonnegative
conditions). To the best of our knowledge, there is no direct
way to obtain its global solution for a linearly constrained
nonconvex optimization problem. Thus, in this paper the
optimization procedure is implemented as an iterative proce-
dure and divided into two steps by optimizing (U, V ) and
α = (α1, . . . , αn) alternately [34]. For each step, one of (U, V)
and α is optimized while the other is fixed and at the next step,
we switch (U, V) and α. The iteration procedure stops until
it converges.
1) Optimizing (U, V ): By first fixing α, we substitute
K = n
i=1 αi Ki and L = n
i=1 αi Li. A Lagrangian
multiplier function is introduced for our problem as:
L1(U, V, , ) = K − UV 2
+ γ Tr(V LV T
)
+ η UT
U − I 2
+ Tr( UT
)+Tr( V T
),
(8)
where and are two matrices in which all their entries are
the Lagrangian multipliers to constrain U, V ≥ 0 respectively.
Then we let the partial derivatives of L1 with respect
to U and V be zeroes, i.e., ∇U,V L1 = 0, we obtain
∇U L1 = 2 −K V T
+ UV V T
+ 2ηUUT
U − 2ηU +
= 0, (9)
∇V L1 = 2 −UT
K + UT
UV + γ V L + = 0. (10)
Using the Karush-Kuhn-Tucker (KKT) conditions [35] since
our objective function satisfies the constraint qualification
condition, we have the complementary slackness conditions:
ij Uij = 0 and ij Vij = 0, ∀i, j.
Multiplying Uij and Vij on the corresponding entries
of Eq. (9) and (10), we have the following equations
for Uij and Vij
−K V T
+ UV V T
+ 2ηUUT
U − 2ηU
ij
Uij = 0,
−UT
K + UT
UV + γ V L
ij
Vij = 0.
Hence, similar to the standard procedure of NMF [26], we get
our updating rules
Uij ←
K V T + 2ηU ij
UV V T + 2ηUUT U ij
Uij , (11)
Vij ←
UT K + γ V W ij
UT UV + γ V D ij
Vij . (12)
where D = n
i=1 αi D(i) and W = n
i=1 αi W(i). This
allocation is to guarantee that all the entries of U and V are
nonnegative. U is also needed to be normalized as Eq. (3). The
convergence of U is based on [36] and the convergence of V is
based on [30]. Similar to [37], they have proven that after
each update of U or V, the objective function is monotonically
nonincreasing.
2) Optimizing α: Due to the quadratic norm in our objective
function, the procedure for optimizing α is different from the
general linear programming. For fixed U and V, omitting the
irrelevant terms, we define the Lagrangian function
L2(α, λ, β) =
n
i=1
αi Ki − UV 2
+ γ
n
i=1
αi Tr(V Li V T
)
+ λ(
n
i=1
αi − 1) + βαT
, (13)
where λ and β = (β1, . . . , βn) are the Lagrangian multipliers
for the constraints and α = (α1, . . . , αn). Considering the
partial derivatives of L2 with respect to α, λ and β, we have
the equality ∇α,λL2 = 0 and the inequality ∇βL2 ≥ 0. Then
we need
∇αL2 = 2 Tr (
n
i=1
αi Ki − UV )K j
+ γ Tr(V L j V T
) + λ + βj
1≤ j≤n
= 0, (14)
∇λL2 =
n
i=1
αi − 1 = 0, (15)
∇βL2 = α ≥ 0. (16)
And we also have complementary slackness conditions
βj αj = 0, j = 1, . . . , n. (17)
Suppose αj = 0 for some j, specifically denote
J = { j|αj = 0}, then the optimal solution would include some
zeroes. In this case, there is no difference with the optimization
procedure for minimizing j∈J αj K j −UV 2. Without loss
of generality, we suppose αj > 0, ∀ j. Then β = 0. From
Eq. (14), we have
n
i=1
αi Tr(Ki K j ) = Tr(UV K j ) − γ Tr(V L j V T
)/2 − λ/2,
(18)
960 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
⎛
⎜
⎜
⎜
⎝
Tr(K2
1 ) − Tr(K1Kn) . . . Tr(Kn K1) − Tr(K2
n )
...
...
...
Tr(K1Kn−1) − Tr(K1Kn) . . . Tr(Kn Kn−1) − Tr(K2
n )
1 . . . 1
⎞
⎟
⎟
⎟
⎠
⎛
⎜
⎝
α1
...
αn
⎞
⎟
⎠ =
⎛
⎜
⎜
⎜
⎝
T1 − Tn
...
Tn−1 − Tn
1
⎞
⎟
⎟
⎟
⎠
(20)
for j = 1, . . ., n. If we transform the above equations to the
matrix form and denote Tj = Tr(UV K j ) − γ Tr(V L j V T )/2,
we obtain
⎛
⎜
⎜
⎜
⎝
Tr(K2
1 ) Tr(K2 K1) . . . Tr(Kn K1)
Tr(K1K2) Tr(K2
2 ) . . . Tr(Kn K2)
...
...
...
...
Tr(K1Kn) Tr(K2Kn) . . . Tr(K2
n )
⎞
⎟
⎟
⎟
⎠
⎛
⎜
⎜
⎜
⎝
α1
α2
...
αn
⎞
⎟
⎟
⎟
⎠
=
⎛
⎜
⎜
⎜
⎝
T1 − λ/2
T2 − λ/2
...
Tn − λ/2
⎞
⎟
⎟
⎟
⎠
. (19)
Let us denote Eq. (19) by AαT = B. Note that matrix A
is actually the Gram matrix of Ki based on Frobenius
inner product Ki , K j = Tr(Ki K T
j ) = Tr(Ki K j ). Let
M = (vec(K1), . . . , vec(Kn)), where vec(Ki) is the
vectorization of Ki , then A = MT M. And rank(A) =
rank(MT M) = rank(M) = n under the assumption that the
kernel matrices K1, . . . , Kn from n different views are linearly
independent. Combined with Eq. (16) and eliminated λ by
subtracting other rows from the first row in Eq. (19), we
can obtain the linear equations demonstrated in Eq. (20), as
shown at the top of the page in a matrix form.
We denote Eq. (20) by AαT = B. According to the variety
of different views, 1 = (1, . . . , 1) and all of the rows in A are
linearly independent. Then we have rank(A) = rank(A) − 1 +
1 = n. Thus, the inverse of A exists and we have
αT
= A−1
B.
3) Global Convergence: Let us denote our original objective
function in Eq. (7) by L(U, V, α). Then for any m-th step, the
alternate iteration procedure will be:
(U(m)
, V (m)
) ← arg min
U,V
L(U, V, α(m−1)
)
and
α(m)
← arg min
α
L(U(m)
, V (m)
, α).
Thus we have the following inequality:
L(U(m−1)
, V (m−1)
, α(m−1)
)
≥ L(U(m)
, V (m)
, α(m−1)
) ≥ L(U(m)
, V (m)
, α(m)
)
≥ L(U(m+1)
, V (m+1)
, α(m)
)
≥ L(U(m+1)
, V (m+1)
, α(m+1)
) ≥ . . . .
In other words, L(U(m), V (m), α(m)) are monotonic nonin-
creasing as m → ∞. Note that we have L(U, V, α) ≥ 0.
Then the alternate iteration converges.
Following the above optimization procedure on (U, V )
and α, we compute them alternately by fixing another until
the object function converges. In practice, we terminate the
iteration process when the difference |L(U(m), V (m), α(m)) −
L(U(m−1), V (m−1), α(m−1))| is less than a small threshold or
the number of iterations reaches a maximum value.
C. Hashing Function Generation
From the above section, we can easily compute the weight
vector α = (α1, . . . , αn) and further get the fused kernel
matrix K and the combined joint probability Laplacian
matrix L. Thus, from Eq. (11) and Eq. (12) we can obtain the
multiview RKNMF bases U ∈ RN×d and the low-dimensional
representation V ∈ Rd×N , where d Di , i = 1, . . . , n.
We now convert the above low-dimensional real-valued
representation from V = [v1, . . . , vN ] into binary codes
via thresholding: if the l-th element of vp is larger than
the specified threshold, the mapped bit ˆv(l)
p = 1; otherwise
ˆv(l)
p = 0, where p = 1, . . ., N and l = 1, . . . , d.
According to [38], a well-designed semantic hashing should
also be entropy-maximized to ensure efficiency. Moreover,
from the information theory [39], the maximal entropy of a
source alphabet is attained by having a uniform probability
distribution. If the entropy of codes over the bit-string is
small, it means that documents are mapped to only a small
part of codes (hash bins), thereby rendering the hash table
inefficient. In our method, we set the threshold for each
element in ˆv
(l)
p as the median value of v
(l)
p , which can meet the
entropy maximizing criterion mentioned above. Therefore, the
p-th bit-string will be 1 for half of it and 0 for the other half.
The above scheme gives each distinct binary code roughly
equal probability of occurring in the data collection, hence
achieves the best utilization of the hash table. The binary code
can be represented as: ˆV = [ˆv1, . . . , ˆvN ], where ˆvp ∈ {0, 1}d
and p = 1, . . . , N. A similar scheme has been used
in [40] and [41].
Up to now, this would only tell us how to compute the
hash code of items in the training set, while for a new
coming sample we cannot explicitly find the corresponding
hashing function. Inspired by [42], we determine our out-of-
sample extension using a regression technique with multiple
variables. In this paper, instead of applying linear regression
as mentioned in [42], the binary coding environment forces
us to implement our tasks via the logistic regression [43],
which can be treated as a type of probabilistic statistical
classification model. The corresponding probabilities describe
the possibilities of binary responses. In n distributions
Yi|Xi ∼ Bernoulli(pi), i = 1, . . . , n, for the function
Pr(Yi = 1| Xi = x) := hθ (x) with the parameter θ,
LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 961
Fig. 2. The illustration of the embedding via Eq. (23), i.e., the nearest integer function.
the likelihood function is
n
i=1
Pr(Yi = yi|Xi = xi) =
n
i=1
hθ (xi)yi (1 − hθ (xi))1−yi .
According to the maximum log-likelihood criterion, we define
our logistic regression cost function:
J( ) = −
1
N
N
p=1
ˆvp, log(h (vp))
+ (1 − ˆvp), log(1 − h (vp)) + ξ|| ||2
,
(21)
where
h (vp) =
1
1 + e−( T vp)i
T
1≤i≤d
is the regression function for each component in vp;
log(x) = (log(x1), . . . , log(xn))T for x = (x1, . . . , xn)T ∈ Rn;
·, · represents the inner product; is the corresponding
regression matrix with the size d × d; 1 is denoted as N × 1
all ones matrix and ξ is the parameter for regularization term
ξ|| ||2 in logistic regression, which is to avoid overfitting.
Further aiming to minimize J( ), a standard gradient
descent algorithm has been employed. According to [43], the
updated equation with the learning rate r can be written as:
t+1 = t − r
⎛
⎝ 1
N
N
p=1
(h (vp) − ˆvp)vT
p +
ξ
N
t
⎞
⎠. (22)
We stop the update iteration when the norm of difference
between t+1 and t, || t+1− t||2, is less than an empirical
small value (i.e., reach the convergence) and then output the
regression matrix .
In this way, given a sample, the related kernel matrices
{Knew
1 , . . . , Knew
n } for each view are first computed via Heat
Kernel, where Knew
i is a N ×1 matrix, ∀i. We then fuse these
kernels with optimized weights α:
Knew
=
n
i=1
αi Knew
i
and obtain the low-dimensional real-value representation by a
linear projection matrix:
P = (UT
U)−1
UT
Algorithm 1 Multiview Alignment Hashing (MAH)
via RKNMF, which is the pseudoinverse of U. Since h is
a Sigmoid function, finally the hash code for the new sample
can be calculated as:
ˆvnew = h (P · Knew
) , (23)
where function · is the nearest integer function for each
entry of h . In fact, it follows the property of h ∈ (0, 1)
to binarize ˆvnew by a threshold 0.5. If any bit of
h (P · Knew)’s output is larger than 0.5, we assign “1” to
this bit, otherwise “0”. In this way, we can obtain our final
Multiview Alignment Hashing (MAH) codes for any data
points (see in Fig. 2).
It is noteworthy that our approach can be regarded as an
embedding method. In practice, we map all the training and
test data in a same way to ensure that they are in the same
subspace by Eq. (23) with {P, α, } which are obtained via
multiview RKNMF optimization and logistic regression. This
procedure is similar as a handcrafted embedding scheme,
without re-training. The corresponding integrated MAH
algorithm is depicted in Algorithm 1.
D. Complexity Analysis
The cost of MAH learning mainly contains two parts.
The first part is for the constructions of Heat Kernels and
similarity probability regularization items for different views,
i.e., Ki and Li. From Section III-A, the time complexity of
this part is O(2( n
i=1 Di )N2). The second part is for the
962 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
alternate optimization. The time complexity of the matrix
factorization in updating (U, V) step is O(N2d) according
to [30]. And, the updating of α has the complexity of O(n2 N2)
in MAH. In total, the time complexity of MAH learning is
O(2( n
i=1 Di )N2 + T × (N2d + n2 N2)), where T is the
number of iterations for alternate optimization. Empirically,
T is always less than 10, i.e., MAH converges within
10 rounds.
IV. EXPERIMENTS AND RESULTS
In this section, the MAH algorithm is evaluated for the high
dimensional nearest neighbor search problem. Three different
datasets are used in our experiments, i.e., Caltech-256 [44],
CIFAR-10 [45] and CIFAR-20 [45]. Caltech-256 consists
of 30607 images associated with 256 object categories.
CIFAR-10 and CIFAR-20 are both 60, 000-image subsets
collected from the 80-million tiny images dataset [46] with
10 class labels and 20 class labels, respectively. Following
the experimental setting in [21] and [22], for each dataset, we
randomly select 1000 images as the query set and the rest of
dataset is used as the training set. Given an image, we would
like to describe it with multiview features extracted from
it. The descriptors are expected to capture the orientation,
intensity, texture and color information, which are the main
cues of an image. Therefore, 512-dim Gist3 [47], 1152-dim
histogram of oriented gradients (HOG)4 [48], 256-dim local
binary pattern (LBP)5 [49] and 192-dim color histogram
(ColorHist)6 are respectively employed for image
representation.
In the test phase, a returned point is regarded as a true
neighbor if it lies in the top 100, 500 and 500 points closest
to a query for Caltech-256, CIFAR-10 and CIFAR-20,
respectively. For each query, all the data points in the database
are ranked according to their Hamming distances to the query,
since it is fast enough with short hash codes in practice.
We then evaluate the retrieval results by the Mean Average
Precision (MAP) and the precision-recall curve. Additionally,
we also report the training time and the test time (the average
searching time used for each query) for all the methods.
All experiments are performed using Matlab 2013a on a
server configured with a 12-core processor and 128G of RAM
running the Linux OS.
A. Compared Methods and Settings
We compare our method against six popular
unsupervised multiview hashing algorithms, i.e., Multi-View
Anchor Graph Hashing (MVAGH) [21], Sequential Update
for Multi-View Spectral Hashing (SU-MVSH) [22],
Multi-View Hashing (MVH-CS) [23], Composite Hashing
with Multiple Information Sources (CHMIS) [24],
3Gabor filters are applied on images with 8 different orientations and
4 scales. Each filtered image is then averaged over 4 × 4 grid leading to
a 512-dimensional vector (8 × 4 × 16 = 512).
436 4 × 4 non-overlapping windows yield a 1152-dimensional vector.
5The LBP labels the pixels of an image by thresholding a 3 × 3 neighbor-
hood, and responses are mapped to a 256-dimensional vector.
6For each R,G,B channel, a 64-bin histogram is computed and the total
length is 3 × 64 = 192.
Deep Multi-view Hashing (DMVH) [25] with a 4-layers
deep-net and a derived version of MVH-CS, termed
MAV-CCA, which is a special case of MAV-CS when the
averaged similarity matrix is fixed as the identity matrix [23].
Besides, we also compare our method with two state-of-the-art
single-view hashing methods, i.e., Spectral Hashing (SpH) [8]
and Anchor Graphs Hashing (AGH) [10]. For single-view
hashing methods, data from multiviews are concatenated into a
single representation for hashing learning. It is noteworthy that
we have normalized all the features into a same scale before
concatenating the multiple features into a single representation
for single-view hashing algorithms. All of the above methods
are then evaluated on six different lengths of codes
(16, 32, 48, 64, 80, 96). Following the same experimental
setting, all the parameters used in the compared methods
have been strictly chosen according to their original papers.
For our MAH, we apply Heat Kernel:
Ki(x(i)
p , x(i)
q ) = exp
−||x(i)
p − x(i)
q ||2
2τ2
i
, ∀p, q,
to construct the original kernel matrices, where τi is set as
the median of pairwise distances of data points. The selection
of the smooth parameter σi used in Eq. (5) also follows
with the same scheme of τi. The optimal learning rate r for
each dataset is selected from one of {0.01, 0.02, . . ., 0.10}
with the step of 0.01 which yields the best performance
by 10-fold cross-validation on the training data. The choice
of three regularization parameters {γ, η, ξ} is also done via
cross-validation on the training set and we finally fix γ = 0.15,
η = 0.325 and ξ = 0.05 for all three datasets. To further speed
up the convergence of the proposed alternate optimization
procedure, in our experiments, we apply a small trick with
the following steps:
1) For the first time to calculate U and V in the step
of optimizing (U, V) in Section III-B, we utilize
Eq. (11) and Eq. (12) with random initialization,
following the original NMF procedure [26]. We then
store the obtained U and V.
2) From the second time, we optimize (U, V) by using
the stored U and V from the last time to initialize the
NMF algorithm, instead of using random values.
This small improvement can effectively reduce the time of
convergence in the training phase and make the proposed
method more efficient for large-scale tasks. Fig. 3 shows
the retrieval performance of MAH when four descriptors
(i.e., Gist, HOG, LBP, ColorHist) are used together via differ-
ent kernel combinations, or when single descriptors are used.
Specifically, MAH denotes that the kernels are combined by
the proposed method:
K =
n
i=1
αi Ki,
where the weight αi is obtained via alternate optimization.
MAH (Average) means that the kernels are combined by
arithmetic mean
K =
1
n
n
i=1
Ki ,
LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 963
Fig. 3. Performance (MAP) of MAH when four descriptors are used together via different kernel combinations in comparison to that of single descriptors.
Fig. 4. Performance (MAP) comparison with different numbers of bits.
Fig. 5. The precision-recall curves of all compared algorithms on the three datasets with the code length of 96 bits.
and MAH (Product) indicates the combination of kernels
through geometric mean
K = (
n
i=1
Ki )
1
n .
The corresponding combinations of similarity probability
regularization items Li also follow the above similar schemes.
The results on three datasets demonstrate that integrating
multiple features achieves better performance than using single
features and the proposed weighted combination improves the
performance compared with average and product schemes.
Fig. 4 illustrates the MAP curves of all compared algorithms
on three datasets. In its entirety, firstly the retrieval accu-
racies on the CIFAR-10 dataset are obviously higher than
that on the more complicated CIFAR-20 and Caltech-256
datesets. Secondly, the multiview methods always achieve
better results than single-view schemes. Particularly, in most
964 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
Fig. 6. Some retrieval results on the Caltech-256 dataset. The top-left image in each block is the query and the rest are the 8-nearest images. The incorrect
results are marked by red boxes.
Fig. 7. Comparison of some retrieval results via compared hashing methods on the Caltech-256 dataset. The top-left image in each block is the query and
the rest are the 8-nearest images. The incorrect results are marked by red boxes.
cases, MVAGH achieves higher performance than SU-MVSH,
MVH-CS, MVH-CCA, DMVH-4 layers and CHMIS. The
results of MVH-CS always climb up then go down when the
length of codes increases. The same tendency also appears
with CHMIS. In general, our MAH outperforms all other com-
pared methods (also see Table I). In additional, we have also
compared our proposed method with other various baselines
(with/without the probability regularization or the orthogonal
constraint) in Table II). It is obviously observed that the
proposed method has significantly improved the effectiveness
of NMF and its variants in terms of accuracies.
Furthermore, we present the precision-recall curves of all
the algorithms on three datasets with the code length of 96 bits
in Fig. 5. From this figure, we can further discover that MAH
consistently achieves the better results again by comparing
the Area Under the Curve (AUC). Some examples of retrieval
results on Caltech-256 are also shown in Fig. 6 and Fig. 7.
The proposed algorithm aims to generate a non-linear
binary code, which can better fit the intrinsic data structure
than linear ones. The only existing non-linear multiview
hashing algorithm is MVAGH. However, the proposed
algorithm can successfully find the optimal aggregation of
LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 965
TABLE I
MEAN AVERAGE PRECISION (MAP) OF 32 BITS WITH TRAINING AND TEST TIME ON THREE DATASETS
TABLE II
COMPARISON OF DIFFERENT VARIANTS OF MAH TO PROVE
THE EFFECTIVENESS OF THE IMPROVEMENT
the kernel matrix, whereas MVAGH cannot. Therefore, it is
expected that the proposed algorithm performs better than
MVAGH and the existing linear multiview hashing algorithms.
Finally, we list the training time and the test time for
different algorithms on three datasets in Table I. Considering
the training time, since DMVH involves learning of a 4-layer
deep-net, it spends the most time to train hashing functions.
MVH-CCA spends the second longest time for training. Our
MAH only costs slightly more time than MVAGH and CHMIS
but significantly less than other methods for training. For
the test phase, MVH-CS and MVH-CCA are the most effi-
cient methods, while MAH has competitive searching time as
SU-MVSH. DMVH needs the most time for testing as well.
V. CONCLUSION
In this paper, we have presented a novel unsupervised
hashing method called Multiview Alignment Hashing (MAH),
where hashing functions are effectively learnt via kernelized
Nonnegative Matrix Factorization with preserving data joint
probability distribution. We incorporate multiple visual fea-
tures from different views together and an alternate way is
introduced to optimize the weights for different views and
simultaneously produce the low-dimensional representation.
We address this as a nonconvex optimization problem and its
alternate procedure will finally converge at the locally opti-
mal solution. For the out-of-sample extension, multivariable
logistic regression has been successfully applied to obtain the
regression matrix for fast hash encoding. Numerical exper-
iments have been systematically evaluated on Caltech-256,
CIFAR-10 and CIFAR-20 datasets. The results manifest that
our MAH significantly outperforms the state-of-the-art multi-
view hashing techniques in terms of searching accuracies.
REFERENCES
[1] L. Shao, L. Liu, and X. Li, “Feature learning for image classification via
multiobjective genetic programming,” IEEE Trans. Neural Netw. Learn.
Syst., vol. 25, no. 7, pp. 1359–1371, Jul. 2014.
[2] F. Zhu and L. Shao, “Weakly-supervised cross-domain dictionary learn-
ing for visual recognition,” Int. J. Comput. Vis., vol. 109, nos. 1–2,
pp. 42–59, Aug. 2014.
[3] J. Han et al., “Representing and retrieving video shots in human-centric
brain imaging space,” IEEE Trans. Image Process., vol. 22, no. 7,
pp. 2723–2736, Jul. 2013.
[4] J. Han, K. N. Ngan, M. Li, and H.-J. Zhang, “A memory learning
framework for effective image retrieval,” IEEE Trans. Image Process.,
vol. 14, no. 4, pp. 511–524, Apr. 2005.
[5] J. Han, S. He, X. Qian, D. Wang, L. Guo, and T. Liu, “An object-
oriented visual saliency detection framework based on sparse coding
representations,” IEEE Trans. Circuits Syst. Video Technol., vol. 23,
no. 12, pp. 2009–2021, Dec. 2013.
[6] V. Gaede and O. Günther, “Multidimensional access methods,” ACM
Comput. Surv., vol. 30, no. 2, pp. 170–231, Jun. 1998.
[7] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high
dimensions via hashing,” in Proc. Int. Conf. Very Large Data Bases,
vol. 99. 1999, pp. 518–529.
[8] Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc. Neural
Inf. Process. Syst., 2008, pp. 1753–1760.
[9] J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for large-
scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12,
pp. 2393–2406, Dec. 2012.
[10] W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,” in
Proc. Int. Conf. Mach. Learn., 2011, pp. 1–8.
[11] K. Grauman and R. Fergus, “Learning binary hash codes for large-
scale image search,” in Machine Learning for Computer Vision. Berlin,
Germany: Springer-Verlag, 2013, pp. 49–87.
[12] J. Cheng, C. Leng, P. Li, M. Wang, and H. Lu, “Semi-supervised
multi-graph hashing for scalable similarity search,” Comput. Vis. Image
Understand., vol. 124, pp. 12–21, Jul. 2014.
[13] P. Li, M. Wang, J. Cheng, C. Xu, and H. Lu, “Spectral hashing
with semantically consistent graph for image indexing,” IEEE Trans.
Multimedia, vol. 15, no. 1, pp. 141–152, Jan. 2013.
[14] C. Xu, D. Tao, and C. Xu, “Large-margin multi-view information
bottleneck,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 8,
pp. 1559–1572, Aug. 2014.
[15] T. Xia, D. Tao, T. Mei, and Y. Zhang, “Multiview spectral embed-
ding,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 40, no. 6,
pp. 1438–1446, Dec. 2010.
[16] C. Xu, D. Tao, and C. Xu. (2013). “A survey on multi-view learning.”
[Online]. Available: http://arxiv.org/abs/1304.5634
[17] B. Xie, Y. Mu, D. Tao, and K. Huang, “m-SNE: Multiview stochastic
neighbor embedding,” IEEE Trans. Syst., Man, Cybern. B, Cybern.,
vol. 41, no. 4, pp. 1088–1096, Aug. 2011.
966 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015
[18] M. Wang, X.-S. Hua, R. Hong, J. Tang, G.-J. Qi, and Y. Song, “Unified
video annotation via multigraph learning,” IEEE Trans. Circuits Syst.
Video Technol., vol. 19, no. 5, pp. 733–746, May 2009.
[19] M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, “Multimodal graph-based
reranking for web image search,” IEEE Trans. Image Process., vol. 21,
no. 11, pp. 4649–4661, Nov. 2012.
[20] J. Yu, M. Wang, and D. Tao, “Semisupervised multiview distance metric
learning for cartoon synthesis,” IEEE Trans. Image Process., vol. 21,
no. 11, pp. 4636–4648, Nov. 2012.
[21] S. Kim and S. Choi, “Multi-view anchor graph hashing,” in Proc. IEEE
Int. Conf. Acoust., Speech, Signal Process., May 2013, pp. 3123–3127.
[22] S. Kim, Y. Kang, and S. Choi, “Sequential spectral learning to hash with
multiple representations,” in Proc. 12th Eur. Conf. Comput. Vis., 2012,
pp. 538–551.
[23] S. Kumar and R. Udupa, “Learning hash functions for cross-view
similarity search,” in Proc. Int. Joint Conf. Artif. Intell., 2011, p. 1360.
[24] D. Zhang, F. Wang, and L. Si, “Composite hashing with multiple
information sources,” in Proc. ACM SIGIR Conf. Res. Develop. Inf. Retr.,
2011, pp. 225–234.
[25] Y. Kang, S. Kim, and S. Choi, “Deep learning to hash with multiple
representations,” in Proc. IEEE 12th Int. Conf. Data Mining, Dec. 2012,
pp. 930–935.
[26] D. D. Lee and H. S. Seung, “Learning the parts of objects by non-
negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791,
Oct. 1999.
[27] S. Z. Li, X. Hou, H. Zhang, and Q. Cheng, “Learning spatially localized,
parts-based representation,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., Dec. 2001, pp. I-207–I-212.
[28] P. O. Hoyer, “Non-negative matrix factorization with sparseness con-
straints,” J. Mach. Learn. Res., vol. 5, pp. 1457–1469, Dec. 2004.
[29] Q. Gu and J. Zhou, “Neighborhood preserving nonnegative matrix
factorization,” in Proc. Brit. Mach. Vis. Conf., 2009, pp. 1–10.
[30] D. Cai, X. He, J. Han, and T. S. Huang, “Graph regularized nonnegative
matrix factorization for data representation,” IEEE Trans. Pattern Anal.
Mach. Intell., vol. 33, no. 8, pp. 1548–1560, Aug. 2011.
[31] D. Zhang, Z.-H. Zhou, and S. Chen, “Non-negative matrix factorization
on kernels,” in Proc. 9th Pacific Rim Int. Conf. Artif. Intell., 2006,
pp. 404–412.
[32] S. An, J.-M. Yun, and S. Choi, “Multiple kernel nonnegative matrix
factorization,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.,
May 2011, pp. 1976–1979.
[33] L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,”
J. Mach. Learn. Res., vol. 9, no. 11, pp. 2579–2605, Nov. 2008.
[34] J. C. Bezdek and R. J. Hathaway, “Some notes on alternating optimiza-
tion,” in Proc. Adv. Soft Comput. (AFSS), 2002, pp. 288–300.
[35] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:
Cambridge Univ. Press, 2004.
[36] W. Zheng, Y. Qian, and H. Tang, “Dimensionality reduction with
category information fusion and non-negative matrix factorization for
text categorization,” in Proc. 3rd Int. Conf. Artif. Intell. Comput. Intell.,
2011, pp. 505–512.
[37] D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix
factorization,” in Proc. Neural Inf. Process. Syst., 2000, pp. 556–562.
[38] S. Baluja and M. Covell, “Learning to hash: Forgiving hash functions
and applications,” Data Mining Knowl. Discovery, vol. 17, no. 3,
pp. 402–430, Dec. 2008.
[39] C. E. Shannon, “A mathematical theory of communication,” ACM
SIGMOBILE Mobile Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55,
Jan. 2001.
[40] D. Zhang, J. Wang, D. Cai, and J. Lu, “Self-taught hashing for fast
similarity search,” in Proc. ACM SIGIR Conf. Res. Develop. Inf. Retr.,
2010, pp. 18–25.
[41] Y. Lin, R. Jin, D. Cai, S. Yan, and X. Li, “Compressed hashing,” in Proc.
IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013, pp. 446–451.
[42] D. Cai, X. He, and J. Han, “Spectral regression for efficient regular-
ized subspace learning,” in Proc. IEEE 11th Int. Conf. Comput. Vis.,
Oct. 2007, pp. 1–8.
[43] D. W. Hosmer, Jr., and S. Lemeshow, Applied Logistic Regression.
New York, NY, USA: Wiley, 2004.
[44] G. Griffin, A. Holub, and P. Perona, “Caltech-256 object cat-
egory dataset,” California Inst. Technol., Pasadena, CA, USA,
Tech. Rep. CaltechAUTHORS:CNS-TR-2007-001, 2007.
[45] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from
tiny images,” Dept. Comput. Sci., Univ. Toronto, Toronto, ON, Canada,
Tech. Rep. TR-2009, 2009.
[46] A. Torralba, R. Fergus, and W. T. Freeman, “80 million tiny images:
A large data set for nonparametric object and scene recognition,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1958–1970,
Nov. 2008.
[47] A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic
representation of the spatial envelope,” Int. J. Comput. Vis., vol. 42,
no. 3, pp. 145–175, May 2001.
[48] N. Dalal and B. Triggs, “Histograms of oriented gradients for human
detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2005, pp. 886–893.
[49] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with
local binary patterns,” in Proc. 8th Eur. Conf. Comput. Vis., 2004,
pp. 469–481.
Li Liu received the B.Eng. degree in electronic
information engineering from Xi’an Jiaotong Uni-
versity, Xi’an, China, in 2011, and the Ph.D. degree
from the Department of Electronic and Electrical
Engineering, University of Sheffield, Sheffield, U.K.,
in 2014. He is currently a Research Fellow with the
Department of Computer Science and Digital Tech-
nologies, Northumbria University, Newcastle upon
Tyne, U.K. His research interests include computer
vision, machine learning, and data mining.
Mengyang Yu (S’14) received the B.S. and
M.S. degrees from the School of Mathematical
Sciences, Peking University, Beijing, China, in 2010
and 2013, respectively. He is currently pursuing
the Ph.D. degree with the Department of Computer
Science and Digital Technologies, Northumbria Uni-
versity, Newcastle upon Tyne, U.K. His research
interests include computer vision, machine learning,
and data mining.
Ling Shao (M’09–SM’10)is currently a Professor
with the Department of Computer Science and
Digital Technologies, Northumbria University,
Newcastle upon Tyne, U.K. He was a Senior Lec-
turer with the Department of Electronic and Electri-
cal Engineering, University of Sheffield, Sheffield,
U.K., from 2009 to 2014, and a Senior Scientist
with Philips Research, Eindhoven, The Netherlands,
from 2005 to 2009. His research interests include
computer vision, image/video processing, and
machine learning. He is an Associate Editor of the
IEEE TRANSACTIONS ON IMAGE PROCESSING, the IEEE TRANSACTIONS
ON CYBERNETICS, and several other journals. He is a fellow of the British
Computer Society and the Institution of Engineering and Technology.

More Related Content

What's hot

Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Editor IJARCET
 
Ba2419551957
Ba2419551957Ba2419551957
Ba2419551957IJMER
 
B colouring
B colouringB colouring
B colouringxs76250
 
Vector sparse representation of color image using quaternion matrix analysis
Vector sparse representation of color image using quaternion matrix analysisVector sparse representation of color image using quaternion matrix analysis
Vector sparse representation of color image using quaternion matrix analysisredpel dot com
 
Semi-Supervised Discriminant Analysis Based On Data Structure
Semi-Supervised Discriminant Analysis Based On Data StructureSemi-Supervised Discriminant Analysis Based On Data Structure
Semi-Supervised Discriminant Analysis Based On Data Structureiosrjce
 
Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...IOSR Journals
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...IJCSIS Research Publications
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...IOSR Journals
 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Editor IJCATR
 
A divisive hierarchical clustering based method for indexing image information
A divisive hierarchical clustering based method for indexing image informationA divisive hierarchical clustering based method for indexing image information
A divisive hierarchical clustering based method for indexing image informationsipij
 
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...csandit
 
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence InformationLatent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Informationcsandit
 

What's hot (18)

Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
 
Ba2419551957
Ba2419551957Ba2419551957
Ba2419551957
 
B colouring
B colouringB colouring
B colouring
 
Vector sparse representation of color image using quaternion matrix analysis
Vector sparse representation of color image using quaternion matrix analysisVector sparse representation of color image using quaternion matrix analysis
Vector sparse representation of color image using quaternion matrix analysis
 
A0360109
A0360109A0360109
A0360109
 
Semi-Supervised Discriminant Analysis Based On Data Structure
Semi-Supervised Discriminant Analysis Based On Data StructureSemi-Supervised Discriminant Analysis Based On Data Structure
Semi-Supervised Discriminant Analysis Based On Data Structure
 
Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...
 
Beyond Bag of Features: Adaptive Hilbert Scan Based Tree for Image Retrieval
Beyond Bag of Features: Adaptive Hilbert Scan Based Tree for Image RetrievalBeyond Bag of Features: Adaptive Hilbert Scan Based Tree for Image Retrieval
Beyond Bag of Features: Adaptive Hilbert Scan Based Tree for Image Retrieval
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
 
I0341042048
I0341042048I0341042048
I0341042048
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
 
Ia3613981403
Ia3613981403Ia3613981403
Ia3613981403
 
Automatic Image Annotation
Automatic Image AnnotationAutomatic Image Annotation
Automatic Image Annotation
 
A divisive hierarchical clustering based method for indexing image information
A divisive hierarchical clustering based method for indexing image informationA divisive hierarchical clustering based method for indexing image information
A divisive hierarchical clustering based method for indexing image information
 
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...
E XTENDED F AST S EARCH C LUSTERING A LGORITHM : W IDELY D ENSITY C LUSTERS ,...
 
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence InformationLatent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
Latent Semantic Word Sense Disambiguation Using Global Co-Occurrence Information
 

Viewers also liked

Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...
Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...
Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...1crore projects
 
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...1crore projects
 
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...1crore projects
 
Cloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection SystemCloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection System1crore projects
 
An Attribute-Assisted Reranking Model for Web Image Search
An Attribute-Assisted Reranking Model for Web Image SearchAn Attribute-Assisted Reranking Model for Web Image Search
An Attribute-Assisted Reranking Model for Web Image Search1crore projects
 
Eye Gaze Tracking With a Web Camera in a Desktop Environment
Eye Gaze Tracking With a Web Camera in a Desktop EnvironmentEye Gaze Tracking With a Web Camera in a Desktop Environment
Eye Gaze Tracking With a Web Camera in a Desktop Environment1crore projects
 
Energy-aware Load Balancing and Application Scaling for the Cloud Ecosystem
Energy-aware Load Balancing and Application Scaling for the Cloud EcosystemEnergy-aware Load Balancing and Application Scaling for the Cloud Ecosystem
Energy-aware Load Balancing and Application Scaling for the Cloud Ecosystem1crore projects
 
SelCSP: A Framework to Facilitate Selection of Cloud Service Providers
SelCSP: A Framework to Facilitate Selection of Cloud Service ProvidersSelCSP: A Framework to Facilitate Selection of Cloud Service Providers
SelCSP: A Framework to Facilitate Selection of Cloud Service Providers1crore projects
 
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection System
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection SystemKey-Recovery Attacks on KIDS, a Keyed Anomaly Detection System
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection System1crore projects
 
Steganography Using Reversible Texture Synthesis
Steganography Using Reversible Texture SynthesisSteganography Using Reversible Texture Synthesis
Steganography Using Reversible Texture Synthesis1crore projects
 
Query Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain ObjectsQuery Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain Objects1crore projects
 
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...1crore projects
 
Universal Network Coding-Based Opportunistic Routing for Unicast
Universal Network Coding-Based Opportunistic Routing for UnicastUniversal Network Coding-Based Opportunistic Routing for Unicast
Universal Network Coding-Based Opportunistic Routing for Unicast1crore projects
 
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing SitesPrivacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites1crore projects
 
Learning Fingerprint Reconstruction: From Minutiae to Image
Learning Fingerprint Reconstruction: From Minutiae to ImageLearning Fingerprint Reconstruction: From Minutiae to Image
Learning Fingerprint Reconstruction: From Minutiae to Image1crore projects
 
Identity-Based Distributed Provable Data Possession in Multicloud Storage
Identity-Based Distributed Provable Data Possession in Multicloud StorageIdentity-Based Distributed Provable Data Possession in Multicloud Storage
Identity-Based Distributed Provable Data Possession in Multicloud Storage1crore projects
 
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...Location-Aware and Personalized Collaborative Filtering for Web Service Recom...
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...1crore projects
 
Defeating Jamming With the Power of Silence: A Game-Theoretic Analysis
Defeating Jamming With the Power of Silence: A Game-Theoretic AnalysisDefeating Jamming With the Power of Silence: A Game-Theoretic Analysis
Defeating Jamming With the Power of Silence: A Game-Theoretic Analysis1crore projects
 
Optimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc NetworksOptimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc Networks1crore projects
 
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...1crore projects
 

Viewers also liked (20)

Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...
Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...
Collision Tolerant and Collision Free Packet Scheduling for Underwater Acoust...
 
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...
Control Cloud Data Access Privilege and Anonymity with Fully Anonymous Attrib...
 
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
 
Cloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection SystemCloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection System
 
An Attribute-Assisted Reranking Model for Web Image Search
An Attribute-Assisted Reranking Model for Web Image SearchAn Attribute-Assisted Reranking Model for Web Image Search
An Attribute-Assisted Reranking Model for Web Image Search
 
Eye Gaze Tracking With a Web Camera in a Desktop Environment
Eye Gaze Tracking With a Web Camera in a Desktop EnvironmentEye Gaze Tracking With a Web Camera in a Desktop Environment
Eye Gaze Tracking With a Web Camera in a Desktop Environment
 
Energy-aware Load Balancing and Application Scaling for the Cloud Ecosystem
Energy-aware Load Balancing and Application Scaling for the Cloud EcosystemEnergy-aware Load Balancing and Application Scaling for the Cloud Ecosystem
Energy-aware Load Balancing and Application Scaling for the Cloud Ecosystem
 
SelCSP: A Framework to Facilitate Selection of Cloud Service Providers
SelCSP: A Framework to Facilitate Selection of Cloud Service ProvidersSelCSP: A Framework to Facilitate Selection of Cloud Service Providers
SelCSP: A Framework to Facilitate Selection of Cloud Service Providers
 
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection System
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection SystemKey-Recovery Attacks on KIDS, a Keyed Anomaly Detection System
Key-Recovery Attacks on KIDS, a Keyed Anomaly Detection System
 
Steganography Using Reversible Texture Synthesis
Steganography Using Reversible Texture SynthesisSteganography Using Reversible Texture Synthesis
Steganography Using Reversible Texture Synthesis
 
Query Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain ObjectsQuery Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain Objects
 
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...
Improved Privacy-Preserving P2P Multimedia Distribution Based on Recombined F...
 
Universal Network Coding-Based Opportunistic Routing for Unicast
Universal Network Coding-Based Opportunistic Routing for UnicastUniversal Network Coding-Based Opportunistic Routing for Unicast
Universal Network Coding-Based Opportunistic Routing for Unicast
 
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing SitesPrivacy Policy Inference of User-Uploaded Images on Content Sharing Sites
Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites
 
Learning Fingerprint Reconstruction: From Minutiae to Image
Learning Fingerprint Reconstruction: From Minutiae to ImageLearning Fingerprint Reconstruction: From Minutiae to Image
Learning Fingerprint Reconstruction: From Minutiae to Image
 
Identity-Based Distributed Provable Data Possession in Multicloud Storage
Identity-Based Distributed Provable Data Possession in Multicloud StorageIdentity-Based Distributed Provable Data Possession in Multicloud Storage
Identity-Based Distributed Provable Data Possession in Multicloud Storage
 
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...Location-Aware and Personalized Collaborative Filtering for Web Service Recom...
Location-Aware and Personalized Collaborative Filtering for Web Service Recom...
 
Defeating Jamming With the Power of Silence: A Game-Theoretic Analysis
Defeating Jamming With the Power of Silence: A Game-Theoretic AnalysisDefeating Jamming With the Power of Silence: A Game-Theoretic Analysis
Defeating Jamming With the Power of Silence: A Game-Theoretic Analysis
 
Optimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc NetworksOptimal Configuration of Network Coding in Ad Hoc Networks
Optimal Configuration of Network Coding in Ad Hoc Networks
 
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...
Privacy-Preserving and Truthful Detection of Packet Dropping Attacks in Wirel...
 

Similar to Multiview Alignment Hashing for Efficient Image Search

Multiview alignment hashing for
Multiview alignment hashing forMultiview alignment hashing for
Multiview alignment hashing forjpstudcorner
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Editor IJARCET
 
Emr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalEmr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalPvrtechnologies Nellore
 
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...IJCI JOURNAL
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POIIRJET Journal
 
Memory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalMemory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalKunal Kishor Nirala
 
Heterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationHeterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationJAYAPRAKASH JPINFOTECH
 
Heterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationHeterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationJAYAPRAKASH JPINFOTECH
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
 
Design of an Effective Method for Image Retrieval
Design of an Effective Method for Image RetrievalDesign of an Effective Method for Image Retrieval
Design of an Effective Method for Image RetrievalAM Publications
 
EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval
EMR: A Scalable Graph-based Ranking Model for Content-based Image RetrievalEMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval
EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval1crore projects
 
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...IJSRD
 
Expandable bayesian
Expandable bayesianExpandable bayesian
Expandable bayesianAhmad Amri
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...acijjournal
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
 

Similar to Multiview Alignment Hashing for Efficient Image Search (20)

Multiview alignment hashing for
Multiview alignment hashing forMultiview alignment hashing for
Multiview alignment hashing for
 
Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932Volume 2-issue-6-1930-1932
Volume 2-issue-6-1930-1932
 
Emr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalEmr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrieval
 
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...
An Ensemble Approach To Improve Homomorphic Encrypted Data Classification Per...
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
 
Memory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunalMemory based recognition for 3 d object-kunal
Memory based recognition for 3 d object-kunal
 
Heterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationHeterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for Recommendation
 
Heterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for RecommendationHeterogeneous Information Network Embedding for Recommendation
Heterogeneous Information Network Embedding for Recommendation
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
Design of an Effective Method for Image Retrieval
Design of an Effective Method for Image RetrievalDesign of an Effective Method for Image Retrieval
Design of an Effective Method for Image Retrieval
 
EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval
EMR: A Scalable Graph-based Ranking Model for Content-based Image RetrievalEMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval
EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval
 
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...
A Novel Approach for Travel Package Recommendation Using Probabilistic Matrix...
 
Expandable bayesian
Expandable bayesianExpandable bayesian
Expandable bayesian
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 
20120140505011
2012014050501120120140505011
20120140505011
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
 
G44093135
G44093135G44093135
G44093135
 
61_Empirical
61_Empirical61_Empirical
61_Empirical
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
 

Recently uploaded

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

Multiview Alignment Hashing for Efficient Image Search

  • 1. 956 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 Multiview Alignment Hashing for Efficient Image Search Li Liu, Mengyang Yu, Student Member, IEEE, and Ling Shao, Senior Member, IEEE Abstract—Hashing is a popular and efficient method for nearest neighbor search in large-scale data spaces by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. For most hashing methods, the performance of retrieval heavily depends on the choice of the high-dimensional feature descriptor. Furthermore, a single type of feature cannot be descriptive enough for different images when it is used for hashing. Thus, how to combine multiple representations for learning effective hashing functions is an imminent task. In this paper, we present a novel unsupervised multiview alignment hashing approach based on regularized kernel nonnegative matrix factorization, which can find a compact representation uncovering the hidden semantics and simultaneously respecting the joint probability distribution of data. In particular, we aim to seek a matrix factorization to effectively fuse the multiple information sources meanwhile discarding the feature redundancy. Since the raised problem is regarded as nonconvex and discrete, our objective function is then optimized via an alternate way with relaxation and converges to a locally optimal solution. After finding the low-dimensional representation, the hashing functions are finally obtained through multivariable logistic regression. The proposed method is systematically evaluated on three data sets: 1) Caltech-256; 2) CIFAR-10; and 3) CIFAR-20, and the results show that our method significantly outperforms the state-of-the-art multiview hashing techniques. Index Terms—Hashing, multiview, NMF, alternate optimization, logistic regression, image similarity search. I. INTRODUCTION LEARNING discriminative embedding has been a critical problem in many fields of information processing and analysis, such as object recognition [1], [2], image/video retrieval [3], [4] and visual detection [5]. Among them, scalable retrieval of similar visual information is attractive, since with the advances of computer technologies and the development of the World Wide Web, a huge amount of digital data has been generated and applied. The most basic but essential scheme for similarity search is the nearest neigh- bor (NN) search: given a query image, to find an image that is most similar to it within a large database and assign the same label of the nearest neighbor to this query image. Manuscript received May 25, 2014; revised January 5, 2015; accepted January 5, 2015. Date of publication January 12, 2015; date of current version January 30, 2015. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Jie Liang. (Corresponding author: Ling Shao) The authors are with the Department of Computer Science and Digital Technologies, Northumbria University, Newcastle upon Tyne NE1 8ST, U.K. (e-mail: li2.liu@northumbria.ac.uk; m.y.yu@ieee.org; ling.shao@ northumbria.ac.uk). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2015.2390975 NN search is regarded as a linear search scheme (O(N)), which is not scalable due to the large sample size in datasets of practical applications. Later, to overcome this kind of computational complexity problem, some tree-based search schemes are proposed to partition the data space via various tree structures. Among them, KD-tree and R-tree [6] are successfully applied to index the data for fast query responses. However, these methods cannot operate with high-dimensional data and do not guarantee faster search compared to the linear scan. In fact, most of the vision-based tasks suffer from the curse of dimensionality problems,1 because visual descriptors usually have hundreds or even thousands of dimensions. Thus, some hashing schemes are proposed to effectively embed data from a high-dimensional feature space into a similarity-preserving low-dimensional Hamming space where an approximate nearest neighbor of a given query can be found with sub-linear time complexity. One of the most well-known hashing techniques that preserve similarity information is Locality-Sensitive Hashing (LSH) [7]. LSH simply employs random linear projections (followed by random thresholding) to map data points close in a Euclidean space to similar codes. Spectral Hashing (SpH) [8] is a representative unsupervised hashing method, in which the Laplace-Beltrami eigenfunctions of manifolds are used to determine binary codes. Moreover, principled linear projections like PCA Hashing (PCAH) [9] has been suggested for better quantization rather than random projection hashing. Besides, another popular hashing approach, Anchor Graphs Hashing (AGH) [10], is proposed to learn compact binary codes via tractable low-rank adjacency matrices. AGH allows constant time hashing of a new data point by extrapolating graph Laplacian eigenvectors to eigenfunctions. More relevant hashing methods can be seen in [11]–[13]. However, single-view hashing is the main topic on which the previous exploration of hashing methods focuses. In their architectures, only one type of feature descriptor is used for learning hashing functions. In practice, to make a more comprehensive description, objects/images are always represented via several different kinds of features and each of them has its own characteristics. Thus, it is desirable to incor- porate these heterogenous feature descriptors into learning hashing functions, leading to multi-view hashing approaches. Multiview learning techniques [14]–[17] have been well 1The effectiveness and efficiency of these methods drop exponentially as the dimensionality increases, which is commonly referred to as the curse of dimensionality. 1057-7149 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
  • 2. LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 957 explored in the past few years and widely applied to visual information fusion [18]–[20]. Recently, a number of multiview hashing methods have been proposed for efficient similarity search, such as Multi-View Anchor Graph Hashing (MVAGH) [21], Sequential Update for Multi- View Spectral Hashing (SU-MVSH) [22], Multi-View Hashing (MVH-CS) [23], Composite Hashing with Multiple Information Sources (CHMIS) [24] and Deep Multi-view Hashing (DMVH) [25]. These methods mainly depend on spectral, graph or deep learning techniques to achieve data structure preserving encoding. Nevertheless, the hashing purely with the above schemes are usually sensitive to data noise and suffering from the high computational complexity. The above drawbacks of prior work motivate us to propose a novel unsupervised mulitiview hashing approach, termed Multiview Alignment Hashing (MAH), which can effectively fuse multiple information sources and exploit the discriminative low-dimensional embedding via Nonnegative Matrix Factorization (NMF) [26]. NMF is a popular method in data mining tasks including clustering, collaborative filtering, outlier detection, etc. Unlike other embedding methods with positive and negative values, NMF seeks to learn a nonnegative parts-based representation that gives better visual interpretation of factoring matrices for high-dimensional data. Therefore, in many cases, NMF may be more suitable for sub- space learning tasks, because it provides a non-global basis set which intuitively contains the localized parts of objects [26]. In addition, since the flexibility of matrix factorization can handle widely varying data distributions, NMF enables more robust subspace learning. More importantly, NMF decomposes an original matrix into a part-based representation that gives better interpretation of factoring matrices for non-negative data. When applying NMF to multiview fusion tasks, a part-based representation can reduce the corruption between any two views and gain more discriminative codes. To the best of our knowledge, this is the first work using NMF to combine multiple views for image hashing. It is worthwhile to highlight several contributions of the proposed method: • MAH can find a compact representation uncovering the hidden semantics from different view aspects and simultaneously respecting the joint probability distribution of data. • To solve our nonconvex objective function, a new alternate optimization has been proposed to get the final solution. • We utilize multivariable logistic regression to generate the hashing function and achieve the out-of-sample extension. The rest of this paper is organized as follows. In Section II, we give a brief review of NMF. The details of our method are described in Section III. Section IV reports the experimental results. Finally, we conclude this paper in Section V. II. A BRIEF REVIEW OF NMF In this section, we mainly review some related algorithms, focusing on Nonnegative Matrix Factorization (NMF) and its variants. NMF is proposed to learn the nonnegative parts of objects. Given a nonnegative data matrix X = [x1, . . . , xN ] ∈ RD×N ≥0 , each column of X is a sample data. NMF aims to find two nonnegative matrices U ∈ RD×d ≥0 and V ∈ Rd×N ≥0 with full rank whose product can approximately represent the original matrix X, i.e., X ≈ UV . In practice, we always have d < min(D, N). Thus, we minimize the following objective function LN MF = X − UV 2 , s.t. U, V ≥ 0, (1) where · is Frobenius norm. To optimize the above objective function, an iterative updating procedure was developed in [26] as follows: Vij ← (UT X)ij (UT UV )ij Vij , Uij ← (XV T )ij (UV V T )ij Uij , (2) and normalization Uij ← Uij i Uij . (3) It has been proved that the above updating procedure can find the local minimum of LN MF . The matrix V obtained in NMF is always regarded as the low-dimensional representation while the matrix U denotes the basis matrix. Furthermore, there also exists some variants of NMF. Local NMF (LNMF) [27] imposes a spatial localized constraint on the bases. In [28], sparse NMF was proposed and later, NMF constrained with neighborhood preserving regulariza- tion (NPNMF) [29] was developed. Besides, researchers also proposed graph regularized NMF (GNMF) [30], which effectively preserves the locality structure of data. Beyond these methods, [31] extends the original NMF with the kernel trick as kernelized NMF (KNMF), which could extract more useful features hidden in the original data through some kernel-induced nonlinear mappings. Moreover, it can deal with data where only relationships (similarities or dissimilarities) between objects are known. Specifically, in their work, they addressed the matrix factorization by K ≈ UV , where K is the kernel matrix instead of the data matrix X, and (U, V) are similar with standard NMF. More related to our work, a multiple kernels NMF (MKNMF) [32] approach was proposed, where linear programming is applied to determine the combination of different kernels. In this paper, we present a Regularized Kernel Nonnegative Matrix Factorization (RKNMF) framework for hashing, which can effectively preserve the data intrinsic probability distribution and simultaneously reduce the redundancy of low-dimensional representations. Rather than locality-based graph regularization, we measure the joint probability of pairwise data by the Gaussian function, which is defined over all the potential neighbors and has been proved to effectively resist data noise [33]. This kind of measurement is capable to capture the local structure of the high-dimensional data while also revealing global structure such as the presence of clusters at several scales. To the best of our knowledge, this is the first time that NMF with multiview hashing has been successfully applied to feature embedding for large-scale similarity search.
  • 3. 958 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 Fig. 1. Illustration of the working flow of MAH. During the training phase, the proposed optimization method is derived by an alternate optimization procedure which optimizes the kernel’s weight α and the factorization matrices (U, V) alternately. Using the multivariable logistic regression, the algorithm then outputs the projection matrix P and regression matrix to generate the hash function, which are directly used in the test phase. III. MULTIVIEW ALIGNMENT HASHING In the section, we introduce our new Multiview Alignment Hashing approach, referred as MAH. Our goal is to learn a hash embedding function, which fuses the various alignment representations from multiple sources while preserving the high-dimensional joint distribution and obtaining the orthogo- nal bases simultaneously during the RKNMF. Originally, we need to find the binary solution which, however, is first relaxed to a real-valued range so that a more suitable solution can be gained. After applying the alternate optimization, we convert the real-valued solutions into binary codes. Fig. 1 shows the outline of the proposed MAH approach. A. Objective Function Given the i-th view training data X(i) = [x(i) 1 , . . . , x(i) N ] ∈ RDi ×N , we construct the corresponding N × N kernel matrix Ki using the Heat Kernel formulation: Ki x(i) p , x(i) q = exp −||x(i) p − x(i) q ||2 2τ2 i , ∀p, q, where τi is the related scalable parameter. Actually, our approach can work with any legitimate kernel. Without loss of generality, one of the most popular kernels, Heat Kernel, is used here in the purposed method. Our discussion in further section will only focus on the Heat Kernel. Then multiple kernel matrices from each view data {K1, . . . , Kn} are computed and Ki ∈ RN×N ≥0 , ∀i. We further define the fusion matrix K = n i=1 αi Ki, subject to n i=1 αi = 1, αi ≥ 0, ∀i. To obtain a meaningful low-dimensional matrix factorization, we then set a constraint for the binary representation V = [v1, . . . , vN ] as the similarity probability regularization, which is utilized to preserve the data distribution in the intrinsic objective space. The optimization is expressed as: arg min V∈{0,1}d×N 1 2 p,q W(i) pq vp − vq 2 , (4) where W (i) pq = Pr(x (i) p , x (i) q ) is the symmetric joint probability between x(i) p and x(i) q in the i-th view feature space. In this paper, we use a Gaussian function2 [33] to measure it: Pr x(i) p , x(i) q = ⎧ ⎨ ⎩ exp − x (i) p −x (i) q 2/2σ2 i k=l exp − x (i) k −x (i) l 2/2σ2 i , if p = q 0, if p = q (5) where σi is the Gaussian smooth parameter. x (i) p − x (i) q is always measured by the Euclidean distance. To refer the framework in [30], the similarity probability regularization for the i-th view can be reduced to arg min V∈{0,1}d×N Tr(V Li V T ), (6) where Li = D(i) − W(i) is the Laplacian matrix, W(i) = W(i) pq ∈ RN×N is the symmetric similarity matrix and D(i) is a diagonal matrix with its entries D(i) pp = p W(i) pq . In addition, to reduce the redundancy of subspaces and obtain the compact bases in NMF simultaneously, we hope the basis matrix U of NMF should be as orthogonal as possible 2Our approach can work with any legitimate similarity probability measure, though we focus on Gaussian similarity probability in this paper.
  • 4. LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 959 to minimize the redundancy, i.e., UT U − I = 0. Here we also minimize UT U −I 2 and make basis U near-orthogonal since this objective function can be fused into the computation of the L2-norm loss function of NMF. Finally, we combine the above-mentioned two constraints for U and V and construct our optimization problem as follows: arg min U,V,αi n i=1 αi Ki − UV 2 +γ n i=1 αi Tr(V Li V T ) + η UT U − I 2 , s.t. V ∈ {0, 1}d×N , U, V ≥ 0, n i=1 αi = 1, αi ≥ 0, ∀i, (7) where γ and η are two positive coefficients balancing the tradeoff between the NMF approximation error and additional constraints. B. Alternate Optimization via Relaxation In this section, we first relax the discreteness condition to the continuous condition, i.e., we relax the binary domain V ∈ {0, 1}d×N in Eq. (7) to the real number domain V ∈ Rd×N but keep the NMF requirements (nonnegative conditions). To the best of our knowledge, there is no direct way to obtain its global solution for a linearly constrained nonconvex optimization problem. Thus, in this paper the optimization procedure is implemented as an iterative proce- dure and divided into two steps by optimizing (U, V ) and α = (α1, . . . , αn) alternately [34]. For each step, one of (U, V) and α is optimized while the other is fixed and at the next step, we switch (U, V) and α. The iteration procedure stops until it converges. 1) Optimizing (U, V ): By first fixing α, we substitute K = n i=1 αi Ki and L = n i=1 αi Li. A Lagrangian multiplier function is introduced for our problem as: L1(U, V, , ) = K − UV 2 + γ Tr(V LV T ) + η UT U − I 2 + Tr( UT )+Tr( V T ), (8) where and are two matrices in which all their entries are the Lagrangian multipliers to constrain U, V ≥ 0 respectively. Then we let the partial derivatives of L1 with respect to U and V be zeroes, i.e., ∇U,V L1 = 0, we obtain ∇U L1 = 2 −K V T + UV V T + 2ηUUT U − 2ηU + = 0, (9) ∇V L1 = 2 −UT K + UT UV + γ V L + = 0. (10) Using the Karush-Kuhn-Tucker (KKT) conditions [35] since our objective function satisfies the constraint qualification condition, we have the complementary slackness conditions: ij Uij = 0 and ij Vij = 0, ∀i, j. Multiplying Uij and Vij on the corresponding entries of Eq. (9) and (10), we have the following equations for Uij and Vij −K V T + UV V T + 2ηUUT U − 2ηU ij Uij = 0, −UT K + UT UV + γ V L ij Vij = 0. Hence, similar to the standard procedure of NMF [26], we get our updating rules Uij ← K V T + 2ηU ij UV V T + 2ηUUT U ij Uij , (11) Vij ← UT K + γ V W ij UT UV + γ V D ij Vij . (12) where D = n i=1 αi D(i) and W = n i=1 αi W(i). This allocation is to guarantee that all the entries of U and V are nonnegative. U is also needed to be normalized as Eq. (3). The convergence of U is based on [36] and the convergence of V is based on [30]. Similar to [37], they have proven that after each update of U or V, the objective function is monotonically nonincreasing. 2) Optimizing α: Due to the quadratic norm in our objective function, the procedure for optimizing α is different from the general linear programming. For fixed U and V, omitting the irrelevant terms, we define the Lagrangian function L2(α, λ, β) = n i=1 αi Ki − UV 2 + γ n i=1 αi Tr(V Li V T ) + λ( n i=1 αi − 1) + βαT , (13) where λ and β = (β1, . . . , βn) are the Lagrangian multipliers for the constraints and α = (α1, . . . , αn). Considering the partial derivatives of L2 with respect to α, λ and β, we have the equality ∇α,λL2 = 0 and the inequality ∇βL2 ≥ 0. Then we need ∇αL2 = 2 Tr ( n i=1 αi Ki − UV )K j + γ Tr(V L j V T ) + λ + βj 1≤ j≤n = 0, (14) ∇λL2 = n i=1 αi − 1 = 0, (15) ∇βL2 = α ≥ 0. (16) And we also have complementary slackness conditions βj αj = 0, j = 1, . . . , n. (17) Suppose αj = 0 for some j, specifically denote J = { j|αj = 0}, then the optimal solution would include some zeroes. In this case, there is no difference with the optimization procedure for minimizing j∈J αj K j −UV 2. Without loss of generality, we suppose αj > 0, ∀ j. Then β = 0. From Eq. (14), we have n i=1 αi Tr(Ki K j ) = Tr(UV K j ) − γ Tr(V L j V T )/2 − λ/2, (18)
  • 5. 960 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 ⎛ ⎜ ⎜ ⎜ ⎝ Tr(K2 1 ) − Tr(K1Kn) . . . Tr(Kn K1) − Tr(K2 n ) ... ... ... Tr(K1Kn−1) − Tr(K1Kn) . . . Tr(Kn Kn−1) − Tr(K2 n ) 1 . . . 1 ⎞ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎝ α1 ... αn ⎞ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎝ T1 − Tn ... Tn−1 − Tn 1 ⎞ ⎟ ⎟ ⎟ ⎠ (20) for j = 1, . . ., n. If we transform the above equations to the matrix form and denote Tj = Tr(UV K j ) − γ Tr(V L j V T )/2, we obtain ⎛ ⎜ ⎜ ⎜ ⎝ Tr(K2 1 ) Tr(K2 K1) . . . Tr(Kn K1) Tr(K1K2) Tr(K2 2 ) . . . Tr(Kn K2) ... ... ... ... Tr(K1Kn) Tr(K2Kn) . . . Tr(K2 n ) ⎞ ⎟ ⎟ ⎟ ⎠ ⎛ ⎜ ⎜ ⎜ ⎝ α1 α2 ... αn ⎞ ⎟ ⎟ ⎟ ⎠ = ⎛ ⎜ ⎜ ⎜ ⎝ T1 − λ/2 T2 − λ/2 ... Tn − λ/2 ⎞ ⎟ ⎟ ⎟ ⎠ . (19) Let us denote Eq. (19) by AαT = B. Note that matrix A is actually the Gram matrix of Ki based on Frobenius inner product Ki , K j = Tr(Ki K T j ) = Tr(Ki K j ). Let M = (vec(K1), . . . , vec(Kn)), where vec(Ki) is the vectorization of Ki , then A = MT M. And rank(A) = rank(MT M) = rank(M) = n under the assumption that the kernel matrices K1, . . . , Kn from n different views are linearly independent. Combined with Eq. (16) and eliminated λ by subtracting other rows from the first row in Eq. (19), we can obtain the linear equations demonstrated in Eq. (20), as shown at the top of the page in a matrix form. We denote Eq. (20) by AαT = B. According to the variety of different views, 1 = (1, . . . , 1) and all of the rows in A are linearly independent. Then we have rank(A) = rank(A) − 1 + 1 = n. Thus, the inverse of A exists and we have αT = A−1 B. 3) Global Convergence: Let us denote our original objective function in Eq. (7) by L(U, V, α). Then for any m-th step, the alternate iteration procedure will be: (U(m) , V (m) ) ← arg min U,V L(U, V, α(m−1) ) and α(m) ← arg min α L(U(m) , V (m) , α). Thus we have the following inequality: L(U(m−1) , V (m−1) , α(m−1) ) ≥ L(U(m) , V (m) , α(m−1) ) ≥ L(U(m) , V (m) , α(m) ) ≥ L(U(m+1) , V (m+1) , α(m) ) ≥ L(U(m+1) , V (m+1) , α(m+1) ) ≥ . . . . In other words, L(U(m), V (m), α(m)) are monotonic nonin- creasing as m → ∞. Note that we have L(U, V, α) ≥ 0. Then the alternate iteration converges. Following the above optimization procedure on (U, V ) and α, we compute them alternately by fixing another until the object function converges. In practice, we terminate the iteration process when the difference |L(U(m), V (m), α(m)) − L(U(m−1), V (m−1), α(m−1))| is less than a small threshold or the number of iterations reaches a maximum value. C. Hashing Function Generation From the above section, we can easily compute the weight vector α = (α1, . . . , αn) and further get the fused kernel matrix K and the combined joint probability Laplacian matrix L. Thus, from Eq. (11) and Eq. (12) we can obtain the multiview RKNMF bases U ∈ RN×d and the low-dimensional representation V ∈ Rd×N , where d Di , i = 1, . . . , n. We now convert the above low-dimensional real-valued representation from V = [v1, . . . , vN ] into binary codes via thresholding: if the l-th element of vp is larger than the specified threshold, the mapped bit ˆv(l) p = 1; otherwise ˆv(l) p = 0, where p = 1, . . ., N and l = 1, . . . , d. According to [38], a well-designed semantic hashing should also be entropy-maximized to ensure efficiency. Moreover, from the information theory [39], the maximal entropy of a source alphabet is attained by having a uniform probability distribution. If the entropy of codes over the bit-string is small, it means that documents are mapped to only a small part of codes (hash bins), thereby rendering the hash table inefficient. In our method, we set the threshold for each element in ˆv (l) p as the median value of v (l) p , which can meet the entropy maximizing criterion mentioned above. Therefore, the p-th bit-string will be 1 for half of it and 0 for the other half. The above scheme gives each distinct binary code roughly equal probability of occurring in the data collection, hence achieves the best utilization of the hash table. The binary code can be represented as: ˆV = [ˆv1, . . . , ˆvN ], where ˆvp ∈ {0, 1}d and p = 1, . . . , N. A similar scheme has been used in [40] and [41]. Up to now, this would only tell us how to compute the hash code of items in the training set, while for a new coming sample we cannot explicitly find the corresponding hashing function. Inspired by [42], we determine our out-of- sample extension using a regression technique with multiple variables. In this paper, instead of applying linear regression as mentioned in [42], the binary coding environment forces us to implement our tasks via the logistic regression [43], which can be treated as a type of probabilistic statistical classification model. The corresponding probabilities describe the possibilities of binary responses. In n distributions Yi|Xi ∼ Bernoulli(pi), i = 1, . . . , n, for the function Pr(Yi = 1| Xi = x) := hθ (x) with the parameter θ,
  • 6. LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 961 Fig. 2. The illustration of the embedding via Eq. (23), i.e., the nearest integer function. the likelihood function is n i=1 Pr(Yi = yi|Xi = xi) = n i=1 hθ (xi)yi (1 − hθ (xi))1−yi . According to the maximum log-likelihood criterion, we define our logistic regression cost function: J( ) = − 1 N N p=1 ˆvp, log(h (vp)) + (1 − ˆvp), log(1 − h (vp)) + ξ|| ||2 , (21) where h (vp) = 1 1 + e−( T vp)i T 1≤i≤d is the regression function for each component in vp; log(x) = (log(x1), . . . , log(xn))T for x = (x1, . . . , xn)T ∈ Rn; ·, · represents the inner product; is the corresponding regression matrix with the size d × d; 1 is denoted as N × 1 all ones matrix and ξ is the parameter for regularization term ξ|| ||2 in logistic regression, which is to avoid overfitting. Further aiming to minimize J( ), a standard gradient descent algorithm has been employed. According to [43], the updated equation with the learning rate r can be written as: t+1 = t − r ⎛ ⎝ 1 N N p=1 (h (vp) − ˆvp)vT p + ξ N t ⎞ ⎠. (22) We stop the update iteration when the norm of difference between t+1 and t, || t+1− t||2, is less than an empirical small value (i.e., reach the convergence) and then output the regression matrix . In this way, given a sample, the related kernel matrices {Knew 1 , . . . , Knew n } for each view are first computed via Heat Kernel, where Knew i is a N ×1 matrix, ∀i. We then fuse these kernels with optimized weights α: Knew = n i=1 αi Knew i and obtain the low-dimensional real-value representation by a linear projection matrix: P = (UT U)−1 UT Algorithm 1 Multiview Alignment Hashing (MAH) via RKNMF, which is the pseudoinverse of U. Since h is a Sigmoid function, finally the hash code for the new sample can be calculated as: ˆvnew = h (P · Knew ) , (23) where function · is the nearest integer function for each entry of h . In fact, it follows the property of h ∈ (0, 1) to binarize ˆvnew by a threshold 0.5. If any bit of h (P · Knew)’s output is larger than 0.5, we assign “1” to this bit, otherwise “0”. In this way, we can obtain our final Multiview Alignment Hashing (MAH) codes for any data points (see in Fig. 2). It is noteworthy that our approach can be regarded as an embedding method. In practice, we map all the training and test data in a same way to ensure that they are in the same subspace by Eq. (23) with {P, α, } which are obtained via multiview RKNMF optimization and logistic regression. This procedure is similar as a handcrafted embedding scheme, without re-training. The corresponding integrated MAH algorithm is depicted in Algorithm 1. D. Complexity Analysis The cost of MAH learning mainly contains two parts. The first part is for the constructions of Heat Kernels and similarity probability regularization items for different views, i.e., Ki and Li. From Section III-A, the time complexity of this part is O(2( n i=1 Di )N2). The second part is for the
  • 7. 962 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 alternate optimization. The time complexity of the matrix factorization in updating (U, V) step is O(N2d) according to [30]. And, the updating of α has the complexity of O(n2 N2) in MAH. In total, the time complexity of MAH learning is O(2( n i=1 Di )N2 + T × (N2d + n2 N2)), where T is the number of iterations for alternate optimization. Empirically, T is always less than 10, i.e., MAH converges within 10 rounds. IV. EXPERIMENTS AND RESULTS In this section, the MAH algorithm is evaluated for the high dimensional nearest neighbor search problem. Three different datasets are used in our experiments, i.e., Caltech-256 [44], CIFAR-10 [45] and CIFAR-20 [45]. Caltech-256 consists of 30607 images associated with 256 object categories. CIFAR-10 and CIFAR-20 are both 60, 000-image subsets collected from the 80-million tiny images dataset [46] with 10 class labels and 20 class labels, respectively. Following the experimental setting in [21] and [22], for each dataset, we randomly select 1000 images as the query set and the rest of dataset is used as the training set. Given an image, we would like to describe it with multiview features extracted from it. The descriptors are expected to capture the orientation, intensity, texture and color information, which are the main cues of an image. Therefore, 512-dim Gist3 [47], 1152-dim histogram of oriented gradients (HOG)4 [48], 256-dim local binary pattern (LBP)5 [49] and 192-dim color histogram (ColorHist)6 are respectively employed for image representation. In the test phase, a returned point is regarded as a true neighbor if it lies in the top 100, 500 and 500 points closest to a query for Caltech-256, CIFAR-10 and CIFAR-20, respectively. For each query, all the data points in the database are ranked according to their Hamming distances to the query, since it is fast enough with short hash codes in practice. We then evaluate the retrieval results by the Mean Average Precision (MAP) and the precision-recall curve. Additionally, we also report the training time and the test time (the average searching time used for each query) for all the methods. All experiments are performed using Matlab 2013a on a server configured with a 12-core processor and 128G of RAM running the Linux OS. A. Compared Methods and Settings We compare our method against six popular unsupervised multiview hashing algorithms, i.e., Multi-View Anchor Graph Hashing (MVAGH) [21], Sequential Update for Multi-View Spectral Hashing (SU-MVSH) [22], Multi-View Hashing (MVH-CS) [23], Composite Hashing with Multiple Information Sources (CHMIS) [24], 3Gabor filters are applied on images with 8 different orientations and 4 scales. Each filtered image is then averaged over 4 × 4 grid leading to a 512-dimensional vector (8 × 4 × 16 = 512). 436 4 × 4 non-overlapping windows yield a 1152-dimensional vector. 5The LBP labels the pixels of an image by thresholding a 3 × 3 neighbor- hood, and responses are mapped to a 256-dimensional vector. 6For each R,G,B channel, a 64-bin histogram is computed and the total length is 3 × 64 = 192. Deep Multi-view Hashing (DMVH) [25] with a 4-layers deep-net and a derived version of MVH-CS, termed MAV-CCA, which is a special case of MAV-CS when the averaged similarity matrix is fixed as the identity matrix [23]. Besides, we also compare our method with two state-of-the-art single-view hashing methods, i.e., Spectral Hashing (SpH) [8] and Anchor Graphs Hashing (AGH) [10]. For single-view hashing methods, data from multiviews are concatenated into a single representation for hashing learning. It is noteworthy that we have normalized all the features into a same scale before concatenating the multiple features into a single representation for single-view hashing algorithms. All of the above methods are then evaluated on six different lengths of codes (16, 32, 48, 64, 80, 96). Following the same experimental setting, all the parameters used in the compared methods have been strictly chosen according to their original papers. For our MAH, we apply Heat Kernel: Ki(x(i) p , x(i) q ) = exp −||x(i) p − x(i) q ||2 2τ2 i , ∀p, q, to construct the original kernel matrices, where τi is set as the median of pairwise distances of data points. The selection of the smooth parameter σi used in Eq. (5) also follows with the same scheme of τi. The optimal learning rate r for each dataset is selected from one of {0.01, 0.02, . . ., 0.10} with the step of 0.01 which yields the best performance by 10-fold cross-validation on the training data. The choice of three regularization parameters {γ, η, ξ} is also done via cross-validation on the training set and we finally fix γ = 0.15, η = 0.325 and ξ = 0.05 for all three datasets. To further speed up the convergence of the proposed alternate optimization procedure, in our experiments, we apply a small trick with the following steps: 1) For the first time to calculate U and V in the step of optimizing (U, V) in Section III-B, we utilize Eq. (11) and Eq. (12) with random initialization, following the original NMF procedure [26]. We then store the obtained U and V. 2) From the second time, we optimize (U, V) by using the stored U and V from the last time to initialize the NMF algorithm, instead of using random values. This small improvement can effectively reduce the time of convergence in the training phase and make the proposed method more efficient for large-scale tasks. Fig. 3 shows the retrieval performance of MAH when four descriptors (i.e., Gist, HOG, LBP, ColorHist) are used together via differ- ent kernel combinations, or when single descriptors are used. Specifically, MAH denotes that the kernels are combined by the proposed method: K = n i=1 αi Ki, where the weight αi is obtained via alternate optimization. MAH (Average) means that the kernels are combined by arithmetic mean K = 1 n n i=1 Ki ,
  • 8. LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 963 Fig. 3. Performance (MAP) of MAH when four descriptors are used together via different kernel combinations in comparison to that of single descriptors. Fig. 4. Performance (MAP) comparison with different numbers of bits. Fig. 5. The precision-recall curves of all compared algorithms on the three datasets with the code length of 96 bits. and MAH (Product) indicates the combination of kernels through geometric mean K = ( n i=1 Ki ) 1 n . The corresponding combinations of similarity probability regularization items Li also follow the above similar schemes. The results on three datasets demonstrate that integrating multiple features achieves better performance than using single features and the proposed weighted combination improves the performance compared with average and product schemes. Fig. 4 illustrates the MAP curves of all compared algorithms on three datasets. In its entirety, firstly the retrieval accu- racies on the CIFAR-10 dataset are obviously higher than that on the more complicated CIFAR-20 and Caltech-256 datesets. Secondly, the multiview methods always achieve better results than single-view schemes. Particularly, in most
  • 9. 964 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 Fig. 6. Some retrieval results on the Caltech-256 dataset. The top-left image in each block is the query and the rest are the 8-nearest images. The incorrect results are marked by red boxes. Fig. 7. Comparison of some retrieval results via compared hashing methods on the Caltech-256 dataset. The top-left image in each block is the query and the rest are the 8-nearest images. The incorrect results are marked by red boxes. cases, MVAGH achieves higher performance than SU-MVSH, MVH-CS, MVH-CCA, DMVH-4 layers and CHMIS. The results of MVH-CS always climb up then go down when the length of codes increases. The same tendency also appears with CHMIS. In general, our MAH outperforms all other com- pared methods (also see Table I). In additional, we have also compared our proposed method with other various baselines (with/without the probability regularization or the orthogonal constraint) in Table II). It is obviously observed that the proposed method has significantly improved the effectiveness of NMF and its variants in terms of accuracies. Furthermore, we present the precision-recall curves of all the algorithms on three datasets with the code length of 96 bits in Fig. 5. From this figure, we can further discover that MAH consistently achieves the better results again by comparing the Area Under the Curve (AUC). Some examples of retrieval results on Caltech-256 are also shown in Fig. 6 and Fig. 7. The proposed algorithm aims to generate a non-linear binary code, which can better fit the intrinsic data structure than linear ones. The only existing non-linear multiview hashing algorithm is MVAGH. However, the proposed algorithm can successfully find the optimal aggregation of
  • 10. LIU et al.: MAH FOR EFFICIENT IMAGE SEARCH 965 TABLE I MEAN AVERAGE PRECISION (MAP) OF 32 BITS WITH TRAINING AND TEST TIME ON THREE DATASETS TABLE II COMPARISON OF DIFFERENT VARIANTS OF MAH TO PROVE THE EFFECTIVENESS OF THE IMPROVEMENT the kernel matrix, whereas MVAGH cannot. Therefore, it is expected that the proposed algorithm performs better than MVAGH and the existing linear multiview hashing algorithms. Finally, we list the training time and the test time for different algorithms on three datasets in Table I. Considering the training time, since DMVH involves learning of a 4-layer deep-net, it spends the most time to train hashing functions. MVH-CCA spends the second longest time for training. Our MAH only costs slightly more time than MVAGH and CHMIS but significantly less than other methods for training. For the test phase, MVH-CS and MVH-CCA are the most effi- cient methods, while MAH has competitive searching time as SU-MVSH. DMVH needs the most time for testing as well. V. CONCLUSION In this paper, we have presented a novel unsupervised hashing method called Multiview Alignment Hashing (MAH), where hashing functions are effectively learnt via kernelized Nonnegative Matrix Factorization with preserving data joint probability distribution. We incorporate multiple visual fea- tures from different views together and an alternate way is introduced to optimize the weights for different views and simultaneously produce the low-dimensional representation. We address this as a nonconvex optimization problem and its alternate procedure will finally converge at the locally opti- mal solution. For the out-of-sample extension, multivariable logistic regression has been successfully applied to obtain the regression matrix for fast hash encoding. Numerical exper- iments have been systematically evaluated on Caltech-256, CIFAR-10 and CIFAR-20 datasets. The results manifest that our MAH significantly outperforms the state-of-the-art multi- view hashing techniques in terms of searching accuracies. REFERENCES [1] L. Shao, L. Liu, and X. Li, “Feature learning for image classification via multiobjective genetic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 7, pp. 1359–1371, Jul. 2014. [2] F. Zhu and L. Shao, “Weakly-supervised cross-domain dictionary learn- ing for visual recognition,” Int. J. Comput. Vis., vol. 109, nos. 1–2, pp. 42–59, Aug. 2014. [3] J. Han et al., “Representing and retrieving video shots in human-centric brain imaging space,” IEEE Trans. Image Process., vol. 22, no. 7, pp. 2723–2736, Jul. 2013. [4] J. Han, K. N. Ngan, M. Li, and H.-J. Zhang, “A memory learning framework for effective image retrieval,” IEEE Trans. Image Process., vol. 14, no. 4, pp. 511–524, Apr. 2005. [5] J. Han, S. He, X. Qian, D. Wang, L. Guo, and T. Liu, “An object- oriented visual saliency detection framework based on sparse coding representations,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 12, pp. 2009–2021, Dec. 2013. [6] V. Gaede and O. Günther, “Multidimensional access methods,” ACM Comput. Surv., vol. 30, no. 2, pp. 170–231, Jun. 1998. [7] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high dimensions via hashing,” in Proc. Int. Conf. Very Large Data Bases, vol. 99. 1999, pp. 518–529. [8] Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc. Neural Inf. Process. Syst., 2008, pp. 1753–1760. [9] J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for large- scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, pp. 2393–2406, Dec. 2012. [10] W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,” in Proc. Int. Conf. Mach. Learn., 2011, pp. 1–8. [11] K. Grauman and R. Fergus, “Learning binary hash codes for large- scale image search,” in Machine Learning for Computer Vision. Berlin, Germany: Springer-Verlag, 2013, pp. 49–87. [12] J. Cheng, C. Leng, P. Li, M. Wang, and H. Lu, “Semi-supervised multi-graph hashing for scalable similarity search,” Comput. Vis. Image Understand., vol. 124, pp. 12–21, Jul. 2014. [13] P. Li, M. Wang, J. Cheng, C. Xu, and H. Lu, “Spectral hashing with semantically consistent graph for image indexing,” IEEE Trans. Multimedia, vol. 15, no. 1, pp. 141–152, Jan. 2013. [14] C. Xu, D. Tao, and C. Xu, “Large-margin multi-view information bottleneck,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 8, pp. 1559–1572, Aug. 2014. [15] T. Xia, D. Tao, T. Mei, and Y. Zhang, “Multiview spectral embed- ding,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 40, no. 6, pp. 1438–1446, Dec. 2010. [16] C. Xu, D. Tao, and C. Xu. (2013). “A survey on multi-view learning.” [Online]. Available: http://arxiv.org/abs/1304.5634 [17] B. Xie, Y. Mu, D. Tao, and K. Huang, “m-SNE: Multiview stochastic neighbor embedding,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 4, pp. 1088–1096, Aug. 2011.
  • 11. 966 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 3, MARCH 2015 [18] M. Wang, X.-S. Hua, R. Hong, J. Tang, G.-J. Qi, and Y. Song, “Unified video annotation via multigraph learning,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 5, pp. 733–746, May 2009. [19] M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, “Multimodal graph-based reranking for web image search,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4649–4661, Nov. 2012. [20] J. Yu, M. Wang, and D. Tao, “Semisupervised multiview distance metric learning for cartoon synthesis,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4636–4648, Nov. 2012. [21] S. Kim and S. Choi, “Multi-view anchor graph hashing,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., May 2013, pp. 3123–3127. [22] S. Kim, Y. Kang, and S. Choi, “Sequential spectral learning to hash with multiple representations,” in Proc. 12th Eur. Conf. Comput. Vis., 2012, pp. 538–551. [23] S. Kumar and R. Udupa, “Learning hash functions for cross-view similarity search,” in Proc. Int. Joint Conf. Artif. Intell., 2011, p. 1360. [24] D. Zhang, F. Wang, and L. Si, “Composite hashing with multiple information sources,” in Proc. ACM SIGIR Conf. Res. Develop. Inf. Retr., 2011, pp. 225–234. [25] Y. Kang, S. Kim, and S. Choi, “Deep learning to hash with multiple representations,” in Proc. IEEE 12th Int. Conf. Data Mining, Dec. 2012, pp. 930–935. [26] D. D. Lee and H. S. Seung, “Learning the parts of objects by non- negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, Oct. 1999. [27] S. Z. Li, X. Hou, H. Zhang, and Q. Cheng, “Learning spatially localized, parts-based representation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Dec. 2001, pp. I-207–I-212. [28] P. O. Hoyer, “Non-negative matrix factorization with sparseness con- straints,” J. Mach. Learn. Res., vol. 5, pp. 1457–1469, Dec. 2004. [29] Q. Gu and J. Zhou, “Neighborhood preserving nonnegative matrix factorization,” in Proc. Brit. Mach. Vis. Conf., 2009, pp. 1–10. [30] D. Cai, X. He, J. Han, and T. S. Huang, “Graph regularized nonnegative matrix factorization for data representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1548–1560, Aug. 2011. [31] D. Zhang, Z.-H. Zhou, and S. Chen, “Non-negative matrix factorization on kernels,” in Proc. 9th Pacific Rim Int. Conf. Artif. Intell., 2006, pp. 404–412. [32] S. An, J.-M. Yun, and S. Choi, “Multiple kernel nonnegative matrix factorization,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., May 2011, pp. 1976–1979. [33] L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 11, pp. 2579–2605, Nov. 2008. [34] J. C. Bezdek and R. J. Hathaway, “Some notes on alternating optimiza- tion,” in Proc. Adv. Soft Comput. (AFSS), 2002, pp. 288–300. [35] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2004. [36] W. Zheng, Y. Qian, and H. Tang, “Dimensionality reduction with category information fusion and non-negative matrix factorization for text categorization,” in Proc. 3rd Int. Conf. Artif. Intell. Comput. Intell., 2011, pp. 505–512. [37] D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in Proc. Neural Inf. Process. Syst., 2000, pp. 556–562. [38] S. Baluja and M. Covell, “Learning to hash: Forgiving hash functions and applications,” Data Mining Knowl. Discovery, vol. 17, no. 3, pp. 402–430, Dec. 2008. [39] C. E. Shannon, “A mathematical theory of communication,” ACM SIGMOBILE Mobile Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, Jan. 2001. [40] D. Zhang, J. Wang, D. Cai, and J. Lu, “Self-taught hashing for fast similarity search,” in Proc. ACM SIGIR Conf. Res. Develop. Inf. Retr., 2010, pp. 18–25. [41] Y. Lin, R. Jin, D. Cai, S. Yan, and X. Li, “Compressed hashing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013, pp. 446–451. [42] D. Cai, X. He, and J. Han, “Spectral regression for efficient regular- ized subspace learning,” in Proc. IEEE 11th Int. Conf. Comput. Vis., Oct. 2007, pp. 1–8. [43] D. W. Hosmer, Jr., and S. Lemeshow, Applied Logistic Regression. New York, NY, USA: Wiley, 2004. [44] G. Griffin, A. Holub, and P. Perona, “Caltech-256 object cat- egory dataset,” California Inst. Technol., Pasadena, CA, USA, Tech. Rep. CaltechAUTHORS:CNS-TR-2007-001, 2007. [45] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Dept. Comput. Sci., Univ. Toronto, Toronto, ON, Canada, Tech. Rep. TR-2009, 2009. [46] A. Torralba, R. Fergus, and W. T. Freeman, “80 million tiny images: A large data set for nonparametric object and scene recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1958–1970, Nov. 2008. [47] A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, May 2001. [48] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2005, pp. 886–893. [49] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local binary patterns,” in Proc. 8th Eur. Conf. Comput. Vis., 2004, pp. 469–481. Li Liu received the B.Eng. degree in electronic information engineering from Xi’an Jiaotong Uni- versity, Xi’an, China, in 2011, and the Ph.D. degree from the Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield, U.K., in 2014. He is currently a Research Fellow with the Department of Computer Science and Digital Tech- nologies, Northumbria University, Newcastle upon Tyne, U.K. His research interests include computer vision, machine learning, and data mining. Mengyang Yu (S’14) received the B.S. and M.S. degrees from the School of Mathematical Sciences, Peking University, Beijing, China, in 2010 and 2013, respectively. He is currently pursuing the Ph.D. degree with the Department of Computer Science and Digital Technologies, Northumbria Uni- versity, Newcastle upon Tyne, U.K. His research interests include computer vision, machine learning, and data mining. Ling Shao (M’09–SM’10)is currently a Professor with the Department of Computer Science and Digital Technologies, Northumbria University, Newcastle upon Tyne, U.K. He was a Senior Lec- turer with the Department of Electronic and Electri- cal Engineering, University of Sheffield, Sheffield, U.K., from 2009 to 2014, and a Senior Scientist with Philips Research, Eindhoven, The Netherlands, from 2005 to 2009. His research interests include computer vision, image/video processing, and machine learning. He is an Associate Editor of the IEEE TRANSACTIONS ON IMAGE PROCESSING, the IEEE TRANSACTIONS ON CYBERNETICS, and several other journals. He is a fellow of the British Computer Society and the Institution of Engineering and Technology.