Hashing has witnessed an increase in popularity over the
past few years due to the promise of compact encoding and fast query time. In order to be effective hashing methods must maximally preserve the similarity between the data points in the underlying binary representation.
The current best performing hashing techniques have utilised supervision. In this paper we propose a two-step iterative scheme, Graph Regularised Hashing (GRH), for incrementally adjusting the positioning of the hashing hypersurfaces to better conform to the supervisory signal:
in the first step the binary bits are regularised using a data similarity graph so that similar data points receive similar bits. In the second step the regularised hashcodes form targets for a set of binary classifiers which shift the position of each hypersurface so as to separate opposite bits with maximum margin. GRH exhibits superior retrieval accuracy to competing hashing methods.
1. Graph Regularised Hashing
Sean Moran and Victor Lavrenko
Institute of Language, Cognition and Computation
School of Informatics
University of Edinburgh
ECIR’15 Vienna, March 2015
12. Graph Regularised Hashing (GRH)
Two step iterative hashing model:
Step A: Graph Regularisation
Lm ← sgn α SD−1
Lm−1 + (1−α)L0
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk ||2
+ C
N
i=1 ξik
s.t. Lik (hk xi + bk ) ≥ 1 − ξik for i = 1. . .N
Repeat for a set number of iterations (M)
18. Graph Regularised Hashing (GRH)
-1 1 1
-1 -1 -1
ba
c
1 1 1
S a b c
a 1 1 0
b 1 1 1
c 0 1 1
D−1
a b c
a 0.5 0 0
b 0 0.33 0
c 0 0 0.5
L0 b1 b2 b3
a −1 −1 −1
b −1 1 1
c 1 1 1
20. Graph Regularised Hashing (GRH)
-1 1 1
-1 1 1
ba
c
1 1 1
L1 =
b1 b2 b3
a −1 1 1
b −1 1 1
c 1 1 1
21. Graph Regularised Hashing (GRH)
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk||2 + C N
i=1 ξik
s.t. Lik(hk xi + bk) ≥ 1 − ξik for i = 1. . .N
hk: Hyperplane k
bk: bias of hyperplane k
xi : data-point i
Lik: bit k of data-point i
ξik: slack variable ij
K: # bits
N: # data-points
22. Graph Regularised Hashing (GRH)
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk||2 + C N
i=1 ξik
s.t. Lik(hk xi + bk) ≥ 1 − ξik for i = 1. . .N
hk: Hyperplane k
bk: bias of hyperplane k
xi : data-point i
Lik: bit k of data-point i
ξik: slack variable ij
K: # bits
N: # data-points
23. Graph Regularised Hashing (GRH)
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk||2 + C N
i=1 ξik
s.t. Lik(hk xi + bk) ≥ 1 − ξik for i = 1. . .N
hk: Hyperplane k
bk: bias of hyperplane k
xi : data-point i
Lik: bit k of data-point i
ξik: slack variable ij
K: # bits
N: # data-points
24. Graph Regularised Hashing (GRH)
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk||2 + C N
i=1 ξik
s.t. Lik(hk xi + bk) ≥ 1 − ξik for i = 1. . .N
hk: Hyperplane k
bk: bias of hyperplane k
xi : data-point i
Lik: bit k of data-point i
ξik: slack variable ij
K: # bits
N: # data-points
25. Graph Regularised Hashing (GRH)
Step B: Data-Space Partitioning
for k = 1. . .K : min ||hk||2 + C N
i=1 ξik
s.t. Lik(hk xi + bk) ≥ 1 − ξik for i = 1. . .N
hk: Hyperplane k
bk: bias of hyperplane k
xi : data-point i
Lik: bit k of data-point i
ξik: slack variable ij
K: # bits
N: # data-points
34. Datasets/Features
Standard evaluation datasets [Liu et al. ’12], [Gong and
Lazebnik ’11]:
CIFAR-10: 60K images, GIST descriptors, 10 classes1
MNIST: 70K images, grayscale pixels, 10 classes2
NUSWIDE: 270K images, GIST descriptors, 21 classes3
True NNs: images that share at least one class in common
[Liu et al. ’12]
1
http://www.cs.toronto.edu/~kriz/cifar.html
2
http://yann.lecun.com/exdb/mnist/
3
http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm
35. Evaluation Metrics
Hamming ranking evaluation paradigm [Liu et al. ’12], [Gong
and Lazebnik ’11]
Standard evaluation metrics [Liu et al. ’12], [Gong and
Lazebnik ’11]:
Mean average precison (mAP)
Precision at Hamming radius 2 (P@R2)
36. GRH vs Literature (CIFAR-10 @ 32 bits)
LSH BRE STH KSH GRH (Linear) GRH (RBF)
0.10
0.15
0.20
0.25
0.30
0.35
mAP
Linear
GRH
Non-linear
GRH
37. GRH vs Literature (CIFAR-10 @ 32 bits)
LSH BRE STH KSH GRH (Linear)GRH (RBF)
0.10
0.15
0.20
0.25
0.30
0.35
mAP
GRH's straightforward
objective outperforms
more complex
objectives
46. Conclusions and Future Work
Supervised hashing model that is both accurate and easily
scalable
Take-home messages:
Regularising bits over a graph is effective (and efficient) for
hashcode learning
An intermediate eigendecomposition step is not necessary
Hyperplanes (linear hypersurfaces) can achieve a very good
retrieval accuracy
Future work: extend to the cross-modal hashing scenario (e.g.
Image ↔ Text, English ↔ Spanish)
47. Thank you for your attention
Sean Moran
Code and datasets available at:
sean.moran@ed.ac.uk
www.seanjmoran.com