rinko2011-agh

9,539 views

Published on

2011年6月24日の数理情報学輪講で発表したときのスライド資料です。ICML2011 の論文 "Hashing with Graphs" の紹介です。

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
9,539
On SlideShare
0
From Embeds
0
Number of Embeds
6,098
Actions
Shares
0
Downloads
57
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

rinko2011-agh

  1. 1. Hashing with GraphsAnchor Graph Hashing 2011 6 24 10 2 ( ) http://blog.beam2d.net @beam2d
  2. 2. ‣‣‣Hashing with Graphs.Liu, W., Wang, J., Kumar, S. and Chang, S.-F. ICML ’11, 2011. 2
  3. 3. ‣‣‣‣ 3
  4. 4. ‣‣‣‣ 4
  5. 5. Near Neighbor Searchxi ∈ R (i ∈ {1, …, n}) : dD: R × R → R : d d x ∈ Rd x1, …, xn x 5
  6. 6. k- 6
  7. 7. ‣ : x - O(dn) - O(dn) ( )‣ - - 7
  8. 8. hashing‣ d x∈R y -y r ( 2 ) -‣ ( ” ” )‣ - ( ) 8
  9. 9. ( ,2 ) 9
  10. 10. ‣ -‣ - 10
  11. 11. ‣ - ( ) -‣ 11
  12. 12. (AGH )‣‣‣‣ 12
  13. 13. ‣‣‣‣ 13
  14. 14. xi ∈ R (i ∈ {1, …, n}) : dD: R × R → R : d dh: Rd × Rd → R : 1 2 h(x, x ) = exp − D (x, x ) . t (t 0 : )A = (h(xi, xj)) ij ∈ R n×n :Yi ∈ {1, -1} : xi rY∈R n×r : Yi T iyk : Y k (k ) 14
  15. 15. 3 Y‣‣‣ n 1 2 min Yi − Yj Aij Y 2 i,j=1 n×r s.t. Y ∈ {1, −1} , 1 Y = 0, Y Y = nIr×r . (1 : 1 ) Y 15
  16. 16. r=1 NP-hardxi (i ∈ {1, …, n}) Aij G‣r = 1 Yi = -1 Yi = 1 16
  17. 17. Y Y n 1 2 min Yi − Yj Aij Y 2 i,j=1 s.t. Y∈R n×r , 1 Y = 0, Y Y = nIr×r . Y 1, -1 17
  18. 18. G A D = diag(A1) L=D-A G n n n 1 2 2 Yi − Yj Aij = Yi Dii − Yi Yj Aij2 i,j=1 i=1 i,j=1 = tr(Y DY) − tr(Y AY) = tr(Y LY). 18
  19. 19. L Y min tr(Y LY) Y s.t. Y ∈ Rn×r , 1 Y = 0, Y Y = nIr×r .‣ Y k yk L 0 k - 0 1‣ 19
  20. 20. L‣ n n×n L O(n )3 - - L A L 20
  21. 21. ‣‣‣‣ 21
  22. 22. ‣‣ m (≪ n) u1, …, um ∈ R d :‣‣ s ( s = 2) (2 ) 22
  23. 23. Truncated similarity‣ : n×m Z∈R h(xi ,uj ) h(xi ,uj ) , ∀j ∈ i Zij = j ∈i 0, otherwise. ( i xi s ( ≪ m) )‣Z i xi s 0 23
  24. 24. Anchor GraphΛ = diag(1 Z) ∈ R T m×m :Z ^ = ZΛ−1 Z . A ≤m G A ^ A G‣ 2 Zij ^ A G( ) 24
  25. 25. ‣ G‣ L‣A ^ ( ) L ^ ^ ^ L = diag(A1) − A = I − A ^ A - L ^ A (1 ) 25
  26. 26. ‣ M = Λ Z ZΛ ∈ R -1/2 T -1/2 m×m‣A^ = ZΛ-1/2Λ-1/2ZT‣ ZΛ = UΣ V : -1/2 1/2 T ( U∈R n×m Σ∈Rm×m V∈R m×m )‣ ^ = UΣ1/2 V VΣ1/2 U = UΣU , A M = VΣ1/2 U UΣ1/2 V = VΣV .‣ U = ZΛ -1/2 VΣ -1/2‣U r Y 26
  27. 27. ‣Σ 1, σ1, …, σr, … σ1, …, σr V v1, …, vr ∈ R m‣ Σr = diag(σ1, …, σr) Vr = [v1, …, vr]‣ W √ −1/2 −1/2 W = nΛ Vr Σr ∈R m×r‣ Y Y = ZW. 27
  28. 28. ‣‣‣‣ 28
  29. 29. Out-of-sample‣ -‣ x -‣ - 29
  30. 30. ‣ x∈R n [δ1 (x)h(x, u1 ), . . . , δm (x)h(x, um )] z(x) = m . j=1 δj (x)h(x, uj ) δj(x) x s j 1 j 0‣A ^ ^ −1 A(x, x ) = z (x)Λ z(x ). - (Λ ) 30
  31. 31. ‣‣ A ^ n→∞ K‣ p(x) p(x) G Gf = K(·, x)f(x)dp(x).‣ ^ A G‣ G 31
  32. 32. Nyström method‣ ^ A k K k‣ n k n 1 ^ φn,k (x) = A(x, xi )Yik . σk i=1 32
  33. 33. AGH Nyström n 1 ^ φn,k (x) = A(x, xi )Yik . σk i=1‣‣ φn,k (x) = wk z(x).-- O(dm)- 33
  34. 34. ‣‣‣‣ 34
  35. 35. d:n:m:T:s:r: ( ) 35
  36. 36. AGH1. u1, …, um2. Z Λ -1/2 T -1/23. M = Λ Z ZΛ4. M Σr, Vr √ −1/2 −1/25. W = nΛ Vr Σr6. Y = ZW‣ O(dmnT + dmn + m2n) -T:‣ O((d + s + r)n) O((s + r)n) 36 -
  37. 37. AGH T1. x W z(x)‣ O(dm + sr)‣ O((d + r)m) - 37
  38. 38. ‣‣‣ ‣ ‣ ‣ ‣‣‣ 38
  39. 39. ‣ : *( )‣ 28 × 28 = 784‣ (784 )‣ n = 69,000‣ 1,000* http://yann.lecun.com/exdb/mnist/ 39
  40. 40. ( ) m = 300, s = 2 40
  41. 41. ‣‣‣ ‣ ‣ ‣ ‣‣‣ 41
  42. 42. ‣‣‣ 42
  43. 43. ‣‣ -A‣ 43
  44. 44. A. Andoni and P. Indyk. Near-optimal hashing algorithms forapproximate nearest neighbor in high dimensions. Proceedings ofFOCS, 2006.Y. Bengio, O. Delalleau, N. Le Roux, and J.-F. Paiement. Learningeigenfunctions links spectral embedding and kernel pca. NeuralComputation, 2004.A. Gionis, P. Indyk, and R. Motwani. Similarity search in highdimensions via hashing. Proceedings of VLDB, 1999.P. Indyk and R. Motwani. Approximate nearest neighbor: Towardsremoving the curse of dimensionality. Proceedings of STOC, 1998.B. Kulis and T. Darrell. Learning to hash with binary reconstructiveembeddings. NIPS 22, 2010.B. Kulis and K. Grauman. Kernelized locality-sensitive hashing forscalable image search. Proceedings of ICCV, 2009. 44
  45. 45. W. Liu, J. He, and S.-F. Chang. Large graph construction for scalablesemi-supervised learning. Proceedings of ICML, 2010.W. Liu, J. Wang, S. Kumar, and S.-F. Chang. Hashing with graphs.ICML, 2011.M. Raginsky and S. Lazebnik. Locality-sensitive binary codes fromshift-invariant kernels. NIPS 22, 2010.J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning forhashing with compact codes. Proceedings of ICML, 2010.Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. NIPS 21, 2009.C. Williams and M. Seeger. The effect of the input density distributionon kernel-based classifiers. Proceedings of ICML, 2000. 45

×