Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

rinko2011-agh

12,644 views

Published on

2011年6月24日の数理情報学輪講で発表したときのスライド資料です。ICML2011 の論文 "Hashing with Graphs" の紹介です。

Published in: Technology
  • Be the first to comment

rinko2011-agh

  1. 1. Hashing with GraphsAnchor Graph Hashing 2011 6 24 10 2 ( ) http://blog.beam2d.net @beam2d
  2. 2. ‣‣‣Hashing with Graphs.Liu, W., Wang, J., Kumar, S. and Chang, S.-F. ICML ’11, 2011. 2
  3. 3. ‣‣‣‣ 3
  4. 4. ‣‣‣‣ 4
  5. 5. Near Neighbor Searchxi ∈ R (i ∈ {1, …, n}) : dD: R × R → R : d d x ∈ Rd x1, …, xn x 5
  6. 6. k- 6
  7. 7. ‣ : x - O(dn) - O(dn) ( )‣ - - 7
  8. 8. hashing‣ d x∈R y -y r ( 2 ) -‣ ( ” ” )‣ - ( ) 8
  9. 9. ( ,2 ) 9
  10. 10. ‣ -‣ - 10
  11. 11. ‣ - ( ) -‣ 11
  12. 12. (AGH )‣‣‣‣ 12
  13. 13. ‣‣‣‣ 13
  14. 14. xi ∈ R (i ∈ {1, …, n}) : dD: R × R → R : d dh: Rd × Rd → R : 1 2 h(x, x ) = exp − D (x, x ) . t (t 0 : )A = (h(xi, xj)) ij ∈ R n×n :Yi ∈ {1, -1} : xi rY∈R n×r : Yi T iyk : Y k (k ) 14
  15. 15. 3 Y‣‣‣ n 1 2 min Yi − Yj Aij Y 2 i,j=1 n×r s.t. Y ∈ {1, −1} , 1 Y = 0, Y Y = nIr×r . (1 : 1 ) Y 15
  16. 16. r=1 NP-hardxi (i ∈ {1, …, n}) Aij G‣r = 1 Yi = -1 Yi = 1 16
  17. 17. Y Y n 1 2 min Yi − Yj Aij Y 2 i,j=1 s.t. Y∈R n×r , 1 Y = 0, Y Y = nIr×r . Y 1, -1 17
  18. 18. G A D = diag(A1) L=D-A G n n n 1 2 2 Yi − Yj Aij = Yi Dii − Yi Yj Aij2 i,j=1 i=1 i,j=1 = tr(Y DY) − tr(Y AY) = tr(Y LY). 18
  19. 19. L Y min tr(Y LY) Y s.t. Y ∈ Rn×r , 1 Y = 0, Y Y = nIr×r .‣ Y k yk L 0 k - 0 1‣ 19
  20. 20. L‣ n n×n L O(n )3 - - L A L 20
  21. 21. ‣‣‣‣ 21
  22. 22. ‣‣ m (≪ n) u1, …, um ∈ R d :‣‣ s ( s = 2) (2 ) 22
  23. 23. Truncated similarity‣ : n×m Z∈R h(xi ,uj ) h(xi ,uj ) , ∀j ∈ i Zij = j ∈i 0, otherwise. ( i xi s ( ≪ m) )‣Z i xi s 0 23
  24. 24. Anchor GraphΛ = diag(1 Z) ∈ R T m×m :Z ^ = ZΛ−1 Z . A ≤m G A ^ A G‣ 2 Zij ^ A G( ) 24
  25. 25. ‣ G‣ L‣A ^ ( ) L ^ ^ ^ L = diag(A1) − A = I − A ^ A - L ^ A (1 ) 25
  26. 26. ‣ M = Λ Z ZΛ ∈ R -1/2 T -1/2 m×m‣A^ = ZΛ-1/2Λ-1/2ZT‣ ZΛ = UΣ V : -1/2 1/2 T ( U∈R n×m Σ∈Rm×m V∈R m×m )‣ ^ = UΣ1/2 V VΣ1/2 U = UΣU , A M = VΣ1/2 U UΣ1/2 V = VΣV .‣ U = ZΛ -1/2 VΣ -1/2‣U r Y 26
  27. 27. ‣Σ 1, σ1, …, σr, … σ1, …, σr V v1, …, vr ∈ R m‣ Σr = diag(σ1, …, σr) Vr = [v1, …, vr]‣ W √ −1/2 −1/2 W = nΛ Vr Σr ∈R m×r‣ Y Y = ZW. 27
  28. 28. ‣‣‣‣ 28
  29. 29. Out-of-sample‣ -‣ x -‣ - 29
  30. 30. ‣ x∈R n [δ1 (x)h(x, u1 ), . . . , δm (x)h(x, um )] z(x) = m . j=1 δj (x)h(x, uj ) δj(x) x s j 1 j 0‣A ^ ^ −1 A(x, x ) = z (x)Λ z(x ). - (Λ ) 30
  31. 31. ‣‣ A ^ n→∞ K‣ p(x) p(x) G Gf = K(·, x)f(x)dp(x).‣ ^ A G‣ G 31
  32. 32. Nyström method‣ ^ A k K k‣ n k n 1 ^ φn,k (x) = A(x, xi )Yik . σk i=1 32
  33. 33. AGH Nyström n 1 ^ φn,k (x) = A(x, xi )Yik . σk i=1‣‣ φn,k (x) = wk z(x).-- O(dm)- 33
  34. 34. ‣‣‣‣ 34
  35. 35. d:n:m:T:s:r: ( ) 35
  36. 36. AGH1. u1, …, um2. Z Λ -1/2 T -1/23. M = Λ Z ZΛ4. M Σr, Vr √ −1/2 −1/25. W = nΛ Vr Σr6. Y = ZW‣ O(dmnT + dmn + m2n) -T:‣ O((d + s + r)n) O((s + r)n) 36 -
  37. 37. AGH T1. x W z(x)‣ O(dm + sr)‣ O((d + r)m) - 37
  38. 38. ‣‣‣ ‣ ‣ ‣ ‣‣‣ 38
  39. 39. ‣ : *( )‣ 28 × 28 = 784‣ (784 )‣ n = 69,000‣ 1,000* http://yann.lecun.com/exdb/mnist/ 39
  40. 40. ( ) m = 300, s = 2 40
  41. 41. ‣‣‣ ‣ ‣ ‣ ‣‣‣ 41
  42. 42. ‣‣‣ 42
  43. 43. ‣‣ -A‣ 43
  44. 44. A. Andoni and P. Indyk. Near-optimal hashing algorithms forapproximate nearest neighbor in high dimensions. Proceedings ofFOCS, 2006.Y. Bengio, O. Delalleau, N. Le Roux, and J.-F. Paiement. Learningeigenfunctions links spectral embedding and kernel pca. NeuralComputation, 2004.A. Gionis, P. Indyk, and R. Motwani. Similarity search in highdimensions via hashing. Proceedings of VLDB, 1999.P. Indyk and R. Motwani. Approximate nearest neighbor: Towardsremoving the curse of dimensionality. Proceedings of STOC, 1998.B. Kulis and T. Darrell. Learning to hash with binary reconstructiveembeddings. NIPS 22, 2010.B. Kulis and K. Grauman. Kernelized locality-sensitive hashing forscalable image search. Proceedings of ICCV, 2009. 44
  45. 45. W. Liu, J. He, and S.-F. Chang. Large graph construction for scalablesemi-supervised learning. Proceedings of ICML, 2010.W. Liu, J. Wang, S. Kumar, and S.-F. Chang. Hashing with graphs.ICML, 2011.M. Raginsky and S. Lazebnik. Locality-sensitive binary codes fromshift-invariant kernels. NIPS 22, 2010.J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning forhashing with compact codes. Proceedings of ICML, 2010.Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. NIPS 21, 2009.C. Williams and M. Seeger. The effect of the input density distributionon kernel-based classifiers. Proceedings of ICML, 2000. 45

×