Your SlideShare is downloading. ×
0
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Jokyokai2
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Jokyokai2

357

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
357
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . . Fast Convergence Rate of Multiple Kernel Learning with Elastic-Net Regularization . .. . . † † ‡ † ‡ 2011 4 25 . . . . . .
  • 2. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 3. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 4. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL (RKHS) k(x, x ′ ) ⇔ Hk 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 ∑ n ∃αi ∈ R s.t. ˆ f (x) = αi k(xi , x) i=1 . . . . . .
  • 5. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL Challenge , , , Multiple Kernel Leaning . . . . . .
  • 6. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKLMultiple Kernel Learning Single Kernel Learning 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 Multiple Kernel Learning (Lanckriet et al., 2004; Bach et al., 2004) ( ) ∑M ∑n ∑M ∑M ˆ= f ˆ ← min 1 fm ℓ yi , fm (xi ) + C ∥fm ∥Hm fm ∈Hm n m=1 m=1i=1 m=1 (Hm : km RKHS) Group Lasso (Sonnenburg et al., 2006; Rakotomamonjy et al., 2008; Suzuki & Tomioka, 2009) . . . . . .
  • 7. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 . . . . . .
  • 8. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
  • 9. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
  • 10. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL regression 1∑ n L(f ) = (f (xi ) − yi )2 n i=1 ∑ M f ∗ (x) = ∗ fm (x)(= E[Y |x]) m=1 . . . . . .
  • 11. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∥f − f ∗ ∥2 2 ˆ L d ∗ d=|{m | ∥fm ∥Hm̸=0}|. L1 -MKL (Koltchinskii & Yuan, 2008): ( ) d log(M) 1+s n − 1+s + 1−s 1 Op d n Mixed-Norm-Elasticnet-MKL (Meier et al., 2009): mini-max ( ( ) 1 ) log(M) 1+s Op d n Mixed-Norm-L1 -MKL (Koltchinskii & Yuan, 2010): mini-max ∑ m (C1 ∥fm ∥n + C2 ∥fm ∥Hm ) ( ) d log(M) Op dn− 1+s + 1 n Mini-max (Raskutti et al., 2009) ( ) − 1+s 1 d log(M/d) Op dn + n . . . . . .
  • 12. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 = Op d 1+q+s n− 1+q+s R21+q+s + ˆ L . n f∗ q f∗ “ ”R2 ℓ2 mini-max ℓ∞ . . . . . .
  • 13. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . (q) 1−s 1 K&Y (2008) q=1 ? d 1+s n− 1+s + d log(M) ( ) 1 n log(M) 1+s Meier et al. (2009) q=0 d n 1 K&Y (2010) q=0 ℓ∞ -ball dn− 1+s + d log(M) n 1+q − 1+q+s IBIS2010 0≤q≤1 ℓ∞ -ball dn + d log(M) n ( d ) 1+q+s 1+q+s 1+q 2s 0≤q≤1 ℓ2 -ball n R2 + d log(M) n . . . . . .
  • 14. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 15. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∗ I0 := {m | ∥fm ∥Hm ̸= 0} ∗ ∥fm ∥Hm > 0 (m ∈ I0 ), ∗ ∥fm ∥Hm = 0 (m ∈ I0 ). c d = |I0 | ( ) . . . . . .
  • 16. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Spectrum Condition (s) 0 < s < 1: Mercer ∑∞ km (x, x ′ ) = ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) {ϕℓ,m }∞ L2 (P) ℓ=1 ONS. . Spectrum Condition (s) . .. 0<s<1 µℓ,m ≤ C ℓ− s 1 . (∀ℓ, m). .. . . s RKHS s s . Proposition (Steinwart et al. (2009)) . .. µℓ,m ∼ ℓ− s ⇔ N(B(Hm ), ϵ, L2 (P)) ∼ ϵ−2s 1 . .. . . . . . . . .
  • 17. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Convolution Condition (q) 0 ≤ q ≤ 1: f∗ Σm : Hm → Hm ⟨f , Σm g ⟩Hm := E[f (X )g (X )] . Convolution Condition (q) (Caponnetto & de Vito, 2007) . .. ∗ 0 ≤ q ≤ 1 gm ∈ Hm ∗ ∗ fm = Σq/2 gm m . .. . . ∑∞ q/2 km (x, x ′ ) := ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) (q/2) ∫ ∗ fm (x) = km (x, x ′ )gm (x ′ )dP(x ′ ), (q/2) ∗ . . . . . .
  • 18. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .s q f* f* f* (a) s q=0 (b) s q>0 (c) s q>0 . . . . . .
  • 19. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Incoherece Condition . Incoherece Condition (Koltchinskii & Yuan, 2008; Meier et al., 2009) . .. 0<C . 0 < C < κ(I0 )(1 − ρ2 (I0 )). .. . . { ∑ } ∥ m∈I fm ∥2 2 κ(I ) := sup κ ≥ 0 | κ ≤ ∑ L 2 , ∀fm ∈ Hm (m ∈ I ) , m∈I ∥fm ∥L2 { } ⟨fI , gI c ⟩L2 ρ(I ) := sup | fI ∈ HI , gI c ∈ HI c , fI ̸= 0, gI c ̸= 0 . ∥fI ∥L2 ∥gI c ∥L2 I0 . . . . . . .
  • 20. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . Basic Condition . .. ∑M E[Y |X ] = f ∗ (X ) = m=1 fm (X ) ∗ ϵ := Y − f (X ) ∗ |ϵ| ≤ L. . supX ∈X |km (X , X )| ≤ 1 (∀m). .. . . . ∞-norm Bound Condition . .. Spectrum Condition (s) ∥fm ∥∞ ≤ C ∥fm ∥1−s ∥fm ∥s m . L2 (P) H . .. . . Gaussian Sobolev Mendelson and Neeman (2010); Steinwart et al. (2009) . . . . . .
  • 21. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mixed-Elasticnet-MKL Mixed-Norm-Elasticnet-MKL ( ) ∑ M (n) ∑√ M (n) (n) ∑ M min L fm + λ1 ∥fm ∥2 + λ2 ∥fm ∥2 m + λ3 n H ∥fm ∥2 m . H fm ∈Hm m=1 m=1 m=1 . Theorem (Suzuki et al. (2011)) . .. Spectrum Condition (s), Convolution Condition (q), Incoherence Condition, Basic Condition, ∞-norm Bound Condition (n) (n) (n) n λ1 , λ2 , λ3 ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 ≤ C ′ d 1+q+s n− 1+q+s R2,g ∗ + ˆ L 1+q+s η(t)2 , n √ √ . 1 − e− nt − e− n (∀t ≥ 1) .. . . √ √ η(t) := max( t, t/ n) R2,g ∗ : ( )1 ∑ M ∗ 2 R 2,g ∗ := ∥gm ∥2 m H . m=1 . . . . . .
  • 22. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mixed-Elasticnet-MKLBound q=0 dn− 1+s + 1 d log(M) Koltchinskii and Yuan (2010) n . 1+q 1+q 2s d 1+q+s n− 1+q+s R2,g ∗ + 1+q+s d log(M) n . ... 1 ∗ ∥fm ∥Hm = 1 (m = 1, . . . , d): dn− 1+s + d log(M) 1 n Koltchinskii and Yuan (2010) ... 2 ∥fm ∥Hm = m−1 (m = 1, . . . , d): ∗ d 1+s n− 1+s + d log(M) 1 1 n s Koltchinskii and Yuan (2010) d 1+s (s = 0) . . . . . .
  • 23. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 24. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mini-max Mini-max q ∗ ∗ fm = Σm gm 2 (∑ )1 1 . .. M m=1 ∗ ∥gm ∥2 m H 2 ≤ R2 g∗ R2 ℓ2 1+q 1+q 2s d log(M/d) d 1+q+s n− 1+q+s R21+q+s + n ... 2 ∗ maxm ∥gm ∥Hm ≤ R∞ g∗ R∞ ℓ∞ 1+q 2s d log(M/d) dn− 1+q+s R∞ + 1+q+s n q = 0, R∞ = 1 Koltchinskii and Yuan (2010) . . . . . .
  • 25. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 26. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Lp -MKL Lp -MKL (Kloft et al., 2009) ( M ) ∑ (n) ∑ M min L fm + λ1 ∥fm ∥p m H fm ∈Hm m=1 m=1 √ t (∑ )p 1 M ∗ η(t) := max( t, √n ), Rp := m=1 ∥fm ∥p m H . Theorem (Lp -MKL ) . .. Spectrum Condition(s), Incoherence Condition, Basic Condition, ∞-norm 2s 1− − 1+s 2 λ1 = n− 1+s M (n) 1 p(1+s) Bound Condition Rp ( ) 2s 2s M log(M) ∥f − f ∗ ∥2 2 ≤ C n− 1+s M 1− p(1+s) Rp1+s + 1 ˆ L η(t)2 , n √ . 1 − exp(−t) − exp(− n) .. . . Mini-max . . . . . .
  • 27. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 28. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Conclusion Mixed-Norm-Elasticnet–MKL f∗ q ℓ2 mini-max Lp -MKL arXiv http://arxiv.org/abs/1103.0431 slide: http://www.simplex.t.u-tokyo.ac.jp/˜s-taiji/data/IBISML2011.pdf . . . . . .
  • 29. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Bach, F., Lanckriet, G., & Jordan, M. (2004). Multiple kernel learning, conic duality, and the SMO algorithm. the 21st International Conference on Machine Learning (pp. 41–48). Caponnetto, A., & de Vito, E. (2007). Optimal rates for regularized least-squares algorithm. Foundations of Computational Mathematics, 7, 331–368. Kloft, M., Brefeld, U., Sonnenburg, S., Laskov, P., M¨ller, K.-R., & Zien, u A. (2009). Efficient and accurate ℓp -norm multiple kernel learning. Advances in Neural Information Processing Systems 22 (pp. 997–1005). Cambridge, MA: MIT Press. Koltchinskii, V., & Yuan, M. (2008). Sparse recovery in large ensembles of kernel machines. Proceedings of the Annual Conference on Learning Theory (pp. 229–238). Koltchinskii, V., & Yuan, M. (2010). Sparsity in multiple kernel learning. The Annals of Statistics, 38, 3660–3695. Lanckriet, G., Cristianini, N., Ghaoui, L. E., Bartlett, P., & Jordan, M. (2004). Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5, 27–72. Meier, L., van de Geer, S., & B¨hlmann, P. (2009). High-dimensional u additive modeling. The Annals of Statistics, 37, 3779–3821. . . . . . .
  • 30. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mendelson, S., & Neeman, J. (2010). Regularization in kernel learning. The Annals of Statistics, 38, 526–565. Rakotomamonjy, A., Bach, F., Canu, S., & Y., G. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521. Raskutti, G., Wainwright, M., & Yu, B. (2009). Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness. In Advances in neural information processing systems 22, 1563–1570. Cambridge, MA: MIT Press. Sonnenburg, S., R¨tsch, G., Sch¨fer, C., & Sch¨lkopf, B. (2006). Large a a o scale multiple kernel learning. Journal of Machine Learning Research, 7, 1531–1565. Steinwart, I., Hush, D., & Scovel, C. (2009). Optimal rates for regularized least squares regression. Proceedings of the Annual Conference on Learning Theory (pp. 79–93). Suzuki, T., & Tomioka, R. (2009). SpicyMKL. arXiv:0909.5026. Suzuki, T., Tomioka, R., & Sugiyama, M. (2011). Fast convergence rate of multiple kernel learning with elastic-net regularization. arXiv:1103.0431. . . . . . .
  • 31. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Tomioka, R., & Suzuki, T. (2009). Sparsity-accuracy trade-off in MKL. NIPS 2009 Workshop:: Understanding Multiple Kernel Learning Methods. Whistler. arXiv:1001.2615. . . . . . .

×