• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Jokyokai2
 

Jokyokai2

on

  • 397 views

 

Statistics

Views

Total Views
397
Views on SlideShare
397
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Jokyokai2 Jokyokai2 Presentation Transcript

    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . . Fast Convergence Rate of Multiple Kernel Learning with Elastic-Net Regularization . .. . . † † ‡ † ‡ 2011 4 25 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL (RKHS) k(x, x ′ ) ⇔ Hk 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 ∑ n ∃αi ∈ R s.t. ˆ f (x) = αi k(xi , x) i=1 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL Challenge , , , Multiple Kernel Leaning . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKLMultiple Kernel Learning Single Kernel Learning 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 Multiple Kernel Learning (Lanckriet et al., 2004; Bach et al., 2004) ( ) ∑M ∑n ∑M ∑M ˆ= f ˆ ← min 1 fm ℓ yi , fm (xi ) + C ∥fm ∥Hm fm ∈Hm n m=1 m=1i=1 m=1 (Hm : km RKHS) Group Lasso (Sonnenburg et al., 2006; Rakotomamonjy et al., 2008; Suzuki & Tomioka, 2009) . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL regression 1∑ n L(f ) = (f (xi ) − yi )2 n i=1 ∑ M f ∗ (x) = ∗ fm (x)(= E[Y |x]) m=1 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∥f − f ∗ ∥2 2 ˆ L d ∗ d=|{m | ∥fm ∥Hm̸=0}|. L1 -MKL (Koltchinskii & Yuan, 2008): ( ) d log(M) 1+s n − 1+s + 1−s 1 Op d n Mixed-Norm-Elasticnet-MKL (Meier et al., 2009): mini-max ( ( ) 1 ) log(M) 1+s Op d n Mixed-Norm-L1 -MKL (Koltchinskii & Yuan, 2010): mini-max ∑ m (C1 ∥fm ∥n + C2 ∥fm ∥Hm ) ( ) d log(M) Op dn− 1+s + 1 n Mini-max (Raskutti et al., 2009) ( ) − 1+s 1 d log(M/d) Op dn + n . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 = Op d 1+q+s n− 1+q+s R21+q+s + ˆ L . n f∗ q f∗ “ ”R2 ℓ2 mini-max ℓ∞ . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . (q) 1−s 1 K&Y (2008) q=1 ? d 1+s n− 1+s + d log(M) ( ) 1 n log(M) 1+s Meier et al. (2009) q=0 d n 1 K&Y (2010) q=0 ℓ∞ -ball dn− 1+s + d log(M) n 1+q − 1+q+s IBIS2010 0≤q≤1 ℓ∞ -ball dn + d log(M) n ( d ) 1+q+s 1+q+s 1+q 2s 0≤q≤1 ℓ2 -ball n R2 + d log(M) n . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∗ I0 := {m | ∥fm ∥Hm ̸= 0} ∗ ∥fm ∥Hm > 0 (m ∈ I0 ), ∗ ∥fm ∥Hm = 0 (m ∈ I0 ). c d = |I0 | ( ) . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Spectrum Condition (s) 0 < s < 1: Mercer ∑∞ km (x, x ′ ) = ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) {ϕℓ,m }∞ L2 (P) ℓ=1 ONS. . Spectrum Condition (s) . .. 0<s<1 µℓ,m ≤ C ℓ− s 1 . (∀ℓ, m). .. . . s RKHS s s . Proposition (Steinwart et al. (2009)) . .. µℓ,m ∼ ℓ− s ⇔ N(B(Hm ), ϵ, L2 (P)) ∼ ϵ−2s 1 . .. . . . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Convolution Condition (q) 0 ≤ q ≤ 1: f∗ Σm : Hm → Hm ⟨f , Σm g ⟩Hm := E[f (X )g (X )] . Convolution Condition (q) (Caponnetto & de Vito, 2007) . .. ∗ 0 ≤ q ≤ 1 gm ∈ Hm ∗ ∗ fm = Σq/2 gm m . .. . . ∑∞ q/2 km (x, x ′ ) := ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) (q/2) ∫ ∗ fm (x) = km (x, x ′ )gm (x ′ )dP(x ′ ), (q/2) ∗ . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .s q f* f* f* (a) s q=0 (b) s q>0 (c) s q>0 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Incoherece Condition . Incoherece Condition (Koltchinskii & Yuan, 2008; Meier et al., 2009) . .. 0<C . 0 < C < κ(I0 )(1 − ρ2 (I0 )). .. . . { ∑ } ∥ m∈I fm ∥2 2 κ(I ) := sup κ ≥ 0 | κ ≤ ∑ L 2 , ∀fm ∈ Hm (m ∈ I ) , m∈I ∥fm ∥L2 { } ⟨fI , gI c ⟩L2 ρ(I ) := sup | fI ∈ HI , gI c ∈ HI c , fI ̸= 0, gI c ̸= 0 . ∥fI ∥L2 ∥gI c ∥L2 I0 . . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . Basic Condition . .. ∑M E[Y |X ] = f ∗ (X ) = m=1 fm (X ) ∗ ϵ := Y − f (X ) ∗ |ϵ| ≤ L. . supX ∈X |km (X , X )| ≤ 1 (∀m). .. . . . ∞-norm Bound Condition . .. Spectrum Condition (s) ∥fm ∥∞ ≤ C ∥fm ∥1−s ∥fm ∥s m . L2 (P) H . .. . . Gaussian Sobolev Mendelson and Neeman (2010); Steinwart et al. (2009) . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mixed-Elasticnet-MKL Mixed-Norm-Elasticnet-MKL ( ) ∑ M (n) ∑√ M (n) (n) ∑ M min L fm + λ1 ∥fm ∥2 + λ2 ∥fm ∥2 m + λ3 n H ∥fm ∥2 m . H fm ∈Hm m=1 m=1 m=1 . Theorem (Suzuki et al. (2011)) . .. Spectrum Condition (s), Convolution Condition (q), Incoherence Condition, Basic Condition, ∞-norm Bound Condition (n) (n) (n) n λ1 , λ2 , λ3 ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 ≤ C ′ d 1+q+s n− 1+q+s R2,g ∗ + ˆ L 1+q+s η(t)2 , n √ √ . 1 − e− nt − e− n (∀t ≥ 1) .. . . √ √ η(t) := max( t, t/ n) R2,g ∗ : ( )1 ∑ M ∗ 2 R 2,g ∗ := ∥gm ∥2 m H . m=1 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mixed-Elasticnet-MKLBound q=0 dn− 1+s + 1 d log(M) Koltchinskii and Yuan (2010) n . 1+q 1+q 2s d 1+q+s n− 1+q+s R2,g ∗ + 1+q+s d log(M) n . ... 1 ∗ ∥fm ∥Hm = 1 (m = 1, . . . , d): dn− 1+s + d log(M) 1 n Koltchinskii and Yuan (2010) ... 2 ∥fm ∥Hm = m−1 (m = 1, . . . , d): ∗ d 1+s n− 1+s + d log(M) 1 1 n s Koltchinskii and Yuan (2010) d 1+s (s = 0) . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Mini-max Mini-max q ∗ ∗ fm = Σm gm 2 (∑ )1 1 . .. M m=1 ∗ ∥gm ∥2 m H 2 ≤ R2 g∗ R2 ℓ2 1+q 1+q 2s d log(M/d) d 1+q+s n− 1+q+s R21+q+s + n ... 2 ∗ maxm ∥gm ∥Hm ≤ R∞ g∗ R∞ ℓ∞ 1+q 2s d log(M/d) dn− 1+q+s R∞ + 1+q+s n q = 0, R∞ = 1 Koltchinskii and Yuan (2010) . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Lp -MKL Lp -MKL (Kloft et al., 2009) ( M ) ∑ (n) ∑ M min L fm + λ1 ∥fm ∥p m H fm ∈Hm m=1 m=1 √ t (∑ )p 1 M ∗ η(t) := max( t, √n ), Rp := m=1 ∥fm ∥p m H . Theorem (Lp -MKL ) . .. Spectrum Condition(s), Incoherence Condition, Basic Condition, ∞-norm 2s 1− − 1+s 2 λ1 = n− 1+s M (n) 1 p(1+s) Bound Condition Rp ( ) 2s 2s M log(M) ∥f − f ∗ ∥2 2 ≤ C n− 1+s M 1− p(1+s) Rp1+s + 1 ˆ L η(t)2 , n √ . 1 − exp(−t) − exp(− n) .. . . Mini-max . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . .Conclusion Mixed-Norm-Elasticnet–MKL f∗ q ℓ2 mini-max Lp -MKL arXiv http://arxiv.org/abs/1103.0431 slide: http://www.simplex.t.u-tokyo.ac.jp/˜s-taiji/data/IBISML2011.pdf . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Bach, F., Lanckriet, G., & Jordan, M. (2004). Multiple kernel learning, conic duality, and the SMO algorithm. the 21st International Conference on Machine Learning (pp. 41–48). Caponnetto, A., & de Vito, E. (2007). Optimal rates for regularized least-squares algorithm. Foundations of Computational Mathematics, 7, 331–368. Kloft, M., Brefeld, U., Sonnenburg, S., Laskov, P., M¨ller, K.-R., & Zien, u A. (2009). Efficient and accurate ℓp -norm multiple kernel learning. Advances in Neural Information Processing Systems 22 (pp. 997–1005). Cambridge, MA: MIT Press. Koltchinskii, V., & Yuan, M. (2008). Sparse recovery in large ensembles of kernel machines. Proceedings of the Annual Conference on Learning Theory (pp. 229–238). Koltchinskii, V., & Yuan, M. (2010). Sparsity in multiple kernel learning. The Annals of Statistics, 38, 3660–3695. Lanckriet, G., Cristianini, N., Ghaoui, L. E., Bartlett, P., & Jordan, M. (2004). Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5, 27–72. Meier, L., van de Geer, S., & B¨hlmann, P. (2009). High-dimensional u additive modeling. The Annals of Statistics, 37, 3779–3821. . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mendelson, S., & Neeman, J. (2010). Regularization in kernel learning. The Annals of Statistics, 38, 526–565. Rakotomamonjy, A., Bach, F., Canu, S., & Y., G. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521. Raskutti, G., Wainwright, M., & Yu, B. (2009). Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness. In Advances in neural information processing systems 22, 1563–1570. Cambridge, MA: MIT Press. Sonnenburg, S., R¨tsch, G., Sch¨fer, C., & Sch¨lkopf, B. (2006). Large a a o scale multiple kernel learning. Journal of Machine Learning Research, 7, 1531–1565. Steinwart, I., Hush, D., & Scovel, C. (2009). Optimal rates for regularized least squares regression. Proceedings of the Annual Conference on Learning Theory (pp. 79–93). Suzuki, T., & Tomioka, R. (2009). SpicyMKL. arXiv:0909.5026. Suzuki, T., Tomioka, R., & Sugiyama, M. (2011). Fast convergence rate of multiple kernel learning with elastic-net regularization. arXiv:1103.0431. . . . . . .
    • Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Tomioka, R., & Suzuki, T. (2009). Sparsity-accuracy trade-off in MKL. NIPS 2009 Workshop:: Understanding Multiple Kernel Learning Methods. Whistler. arXiv:1001.2615. . . . . . .