Your SlideShare is downloading. ×
0
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Datamining 6th Svm
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Datamining 6th Svm

782

Published on

Published in: Technology, News & Politics
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
782
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. • k-NN • Yes, No Training Data Test Data 3
  • 2. 4
  • 3. • xi yi i 1 -1 (xi , yi )(i = 1, . . . , l, xi ∈ Rn , yi ∈ {1, −1}) • w, b yi (w · (xi − b)) > 0 (i = 0, . . . , l) 5
  • 4. • w.x+b≧0 • 1, x d(x) if w · x + b ≥ 0 d(x) = −1, otherwise • • 6
  • 5. Fisher (1) • 2 2 • w w+b=0 • w x+b=0 7
  • 6. Fisher (2) • m+ m- d(x)=1 x d(x)=−1 x m+ = , m− = |{x|d(x) = 1}| |{x|d(x) = −1}| • |(m+ − m− ) · w| • • w.x+b=0 2 2 ((x − m+ ) · w) + ((x − m− ) · w) d(x)=1 d(x)=−1 • 8
  • 7. Fisher (3) • • |w|=1 J(w) w • w.x+b • b 2 |(m+ − m− ) · w| J(w) = 2 2 d(x)=1 ((x − m+ ) · w) + d(x)=−1 ((x − m− ) · w) J(w) w J(w) w 0 9
  • 8. Fisher (4) J(w) w T SB w J(w) = w T SW w SB = (m+ − m− )(m+ − m− )T SW = (x − m+ )(x − m+ )T + (x − m− )(x − m− )T d(x)=1 d(x)=−1 ∂J(w) 0 =0 ∂w f f g − fg = g g2 (wT SB w)SW w = (wT SW w)SB w 2 SB w m+ − m− w ∝ S−1 (m+ − m− ) W Sw 10
  • 9. SVM (Support Vector Machine) • • • 11
  • 10. • ρ(w,b) xi · w xi · w ρ(w, b) = min − max {xi |yi =1} |w| {xi |yi =−1} |w| 12
  • 11. 2 w0 · x + b0 = ±1 w0, b0 xi · w0 xi · w0 ρ(w0 , b0 ) = min − max {xi |yi =1} |w0 | {xi |yi =−1} |w0 | 1 − b0 −1 − b0 2 = − = |w0 | |w0 | |w0 | 13
  • 12. • 2/|w0 | w0 · w0 yi (w0 · xi + b) ≥ 1 (i = 1, . . . , l) w0 · w0 w0 • 2 2 • 2 • 1 • 2 • 14
  • 13. (1) yi (w0 · xi + b) ≥ 1 (i = 1, . . . , l) (1) w0 · w0 w0 Λ = (α1 , . . . , αl ) (αi ≥ 0) l |w|2 L(w, b, Λ) = − αi (yi (xi · w + b) − 1) 2 i=1 • w, b Λ 15
  • 14. (2) • w=w0, b=b0 L(w, b, Λ) l ∂L(w, b, Λ) = w0 − αi yi xi = 0 ∂w w=w0 l i=1 (2) ∂L(w, b, Λ) = − αi yi = 0 ∂b b=b0 i=1 l l w0 = αi yi xi , αi yi = 0 i=1 i=1 • w=w0, b=b0 l 1 L(w0 , b0 , Λ) = w0 · w0 − αi [yi (xi · w0 + b0 ) − 1] 2 i=1 l l l 1 = αi − αi αj yi yj xi · xj i=1 2 i=1 j=1 • w b Λ 16
  • 15. SVM • l w, b αi yi = 0, αi ≥ 0 i=1 (3) l l l 1 L(w0 , b0 , Λ) = αi − αi αj yi yj xi · xj i=1 2 i=1 j=1 Λ • SVM • w0 Λ l • (2) ( w0 = i=1 αi yi xi ) • (2) αi≠0 xi w KKKT • KKT : αi [yi (xi · w0 + b0 ) − 1] = 0 17
  • 16. • • • (A) (B) 18
  • 17. ( ) • • • • • l l l 1 L(w0 , b0 , Λ) = αi − αi αj yi yj xi · xj i=1 2 i=1 j=1 • x l Φ(x) l l 1 L(w0 , b0 , Λ) = αi − αi αj yi yj Φ(xi ) · Φ(xj ) i=1 2 i=1 j=1 • l Φ(x) · w0 + b0 = αi yi Φ(x) · Φ(xi ) + b0 = 0 i=1 • Φ 19
  • 18. Kernel • K(x, y) = Φ(x) √ Φ(y) √ √ • Φ((x1 , x2 )) = (x1 , 2x1 x2 , x2 , 2x1 , 2x2 , 1) 2 2 Φ((x1 , x2 )) · Φ((y1 , y2 )) = (x1 y1 )2 + 2x1 y1 x2 y2 + (x2 y2 )2 + 2x1 y1 + 2x2 y2 + 1 = (x1 y1 + x2 y2 + 1)2 = ((x1 , x2 ) · (y1 , y2 ) + 1)2 • (6 ) • • (x · y + 1)d , • RBF exp(−||x − y||2 /2σ 2 ), • tanh(κx · y − δ) • σ κ δ • Mercer 20
  • 19. • • • • ξ yi (w · xi + b) ≥ 1 − ξi where ξi ≥ 0 (i = 1, . . . , l) l 1 w·w+C ξi 2 i=1 21
  • 20. (1) • Λ = (α1 , . . . , αl ), R = (r1 , . . . , rl ) L L(w, ξ, b, Λ, R) l l l 1 = w·w+C ξi − αi [yi (xi · w + b) − 1 + ξi ] − ri ξi 2 i=1 i=1 i=1 w0 , b0 , ξi L 0 w, b, ξi KKT l ∂L(w, ξ, b, Λ, R) = w0 − α i y i xi = 0 ∂w w=w0 i=0 l ∂L(w, ξ, b, Λ, R) = − αi yi = 0 ∂b b=b0 i=0 ∂L(w, ξ, b, Λ, R) = C − αi − ri = 0 ∂ξi 0 ξ=ξi 22
  • 21. (2) • l L 1 l l L(w, ξ, b, Λ, R) = αi − αi αj yi yj xi · xj 2 • i=1 i=1 j=1 C ξ SVM • αi C • C • C - αi - ri = 0 ri 0≦αi≦C l w,b αi yi = 0, 0 ≤ αi ≤ C i=1 l l l 1 L(w, ξ, b, Λ, R) = αi − αi αj yi yj xi · xj i=1 2 i=1 j=1 Λ 23
  • 22. : Karush-Kuhn-Tucker (KKT ) • • gi(x) ≦ 0 (x = (x1, x2, ..., xn)) f(x) • KKT : m ∂f (x) ∂gi (x) + λi = 0, j = 1, 2, ..., n ∂xj i=1 ∂xj λi gi (x) = 0, λi ≥ 0, gi (x) ≤ 0, i = 1, 2, ..., m • f(x) gi(x) x, λ KKT f(x) 24
  • 23. SMO (Sequence Minimal Optimization) • SVM • Λ=(α1, α2, ...,αl) • αi • 6000 6000 • • 2 (αi, αj) 2 • 2 αi • SMO • LD l l l 1 LD = L(w, ξ, b, Λ, R) = αi − αi αj yi yj xi · xj i=1 2 i=1 j=1 25
  • 24. 2 (1) • α 1 , α2 LD • old old α 1 , α2 new new α 1 , α2 Ei ≡ wold · xi + bold − yi old η ≡ 2K12 − K11 − K22 , where Kij = xi · xj α2 y2 (E1 − E2 ) old old new α2 = α2 − old η l i=1 αi y i = 0 γ ≡ α1 + sα2 = Const. LD LD’=0 η = 2K12 − K11 − K22 = − | x2 − x1 |2 ≤ 0 26
  • 25. 2 (2) • α 1 , α2 γ ≡ α1 + sα2 = Const. • new new α 1 , α2 0 C • α2 clipped α2 (A) (B) 27
  • 26. 2 (3) y1 = y1 (s = 1) L = max(0, α1 + α2 − C), old old H = min(C, α1 + α2 ) old old y1 = y2 (s = −1) L = max(0, α2 − α1 ), old old H = min(C, C + α2 − α1 ) old old L ≤ α2 ≤ H s γ clipped α2   H, if α2 ≥ H new clipped α2 = new α2 , if L < α2 < H new  L, if α2 ≤ L new LD 28
  • 27. • L ≤ α2 ≤ H (A) (B) (C) (D)
  • 28. • clipped α2 (B) (C) (A) (D) : (α1 , α2 ) new new clipped : (α1 , α2 new )
  • 29. 2 1. η = 2K12 − K11 − K22 2. η < 0 α old old y2 (E2 −E1 ) (a) α2 = α2 + new old η clipped (b) α2 clipped (c) α1 = α1 − s(α2 new old − α2 ) old 3. η = 0 LD α2 1 L H α1 2(c) 4. α1,2 • bnew E new = 0 clipped wnew = wold + (α1 − α1 )y1 x1 + (α2 new old − α2 )y2 x2 old E new (x, y) = E old (x, y) + y1 (α1 − α1 )x1 · x new old clipped +y2 (α2 − α2 )x2 · x − bold + bnew old clipped bnew = bold − E old (x, y) − y1 (α1 − α1 )x1 · x − y2 (α2 new old − α2 )x2 · x old 31
  • 30. αi • α1 α2 • α1 • KKT KKT • • 2 • 0 < αi < C • • α2 • LD • |E1-E2| • E1 E2 E1 32
  • 31. SMO SVM • • • α≠0 • α 2 • 2 • 2 α • |E2-E1| • LD KKT 33
  • 32. • 3 ( ) • A B 2 • • (regression problem) • 0 100 0 10, 10 20, • 1 • Web 100 100 Web • • One Class SVM 34

×