Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Least squares Support Vector Machine            Rajkumar Singh          November 25, 2012
Table of Contents   Support Vector machine   Least Squares Support Vector Machine Classifier   Conclusion
Support Vector Machines      SVM is a classifier derived from statistical learning theory by      Vapnik and Chervonnkis.  ...
Support Vector Machines for Classification   Given a training set of N data points {yk , xk }N , the support               ...
SVMs for Classifications   The classifier is constructed as follows. One assumes that.                      ω T φ(xk ) + b ≥...
SVMs for Classification   According to the structural risk minimization principle, the risk   bound is minimized by formula...
SVMs for Classification   from (7) one can obtain.                                                   N                     ...
Note that for the two layer neural SVM, Mercer’s condtion onlyholds for certain parameter values of k and θ. The classifier...
SVMs for Classification   in (13) q is the center, λk are positive lagrange multipliers. Here   q = k λk φ(xk ), where the ...
Least Squares Support Vector Machines   Least squares version to the SVM classifier by formulating the   classification prob...
Least Squares Support Vector Machines   δL3       = 0 → αk = γek , k = 1, . . . N   δek   δL3        = 0 → yk [ω T φ(xk ) ...
Least Squares Support Vector Machines   The solution is given by                     0            −Y T           b   0    ...
Conclusion       Due to the equality constraints, a set of linear equations has       to be solved instead of quadratic pr...
Upcoming SlideShare
Loading in …5
×

Least squares support Vector Machine Classifier

1,019 views

Published on

Here Discussed Support Vector Machine classifier and mainly focused on Least Squares Support VEctor machine cllassifier.

  • Be the first to comment

Least squares support Vector Machine Classifier

  1. 1. Least squares Support Vector Machine Rajkumar Singh November 25, 2012
  2. 2. Table of Contents Support Vector machine Least Squares Support Vector Machine Classifier Conclusion
  3. 3. Support Vector Machines SVM is a classifier derived from statistical learning theory by Vapnik and Chervonnkis. SVMs introduced by Boser, Guyon, Vapnik in COLT-92. Initially popularized in NIPS community, now an important and active field of all Machine Learning Research What is SVM? SVMs are learning systems that Use a hypothesis space of linear functions In a high dimensional feature space - kernel functions Trained with a learning algorithm from optimization theory - Lagrange Implements a learning bias derived from statistical learning theory - Generalization
  4. 4. Support Vector Machines for Classification Given a training set of N data points {yk , xk }N , the support k=1 vector method approach aims at constructing a classifier of the form. N y (x) = sign[ αk yk ψ(x, xk ) + b] (1) k=1 where xk ∈ Rn is the k th input pattern yk ∈ Rn is the k t h output. αk are positive constants, b is real constant.  T xk x Linear SVM.    T (x x + 1)d k Polynomial SVM of degree d ψ(x, xk ) = 2 /σ 2 } exp{− x−xk 2  RBF SVM  tanh(kx T x + θ) Two layer neural SVM  k where σ, θ, k are constants.
  5. 5. SVMs for Classifications The classifier is constructed as follows. One assumes that. ω T φ(xk ) + b ≥ 1, ifyk = +1 (2) ω T φ(xk ) + b ≤ −1, ifyk = −1 Which is equivalent to yk [ω T φ(xk ) + b] ≥ 1, k = 1, . . . N (3) Where φ() is a non-linear function which maps the input space into higher dimensional space. In order to have the possibility to violate (3), in case a separating hyperplane in this high dimensional space does not exist, variables ξk are introduced such that yk [ω T φ(xk ) + b] ≥ 1 − ξk , k = 1, . . . N (4) ξk ≥, k = 1, . . . N
  6. 6. SVMs for Classification According to the structural risk minimization principle, the risk bound is minimized by formulating the optimization problem. N 1 minJ1 (ω, ξk ) = ω T ω + c ξk (5) ω,ξk 2 k=1 Subject to (4). Therefore, one constructs the Lagrangian. N L1 (ω, b, ξk , αk , vk ) = J1 (ω, ξk ) − αk {yk [ω T φ(xk ) + b] − 1 + ξk } k=1 N − v k ξk k=1 (6) by introducing Lagrange multipliers αk ≥ 0, vk ≥ 0(k = 1, . . . N) The solution is given by the saddle point of the Lagrangian by computing. max min L1 (ω, b, ξk , αk , vk ). (7) αk ,vk ω,b,ξk
  7. 7. SVMs for Classification from (7) one can obtain. N δL1 =0→ω= αk yk φ(xk δω k=1 N δL1 (8) =0→ αk y k = 0 δb k=1 δL1 = 0 → 0 ≤ αk ≤ c, k = 1, . . . N. δξk Which leads to the solution of the following quadratic programming problem N N 1 max Q1 (αk ; φ(xk )) = − yk yl φ(xk )T φ(xl )αk αl + αk (9) αk 2 k,l=1 k=1 such that N αk yk = 0, 0 ≤ α ≤ c, k = 1, . . . N. The function k=1 φ(xk ) in (9) is related then to ψ(x, xk ) by imposing φ(x)T φ(xk ) = ψ(x, xk ), (10)
  8. 8. Note that for the two layer neural SVM, Mercer’s condtion onlyholds for certain parameter values of k and θ. The classifier (3) isdesigned by solving. N N 1 max Q1 (αk ; ψ(xk , xl )) = − yk yl ψ(xk , xl )αk αl + αk (11) αk 2 k,l=1 k=1Subject to the constrainnts in (9). one does not have to calculateω nor φ(xk ) in order to determine the decision surface. Thesolution to (11) will be global.Further, it can be show that the hyperplane (3) satisfyingconstraint ω 2 ≤ α have a VC-dimension h which is boundend by h ≤ min([r 2 a2 ], n) + 1 (12)Where [.] denoted the integer part and r is the radius of thesmallest ball containing the points φ(x1 ), . . . φ(xN ). Such ball isfound by defining Lagrangian. N 2 L2 (r , q, λk ) = r − λk (r 2 − φ(xk ) − q 2 2 (13) k=1
  9. 9. SVMs for Classification in (13) q is the center, λk are positive lagrange multipliers. Here q = k λk φ(xk ), where the lagrangian follows from. N T max Q2 (λk ; φ(xk )) = − Nφ(xk ) φ(xl )λk λl + λk φ(xk )T φ(xk ) λk k,l=1 k=1 (14) N Such that k=1 λk = 1, λk ≥ 0, k = 1, . . . N. Based on (10), Q2 can also be expressed in terms of ψ(xk , xl ). Finally one selects a support vector machine VC dimension by solving (11) and computing (12) and (12).
  10. 10. Least Squares Support Vector Machines Least squares version to the SVM classifier by formulating the classification problem as N 1 1 min J3 (ω, b, e) = ω T ω + γ 2 ek (15) ω,b,e 2 2 k=1 subject to the equality constraints. yk [w T φ(xk ) + b] = 1 − ek , k = 1, . . . , N. (16) Lagrangian defined as N L3 (ω, b, e, α) = J3 (ω, b, e)− αk {yk [ω T φ(xk )+b]−1+ek } (17) k=1 where αk are lagrange multipliers. The conditions for optimality N δL3 =0→ω= αk yk φ(xk ) δω k=1 (18) N δL3 =0→ αk yk = 0
  11. 11. Least Squares Support Vector Machines δL3 = 0 → αk = γek , k = 1, . . . N δek δL3 = 0 → yk [ω T φ(xk ) + b] − 1 + ek = 0, k = 1, . . . , N δαk can be writte n as the solution to the following set of linear equations. −Z T      I 0 0 ω 0 0 0 0 −Y T   b  0   =    (19)  0 0 γI −I   e  0 Z Y I 0 α I Where Z = [φ(x1 )T y1 ; . . . φ(xN )T yN ] Y = [y1 ; . . . ; yN ] → 1 = [1; . . . 1] e = [e1 , . . . eN ], α = [α1 , . . . αN ]
  12. 12. Least Squares Support Vector Machines The solution is given by 0 −Y T b 0 = (20) Y ZZ T + γ − 1I a 1 Mercer’s Condition can be applied again to the matrix Ω = ZZ T , where Ωkl = yk yl φ(xk )T φ(xl ) (21) = yk yl ψ(xk ), xl ) Hence the classifier (1) is found by solving the linear equations (20), (21) instead of quadratic programming. The parameters of the kernele such as σ for the RBF kernel can be optimally chosen according to (12). The support values αk are proportional to the errors ar the data points (18), while in case of (14) most values are equal to zero. Hence one could rather speak of a support value spectrum in the least squares case.
  13. 13. Conclusion Due to the equality constraints, a set of linear equations has to be solved instead of quadratic programming, Mercer’s condition is applied as in other SVM’s. Least squares SVM with RBF kernel is readily found with excellent generalization performence and low computational cost. References 1. Least Squres Support Vector machine Classifiers., J.A.K Suykens, and J. Vandewalie.

×