Support vector machine
Upcoming SlideShare
Loading in...5
×
 

Support vector machine

on

  • 557 views

 

Statistics

Views

Total Views
557
Views on SlideShare
557
Embed Views
0

Actions

Likes
0
Downloads
9
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Support vector machine Support vector machine Presentation Transcript

    • Musa Al hawamdah 128129001011 1
    • Topics : Linear Support Vector Machines - Lagrangian (primal) formulation - Dual formulation - Discussion Linearly non-separable case - Soft-margin classification Nonlinear Support Vector Machines - Nonlinear basis functions - The Kernel trick - Mercer’s condition - Popular kernels 2
    • Support Vector Machine (SVM):  Basic idea: - The SVM tries to find a classifier which maximizes the margin between pos. and neg. data points. - Up to now: consider linear classifiers • Formulation as a convex optimization problem Find the hyperplane satisfying 3
    • SVM – Primal Formulation:  • Lagrangian primal form: • The solution of Lp needs to fulfill the KKT conditions - Necessary and sufficient conditions 4
    • SVM – Solution:  Solution for the hyperplane ** Computed as a linear combination of the training examples: **Sparse solution: only for some points, the support vectors ⇒ Only the SVs actually influence the decision boundary! **Compute b by averaging over all support vectors: 5
    • SVM – Support Vectors:  The training points for which an > 0 are called “support vectors”.  Graphical interpretation: ** The support vectors are the points on the margin. **They define the margin and thus the hyperplane. ⇒ All other data points can be discarded! • 6
    • SVM – Discussion (Part 1):  Linear SVM:  Linear classifier  Approximative implementation of the SRM principle.  In case of separable data, the SVM produces an empirical risk of zero with minimal value of the VC confidence (i.e. a classifier minimizing the upper bound on the actual risk).  SVMs thus have a “guaranteed” generalization capability.  Formulation as convex optimization problem. ⇒ Globally optimal solution!  Primal form formulation:  Solution to quadratic prog. problem in M variables is in .  Here: D variables ⇒  Problem: scaling with high-dim. data (“curse of dimensionality”) 7
    • SVM – Dual Formulation: 8
    • SVM – Dual Formulation: 9
    • SVM – Dual Formulation: • 10
    • SVM – Dual Formulation: 11
    • SVM – Discussion (Part 2):  Dual form formulation  In going to the dual, we now have a problem in N variables (an).  Isn’t this worse??? We penalize large training sets!  However… 12
    • SVM – Support Vectors:  Only looked at linearly separable case…  Current problem formulation has no solution if the data are not linearly separable!  Need to introduce some tolerance to outlier data points. • 13
    • SVM – Non-Separable Data: 14
    • SVM – Soft-Margin Classification: 15
    • SVM – Non-Separable Data: 16
    • SVM – New Primal Formulation: 17
    • SVM – New Dual Formulation 18
    • SVM – New Solution: 19
    • Nonlinear SVM: 20
    • Another Example • Non-separable by a hyperplane in 2D: 21
    • • Separable by a surface in 3D 22
    • Nonlinear SVM – Feature Spaces  • General idea: The original input space can be mapped to some higher- dimensional feature space where the training set is separable: 23
    • Nonlinear SVM: 24
    • What Could This Look Like? 25
    • Problem with High-dim. BasisFunctions 26
    • Solution: The Kernel Trick: 27
    • Back to Our Previous Example… 28
    • SVMs with Kernels: 29
    • Which Functions are ValidKernels?  Mercer’s theorem (modernized version):  Every positive definite symmetric function is a kernel.  Positive definite symmetric functions correspond to a positive definite symmetric Gram matrix: 30
    • Kernels Fulfilling Mercer’sCondition: 31
    • THANK YOU 32