Data Selection For Support Vector Machine Classifier - Presentation Transcript
Glenn Fung and Olvi L. Mangasarian August 2000 20081021 Kuan-Chi-I
Outline
Introduction
SVM
MSVM
Comparisons
Conclusion
Introduction
A method for selecting a small set of support vectors which determines a separating plane clsssifier.
Useful for applications contain millions of data points.
SVM
A method for classification.
SVM (Linear Separable Case)
SVM
To find the maximum margin ,equivelent to find minimum ½|| w || 2.
We can transfer above problem to a quadratic problem with parameter v > 0.
A : a real m×n matrix.
e : column vectors of ones in arbitrary dimension.
e ′ : transpose of e .
y : nonnegitive slack variables.
D : m×m diagonal matrix of 1 or -1.
SVM
Written in individual component natation .
A i :row vector of matrix A .
SVM
x′w = γ +1 bounds the class A + points.
x′w = γ +1 bounds the class A - points.
γ : the location relative to the origin.
w : normal to the bounding planes.
The linear separating surface is the plane:
SVM (Linearly Inseparable Case)
SVM (Inseparable)
If the class are inseparable then the two planes bound the two class with a 〝 soft margin”.
MSVM (1-Norm SVM)
A minimal support vertor machine (MSVM).
In order to make use of a faster programming based approach, we reformulate (1) by replacing the 2-norm by a 1-norm as follows:
MSVM
The mathematical program (7) is easily convert to a linear program as follows:
υ : the absolute value | w | of w , and υ i ≧| w i |
MSVM
If we define nonnegative multipliers u ∈ R m associated with the first set of constraints of the linear program (8), and multipliers (r, s) ∈ R n+n for the second set of constraints of (8), then the dual linear program associated with the linear SVM formulation (8) is the following:
MSVM
We modify the linear program to generate an SVM with as fewer support vector as possible by addingan error term e ′ y *
The term e ′ y * suppresses mis-classified points and results in our minimal support vector machine MSVM:
y * :vector x in R n with components ( y * ) i =1 if y i > 0 and 0 otherwise.
μ :positive parameter ,chosen by a tuning set .
MSVM
We approximate e ′ y * here by a smooth concave exponential on the nonnegative real line as was done in the feature selection approach of. For y ≥ 0, the approximation of the step vector y∗ of (9) by the concave exponential, , i = 1, . . . ,m, that is:
MSVM
The smooth MSVM:
MSVM (SLA)
Comparison
Observations of Comparisons
1. For all test problems MSVM had least number of support vectors.
2. For the Ionosphere problem, the reduction in the num-
ber of support vectors of MSVM over SVM| · | 1 is 81%, and
the average reduction in the number of support vectors of MSVM over SVM| · | is 65.8%.
3. Tenfold testing set correctness of MSVM was good.
4. Computing times were higher for MSVM than for other classifiers.
Conclution
We proposed a minimal support vector machine.
Useful in classifying very large datasets by using only a fraction of the data.
Improves generalization over other classifiers that use a higher number of data points.
MSVM requires the solution of a few linear programs to determine a sepaeating surface .
0 comments
Post a comment