SlideShare a Scribd company logo
1 of 4
4BA10/CS7008 Tutorial – SVM
Darren Caulfield
2 March 2009

Support vector machines
         http://en.wikipedia.org/wiki/Support_vector_machine
A support vector machine (SVM) is a type of classifier that became popular in the
early 1990s. A classifier takes a feature vector (a vector of numbers) and assigns a
class (a label) to the vector. The number of elements in the feature vector corresponds
to its dimensionality. When a classifier is “trained” to learn the class associated with
different feature vectors (as with SVMs), we have supervised classification.

Maximum-margin hyperplane




During the training stage, SVMs find the maximum-margin hyperplane between two
classes. This is the line (in two dimensions), plane (in three dimensions) or
hyperplane (in higher dimensions) that maximises the distance to the nearest data
point. Such hyperplanes generally lead to classifiers with good generalisation ability.
They are less likely to overfit the training data, i.e. the classifier should do
approximately as well, in terms of classification accuracy, with unseen data (the “test
set”) as it does with the “training set”. Cross-validation is another technique used to
reduce the chances of overfitting.

The vectors (data points) that are closest to the hyperplane (circled in the above
image) are called the support vectors. The other points do not influence the position
of this decision boundary.

Kernel trick
It is unlikely that a dataset can be well separated by a simple line, plane or hyperplane
in its original feature space. (That would be an example of a linear classifier.) Instead,
the SVM transforms the data into a higher-dimensional feature space and finds the
maximum-margin hyperplane in that space. This is called the “kernel trick”. It only




                                            1
requires the specification of a function – the kernel – that returns the distance between
any 2 points in the hyperspace.

The most popular kernels are listed below, with the parameter names that are used by
both LIBSVM and OpenCV. Custom kernels can significantly improve classification
accuracy, however. For example, we could define a string kernel for DNA sequences.

Linear: no mapping is done, linear discrimination (or regression) is done in the
       original feature space. It is the fastest option.
       d(x,y) = x•y == (x,y)
Poly: polynomial kernel:
       d(x,y) = (gamma*(x•y)+coef0)degree
RBF: radial-basis-function kernel; a good choice in most cases:
       d(x,y) = exp(-gamma*|x-y|2)
Sigmoid: sigmoid function is used as a kernel:
       d(x,y) = tanh(gamma*(x•y)+coef0)


Soft margin SVM
Even with the kernel trick, some datasets are not perfectly separable, either because
the features do not discriminate between the classes well enough or because some
data points have been mis-labelled. “Soft margin” SVMs find hyperplanes that split
the data as cleanly as possible, while allowing some examples to remain on the wrong
side of the hyperplane.

OpenCV implementation
The Machine Learning library in OpenCV 1.0 implements several types of classifier,
including SVMs. However, very little SVM sample code is available to date. The
documentation can be found here:
       http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/d
       oc/ref/opencvref_ml.htm
The functionality closely mirrors that of the more mature LIBSVM (see below).

Other classifiers to be found in OpenCV include: Bayes Classifier, k Nearest
Neighbours, Decision Trees, Boosting, Random Trees, Expectation-Maximization and
Neural Networks.

Evaluation
Classifiers often have their accuracy evaluated in terms of true positives and false
positives for a given threshold:




or by plotting true positives versus false positives while changing some threshold – a
receiver operating characteristic (ROC curve).




                                           2
The importance of features
Much of the research literature is concerned with the accuracy of various classifiers,
often benchmarked against various standard datasets. It is important to realise that the
best way to “solve” a classification problem (or at least improve the accuracy) is to
find, extract or develop better features. With discriminative features a “basic”
approach, e.g. Naïve Bayes or k Nearest Neighbour, will usually do as well as an
advanced approach. No classifier will ever be accurate with weak features.


Tutorial tasks
Download and unzip LIBSVM and the other associated files:
       https://www.cs.tcd.ie/Darren.Caulfield/vision
Further information: “Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for
support vector machines”, 2001. The software is available at
       http://www.csie.ntu.edu.tw/~cjlin/libsvm

svm-toy
Navigate to the “windows” folder and run “svm-toy.exe”. Load the data file
“fourclass_rescaled_for_app.txt”. (It is actually only a two-class dataset, adapted from
the LIBSVM dataset page.)

Here is the LIBSVM parameters guide (compare to the kernels listed above):
       -s svm_type : set type of SVM (default 0)
                 0 -- C-SVC
                 1 -- nu-SVC
                 2 -- one-class SVM
                 3 -- epsilon-SVR
                 4 -- nu-SVR
       -t kernel_type : set type of kernel function (default 2)
                 0 -- linear: u'*v
                 1 -- polynomial: (gamma*u'*v + coef0)^degree
                 2 -- radial basis function: exp(-gamma*|u-v|^2)
                 3 -- sigmoid: tanh(gamma*u'*v + coef0)
       -d degree : set degree in kernel function (default 3)
       -g gamma : set gamma in kernel function (default 1/k)
       -r coef0 : set coef0 in kernel function (default 0)
       -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
       -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
       -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
       -m cachesize : set cache memory size in MB (default 100)
       -e epsilon : set tolerance of termination criterion (default 0.001)
       -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1)
       -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or
       1 (default 0)
       -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1)

       The k in the -g option means the number of attributes in the input data.

       option -v randomly splits the data into n parts and calculates cross
       validation accuracy/mean squared error on them.
Click “Run” with the default parameters left unchanged and observe the classification
result.




                                                  3
Change the parameters (in the text box at the bottom right). In particular, try changing
the t, c g, d and r values. Find parameters that leave the two classes well separated.


svm-train and svm-predict
Download and unzip the “a1a” dataset (training and test sets) and put the files in the
“windows” folder of LIBSVM. Open a command prompt in that folder.

       Usage: svm-train [options] training_set_file [model_file]

       Usage: svm-predict [options] test_file model_file output_file

Run the following commands. The train a classifier (on the training set) using a RBF
kernel (default), and use it for prediction (classification) on the test set:

       svm-train.exe -c 10    a1a.txt          a1a.model

       svm-predict.exe        a1a.t            a1a.model     a1a.output

Change the –c parameter from 0.01 to 10000 (increase by a factor of 10 each time)
and study the effect.

Change the –g (gamma) parameter.

This training set is unbalanced: there are 1210 examples from one class and 395
examples from the other. Try the “–w1 weight” and “–w-1 weight” options to adjust
the penalty for misclassification.




See the following page for some 3D results:
       http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/svmtoy3d/examples/




                                           4

More Related Content

What's hot

Svm Presentation
Svm PresentationSvm Presentation
Svm Presentation
shahparin
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
butest
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
Shao-Chuan Wang
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Musa Hawamdah
 

What's hot (19)

Svm Presentation
Svm PresentationSvm Presentation
Svm Presentation
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Support Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using WekaSupport Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using Weka
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machine
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
 
End1
End1End1
End1
 
How to use SVM for data classification
How to use SVM for data classificationHow to use SVM for data classification
How to use SVM for data classification
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 

Viewers also liked

EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.doc
butest
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
butest
 
Mills_Metafeatures.doc
Mills_Metafeatures.docMills_Metafeatures.doc
Mills_Metafeatures.doc
butest
 
world history.docx (19K) - Get Online Tutoring
world history.docx (19K) - Get Online Tutoring world history.docx (19K) - Get Online Tutoring
world history.docx (19K) - Get Online Tutoring
butest
 
see CV
see CVsee CV
see CV
butest
 
msword
mswordmsword
msword
butest
 
BINF 760 new course approved
BINF 760 new course approvedBINF 760 new course approved
BINF 760 new course approved
butest
 

Viewers also liked (7)

EDRG12_Re.doc
EDRG12_Re.docEDRG12_Re.doc
EDRG12_Re.doc
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
 
Mills_Metafeatures.doc
Mills_Metafeatures.docMills_Metafeatures.doc
Mills_Metafeatures.doc
 
world history.docx (19K) - Get Online Tutoring
world history.docx (19K) - Get Online Tutoring world history.docx (19K) - Get Online Tutoring
world history.docx (19K) - Get Online Tutoring
 
see CV
see CVsee CV
see CV
 
msword
mswordmsword
msword
 
BINF 760 new course approved
BINF 760 new course approvedBINF 760 new course approved
BINF 760 new course approved
 

Similar to Tutorial - Support vector machines

Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
taikhoan262
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu
 

Similar to Tutorial - Support vector machines (20)

Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
 
Text categorization
Text categorizationText categorization
Text categorization
 
Guide
GuideGuide
Guide
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Guide
GuideGuide
Guide
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning V
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its Features
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernels
 
PVS-Studio Meets Octave
PVS-Studio Meets Octave PVS-Studio Meets Octave
PVS-Studio Meets Octave
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
SHOGUN使ってみました
SHOGUN使ってみましたSHOGUN使ってみました
SHOGUN使ってみました
 
Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Tutorial - Support vector machines

  • 1. 4BA10/CS7008 Tutorial – SVM Darren Caulfield 2 March 2009 Support vector machines http://en.wikipedia.org/wiki/Support_vector_machine A support vector machine (SVM) is a type of classifier that became popular in the early 1990s. A classifier takes a feature vector (a vector of numbers) and assigns a class (a label) to the vector. The number of elements in the feature vector corresponds to its dimensionality. When a classifier is “trained” to learn the class associated with different feature vectors (as with SVMs), we have supervised classification. Maximum-margin hyperplane During the training stage, SVMs find the maximum-margin hyperplane between two classes. This is the line (in two dimensions), plane (in three dimensions) or hyperplane (in higher dimensions) that maximises the distance to the nearest data point. Such hyperplanes generally lead to classifiers with good generalisation ability. They are less likely to overfit the training data, i.e. the classifier should do approximately as well, in terms of classification accuracy, with unseen data (the “test set”) as it does with the “training set”. Cross-validation is another technique used to reduce the chances of overfitting. The vectors (data points) that are closest to the hyperplane (circled in the above image) are called the support vectors. The other points do not influence the position of this decision boundary. Kernel trick It is unlikely that a dataset can be well separated by a simple line, plane or hyperplane in its original feature space. (That would be an example of a linear classifier.) Instead, the SVM transforms the data into a higher-dimensional feature space and finds the maximum-margin hyperplane in that space. This is called the “kernel trick”. It only 1
  • 2. requires the specification of a function – the kernel – that returns the distance between any 2 points in the hyperspace. The most popular kernels are listed below, with the parameter names that are used by both LIBSVM and OpenCV. Custom kernels can significantly improve classification accuracy, however. For example, we could define a string kernel for DNA sequences. Linear: no mapping is done, linear discrimination (or regression) is done in the original feature space. It is the fastest option. d(x,y) = x•y == (x,y) Poly: polynomial kernel: d(x,y) = (gamma*(x•y)+coef0)degree RBF: radial-basis-function kernel; a good choice in most cases: d(x,y) = exp(-gamma*|x-y|2) Sigmoid: sigmoid function is used as a kernel: d(x,y) = tanh(gamma*(x•y)+coef0) Soft margin SVM Even with the kernel trick, some datasets are not perfectly separable, either because the features do not discriminate between the classes well enough or because some data points have been mis-labelled. “Soft margin” SVMs find hyperplanes that split the data as cleanly as possible, while allowing some examples to remain on the wrong side of the hyperplane. OpenCV implementation The Machine Learning library in OpenCV 1.0 implements several types of classifier, including SVMs. However, very little SVM sample code is available to date. The documentation can be found here: http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/d oc/ref/opencvref_ml.htm The functionality closely mirrors that of the more mature LIBSVM (see below). Other classifiers to be found in OpenCV include: Bayes Classifier, k Nearest Neighbours, Decision Trees, Boosting, Random Trees, Expectation-Maximization and Neural Networks. Evaluation Classifiers often have their accuracy evaluated in terms of true positives and false positives for a given threshold: or by plotting true positives versus false positives while changing some threshold – a receiver operating characteristic (ROC curve). 2
  • 3. The importance of features Much of the research literature is concerned with the accuracy of various classifiers, often benchmarked against various standard datasets. It is important to realise that the best way to “solve” a classification problem (or at least improve the accuracy) is to find, extract or develop better features. With discriminative features a “basic” approach, e.g. Naïve Bayes or k Nearest Neighbour, will usually do as well as an advanced approach. No classifier will ever be accurate with weak features. Tutorial tasks Download and unzip LIBSVM and the other associated files: https://www.cs.tcd.ie/Darren.Caulfield/vision Further information: “Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines”, 2001. The software is available at http://www.csie.ntu.edu.tw/~cjlin/libsvm svm-toy Navigate to the “windows” folder and run “svm-toy.exe”. Load the data file “fourclass_rescaled_for_app.txt”. (It is actually only a two-class dataset, adapted from the LIBSVM dataset page.) Here is the LIBSVM parameters guide (compare to the kernels listed above): -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/k) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1) The k in the -g option means the number of attributes in the input data. option -v randomly splits the data into n parts and calculates cross validation accuracy/mean squared error on them. Click “Run” with the default parameters left unchanged and observe the classification result. 3
  • 4. Change the parameters (in the text box at the bottom right). In particular, try changing the t, c g, d and r values. Find parameters that leave the two classes well separated. svm-train and svm-predict Download and unzip the “a1a” dataset (training and test sets) and put the files in the “windows” folder of LIBSVM. Open a command prompt in that folder. Usage: svm-train [options] training_set_file [model_file] Usage: svm-predict [options] test_file model_file output_file Run the following commands. The train a classifier (on the training set) using a RBF kernel (default), and use it for prediction (classification) on the test set: svm-train.exe -c 10 a1a.txt a1a.model svm-predict.exe a1a.t a1a.model a1a.output Change the –c parameter from 0.01 to 10000 (increase by a factor of 10 each time) and study the effect. Change the –g (gamma) parameter. This training set is unbalanced: there are 1210 examples from one class and 395 examples from the other. Try the “–w1 weight” and “–w-1 weight” options to adjust the penalty for misclassification. See the following page for some 3D results: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/svmtoy3d/examples/ 4