2. ➔ Most popular Supervised Learning algorithms.
➔ Used for both Classification and Regression tasks. (But widely used for
classification task)
➔ Works better for smaller size data.
➔ Works well with high dimensional space (independent variables)
➔ Works better where no. of features are greater than no. of obs.
➔ It doesn't perform well, when dataset has more noise i.e. target classes
are overlapping.
➔ It doesn't perform well, when we have large dataset because the required
training time is higher.
3. The goal of the support vector machine algorithm is to create the best
line or decision boundary in an n-dimensional space (n — the number
of features) that distinctly classifies the data points.
This best decision boundary is called a hyperplane.
4. Our objective is to find a plane that has the maximum margin, i.e
the maximum distance between data points of both classes.
5. Terminologies:
Margin: Margin is the perpendicular distance between
the closest data points and the hyperplane.
The best optimised line(hyperplane) with maximum
margin is termed as Margin Maximal Hyperplane.
The closest points where the margin distance is
calculated are considered as support Vectors.
6. ➔ Support vectors are data points that are closer to the hyperplane
➔ Support vectors are influencers of the position and orientation of hyperplane
➔ With the help of support vectors, we maximise the margin of classifier.
➔ Support vectors are the points which help us to build the SVM classifier.
7.
8. Terminologies:
Regularization:
➔ ‘C’ parameter in python sklearn library
➔ Optimises SVM classifier to avoid misclassifying the data
C = large Margin of Hyperplane = small
C = small Margin of Hyperplane = large
Problems with setting C values:
C small = chances of underfitting
C large = chances of overfitting
9. C = large
Margin of Hyperplane = small
C = small
Margin of Hyperplane = large
10. Terminologies:
Gamma:
➔ Defines how far influences the calculation of line of
separation.
➔ Low gamma - points far from hyperplane are considered for
the calculation
➔ High gamma - points close to hyperplane are considered for
the calculation
11.
12. Outliers in the data can affect the threshold value and lead
to wrong predictions.
13. Terminologies:
Kernels:
➔ Kernel is the technique used
by SVM to classify the non-
linear data.
➔ Kernel functions are used to
increase the dimension of
the data, so that SVM can fit
the optimum hyperplane to
separate the data.
14. 2D and 3D feature space
If the number of input features is 2, then the hyperplane is just a line.
If the number of input features is 3, then the hyperplane becomes a two-dimensional plane.
It becomes difficult to imagine when the number of features exceeds 3
15. Types of SVM
Non-Linear SVM
02
01
Linear SVM
Linear SVM is used for linearly separable data,
which means if a dataset can be classified into two
classes by using a single straight line
Non-Linear SVM is used for non-linearly
separated data.which means if a dataset
cannot be classified by using a straight line,
then such data is termed as non-linear data
and classifier used is called as Non-linear
SVM classifier.