SlideShare a Scribd company logo
1 of 14
Machine Learning
Jason Tseng
Information Engineering, KSU
Classification Algorithms – Support
Vector Machine Algorithm
 Support vector machines (SVMs) are flexible supervised machine
learning algorithms which are linear models used both for
classification and regression.
 It can solve linear and non-linear problems and work well for
many practical problems
 In 1960s, SVMs were first introduced but later they got refined in
1990. SVMs have their unique way of implementation as
compared to other machine learning algorithms.
 Lately, they are extremely popular because of their ability to
handle multiple continuous and categorical variables.
Support Vector Machine Algorithm-
Theory
 The idea of SVM is simple: The algorithm creates a
line or a hyperplane in multidimensional space
which separates the data into classes.
 The hyperplane will be generated in an iterative
manner by SVM so that the error can be minimized.
 The goal of SVM is to divide the datasets into
classes to find a maximum marginal hyperplane
(MMH).
Support Vector Machine
Algorithm- Problem
 Suppose you have a dataset as shown below
and you need to classify the red rectangles
from the blue ellipses(let’s say positives from
the negatives).
 So your task is to find an ideal line that
separates this dataset in two classes (say red
and blue).
 There isn’t a unique line that does the job. In
fact, we have an infinite lines that can
separate these two classes. So how does SVM
find the ideal one???
Support Vector Machine
Algorithm- Problem
 Suppose we have two candidates here, the green
colored line and the yellow colored line. Which
line according to you best separates the data?
 If you selected the yellow line then congratulations,
because that’s the line we are looking for. It’s
visually quite intuitive in this case that the yellow
line classifies better. But, we need something
concrete to fix our line.
 The green line in the image above is quite close to
the red class. Though it classifies the current
datasets it is not a generalized line and in machine
learning our goal is to get a more generalized
separator.
SVM’s way to find the best line
 The main goal of SVM is to divide the datasets into classes
to find a maximum marginal hyperplane (MMH) in the
following two steps −
 Generates hyperplanes iteratively that segregates the classes in best
way.
 Choose the hyperplane that separates the classes correctly.
 Support vector: find the points closest to the line from
both the classes. These points are called support vectors.
 Hyperplane: a decision plane or space which is divided
between a set of objects having different classes.
 Margin: compute the distance between the line and the
support vectors. This distance is called the margin.
 Our goal is to maximize the margin. The hyperplane for
Case study
 Let’s consider a bit complex dataset, which is not
linearly separable.
 We cannot draw a straight line that can classify this
data. But, this data can be converted to linearly
separable data in higher dimension.
 Lets add one more dimension and call it z-axis. Let
the co-ordinates on z-axis be governed by the
constraint,
 z co-ordinate is the square of distance of the point
from origin. Let’s plot the data on z-axis.
Case study
 Let the purple line separating the data in higher dimension
be z=k, where k is a constant. Since, z=x²+y² we get x² +
y² = k; which is an equation of a circle.
 We can project this linear separator in higher dimension
back in original dimensions using this transformation.
 finding the correct transformation for any given dataset
isn’t that easy. Thankfully, we can use kernels in sklearn’s
SVM implementation to do this job.
 A hyperplane in an n-dimensional Euclidean
space is a flat with n-1 dimensional subset of
that space that divides the space into two
disconnected parts.
decision boundary
SVM Kernels
 SVM algorithm is implemented with kernel that transforms an
input data space into the required form.
 A technique called the kernel trick in which kernel takes a low
dimensional input space and transforms it into a higher
dimensional space. i.e, kernel converts non-separable problems
into separable problems by adding more dimensions to it.
Types of kernels used by SVM
 Linear Kernels: A dot product between any two
observations/vectors (x, xi )
 Polynomial Kernel: more generalized form of linear kernel and
distinguish curved or nonlinear input space (d is the degree of
polynomial
 Radial Basis Function (RBF) Kernel: mostly used in SVM
classification, maps input space in indefinite dimensional space
(gamma ranges from 0 to 1, default value of gamma is 0.1)
A simple python implementation
 Data set: training points in X and the classes they belong to in Y
 Train the SVM model using a linear kernel
 Predict the class of new dataset:[0, 6]
TUNING PARAMETERS for SVM
Some important parameters for SVM:
 Regularization parameter C: It controls the trade off
between smooth decision boundary and classifying training
points correctly. A large value of c means you will get
more training points correctly.
 Eg. There are a number of decision boundaries that we can
draw for this dataset. Consider a straight (green colored)
decision boundary which is quite simple but it comes at the
cost of a few points being misclassified. These
misclassified points are called outliers.
TUNING PARAMETERS for SVM
 We can also make something that is considerably more
wiggly(sky blue colored decision boundary) but where we get
potentially all of the training points correct.
 The trade off having something that is very intricate, very
complicated like this is that chances are it is not going to
generalize quite as well to our test set.
 Something that is simple, more straight maybe actually the
better choice if you look at the accuracy. Large value of c
means you will get more intricate decision curves trying to fit
in all the points.
 Figuring out how much you want to have a smooth decision
boundary vs one that gets things correct is part of artistry of
machine learning. So try different values of c for your dataset
to get the perfectly balanced curve and avoid over fitting.
TUNING PARAMETERS for SVM
Important parameters for SVM:
 Gamma: define how far the influence of a single training example reaches. If
it has a low value it means that every point has a far reach and conversely high
value of gamma means that every point has close reach.
 If gamma has a very high value, then the decision boundary is just going to be
dependent upon the points that are very close to the line which effectively
results in ignoring some of the points that are very far from the decision
boundary.
 This is because the closer points get more weight and it results in a wiggly
curve as shown in previous graph. On the other hand, if the gamma value is
low even that the far away points get considerable weight and we get a more
linear curve.

More Related Content

Similar to classification algorithms in machine learning.pptx

Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSrajalakshmi5921
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxMohamedMonir33
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)Abhimanyu Dwivedi
 
sentiment analysis using support vector machine
sentiment analysis using support vector machinesentiment analysis using support vector machine
sentiment analysis using support vector machineShital Andhale
 
svm-proyekt.pptx
svm-proyekt.pptxsvm-proyekt.pptx
svm-proyekt.pptxElinEliyev
 
Lec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxLec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxpiwig56192
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machineDr. Radhey Shyam
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentationAyanaRukasar
 
ML Softmax JP 24.pptx
ML Softmax JP 24.pptxML Softmax JP 24.pptx
ML Softmax JP 24.pptxJayesh Patil
 
UE19EC353 ML Unit4_slides.pptx
UE19EC353 ML Unit4_slides.pptxUE19EC353 ML Unit4_slides.pptx
UE19EC353 ML Unit4_slides.pptxpremkumar901866
 

Similar to classification algorithms in machine learning.pptx (20)

Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptx
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
SVM_notes.pdf
SVM_notes.pdfSVM_notes.pdf
SVM_notes.pdf
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)
 
sentiment analysis using support vector machine
sentiment analysis using support vector machinesentiment analysis using support vector machine
sentiment analysis using support vector machine
 
svm-proyekt.pptx
svm-proyekt.pptxsvm-proyekt.pptx
svm-proyekt.pptx
 
Lec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxLec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptx
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machine
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
MSE.pptx
MSE.pptxMSE.pptx
MSE.pptx
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Svm ms
Svm msSvm ms
Svm ms
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
ML Softmax JP 24.pptx
ML Softmax JP 24.pptxML Softmax JP 24.pptx
ML Softmax JP 24.pptx
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
 
UE19EC353 ML Unit4_slides.pptx
UE19EC353 ML Unit4_slides.pptxUE19EC353 ML Unit4_slides.pptx
UE19EC353 ML Unit4_slides.pptx
 

Recently uploaded

ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 

Recently uploaded (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 

classification algorithms in machine learning.pptx

  • 2. Classification Algorithms – Support Vector Machine Algorithm  Support vector machines (SVMs) are flexible supervised machine learning algorithms which are linear models used both for classification and regression.  It can solve linear and non-linear problems and work well for many practical problems  In 1960s, SVMs were first introduced but later they got refined in 1990. SVMs have their unique way of implementation as compared to other machine learning algorithms.  Lately, they are extremely popular because of their ability to handle multiple continuous and categorical variables.
  • 3. Support Vector Machine Algorithm- Theory  The idea of SVM is simple: The algorithm creates a line or a hyperplane in multidimensional space which separates the data into classes.  The hyperplane will be generated in an iterative manner by SVM so that the error can be minimized.  The goal of SVM is to divide the datasets into classes to find a maximum marginal hyperplane (MMH).
  • 4. Support Vector Machine Algorithm- Problem  Suppose you have a dataset as shown below and you need to classify the red rectangles from the blue ellipses(let’s say positives from the negatives).  So your task is to find an ideal line that separates this dataset in two classes (say red and blue).  There isn’t a unique line that does the job. In fact, we have an infinite lines that can separate these two classes. So how does SVM find the ideal one???
  • 5. Support Vector Machine Algorithm- Problem  Suppose we have two candidates here, the green colored line and the yellow colored line. Which line according to you best separates the data?  If you selected the yellow line then congratulations, because that’s the line we are looking for. It’s visually quite intuitive in this case that the yellow line classifies better. But, we need something concrete to fix our line.  The green line in the image above is quite close to the red class. Though it classifies the current datasets it is not a generalized line and in machine learning our goal is to get a more generalized separator.
  • 6. SVM’s way to find the best line  The main goal of SVM is to divide the datasets into classes to find a maximum marginal hyperplane (MMH) in the following two steps −  Generates hyperplanes iteratively that segregates the classes in best way.  Choose the hyperplane that separates the classes correctly.  Support vector: find the points closest to the line from both the classes. These points are called support vectors.  Hyperplane: a decision plane or space which is divided between a set of objects having different classes.  Margin: compute the distance between the line and the support vectors. This distance is called the margin.  Our goal is to maximize the margin. The hyperplane for
  • 7. Case study  Let’s consider a bit complex dataset, which is not linearly separable.  We cannot draw a straight line that can classify this data. But, this data can be converted to linearly separable data in higher dimension.  Lets add one more dimension and call it z-axis. Let the co-ordinates on z-axis be governed by the constraint,  z co-ordinate is the square of distance of the point from origin. Let’s plot the data on z-axis.
  • 8. Case study  Let the purple line separating the data in higher dimension be z=k, where k is a constant. Since, z=x²+y² we get x² + y² = k; which is an equation of a circle.  We can project this linear separator in higher dimension back in original dimensions using this transformation.  finding the correct transformation for any given dataset isn’t that easy. Thankfully, we can use kernels in sklearn’s SVM implementation to do this job.  A hyperplane in an n-dimensional Euclidean space is a flat with n-1 dimensional subset of that space that divides the space into two disconnected parts. decision boundary
  • 9. SVM Kernels  SVM algorithm is implemented with kernel that transforms an input data space into the required form.  A technique called the kernel trick in which kernel takes a low dimensional input space and transforms it into a higher dimensional space. i.e, kernel converts non-separable problems into separable problems by adding more dimensions to it.
  • 10. Types of kernels used by SVM  Linear Kernels: A dot product between any two observations/vectors (x, xi )  Polynomial Kernel: more generalized form of linear kernel and distinguish curved or nonlinear input space (d is the degree of polynomial  Radial Basis Function (RBF) Kernel: mostly used in SVM classification, maps input space in indefinite dimensional space (gamma ranges from 0 to 1, default value of gamma is 0.1)
  • 11. A simple python implementation  Data set: training points in X and the classes they belong to in Y  Train the SVM model using a linear kernel  Predict the class of new dataset:[0, 6]
  • 12. TUNING PARAMETERS for SVM Some important parameters for SVM:  Regularization parameter C: It controls the trade off between smooth decision boundary and classifying training points correctly. A large value of c means you will get more training points correctly.  Eg. There are a number of decision boundaries that we can draw for this dataset. Consider a straight (green colored) decision boundary which is quite simple but it comes at the cost of a few points being misclassified. These misclassified points are called outliers.
  • 13. TUNING PARAMETERS for SVM  We can also make something that is considerably more wiggly(sky blue colored decision boundary) but where we get potentially all of the training points correct.  The trade off having something that is very intricate, very complicated like this is that chances are it is not going to generalize quite as well to our test set.  Something that is simple, more straight maybe actually the better choice if you look at the accuracy. Large value of c means you will get more intricate decision curves trying to fit in all the points.  Figuring out how much you want to have a smooth decision boundary vs one that gets things correct is part of artistry of machine learning. So try different values of c for your dataset to get the perfectly balanced curve and avoid over fitting.
  • 14. TUNING PARAMETERS for SVM Important parameters for SVM:  Gamma: define how far the influence of a single training example reaches. If it has a low value it means that every point has a far reach and conversely high value of gamma means that every point has close reach.  If gamma has a very high value, then the decision boundary is just going to be dependent upon the points that are very close to the line which effectively results in ignoring some of the points that are very far from the decision boundary.  This is because the closer points get more weight and it results in a wiggly curve as shown in previous graph. On the other hand, if the gamma value is low even that the far away points get considerable weight and we get a more linear curve.