SlideShare a Scribd company logo
1 of 6
Download to read offline
SVM Tutorial
Zoya Gavrilov
Just the basics with a little bit of spoon-feeding...
1 Simplest case: linearly-separable data, binary
classification
Goal: we want to find the hyperplane (i.e. decision boundary) linearly separating
our classes. Our boundary will have equation: wT
x + b = 0.
Anything above the decision boundary should have label 1.
i.e., xi s.t. wT
xi + b > 0 will have corresponding yi = 1.
Similarly, anything below the decision boundary should have label −1.
i.e., xi s.t. wT
xi + b < 0 will have corresponding yi = −1.
2 Zoya Gavrilov
The reason for this labelling scheme is that it lets us condense the formula-
tion for the decision function to f(x) = sign(wT
x + b) since f(x) = +1 for all
x above the boundary, and f(x) = −1 for all x below the boundary.
Thus, we can figure out if an instance has been classified properly by checking
that y(wT
x+b) ≥ 1 (which will be the case as long as either both y, wT
x+b > 0
or else y, wT
x + b < 0).
You’ll notice that we will now have some space between our decision bound-
ary and the nearest data points of either class. Thus, let’s rescale the data such
that anything on or above the boundary wT
x + b = 1 is of one class (with label
1), and anything on or below the boundary wT
x + b = −1 is of the other class
(with label −1).
What is the distance between these newly added boundaries?
First note that the two lines are parallel, and thus share their parameters w, b.
Pick an arbirary point x1 to lie on line wT
x + b = −1. Then, the closest point
on line wT
x + b = 1 to x1 is the point x2 = x1 + λw (since the closest point
will always lie on the perpendicular; recall that the vector w is perpendicular
to both lines). Using this formulation, λw will be the line segment connecting
x1 and x2, and thus, λ w , the distance between x1 and x2, is the shortest
distance between the two lines/boundaries. Solving for λ:
⇒ wT
x2 + b = 1 where x2 = x1 + λw
⇒ wT
(x1 + λw) + b = 1
⇒ wT
x1 + b + λwT
w = 1 where wT
x1 + b = −1
⇒ −1 + λwT
w = 1
⇒ λwT
w = 2
⇒ λ = 2
wT w
= 2
w 2
And so, the distance λ w is 2
w 2 w = 2
w = 2√
wTw
It’s intuitive that we would want to maximize the distance between the two
SVM Tutorial 3
boundaries demarcating the classes (Why? We want to be as sure as possible
that we are not making classification mistakes, and thus we want our data points
from the two classes to lie as far away from each other as possible). This distance
is called the margin, so what we want to do is to obtain the maximal margin.
Thus, we want to maximize 2√
wTw
, which is equivalent to minimizing
√
wTw
2 ,
which is in turn equivalent to minimizing wT
w
2 (since square root is a monotonic
function).
This quadratic programming problem is expressed as:
minw,b
wT
w
2
subject to: yi(wT
xi + b) ≥ 1 (∀ data points xi).
2 Soft-margin extention
Consider the case that your data isn’t perfectly linearly separable. For instance,
maybe you aren’t guaranteed that all your data points are correctly labelled, so
you want to allow some data points of one class to appear on the other side of
the boundary.
We can introduce slack variables - an i ≥ 0 for each xi. Our quadratic program-
ming problem becomes:
minw,b,
wT
w
2 + C i i
subject to: yi(wT
xi + b) ≥ 1 − i and i ≥ 0 (∀ data points xi).
3 Nonlinear decision boundary
Mapping your data vectors, xi, into a higher-dimension (even infinite) feature
space may make them linearly separable in that space (whereas they may not
be linearly separable in the original space). The formulation of the quadratic
programming problem is as above, but with all xi replaced with φ(xi), where φ
provides the higher-dimensional mapping. So we have the standard SVM formu-
lation:
4 Zoya Gavrilov
minw,b,
wT
w
2 + C i i
subject to: yi(wT
φ(xi) + b) ≥ 1 − i and i ≥ 0 (∀ data points xi).
4 Reformulating as a Lagrangian
We can introduce Lagrange multipliers to represent the condition:
yi(wT
φ(xi) + b) must be as close to 1 as possible.
This condition is captured by: maxαi≥0αi[1 − yi(wT
φ(xi) + b)]
This ensures that when yi(wT
φ(xi) + b) ≥ 1, the expression above is maximal
when αi = 0 (since [1 − yi(wT
φ(xi) + b)] ends up being negative). Otherwise,
yi(wT
φ(xi)+ b) < 1, so [1− yi(wT
φ(xi)+ b)] is a positive value, and the expres-
sion is maximal when αi → ∞. This has the effect of penalizing any misclassified
data points, while assigning 0 penalty to properly classified instances.
We thus have the following formulation:
minw,b[wT
w
2 + i maxαi≥0αi[1 − yi(wT
φ(xi) + b)]]
To allow for slack (soft-margin), preventing the α variables from going to ∞, we
can impose constraints on the Lagrange multipliers to lie within: 0 ≤ αi ≤ C.
We can define the dual problem by interchanging the max and min as follows
(i.e. we minimize after fixing alpha):
maxα≥0[minw,bJ(w, b; α)] where J(w, b; α) = wT
w
2 + i αi[1 − yi(wT
φ(xi) + b)]
Since we’re solving an optimization problem, we set ∂J
∂w = 0 and discover that
the optimal setting of w is i αiyiφ(xi), while setting ∂J
∂b = 0 yields the con-
straint i αiyi = 0.
Thus, after substituting and simplifying, we get:
minw,bJ(w, b; α) = i αi − 1
2 i,j αiαjyiyjφ(xi)T
φ(xj)
And thus our dual is:
maxα≥0[ i αi − 1
2 i,j αiαjyiyjφ(xi)T
φ(xj)]
Subject to: i αiyi = 0 and 0 ≤ αi ≤ C
SVM Tutorial 5
5 Kernel Trick
Because we’re working in a higher-dimension space (and potentially even an
infinite-dimensional space), calculating φ(xi)T
φ(xj) may be intractable. How-
ever, it turns out that there are special kernel functions that operate on the
lower dimension vectors xi and xj to produce a value equivalent to the dot-
product of the higher-dimensional vectors.
For instance, consider the function φ : R3
→ R10
, where:
φ(x) = (1,
√
2x(1)
,
√
2x(2)
,
√
2x(3)
, [x(1)
]2
, [x(2)
]2
, [x(3)
]2
,
√
2x(1)
x(2)
,
√
2x(1)
x(3)
,
√
2x(2)
x(3)
)
Instead, we have the kernel trick, which tells us that K(xi, xj) = (1 + xi
T
xj)2
=
φ(xi)T
φ(xj) for the given φ. Thus, we can simplify our calculations.
Re-writing the dual in terms of the kernel, yields:
maxα≥0[ i αi − 1
2 i,j αiαjyiyjK(xi, xj)]
6 Decision function
To classify a novel instance x once you’ve learned the optimal αi parameters, all
you have to do is calculate f(x) = sign(wT
x + b) = i αiyiK(xi, x) + b
(by setting w = i αiyiφ(xi) and using the kernel trick).
Note that αi is only non-zero for instances φ(xi) on or near the boundary - those
are called the support vectors since they alone specify the decision boundary. We
can toss out the other data points once training is complete. Thus, we only sum
over the xi which constitute the support vectors.
7 Learn more
A great resource is the MLSS 2006 (Machine Learning Summer School) talk by
Chih-Jen Lin, available at: videolectures.net/mlss06tw lin svm/. Make sure to
download his presentation slides. Starting with the basics, the talk continues on
6 Zoya Gavrilov
to discussing practical issues, parameter/kernel selection, as well as more ad-
vanced topics.
The typical reference for such matters: Pattern Recognition and Machine Learn-
ing by Christopher M. Bishop.

More Related Content

What's hot

conference_poster_5_UCSB
conference_poster_5_UCSBconference_poster_5_UCSB
conference_poster_5_UCSBXining Li
 
Lect3 bresenham circlesandpolygons
Lect3 bresenham circlesandpolygonsLect3 bresenham circlesandpolygons
Lect3 bresenham circlesandpolygonsBCET
 
Emat 213 study guide
Emat 213 study guideEmat 213 study guide
Emat 213 study guideakabaka12
 
CS571: Gradient Descent
CS571: Gradient DescentCS571: Gradient Descent
CS571: Gradient DescentJinho Choi
 
7.3 volumes by cylindrical shells
7.3 volumes by cylindrical shells7.3 volumes by cylindrical shells
7.3 volumes by cylindrical shellsdicosmo178
 
7.1 area between curves
7.1 area between curves7.1 area between curves
7.1 area between curvesdicosmo178
 
Crib Sheet AP Calculus AB and BC exams
Crib Sheet AP Calculus AB and BC examsCrib Sheet AP Calculus AB and BC exams
Crib Sheet AP Calculus AB and BC examsA Jorge Garcia
 
Double integration final
Double integration finalDouble integration final
Double integration finalroypark31
 
A Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsA Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsIRJET Journal
 
Bresenham's line algo.
Bresenham's line algo.Bresenham's line algo.
Bresenham's line algo.Mohd Arif
 
CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semanticsJinho Choi
 
CS571: Language Models
CS571: Language ModelsCS571: Language Models
CS571: Language ModelsJinho Choi
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Matthew Leingang
 
Applications of Integrations
Applications of IntegrationsApplications of Integrations
Applications of Integrationsitutor
 
Lecture9 multi kernel_svm
Lecture9 multi kernel_svmLecture9 multi kernel_svm
Lecture9 multi kernel_svmStéphane Canu
 
0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEMi i
 
Assignment no4
Assignment no4Assignment no4
Assignment no4Ali Baig
 

What's hot (19)

Solution 2 i ph o 37
Solution 2 i ph o 37Solution 2 i ph o 37
Solution 2 i ph o 37
 
conference_poster_5_UCSB
conference_poster_5_UCSBconference_poster_5_UCSB
conference_poster_5_UCSB
 
Lect3 bresenham circlesandpolygons
Lect3 bresenham circlesandpolygonsLect3 bresenham circlesandpolygons
Lect3 bresenham circlesandpolygons
 
Emat 213 study guide
Emat 213 study guideEmat 213 study guide
Emat 213 study guide
 
CS571: Gradient Descent
CS571: Gradient DescentCS571: Gradient Descent
CS571: Gradient Descent
 
7.3 volumes by cylindrical shells
7.3 volumes by cylindrical shells7.3 volumes by cylindrical shells
7.3 volumes by cylindrical shells
 
7.1 area between curves
7.1 area between curves7.1 area between curves
7.1 area between curves
 
Crib Sheet AP Calculus AB and BC exams
Crib Sheet AP Calculus AB and BC examsCrib Sheet AP Calculus AB and BC exams
Crib Sheet AP Calculus AB and BC exams
 
Double integration final
Double integration finalDouble integration final
Double integration final
 
A Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsA Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point Theorems
 
Bresenham's line algo.
Bresenham's line algo.Bresenham's line algo.
Bresenham's line algo.
 
CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semantics
 
mathFin01
mathFin01mathFin01
mathFin01
 
CS571: Language Models
CS571: Language ModelsCS571: Language Models
CS571: Language Models
 
Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)Lesson 27: Integration by Substitution (Section 041 slides)
Lesson 27: Integration by Substitution (Section 041 slides)
 
Applications of Integrations
Applications of IntegrationsApplications of Integrations
Applications of Integrations
 
Lecture9 multi kernel_svm
Lecture9 multi kernel_svmLecture9 multi kernel_svm
Lecture9 multi kernel_svm
 
0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM
 
Assignment no4
Assignment no4Assignment no4
Assignment no4
 

Similar to Gentle intro to SVM

lecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.pptlecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.pptNaglaaAbdelhady
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machinesUjjawal
 
linear SVM.ppt
linear SVM.pptlinear SVM.ppt
linear SVM.pptMahimMajee
 
lecture14-SVMs (1).ppt
lecture14-SVMs (1).pptlecture14-SVMs (1).ppt
lecture14-SVMs (1).pptmuqadsatareen
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Cheng Feng
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVMHonglin Yu
 
267 handout 2_partial_derivatives_v2.60
267 handout 2_partial_derivatives_v2.60267 handout 2_partial_derivatives_v2.60
267 handout 2_partial_derivatives_v2.60Ali Adeel
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machinenozomuhamada
 
Notes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVMNotes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVMSyedSaimGardezi
 
Physical Chemistry Assignment Help
Physical Chemistry Assignment HelpPhysical Chemistry Assignment Help
Physical Chemistry Assignment HelpEdu Assignment Help
 
Support vector machine in data mining.pdf
Support vector machine in data mining.pdfSupport vector machine in data mining.pdf
Support vector machine in data mining.pdfRubhithaA
 

Similar to Gentle intro to SVM (20)

lecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.pptlecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.ppt
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
 
linear SVM.ppt
linear SVM.pptlinear SVM.ppt
linear SVM.ppt
 
lecture14-SVMs (1).ppt
lecture14-SVMs (1).pptlecture14-SVMs (1).ppt
lecture14-SVMs (1).ppt
 
Calculus Homework Help
Calculus Homework HelpCalculus Homework Help
Calculus Homework Help
 
Calculus Assignment Help
Calculus Assignment HelpCalculus Assignment Help
Calculus Assignment Help
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVM
 
267 handout 2_partial_derivatives_v2.60
267 handout 2_partial_derivatives_v2.60267 handout 2_partial_derivatives_v2.60
267 handout 2_partial_derivatives_v2.60
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
1551 limits and continuity
1551 limits and continuity1551 limits and continuity
1551 limits and continuity
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine
 
Svm my
Svm mySvm my
Svm my
 
Svm my
Svm mySvm my
Svm my
 
Notes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVMNotes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVM
 
CostFunctions.pdf
CostFunctions.pdfCostFunctions.pdf
CostFunctions.pdf
 
Physical Chemistry Assignment Help
Physical Chemistry Assignment HelpPhysical Chemistry Assignment Help
Physical Chemistry Assignment Help
 
smtlecture.6
smtlecture.6smtlecture.6
smtlecture.6
 
Support vector machine in data mining.pdf
Support vector machine in data mining.pdfSupport vector machine in data mining.pdf
Support vector machine in data mining.pdf
 

Recently uploaded

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 

Recently uploaded (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Gentle intro to SVM

  • 1. SVM Tutorial Zoya Gavrilov Just the basics with a little bit of spoon-feeding... 1 Simplest case: linearly-separable data, binary classification Goal: we want to find the hyperplane (i.e. decision boundary) linearly separating our classes. Our boundary will have equation: wT x + b = 0. Anything above the decision boundary should have label 1. i.e., xi s.t. wT xi + b > 0 will have corresponding yi = 1. Similarly, anything below the decision boundary should have label −1. i.e., xi s.t. wT xi + b < 0 will have corresponding yi = −1.
  • 2. 2 Zoya Gavrilov The reason for this labelling scheme is that it lets us condense the formula- tion for the decision function to f(x) = sign(wT x + b) since f(x) = +1 for all x above the boundary, and f(x) = −1 for all x below the boundary. Thus, we can figure out if an instance has been classified properly by checking that y(wT x+b) ≥ 1 (which will be the case as long as either both y, wT x+b > 0 or else y, wT x + b < 0). You’ll notice that we will now have some space between our decision bound- ary and the nearest data points of either class. Thus, let’s rescale the data such that anything on or above the boundary wT x + b = 1 is of one class (with label 1), and anything on or below the boundary wT x + b = −1 is of the other class (with label −1). What is the distance between these newly added boundaries? First note that the two lines are parallel, and thus share their parameters w, b. Pick an arbirary point x1 to lie on line wT x + b = −1. Then, the closest point on line wT x + b = 1 to x1 is the point x2 = x1 + λw (since the closest point will always lie on the perpendicular; recall that the vector w is perpendicular to both lines). Using this formulation, λw will be the line segment connecting x1 and x2, and thus, λ w , the distance between x1 and x2, is the shortest distance between the two lines/boundaries. Solving for λ: ⇒ wT x2 + b = 1 where x2 = x1 + λw ⇒ wT (x1 + λw) + b = 1 ⇒ wT x1 + b + λwT w = 1 where wT x1 + b = −1 ⇒ −1 + λwT w = 1 ⇒ λwT w = 2 ⇒ λ = 2 wT w = 2 w 2 And so, the distance λ w is 2 w 2 w = 2 w = 2√ wTw It’s intuitive that we would want to maximize the distance between the two
  • 3. SVM Tutorial 3 boundaries demarcating the classes (Why? We want to be as sure as possible that we are not making classification mistakes, and thus we want our data points from the two classes to lie as far away from each other as possible). This distance is called the margin, so what we want to do is to obtain the maximal margin. Thus, we want to maximize 2√ wTw , which is equivalent to minimizing √ wTw 2 , which is in turn equivalent to minimizing wT w 2 (since square root is a monotonic function). This quadratic programming problem is expressed as: minw,b wT w 2 subject to: yi(wT xi + b) ≥ 1 (∀ data points xi). 2 Soft-margin extention Consider the case that your data isn’t perfectly linearly separable. For instance, maybe you aren’t guaranteed that all your data points are correctly labelled, so you want to allow some data points of one class to appear on the other side of the boundary. We can introduce slack variables - an i ≥ 0 for each xi. Our quadratic program- ming problem becomes: minw,b, wT w 2 + C i i subject to: yi(wT xi + b) ≥ 1 − i and i ≥ 0 (∀ data points xi). 3 Nonlinear decision boundary Mapping your data vectors, xi, into a higher-dimension (even infinite) feature space may make them linearly separable in that space (whereas they may not be linearly separable in the original space). The formulation of the quadratic programming problem is as above, but with all xi replaced with φ(xi), where φ provides the higher-dimensional mapping. So we have the standard SVM formu- lation:
  • 4. 4 Zoya Gavrilov minw,b, wT w 2 + C i i subject to: yi(wT φ(xi) + b) ≥ 1 − i and i ≥ 0 (∀ data points xi). 4 Reformulating as a Lagrangian We can introduce Lagrange multipliers to represent the condition: yi(wT φ(xi) + b) must be as close to 1 as possible. This condition is captured by: maxαi≥0αi[1 − yi(wT φ(xi) + b)] This ensures that when yi(wT φ(xi) + b) ≥ 1, the expression above is maximal when αi = 0 (since [1 − yi(wT φ(xi) + b)] ends up being negative). Otherwise, yi(wT φ(xi)+ b) < 1, so [1− yi(wT φ(xi)+ b)] is a positive value, and the expres- sion is maximal when αi → ∞. This has the effect of penalizing any misclassified data points, while assigning 0 penalty to properly classified instances. We thus have the following formulation: minw,b[wT w 2 + i maxαi≥0αi[1 − yi(wT φ(xi) + b)]] To allow for slack (soft-margin), preventing the α variables from going to ∞, we can impose constraints on the Lagrange multipliers to lie within: 0 ≤ αi ≤ C. We can define the dual problem by interchanging the max and min as follows (i.e. we minimize after fixing alpha): maxα≥0[minw,bJ(w, b; α)] where J(w, b; α) = wT w 2 + i αi[1 − yi(wT φ(xi) + b)] Since we’re solving an optimization problem, we set ∂J ∂w = 0 and discover that the optimal setting of w is i αiyiφ(xi), while setting ∂J ∂b = 0 yields the con- straint i αiyi = 0. Thus, after substituting and simplifying, we get: minw,bJ(w, b; α) = i αi − 1 2 i,j αiαjyiyjφ(xi)T φ(xj) And thus our dual is: maxα≥0[ i αi − 1 2 i,j αiαjyiyjφ(xi)T φ(xj)] Subject to: i αiyi = 0 and 0 ≤ αi ≤ C
  • 5. SVM Tutorial 5 5 Kernel Trick Because we’re working in a higher-dimension space (and potentially even an infinite-dimensional space), calculating φ(xi)T φ(xj) may be intractable. How- ever, it turns out that there are special kernel functions that operate on the lower dimension vectors xi and xj to produce a value equivalent to the dot- product of the higher-dimensional vectors. For instance, consider the function φ : R3 → R10 , where: φ(x) = (1, √ 2x(1) , √ 2x(2) , √ 2x(3) , [x(1) ]2 , [x(2) ]2 , [x(3) ]2 , √ 2x(1) x(2) , √ 2x(1) x(3) , √ 2x(2) x(3) ) Instead, we have the kernel trick, which tells us that K(xi, xj) = (1 + xi T xj)2 = φ(xi)T φ(xj) for the given φ. Thus, we can simplify our calculations. Re-writing the dual in terms of the kernel, yields: maxα≥0[ i αi − 1 2 i,j αiαjyiyjK(xi, xj)] 6 Decision function To classify a novel instance x once you’ve learned the optimal αi parameters, all you have to do is calculate f(x) = sign(wT x + b) = i αiyiK(xi, x) + b (by setting w = i αiyiφ(xi) and using the kernel trick). Note that αi is only non-zero for instances φ(xi) on or near the boundary - those are called the support vectors since they alone specify the decision boundary. We can toss out the other data points once training is complete. Thus, we only sum over the xi which constitute the support vectors. 7 Learn more A great resource is the MLSS 2006 (Machine Learning Summer School) talk by Chih-Jen Lin, available at: videolectures.net/mlss06tw lin svm/. Make sure to download his presentation slides. Starting with the basics, the talk continues on
  • 6. 6 Zoya Gavrilov to discussing practical issues, parameter/kernel selection, as well as more ad- vanced topics. The typical reference for such matters: Pattern Recognition and Machine Learn- ing by Christopher M. Bishop.