SlideShare a Scribd company logo
Digg Data
Support Vector Machine
Ankit Sharma
www.diggdata.in
without tears
Digg Data
Content
SVM and its application
Basic SVM
•Hyperplane
•Understanding of basics
•Optimization
Soft margin SVM
Non-linear decision boundary
SVMs in “loss + penalty” form
Kernel method
•Gaussian kernel
SVM usage beyond classification
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 2
Digg Data
• In machine learning, support vector machines are supervised
learning models with associated learning algorithms that analyze
data and recognize patterns, used for classification and regression
analysis.
• Properties of SVM :
Duality
Kernels
Margin
Convexity
Sparseness
SVM : Support Vector Machine
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 3
Digg Data
Time Series
analysis
Classification
Anomaly
detection
Regression
Machine
Vision
Text
categorization
Application of SVM
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 4
Digg Data
Basic concept of SVM
Find a linear decision surface (“hyperplane”) that can separate classes and has the largest
distance (i.e., largest “gap” or “margin”) between border-line patients (i.e., “support vectors”)
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 5
Digg Data
Hyperplane as a Decision boundary
• A hyperplane is a linear decision surface that splits the space into two parts;
• It is obvious that a hyperplane is a binary classifier
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 6
Digg Data
Equation of a hyperplane
An equation of a hyperplane is defined by a
point (P0) and a perpendicular vector to the
plane (𝑤) at that point.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 7
Digg Data
• g(x) is a linear function:
x1
x2
w x + b < 0
w x + b > 0
 A hyper-plane in the feature
space
 (Unit-length) normal vector of
the hyper-plane:

w
n
w
n
Understanding the basics
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 8
Digg Data
x1
x2How to classify these points using
a linear discriminant function in
order to minimize the error rate?
 Infinite number of answers!
 Which one is the best?
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 9
Digg Data
• The linear discriminant
function (classifier) with the
maximum margin is the best
“safe zone”
 Margin is defined as the width
that the boundary could be
increased by before hitting a
data point
 Why it is the best?
Robust to outliners and thus
strong generalization ability
Margin
x1
x2
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 10
Digg Data
• Given a set of data points:
 With a scale transformation on
both w and b, the above is
equivalent to
x1
x2
{( , )}, 1,2, ,i iy i nx , where
𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> 0
𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < 𝟎
𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1
𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 11
Digg Data
• We know that
 The margin width is:
x1
x2
Margin
x+
x+
x-
( )
2
( )
M  
 
  
   
x x n
w
x x
w w
n
Support Vectors
𝑾 𝑿+ + 𝒃 = +𝟏
𝑾 𝑿− + 𝒃 = −𝟏
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 12
Digg Data
• Formulation:
x1
x2
Margin
x+
x+
x-
n
such that
2
maximize
w
𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1
𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 13
Digg Data
• Formulation:
x1
x2
Margin
x+
x+
x-
n
21
minimize
2
w
such that
𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1
𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 14
Digg Data
• Formulation:
x1
x2
Margin
x+
x+
x-
n
21
minimize
2
w
such that
𝐲𝐢 𝐖 𝐗 + 𝐛 ≥ 𝟏
Understanding the basics
denotes +1
denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 15
Digg Data
Basics of optimization: Convex functions
• A function is called convex if the function lies below the straight line
segment connecting two points, for any two points in the interval.
• Property: Any local minimum is a global minimum!
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 16
Digg Data
Basics of optimization: Quadratic programming
• Quadratic programming (QP) is a special optimization problem: the function to
optimize (“objective”) is quadratic, subject to linear constraints.
• Convex QP problems have convex objective functions.
• These problems can be solved easily and efficiently by greedy algorithms (because
every local minimum is a global minimum).
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 17
Digg Data
SVM optimization problem: Primal formulation
• This is called “primal formulation of linear SVMs”
• It is a convex quadratic programming (QP) optimization problem with n
variables (wi, i= 1,…,n), where n is the number of features in the dataset.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 18
Digg Data
SVM optimization problem: Dual formulation
• The previous problem can be recast in the so-called “dual form” giving rise to
“dual formulation of linear SVMs”.
• Apply the method of Lagrange multipliers.
• We need to minimize this Lagrangian with respect to and simultaneously require
that the derivative with respect to vanishes , all subject to the constraints that
αi > 0
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 19
Digg Data
SVM optimization problem: Dual formulation
Cond…
It is also a convex quadratic programming problem but with N variables (αi, i= 1,…,N), where N is
the number of samples.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 20
Digg Data
SVM optimization problem: Benefits of using
dual formulation
1) No need to access original data, need to access only dot products.
2) Number of free parameters is bounded by the number of support vectors
and not by the number of variables (beneficial for high-dimensional
problems).
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 21
Digg Data
Non linearly separable data: “Soft-margin” linear SVM
Assign a “slack variable” to each instance ,
ξi > 0 which can be thought of distance from the
separating hyperplane if an instance is misclassified
and 0 otherwise.
Primal formulation:
Dual formulation:
• When C is very large, the soft-margin SVM is equivalent
to hard-margin SVM;
• When C is very small, we admit misclassifications in the
training data at the expense of having w-vector with
small norm;
• C has to be selected for the distribution at hand as it will
be discussed later in this tutorial.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 22
Digg Data
SVMs in “loss + penalty” form
• Many statistical learning algorithms (including SVMs) search for a decision function by solving the
following optimization problem:
Minimize (Loss+ λ Penalty)
– Loss measures error of fitting the data
– Penalty penalizes complexity of the learned function
– λ is regularization parameter that balances Loss and Penalty
• Overfitting → Poor generalization
Can also be stated as
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 23
Digg Data
Nonlinear decision boundary
Non Linear
Decision
Boundary
Kernel
method
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 24
Digg Data
Kernel method
• Kernel methods involve
– Nonlinear transformation of data to a higher dimensional feature space induced by a Mercer kernel
– Detection of optimal linear solutions in the kernel feature space
• Transformation to a higher dimensional space is expected to be helpful in conversion of nonlinear relations
into linear relations (Cover’s theorem)
– Nonlinearly separable patterns to linearly separable patterns
– Nonlinear regression to linear regression
– Nonlinear separation of clusters to linear separation of clusters
• Pattern analysis methods are implemented in such a way that the kernel feature space representation is
not explicitly required. They involve computation of pair-wise inner-products only.
• The pair-wise inner-products are computed efficiently directly from the original representation of data
using a kernel function (Kernel trick)
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 25
Digg Data
Kernel trick
Not every function RN×RN -> R can be a valid kernel; it has to satisfy so-called Mercer conditions.
Otherwise, the underlying quadratic program may not be solvable.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 26
Digg Data
Popular kernels
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 27
Digg Data
Gaussian kernel
Consider the Gaussian kernel:
Geometrically, this is a “bump” or “cavity”
centered at the training data point 𝑥j :
The resulting mapping function is a
combination of bumps and cavities.
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 28
Digg Data
SVM usage beyond classification
Regression analysis
(ε-Support vector
regression)
Anomaly detection
(One-class SVM)
Clustering analysis
(Support Vector
Domain
Description)
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 29
Digg Data
Thank you
Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 30

More Related Content

What's hot

Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
Suraj Parmar
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
CloudxLab
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
Sharayu Patil
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
Anandha L Ranganathan
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
Sangwoo Mo
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
Derek Kane
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
Support vector machine-SVM's
Support vector machine-SVM'sSupport vector machine-SVM's
Support vector machine-SVM's
Anudeep Chowdary Kamepalli
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Lecture 9 Perceptron
Lecture 9 PerceptronLecture 9 Perceptron
Lecture 9 Perceptron
Marina Santini
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
milad abbasi
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Musa Hawamdah
 
Svm
SvmSvm
Logistic regression
Logistic regressionLogistic regression
Logistic regression
MartinHogg9
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
Abhimanyu Dwivedi
 
Linear regression
Linear regressionLinear regression
Linear regression
MartinHogg9
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
NAGUR SHAREEF SHAIK
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
SANTHOSH RAJA M G
 

What's hot (20)

Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Support vector machine-SVM's
Support vector machine-SVM'sSupport vector machine-SVM's
Support vector machine-SVM's
 
Random forest
Random forestRandom forest
Random forest
 
Lecture 9 Perceptron
Lecture 9 PerceptronLecture 9 Perceptron
Lecture 9 Perceptron
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Svm
SvmSvm
Svm
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 

Similar to Support Vector Machine without tears

background.pptx
background.pptxbackground.pptx
background.pptx
KabileshCm
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
manaswinimysore
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
Mohammad Shaker
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
pushkarjoshi42
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
kibrualemu812
 
Svm ms
Svm msSvm ms
Svm ms
student
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
TheULTIMATEALLROUNDE
 
Support Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using WekaSupport Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using Weka
Macha Pujitha
 
OM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdfOM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdf
ssuserb016ab
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
Tuan Yang
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
Nimrita Koul
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
rajalakshmi5921
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
Poorabpatel
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
Pradeep Redddy Raamana
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Regression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVMRegression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVM
Ratul Alahy
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2D
Mohamed Nassar
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
Ahmed Youssef Ali Amer
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 

Similar to Support Vector Machine without tears (20)

background.pptx
background.pptxbackground.pptx
background.pptx
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Svm ms
Svm msSvm ms
Svm ms
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Support Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using WekaSupport Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using Weka
 
OM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdfOM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdf
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Regression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVMRegression vs Deep Neural net vs SVM
Regression vs Deep Neural net vs SVM
 
Throttling Malware Families in 2D
Throttling Malware Families in 2DThrottling Malware Families in 2D
Throttling Malware Families in 2D
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 

Recently uploaded

一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 

Recently uploaded (20)

一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 

Support Vector Machine without tears

  • 1. Digg Data Support Vector Machine Ankit Sharma www.diggdata.in without tears
  • 2. Digg Data Content SVM and its application Basic SVM •Hyperplane •Understanding of basics •Optimization Soft margin SVM Non-linear decision boundary SVMs in “loss + penalty” form Kernel method •Gaussian kernel SVM usage beyond classification Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 2
  • 3. Digg Data • In machine learning, support vector machines are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. • Properties of SVM : Duality Kernels Margin Convexity Sparseness SVM : Support Vector Machine Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 3
  • 5. Digg Data Basic concept of SVM Find a linear decision surface (“hyperplane”) that can separate classes and has the largest distance (i.e., largest “gap” or “margin”) between border-line patients (i.e., “support vectors”) Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 5
  • 6. Digg Data Hyperplane as a Decision boundary • A hyperplane is a linear decision surface that splits the space into two parts; • It is obvious that a hyperplane is a binary classifier Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 6
  • 7. Digg Data Equation of a hyperplane An equation of a hyperplane is defined by a point (P0) and a perpendicular vector to the plane (𝑤) at that point. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 7
  • 8. Digg Data • g(x) is a linear function: x1 x2 w x + b < 0 w x + b > 0  A hyper-plane in the feature space  (Unit-length) normal vector of the hyper-plane:  w n w n Understanding the basics Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 8
  • 9. Digg Data x1 x2How to classify these points using a linear discriminant function in order to minimize the error rate?  Infinite number of answers!  Which one is the best? Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 9
  • 10. Digg Data • The linear discriminant function (classifier) with the maximum margin is the best “safe zone”  Margin is defined as the width that the boundary could be increased by before hitting a data point  Why it is the best? Robust to outliners and thus strong generalization ability Margin x1 x2 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 10
  • 11. Digg Data • Given a set of data points:  With a scale transformation on both w and b, the above is equivalent to x1 x2 {( , )}, 1,2, ,i iy i nx , where 𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> 0 𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < 𝟎 𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1 𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 11
  • 12. Digg Data • We know that  The margin width is: x1 x2 Margin x+ x+ x- ( ) 2 ( ) M            x x n w x x w w n Support Vectors 𝑾 𝑿+ + 𝒃 = +𝟏 𝑾 𝑿− + 𝒃 = −𝟏 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 12
  • 13. Digg Data • Formulation: x1 x2 Margin x+ x+ x- n such that 2 maximize w 𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1 𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 13
  • 14. Digg Data • Formulation: x1 x2 Margin x+ x+ x- n 21 minimize 2 w such that 𝑭𝒐𝒓 𝒚𝒊 = +𝟏, 𝑾 𝑿𝒊 + 𝒃> +1 𝑭𝒐𝒓 𝒚𝒊 = −𝟏, 𝑾 𝑿𝒊 + 𝒃 < -1 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 14
  • 15. Digg Data • Formulation: x1 x2 Margin x+ x+ x- n 21 minimize 2 w such that 𝐲𝐢 𝐖 𝐗 + 𝐛 ≥ 𝟏 Understanding the basics denotes +1 denotes -1Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 15
  • 16. Digg Data Basics of optimization: Convex functions • A function is called convex if the function lies below the straight line segment connecting two points, for any two points in the interval. • Property: Any local minimum is a global minimum! Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 16
  • 17. Digg Data Basics of optimization: Quadratic programming • Quadratic programming (QP) is a special optimization problem: the function to optimize (“objective”) is quadratic, subject to linear constraints. • Convex QP problems have convex objective functions. • These problems can be solved easily and efficiently by greedy algorithms (because every local minimum is a global minimum). Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 17
  • 18. Digg Data SVM optimization problem: Primal formulation • This is called “primal formulation of linear SVMs” • It is a convex quadratic programming (QP) optimization problem with n variables (wi, i= 1,…,n), where n is the number of features in the dataset. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 18
  • 19. Digg Data SVM optimization problem: Dual formulation • The previous problem can be recast in the so-called “dual form” giving rise to “dual formulation of linear SVMs”. • Apply the method of Lagrange multipliers. • We need to minimize this Lagrangian with respect to and simultaneously require that the derivative with respect to vanishes , all subject to the constraints that αi > 0 Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 19
  • 20. Digg Data SVM optimization problem: Dual formulation Cond… It is also a convex quadratic programming problem but with N variables (αi, i= 1,…,N), where N is the number of samples. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 20
  • 21. Digg Data SVM optimization problem: Benefits of using dual formulation 1) No need to access original data, need to access only dot products. 2) Number of free parameters is bounded by the number of support vectors and not by the number of variables (beneficial for high-dimensional problems). Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 21
  • 22. Digg Data Non linearly separable data: “Soft-margin” linear SVM Assign a “slack variable” to each instance , ξi > 0 which can be thought of distance from the separating hyperplane if an instance is misclassified and 0 otherwise. Primal formulation: Dual formulation: • When C is very large, the soft-margin SVM is equivalent to hard-margin SVM; • When C is very small, we admit misclassifications in the training data at the expense of having w-vector with small norm; • C has to be selected for the distribution at hand as it will be discussed later in this tutorial. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 22
  • 23. Digg Data SVMs in “loss + penalty” form • Many statistical learning algorithms (including SVMs) search for a decision function by solving the following optimization problem: Minimize (Loss+ λ Penalty) – Loss measures error of fitting the data – Penalty penalizes complexity of the learned function – λ is regularization parameter that balances Loss and Penalty • Overfitting → Poor generalization Can also be stated as Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 23
  • 24. Digg Data Nonlinear decision boundary Non Linear Decision Boundary Kernel method Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 24
  • 25. Digg Data Kernel method • Kernel methods involve – Nonlinear transformation of data to a higher dimensional feature space induced by a Mercer kernel – Detection of optimal linear solutions in the kernel feature space • Transformation to a higher dimensional space is expected to be helpful in conversion of nonlinear relations into linear relations (Cover’s theorem) – Nonlinearly separable patterns to linearly separable patterns – Nonlinear regression to linear regression – Nonlinear separation of clusters to linear separation of clusters • Pattern analysis methods are implemented in such a way that the kernel feature space representation is not explicitly required. They involve computation of pair-wise inner-products only. • The pair-wise inner-products are computed efficiently directly from the original representation of data using a kernel function (Kernel trick) Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 25
  • 26. Digg Data Kernel trick Not every function RN×RN -> R can be a valid kernel; it has to satisfy so-called Mercer conditions. Otherwise, the underlying quadratic program may not be solvable. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 26
  • 27. Digg Data Popular kernels Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 27
  • 28. Digg Data Gaussian kernel Consider the Gaussian kernel: Geometrically, this is a “bump” or “cavity” centered at the training data point 𝑥j : The resulting mapping function is a combination of bumps and cavities. Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 28
  • 29. Digg Data SVM usage beyond classification Regression analysis (ε-Support vector regression) Anomaly detection (One-class SVM) Clustering analysis (Support Vector Domain Description) Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 29
  • 30. Digg Data Thank you Thursday, August 7, 2014 WITHOUT TEARS SERIES | www.diggdata.in 30