SlideShare a Scribd company logo
1 of 17
MACHINE LEARNING
SOFTMAX MARGIN
CLASSIFICATION
- JAYESH SUKDEO PATIL
Support Vector
Machines
• A Support Vector Machine (SVM) is
a very powerful and versatile
Machine Learning model, capable
of performing linear or nonlinear
classification, regression, and even
outlier detection. It is one of the
most popular models in Machine
Learning, and any‐one interested in
Machine Learning should have it in
their toolbox. SVMs are particularly
well suited for classification of
complex but small- or medium-
sized datasets.
Concept of SVM in
three parts:
Linear SVM
 Hard Margin
Classifier
 Soft Margin
Classifier
 Non – Linear SVM
Linear SVM Non-Linear SVM
It can be easily separated with a
linear line.
It cannot be easily separated with a linear line.
Data is classified with the help of
hyperplane.
We use Kernels to make non-separable data into
separable data.
Support Vector Machine
• 1. Linear SVM – Hard Margin Classifier
• Here we will build our initial concept of SVM by
classifying perfectly separated dataset ( linear classification
). This is also called “Linear SVM – Hard Margin
Classifier”. We will define the objective function. This
tutorial is dedicated for Hard Margin Classifier.
• 2. Linear SVM – Soft Margin Classifier
• We will extend our concept of Hard Margin Classifier to
solve for dataset where there are some outliers. In this case
all of the data points cant be separated using a straight line,
there will be some miss-classified points. This is similar of
adding regularization to a regression model.
• 3. Non – Linear SVM
• Finally we will learn how to derive Non-linear SVM using
kernel. I will probably have a separate tutorial on kernel
before this.
•
Maximal Margin Classifier
• a margin classifier is a classifier which is able to give an
associated distance from the decision boundary for each
example.
• Hyperplane
• We can use a line to separate data which is in two
dimension (Have 2 features x1 and x2 ). Similarly need a
2D plane to separate data in 3 dimension. In order to
generalize the concepts, we will call them hyperplane,
instead of line, plane or cube for n dimension of data,
where n > 0.
What is Margin?
• Margin can be defined using the minimum
distance (normal distance) from each observations to a
given separating hyperplane. Let’s see how we can use
Margin to find optimal Hyperplane.
• What is classification margin?
• The classification margin is the difference between the
classification score for the true class and maximal
classification score for the false classes. The
classification margin is a column vector with the same
number of rows as in the matrix X .
What is hard margin
SVM?
• A hard margin means that an
SVM is very rigid in
classification and tries to work
extremely well in the training set,
causing overfitting.
What is soft margin
classification?
• Soft Margin Classifier
• The constraint of maximizing the margin of the line
that separates the classes must be relaxed. This is
often called the soft margin classifier. This change
allows some points in the training data to violate the
separating line.
What is soft and
hard margin
in SVM?
• The difference between a hard
margin and a soft margin in
SVMs lies in the separability of the
data. In this case, a soft margin SVM
is appropriate. Sometimes, the data
is linearly separable, but the margin
is so small that the model becomes
prone to overfitting or being too
sensitive to outliers.
What & Why of SVM as Soft Margin Classifier?
• Linear classifier can be made using support vector machines. One
disadvantage of an SVM is that a classifier with NO regularization cost
is updated only until all the training points are classified correctly.
• SVM based-classifiers do not distinquish models based on how well or
confidently they classify the data. As a result it is difficult to compare
the quality of two models. A softmax classifier is a better choice when
we are also concerned about the quality of classification.
• For example, both the SVM models presented below classify the data
accurately, however, the one on the right is prefered because it has higher
margin. A SVM update rule without regularized weight will not be able
to pick out this difference. Worse, it is possilbe that with regularized
weights the SVM method chooses the classifier with a smaller margin.
SOFTMAX
MARGIN
CLASSIFICATION
SOFTMAX MARGIN
CLASSIFICATION
• Single outlier can push the decision
boundary greatly, so that the margin
becomes very narrow.
• Even though a linear decision boundary
can classify the target classes properly,
the data may not be separable using a
straight line ( no clear boundary )
SOFTMAX MARGIN
CLASSIFICATION
• If we strictly impose that all instances be off the street and on the
right side, this is called hard margin classification. There are two
main issues with hard margin classification.
• First, it only works if the data is linearly separable, and second it is
quite sensitive to outliers. Figure 5-3 shows the iris dataset with just
one additional outlier: on the left, it is impossible to find a hard
margin, and on the right the decision boundary ends up very
different from the one we saw in Figure 5-1 without the outlier, and
it will probably not generalize as well.
• To avoid these issues it is preferable to use a more flexible model.
The objective is to find a good balance between keeping the street
as large as possible and limiting the margin violations (i.e., instances
that end up in the middle of the street or even on the wrong side).
This is called soft margin classification.
• In Scikit-Learn’s SVM classes, you can control this balance using
the C hyperparame‐ter: a smaller C value leads to a wider street but
more margin violations. Figure 5-4shows the decision boundaries
and margins of two soft margin SVM classifiers on a nonlinearly
separable dataset. On the left, using a high C value the classifier
makesfewer margin violations but ends up with a smaller margin.
• On the right, using a low C value the margin is much larger, but
many instances end up on the street. However,it seems likely that
the second classifier will generalize better: in fact even on this
training set it makes fewer prediction errors, since most of the
margin violations are actually on the correct side of the decision
boundary.
Softmax classifier Formula
Click to add text
References:
• https://www.cs.toronto.edu/~tang/papers/dlsvm.pdf
• https://vitalflux.com/svm-soft-margin-classifier-c-value-
importance/
• https://www.youtube.com/watch?v=7vSGI9FCCaY
• https://cs231n.github.io/linear-classify/#softmax-
classifier
ML Softmax JP 24.pptx

More Related Content

Similar to ML Softmax JP 24.pptx

Support Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random ForestSupport Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random Forestumarcybermind
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Introduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseIntroduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseMayuraD1
 
Classification-Support Vector Machines.pptx
Classification-Support Vector Machines.pptxClassification-Support Vector Machines.pptx
Classification-Support Vector Machines.pptxCiceer Ghimirey
 
Kate · SlidesCarnival.pptx
Kate · SlidesCarnival.pptxKate · SlidesCarnival.pptx
Kate · SlidesCarnival.pptxAhmedSalah48055
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machineDr. Radhey Shyam
 
Lec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxLec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxpiwig56192
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineSomnathMore3
 
OM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdfOM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdfssuserb016ab
 
Support Vector Machine(SVM) with Iris and Mushroom Dataset
Support Vector Machine(SVM) with Iris and Mushroom DatasetSupport Vector Machine(SVM) with Iris and Mushroom Dataset
Support Vector Machine(SVM) with Iris and Mushroom DatasetPawandeep Kaur
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
3. Regression.pdf
3. Regression.pdf3. Regression.pdf
3. Regression.pdfJyoti Yadav
 

Similar to ML Softmax JP 24.pptx (20)

Support vector machine-SVM's
Support vector machine-SVM'sSupport vector machine-SVM's
Support vector machine-SVM's
 
Support Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random ForestSupport Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random Forest
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Introduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseIntroduction to Machine Learning Elective Course
Introduction to Machine Learning Elective Course
 
Classification-Support Vector Machines.pptx
Classification-Support Vector Machines.pptxClassification-Support Vector Machines.pptx
Classification-Support Vector Machines.pptx
 
svm.pptx
svm.pptxsvm.pptx
svm.pptx
 
Module-3_SVM_Kernel_KNN.pptx
Module-3_SVM_Kernel_KNN.pptxModule-3_SVM_Kernel_KNN.pptx
Module-3_SVM_Kernel_KNN.pptx
 
Kate · SlidesCarnival.pptx
Kate · SlidesCarnival.pptxKate · SlidesCarnival.pptx
Kate · SlidesCarnival.pptx
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machine
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Svm ms
Svm msSvm ms
Svm ms
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Lec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptxLec_XX_Support Vector Machine Algorithm.pptx
Lec_XX_Support Vector Machine Algorithm.pptx
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
OM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdfOM-DS-Fall2022-Session10-Support vector machine.pdf
OM-DS-Fall2022-Session10-Support vector machine.pdf
 
Support Vector Machine(SVM) with Iris and Mushroom Dataset
Support Vector Machine(SVM) with Iris and Mushroom DatasetSupport Vector Machine(SVM) with Iris and Mushroom Dataset
Support Vector Machine(SVM) with Iris and Mushroom Dataset
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
3. Regression.pdf
3. Regression.pdf3. Regression.pdf
3. Regression.pdf
 
SVM
SVMSVM
SVM
 

More from Jayesh Patil

AWS Cloudtrail JSP.pptx
AWS Cloudtrail JSP.pptxAWS Cloudtrail JSP.pptx
AWS Cloudtrail JSP.pptxJayesh Patil
 
Basics of cloud - AWS.pptx
Basics of cloud - AWS.pptxBasics of cloud - AWS.pptx
Basics of cloud - AWS.pptxJayesh Patil
 
IOT EDGE SS JP.pptx
IOT EDGE SS JP.pptxIOT EDGE SS JP.pptx
IOT EDGE SS JP.pptxJayesh Patil
 
Flume DS -JSP.pptx
Flume DS -JSP.pptxFlume DS -JSP.pptx
Flume DS -JSP.pptxJayesh Patil
 
Blom Scheme CT -JSP.pptx
Blom Scheme CT -JSP.pptxBlom Scheme CT -JSP.pptx
Blom Scheme CT -JSP.pptxJayesh Patil
 
ATHLETICS - SD.pptx
ATHLETICS - SD.pptxATHLETICS - SD.pptx
ATHLETICS - SD.pptxJayesh Patil
 

More from Jayesh Patil (10)

AWS EC2 JSP.pptx
AWS EC2 JSP.pptxAWS EC2 JSP.pptx
AWS EC2 JSP.pptx
 
AWS Cloudtrail JSP.pptx
AWS Cloudtrail JSP.pptxAWS Cloudtrail JSP.pptx
AWS Cloudtrail JSP.pptx
 
Basics of cloud - AWS.pptx
Basics of cloud - AWS.pptxBasics of cloud - AWS.pptx
Basics of cloud - AWS.pptx
 
Cloud Roles.pptx
Cloud Roles.pptxCloud Roles.pptx
Cloud Roles.pptx
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
IOT EDGE SS JP.pptx
IOT EDGE SS JP.pptxIOT EDGE SS JP.pptx
IOT EDGE SS JP.pptx
 
Flume DS -JSP.pptx
Flume DS -JSP.pptxFlume DS -JSP.pptx
Flume DS -JSP.pptx
 
Blom Scheme CT -JSP.pptx
Blom Scheme CT -JSP.pptxBlom Scheme CT -JSP.pptx
Blom Scheme CT -JSP.pptx
 
AZURE CC JP.pptx
AZURE CC JP.pptxAZURE CC JP.pptx
AZURE CC JP.pptx
 
ATHLETICS - SD.pptx
ATHLETICS - SD.pptxATHLETICS - SD.pptx
ATHLETICS - SD.pptx
 

Recently uploaded

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 

Recently uploaded (20)

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 

ML Softmax JP 24.pptx

  • 2. Support Vector Machines • A Support Vector Machine (SVM) is a very powerful and versatile Machine Learning model, capable of performing linear or nonlinear classification, regression, and even outlier detection. It is one of the most popular models in Machine Learning, and any‐one interested in Machine Learning should have it in their toolbox. SVMs are particularly well suited for classification of complex but small- or medium- sized datasets.
  • 3. Concept of SVM in three parts: Linear SVM  Hard Margin Classifier  Soft Margin Classifier  Non – Linear SVM Linear SVM Non-Linear SVM It can be easily separated with a linear line. It cannot be easily separated with a linear line. Data is classified with the help of hyperplane. We use Kernels to make non-separable data into separable data.
  • 4. Support Vector Machine • 1. Linear SVM – Hard Margin Classifier • Here we will build our initial concept of SVM by classifying perfectly separated dataset ( linear classification ). This is also called “Linear SVM – Hard Margin Classifier”. We will define the objective function. This tutorial is dedicated for Hard Margin Classifier. • 2. Linear SVM – Soft Margin Classifier • We will extend our concept of Hard Margin Classifier to solve for dataset where there are some outliers. In this case all of the data points cant be separated using a straight line, there will be some miss-classified points. This is similar of adding regularization to a regression model. • 3. Non – Linear SVM • Finally we will learn how to derive Non-linear SVM using kernel. I will probably have a separate tutorial on kernel before this. •
  • 5. Maximal Margin Classifier • a margin classifier is a classifier which is able to give an associated distance from the decision boundary for each example. • Hyperplane • We can use a line to separate data which is in two dimension (Have 2 features x1 and x2 ). Similarly need a 2D plane to separate data in 3 dimension. In order to generalize the concepts, we will call them hyperplane, instead of line, plane or cube for n dimension of data, where n > 0.
  • 6. What is Margin? • Margin can be defined using the minimum distance (normal distance) from each observations to a given separating hyperplane. Let’s see how we can use Margin to find optimal Hyperplane. • What is classification margin? • The classification margin is the difference between the classification score for the true class and maximal classification score for the false classes. The classification margin is a column vector with the same number of rows as in the matrix X .
  • 7. What is hard margin SVM? • A hard margin means that an SVM is very rigid in classification and tries to work extremely well in the training set, causing overfitting.
  • 8. What is soft margin classification? • Soft Margin Classifier • The constraint of maximizing the margin of the line that separates the classes must be relaxed. This is often called the soft margin classifier. This change allows some points in the training data to violate the separating line.
  • 9. What is soft and hard margin in SVM? • The difference between a hard margin and a soft margin in SVMs lies in the separability of the data. In this case, a soft margin SVM is appropriate. Sometimes, the data is linearly separable, but the margin is so small that the model becomes prone to overfitting or being too sensitive to outliers.
  • 10. What & Why of SVM as Soft Margin Classifier? • Linear classifier can be made using support vector machines. One disadvantage of an SVM is that a classifier with NO regularization cost is updated only until all the training points are classified correctly. • SVM based-classifiers do not distinquish models based on how well or confidently they classify the data. As a result it is difficult to compare the quality of two models. A softmax classifier is a better choice when we are also concerned about the quality of classification. • For example, both the SVM models presented below classify the data accurately, however, the one on the right is prefered because it has higher margin. A SVM update rule without regularized weight will not be able to pick out this difference. Worse, it is possilbe that with regularized weights the SVM method chooses the classifier with a smaller margin.
  • 12. SOFTMAX MARGIN CLASSIFICATION • Single outlier can push the decision boundary greatly, so that the margin becomes very narrow. • Even though a linear decision boundary can classify the target classes properly, the data may not be separable using a straight line ( no clear boundary )
  • 13. SOFTMAX MARGIN CLASSIFICATION • If we strictly impose that all instances be off the street and on the right side, this is called hard margin classification. There are two main issues with hard margin classification. • First, it only works if the data is linearly separable, and second it is quite sensitive to outliers. Figure 5-3 shows the iris dataset with just one additional outlier: on the left, it is impossible to find a hard margin, and on the right the decision boundary ends up very different from the one we saw in Figure 5-1 without the outlier, and it will probably not generalize as well. • To avoid these issues it is preferable to use a more flexible model. The objective is to find a good balance between keeping the street as large as possible and limiting the margin violations (i.e., instances that end up in the middle of the street or even on the wrong side). This is called soft margin classification. • In Scikit-Learn’s SVM classes, you can control this balance using the C hyperparame‐ter: a smaller C value leads to a wider street but more margin violations. Figure 5-4shows the decision boundaries and margins of two soft margin SVM classifiers on a nonlinearly separable dataset. On the left, using a high C value the classifier makesfewer margin violations but ends up with a smaller margin. • On the right, using a low C value the margin is much larger, but many instances end up on the street. However,it seems likely that the second classifier will generalize better: in fact even on this training set it makes fewer prediction errors, since most of the margin violations are actually on the correct side of the decision boundary.
  • 15. Click to add text
  • 16. References: • https://www.cs.toronto.edu/~tang/papers/dlsvm.pdf • https://vitalflux.com/svm-soft-margin-classifier-c-value- importance/ • https://www.youtube.com/watch?v=7vSGI9FCCaY • https://cs231n.github.io/linear-classify/#softmax- classifier