SlideShare a Scribd company logo
1 of 23
10/11/2021 Introduction to Data Mining, 2nd Edition 1
Data Mining
Support Vector Machines
Introduction to Data Mining, 2nd Edition
by
Tan, Steinbach, Karpatne, Kumar
10/11/2021 Introduction to Data Mining, 2nd Edition 2
Support Vector Machines
• Find a linear hyperplane (decision boundary) that will separate the data
10/11/2021 Introduction to Data Mining, 2nd Edition 3
Support Vector Machines
• One Possible Solution
B1
10/11/2021 Introduction to Data Mining, 2nd Edition 4
Support Vector Machines
• Another possible solution
B2
10/11/2021 Introduction to Data Mining, 2nd Edition 5
Support Vector Machines
• Other possible solutions
B2
10/11/2021 Introduction to Data Mining, 2nd Edition 6
Support Vector Machines
• Which one is better? B1 or B2?
• How do you define better?
B1
B2
10/11/2021 Introduction to Data Mining, 2nd Edition 7
Support Vector Machines
• Find hyperplane maximizes the margin => B1 is better than B2
B1
B2
b11
b12
b21
b22
margin
10/11/2021 Introduction to Data Mining, 2nd Edition 8
Support Vector Machines
B1
b11
b12
0


 b
x
w


1



 b
x
w

 1



 b
x
w














1
b
x
w
if
1
1
b
x
w
if
1
)
( 




x
f
||
||
2
Margin
w


10/11/2021 Introduction to Data Mining, 2nd Edition 9
Linear SVM
• Linear model:
• Learning the model is equivalent to determining
the values of
– How to find from training data?












1
b
x
w
if
1
1
b
x
w
if
1
)
( 




x
f
and b
w

and b
w

10/11/2021 Introduction to Data Mining, 2nd Edition 10
Learning Linear SVM
• Objective is to maximize:
– Which is equivalent to minimizing:
– Subject to the following constraints:
or
 This is a constrained optimization problem
– Solve it using Lagrange multiplier method
||
||
2
Margin
w














1
b
x
w
if
1
1
b
x
w
if
1
i
i




i
y
2
||
||
)
(
2
w
w
L



𝑦𝑖(w • x𝑖 + 𝑏) ≥ 1, 𝑖 = 1,2, . . . , 𝑁
10/11/2021 Introduction to Data Mining, 2nd Edition 11
Example of Linear SVM
x1 x2 y l
0.3858 0.4687 1 65.5261
0.4871 0.611 -1 65.5261
0.9218 0.4103 -1 0
0.7382 0.8936 -1 0
0.1763 0.0579 1 0
0.4057 0.3529 1 0
0.9355 0.8132 -1 0
0.2146 0.0099 1 0
Support vectors
10/11/2021 Introduction to Data Mining, 2nd Edition 12
Learning Linear SVM
• Decision boundary depends only on support
vectors
– If you have data set with same support
vectors, decision boundary will not change
– How to classify using SVM once w and b are
found? Given a test record, xi












1
b
x
w
if
1
1
b
x
w
if
1
)
(
i
i





i
x
f
10/11/2021 Introduction to Data Mining, 2nd Edition 13
Support Vector Machines
• What if the problem is not linearly separable?
10/11/2021 Introduction to Data Mining, 2nd Edition 14
Support Vector Machines
• What if the problem is not linearly separable?
– Introduce slack variables
 Need to minimize:
 Subject to:
 If k is 1 or 2, this leads to similar objective function
as linear SVM but with different constraints (see
textbook)













i
i
i
i
1
b
x
w
if
1
-
1
b
x
w
if
1






i
y







 

N
i
k
i
C
w
w
L
1
2
2
||
||
)
( 

10/11/2021 Introduction to Data Mining, 2nd Edition 15
Support Vector Machines
• Find the hyperplane that optimizes both factors
B1
B2
b11
b12
b21
b22
margin
10/11/2021 Introduction to Data Mining, 2nd Edition 16
Nonlinear Support Vector Machines
• What if decision boundary is not linear?
10/11/2021 Introduction to Data Mining, 2nd Edition 17
Nonlinear Support Vector Machines
• Transform data into higher dimensional space
0
)
( 


 b
x
w


Decision boundary:
10/11/2021 Introduction to Data Mining, 2nd Edition 18
Learning Nonlinear SVM
• Optimization problem:
• Which leads to the same set of equations (but
involve (x) instead of x)
10/11/2021 Introduction to Data Mining, 2nd Edition 19
Learning NonLinear SVM
• Issues:
– What type of mapping function  should be
used?
– How to do the computation in high
dimensional space?
 Most computations involve dot product (xi) (xj)
 Curse of dimensionality?
10/11/2021 Introduction to Data Mining, 2nd Edition 20
Learning Nonlinear SVM
• Kernel Trick:
– (xi) (xj) = K(xi, xj)
– K(xi, xj) is a kernel function (expressed in
terms of the coordinates in the original space)
 Examples:
10/11/2021 Introduction to Data Mining, 2nd Edition 21
Example of Nonlinear SVM
SVM with polynomial
degree 2 kernel
10/11/2021 Introduction to Data Mining, 2nd Edition 22
Learning Nonlinear SVM
• Advantages of using kernel:
– Don’t have to know the mapping function 
– Computing dot product (xi) (xj) in the
original space avoids curse of dimensionality
• Not all functions can be kernels
– Must make sure there is a corresponding  in
some high-dimensional space
– Mercer’s theorem (see textbook)
10/11/2021 Introduction to Data Mining, 2nd Edition 23
Characteristics of SVM
• The learning problem is formulated as a convex optimization problem
– Efficient algorithms are available to find the global minima
– Many of the other methods use greedy approaches and find locally
optimal solutions
– High computational complexity for building the model
• Robust to noise
• Overfitting is handled by maximizing the margin of the decision boundary,
• SVM can handle irrelevant and redundant attributes better than many
other techniques
• The user needs to provide the type of kernel function and cost function
• Difficult to handle missing values
• What about categorical variables?

More Related Content

Similar to svm.pptx

SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
NoorUlHaq47
 
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Lucidworks
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 

Similar to svm.pptx (20)

TextCategorization support vector_0308.pdf
TextCategorization support vector_0308.pdfTextCategorization support vector_0308.pdf
TextCategorization support vector_0308.pdf
 
SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explained
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
The Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksThe Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service Networks
 
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis ...
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
SVM.ppt
SVM.pptSVM.ppt
SVM.ppt
 
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
 
210523 swin transformer v1.5
210523 swin transformer v1.5210523 swin transformer v1.5
210523 swin transformer v1.5
 
Notes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVMNotes relating to Machine Learning and SVM
Notes relating to Machine Learning and SVM
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 

More from PerumalPitchandi

Agile and its impact to Project Management 022218.pptx
Agile and its impact to Project Management 022218.pptxAgile and its impact to Project Management 022218.pptx
Agile and its impact to Project Management 022218.pptx
PerumalPitchandi
 

More from PerumalPitchandi (20)

Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
22ADE002 – Business Analytics- Module 1.pptx
22ADE002 – Business Analytics- Module 1.pptx22ADE002 – Business Analytics- Module 1.pptx
22ADE002 – Business Analytics- Module 1.pptx
 
biv_mult.ppt
biv_mult.pptbiv_mult.ppt
biv_mult.ppt
 
ppt_ids-data science.pdf
ppt_ids-data science.pdfppt_ids-data science.pdf
ppt_ids-data science.pdf
 
ANOVA Presentation.ppt
ANOVA Presentation.pptANOVA Presentation.ppt
ANOVA Presentation.ppt
 
Data Science Intro.pptx
Data Science Intro.pptxData Science Intro.pptx
Data Science Intro.pptx
 
Descriptive_Statistics_PPT.ppt
Descriptive_Statistics_PPT.pptDescriptive_Statistics_PPT.ppt
Descriptive_Statistics_PPT.ppt
 
SW_Cost_Estimation.ppt
SW_Cost_Estimation.pptSW_Cost_Estimation.ppt
SW_Cost_Estimation.ppt
 
CostEstimation-1.ppt
CostEstimation-1.pptCostEstimation-1.ppt
CostEstimation-1.ppt
 
20IT204-COA-Lecture 18.ppt
20IT204-COA-Lecture 18.ppt20IT204-COA-Lecture 18.ppt
20IT204-COA-Lecture 18.ppt
 
20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptx20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptx
 
Capability Maturity Model (CMM).pptx
Capability Maturity Model (CMM).pptxCapability Maturity Model (CMM).pptx
Capability Maturity Model (CMM).pptx
 
Comparison_between_Waterfall_and_Agile_m (1).pptx
Comparison_between_Waterfall_and_Agile_m (1).pptxComparison_between_Waterfall_and_Agile_m (1).pptx
Comparison_between_Waterfall_and_Agile_m (1).pptx
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
FDS Module I 24.1.2022.ppt
FDS Module I 24.1.2022.pptFDS Module I 24.1.2022.ppt
FDS Module I 24.1.2022.ppt
 
FDS Module I 20.1.2022.ppt
FDS Module I 20.1.2022.pptFDS Module I 20.1.2022.ppt
FDS Module I 20.1.2022.ppt
 
AgileSoftwareDevelopment.ppt
AgileSoftwareDevelopment.pptAgileSoftwareDevelopment.ppt
AgileSoftwareDevelopment.ppt
 
Agile and its impact to Project Management 022218.pptx
Agile and its impact to Project Management 022218.pptxAgile and its impact to Project Management 022218.pptx
Agile and its impact to Project Management 022218.pptx
 
Data_exploration.ppt
Data_exploration.pptData_exploration.ppt
Data_exploration.ppt
 
state-spaces29Sep06.ppt
state-spaces29Sep06.pptstate-spaces29Sep06.ppt
state-spaces29Sep06.ppt
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 

Recently uploaded (20)

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 

svm.pptx

  • 1. 10/11/2021 Introduction to Data Mining, 2nd Edition 1 Data Mining Support Vector Machines Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar
  • 2. 10/11/2021 Introduction to Data Mining, 2nd Edition 2 Support Vector Machines • Find a linear hyperplane (decision boundary) that will separate the data
  • 3. 10/11/2021 Introduction to Data Mining, 2nd Edition 3 Support Vector Machines • One Possible Solution B1
  • 4. 10/11/2021 Introduction to Data Mining, 2nd Edition 4 Support Vector Machines • Another possible solution B2
  • 5. 10/11/2021 Introduction to Data Mining, 2nd Edition 5 Support Vector Machines • Other possible solutions B2
  • 6. 10/11/2021 Introduction to Data Mining, 2nd Edition 6 Support Vector Machines • Which one is better? B1 or B2? • How do you define better? B1 B2
  • 7. 10/11/2021 Introduction to Data Mining, 2nd Edition 7 Support Vector Machines • Find hyperplane maximizes the margin => B1 is better than B2 B1 B2 b11 b12 b21 b22 margin
  • 8. 10/11/2021 Introduction to Data Mining, 2nd Edition 8 Support Vector Machines B1 b11 b12 0    b x w   1     b x w   1     b x w               1 b x w if 1 1 b x w if 1 ) (      x f || || 2 Margin w  
  • 9. 10/11/2021 Introduction to Data Mining, 2nd Edition 9 Linear SVM • Linear model: • Learning the model is equivalent to determining the values of – How to find from training data?             1 b x w if 1 1 b x w if 1 ) (      x f and b w  and b w 
  • 10. 10/11/2021 Introduction to Data Mining, 2nd Edition 10 Learning Linear SVM • Objective is to maximize: – Which is equivalent to minimizing: – Subject to the following constraints: or  This is a constrained optimization problem – Solve it using Lagrange multiplier method || || 2 Margin w               1 b x w if 1 1 b x w if 1 i i     i y 2 || || ) ( 2 w w L    𝑦𝑖(w • x𝑖 + 𝑏) ≥ 1, 𝑖 = 1,2, . . . , 𝑁
  • 11. 10/11/2021 Introduction to Data Mining, 2nd Edition 11 Example of Linear SVM x1 x2 y l 0.3858 0.4687 1 65.5261 0.4871 0.611 -1 65.5261 0.9218 0.4103 -1 0 0.7382 0.8936 -1 0 0.1763 0.0579 1 0 0.4057 0.3529 1 0 0.9355 0.8132 -1 0 0.2146 0.0099 1 0 Support vectors
  • 12. 10/11/2021 Introduction to Data Mining, 2nd Edition 12 Learning Linear SVM • Decision boundary depends only on support vectors – If you have data set with same support vectors, decision boundary will not change – How to classify using SVM once w and b are found? Given a test record, xi             1 b x w if 1 1 b x w if 1 ) ( i i      i x f
  • 13. 10/11/2021 Introduction to Data Mining, 2nd Edition 13 Support Vector Machines • What if the problem is not linearly separable?
  • 14. 10/11/2021 Introduction to Data Mining, 2nd Edition 14 Support Vector Machines • What if the problem is not linearly separable? – Introduce slack variables  Need to minimize:  Subject to:  If k is 1 or 2, this leads to similar objective function as linear SVM but with different constraints (see textbook)              i i i i 1 b x w if 1 - 1 b x w if 1       i y           N i k i C w w L 1 2 2 || || ) (  
  • 15. 10/11/2021 Introduction to Data Mining, 2nd Edition 15 Support Vector Machines • Find the hyperplane that optimizes both factors B1 B2 b11 b12 b21 b22 margin
  • 16. 10/11/2021 Introduction to Data Mining, 2nd Edition 16 Nonlinear Support Vector Machines • What if decision boundary is not linear?
  • 17. 10/11/2021 Introduction to Data Mining, 2nd Edition 17 Nonlinear Support Vector Machines • Transform data into higher dimensional space 0 ) (     b x w   Decision boundary:
  • 18. 10/11/2021 Introduction to Data Mining, 2nd Edition 18 Learning Nonlinear SVM • Optimization problem: • Which leads to the same set of equations (but involve (x) instead of x)
  • 19. 10/11/2021 Introduction to Data Mining, 2nd Edition 19 Learning NonLinear SVM • Issues: – What type of mapping function  should be used? – How to do the computation in high dimensional space?  Most computations involve dot product (xi) (xj)  Curse of dimensionality?
  • 20. 10/11/2021 Introduction to Data Mining, 2nd Edition 20 Learning Nonlinear SVM • Kernel Trick: – (xi) (xj) = K(xi, xj) – K(xi, xj) is a kernel function (expressed in terms of the coordinates in the original space)  Examples:
  • 21. 10/11/2021 Introduction to Data Mining, 2nd Edition 21 Example of Nonlinear SVM SVM with polynomial degree 2 kernel
  • 22. 10/11/2021 Introduction to Data Mining, 2nd Edition 22 Learning Nonlinear SVM • Advantages of using kernel: – Don’t have to know the mapping function  – Computing dot product (xi) (xj) in the original space avoids curse of dimensionality • Not all functions can be kernels – Must make sure there is a corresponding  in some high-dimensional space – Mercer’s theorem (see textbook)
  • 23. 10/11/2021 Introduction to Data Mining, 2nd Edition 23 Characteristics of SVM • The learning problem is formulated as a convex optimization problem – Efficient algorithms are available to find the global minima – Many of the other methods use greedy approaches and find locally optimal solutions – High computational complexity for building the model • Robust to noise • Overfitting is handled by maximizing the margin of the decision boundary, • SVM can handle irrelevant and redundant attributes better than many other techniques • The user needs to provide the type of kernel function and cost function • Difficult to handle missing values • What about categorical variables?