SlideShare a Scribd company logo
CHAPTER 06
SUPPORT VECTOR MACHINES
CSC445: Neural Networks
Prof. Dr. Mostafa Gadal-Haqq M. Mostafa
Computer Science Department
Faculty of Computer & Information Sciences
AIN SHAMS UNIVERSITY
(some of the figures in this presentation are copyrighted to Pearson Education, Inc.)
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
 Introduction
 Optimal Hyperplane for Linearly Separable Pattern
 Quadratic Optimization for Finding the Optimal Hyperplan
 Optimal Hyperplane for Nonseparable Patterns
 Underlying Philosophy of SVM for Pattern Calssification
 SVM viewed as Kernel Machine
 The XOR problem
 Computer Experiment
2
Outlines
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 3
Introduction
 The main idea of the SVMs may be summed up as
follows:
 “Given a training samples, the SVM constructs a
hyperplane as decision surface in such a way the
margin of separation between positive and negative
examples is maximized.”
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 4
Linearly Separable Patterns
 SVM is a binary learning machine.
 Binary classification is the task of separating classes in
feature space.
wTx + b = 0
wTx + b < 0
wTx + b > 0
bxwxg T


)(
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 5
Linearly Separable Patterns
 Which of the linear separators is optimal?
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Optimal Decision Boundary
 The optimal decision boundary is the one that
maximize the margin 
6
r
ρ
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Margin 
7
|||| w
w
rxx P 



ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Margin 
||||)(then
,0since
||||
)()(
||||
,)(
wrxg
bxw
w
w
w
rbxwxg
w
w
rxxbxwxg
P
T
T
P
T
P
T













8
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Margin 










1
||||
1
1
||||
1
||||
)(
11)(
dif
w
dif
w
w
xg
r
dforbxwxg T





9
r
ρ
1bxwT 
1 bxwT 
0 bxwT 
||||
2
2
w
r 
Then the margin is given as:
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Optimal Decision Boundary
 Let {x1, ..., xn} be our data set and let di  {1,-1} be the
class label of xi
 The decision boundary should classify all points
correctly.
 That is, we have a constrained optimization problem
Maximize  = 𝟐𝒓 =
𝟐
𝒘
, or Minimize 𝒘
Subject to 𝒅𝒊(𝒘 𝑻 𝒙 ± 𝒃) ≥ 𝟏
10
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Optimization Problem
 Introduce Lagrange multipliers ,
 That is, the Lagrange function:
Is to be minimized with respect to w and b, i.e,
𝜕𝑱(𝒘,𝒃,)
𝜕𝒘
= 𝟎 ; and
𝜕𝑱(𝒘,𝒃, )
𝜕𝒃
= 𝟎
)1][(||||
2
1
),,(
1
2
 
bxwdwbwJ i
T
i
N
i
i
11
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Solving the Optimization Problem
 Need to optimize a quadratic function subject to linear
constraints.
 The solution involves constructing a dual problem where a
Lagrange multiplier αi is associated with every constraint in the
primary problem:
Find 𝛼1…𝛼 𝑁such that
𝑸 𝜶 = 𝛼𝑖 −
1
2
𝛼𝑖 𝛼𝑗 𝑑𝑖 𝑑𝑗x 𝑖x𝑗𝑗𝑖
𝑵
𝒊=𝟏
is maximized and
(1) 𝛼𝑖 𝑑𝑖𝑗
(2) 𝛼1 ≥ 0 ∀ 𝑖
12
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Optimization Problem
 The solution has the form:
and such that 𝒊 ≠ 𝟎
 Each non-zero αi indicates that corresponding xi is a support vector.
 Then the classifying function will have the form:
 Notice that it relies on an inner product between the test point x and the
support vectors xi
 Also keep in mind that solving the optimization problem involved computing
the inner products xi
Txj between all training points!
13
ii
N
i
i xd

1
w  iii
N
i
idb xx1
1

 
bdxg iii
N
i
i  
xx)(
1

ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
6=1.4
The Optimization Problem
 Support vectors are samples that have non-zero 
Class 1
Class 2
1=0.8
2=0
3=0
4=0
5=0
7=0
8=0.6
9=0
10=0
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Optimal Hyperplane for Nonseparable Patterns
Figure 6.3 Soft margin hyperplane (a) Data point xi (belonging to class C1,
represented by a small square) falls inside the region of separation, but on the
correct side of the decision surface. (b) Data point xi (belonging to class C2,
represented by a small circle) falls on the wrong side of the decision surface.
15
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Optimal Hyperplane for Nonseparable Patterns
 We allow “error” xi in classification
16
ξi
ξi
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Soft Margin Hyperplane
 The old formulation:
 The new formulation incorporating relaxed variables:
Parameter C can be viewed as a way to control overfitting.
17
Find w and b such that
∅ 𝑾 =
𝟏
𝟐
𝑾 𝑻
𝑾 is minimized and for all {(xi ,yi)}
Subject to: 𝒅𝒊(𝒘 𝑻
𝒙 ± 𝒃) ≥ 𝟏
Find w and b such that
∅ 𝐖 =
𝟏
𝟐
𝐖 𝐓 𝐖 + 𝐜 𝝃𝒊𝒊 is minimized for all {(xi ,yi)}
Subject to: 𝒅𝒊(𝒘 𝑻 𝒙 ± 𝒃) ≥ 𝟏 , and ξi ≥ 0 for all i
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Soft Margin Hyperplane
 Again, xi with non-zero αi will be support vectors.
 Solution to the dual problem is:
𝑾 = 𝜶𝒊 𝒅𝒊 𝒙𝒊𝒊
and
𝒃 = 𝒅𝒊 𝟏 − 𝝃𝒊 − 𝑾 𝑻
𝒙𝒊
18
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Extension to Non-linear Decision Boundary
 Key idea: transform xi to a higher dimensional space
 Input space: the space of xi
 Feature space: the “kernel” space of f(xi)
19
f( )
f( )
f( )
f( )f( )
f( )
f( )
f( )
f(.)
f( )
f( )
f( )
f( )
f( )
f( )
f( )
f( )
f( )
f( )
Feature spaceInput space
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Kernel Trick
 The linear classifier relies on inner product between
vectors:
𝑲 𝐱 𝒊, 𝐱 𝒋 = 𝐱𝒊
𝑻 𝐱 𝒋
If every datapoint is mapped into high-dimensional space
via some transformation Φ: x → φ(x), the inner product
becomes:
𝑲 𝐱 𝒊, 𝐱 𝒋 = 𝛟 𝐱𝐢
𝑻 𝛟(𝐱 𝒋)
 A kernel function is some function that corresponds to
an inner product into some feature space.
 K (x, xj) needs to satisfy a technical condition (Mercer
condition) in order for f(.) to exist
20
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Mercer’s Theorem
 𝑲 = 𝒌(𝒙𝒊, 𝒙𝒋) ∀𝒊, 𝒋 has to be non-negative definite or
positive semidefinite , that is, it satisfies:
𝒂 𝑻K𝒂 ≥ 𝟎
 Some of kernel functions that satisfy Mercer’s condition:
21
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The SVM viewed as Kernel Machine
Figure 6.5 Architecture of support vector machine, using a
radial-basis function network.
22
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The XOR Problem
 For the two dimensional vectors x=[x1 x2];
 Define the following Kernel:
𝒌 x,x𝒊 = 𝟏 + x 𝑻
x𝒊
2
 Need to show that
K(xi,xj)= φ(xi)Tφ(xj)
K(xi,xj)=(1 + xi
Txj)2
= 1+ xi1
2xj1
2 + 2 xi1xj1 xi2xj2+ xi2
2xj2
2 + 2xi1xj1 + 2xi2xj2=
= [1 xi1
2 √2 xi1xi2 xi2
2 √2xi1 √2xi2]T [1 xj1
2 √2 xj1xj2 xj2
2 √2xj1 √2xj2]
= φ(xi)Tφ(xj),
where
φ(x) = [1 x1
2 √2 x1x2 x2
2 √2x1 √2x2]
23
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The XOR Problem
 Which give the optimal hyperplane as:
−𝒙 𝟏 𝒙 𝟐 = 𝟎
 This yields
Figure 6.6 (a) Polynomial machine for solving the XOR problem. (b) Induced
images in the feature space due to the four data points of the XOR problem.
24
(1, -1)
(-1,1)
(-1, -1)
(1,1)
-1.0
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Conclusion
 SVM is a useful alternative to neural networks
 Two key concepts of SVM: maximize the margin
and the kernel trick
 Many active research is taking place on areas
related to SVM
 Many SVM implementations are available on the
web for you to try on your data set!
25
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Computer Experiment
Figure 6.7 Experiment on SVM for the double-moon of Fig. 1.8 with
distance d = –6.
26
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Computer Experiment
Figure 6.8 Experiment on SVM for the double-moon of Fig. 1.8 with
distance d = –6.5.
27
Principal Component
Analysis (PCA)
Next Time
28

More Related Content

What's hot

Neural network
Neural networkNeural network
Neural network
Ramesh Giri
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
Mostafa G. M. Mostafa
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
Mostafa G. M. Mostafa
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
Ahmed Daoud
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
Mostafa G. M. Mostafa
 
Back propagation
Back propagationBack propagation
Back propagation
Nagarajan
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
Tarat Diloksawatdikul
 
Mc culloch pitts neuron
Mc culloch pitts neuronMc culloch pitts neuron
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian LearningESCOM
 
Activation function
Activation functionActivation function
Activation function
Astha Jain
 
Yolo
YoloYolo
Radial Basis Function
Radial Basis FunctionRadial Basis Function
Radial Basis Function
Madhawa Gunasekara
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
lalithambiga kamaraj
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
Vajiheh Zoghiyan
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computation
Mohammed Bennamoun
 
Cnn
CnnCnn
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
Sushant Shrivastava
 
Wiener Filter
Wiener FilterWiener Filter
Wiener Filter
Akshat Ratanpal
 
Anfis (1)
Anfis (1)Anfis (1)
Anfis (1)
TarekBarhoum
 

What's hot (20)

Neural network
Neural networkNeural network
Neural network
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Back propagation
Back propagationBack propagation
Back propagation
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
Mc culloch pitts neuron
Mc culloch pitts neuronMc culloch pitts neuron
Mc culloch pitts neuron
 
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian Learning
 
Activation function
Activation functionActivation function
Activation function
 
Yolo
YoloYolo
Yolo
 
Radial Basis Function
Radial Basis FunctionRadial Basis Function
Radial Basis Function
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computation
 
Cnn
CnnCnn
Cnn
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Wiener Filter
Wiener FilterWiener Filter
Wiener Filter
 
Anfis (1)
Anfis (1)Anfis (1)
Anfis (1)
 

Viewers also liked

Csc446: Pattern Recognition
Csc446: Pattern Recognition Csc446: Pattern Recognition
Csc446: Pattern Recognition
Mostafa G. M. Mostafa
 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) Algorithm
Mostafa G. M. Mostafa
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)
Mostafa G. M. Mostafa
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
Mostafa G. M. Mostafa
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear Regression
Mostafa G. M. Mostafa
 
CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)
Mostafa G. M. Mostafa
 
Digital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency DomainDigital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency Domain
Mostafa G. M. Mostafa
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
Mostafa G. M. Mostafa
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
Mostafa G. M. Mostafa
 
Csc446: Pattren Recognition
Csc446: Pattren RecognitionCsc446: Pattren Recognition
Csc446: Pattren Recognition
Mostafa G. M. Mostafa
 
Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)
Mostafa G. M. Mostafa
 
CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)
Mostafa G. M. Mostafa
 
CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)
Mostafa G. M. Mostafa
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
Daksh Raj Chopra
 
CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)
Mostafa G. M. Mostafa
 
Digital Image Processing: Image Restoration
Digital Image Processing: Image RestorationDigital Image Processing: Image Restoration
Digital Image Processing: Image Restoration
Mostafa G. M. Mostafa
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial Domain
Mostafa G. M. Mostafa
 
Correspondence analysis(step by step)
Correspondence analysis(step by step)Correspondence analysis(step by step)
Correspondence analysis(step by step)
Nguyen Van Chuc
 
Sefl Organizing Map
Sefl Organizing MapSefl Organizing Map
Sefl Organizing Map
Nguyen Van Chuc
 
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo IINeural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
Edgar Carrillo
 

Viewers also liked (20)

Csc446: Pattern Recognition
Csc446: Pattern Recognition Csc446: Pattern Recognition
Csc446: Pattern Recognition
 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) Algorithm
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear Regression
 
CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)
 
Digital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency DomainDigital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency Domain
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
 
Csc446: Pattren Recognition
Csc446: Pattren RecognitionCsc446: Pattren Recognition
Csc446: Pattren Recognition
 
Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)
 
CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)
 
CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
 
CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)
 
Digital Image Processing: Image Restoration
Digital Image Processing: Image RestorationDigital Image Processing: Image Restoration
Digital Image Processing: Image Restoration
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial Domain
 
Correspondence analysis(step by step)
Correspondence analysis(step by step)Correspondence analysis(step by step)
Correspondence analysis(step by step)
 
Sefl Organizing Map
Sefl Organizing MapSefl Organizing Map
Sefl Organizing Map
 
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo IINeural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
 

Similar to Neural Networks: Support Vector machines

The reversible residual network
The reversible residual networkThe reversible residual network
The reversible residual network
ThyrixYang1
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
Paris Women in Machine Learning and Data Science
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logic
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logicApproximate bounded-knowledge-extractionusing-type-i-fuzzy-logic
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logicCemal Ardil
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
PriyankaRamavath3
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
tuxette
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
Christopher Morris
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual RepresentationsSimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
ynxm25hpxp
 
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
EL-Hachemi Guerrout
 
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
Victor Asanza
 
Multilayer Neuronal network hardware implementation
Multilayer Neuronal network hardware implementation Multilayer Neuronal network hardware implementation
Multilayer Neuronal network hardware implementation
Nabil Chouba
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Kostas Hatalis, PhD
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
Ding Li
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
anandsimple
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
hirokazutanaka
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Soma Boubou
 

Similar to Neural Networks: Support Vector machines (20)

The reversible residual network
The reversible residual networkThe reversible residual network
The reversible residual network
 
1.pptx
1.pptx1.pptx
1.pptx
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logic
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logicApproximate bounded-knowledge-extractionusing-type-i-fuzzy-logic
Approximate bounded-knowledge-extractionusing-type-i-fuzzy-logic
 
4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development4.Support Vector Machines.ppt machine learning and development
4.Support Vector Machines.ppt machine learning and development
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual RepresentationsSimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
 
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
 
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
 
Multilayer Neuronal network hardware implementation
Multilayer Neuronal network hardware implementation Multilayer Neuronal network hardware implementation
Multilayer Neuronal network hardware implementation
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 

Recently uploaded

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
GeoBlogs
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 

Recently uploaded (20)

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 

Neural Networks: Support Vector machines

  • 1. CHAPTER 06 SUPPORT VECTOR MACHINES CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq M. Mostafa Computer Science Department Faculty of Computer & Information Sciences AIN SHAMS UNIVERSITY (some of the figures in this presentation are copyrighted to Pearson Education, Inc.)
  • 2. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq  Introduction  Optimal Hyperplane for Linearly Separable Pattern  Quadratic Optimization for Finding the Optimal Hyperplan  Optimal Hyperplane for Nonseparable Patterns  Underlying Philosophy of SVM for Pattern Calssification  SVM viewed as Kernel Machine  The XOR problem  Computer Experiment 2 Outlines
  • 3. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 3 Introduction  The main idea of the SVMs may be summed up as follows:  “Given a training samples, the SVM constructs a hyperplane as decision surface in such a way the margin of separation between positive and negative examples is maximized.”
  • 4. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 4 Linearly Separable Patterns  SVM is a binary learning machine.  Binary classification is the task of separating classes in feature space. wTx + b = 0 wTx + b < 0 wTx + b > 0 bxwxg T   )(
  • 5. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 5 Linearly Separable Patterns  Which of the linear separators is optimal?
  • 6. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Optimal Decision Boundary  The optimal decision boundary is the one that maximize the margin  6 r ρ
  • 7. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Margin  7 |||| w w rxx P    
  • 8. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Margin  ||||)(then ,0since |||| )()( |||| ,)( wrxg bxw w w w rbxwxg w w rxxbxwxg P T T P T P T              8
  • 9. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Margin            1 |||| 1 1 |||| 1 |||| )( 11)( dif w dif w w xg r dforbxwxg T      9 r ρ 1bxwT  1 bxwT  0 bxwT  |||| 2 2 w r  Then the margin is given as:
  • 10. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Optimal Decision Boundary  Let {x1, ..., xn} be our data set and let di  {1,-1} be the class label of xi  The decision boundary should classify all points correctly.  That is, we have a constrained optimization problem Maximize  = 𝟐𝒓 = 𝟐 𝒘 , or Minimize 𝒘 Subject to 𝒅𝒊(𝒘 𝑻 𝒙 ± 𝒃) ≥ 𝟏 10
  • 11. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Optimization Problem  Introduce Lagrange multipliers ,  That is, the Lagrange function: Is to be minimized with respect to w and b, i.e, 𝜕𝑱(𝒘,𝒃,) 𝜕𝒘 = 𝟎 ; and 𝜕𝑱(𝒘,𝒃, ) 𝜕𝒃 = 𝟎 )1][(|||| 2 1 ),,( 1 2   bxwdwbwJ i T i N i i 11
  • 12. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Solving the Optimization Problem  Need to optimize a quadratic function subject to linear constraints.  The solution involves constructing a dual problem where a Lagrange multiplier αi is associated with every constraint in the primary problem: Find 𝛼1…𝛼 𝑁such that 𝑸 𝜶 = 𝛼𝑖 − 1 2 𝛼𝑖 𝛼𝑗 𝑑𝑖 𝑑𝑗x 𝑖x𝑗𝑗𝑖 𝑵 𝒊=𝟏 is maximized and (1) 𝛼𝑖 𝑑𝑖𝑗 (2) 𝛼1 ≥ 0 ∀ 𝑖 12
  • 13. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Optimization Problem  The solution has the form: and such that 𝒊 ≠ 𝟎  Each non-zero αi indicates that corresponding xi is a support vector.  Then the classifying function will have the form:  Notice that it relies on an inner product between the test point x and the support vectors xi  Also keep in mind that solving the optimization problem involved computing the inner products xi Txj between all training points! 13 ii N i i xd  1 w  iii N i idb xx1 1    bdxg iii N i i   xx)( 1 
  • 14. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 6=1.4 The Optimization Problem  Support vectors are samples that have non-zero  Class 1 Class 2 1=0.8 2=0 3=0 4=0 5=0 7=0 8=0.6 9=0 10=0
  • 15. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Optimal Hyperplane for Nonseparable Patterns Figure 6.3 Soft margin hyperplane (a) Data point xi (belonging to class C1, represented by a small square) falls inside the region of separation, but on the correct side of the decision surface. (b) Data point xi (belonging to class C2, represented by a small circle) falls on the wrong side of the decision surface. 15
  • 16. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Optimal Hyperplane for Nonseparable Patterns  We allow “error” xi in classification 16 ξi ξi
  • 17. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Soft Margin Hyperplane  The old formulation:  The new formulation incorporating relaxed variables: Parameter C can be viewed as a way to control overfitting. 17 Find w and b such that ∅ 𝑾 = 𝟏 𝟐 𝑾 𝑻 𝑾 is minimized and for all {(xi ,yi)} Subject to: 𝒅𝒊(𝒘 𝑻 𝒙 ± 𝒃) ≥ 𝟏 Find w and b such that ∅ 𝐖 = 𝟏 𝟐 𝐖 𝐓 𝐖 + 𝐜 𝝃𝒊𝒊 is minimized for all {(xi ,yi)} Subject to: 𝒅𝒊(𝒘 𝑻 𝒙 ± 𝒃) ≥ 𝟏 , and ξi ≥ 0 for all i
  • 18. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Soft Margin Hyperplane  Again, xi with non-zero αi will be support vectors.  Solution to the dual problem is: 𝑾 = 𝜶𝒊 𝒅𝒊 𝒙𝒊𝒊 and 𝒃 = 𝒅𝒊 𝟏 − 𝝃𝒊 − 𝑾 𝑻 𝒙𝒊 18
  • 19. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Extension to Non-linear Decision Boundary  Key idea: transform xi to a higher dimensional space  Input space: the space of xi  Feature space: the “kernel” space of f(xi) 19 f( ) f( ) f( ) f( )f( ) f( ) f( ) f( ) f(.) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) f( ) Feature spaceInput space
  • 20. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Kernel Trick  The linear classifier relies on inner product between vectors: 𝑲 𝐱 𝒊, 𝐱 𝒋 = 𝐱𝒊 𝑻 𝐱 𝒋 If every datapoint is mapped into high-dimensional space via some transformation Φ: x → φ(x), the inner product becomes: 𝑲 𝐱 𝒊, 𝐱 𝒋 = 𝛟 𝐱𝐢 𝑻 𝛟(𝐱 𝒋)  A kernel function is some function that corresponds to an inner product into some feature space.  K (x, xj) needs to satisfy a technical condition (Mercer condition) in order for f(.) to exist 20
  • 21. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Mercer’s Theorem  𝑲 = 𝒌(𝒙𝒊, 𝒙𝒋) ∀𝒊, 𝒋 has to be non-negative definite or positive semidefinite , that is, it satisfies: 𝒂 𝑻K𝒂 ≥ 𝟎  Some of kernel functions that satisfy Mercer’s condition: 21
  • 22. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The SVM viewed as Kernel Machine Figure 6.5 Architecture of support vector machine, using a radial-basis function network. 22
  • 23. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The XOR Problem  For the two dimensional vectors x=[x1 x2];  Define the following Kernel: 𝒌 x,x𝒊 = 𝟏 + x 𝑻 x𝒊 2  Need to show that K(xi,xj)= φ(xi)Tφ(xj) K(xi,xj)=(1 + xi Txj)2 = 1+ xi1 2xj1 2 + 2 xi1xj1 xi2xj2+ xi2 2xj2 2 + 2xi1xj1 + 2xi2xj2= = [1 xi1 2 √2 xi1xi2 xi2 2 √2xi1 √2xi2]T [1 xj1 2 √2 xj1xj2 xj2 2 √2xj1 √2xj2] = φ(xi)Tφ(xj), where φ(x) = [1 x1 2 √2 x1x2 x2 2 √2x1 √2x2] 23
  • 24. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The XOR Problem  Which give the optimal hyperplane as: −𝒙 𝟏 𝒙 𝟐 = 𝟎  This yields Figure 6.6 (a) Polynomial machine for solving the XOR problem. (b) Induced images in the feature space due to the four data points of the XOR problem. 24 (1, -1) (-1,1) (-1, -1) (1,1) -1.0
  • 25. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Conclusion  SVM is a useful alternative to neural networks  Two key concepts of SVM: maximize the margin and the kernel trick  Many active research is taking place on areas related to SVM  Many SVM implementations are available on the web for you to try on your data set! 25
  • 26. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Computer Experiment Figure 6.7 Experiment on SVM for the double-moon of Fig. 1.8 with distance d = –6. 26
  • 27. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Computer Experiment Figure 6.8 Experiment on SVM for the double-moon of Fig. 1.8 with distance d = –6.5. 27