UNIT 4
Part-1
RBFN
Introduction:
 A Radial Basis Function Network (RBFN) is a type of artificial neural network that
uses radial basis functions as activation functions.
 It is commonly applied to solve problems involving pattern recognition, function
approximation, and classification by mapping input data to a higher-dimensional
space for better representation and decision-making.
 Radial basis function network, which uses supervised learning and is based on
curve-fitting problem in high dimensional space. RBFN is essentially a feed-forward
network with radial basis function as activation functions.
 In RBFN, the aim of learning process is to find a surface in multi-dimensional space
that gives a best fit (in statistical sense of measurement) for the given set of data.
 In neural network context, hidden nodes provide a set of functions called radial
basis, constituting arbitrary basis for input test data when they expand into hidden
space.
Radial Basis Function Network
The construction of a radial basis unction network is entirely different from single
layer feed forward and multi layer feed forward neural networks.
The input layer is made up of nodes, which connect the RBFN to the outside
environment.
The only hidden layer of the network is used to apply linear transformation from
input to hidden space (in most cases hidden space is a very high dimensional space).
The last layer of RBFN is linear and it gives the response of network to the outside
environment.
 RBFN provides the universal method to function approximation and it is successfully
applicable in regression modelling and pattern classification applications.
In short, a RBFN is an artificial neural network that uses RBFs as activation functions.
Components of Radial-basis Function Network
 Input Vector:
• The input consists of n-dimensional attributes of the data to be classified. Each RBF neuron in the
network receives the input vector and processes it.
 RBF Nodes :
• One of the vectors from the training dataset is stored in each of the RBF node and this vector is known as
centre.
• Similarity distance between the input vector and the centre is computed at each RBF node and then the
output is generated in the range between 0 and 1.
• Euclidian distance may be used as a similarity measure. It gives an output value of 1 if the input is similar
to the centre and the output exponentially falls towards value 0, as the distance between centre and input
increases.
• The output value of a neuron is called activation value and the shape of the response from RBF neuron
results in a bell curve.
Components of Radial-basis Function Network (continued)
 Output Nodes:
 Each node in output layer corresponds to one of the class in which input set needs to be
categorized.
 Every RBF node is connected to every output layer node and the connection links are associated
with weights.
 The output node computes the weighted sum of the activation values received from the RBF
nodes.
 The input is finally classified to the class with highest value of weighted sum.
 RBF Neuron Activation Function:
 At every RBF node, the similarity value between the input vector and centre is computed.
 Those inputs, which are more close to the centre result in a value closer to 1 and those, which
are more different from the centre result in a value closer to value 0.
 There are several choices available for the activation function and one of the most commonly
used is based on the Gaussian function.
Radial Functions
 A function is known as RBF if its output depends on the dissimilarity between the input vector (X) and a stored
vector at that node, which is called centre (C).
 The important property of these radial basis functions is that they decrease monotonically with the distance
between the centre and the input vector.
 The parameters of the radial basis function include the distance scale, the centre and the precise shape of the
radial function.
 If the function is linear then the value of all these parameters is fixed. In any RBFN, there is one hidden layer with
its neurons having RBF activation functions that describe the local receptors.
 The output of the hidden-layer neurons is combined linearly at the nodes of output layer.
 A radial basis function is a real valued function, the norm used is the Euclidean distance.
Radial Functions (continued)
 Spread R is defined as R=(||X-C||).
 A Gaussian RBF monotonically decreases with distance from the centre. Radial basis functions are typically
used to build-up function approximations of the form given by:
1.
Types of Radial Basis Functions
 There are many functions used for RBFN. Commonly used function is Gaussian function.
 Gaussian Function is a function of the following form for arbitrary real constants.
Few more radial basis functions
Architecture of Radial Basis Function Network
1. Structure of RBFN:
 RBFN consists of at least three layers:
• Input layer: Has n-nodes, corresponding to the features of the input data.
• Hidden layer: Contains mmm-nodes with Radial Basis Functions as activation functions. The number of
nodes depends on the complexity of the classification or regression task.
• Output layer: Typically has one node (for single-output applications) but can include more for multi-output
scenarios.
2. Cover's Theorem:
 According to Cover's theorem on the separability of patterns, the complex pattern classification problems with
data, which are non-linearly separable is more likely to be linearly separable in high-dimensional space,
compared to low-dimensional space.
 The problem of complex pattern classification is solved using radial basis function networks by transforming it
into higher dimension space in a non-linear fashion. Therefore, it can be interpreted that when the
transformation from input space to the hidden space is non-linear and the dimension of hidden space are
higher compared to the input space, then there is high probability that the given non-separable pattern
classification problem in the input space is converted into a linearly separable one in the hidden space.
Let X = (X1, X2, ..., XN} be a set of N patterns (test dataset), where each Xi is a vector of n attributes. The problem is to
train neural network such that each pattern belongs to either of the two classes, C₁ and C2. Ci (1 ≤ i ≤ m) are centres
at hidden layer.
Let vector is defined as ϕ:Rⁿ -> R, consisting of real-valued functions {ϕi (X) | 1<=i<=n}, ϕ(X)= [ϕ₁(X),….. ϕ (X)].
ₘ
Here X=[X₁,X₂,….,X ] is vector of n-dimensional input space. Vector
ₙ ϕ(X) maps points in n-dimensional input space
into corresponding m-dimensional point.
We refer ϕi(X) as radial basis function at i-th hidden node and serves the purpose of activation function. Hidden
nodes play similar role as that of FFNN. The training dataset X is separable into two classes C1 and C2, if there exists
an m-dimensional weight vector W=[W₁,W₂,….,W ] such that
ₘ
RBFN Learning:
Learning algorithm of RBF network may be split into two phases.
 Phase 1: Hidden-layer learning or basis function selection.
 Phase 2: Learning weights
The first phase is an unsupervised learning. It does not use any information on target
outputs and deals with a set of inputs only. At this stage, we
 select a number of radial basis functions (hidden-layer nodes)
 select a centre for each basis function and
 select a value for a parameter R, which characterizes the basis function range of
definition (the range of its influence).
In the second phase, once hidden unit activations are fixed then the output weights
vector W is determined, which optimize a single-layer linear network at output layer
similar to FFNN (MLP).
Learning of centres for hidden layer:
 Centres may be defined in several ways:
1.Fixed Centres
2.Clustering Approaches
 Fixed Centres Based Approach
The simplest approach for setting the RBF parameters is to have their
1. centres selected randomly from the N training examples and
2. width R equal and fixed, of appropriate size
Problem with random selection of RBF centres is that the training examples may not be evenly distributed throughout the
input space.
 Clustering Based Approaches
Rather than randomly selecting centres, a better approach is to use clustering technique, which reflect the
distribution of the data points more accurately. The following two clustering approaches can be used
1. K-Means Clustering
2. Self-organizing Maps
These approaches are used to form subsets of training examples into m clusters/partitions of the input space. Then centre
of each cluster/partition is chosen to be the centre of each hidden node. Once the RBF centres have been identified, each
RBF width can be set according to the variances of the input points in the corresponding cluster.
K-Means Clustering Algorithm
• Unsupervised Learning
RBFN For The XOR Problem
Comparision of RBF Network with FFNN
RBF Network learning is similar to curve-fitting problem in a high-dimensional space. RBF networks are
different from FFNN (MLP: Multi-layer Perceptron) in some basic ways:
 Both types of networks are non-linear layered feed-forward networks and are universal approximators.
 In general, the feed-forward neural networks are considered as global approximators, whereas radial basis
function networks are considered as local approximators.
 The feed-forward neural network can have any number of hidden layers; whereas, the radial basis function
networks usually have a single hidden layer.
 Commonly, the hidden and output-layer neurons of feed-forward neural network shares the same neuron
model, whereas in radial basis function networks, the neuron model of the hidden-layer neurons is
different from the neurons of output layer.
 Generally, both the hidden and output layers of the feed-forward neural network are non-linear, whereas
in case of radial basis function networks, the hidden layer is non-linear and the output layer is linear.
 Euclidian distance between the input vector and the parameter vector of network is computed in the
activation function of hidden layer of RBFN; whereas, the inner product between the input vector and the
weight vector is computed in the activation function of a FFNN.
 Radial unit in RBFN is defined by its centre point and a radius unlike FFNN, which is defined by weight and
threshold.
Advantages of RBFN over FFNN
 A radial basis function network is capable of modelling any non-linear function using
only a single hidden layer. This helps in reducing the pressure of making some
design decisions such as about the number of layers required for network design.
 The output layer of radial basis function network can use simple linear
transformation, which can be optimized fully using traditional linear modelling
techniques. Unlike feed-forward neural network training techniques, these
traditional linear modelling techniques do not suffer from local minima. Thus, the
RBFNs can be trained faster compared to FFNNs.
Disadvantages of RBFN over FFNN
 In RBF networks, number of radial units required and the centres of
those units must be decided and deviation must be set before linear
optimization could be applied to the output layer.
 Radial basis function networks are faster than feed-forward neural
networks but it is equally susceptible to find sub-optimal combinations.
 Radial basis function networks are also more vulnerable to the curse of
dimensionality and they face more problems if the number of input
units is larger.
RBFN Applications
RBFNs are generally used for classification problems. But they are
general mapping networks having universal approximation capabilities
and are used in function approximation.
◆ Pattern recognition
◆ Speech processing
◆ Vision processing
◆ Image processing
◆ Time series prediction
◆Control problems
Improving RBF Networks
Basic structure of the RBF networks can be improved in a
number of ways:
◆ The number m of hidden nodes need not equal the number N
of training dataset. In general, it is better to have m < N.
◆ The centres of the basis functions may not be out of training
dataset and instead can be determined by a training algorithm.
◆ The basic functions need not all have the same radius/spread
R and can also be determined by a training algorithm.

unit 4_1-SC.pptxfggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg

  • 1.
  • 2.
    Introduction:  A RadialBasis Function Network (RBFN) is a type of artificial neural network that uses radial basis functions as activation functions.  It is commonly applied to solve problems involving pattern recognition, function approximation, and classification by mapping input data to a higher-dimensional space for better representation and decision-making.  Radial basis function network, which uses supervised learning and is based on curve-fitting problem in high dimensional space. RBFN is essentially a feed-forward network with radial basis function as activation functions.  In RBFN, the aim of learning process is to find a surface in multi-dimensional space that gives a best fit (in statistical sense of measurement) for the given set of data.  In neural network context, hidden nodes provide a set of functions called radial basis, constituting arbitrary basis for input test data when they expand into hidden space.
  • 3.
    Radial Basis FunctionNetwork The construction of a radial basis unction network is entirely different from single layer feed forward and multi layer feed forward neural networks. The input layer is made up of nodes, which connect the RBFN to the outside environment. The only hidden layer of the network is used to apply linear transformation from input to hidden space (in most cases hidden space is a very high dimensional space). The last layer of RBFN is linear and it gives the response of network to the outside environment.  RBFN provides the universal method to function approximation and it is successfully applicable in regression modelling and pattern classification applications. In short, a RBFN is an artificial neural network that uses RBFs as activation functions.
  • 4.
    Components of Radial-basisFunction Network  Input Vector: • The input consists of n-dimensional attributes of the data to be classified. Each RBF neuron in the network receives the input vector and processes it.  RBF Nodes : • One of the vectors from the training dataset is stored in each of the RBF node and this vector is known as centre. • Similarity distance between the input vector and the centre is computed at each RBF node and then the output is generated in the range between 0 and 1. • Euclidian distance may be used as a similarity measure. It gives an output value of 1 if the input is similar to the centre and the output exponentially falls towards value 0, as the distance between centre and input increases. • The output value of a neuron is called activation value and the shape of the response from RBF neuron results in a bell curve.
  • 5.
    Components of Radial-basisFunction Network (continued)  Output Nodes:  Each node in output layer corresponds to one of the class in which input set needs to be categorized.  Every RBF node is connected to every output layer node and the connection links are associated with weights.  The output node computes the weighted sum of the activation values received from the RBF nodes.  The input is finally classified to the class with highest value of weighted sum.  RBF Neuron Activation Function:  At every RBF node, the similarity value between the input vector and centre is computed.  Those inputs, which are more close to the centre result in a value closer to 1 and those, which are more different from the centre result in a value closer to value 0.  There are several choices available for the activation function and one of the most commonly used is based on the Gaussian function.
  • 6.
    Radial Functions  Afunction is known as RBF if its output depends on the dissimilarity between the input vector (X) and a stored vector at that node, which is called centre (C).  The important property of these radial basis functions is that they decrease monotonically with the distance between the centre and the input vector.  The parameters of the radial basis function include the distance scale, the centre and the precise shape of the radial function.  If the function is linear then the value of all these parameters is fixed. In any RBFN, there is one hidden layer with its neurons having RBF activation functions that describe the local receptors.  The output of the hidden-layer neurons is combined linearly at the nodes of output layer.  A radial basis function is a real valued function, the norm used is the Euclidean distance.
  • 7.
    Radial Functions (continued) Spread R is defined as R=(||X-C||).  A Gaussian RBF monotonically decreases with distance from the centre. Radial basis functions are typically used to build-up function approximations of the form given by: 1.
  • 9.
    Types of RadialBasis Functions  There are many functions used for RBFN. Commonly used function is Gaussian function.  Gaussian Function is a function of the following form for arbitrary real constants.
  • 10.
    Few more radialbasis functions
  • 12.
    Architecture of RadialBasis Function Network 1. Structure of RBFN:  RBFN consists of at least three layers: • Input layer: Has n-nodes, corresponding to the features of the input data. • Hidden layer: Contains mmm-nodes with Radial Basis Functions as activation functions. The number of nodes depends on the complexity of the classification or regression task. • Output layer: Typically has one node (for single-output applications) but can include more for multi-output scenarios. 2. Cover's Theorem:  According to Cover's theorem on the separability of patterns, the complex pattern classification problems with data, which are non-linearly separable is more likely to be linearly separable in high-dimensional space, compared to low-dimensional space.  The problem of complex pattern classification is solved using radial basis function networks by transforming it into higher dimension space in a non-linear fashion. Therefore, it can be interpreted that when the transformation from input space to the hidden space is non-linear and the dimension of hidden space are higher compared to the input space, then there is high probability that the given non-separable pattern classification problem in the input space is converted into a linearly separable one in the hidden space.
  • 13.
    Let X =(X1, X2, ..., XN} be a set of N patterns (test dataset), where each Xi is a vector of n attributes. The problem is to train neural network such that each pattern belongs to either of the two classes, C₁ and C2. Ci (1 ≤ i ≤ m) are centres at hidden layer.
  • 14.
    Let vector isdefined as ϕ:Rⁿ -> R, consisting of real-valued functions {ϕi (X) | 1<=i<=n}, ϕ(X)= [ϕ₁(X),….. ϕ (X)]. ₘ Here X=[X₁,X₂,….,X ] is vector of n-dimensional input space. Vector ₙ ϕ(X) maps points in n-dimensional input space into corresponding m-dimensional point. We refer ϕi(X) as radial basis function at i-th hidden node and serves the purpose of activation function. Hidden nodes play similar role as that of FFNN. The training dataset X is separable into two classes C1 and C2, if there exists an m-dimensional weight vector W=[W₁,W₂,….,W ] such that ₘ
  • 15.
    RBFN Learning: Learning algorithmof RBF network may be split into two phases.  Phase 1: Hidden-layer learning or basis function selection.  Phase 2: Learning weights The first phase is an unsupervised learning. It does not use any information on target outputs and deals with a set of inputs only. At this stage, we  select a number of radial basis functions (hidden-layer nodes)  select a centre for each basis function and  select a value for a parameter R, which characterizes the basis function range of definition (the range of its influence). In the second phase, once hidden unit activations are fixed then the output weights vector W is determined, which optimize a single-layer linear network at output layer similar to FFNN (MLP).
  • 16.
    Learning of centresfor hidden layer:  Centres may be defined in several ways: 1.Fixed Centres 2.Clustering Approaches  Fixed Centres Based Approach The simplest approach for setting the RBF parameters is to have their 1. centres selected randomly from the N training examples and 2. width R equal and fixed, of appropriate size Problem with random selection of RBF centres is that the training examples may not be evenly distributed throughout the input space.  Clustering Based Approaches Rather than randomly selecting centres, a better approach is to use clustering technique, which reflect the distribution of the data points more accurately. The following two clustering approaches can be used 1. K-Means Clustering 2. Self-organizing Maps These approaches are used to form subsets of training examples into m clusters/partitions of the input space. Then centre of each cluster/partition is chosen to be the centre of each hidden node. Once the RBF centres have been identified, each RBF width can be set according to the variances of the input points in the corresponding cluster.
  • 17.
    K-Means Clustering Algorithm •Unsupervised Learning
  • 18.
    RBFN For TheXOR Problem
  • 22.
    Comparision of RBFNetwork with FFNN RBF Network learning is similar to curve-fitting problem in a high-dimensional space. RBF networks are different from FFNN (MLP: Multi-layer Perceptron) in some basic ways:  Both types of networks are non-linear layered feed-forward networks and are universal approximators.  In general, the feed-forward neural networks are considered as global approximators, whereas radial basis function networks are considered as local approximators.  The feed-forward neural network can have any number of hidden layers; whereas, the radial basis function networks usually have a single hidden layer.  Commonly, the hidden and output-layer neurons of feed-forward neural network shares the same neuron model, whereas in radial basis function networks, the neuron model of the hidden-layer neurons is different from the neurons of output layer.  Generally, both the hidden and output layers of the feed-forward neural network are non-linear, whereas in case of radial basis function networks, the hidden layer is non-linear and the output layer is linear.  Euclidian distance between the input vector and the parameter vector of network is computed in the activation function of hidden layer of RBFN; whereas, the inner product between the input vector and the weight vector is computed in the activation function of a FFNN.  Radial unit in RBFN is defined by its centre point and a radius unlike FFNN, which is defined by weight and threshold.
  • 23.
    Advantages of RBFNover FFNN  A radial basis function network is capable of modelling any non-linear function using only a single hidden layer. This helps in reducing the pressure of making some design decisions such as about the number of layers required for network design.  The output layer of radial basis function network can use simple linear transformation, which can be optimized fully using traditional linear modelling techniques. Unlike feed-forward neural network training techniques, these traditional linear modelling techniques do not suffer from local minima. Thus, the RBFNs can be trained faster compared to FFNNs.
  • 24.
    Disadvantages of RBFNover FFNN  In RBF networks, number of radial units required and the centres of those units must be decided and deviation must be set before linear optimization could be applied to the output layer.  Radial basis function networks are faster than feed-forward neural networks but it is equally susceptible to find sub-optimal combinations.  Radial basis function networks are also more vulnerable to the curse of dimensionality and they face more problems if the number of input units is larger.
  • 25.
    RBFN Applications RBFNs aregenerally used for classification problems. But they are general mapping networks having universal approximation capabilities and are used in function approximation. ◆ Pattern recognition ◆ Speech processing ◆ Vision processing ◆ Image processing ◆ Time series prediction ◆Control problems
  • 26.
    Improving RBF Networks Basicstructure of the RBF networks can be improved in a number of ways: ◆ The number m of hidden nodes need not equal the number N of training dataset. In general, it is better to have m < N. ◆ The centres of the basis functions may not be out of training dataset and instead can be determined by a training algorithm. ◆ The basic functions need not all have the same radius/spread R and can also be determined by a training algorithm.