SlideShare a Scribd company logo
1 of 34
Download to read offline
CHAPTER 03
THE LEAST-MEAN SQUARE
ALGORITHM
CSC445: Neural Networks
Prof. Dr. Mostafa Gadal-Haqq M. Mostafa
Computer Science Department
Faculty of Computer & Information Sciences
AIN SHAMS UNIVERSITY
(most of figures in this presentation are copyrighted to Pearson Education, Inc.)
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
 Introduction
 Filtering Structure of the LMS Algorithm
 Unconstrained Optimization: A Review
 Method of Steepest Descent
 Newton’s Method
 Gauss-Newton Method
 The Least-Mean Square Algorithm
 Computer Experiment
2
Model Building Through Regression
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Introduction
 In Least-Mean Square (LMS) , developed by Widrow
and Hoff (1960), was the first linear adaptive-
filtering algorithm (inspired by the perceptron) for
solving problems such as prediction:
 Some features of the LMS algorithm:
 Linear computational complexity with respect to adjustable
parameters.
 Simple to code
 Robust with respect to external disturbance
3
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Filtering Structure of the LMS Algorithm
 Unknown dynamic system described by:
Figure 3.1 (a) Unknown dynamic system. (b) Signal-flow graph of adaptive model
for the system; the graph embodies a feedback loop set in color.
4
T
M ixixix )](),...,(),([ 21x
 ,...,...,1),(),(: niidiΤ x
where
 M is the input dimensionality
 The stimulus vector x(i) can arise in
either way of the following:
 Snapshot data, The M input elements of x
originate at different point in space
 Temporal data, The M input elements of x
represent the set of present and (M-1) past
values of some excitation.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Filtering Structure of the LMS Algorithm
 The problem addressed is “how to design a multi-
input-single-output model of the unknown dynamic
system by building it around a single linear neuron”
 Adaptive filtering (system identification):
 Start from an arbitrary setting of the adjustable weights.
 Adjustment of weights are made on continuous basis.
 Computation of adjustments to the weight are completed inside
one interval that is one sampling period long.
 Adaptive filter consists of two continuous processes:
 Filtering process: computation of the output signal y(i) and the
error signal e(i).
 Adaptive process: The automatic adjustment of the weights
according to the error signal.
5
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Filtering Structure of the LMS Algorithm
 These two processes together constitute a feedback
loop acting around the neuron.
 Since the neuron is linear:
where
The error signal
is determined by the cost function used to derive the
adaptive-filtering algorithm, which is closely
related to optimization process.
6
)()()()()()()(
1
iiiyixiwiviy T
M
k
kk xw 

T
M iwiwiwi )](),...,(),([)( 21w
)()()( iyidie 
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Unconstrained Optimization: A Review
 Consider the cost function E (w) that is continuously
differentiable function of unknown weights w. we need to find
the optimal solution w* that satisfies:
 That is, we need to solve the unconstrained-optimization
problem stated as:
“Minimize the cost function E (w) with respect to the weight
vector w”
 The necessary condition for optimality:
 Usually the solution is based on the idea of local iterative
descent:
Starting with a guess w(0), generate a sequence of weight
vectors w(1), w(2), …, such that:
7
)(*)( ww EE 
0*)(  wE
))(())1(( nn ww EE 
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Unconstrained Optimization: A Review
Method of Steepest descent
 The successive adjustments applied to the weight vector w are
in the direction of steepest descent; that is in a direction
opposite to the gradient of E (w).
 If g = E (w) , then the steepest descent algorithm is
 Where  is a positive constant called the step size, or learning-
rate, parameter. In each step, the algorithm applies the
correction:
 HW: Prove the convergence of this algorithm and show how it
is influenced by the learning rate . (Sec. 3.3 of Haykin)
8
)()()1( nnn gww 
)(
)()1()(
n
nnn
g
www


ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Unconstrained Optimization: A Review
 When  is small, the transient response of the algorithm is
overdamped, i.e., w(n) follows smooth path.
 When  is large, the transient response of the algorithm is
underdamped, i.e., w(n) follows zigzaging (oscillatory) path.
 When  exceeds critical value, the algorithm becomes unstable.
Figure 3.2 Trajectory of the method of steepest descent in a two-dimensional space for two
different values of learning-rate parameter: (a) small η (b) large η .The coordinates w1 and w2
are elements of the weight vector w; they both lie in the W -plane.
9
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Unconstrained Optimization: A Review
Newton’s Method
 The idea is to minimize the quadratic approximation of the
cost function E (w) around the current point w(n):
 Where H(n) is the Hessian matrix of E (w)
 The weights are updated by
minimizing E (w) as:
10
)()()(
2
1
)()(g
))(())1(())((
T
nnnnn
nnn
T
wHww
www

 EEE










































2
2
2
2
1
2
2
2
2
2
2
12
2
1
2
21
2
2
1
2
2
))(()(
MMM
M
M
wwwww
wwwww
wwwww
nn
EEE
EEE
EEE
E




wH
)()()(
)()()1(
1
nnn
nnn
gHw
www



ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Unconstrained Optimization: A Review
Newton’s Method
 Generally, Newton’s method converges quickly and does not
exhibit the zigzagging behavior of the method of steepest-
descent.
 However, Newton’s method has two main disadvantages:
 the Hessian matrix H(n) has to be a positive definite matrix for
all n, which is not guaranteed by the algorithm.
 This is solved by the modified Gauss-Newton method.
 It has high computational complexity.
11
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Least-Mean Square Algorithm
 The aim of the LMS algorithm is to minimize the
instantaneous value of the cost function E (w) :
 Differentiation of E (w), with respect to w, yields :
 As with the least-square filters, the LMS operates on linear
neuron, we can write:
and
Finally
12
)(
2
1
)ˆ( 2
newE
ww
w




 )(
)(
ˆ
)ˆ( ne
ne
E
)(
ˆ
e(n)
)()(ˆ)()( nnnidne T
x
w
xw 



)()(
ˆ
)ˆ(
)( nenn x
w
w
g 



E
)()()(ˆ)()(ˆ)1(ˆ nennngnn xwww  
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Least-Mean Square Algorithm
 The inverse of the learning-rate acts as a memory of the LMS
algorithm. The smaller the learning-rate , the longer the
memory span over the past data,
 which leads to more accurate results but with slow convergence rate.
 In the steepest-descent algorithm the weight vector w(n) follows
a well-defined trajectory in the weight space for a prescribed .
 In contrast, in the LMS algorithm, the weight vector w(n) traces a
random trajectory. For this reason, the LMS algorithm is
sometimes referred to as “stochastic gradient algorithm.”
 Unlike the steepest-descent, the LMS algorithm does not require
knowledge of the statistics of the environment. It produces an
instantaneous estimate of the weight vector.
13
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The Least-Mean Square Algorithm
 Summary of the LMS Algorithm
14
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Virtues and Limitation of the LMS Alg.
Computational Simplicity and Efficiency:
 The Algorithm is very simple to code, only two or
three line of code.
 the computational complexity of the algorithm is
linear in the adjustable parameters.
15
A COMPUTATIONAL
EXAMPLE
16
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Cost Function
Usually linear models produce concave cost functions
Figure copyright of Andrew Ng.
17
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
1
0
J(0,1)
Cost Function
Different starting point could lead to different solutionFigure copyright of Andrew Ng.
18
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
The corresponding solution (Contour of the cost function
Finding a Solution
Figure copyright of Andrew Ng.
20
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
21
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
22
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
23
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
24
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
25
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
26
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
27
The corresponding solution (Contour of the cost function
Figure copyright of Andrew Ng.
28
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Virtues and Limitation of the LMS Alg.
Robustness
 Since the LMS is model independent, therefore it is robust with
respect to disturbance.
 That is, the LMS is optimal in accordance with H norm (the
response of the transfer function T of the estimator).
 The philosophy of the optimality is to accommodate with the worst-case
scenario:
• “If you don’t know what you are up against, plan for the worst scenario and
optimize.”
Figure 3.8 Formulation of the optimal H∞ estimation problem. The generic estimation error at the transfer
operator’s output could be the weight-error vector, the explanational error, etc.
29
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Virtues and Limitation of the LMS Alg.
Factors Limiting the LMS Performance
 The primary limitations of the LMS algorithm are:
 Its slow rate of convergence (which become serious when the
dimensionality of the input space becomes high), and
 Its sensitivity to variation in the eigen structure of the input. (it
typically requires a number of iterations equal to about 10 times the
dimensionality of the input data space for it to converge to a stable
solution.
• The sensitivity to changes in environment become particularly
acute when the condition number of the LMS algorithm is high.
• The condition number,
(R) = max /min,
• Where max and min are the maximum and minimum
eigenvalues of the correlation matrix, R xx.
30
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Learning-rate Annealing Schedules
 The slow-rate convergence may be
attributed to maintaining the learning-
rate parameter, , constant.
 In stochastic approximation, the
learning-rate parameter is time-
varying parameter
 The most common form used is:
 An alternative form, is the search-then-
converge schedule.
Figure 3.9 Learning-rate annealing
schedules: The horizontal axis,
printed in color, pertains to the
standard LMS algorithm.
31
n
c
n )(
)/(1
)(



n
n o


ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Computer Experiment
 d=1
Figure 3.6 LMS classification with distance 1, based on the double-moon
configuration of Fig. 1.8.
32
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq
Computer Experiment
 d=-4
Figure 3.7 LMS classification with distance –4, based on the double-moon
configuration of Fig. 1.8.
33
•Problems:
•3.3
•Computer Experiment
•3.10, 3.13
Homework 3
34
Multilayer Perceptrons
Next Time
35

More Related Content

What's hot

01 knapsack using backtracking
01 knapsack using backtracking01 knapsack using backtracking
01 knapsack using backtrackingmandlapure
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data MiningKamal Acharya
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Fuzzy rules and fuzzy reasoning
Fuzzy rules and fuzzy reasoningFuzzy rules and fuzzy reasoning
Fuzzy rules and fuzzy reasoningVeni7
 
Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filterarulraj121
 
neural network
neural networkneural network
neural networkSTUDENT
 
Digital image processing
Digital image processingDigital image processing
Digital image processingABIRAMI M
 
Image sampling and quantization
Image sampling and quantizationImage sampling and quantization
Image sampling and quantizationBCET, Balasore
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Mostafa G. M. Mostafa
 
Classical relations and fuzzy relations
Classical relations and fuzzy relationsClassical relations and fuzzy relations
Classical relations and fuzzy relationsBaran Kaynak
 

What's hot (20)

Associative memory network
Associative memory networkAssociative memory network
Associative memory network
 
01 knapsack using backtracking
01 knapsack using backtracking01 knapsack using backtracking
01 knapsack using backtracking
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
 
Back propagation
Back propagationBack propagation
Back propagation
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Max net
Max netMax net
Max net
 
Fuzzy rules and fuzzy reasoning
Fuzzy rules and fuzzy reasoningFuzzy rules and fuzzy reasoning
Fuzzy rules and fuzzy reasoning
 
Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filter
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Multi Layer Network
 
neural network
neural networkneural network
neural network
 
Fuzzy Membership Function
Fuzzy Membership Function Fuzzy Membership Function
Fuzzy Membership Function
 
Digital image processing
Digital image processingDigital image processing
Digital image processing
 
Image sampling and quantization
Image sampling and quantizationImage sampling and quantization
Image sampling and quantization
 
Walsh transform
Walsh transformWalsh transform
Walsh transform
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
Graph coloring using backtracking
Graph coloring using backtrackingGraph coloring using backtracking
Graph coloring using backtracking
 
Shortest path algorithm
Shortest  path algorithmShortest  path algorithm
Shortest path algorithm
 
Classical relations and fuzzy relations
Classical relations and fuzzy relationsClassical relations and fuzzy relations
Classical relations and fuzzy relations
 

Viewers also liked

Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machinesMostafa G. M. Mostafa
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronMostafa G. M. Mostafa
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionMostafa G. M. Mostafa
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)Mostafa G. M. Mostafa
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level codingsrkrishna341
 
Digital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency DomainDigital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency DomainMostafa G. M. Mostafa
 
Du binary signalling
Du binary signallingDu binary signalling
Du binary signallingsrkrishna341
 
Quantum Key Distribution
Quantum Key DistributionQuantum Key Distribution
Quantum Key DistributionShahrikh Khan
 

Viewers also liked (20)

Neural Networks: Introducton
Neural Networks: IntroductonNeural Networks: Introducton
Neural Networks: Introducton
 
Csc446: Pattern Recognition
Csc446: Pattern Recognition Csc446: Pattern Recognition
Csc446: Pattern Recognition
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machines
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Neural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear RegressionNeural Networks: Model Building Through Linear Regression
Neural Networks: Model Building Through Linear Regression
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level coding
 
CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)
 
CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)CSC446: Pattern Recognition (LN8)
CSC446: Pattern Recognition (LN8)
 
Digital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency DomainDigital Image Processing: Image Enhancement in the Frequency Domain
Digital Image Processing: Image Enhancement in the Frequency Domain
 
neural networks
 neural networks neural networks
neural networks
 
Csc446: Pattren Recognition
Csc446: Pattren RecognitionCsc446: Pattren Recognition
Csc446: Pattren Recognition
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
 
CSC446: Pattern Recognition (LN5)
CSC446: Pattern Recognition (LN5)CSC446: Pattern Recognition (LN5)
CSC446: Pattern Recognition (LN5)
 
Du binary signalling
Du binary signallingDu binary signalling
Du binary signalling
 
Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)Csc446: Pattren Recognition (LN2)
Csc446: Pattren Recognition (LN2)
 
Quantum Key Distribution
Quantum Key DistributionQuantum Key Distribution
Quantum Key Distribution
 

Similar to Neural Networks: Least Mean Square (LSM) Algorithm

A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...Cemal Ardil
 
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINESAPPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINEScseij
 
Application of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linesApplication of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linescseij
 
A New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsA New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsJody Sullivan
 
A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System  A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System IJECEIAES
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methodsRamona Corman
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelIJECEIAES
 
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHM
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHMTHE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHM
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHMIJCSEA Journal
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsIRJET Journal
 
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTION
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONA COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTION
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONIJCSEA Journal
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
 
Power system transient stability margin estimation using artificial neural ne...
Power system transient stability margin estimation using artificial neural ne...Power system transient stability margin estimation using artificial neural ne...
Power system transient stability margin estimation using artificial neural ne...elelijjournal
 
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...Nicole Heredia
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachEditor Jacotech
 
Neural Network Dynamical Systems
Neural Network Dynamical Systems Neural Network Dynamical Systems
Neural Network Dynamical Systems M Reza Rahmati
 
Comparison of Neural Network Training Functions for Hematoma Classification i...
Comparison of Neural Network Training Functions for Hematoma Classification i...Comparison of Neural Network Training Functions for Hematoma Classification i...
Comparison of Neural Network Training Functions for Hematoma Classification i...IOSR Journals
 
Another Adaptive Approach to Novelty Detection in Time Series
Another Adaptive Approach to Novelty Detection in Time Series Another Adaptive Approach to Novelty Detection in Time Series
Another Adaptive Approach to Novelty Detection in Time Series csandit
 

Similar to Neural Networks: Least Mean Square (LSM) Algorithm (20)

A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
 
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINESAPPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
 
Application of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linesApplication of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip lines
 
A New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsA New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming Problems
 
A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System  A Novel Neuroglial Architecture for Modelling Singular Perturbation System
A Novel Neuroglial Architecture for Modelling Singular Perturbation System
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methods
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX model
 
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHM
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHMTHE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHM
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHM
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique Algorithms
 
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTION
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTIONA COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTION
A COMPARATIVE STUDY OF BACKPROPAGATION ALGORITHMS IN FINANCIAL PREDICTION
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
 
Power system transient stability margin estimation using artificial neural ne...
Power system transient stability margin estimation using artificial neural ne...Power system transient stability margin estimation using artificial neural ne...
Power system transient stability margin estimation using artificial neural ne...
 
G234247
G234247G234247
G234247
 
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...
An Adaptive Recurrent Neural-Network Controller Using A Stabilization Matrix ...
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approach
 
Design of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approachDesign of airfoil using backpropagation training with mixed approach
Design of airfoil using backpropagation training with mixed approach
 
Neural Network Dynamical Systems
Neural Network Dynamical Systems Neural Network Dynamical Systems
Neural Network Dynamical Systems
 
1.pptx
1.pptx1.pptx
1.pptx
 
Comparison of Neural Network Training Functions for Hematoma Classification i...
Comparison of Neural Network Training Functions for Hematoma Classification i...Comparison of Neural Network Training Functions for Hematoma Classification i...
Comparison of Neural Network Training Functions for Hematoma Classification i...
 
Another Adaptive Approach to Novelty Detection in Time Series
Another Adaptive Approach to Novelty Detection in Time Series Another Adaptive Approach to Novelty Detection in Time Series
Another Adaptive Approach to Novelty Detection in Time Series
 

More from Mostafa G. M. Mostafa

Digital Image Processing: Image Restoration
Digital Image Processing: Image RestorationDigital Image Processing: Image Restoration
Digital Image Processing: Image RestorationMostafa G. M. Mostafa
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainMostafa G. M. Mostafa
 
Digital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image FundamentalsDigital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image FundamentalsMostafa G. M. Mostafa
 
Digital Image Processing: An Introduction
Digital Image Processing: An IntroductionDigital Image Processing: An Introduction
Digital Image Processing: An IntroductionMostafa G. M. Mostafa
 

More from Mostafa G. M. Mostafa (6)

CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)CSC446: Pattern Recognition (LN4)
CSC446: Pattern Recognition (LN4)
 
CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)CSC446: Pattern Recognition (LN3)
CSC446: Pattern Recognition (LN3)
 
Digital Image Processing: Image Restoration
Digital Image Processing: Image RestorationDigital Image Processing: Image Restoration
Digital Image Processing: Image Restoration
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial Domain
 
Digital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image FundamentalsDigital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image Fundamentals
 
Digital Image Processing: An Introduction
Digital Image Processing: An IntroductionDigital Image Processing: An Introduction
Digital Image Processing: An Introduction
 

Recently uploaded

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

Neural Networks: Least Mean Square (LSM) Algorithm

  • 1. CHAPTER 03 THE LEAST-MEAN SQUARE ALGORITHM CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq M. Mostafa Computer Science Department Faculty of Computer & Information Sciences AIN SHAMS UNIVERSITY (most of figures in this presentation are copyrighted to Pearson Education, Inc.)
  • 2. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq  Introduction  Filtering Structure of the LMS Algorithm  Unconstrained Optimization: A Review  Method of Steepest Descent  Newton’s Method  Gauss-Newton Method  The Least-Mean Square Algorithm  Computer Experiment 2 Model Building Through Regression
  • 3. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Introduction  In Least-Mean Square (LMS) , developed by Widrow and Hoff (1960), was the first linear adaptive- filtering algorithm (inspired by the perceptron) for solving problems such as prediction:  Some features of the LMS algorithm:  Linear computational complexity with respect to adjustable parameters.  Simple to code  Robust with respect to external disturbance 3
  • 4. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Filtering Structure of the LMS Algorithm  Unknown dynamic system described by: Figure 3.1 (a) Unknown dynamic system. (b) Signal-flow graph of adaptive model for the system; the graph embodies a feedback loop set in color. 4 T M ixixix )](),...,(),([ 21x  ,...,...,1),(),(: niidiΤ x where  M is the input dimensionality  The stimulus vector x(i) can arise in either way of the following:  Snapshot data, The M input elements of x originate at different point in space  Temporal data, The M input elements of x represent the set of present and (M-1) past values of some excitation.
  • 5. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Filtering Structure of the LMS Algorithm  The problem addressed is “how to design a multi- input-single-output model of the unknown dynamic system by building it around a single linear neuron”  Adaptive filtering (system identification):  Start from an arbitrary setting of the adjustable weights.  Adjustment of weights are made on continuous basis.  Computation of adjustments to the weight are completed inside one interval that is one sampling period long.  Adaptive filter consists of two continuous processes:  Filtering process: computation of the output signal y(i) and the error signal e(i).  Adaptive process: The automatic adjustment of the weights according to the error signal. 5
  • 6. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Filtering Structure of the LMS Algorithm  These two processes together constitute a feedback loop acting around the neuron.  Since the neuron is linear: where The error signal is determined by the cost function used to derive the adaptive-filtering algorithm, which is closely related to optimization process. 6 )()()()()()()( 1 iiiyixiwiviy T M k kk xw   T M iwiwiwi )](),...,(),([)( 21w )()()( iyidie 
  • 7. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Unconstrained Optimization: A Review  Consider the cost function E (w) that is continuously differentiable function of unknown weights w. we need to find the optimal solution w* that satisfies:  That is, we need to solve the unconstrained-optimization problem stated as: “Minimize the cost function E (w) with respect to the weight vector w”  The necessary condition for optimality:  Usually the solution is based on the idea of local iterative descent: Starting with a guess w(0), generate a sequence of weight vectors w(1), w(2), …, such that: 7 )(*)( ww EE  0*)(  wE ))(())1(( nn ww EE 
  • 8. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Unconstrained Optimization: A Review Method of Steepest descent  The successive adjustments applied to the weight vector w are in the direction of steepest descent; that is in a direction opposite to the gradient of E (w).  If g = E (w) , then the steepest descent algorithm is  Where  is a positive constant called the step size, or learning- rate, parameter. In each step, the algorithm applies the correction:  HW: Prove the convergence of this algorithm and show how it is influenced by the learning rate . (Sec. 3.3 of Haykin) 8 )()()1( nnn gww  )( )()1()( n nnn g www  
  • 9. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Unconstrained Optimization: A Review  When  is small, the transient response of the algorithm is overdamped, i.e., w(n) follows smooth path.  When  is large, the transient response of the algorithm is underdamped, i.e., w(n) follows zigzaging (oscillatory) path.  When  exceeds critical value, the algorithm becomes unstable. Figure 3.2 Trajectory of the method of steepest descent in a two-dimensional space for two different values of learning-rate parameter: (a) small η (b) large η .The coordinates w1 and w2 are elements of the weight vector w; they both lie in the W -plane. 9
  • 10. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Unconstrained Optimization: A Review Newton’s Method  The idea is to minimize the quadratic approximation of the cost function E (w) around the current point w(n):  Where H(n) is the Hessian matrix of E (w)  The weights are updated by minimizing E (w) as: 10 )()()( 2 1 )()(g ))(())1(())(( T nnnnn nnn T wHww www   EEE                                           2 2 2 2 1 2 2 2 2 2 2 12 2 1 2 21 2 2 1 2 2 ))(()( MMM M M wwwww wwwww wwwww nn EEE EEE EEE E     wH )()()( )()()1( 1 nnn nnn gHw www   
  • 11. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Unconstrained Optimization: A Review Newton’s Method  Generally, Newton’s method converges quickly and does not exhibit the zigzagging behavior of the method of steepest- descent.  However, Newton’s method has two main disadvantages:  the Hessian matrix H(n) has to be a positive definite matrix for all n, which is not guaranteed by the algorithm.  This is solved by the modified Gauss-Newton method.  It has high computational complexity. 11
  • 12. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Least-Mean Square Algorithm  The aim of the LMS algorithm is to minimize the instantaneous value of the cost function E (w) :  Differentiation of E (w), with respect to w, yields :  As with the least-square filters, the LMS operates on linear neuron, we can write: and Finally 12 )( 2 1 )ˆ( 2 newE ww w      )( )( ˆ )ˆ( ne ne E )( ˆ e(n) )()(ˆ)()( nnnidne T x w xw     )()( ˆ )ˆ( )( nenn x w w g     E )()()(ˆ)()(ˆ)1(ˆ nennngnn xwww  
  • 13. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Least-Mean Square Algorithm  The inverse of the learning-rate acts as a memory of the LMS algorithm. The smaller the learning-rate , the longer the memory span over the past data,  which leads to more accurate results but with slow convergence rate.  In the steepest-descent algorithm the weight vector w(n) follows a well-defined trajectory in the weight space for a prescribed .  In contrast, in the LMS algorithm, the weight vector w(n) traces a random trajectory. For this reason, the LMS algorithm is sometimes referred to as “stochastic gradient algorithm.”  Unlike the steepest-descent, the LMS algorithm does not require knowledge of the statistics of the environment. It produces an instantaneous estimate of the weight vector. 13
  • 14. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The Least-Mean Square Algorithm  Summary of the LMS Algorithm 14
  • 15. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Virtues and Limitation of the LMS Alg. Computational Simplicity and Efficiency:  The Algorithm is very simple to code, only two or three line of code.  the computational complexity of the algorithm is linear in the adjustable parameters. 15
  • 17. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Cost Function Usually linear models produce concave cost functions Figure copyright of Andrew Ng. 17
  • 18. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 1 0 J(0,1) Cost Function Different starting point could lead to different solutionFigure copyright of Andrew Ng. 18
  • 19. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq The corresponding solution (Contour of the cost function Finding a Solution Figure copyright of Andrew Ng. 20
  • 20. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 21
  • 21. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 22
  • 22. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 23
  • 23. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 24
  • 24. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 25
  • 25. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 26
  • 26. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 27
  • 27. The corresponding solution (Contour of the cost function Figure copyright of Andrew Ng. 28
  • 28. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Virtues and Limitation of the LMS Alg. Robustness  Since the LMS is model independent, therefore it is robust with respect to disturbance.  That is, the LMS is optimal in accordance with H norm (the response of the transfer function T of the estimator).  The philosophy of the optimality is to accommodate with the worst-case scenario: • “If you don’t know what you are up against, plan for the worst scenario and optimize.” Figure 3.8 Formulation of the optimal H∞ estimation problem. The generic estimation error at the transfer operator’s output could be the weight-error vector, the explanational error, etc. 29
  • 29. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Virtues and Limitation of the LMS Alg. Factors Limiting the LMS Performance  The primary limitations of the LMS algorithm are:  Its slow rate of convergence (which become serious when the dimensionality of the input space becomes high), and  Its sensitivity to variation in the eigen structure of the input. (it typically requires a number of iterations equal to about 10 times the dimensionality of the input data space for it to converge to a stable solution. • The sensitivity to changes in environment become particularly acute when the condition number of the LMS algorithm is high. • The condition number, (R) = max /min, • Where max and min are the maximum and minimum eigenvalues of the correlation matrix, R xx. 30
  • 30. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Learning-rate Annealing Schedules  The slow-rate convergence may be attributed to maintaining the learning- rate parameter, , constant.  In stochastic approximation, the learning-rate parameter is time- varying parameter  The most common form used is:  An alternative form, is the search-then- converge schedule. Figure 3.9 Learning-rate annealing schedules: The horizontal axis, printed in color, pertains to the standard LMS algorithm. 31 n c n )( )/(1 )(    n n o  
  • 31. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Computer Experiment  d=1 Figure 3.6 LMS classification with distance 1, based on the double-moon configuration of Fig. 1.8. 32
  • 32. ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq Computer Experiment  d=-4 Figure 3.7 LMS classification with distance –4, based on the double-moon configuration of Fig. 1.8. 33